contiguous_split#
- class pylibcudf.contiguous_split.ChunkedPack#
A chunked version of
pack()
.This object can be used to pack (and therefore serialize) a table piece-by-piece through a user-provided staging buffer. This is useful when we want the end result to end up in host memory, but want control over the memory footprint.
Methods
build_metadata
(self)Build the metadata for the packed representation.
create
(Table input, size_t user_buffer_size, ...)Create a chunked packer.
Get the total size of the packed data.
has_next
(self)Check if the packer has more chunks to pack.
next
(self, DeviceBuffer buf)Pack the next chunk into the provided device buffer.
pack_to_host
(self, DeviceBuffer buf)Pack the entire table into a host buffer.
- build_metadata(self) memoryview #
Build the metadata for the packed representation.
- Returns:
- memoryview of metadata suitable for passing to unpack_from_memoryviews.
- static create(Table input, size_t user_buffer_size, Stream stream, DeviceMemoryResource temp_mr)#
Create a chunked packer.
- Parameters:
- input
The table to pack.
- user_buffer_size
Size of the staging buffer to pack into, must be at least 1MB.
- stream
Stream used for device memory operations and kernel launches.
- temp_mr
Memory resource for scratch allocations.
- Returns:
- New ChunkedPack object.
- get_total_contiguous_size(self) size_t #
Get the total size of the packed data.
- Returns:
- Size of packed data.
- has_next(self) bool #
Check if the packer has more chunks to pack.
- Returns:
- True if the packer has chunks still to pack.
- next(self, DeviceBuffer buf) size_t #
Pack the next chunk into the provided device buffer.
- Parameters:
- buf
The device buffer to use as a staging buffer, must be at least as large as the user_buffer_size used to construct the packer.
- Returns:
- Number of bytes packed.
Notes
This is stream-ordered with respect to the stream used when creating the ChunkedPack.
- pack_to_host(self, DeviceBuffer buf) tuple #
Pack the entire table into a host buffer.
- Parameters:
- buf
The device buffer to use as a staging buffer, must be at least as large as the user_buffer_size used to construct the packer.
- Returns:
- tuple of metadata and packed host data (as memoryviews)
- Raises:
- RuntimeError
If the copy to host fails or an incorrectly sized buffer is provided.
Notes
This is stream-ordered with respect to the stream used when creating the ChunkedPack and syncs that stream before returning.
- class pylibcudf.contiguous_split.PackedColumns#
Column data in a serialized format.
Contains data from an array of columns in two contiguous buffers: one on host, which contains table metadata and one on device which contains the table data.
For details, see
cudf::packed_columns
.Methods
release
(self)Releases and returns the underlying serialized metadata and gpu data.
- release(self) tuple #
Releases and returns the underlying serialized metadata and gpu data.
The ownership of the memory are transferred to the returned buffers. After this call, self is empty.
- Returns:
- memoryview (of a HostBuffer)
The serialized metadata as contiguous host memory.
- gpumemoryview (of a rmm.DeviceBuffer)
The serialized gpu data as contiguous device memory.
- pylibcudf.contiguous_split.pack(Table input) PackedColumns #
Deep-copy a table into a serialized contiguous memory format.
Later use unpack or unpack_from_memoryviews to unpack the serialized data back into the table.
- Parameters:
- inputTable
Table to pack.
- Returns:
- PackedColumns
The packed columns.
Examples
>>> packed = pylibcudf.contiguous_split.pack(...) >>> # Either unpack the whole `PackedColumns` at once. >>> pylibcudf.contiguous_split.unpack(packed) >>> # Or unpack the two serialized buffers in `PackedColumns`. >>> metadata, gpu_data = packed.release() >>> pylibcudf.contiguous_split.unpack_from_memoryviews(metadata, gpu_data)
For details, see
cudf::pack()
.
- pylibcudf.contiguous_split.unpack(PackedColumns input) Table #
Deserialize the result of pack.
Copies the result of a serialized table into a table.
For details, see
cudf::unpack()
.- Parameters:
- inputPackedColumns
The packed columns to unpack.
- Returns:
- Table
Copy of the packed columns.
- pylibcudf.contiguous_split.unpack_from_memoryviews(memoryview metadata, gpumemoryview gpu_data) Table #
Deserialize the result of pack.
Copies the result of a serialized table into a table.
For details, see
cudf::unpack()
.- Parameters:
- metadatamemoryview
The packed metadata to unpack.
- gpu_datagpumemoryview
The packed gpu_data to unpack.
- Returns:
- Table
Copy of the packed columns.