contiguous_split#

class pylibcudf.contiguous_split.ChunkedPack#

A chunked version of pack().

This object can be used to pack (and therefore serialize) a table piece-by-piece through a user-provided staging buffer. This is useful when we want the end result to end up in host memory, but want control over the memory footprint.

Methods

build_metadata(self)

Build the metadata for the packed representation.

create(Table input, size_t user_buffer_size, ...)

Create a chunked packer.

get_total_contiguous_size(self)

Get the total size of the packed data.

has_next(self)

Check if the packer has more chunks to pack.

next(self, DeviceBuffer buf)

Pack the next chunk into the provided device buffer.

pack_to_host(self, DeviceBuffer buf)

Pack the entire table into a host buffer.

build_metadata(self) memoryview#

Build the metadata for the packed representation.

Returns:
memoryview of metadata suitable for passing to unpack_from_memoryviews.
static create(Table input, size_t user_buffer_size, Stream stream, DeviceMemoryResource temp_mr)#

Create a chunked packer.

Parameters:
input

The table to pack.

user_buffer_size

Size of the staging buffer to pack into, must be at least 1MB.

stream

Stream used for device memory operations and kernel launches.

temp_mr

Memory resource for scratch allocations.

Returns:
New ChunkedPack object.
get_total_contiguous_size(self) size_t#

Get the total size of the packed data.

Returns:
Size of packed data.
has_next(self) bool#

Check if the packer has more chunks to pack.

Returns:
True if the packer has chunks still to pack.
next(self, DeviceBuffer buf) size_t#

Pack the next chunk into the provided device buffer.

Parameters:
buf

The device buffer to use as a staging buffer, must be at least as large as the user_buffer_size used to construct the packer.

Returns:
Number of bytes packed.

Notes

This is stream-ordered with respect to the stream used when creating the ChunkedPack.

pack_to_host(self, DeviceBuffer buf) tuple#

Pack the entire table into a host buffer.

Parameters:
buf

The device buffer to use as a staging buffer, must be at least as large as the user_buffer_size used to construct the packer.

Returns:
tuple of metadata and packed host data (as memoryviews)
Raises:
RuntimeError

If the copy to host fails or an incorrectly sized buffer is provided.

Notes

This is stream-ordered with respect to the stream used when creating the ChunkedPack and syncs that stream before returning.

class pylibcudf.contiguous_split.PackedColumns#

Column data in a serialized format.

Contains data from an array of columns in two contiguous buffers: one on host, which contains table metadata and one on device which contains the table data.

For details, see cudf::packed_columns.

Methods

release(self)

Releases and returns the underlying serialized metadata and gpu data.

release(self) tuple#

Releases and returns the underlying serialized metadata and gpu data.

The ownership of the memory are transferred to the returned buffers. After this call, self is empty.

Returns:
memoryview (of a HostBuffer)

The serialized metadata as contiguous host memory.

gpumemoryview (of a rmm.DeviceBuffer)

The serialized gpu data as contiguous device memory.

pylibcudf.contiguous_split.pack(Table input) PackedColumns#

Deep-copy a table into a serialized contiguous memory format.

Later use unpack or unpack_from_memoryviews to unpack the serialized data back into the table.

Parameters:
inputTable

Table to pack.

Returns:
PackedColumns

The packed columns.

Examples

>>> packed = pylibcudf.contiguous_split.pack(...)
>>> # Either unpack the whole `PackedColumns` at once.
>>> pylibcudf.contiguous_split.unpack(packed)
>>> # Or unpack the two serialized buffers in `PackedColumns`.
>>> metadata, gpu_data = packed.release()
>>> pylibcudf.contiguous_split.unpack_from_memoryviews(metadata, gpu_data)

For details, see cudf::pack().

pylibcudf.contiguous_split.unpack(PackedColumns input) Table#

Deserialize the result of pack.

Copies the result of a serialized table into a table.

For details, see cudf::unpack().

Parameters:
inputPackedColumns

The packed columns to unpack.

Returns:
Table

Copy of the packed columns.

pylibcudf.contiguous_split.unpack_from_memoryviews(memoryview metadata, gpumemoryview gpu_data) Table#

Deserialize the result of pack.

Copies the result of a serialized table into a table.

For details, see cudf::unpack().

Parameters:
metadatamemoryview

The packed metadata to unpack.

gpu_datagpumemoryview

The packed gpu_data to unpack.

Returns:
Table

Copy of the packed columns.