Classes | Namespaces | Functions
wordpiece_tokenize.hpp File Reference
#include <cudf/column/column.hpp>
#include <cudf/scalar/scalar.hpp>
#include <cudf/strings/strings_column_view.hpp>
#include <cudf/utilities/export.hpp>
#include <cudf/utilities/memory_resource.hpp>

Go to the source code of this file.

Classes

struct  nvtext::wordpiece_vocabulary
 Vocabulary object to be used with nvtext::wordpiece_tokenizer. More...
 

Namespaces

 nvtext
 NVText APIs.
 

Functions

std::unique_ptr< wordpiece_vocabulary > nvtext::load_wordpiece_vocabulary (cudf::strings_column_view const &input, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Create a tokenize_vocabulary object from a strings column. More...
 
std::unique_ptr< cudf::columnnvtext::wordpiece_tokenize (cudf::strings_column_view const &input, wordpiece_vocabulary const &vocabulary, cudf::size_type max_words_per_row=0, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns the token ids for the input string a wordpiece tokenizer algorithm with the given vocabulary. More...