IVF-PQ#
The IVF-PQ method is an ANN algorithm. Like IVF-Flat, IVF-PQ splits the points into a number of clusters (also specified by a parameter called n_lists) and searches the closest clusters to compute the nearest neighbors (also specified by a parameter called n_probes), but it shrinks the sizes of the vectors using a technique called product quantization.
#include <cuvs/neighbors/ivf_pq.hpp>
namespace cuvs::neighbors::ivf_pq
Index build parameters#
-
enum class codebook_gen
A type for specifying how PQ codebooks are created.
Values:
-
enumerator PER_SUBSPACE
-
enumerator PER_CLUSTER
-
enumerator PER_SUBSPACE
-
struct index_params : public cuvs::neighbors::ann::index_params#
- #include <ivf_pq.hpp>
Public Functions
-
inline operator raft::neighbors::ivf_pq::index_params() const#
Build a raft IVF_PQ index params from an existing cuvs IVF_PQ index params.
Public Members
-
uint32_t n_lists = 1024#
The number of inverted lists (clusters)
Hint: the number of vectors per cluster (
n_rows/n_lists
) should be approximately 1,000 to 10,000.
-
uint32_t kmeans_n_iters = 20#
The number of iterations searching for kmeans centers (index building).
-
double kmeans_trainset_fraction = 0.5#
The fraction of data to use during iterative kmeans building.
-
uint32_t pq_bits = 8#
The bit length of the vector element after compression by PQ.
Possible values: [4, 5, 6, 7, 8].
Hint: the smaller the ‘pq_bits’, the smaller the index size and the better the search performance, but the lower the recall.
-
uint32_t pq_dim = 0#
The dimensionality of the vector after compression by PQ. When zero, an optimal value is selected using a heuristic.
NB:
pq_dim * pq_bits
must be a multiple of 8.Hint: a smaller ‘pq_dim’ results in a smaller index size and better search performance, but lower recall. If ‘pq_bits’ is 8, ‘pq_dim’ can be set to any number, but multiple of 8 are desirable for good performance. If ‘pq_bits’ is not 8, ‘pq_dim’ should be a multiple of 8. For good performance, it is desirable that ‘pq_dim’ is a multiple of 32. Ideally, ‘pq_dim’ should be also a divisor of the dataset dim.
-
codebook_gen codebook_kind = codebook_gen::PER_SUBSPACE#
How PQ codebooks are created.
-
bool force_random_rotation = false#
Apply a random rotation matrix on the input data and queries even if
dim % pq_dim == 0
.Note: if
dim
is not multiple ofpq_dim
, a random rotation is always applied to the input data and queries to transform the working space fromdim
torot_dim
, which may be slightly larger than the original space and and is a multiple ofpq_dim
(rot_dim % pq_dim == 0
). However, this transform is not necessary whendim
is multiple ofpq_dim
(dim == rot_dim
, hence no need in adding “extra” data columns / features).By default, if
dim == rot_dim
, the rotation transform is initialized with the identity matrix. Whenforce_random_rotation == true
, a random orthogonal transform matrix is generated regardless of the values ofdim
andpq_dim
.
-
bool conservative_memory_allocation = false#
By default, the algorithm allocates more space than necessary for individual clusters (
list_data
). This allows to amortize the cost of memory allocation and reduce the number of data copies during repeated calls toextend
(extending the database).The alternative is the conservative allocation behavior; when enabled, the algorithm always allocates the minimum amount of memory required to store the given number of records. Set this flag to
true
if you prefer to use as little GPU memory for the database as possible.
-
inline operator raft::neighbors::ivf_pq::index_params() const#
Index search parameters#
-
struct search_params : public cuvs::neighbors::ann::search_params#
- #include <ivf_pq.hpp>
Public Functions
-
inline operator raft::neighbors::ivf_pq::search_params() const#
Build a raft IVF_PQ search params from an existing cuvs IVF_PQ search params.
Public Members
-
uint32_t n_probes = 20#
The number of clusters to search.
-
cudaDataType_t lut_dtype = CUDA_R_32F#
Data type of look up table to be created dynamically at search time.
Possible values: [CUDA_R_32F, CUDA_R_16F, CUDA_R_8U]
The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected.
-
cudaDataType_t internal_distance_dtype = CUDA_R_32F#
Storage data type for distance/similarity computed at search time.
Possible values: [CUDA_R_16F, CUDA_R_32F]
If the performance limiter at search time is device memory access, selecting FP16 will improve performance slightly.
-
double preferred_shmem_carveout = 1.0#
Preferred fraction of SM’s unified memory / L1 cache to be used as shared memory.
Possible values: [0.0 - 1.0] as a fraction of the
sharedMemPerMultiprocessor
.One wants to increase the carveout to make sure a good GPU occupancy for the main search kernel, but not to keep it too high to leave some memory to be used as L1 cache. Note, this value is interpreted only as a hint. Moreover, a GPU usually allows only a fixed set of cache configurations, so the provided value is rounded up to the nearest configuration. Refer to the NVIDIA tuning guide for the target GPU architecture.
Note, this is a low-level tuning parameter that can have drastic negative effects on the search performance if tweaked incorrectly.
-
inline operator raft::neighbors::ivf_pq::search_params() const#
Index#
-
template<typename IdxT>
struct index : public cuvs::neighbors::ann::index# - #include <ivf_pq.hpp>
IVF-PQ index.
In the IVF-PQ index, a database vector y is approximated with two level quantization:
y = Q_1(y) + Q_2(y - Q_1(y))
The first level quantizer (Q_1), maps the vector y to the nearest cluster center. The number of clusters is n_lists.
The second quantizer encodes the residual, and it is defined as a product quantizer [1].
A product quantizer encodes a
dim
dimensional vector with apq_dim
dimensional vector. First we split the input vector intopq_dim
subvectors (denoted by u), where each u vector containspq_len
distinct components of yy_1, y_2, … y_{pq_len}, y_{pq_len+1}, … y_{2*pq_len}, … y_{dim-pq_len+1} … y_{dim} ___________________/ ____________________________/ ______________________/ u_1 u_2 u_{pq_dim}
Then each subvector encoded with a separate quantizer q_i, end the results are concatenated
Q_2(y) = q_1(u_1),q_2(u_2),…,q_{pq_dim}(u_pq_dim})
Each quantizer q_i outputs a code with pq_bit bits. The second level quantizers are also defined by k-means clustering in the corresponding sub-space: the reproduction values are the centroids, and the set of reproduction values is the codebook.
When the data dimensionality
dim
is not multiple ofpq_dim
, the feature space is transformed using a random orthogonal matrix to haverot_dim = pq_dim * pq_len
dimensions (rot_dim >= dim
).The second-level quantizers are trained either for each subspace or for each cluster: (a) codebook_gen::PER_SUBSPACE: creates
pq_dim
second-level quantizers - one for each slice of the data along features; (b) codebook_gen::PER_CLUSTER: createsn_lists
second-level quantizers - one for each first-level cluster. In either case, the centroids are again found using k-means clustering interpreting the data as having pq_len dimensions.[1] Product quantization for nearest neighbor search Herve Jegou, Matthijs Douze, Cordelia Schmid
- Template Parameters:
IdxT – type of the indices in the source dataset
Public Functions
-
index(raft::resources const &handle, const index_params ¶ms, uint32_t dim)#
Construct an empty index. It needs to be trained and then populated.
-
uint32_t dim() const noexcept#
Dimensionality of the input data.
-
uint32_t dim_ext() const noexcept#
Dimensionality of the cluster centers: input data dim extended with vector norms and padded to 8 elems.
-
uint32_t rot_dim() const noexcept#
Dimensionality of the data after transforming it for PQ processing (rotated and augmented to be muplitple of
pq_dim
).
-
uint32_t pq_bits() const noexcept#
The bit length of an encoded vector element after compression by PQ.
-
uint32_t pq_dim() const noexcept#
The dimensionality of an encoded vector after compression by PQ.
-
uint32_t pq_len() const noexcept#
Dimensionality of a subspaces, i.e. the number of vector components mapped to a subspace
-
uint32_t pq_book_size() const noexcept#
The number of vectors in a PQ codebook (
1 << pq_bits
).
-
cuvs::distance::DistanceType metric() const noexcept#
Distance metric used for clustering.
-
codebook_gen codebook_kind() const noexcept#
How PQ codebooks are created.
-
uint32_t n_lists() const noexcept#
Number of clusters/inverted lists (first level quantization).
-
bool conservative_memory_allocation() const noexcept#
Whether to use convervative memory allocation when extending the list (cluster) data (see index_params.conservative_memory_allocation).
-
raft::mdspan<float, pq_centers_extents, raft::row_major> pq_centers() noexcept#
PQ cluster centers
codebook_gen::PER_SUBSPACE: [pq_dim , pq_len, pq_book_size]
codebook_gen::PER_CLUSTER: [n_lists, pq_len, pq_book_size]
-
raft::device_vector_view<uint8_t*, uint32_t, raft::row_major> data_ptrs() noexcept#
Pointers to the inverted lists (clusters) data [n_lists].
-
raft::device_vector_view<IdxT*, uint32_t, raft::row_major> inds_ptrs() noexcept#
Pointers to the inverted lists (clusters) indices [n_lists].
-
raft::device_matrix_view<float, uint32_t, raft::row_major> rotation_matrix() noexcept#
The transform matrix (original space -> rotated padded space) [rot_dim, dim]
-
raft::host_vector_view<IdxT, uint32_t, raft::row_major> accum_sorted_sizes() noexcept#
Accumulated list sizes, sorted in descending order [n_lists + 1]. The last value contains the total length of the index. The value at index zero is always zero.
That is, the content of this span is as if the
list_sizes
was sorted and then accumulated.This span is used during search to estimate the maximum size of the workspace.
-
raft::device_vector_view<uint32_t, uint32_t, raft::row_major> list_sizes() noexcept#
Sizes of the lists [n_lists].
-
raft::device_matrix_view<float, uint32_t, raft::row_major> centers() noexcept#
Cluster centers corresponding to the lists in the original space [n_lists, dim_ext]
-
raft::device_matrix_view<float, uint32_t, raft::row_major> centers_rot() noexcept#
Cluster centers corresponding to the lists in the rotated space [n_lists, rot_dim]
Index build#
-
auto build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const float, int64_t, raft::row_major> dataset) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset auto index = ivf_pq::build(handle, index_params, dataset);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] a device matrix view to a row-major matrix [n_rows, dim]
- Returns:
the constructed ivf-pq index
-
void build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const float, int64_t, raft::row_major> dataset, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index; ivf_pq::build(handle, index_params, dataset, index);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] raft::device_matrix_view to a row-major matrix [n_rows, dim]
idx – [out] reference to ivf_pq::index
-
auto build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset auto index = ivf_pq::build(handle, index_params, dataset);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] a device matrix view to a row-major matrix [n_rows, dim]
- Returns:
the constructed ivf-pq index
-
void build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index; ivf_pq::build(handle, index_params, dataset, index);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] raft::device_matrix_view to a row-major matrix [n_rows, dim]
idx – [out] reference to ivf_pq::index
-
auto build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset auto index = ivf_pq::build(handle, index_params, dataset);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] a device matrix view to a row-major matrix [n_rows, dim]
- Returns:
the constructed ivf-pq index
-
void build(raft::resources const &handle, const cuvs::neighbors::ivf_pq::index_params &index_params, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Build the index from the dataset for efficient search.
Usage example:
using namespace cuvs::neighbors; // use default index parameters ivf_pq::index_params index_params; // create and fill the index from a [N, D] dataset ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index; ivf_pq::build(handle, index_params, dataset, index);
- Parameters:
handle – [in]
index_params – configure the index building
dataset – [in] raft::device_matrix_view to a row-major matrix [n_rows, dim]
idx – [out] reference to ivf_pq::index
Index extend#
-
auto extend(raft::resources const &handle, raft::device_matrix_view<const float, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, const cuvs::neighbors::ivf_pq::index<int64_t> &idx) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
-
void extend(raft::resources const &handle, raft::device_matrix_view<const float, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; ivf_pq::extend(handle, new_vectors, no_op, &index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
-
auto extend(raft::resources const &handle, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, const cuvs::neighbors::ivf_pq::index<int64_t> &idx) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
-
void extend(raft::resources const &handle, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; ivf_pq::extend(handle, new_vectors, no_op, &index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
-
auto extend(raft::resources const &handle, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, const cuvs::neighbors::ivf_pq::index<int64_t> &idx) -> cuvs::neighbors::ivf_pq::index<int64_t>#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
-
void extend(raft::resources const &handle, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors, std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices, cuvs::neighbors::ivf_pq::index<int64_t> *idx)#
Extend the index with the new data.
Usage example:
using namespace cuvs::neighbors; ivf_pq::index_params index_params; index_params.add_data_on_build = false; // don't populate index on build index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training // train the index from a [N, D] dataset auto index_empty = ivf_pq::build(handle, index_params, dataset); // fill the index with the data std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt; ivf_pq::extend(handle, new_vectors, no_op, &index_empty);
- Parameters:
handle – [in]
new_vectors – [in] a device matrix view to a row-major matrix [n_rows, idx.dim()]
new_indices – [in] a device vector view to a vector of indices [n_rows]. If the original index is empty (
idx.size() == 0
), you can passstd::nullopt
here to imply a continuous range[0...n_rows)
.idx – [inout]
Index search#
-
void search(raft::resources const &handle, const cuvs::neighbors::ivf_pq::search_params &search_params, cuvs::neighbors::ivf_pq::index<int64_t> &index, raft::device_matrix_view<const float, int64_t, raft::row_major> queries, raft::device_matrix_view<int64_t, int64_t, raft::row_major> neighbors, raft::device_matrix_view<float, int64_t, raft::row_major> distances)#
Search ANN using the constructed index.
See the ivf_pq::build documentation for a usage example.
Note, this function requires a temporary buffer to store intermediate results between cuda kernel calls, which may lead to undesirable allocations and slowdown. To alleviate the problem, you can pass a pool memory resource or a large enough pre-allocated memory resource to reduce or eliminate entirely allocations happening within
search
. The exact size of the temporary buffer depends on multiple factors and is an implementation detail. However, you can safely specify a small initial size for the memory pool, so that only a few allocations happen to grow it during the first invocations of thesearch
.... // use default search parameters ivf_pq::search_params search_params; // Use the same allocator across multiple searches to reduce the number of // cuda memory allocations ivf_pq::search(handle, search_params, index, queries1, out_inds1, out_dists1); ivf_pq::search(handle, search_params, index, queries2, out_inds2, out_dists2); ivf_pq::search(handle, search_params, index, queries3, out_inds3, out_dists3); ...
- Parameters:
handle – [in]
search_params – configure the search
index – [in] ivf-pq constructed index
queries – [in] a device matrix view to a row-major matrix [n_queries, index->dim()]
neighbors – [out] a device matrix view to the indices of the neighbors in the source dataset [n_queries, k]
distances – [out] a device matrix view to the distances to the selected neighbors [n_queries, k]
-
void search(raft::resources const &handle, const cuvs::neighbors::ivf_pq::search_params &search_params, cuvs::neighbors::ivf_pq::index<int64_t> &index, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> queries, raft::device_matrix_view<int64_t, int64_t, raft::row_major> neighbors, raft::device_matrix_view<float, int64_t, raft::row_major> distances)#
Search ANN using the constructed index.
See the ivf_pq::build documentation for a usage example.
Note, this function requires a temporary buffer to store intermediate results between cuda kernel calls, which may lead to undesirable allocations and slowdown. To alleviate the problem, you can pass a pool memory resource or a large enough pre-allocated memory resource to reduce or eliminate entirely allocations happening within
search
. The exact size of the temporary buffer depends on multiple factors and is an implementation detail. However, you can safely specify a small initial size for the memory pool, so that only a few allocations happen to grow it during the first invocations of thesearch
.... // use default search parameters ivf_pq::search_params search_params; // Use the same allocator across multiple searches to reduce the number of // cuda memory allocations ivf_pq::search(handle, search_params, index, queries1, out_inds1, out_dists1); ivf_pq::search(handle, search_params, index, queries2, out_inds2, out_dists2); ivf_pq::search(handle, search_params, index, queries3, out_inds3, out_dists3); ...
- Parameters:
handle – [in]
search_params – configure the search
index – [in] ivf-pq constructed index
queries – [in] a device matrix view to a row-major matrix [n_queries, index->dim()]
neighbors – [out] a device matrix view to the indices of the neighbors in the source dataset [n_queries, k]
distances – [out] a device matrix view to the distances to the selected neighbors [n_queries, k]
-
void search(raft::resources const &handle, const cuvs::neighbors::ivf_pq::search_params &search_params, cuvs::neighbors::ivf_pq::index<int64_t> &index, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> queries, raft::device_matrix_view<int64_t, int64_t, raft::row_major> neighbors, raft::device_matrix_view<float, int64_t, raft::row_major> distances)#
Search ANN using the constructed index.
See the ivf_pq::build documentation for a usage example.
Note, this function requires a temporary buffer to store intermediate results between cuda kernel calls, which may lead to undesirable allocations and slowdown. To alleviate the problem, you can pass a pool memory resource or a large enough pre-allocated memory resource to reduce or eliminate entirely allocations happening within
search
. The exact size of the temporary buffer depends on multiple factors and is an implementation detail. However, you can safely specify a small initial size for the memory pool, so that only a few allocations happen to grow it during the first invocations of thesearch
.... // use default search parameters ivf_pq::search_params search_params; // Use the same allocator across multiple searches to reduce the number of // cuda memory allocations ivf_pq::search(handle, search_params, index, queries1, out_inds1, out_dists1); ivf_pq::search(handle, search_params, index, queries2, out_inds2, out_dists2); ivf_pq::search(handle, search_params, index, queries3, out_inds3, out_dists3); ...
- Parameters:
handle – [in]
search_params – configure the search
index – [in] ivf-pq constructed index
queries – [in] a device matrix view to a row-major matrix [n_queries, index->dim()]
neighbors – [out] a device matrix view to the indices of the neighbors in the source dataset [n_queries, k]
distances – [out] a device matrix view to the distances to the selected neighbors [n_queries, k]
Index serialize#
-
void serialize(raft::resources const &handle, std::string &filename, const cuvs::neighbors::ivf_pq::index<int64_t> &index)#
Save the index to file.
Experimental, both the API and the serialization format are subject to change.
#include <raft/core/resources.hpp> raft::resources handle; // create a string with a filepath std::string filename("/path/to/index"); // create an index with `auto index = ivf_pq::build(...);` cuvs::serialize(handle, filename, index);
- Parameters:
handle – [in] the raft handle
filename – [in] the file name for saving the index
index – [in] IVF-PQ index
-
void deserialize(raft::resources const &handle, const std::string &filename, cuvs::neighbors::ivf_pq::index<int64_t> *index)#
Load index from file.
Experimental, both the API and the serialization format are subject to change.
#include <raft/core/resources.hpp> raft::resources handle; // create a string with a filepath std::string filename("/path/to/index"); using IdxT = int64_t; // type of the index // create an empty index with `ivf_pq::index<IdxT> index(handle, index_params, dim);` cuvs::deserialize(handle, filename, &index);
- Parameters:
handle – [in] the raft handle
filename – [in] the name of the file that stores the index
index – [out] IVF-PQ index