libcudf  23.12.00
Files | Enumerations | Functions

Files

file  sorting.hpp
 Column APIs for sort and rank.
 

Enumerations

enum class  cudf::rank_method : int32_t {
  cudf::FIRST , cudf::AVERAGE , cudf::MIN , cudf::MAX ,
  cudf::DENSE
}
 Tie-breaker method to use for ranking the column. More...
 

Functions

std::unique_ptr< columncudf::sorted_order (table_view const &input, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Computes the row indices that would produce input in a lexicographical sorted order. More...
 
std::unique_ptr< columncudf::stable_sorted_order (table_view const &input, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Computes the row indices that would produce input in a stable lexicographical sorted order. More...
 
bool cudf::is_sorted (cudf::table_view const &table, std::vector< order > const &column_order, std::vector< null_order > const &null_precedence, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Checks whether the rows of a table are sorted in a lexicographical order. More...
 
std::unique_ptr< tablecudf::sort (table_view const &input, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a lexicographic sort of the rows of a table. More...
 
std::unique_ptr< tablecudf::sort_by_key (table_view const &values, table_view const &keys, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a key-value sort. More...
 
std::unique_ptr< tablecudf::stable_sort_by_key (table_view const &values, table_view const &keys, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a key-value stable sort. More...
 
std::unique_ptr< columncudf::rank (column_view const &input, rank_method method, order column_order, null_policy null_handling, null_order null_precedence, bool percentage, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Computes the ranks of input column in sorted order. More...
 
std::unique_ptr< columncudf::segmented_sorted_order (table_view const &keys, column_view const &segment_offsets, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns sorted order after sorting each segment in the table. More...
 
std::unique_ptr< columncudf::stable_segmented_sorted_order (table_view const &keys, column_view const &segment_offsets, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns sorted order after stably sorting each segment in the table. More...
 
std::unique_ptr< tablecudf::segmented_sort_by_key (table_view const &values, table_view const &keys, column_view const &segment_offsets, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a lexicographic segmented sort of a table. More...
 
std::unique_ptr< tablecudf::stable_segmented_sort_by_key (table_view const &values, table_view const &keys, column_view const &segment_offsets, std::vector< order > const &column_order={}, std::vector< null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Performs a stably lexicographic segmented sort of a table. More...
 

Detailed Description

Enumeration Type Documentation

◆ rank_method

enum cudf::rank_method : int32_t
strong

Tie-breaker method to use for ranking the column.

See also
cudf::make_rank_aggregation for more details.
Enumerator
FIRST 

stable sort order ranking (no ties)

AVERAGE 

mean of first in the group

MIN 

min of first in the group

MAX 

max of first in the group

DENSE 

rank always increases by 1 between groups

Definition at line 53 of file aggregation.hpp.

Function Documentation

◆ is_sorted()

bool cudf::is_sorted ( cudf::table_view const &  table,
std::vector< order > const &  column_order,
std::vector< null_order > const &  null_precedence,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Checks whether the rows of a table are sorted in a lexicographical order.

Parameters
tableTable whose rows need to be compared for ordering
column_orderThe expected sort order for each column. Size must be equal to in.num_columns() or empty. If empty, it is expected all columns are in ascending order.
null_precedenceThe desired order of null compared to other elements for each column. Size must be equal to input.num_columns() or empty. If empty, null_order::BEFORE is assumed for all columns.
streamCUDA stream used for device memory operations and kernel launches
Returns
true if sorted as expected, false if not

◆ rank()

std::unique_ptr<column> cudf::rank ( column_view const &  input,
rank_method  method,
order  column_order,
null_policy  null_handling,
null_order  null_precedence,
bool  percentage,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Computes the ranks of input column in sorted order.

Rank indicate the position of each element in the sorted column and rank value starts from 1.

input = { 3, 4, 5, 4, 1, 2}
Result for different rank_method are
FIRST = {3, 4, 6, 5, 1, 2}
AVERAGE = {3, 4.5, 6, 4.5, 1, 2}
MIN = {3, 4, 6, 4, 1, 2}
MAX = {3, 5, 6, 5, 1, 2}
DENSE = {3, 4, 5, 4, 1, 2}
Parameters
inputThe column to rank
methodThe ranking method used for tie breaking (same values)
column_orderThe desired sort order for ranking
null_handlingflag to include nulls during ranking. If nulls are not included, corresponding rank will be null.
null_precedenceThe desired order of null compared to other elements for column
percentageflag to convert ranks to percentage in range (0,1]
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
A column of containing the rank of the each element of the column of input. The output column type will be size_typecolumn by default or else double when method=rank_method::AVERAGE or percentage=True

◆ segmented_sort_by_key()

std::unique_ptr<table> cudf::segmented_sort_by_key ( table_view const &  values,
table_view const &  keys,
column_view const &  segment_offsets,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a lexicographic segmented sort of a table.

If segment_offsets contains values larger than the number of rows, the behavior is undefined.

Exceptions
cudf::logic_errorif values.num_rows() != keys.num_rows().
cudf::logic_errorif segment_offsets is not size_type column.
Example:
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
values = { {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'} }
offsets = {0, 3, 7, 10}
result = cudf::segmented_sort_by_key(keys, values, offsets);
result is { 'c','b','a', 'g','f','e','d', 'j','i','h' }

If segment_offsets is empty or contains a single index, no values are sorted and the result is a copy of the values.

The segment_offsets are not required to include all indices. Any indices outside the specified segments will not be sorted.

Example: (offsets do not cover all indices)
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
values = { {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'} }
offsets = {3, 7}
result = cudf::segmented_sort_by_key(keys, values, offsets);
result is { 'a','b','c', 'g','f','e','d', 'h','i','j' }
Parameters
valuesThe table to reorder
keysThe table that determines the ordering of elements in each segment
segment_offsetsThe column of size_type type containing start offset index for each contiguous segment.
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource to allocate any returned objects
Returns
table with elements in each segment sorted

◆ segmented_sorted_order()

std::unique_ptr<column> cudf::segmented_sorted_order ( table_view const &  keys,
column_view const &  segment_offsets,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns sorted order after sorting each segment in the table.

If segment_offsets contains values larger than the number of rows, the behavior is undefined.

Exceptions
cudf::logic_errorif segment_offsets is not size_type column.
Example:
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
offsets = {0, 3, 7, 10}
result = cudf::segmented_sorted_order(keys, offsets);
result is { 2,1,0, 6,5,4,3, 9,8,7 }

If segment_offsets is empty or contains a single index, no values are sorted and the result is a sequence of integers from 0 to keys.size()-1.

The segment_offsets are not required to include all indices. Any indices outside the specified segments will not be sorted.

Example: (offsets do not cover all indices)
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
offsets = {3, 7}
result = cudf::segmented_sorted_order(keys, offsets);
result is { 0,1,2, 6,5,4,3, 7,8,9 }
Parameters
keysThe table that determines the ordering of elements in each segment
segment_offsetsThe column of size_type type containing start offset index for each contiguous segment.
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource to allocate any returned objects
Returns
sorted order of the segment sorted table

◆ sort()

std::unique_ptr<table> cudf::sort ( table_view const &  input,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a lexicographic sort of the rows of a table.

Parameters
inputThe table to sort
column_orderThe desired order for each column. Size must be equal to input.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in input. Size must be equal to input.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table's device memory
Returns
New table containing the desired sorted order of input

◆ sort_by_key()

std::unique_ptr<table> cudf::sort_by_key ( table_view const &  values,
table_view const &  keys,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a key-value sort.

Creates a new table that reorders the rows of values according to the lexicographic ordering of the rows of keys.

Exceptions
cudf::logic_errorif values.num_rows() != keys.num_rows().
Parameters
valuesThe table to reorder
keysThe table that determines the ordering
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table's device memory
Returns
The reordering of values determined by the lexicographic order of the rows of keys.

◆ sorted_order()

std::unique_ptr<column> cudf::sorted_order ( table_view const &  input,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Computes the row indices that would produce input in a lexicographical sorted order.

Parameters
inputThe table to sort
column_orderThe desired sort order for each column. Size must be equal to input.num_columns() or empty. If empty, all columns will be sorted in ascending order.
null_precedenceThe desired order of null compared to other elements for each column. Size must be equal to input.num_columns() or empty. If empty, all columns will be sorted in null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
A non-nullable column of elements containing the permuted row indices of input if it were sorted

◆ stable_segmented_sort_by_key()

std::unique_ptr<table> cudf::stable_segmented_sort_by_key ( table_view const &  values,
table_view const &  keys,
column_view const &  segment_offsets,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a stably lexicographic segmented sort of a table.

Performs a lexicographic segmented sort of a table. If segment_offsets contains values larger than the number of rows, the behavior is undefined.

Exceptions
cudf::logic_errorif values.num_rows() != keys.num_rows().
cudf::logic_errorif segment_offsets is not size_type column.
Example:
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
values = { {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'} }
offsets = {0, 3, 7, 10}
result = cudf::segmented_sort_by_key(keys, values, offsets);
result is { 'c','b','a', 'g','f','e','d', 'j','i','h' }

If segment_offsets is empty or contains a single index, no values are sorted and the result is a copy of the values.

The segment_offsets are not required to include all indices. Any indices outside the specified segments will not be sorted.

Example: (offsets do not cover all indices)
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
values = { {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'} }
offsets = {3, 7}
result = cudf::segmented_sort_by_key(keys, values, offsets);
result is { 'a','b','c', 'g','f','e','d', 'h','i','j' }
Parameters
valuesThe table to reorder
keysThe table that determines the ordering of elements in each segment
segment_offsetsThe column of size_type type containing start offset index for each contiguous segment.
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource to allocate any returned objects
Returns
table with elements in each segment sorted

◆ stable_segmented_sorted_order()

std::unique_ptr<column> cudf::stable_segmented_sorted_order ( table_view const &  keys,
column_view const &  segment_offsets,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns sorted order after stably sorting each segment in the table.

Returns sorted order after sorting each segment in the table. If segment_offsets contains values larger than the number of rows, the behavior is undefined.

Exceptions
cudf::logic_errorif segment_offsets is not size_type column.
Example:
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
offsets = {0, 3, 7, 10}
result = cudf::segmented_sorted_order(keys, offsets);
result is { 2,1,0, 6,5,4,3, 9,8,7 }

If segment_offsets is empty or contains a single index, no values are sorted and the result is a sequence of integers from 0 to keys.size()-1.

The segment_offsets are not required to include all indices. Any indices outside the specified segments will not be sorted.

Example: (offsets do not cover all indices)
keys = { {9, 8, 7, 6, 5, 4, 3, 2, 1, 0} }
offsets = {3, 7}
result = cudf::segmented_sorted_order(keys, offsets);
result is { 0,1,2, 6,5,4,3, 7,8,9 }
Parameters
keysThe table that determines the ordering of elements in each segment
segment_offsetsThe column of size_type type containing start offset index for each contiguous segment.
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource to allocate any returned objects
Returns
sorted order of the segment sorted table

◆ stable_sort_by_key()

std::unique_ptr<table> cudf::stable_sort_by_key ( table_view const &  values,
table_view const &  keys,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Performs a key-value stable sort.

Creates a new table that reorders the rows of values according to the lexicographic ordering of the rows of keys.

The order of equivalent elements is guaranteed to be preserved.

Exceptions
cudf::logic_errorif values.num_rows() != keys.num_rows().
Parameters
valuesThe table to reorder
keysThe table that determines the ordering
column_orderThe desired order for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns are sorted in ascending order.
null_precedenceThe desired order of a null element compared to other elements for each column in keys. Size must be equal to keys.num_columns() or empty. If empty, all columns will be sorted with null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table's device memory
Returns
The reordering of values determined by the lexicographic order of the rows of keys.

◆ stable_sorted_order()

std::unique_ptr<column> cudf::stable_sorted_order ( table_view const &  input,
std::vector< order > const &  column_order = {},
std::vector< null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Computes the row indices that would produce input in a stable lexicographical sorted order.

The order of equivalent elements is guaranteed to be preserved.

Computes the row indices that would produce input in a lexicographical sorted order.

Parameters
inputThe table to sort
column_orderThe desired sort order for each column. Size must be equal to input.num_columns() or empty. If empty, all columns will be sorted in ascending order.
null_precedenceThe desired order of null compared to other elements for each column. Size must be equal to input.num_columns() or empty. If empty, all columns will be sorted in null_order::BEFORE.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
A non-nullable column of elements containing the permuted row indices of input if it were sorted