Since #34361 and the introduction of special validity checks in #34408, we are aware that parts of the codebase handle nulls the wrong way when the input is a run-end encoded array (NEW) or an union array.
This issue aims to be a list of compute kernels that have to be changed regarding nulls in these special array types by using specialized ways of determining a valid is a logical null instead of relying on validity bitmaps that are absent in these types of arrays:
Component(s)
C++
Since #34361 and the introduction of special validity checks in #34408, we are aware that parts of the codebase handle nulls the wrong way when the input is a run-end encoded array (NEW) or an union array.
This issue aims to be a list of compute kernels that have to be changed regarding nulls in these special array types by using specialized ways of determining a valid is a logical null instead of relying on validity bitmaps that are absent in these types of arrays:
is_null[Python]pa.compute.is_null()returns incorrect answer for dense union arrays and segfaults for dense union scalars #34315hash_count[C++]hash_countkernel miscounts when run-end encoded array contains null #35059drop_null[C++] drop_null kernel ignores nulls when values is an union or a REE array #43851count[C++][Compute]countkernel miscounts when run-end encoded array contains null #49888true_unless_null[C++][Compute]true_unless_nullkernel output incorrect when run-end encoded array contains null #49889Component(s)
C++