perf: Improve benchmarks for native row-to-columnar used by JVM shuffle#3290
perf: Improve benchmarks for native row-to-columnar used by JVM shuffle#3290andygrove wants to merge 6 commits intoapache:mainfrom
Conversation
Add jvm_shuffle.rs benchmark that covers the full range of data types processed by `process_sorted_row_partition()` in JVM shuffle: - Primitive columns (100 Int64 columns) - Struct (flat with 5/10/20 fields) - Nested struct (2 levels deep) - Deeply nested struct (3 levels deep) - List<Int64> - Map<Int64, Int64> This replaces the old row_columnar.rs which only tested primitive columns. These benchmarks help measure the performance of the row-to-columnar conversion used by CometColumnarShuffle when writing shuffle data. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3290 +/- ##
============================================
+ Coverage 56.12% 60.13% +4.00%
- Complexity 976 1468 +492
============================================
Files 119 175 +56
Lines 11743 16085 +4342
Branches 2251 2665 +414
============================================
+ Hits 6591 9672 +3081
- Misses 4012 5066 +1054
- Partials 1140 1347 +207 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Use div_ceil() instead of manual ceiling division and replace needless range loop with iterator pattern. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| fn get_row_size(num_struct_fields: usize) -> usize { | ||
| // Top-level row has 1 column (the struct) | ||
| let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1); | ||
| // Struct pointer (offset + size) is 8 bytes |
There was a problem hiding this comment.
| // Struct pointer (offset + size) is 8 bytes | |
| // Struct pointer (offset + size) is 8 bytes on 64bit architectures |
| let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1); | ||
|
|
||
| // Nested struct starts after top-level row header + pointer | ||
| let nested_offset = top_level_bitset_width + 8; |
There was a problem hiding this comment.
just a thought, its too many eights, prob it would be easy to name them? where is the pointer size or int64 size, etc?
| // Fill nested struct with some data | ||
| for i in 0..num_struct_fields { | ||
| let value_offset = nested_offset + nested_bitset_width + i * 8; | ||
| let value = (i as i64) * 100; |
There was a problem hiding this comment.
what is 100 here? is aligning?
| false, | ||
| 0, | ||
| None, | ||
| &CompressionCodec::Zstd(1), |
There was a problem hiding this comment.
should we also check other codecs? 🤔
I might be wrong but Spark uses LZ4 for IO_COMPRESSION_CODEC which is used for shuffle?
private[spark] val IO_COMPRESSION_CODEC =
ConfigBuilder("spark.io.compression.codec")
.doc("The codec used to compress internal data such as RDD partitions, event log, " +
"broadcast variables and shuffle outputs. By default, Spark provides four codecs: " +
"lz4, lzf, snappy, and zstd. You can also use fully qualified class names to specify " +
"the codec")
.version("0.8.0")
.stringConf
.createWithDefaultString("lz4")
| } | ||
|
|
||
| /// Create a schema with nested structs: Struct<Struct<int64 fields>> | ||
| fn make_nested_struct_schema(num_fields: usize) -> DataType { |
There was a problem hiding this comment.
I have some feeling make_nested_struct_schema and make_deeply_nested_struct_schema can be generalized?
| fn get_nested_row_size(num_inner_fields: usize) -> usize { | ||
| // Top-level row has 1 column (the outer struct) | ||
| let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1); | ||
| let struct_pointer_size = 8; |
There was a problem hiding this comment.
this value should prob be const on crate level?
comphead
left a comment
There was a problem hiding this comment.
Thanks @andygrove WDYT it would be nice to get a followup PR to see nested type of list/map/struct combinations?
Improve
row_columnar.rsbenchmark to cover the full range of data types processed byprocess_sorted_row_partition()in JVM shuffle:These benchmarks help measure the performance of the row-to-columnar conversion used by CometColumnarShuffle when writing shuffle data.