perf: Improve benchmarks for native row-to-columnar used by JVM shuffle#3290

Open

andygrove wants to merge 6 commits intoapache:mainfrom

andygrove:add-jvm-shuffle-benchmarks

Member

andygrove commented Jan 26, 2026 •

edited

Loading

Improve row_columnar.rs benchmark to cover the full range of data types processed by process_sorted_row_partition() in JVM shuffle:

Primitive columns (100 Int64 columns)
Struct (flat with 5/10/20 fields)
Nested struct (2 levels deep)
Deeply nested struct (3 levels deep)
List
Map<Int64, Int64>

These benchmarks help measure the performance of the row-to-columnar conversion used by CometColumnarShuffle when writing shuffle data.

andygrove and others added 2 commits

January 26, 2026 10:42


          test: add comprehensive JVM shuffle benchmarks

6867eed

Add jvm_shuffle.rs benchmark that covers the full range of data types
processed by `process_sorted_row_partition()` in JVM shuffle:

- Primitive columns (100 Int64 columns)
- Struct (flat with 5/10/20 fields)
- Nested struct (2 levels deep)
- Deeply nested struct (3 levels deep)
- List<Int64>
- Map<Int64, Int64>

This replaces the old row_columnar.rs which only tested primitive columns.

These benchmarks help measure the performance of the row-to-columnar
conversion used by CometColumnarShuffle when writing shuffle data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>


          lint

69fb543

andygrove marked this pull request as draft

January 26, 2026 17:55

codecov-commenter commented Jan 26, 2026 •

edited

Loading

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.13%. Comparing base (f09f8af) to head (69fb543).
⚠️ Report is 896 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3290      +/-   ##
============================================
+ Coverage     56.12%   60.13%   +4.00%     
- Complexity      976     1468     +492     
============================================
  Files           119      175      +56     
  Lines         11743    16085    +4342     
  Branches       2251     2665     +414     
============================================
+ Hits           6591     9672    +3081     
- Misses         4012     5066    +1054     
- Partials       1140     1347     +207

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andygrove and others added 4 commits

January 26, 2026 12:18


          Merge remote-tracking branch 'apache/main' into add-jvm-shuffle-bench…

a1b659f

…marks


          fix: resolve clippy warnings in jvm_shuffle benchmark

a337b0f

Use div_ceil() instead of manual ceiling division and replace
needless range loop with iterator pattern.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>


          rename

5d6d9a5


          rename

8d1302b

andygrove marked this pull request as ready for review

January 26, 2026 19:33

andygrove changed the title ~~perf: Improve benchmarks for row-to-columnar conversion in JVM shuffle~~ perf: Improve benchmarks for native row-to-columnar used by JVM shuffle

andygrove mentioned this pull request

perf: Improve performance of native row-to-columnar transition used by JVM shuffle #3289

Open

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+              fn get_row_size(num_struct_fields: usize) -> usize {
+                  // Top-level row has 1 column (the struct)
+                  let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1);
+                  // Struct pointer (offset + size) is 8 bytes

Contributor

comphead Jan 27, 2026

Suggested change

      
                // Struct pointer (offset + size) is 8 bytes
          
                // Struct pointer (offset + size) is 8 bytes on 64bit architectures

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+                      let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1);
+                      // Nested struct starts after top-level row header + pointer
+                      let nested_offset = top_level_bitset_width + 8;

Contributor

comphead Jan 27, 2026

just a thought, its too many eights, prob it would be easy to name them? where is the pointer size or int64 size, etc?

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+                      // Fill nested struct with some data
+                      for i in 0..num_struct_fields {
+                          let value_offset = nested_offset + nested_bitset_width + i * 8;
+                          let value = (i as i64) * 100;

Contributor

comphead Jan 27, 2026

what is 100 here? is aligning?

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+                                          false,
+,
+                                          None,
+                                          &CompressionCodec::Zstd(1),

Contributor

comphead Jan 27, 2026

should we also check other codecs? 🤔
I might be wrong but Spark uses LZ4 for IO_COMPRESSION_CODEC which is used for shuffle?

private[spark] val IO_COMPRESSION_CODEC =
  ConfigBuilder("spark.io.compression.codec")
    .doc("The codec used to compress internal data such as RDD partitions, event log, " +
      "broadcast variables and shuffle outputs. By default, Spark provides four codecs: " +
      "lz4, lzf, snappy, and zstd. You can also use fully qualified class names to specify " +
      "the codec")
    .version("0.8.0")
    .stringConf
    .createWithDefaultString("lz4")

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+              }
+              /// Create a schema with nested structs: Struct<Struct<int64 fields>>
+              fn make_nested_struct_schema(num_fields: usize) -> DataType {

Contributor

comphead Jan 27, 2026

I have some feeling make_nested_struct_schema and make_deeply_nested_struct_schema can be generalized?

comphead reviewed

View reviewed changes

native/core/benches/row_columnar.rs

+              fn get_nested_row_size(num_inner_fields: usize) -> usize {
+                  // Top-level row has 1 column (the outer struct)
+                  let top_level_bitset_width = SparkUnsafeRow::get_row_bitset_width(1);
+                  let struct_pointer_size = 8;

Contributor

comphead Jan 27, 2026

this value should prob be const on crate level?

comphead reviewed

View reviewed changes

Contributor

comphead left a comment

Thanks @andygrove WDYT it would be nice to get a followup PR to see nested type of list/map/struct combinations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet