[SPARK-54796][CORE] Fix NPE caused by race condition between Executor initialization and shuffle migration#54136
[SPARK-54796][CORE] Fix NPE caused by race condition between Executor initialization and shuffle migration#54136ivoson wants to merge 3 commits intoapache:masterfrom
Conversation
JIRA Issue Information=== Bug SPARK-54796 === This comment was automatically generated by GitHub Actions |
Ngone51
left a comment
There was a problem hiding this comment.
Is it possible to switch the initialization order between BlockManager initialization and ShuffleManager initialization?
If impossible, I actually think we could add a new flag in BlockManagerInfo (or a pending list for those inital BlockManagers in terms of better memory utilization) to indicate whether the executor is ready for shuffle migration by sending a RPC signal (similar to LaunchedExecutor) after ShuffleManager initialized.
TBH, the current fix looks a bit complex to me.
Have thought about the other options: For adding a new flag in |
| // Wait for ShuffleManager to be initialized before handling shuffle migration requests. | ||
| // This can happen when an executor receives migration requests before its ShuffleManager | ||
| // is fully initialized. | ||
| waitForShuffleManagerInit(blockId) |
There was a problem hiding this comment.
Shall we maybe waitForShuffleManagerInit before BlockManager register with the driver?
There might be other potential cases or future cases affected by this issue if we fix in current way (for shuffle migration case only). My point is still that we should ensure the BlockManager is really ready before the driver picks it up as an candidate for some stuff.
What changes were proposed in this pull request?
Fixing the race condition between executor initialization and shuffle migration. When starting an
Executor, spark will:Shuffle migration request could be received before shuffle manager is initialized,
putBlockDataAsStreamwill be invoked and shuffleManager will be initialized asnull.Then all the later operations depending on
shuffleManagerwill fail withNPE.To fix the issue, this PR propose to:
putBlockDataAsStreamwhich could be called before the Executor is fully ready to handle migration requests.BlockManagerDecommissionerwill retry if the shuffle migration request hits the issue timeout waiting for shuffleManager to be initialized.Why are the changes needed?
Fixing the race condition leading to the
shuffleManagerin BlockManager to be initialized asnull.Does this PR introduce any user-facing change?
No
How was this patch tested?
UT added.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor 2.4.23