gh-128213: fast path for bytes creation from list and tuple#132590
gh-128213: fast path for bytes creation from list and tuple#132590eendebakpt wants to merge 29 commits intopython:mainfrom
Conversation
…sing PyNumber_AsSsize_t; fixed indentation
Misc/NEWS.d/next/Core_and_Builtins/2024-12-24-08-44-49.gh-issue-128213.Y71jDi.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Core_and_Builtins/2024-12-24-08-44-49.gh-issue-128213.Y71jDi.rst
Outdated
Show resolved
Hide resolved
Objects/bytesobject.c
Outdated
| goto error; | ||
| PyObject *const *items = PySequence_Fast_ITEMS(x); | ||
| for (Py_ssize_t i = 0; i < size; i++) { | ||
| if (!PyLong_Check(items[i])) { |
There was a problem hiding this comment.
If you're interested in speed, PyLong_CheckExact will allow you to use _PyLong_IsNonNegativeCompact and _PyLong_CompactValue to get the C int in a few cycles.
There was a problem hiding this comment.
Interesting. For the common/happy path we now we call PyNumber_AsSsize_t, which calls PyLong_AsSsize_t. That methods does another PyLong_Check and then a call to _PyLong_IsCompact.
So we have some options:
- We could remove the call to
PyLong_Check(as it is already covered byPyLong_AsSsize_t), that improves performance a bit for all cases. - We could use
PyLong_CheckExactwith_PyLong_IsNonNegativeCompact. Fastest for exact ints, but non-exact ints become slower. I suspect non-exact ints are rare though. - We add a fast path using
PyLong_CheckExactand a fallback fast path usingPyNumber_AsSsize_tfor the non-exacts ints. Fast for all cases, but takes a but more code.
@markshannon Any preference? I am happy to work out any of the above.
There was a problem hiding this comment.
The PyNumber_AsSsize_t performs an incref/decref on the argument which we can avoid by calling PyLong_AsSsize_t directly (the incref is not very bad, since in the happy path the argument will be in the 0 to 255 range so a python small int, but still). I updated the PR with this approach.
Most important for the PR is making the code thread-safe, so I left the possible optimization with _PyLong_IsNonNegativeCompact out for now.
|
Tuples are immutable, so why does creating a bytes object from a tuple require synchronization? |
Tuples indeed do not require synchronization. In this PR exact lists and tuples use the path (using synchronization with |
Continuation of #128214. This PR