On 8/26/22 12:22, Daniel Micay via devel wrote:
Also, you hardened_malloc doesn't use a thread cache for
security
reasons. It invalidates many of the security properties. If you compare
to glibc malloc in the light configuration with tcache disabled in glibc
malloc it will compare well, and hardened_malloc can scale better when
given enough arenas. If you want to make the substantial security
sacrifices required for a traditional thread cache, then I don't think
hardened_malloc makes sense, which is why it doesn't include the option
to do thread caching even though it'd be easy to implement. It may one
day include the option to do thread batched allocation, but it isn't
feasible to do it for deallocation without losing a ton of the strong
security properties.
I'm an upstream glibc developer, but I've tried to remove my bias here and
present
the facts as they are for the existing heap-based allocator that is in use by the
distributions today and why it's hard to change.
(1) Pick your own allocator vs. use the default.
We allow any end user to make those choices by interposing the final allocator with
an allocator of their choice depending on specific workload criteria. This means
that distributions don't have a strong incentive to change system allocators unless
they are making a strategic change in their core values or vision for the distribution
(like Graphene OS makes for security).
At the ELF level we make sure that we can interpose a new allocator, and we work
carefully to ensure that newer features at the compiler level can be supported
incrementally (_FORTIFIY_SOURCE=3 and __builtin_dynamic_object_size) by newer
allocators.
In summary: If the "good enough" allocator doesn't meet your requirements,
then
you can use one of the alternatives.
(2) Switching the default vs. improving the default.
It is arguably lower TCO for all distributions using glibc to improve glibc's
malloc. Some improvements can't be made, but some buy enough benefit that there
is no strong reason to change allocators.
For example:
- jemalloc/tcmalloc used a fast per-thread cache.
- glibc implemented fast per-thread caching in 2.26 (2017) (DJ Delorie's work)
- Chromium started using safe-linking pointer hardening.
- glibc implemented safe-linking pointer hardening for fastbins and tcache (2020) (Eyal
Itkin's work)
Next steps for glibc's malloc is probably:
- Improve internal fragmentation [1]
- Round-robin arena assignment with uniform arena assignment as a goal.
- Provide a packed arena for sub 16-byte sized allocations to improve utilization.
- We have seen some C++ workloads/frameworks that create trillions of 13-byte objects.
(3) Requirements vs. change.
While Facebook/BSD (jemalloc), Google (tcmalloc), Microsoft (mimalloc) have very
good allocators, issues seen with those allocators can be more difficult to
correct because of the impact those changes have on wider workloads beyond
distribution workloads.
For example if Graphene OS, with it's own goals, and Fedora with it's own goals
had a conflict of interest for the direction of the allocator e.g. cost vs.
security, what kind of choice would the hardened_allocator maintainers make?
Upstream glibc has largely been aligned with traditional distribution
requirements for a long time, and continues to be aligned with the notion
of a "general purpose" distribution via the contributors and deep network
of developers in the distributions:
https://sourceware.org/glibc/wiki/MAINTAINERS#Distribution_Maintainers
---
The combination of (1), (2) and (3) mean that for general purpose
distributions the choice of staying with glibc's malloc means having
an ecosystem of distributions that are using the same allocator and
benefit from wide application testing and development and support
when required.
It would be easier to approach glibc upstream and convince them that the
default allocator in glibc should be replaced with hardened_alloc or
jemalloc or tcmalloc or mimalloc...
--
Cheers,
Carlos.
[1]
https://patchwork.sourceware.org/project/glibc/patch/xn4jz19fts.fsf@greed...