F35 Change: Memory Constraints macros for RPM (System-Wide Change proposal)
by Ben Cotton
https://fedoraproject.org/wiki/Changes/MemoryConstraintsMacros
== Summary ==
Introduce macros, similar to openSUSE's
[https://build.opensuse.org/package/show/openSUSE:Factory/memory-constraints
memory-constraints]), for optionally limiting build parallelism for
build-time memory-bound packages
== Owner ==
* Name: [[User:salimma|Michel Alexandre Salim]]
* Email: michel AT michel-slm DOT name
== Detailed Description ==
Some source packages have a memory usage per build thread higher than
the RAM:CPU ratio available in some of our builders. Further, this
ratio can be different for different build server on different
architectures.
At the moment, such packages
([https://src.fedoraproject.org/rpms/ceph/blob/d7454e4e0a98208dc569553b901a...
ceph], [https://src.fedoraproject.org/rpms/chromium/blob/baaf27b384295d6288ef367d...
chromium], [https://src.fedoraproject.org/rpms/mcrouter/blob/a0f7ecad2ccc51c4214646b0...
mcrouter]) have to implement their own logic for determining the safe
amount of parallelism, and redefine `_smp_build_ncpus`.
When this proposal is implemented, they can instead declaratively
specify the amount of RAM needed per build thread, e.g.
%limit_build -m 8192
for declaring a build thread should be allocated 8GB of RAM.
Since Koji supports
[https://docs.pagure.org/koji/release_notes/release_notes_1.18/#system-cha...
setting default values for macros], there will be a macro for the
default memory limit (name TBD) that, if set, will be used to cap
`_smp_build_ncpus` unless overridden by `%limit_build -m`.
0
I'm proposing to tentatively call the macro package
`build-constraints-rpm-macros` to allow the possibility of adding
macros for related needs e.g. [https://pagure.io/copr/copr/issue/1678
build timeouts] to the same package.
== Benefit to Fedora ==
This change simplifies maintaining specs for software that are
memory-bounded rather than CPU-bounded on our build servers
It could potentially improve build reliability for these packages, by
reducing the number of jobs failing because of OOM errors, and reduce
the need for package maintainers to debug these failures.
By keeping the user-facing API aligned with what openSUSE is using, we
open up the possibility to collaborate with them and with the upstream
RPM project to get such macros upstreamed into RPM itself (see
[https://github.com/rpm-software-management/rpm/pull/821 previous
attempt]). **note** that is not in scope for this Change.
== Scope ==
* Proposal owners:
** Introduce new macros
** Update known packages to use the new macros, replacing their custom
`_smp_build_ncpus` overrides
* Other developers:
** The proposal owners might not catch all references of such logic.
Individual package maintainers can try refactoring their packages
using these new macros
* Release engineering: [https://pagure.io/releng/issue/10188 #10188]
No mass rebuild needed. Affected packages should be rebuilt using the new macro
* Policies and guidelines: Packaging guideline can be updated to
recommend using these macros for build-time memory-bound packages
* Trademark approval: N/A (not needed for this Change)
* Alignment with Objectives: N/A
== Upgrade/compatibility impact ==
No impact, affects package building only. Also, the use of the new
macros are optional.
== How To Test ==
1. Install `build-constraints-rpm-macros`
2. Modify spec to set `%limit_build -n AMOUNT_IN_MB` in `%build`
3. Rebuild in koji and make sure it passes on all supported architectures
== User Experience ==
== Dependencies ==
This can optionally be added as dependencies of `redhat-rpm-config`
and `epel-rpm-macros`, depending on how many packages need this
== Contingency Plan ==
* Contingency mechanism: (What to do? Who will do it?)
Revert changed packages to their previous way of capping the number of
build jobs
* Contingency deadline: beta
* Blocks release? No
== Documentation ==
Previous discussion:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...
openSUSE implementation:
https://build.opensuse.org/package/show/openSUSE:Factory/memory-constraints
--
Ben Cotton
He / Him / His
Fedora Program Manager
Red Hat
TZ=America/Indiana/Indianapolis
2 years, 9 months
Re: libc_malloc_debug.so vs .so.0 in glibc 2.34
by Richard W.M. Jones
On Tue, Aug 24, 2021 at 10:42:32AM +0530, Siddhesh Poyarekar wrote:
> On Tue, Aug 24, 2021 at 2:52 AM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> > Anyway this is not the reason why I'm writing this email on Fedora
> > devel list. In the Fedora package we have:
> >
> > $ rpm -qf /usr/lib64/libc_malloc_debug.so
> > glibc-devel-2.34-1.fc35.x86_64
> > $ rpm -qf /usr/lib64/libc_malloc_debug.so.0
> > glibc-2.34-1.fc35.x86_64
> >
> > I'm wondering why it was decided to put the symlink and the file in
> > the two different packages?
> >
> > Is it intended that end users use:
> >
> > LD_PRELOAD=/usr/lib64/libc_malloc_debug.so.0 # (1)
>
> This one is correct.
Thanks - I made this change to nbdkit:
https://gitlab.com/nbdkit/nbdkit/-/commit/8972831aa2a32d4b5820465d37c1827...
Will the .0 ever change?
...
> > If it's (2) then that's a symlink to (1), so why have the possibility
> > of installing only the glibc package thus getting only the file
> > /usr/lib64/libc_malloc_debug.so.0 which is not useful on its own?
> >
> > Also is this feature intended for developers (glibc-devel) or everyone
> > (glibc)? (My preference is "everyone" - we've found that asking bug
> > reporters to enable MALLOC_CHECK_ in an ad hoc way can be a good way
> > to see if a bug is a memory corruption problem.)
>
> The intent is to separate debugging from core functionality to improve
> security and performance (and make it easier for us to improve malloc
> in future but that's unrelated in this context), so it will likely end
> up in glibc-utils (that's where mtrace is) or its own package. That
> will give administrators the option to remove libc_malloc_debug.so.0
> from the system. We (i.e. glibc package maintainers) are yet to agree
> on the right place for this, mainly due to forgetting about it after
> the mass rebuild.
Makes sense, thanks.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
2 years, 9 months
Re: libc_malloc_debug.so vs .so.0 in glibc 2.34
by Siddhesh Poyarekar
[cursed email aliases!]
On Tue, Aug 24, 2021 at 10:42 AM Siddhesh Poyarekar <sipoyare(a)redhat.com> wrote:
>
> On Tue, Aug 24, 2021 at 2:52 AM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> > Anyway this is not the reason why I'm writing this email on Fedora
> > devel list. In the Fedora package we have:
> >
> > $ rpm -qf /usr/lib64/libc_malloc_debug.so
> > glibc-devel-2.34-1.fc35.x86_64
> > $ rpm -qf /usr/lib64/libc_malloc_debug.so.0
> > glibc-2.34-1.fc35.x86_64
> >
> > I'm wondering why it was decided to put the symlink and the file in
> > the two different packages?
> >
> > Is it intended that end users use:
> >
> > LD_PRELOAD=/usr/lib64/libc_malloc_debug.so.0 # (1)
>
> This one is correct.
>
> > or
> >
> > LD_PRELOAD=/usr/lib64/libc_malloc_debug.so # (2)
> >
> > If it's (1) then that is a file, so why bother with the symlink?
>
> I have an open bug and PR to resolve this:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1985048
> https://src.fedoraproject.org/rpms/glibc/pull-request/37
>
> I waited because we were too close to the mass rebuild and then it
> fell off my plate, sorry :/
>
> > If it's (2) then that's a symlink to (1), so why have the possibility
> > of installing only the glibc package thus getting only the file
> > /usr/lib64/libc_malloc_debug.so.0 which is not useful on its own?
> >
> > Also is this feature intended for developers (glibc-devel) or everyone
> > (glibc)? (My preference is "everyone" - we've found that asking bug
> > reporters to enable MALLOC_CHECK_ in an ad hoc way can be a good way
> > to see if a bug is a memory corruption problem.)
>
> The intent is to separate debugging from core functionality to improve
> security and performance (and make it easier for us to improve malloc
> in future but that's unrelated in this context), so it will likely end
> up in glibc-utils (that's where mtrace is) or its own package. That
> will give administrators the option to remove libc_malloc_debug.so.0
> from the system. We (i.e. glibc package maintainers) are yet to agree
> on the right place for this, mainly due to forgetting about it after
> the mass rebuild.
>
> Siddhesh
2 years, 9 months
krita build failure with inlining on ppc64le only
by Richard Shaw
I'm trying to finalize all the builds for OpenEXR/Imath 3.1 in my side tag
but having a tough time with krita getting a few errors[1] like this:
In file included from
/usr/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProduct.h:18,
from /usr/include/eigen3/Eigen/Core:350,
from /usr/include/eigen3/Eigen/Dense:1,
from
/builddir/build/BUILD/krita-4.4.5/libs/global/KisBezierUtils.cpp:36:
/usr/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProductMMA.h: In
function 'Eigen::internal::ploadRhsMMA<float, float __vector(4)>(float
const*, float __vector(4)&)void':
/usr/include/eigen3/Eigen/src/Core/arch/AltiVec/MatrixProductCommon.h:215:28:
error: inlining failed in call to 'always_inline'
'Eigen::internal::ploadRhs<float, float __vector(4)>(float const*)float
__vector(4)': target specific option mismatch
Thoughts?
Interestingly, and I don't know if it's related or not, but with %check
enabled on vips, it had a failure in one test with ppc64le only.
Thanks,
Richard
[1] https://kojipkgs.fedoraproject.org//work/tasks/6679/74326679/build.log
2 years, 9 months
libc_malloc_debug.so vs .so.0 in glibc 2.34
by Richard W.M. Jones
I can't say I'm a huge fan of the new libc_malloc_debug.so library
which is now required to do malloc checking:
* Debugging features in malloc such as the MALLOC_CHECK_ environment
variable (or the glibc.malloc.check tunable), mtrace() and mcheck()
have now been disabled by default in the main C library. Users
looking to use these features now need to preload a new debugging
DSO libc_malloc_debug.so to get this functionality back.
[from https://sourceware.org/pipermail/libc-alpha/2021-August/129718.html]
I just added this to a project and it's a lot more work than the old way:
https://gitlab.com/nbdkit/nbdkit/-/commit/362e0fdcae37db876e13b944102a5c1...
Anyway this is not the reason why I'm writing this email on Fedora
devel list. In the Fedora package we have:
$ rpm -qf /usr/lib64/libc_malloc_debug.so
glibc-devel-2.34-1.fc35.x86_64
$ rpm -qf /usr/lib64/libc_malloc_debug.so.0
glibc-2.34-1.fc35.x86_64
I'm wondering why it was decided to put the symlink and the file in
the two different packages?
Is it intended that end users use:
LD_PRELOAD=/usr/lib64/libc_malloc_debug.so.0 # (1)
or
LD_PRELOAD=/usr/lib64/libc_malloc_debug.so # (2)
If it's (1) then that is a file, so why bother with the symlink?
If it's (2) then that's a symlink to (1), so why have the possibility
of installing only the glibc package thus getting only the file
/usr/lib64/libc_malloc_debug.so.0 which is not useful on its own?
Also is this feature intended for developers (glibc-devel) or everyone
(glibc)? (My preference is "everyone" - we've found that asking bug
reporters to enable MALLOC_CHECK_ in an ad hoc way can be a good way
to see if a bug is a memory corruption problem.)
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
2 years, 9 months
OpenColorIO 2.0: armv7hf only linker error
by Richard Shaw
I'm working on updating OpenColorIO to 2.0.1 and building in a side tag,
however, the build failed but only on armv7hf with:
usr/lib/libpystring.so /usr/lib/libyaml-cpp.so.0.6.3
../testutils/libtestutils.a -lm ../../src/apputils/libapputils.a
/usr/bin/ld: CMakeFiles/test_cpu_exec.dir/Processor_tests.cpp.o (symbol
from plugin): in function
`OpenColorIO_v2_0::ProcessorMetadata::ProcessorMetadata()':
(.text+0x0): multiple definition of `typeinfo name for
OpenColorIO_v2_0::ProcessorCache<unsigned int,
std::shared_ptr<OpenColorIO_v2_0::Processor> >';
CMakeFiles/test_cpu_exec.dir/Config_tests.cpp.o (symbol from
plugin):(.text+0x0): first defined here
/usr/bin/ld: CMakeFiles/test_cpu_exec.dir/Processor_tests.cpp.o (symbol
from plugin): in function
`OpenColorIO_v2_0::ProcessorMetadata::ProcessorMetadata()':
(.text+0x0): multiple definition of `typeinfo for
OpenColorIO_v2_0::ProcessorCache<unsigned int,
std::shared_ptr<OpenColorIO_v2_0::Processor> >';
CMakeFiles/test_cpu_exec.dir/Config_tests.cpp.o (symbol from
plugin):(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
Ideas?
Thanks,
Richard
2 years, 9 months