From: Nico Pache npache@redhat.com
Redhat: enable Kfence on production servers
Kfence allows for a low overhead memory error detection system that can be deployed in production. By enabling this feature we allow for better and quicker bug reporting/fixing.
the Kscale team has done some performance testing and have concluded there is no noticable performance impact.
Enable for ELN and remove the Fedora specific config definition. We may want to consider enabling CONFIG_KFENCE_DEFERRABLE on fedora... This config allows the CPU wakup to be deferred, which is more ideal for power-constrianted systems which is more likely to be the case on Fedora.
Signed-off-by: Nico Pache npache@redhat.com
diff --git a/redhat/configs/common/debug/CONFIG_KFENCE b/redhat/configs/common/debug/CONFIG_KFENCE new file mode 100644 index blahblah..blahblah 100644 --- /dev/null +++ b/redhat/configs/common/debug/CONFIG_KFENCE @@ -0,0 +1 @@ +# CONFIG_KFENCE is not set diff --git a/redhat/configs/common/generic/CONFIG_KFENCE b/redhat/configs/common/generic/CONFIG_KFENCE index blahblah..blahblah 100644 --- a/redhat/configs/common/generic/CONFIG_KFENCE +++ b/redhat/configs/common/generic/CONFIG_KFENCE @@ -1 +1 @@ -# CONFIG_KFENCE is not set +CONFIG_KFENCE=y diff --git a/redhat/configs/common/generic/CONFIG_KFENCE_DEFERRABLE b/redhat/configs/common/generic/CONFIG_KFENCE_DEFERRABLE new file mode 100644 index blahblah..blahblah 100644 --- /dev/null +++ b/redhat/configs/common/generic/CONFIG_KFENCE_DEFERRABLE @@ -0,0 +1 @@ +# CONFIG_KFENCE_DEFERRABLE is not set diff --git a/redhat/configs/fedora/generic/CONFIG_KFENCE_NUM_OBJECTS b/redhat/configs/common/generic/CONFIG_KFENCE_NUM_OBJECTS rename from redhat/configs/fedora/generic/CONFIG_KFENCE_NUM_OBJECTS rename to redhat/configs/common/generic/CONFIG_KFENCE_NUM_OBJECTS index blahblah..blahblah 100644 --- a/redhat/configs/fedora/generic/CONFIG_KFENCE_NUM_OBJECTS +++ b/redhat/configs/common/generic/CONFIG_KFENCE_NUM_OBJECTS diff --git a/redhat/configs/common/generic/CONFIG_KFENCE_SAMPLE_INTERVAL b/redhat/configs/common/generic/CONFIG_KFENCE_SAMPLE_INTERVAL new file mode 100644 index blahblah..blahblah 100644 --- /dev/null +++ b/redhat/configs/common/generic/CONFIG_KFENCE_SAMPLE_INTERVAL @@ -0,0 +1 @@ +CONFIG_KFENCE_SAMPLE_INTERVAL=100 diff --git a/redhat/configs/common/generic/CONFIG_KFENCE_STATIC_KEYS b/redhat/configs/common/generic/CONFIG_KFENCE_STATIC_KEYS new file mode 100644 index blahblah..blahblah 100644 --- /dev/null +++ b/redhat/configs/common/generic/CONFIG_KFENCE_STATIC_KEYS @@ -0,0 +1 @@ +CONFIG_KFENCE_STATIC_KEYS=y diff --git a/redhat/configs/fedora/generic/CONFIG_KFENCE b/redhat/configs/fedora/generic/CONFIG_KFENCE deleted file mode 100644 index blahblah..blahblah 0 --- a/redhat/configs/fedora/generic/CONFIG_KFENCE +++ /dev/null @@ -1 +0,0 @@ -CONFIG_KFENCE=y diff --git a/redhat/configs/fedora/generic/CONFIG_KFENCE_SAMPLE_INTERVAL b/redhat/configs/fedora/generic/CONFIG_KFENCE_SAMPLE_INTERVAL deleted file mode 100644 index blahblah..blahblah 0 --- a/redhat/configs/fedora/generic/CONFIG_KFENCE_SAMPLE_INTERVAL +++ /dev/null @@ -1 +0,0 @@ -CONFIG_KFENCE_SAMPLE_INTERVAL=0
-- https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748
From: Nico Pache on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9085650...
There will most likely be `CKI::Failed kernel-results` here. Its a known issue with the parser. The test actually passes. I will work on fixing that now. *probably should have done that first*
From: Justin M. Forbes on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9086053...
Curious on CONFIG_KFENCE_STATIC_KEYS, was there benchmarking done both enabled and disabled? Just asking because the help text seemed to indicate it was "only recommended when using very large sample intervals, or performance has carefully been evaluated with this option." I would not consider 100 to be a "very large sample interval" so is it worth the trade off?
From: Nico Pache on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9086112...
No, no other config was tested. The test suite that was run was pretty thorough and noted no change in performance. Ill send you the details in a private note.
As for why I enabled it, per the help text: `static keys is normally recommended`
From: Justin M. Forbes on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9086179...
Actually commit 4f612ed3f748962cbef1316ff3d323e2b9055b6e changed it to depend on CONFIG_EXPERT, which is not set, so there is no need for an entry on either CONFIG_KFENCE_STATIC_KEYS and it will be set to off.
From: Nico Pache on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9086310...
Interesting... Good catch. Let me circle back with the Kscale team, and see if they can test with this changed to =N. Based on that commit it seems the performance issues arise in systems with a very large amount of CPUs, which im not sure were tested.
From: Bruno Goncalves on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9166975...
@npache I used to bot to test this MR, but it failed to build for x86_64: https://gitlab.com/redhat/red-hat-ci-tools/kernel/cki-internal-pipelines/cki... trusted-contributors/-/jobs/2349465582 the failure doesn't seem related to the MR though...
On Tue, Apr 19, 2022 at 7:48 AM Bruno Goncalves (via Email Bridge) cki-gitlab@redhat.com wrote:
From: Bruno Goncalves on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9166975...
@npache I used to bot to test this MR, but it failed to build for x86_64: https://gitlab.com/redhat/red-hat-ci-tools/kernel/cki-internal-pipelines/cki... trusted-contributors/-/jobs/2349465582 the failure doesn't seem related to the MR though...
It is indeed unrelated. You are attempting to build with a broken GCC. Ths bug was fixed with gcc-12.0.1-0.16 builds
Justin
From: Nico Pache on gitlab.com https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1748#note_9169626...
Kscale team confirms no performance degradation with STATIC_KEYS=n. It was tested on larger and newer systems as well.
kernel@lists.fedoraproject.org