On 7/19/19 1:09 PM, Jon Masters wrote:
On 7/4/19 6:12 PM, Jon Masters wrote:
> I think we have identified the root cause of the 32-bit builder issue.
> Many thanks to Paul and Peter for assistance in debugging. Here's my
> write-up, and we'll work with the vendor on a suitable mitigation to
> workaround any errata:
The hardware vendor have reproduced what I believe to be an errata.
Meanwhile, I've made a test kernel that forces CONFIG_HIGHPTE to off:
With this kernel, you still get LPAE but leaf level PTEs are not
allocated from high memory any longer. This is because I believe the
errata to be caused by stage 1 page table walks in the guest trapping to
stage 2 (hypervisor) for e.g. Access bit updates on the host. When those
occur, I believe there is a truncation of the guest IPA (guest memory)
address to 32-bits, but only for page table entry walks. Normal
translation faults I think are unaffected by this (TBC).
Normally, we don't allocate PGDs (high level page table pieces) from
high memory (we allocate those from kernel memory caches) but we DO
allocate PTEs specifically from what might be high memory. Except when
we force CONFIG_HIGHPTE to off. The patch I'm using is attached.
It's currently being tested. If it works, I'm curious for input on
temporarily carrying this in Fedora. In theory it means an LPAE system
could starve for PTEs if it has many many processes running, but in
practice I'm willing to bet LPAE is mostly used by Fedora for the 32-bit
builders and that few people would actually complain if we did this.
This stayed up for 3+ days. Eventually, there were a couple of faults
that I thought were a problem but it turns out that they weren't and
just generated noise on the host kernel log. So it looks good to go with
the hack that I proposed and that's going to be in Fedora's 5.2 kernel.
The host saw a couple of exits due to speculative page walks in the
guest. It hit my previous logic due to S1 PTW but this time the HPFAR
was correct vs what we would expect due to the 32-bit range limit.
[359524.820107] JCM: WARNING: Mismatched FIPA and PA translation detected!
[359524.899630] JCM: Hyper faulting far: 0x40163000
[359524.955044] JCM: Guest faulting far: 0xb6dbbf48 (gfn: 0x4016)
[359525.025047] JCM: Guest TTBCR: 0xb5023500, TTBR0: 0x4c99ca80
[359525.092963] JCM: Guest PGD address: 0x4c99ca90
[359525.147312] JCM: Guest PGD: 0x58bf7003
[359525.193319] JCM: Guest PMD address: 0x58bf7db0
[359525.247671] JCM: Guest PMD: 0x40163003
[359525.293678] JCM: Guest PTE address: 0x40163dd8
[359525.348030] JCM: Guest PTE: 0x420000367508fdf
[359525.401338] JCM: Manually translated as: 0xb6dbbf48->0x367508000
[359525.474465] JCM: Faulting IPA page: 0x40163000
[359525.528814] JCM: Faulting PTE page: 0x40163000
[359525.583166] JCM: *** debugging data ***
[359525.630215] JCM: FAR_EL2: 0xb6dbbf48
[359525.674133] JCM: HPFAR_EL2: 0x401630
[359525.718052] JCM: ESR_EL2: 0x8200008b
[359525.761972] JCM: FAR_EL1: 0x4f2e50005b89b4
[359525.812149] JCM: ESR_EL1: 0x20b
[359525.850852] JCM: *** debugging data ***
[359525.897899] JCM: Fault occurred while performing S1 PTW -fixing
[359525.969985] JCM: corrected fault_ipa: 0x40163000
[359526.026423] JCM: Corrected gfn: 0x4016
[359526.072427] JCM: handle access fault
[359526.116347] JCM: ret: 0x1
You can see the FAR reported pfn 4016 and that's what we expected, so
the above was just noise in my test kernel on the host monitoring a bit
too carefully and not needing to actually fix anything this time.
Computer Architect | Sent with my Fedora powered laptop