On 05/10/2017 at 06:44 PM, Hatayama, Daisuke wrote:
> On 05/10/2017 at 12:16 PM, Hatayama, Daisuke wrote:
>>> -----Original Message-----
>>> From: Xunlei Pang [mailto:xpang@redhat.com]
>>> On 05/10/2017 at 09:54 AM, Hatayama, Daisuke wrote:
>>>> Pang,
>>>>
>>>> Thanks for cc'ing to me.
>>>>
>>>>> -----Original Message-----
>>>>> From: Xunlei Pang [mailto:xlpang@redhat.com]
>>>>> Sent: Tuesday, May 9, 2017 8:52 PM
>>>>> To: kexec(a)lists.fedoraproject.org
>>>>> Cc: Xunlei Pang <xlpang(a)redhat.com>
>>>>> Subject: [PATCH] kdumpctl: use "apicid" other than
"initial apicid"
>>>>>
>>>>> We met a problem on AMD machines, when using "nr_cpus=4"
for
>>>>> kdump, and crash happens on cpus other than cpu0, kdump kernel
>>>>> will fail to boot and eventually reset.
>>>>>
>>>>> After some debugging, we found that it stuck at the kernel path
>>>>> do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init():
>>>>> apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT,
>>>>> phys_apicid);
>>>>> that is, it stuck at sending INIT from AP to BP and reset, which
>>>>> is actually what "disable_cpu_apicid=X" tries to solve.
Printing
>>>>> the value of @phys_apicid showed that it was the value of
"apicid"
>>>>> other that of "initial apicid" showed by /proc/cpuinfo.
>>>>>
>>>>> As described in x86 specification:
>>>>> "In MP systems, the local APIC ID is also used as a processor ID
by the
>>>>> BIOS and the operating system. Some processors permit software to
modify
>>>>> the APIC ID. However, the ability of software to modify the APIC ID
is
>>>>> processor model specific. Because of this, operating system software
>>>>> should avoid writing to the local APIC ID register. The value
returned
> by
>>>>> bits 31-24 of the EBX register (when the CPUID instruction is
executed
> with
>>>>> a
>>>>> source operand value of 1 in the EAX register) is always the Initial
APIC
>>> ID
>>>>> (determined by the platform initialization). This is true even if
software
>>>>> has changed the value in the Local APIC ID register."
>>>>>
>>>>> From kernel commit 151e0c7de("x86, apic, kexec: Add
disable_cpu_apicid
>>>>> kernel parameter"), we can see in generic_processor_info(), it
uses
>>>>> a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
>>>>>
>>>>> a)@apicid which is actually @phys_apicid above-mentioned is from the
>>>>> following calltrace(on the problematic AMD machine):
>>>>> generic_processor_info+0x37/0x300
>>>>> acpi_register_lapic+0x30/0x90
>>>>> acpi_parse_lapic+0x40/0x50
>>>>> acpi_table_parse_entries_array+0x171/0x1de
>>>>> acpi_boot_init+0xed/0x50f
>>>>> The value of @apicid(from acpi MADT) is equal to the value of
"apicid"
>>>>> showed by /proc/cpuinfo as proved by our debug printk.
>>>>> b)read_apic_id() gets the value from LAPIC ID register which is
"apicid"
>>>>> as well.
>>>>>
>>>>> While the value of "initial apicid" is from cpuid
instruction.
>>>>>
>>>>> One example of "apicid" and "initial apicid" of
cpu0 from /proc/cpuinfo
>>>>> on AMD machine:
>>>>> apicid : 32
>>>>> initial apicid : 0
>>>>>
>>>>> Therefore, we should assign /proc/cpuifo "apicid" to
>>> "disable_cpu_apicid=X".
>>>>> We've never met such issue before, because we usually tested
"nr_cpus=1",
>>>>> and mostly on Intel machines, and "apicid" and
"initial apicid" have the
>>>>> same value in most cases on Intel machines.
>>>>>
>>>> For my understanding, could you show me the following information
>>>> on the AMD machines?
>>>>
>>>> - dmesg | grep "ACPI: LAPIC"
>>>> - /proc/cpuinfo
>>> # dmesg | grep "ACPI: LAPIC"
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x11] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] enabled)
>>> [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] enabled)
>>> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
>>>
>>> # cat /proc/cpuinfo (there are 8 cpus, paste 4 cpus here)
>>> processor : 0
>>> vendor_id : AuthenticAMD
>>> cpu family : 21
>>> model : 2
>>> model name : AMD FX(tm)-8350 Eight-Core Processor
>>> stepping : 0
>>> microcode : 0x600084f
>>> cpu MHz : 4000.000
>>> cache size : 2048 KB
>>> physical id : 0
>>> siblings : 8
>>> core id : 0
>>> cpu cores : 4
>>> apicid : 16
>>> initial apicid : 0
>> Thanks for these information.
>>
>> I was confused about MADT listing initial APIC id at least for BSP.
>> I cannot recall the reason why I understand this way wrong, but looking
>> back at Intel's Architectures Software Developer's Manual, I found
>> the description " 5. As part of the boot-strap code, the BSP creates
>> an ACPI table and/or an MP table and adds its initial APIC ID to
>> these tables as appropriate." in 8.4.3 MP Initialization Protocol
>> Algorithm for MP Systems and so I guess this was probably the reason.
> I couldn't find an Intel machine with different "apicid" and
"initial apicid",
> so
> it's hard to verify that.
>
> Maybe it's different for AMD, I tested three different AMD machines showing
> the apicid from the ACPI table has the same value as /proc/cpuinfo
"apicid".
>
> For AMD:
> 1) apicid is initiated by
> init_amd(): c->apicid = hard_smp_processor_id(); // calls read_apic_id()
> 2) initial apicid is initiated by
> generic_identify(): c->initial_apicid = (cpuid_ebx(1) >> 24) &
0xFF;
I'm not saying the Intel machine works like this.
I'm just saying how I was wrong...
It is correct that we specify local apicid in /proc/cpufinfo to disable_cpu_apic
parameter because MADT lists local APIC id that are not necessarily initial.
> Maybe I can apply this patch only for AMD machines for safety?
I don't think it necessary to do such limitation because
there is no additional impact by your patch for the system where local apicid is
equal to the initial apicid.
Ok, thanks for the explanation.
>> Then, in this system, cpu0 has 16 as its APIC id. Is this the
same
>> system as you mentioned in the patch description? The patch description
>> explains that APIC id of the cpu0 is 32. Or the APIC id could be changed
>> at each boot or at each kdump kexec in the worst case? The latter case
>> means that disable_cpu_apicid doesn't work well on such system.
>>
> Sorry, I got them from two different AMD machines, the APIC ID stays
> invariable each reboot.
>
So, on the AMD machines, BPS's local APIC ID is unchanged until boot time
of the kdump 2nd kernel. Then, disable_cpu_apicid works well on them.
The condition for disable_cpu_apicid to work well is that BSP's local APIC ID
is kept unchanged until boot time of the kdump 2nd kernel.
I think it necessary to confirm when local APIC ID is changed in general
and possibility for BSP's local APIC ID to be changed.
I have no idea about these now.
However, honestly, I guess such case is actually unlikely to happen except
for some bug...
Yes, I personally don't care about it, the Spec also recommends that
software doesn't touch the value of the local APIC ID.
Thanks,
Xunlei