Ok, thanks for the clarification.
Coiby Xu <coxu(a)redhat.com> 于 2022年4月25日周一 上午11:35写道:
On Sat, Apr 23, 2022 at 10:47:57PM +0800, Kairui Song wrote:
>Coiby Xu <coxu(a)redhat.com> 于2022年4月2日周六 11:25写道:
>>
>> A NIC may get a different name in the kdump kernel from 1st kernel
>> in cases like,
>> - kernel assigned network interface names are not persistent e.g. [1]
>> - there is an udev rule to rename the NIC in the 1st kernel but the
>> kdump initrd may not have that rule e.g. [2]
>>
>> If NM tries to match a NIC with a connection profile based on NIC name
>> i.e. connection.interface-name, it will fail the above bases. So we
>> should remove the line connection.interface-name=XX from the connection
>> file. With this line deleted, multiple NICs may be matched by a
>> connection profile but we don't need to worry about a wrong NIC is
>> brought up by NM since we have explicitly asked NM to only bring up the
>> NICs needed by kdump via /etc/NetworkManager/conf.d/10-kdump-netif.conf.
>> Note we don't need to do this for user created NIC like vlan, bridge and
>> bond.
>>
>> An remaining issue is passing the name of a NIC via the kdumpnic dracut
>> command line parameter which requires passing
ifname=<interface>:<MAC> to
>> have fixed NIC name. But we can simply drop this requirement. kdumpnic
>> is needed because kdump needs to get the IP by NIC name and use the IP
>> to created a dumping folder named "{IP}-{DATE}". We can simply pass
the
>> IP to the kdump kernel directly via a new dracut command line parameter
>> kdumpip instead. In addition to the benefit of simplifying the code,
>> there are other three benefits brought by this approach,
>> - make use of whatever network to transfer the vmcore. Because as long
>> as we have the network to we don't care which NIC is active.
>> - if obtained IP in the kdump kernel is different from the one in the
>> 1st kernel. "{IP}-{DATE}" would better tell where the dumped
vmcore
>> comes from.
>> - without passing ifname=<interface>:<MAC> to to kdump initrd,
the
>> issue of there are two interfaces with the same MAC address for
>> Azure Hyper-V NIC SR-IOV [3] is resolved automatically.
>>
>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1121778
>> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=810107
>> [3]
https://bugzilla.redhat.com/show_bug.cgi?id=1962421
>>
>> Signed-off-by: Coiby Xu <coxu(a)redhat.com>
>> ---
>> dracut-kdump.sh | 18 ++++--------------
>> dracut-module-setup.sh | 38 ++++++++++++++------------------------
>> 2 files changed, 18 insertions(+), 38 deletions(-)
>>
>> diff --git a/dracut-kdump.sh b/dracut-kdump.sh
>> index b69bc98..e27be61 100755
>> --- a/dracut-kdump.sh
>> +++ b/dracut-kdump.sh
>> @@ -475,22 +475,12 @@ save_vmcore_dmesg_ssh()
>> get_host_ip()
>> {
>> if is_nfs_dump_target || is_ssh_dump_target; then
>> - kdumpnic=$(getarg kdumpnic=)
>> - if [ -z "$kdumpnic" ]; then
>> - derror "failed to get kdumpnic!"
>> + kdumpip=$(getarg kdumpip=)
>> + if [ -z "$kdumpip" ]; then
>> + derror "failed to get kdumpip!"
>> return 1
>> fi
>> - if ! kdumphost=$(ip addr show dev "$kdumpnic" | grep
'[ ]*inet'); then
>> - derror "wrong kdumpnic: $kdumpnic"
>> - return 1
>> - fi
>> - kdumphost=$(echo "$kdumphost" | head -n 1 | awk
'{print $2}')
>> - kdumphost="${kdumphost%%/*}"
>> - if [ -z "$kdumphost" ]; then
>> - derror "wrong kdumpnic: $kdumpnic"
>> - return 1
>> - fi
>> - HOST_IP=$kdumphost
>> + HOST_IP=$kdumpip
>> fi
>> return 0
>> }
>> diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh
>> index c05666e..8b28141 100755
>> --- a/dracut-module-setup.sh
>> +++ b/dracut-module-setup.sh
>> @@ -210,26 +210,6 @@ kdump_get_perm_addr() {
>> fi
>> }
>>
>> -# Prefix kernel assigned names with "kdump-". EX: eth0 ->
kdump-eth0
>> -# Because kernel assigned names are not persistent between 1st and 2nd
>> -# kernel. We could probably end up with eth0 being eth1, eth0 being
>> -# eth1, and naming conflict happens.
>> -kdump_setup_ifname() {
>> - local _ifname
>> -
>> - # If ifname already has 'kdump-' prefix, we must be switching from
>> - # fadump to kdump. Skip prefixing 'kdump-' in this case as adding
>> - # another prefix may truncate the ifname. Since an ifname with
>> - # 'kdump-' is already persistent, this should be fine.
>> - if [[ $1 =~ eth* ]] && [[ ! $1 =~ ^kdump-* ]]; then
>> - _ifname="kdump-$1"
>> - else
>> - _ifname="$1"
>> - fi
>> -
>> - echo "$_ifname"
>> -}
>> -
>> kdump_copy_nmconnection_file() {
>> local _dev _nmconnection_file_path _nmconnection_name
_initrd_nmconnection_file_path
>> local _cloned_nmconnection_file_path _uniq_name _old_uuid _old_name _uuid
_per_mac
>> @@ -264,6 +244,8 @@ kdump_copy_nmconnection_file() {
>> _per_mac=$(kdump_get_perm_addr "$_dev")
>> if [[ "$_per_mac" != 'not set' ]]; then
>> echo -n "except:mac:$_per_mac," >>
"/tmp/$$-netif_allowlist"
>> + # Ask NM to not match a connection profile based on interface-name
>> + sed -i -E "s/^interface-name=.*$//g"
"${initdir}/$_initrd_nmconnection_file_path"
>> else
>> echo -n "except:interface-name:$_dev," >>
"/tmp/$$-netif_allowlist"
>> fi
>> @@ -428,7 +410,7 @@ kdump_get_remote_ip() {
>> # initramfs accessing giving destination
>> # $1: destination host
>> kdump_install_net() {
>> - local _destaddr _route _netdev _conpath kdumpnic
>> + local _destaddr _route _netdev _conpath _kdumpip
>> local _znet_netdev _znet_conpath
>> # each netowrk interface is managed by a NM connection profile
>> declare -A nmconnection_map
>> @@ -441,9 +423,16 @@ kdump_install_net() {
>> _route=$(kdump_get_ip_route "$_destaddr")
>> _netdev=$(kdump_get_ip_route_field "$_route" "dev")
>> _conpath=$(get_nmcli_connection_apath_by_ifname "$_netdev")
>> - kdumpnic=$(kdump_setup_ifname "$_netdev")
>>
>> + if ! _kdumpip=$(ip addr show dev "$_netdev" | grep '[
]*inet'); then
>> + derror "Failed to get IP of $_netdev"
>> + return 1
>> + fi
>> +
>> + _kdumpip=$(echo "$_kdumpip" | head -n 1 | awk '{print
$2}')
>> + _kdumpip="${_kdumpip%%/*}"
>> _znet_netdev=$(find_online_znet_device)
>> +
>> if [[ -n $_znet_netdev ]]; then
>> _znet_conpath=$(get_nmcli_connection_apath_by_ifname
"$_znet_netdev")
>> if ! (kdump_setup_znet "$_znet_netdev"
"$_znet_conpath"); then
>> @@ -463,6 +452,7 @@ kdump_install_net() {
>> elif kdump_is_vlan "$_netdev"; then
>> kdump_setup_vlan "$_netdev"
>> fi
>> +
>> kdump_copy_nmconnection_file "$_netdev"
>> kdump_install_nm_netif_allowlist
>>
>> @@ -483,8 +473,8 @@ kdump_install_net() {
>> # the default gate way for network dump, eth1 in the fence kdump path will
>> # call kdump_install_net again and we don't want eth1 to be the
default
>> # gateway.
>> - if [[ ! -f ${initdir}/etc/cmdline.d/60kdumpnic.conf ]]; then
>> - echo "kdumpnic=$kdumpnic" >
"${initdir}/etc/cmdline.d/60kdumpnic.conf"
>> + if [[ ! -f ${initdir}/etc/cmdline.d/60kdumpip.conf ]]; then
>> + echo "kdumpip=$_kdumpip" >
"${initdir}/etc/cmdline.d/60kdumpip.conf"
>> fi
>> }
>>
>> --
>> 2.34.1
>
>Great idea!
Thanks!
>
>Just one concern, could there be any wired DHCP corner case? Like if
>the kdump kernel took a longer time to boot, eg. 5 min, and then your
>DHCP IP just expired during that time window, the kdump kernel will be
>using an expired IP and conflict with other machines. The minimal
>lease time of DHCP address is 1 hr, could be a rare corner case, but
>in-theory possible.
It seems I don't fully understand you. In the kdump kernel, we still use
DHCP to have a IP so there should be conflict. kdumpip is only used to
create to a dumping folder named "{IP}-{DATE}" and we don't assign
kdumpip to a NIC.
Oh, yes, you are right, I didn't read the patch carefully enough.
Since kdumpip is only used to create the dump dir, then this is fine.
Now I'm a bit concerned that kdumpip being embedded inside the initramfs.
kdumpip could be dynamic, but the initramfs isn't. This is probably
okay? since it's only used to naming the dump dir...
IIUC, now the dump dir's ip prefix could be out-of-sync with the
actual IP address of the machine when the dump happens, and will be in
out-of-sync status for a long time until the initramfs get rebuilt.
Just not sure if this will confuse anyone...
> >> List Archives: