Are there any Fedora Xen kernel gurus reading this list?
I'm trying to get my head around a problem reported by one of our atl1
(network driver) users. I can duplicate his problem. It happens
when we load the atl1 module while running 2.6.20-2925.5.fc7xen. Here's
the oops:
Unable to handle kernel paging request at ffff880382180ae8 RIP:
[<ffffffff80323985>] swiotlb_map_page+0x54/0x125
PGD 1100067 PUD 0
Oops: 0000 [1] SMP
last sysfs file: /class/net/eth0/address
CPU 0
Modules linked in: atl1 mii i915 drm netloop netbk blktap blkbk ipt_MASQUERADE iptable_nat
nf_nat xt_physdev bridge w83627ehf hwmon i2c_isa eeprom nf_conntrack_netbios_ns ipt_REJECT
nf_conntrack_ipv4 ipt_LOG ipt_recent iptable_filter ip_tables ip6t_LOG nf_conntrack_ipv6
xt_state nf_conntrack nfnetlink xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 video
sbs i2c_ec dock button battery asus_acpi backlight ac parport_pc lp parport snd_hda_intel
snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_mixer_oss i2c_i801 i2c_core snd_pcm serio_raw snd_timer snd iTCO_wdt
iTCO_vendor_support pcspkr soundcore snd_page_alloc shpchp sr_mod cdrom floppy sg
dm_snapshot dm_zero dm_mirror dm_mod ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd
mbcache ehci_hcd ohci_hcd uhci_hcd
Pid: 2710, comm: ip Not tainted 2.6.20-2925.5.fc7xen #1
RIP: e030:[<ffffffff80323985>] [<ffffffff80323985>]
swiotlb_map_page+0x54/0x125
RSP: e02b:ffff880019f2bd28 EFLAGS: 00010246
RAX: 7fffffffffffffff RBX: ffff880000f4c070 RCX: 000000007003975d
RDX: ffff880001fb5000 RSI: ffff881881f6ec58 RDI: 0000000000000012
RBP: 0000000000000002 R08: 0000000000000002 R09: ffff88002ba8a580
R10: ffff880038383780 R11: 00000000000000d0 R12: 00000000000005f0
R13: ffff88002ba8a880 R14: 0000000000000000 R15: ffff88002ba8a580
FS: 00002aaaaaac6820(0000) GS:ffffffff80579000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000019d24000 CR4: 0000000000002620
Process ip (pid: 2710, threadinfo ffff880019f2a000, task ffff880026a73080)
Stack: ffff88002ba8a880 0000000000000000 ffff880019f99000 ffff880019db1800
0000000000000001 ffffffff883d2be0 ffff880019f99000 ffff88002ba8a000
ffff880000f4c000 00000000fffffff4 ffff88002ba8a580 ffff88002ba8a000
Call Trace:
[<ffffffff883d2be0>] :atl1:atl1_alloc_rx_buffers+0x180/0x220
[<ffffffff883d2caf>] :atl1:atl1_up+0x2f/0x660
[<ffffffff883d37eb>] :atl1:atl1_open+0x2b/0x60
[<ffffffff803d5050>] dev_open+0x2f/0x6e
[<ffffffff803d3548>] dev_change_flags+0x5a/0x11a
[<ffffffff80407f20>] devinet_ioctl+0x235/0x59c
[<ffffffff8020ba24>] __might_sleep+0x26/0xd0
[<ffffffff803cbf9b>] sock_ioctl+0x1c8/0x1e5
[<ffffffff80240762>] do_ioctl+0x21/0x6b
[<ffffffff802304db>] vfs_ioctl+0x25c/0x275
[<ffffffff8024a735>] sys_ioctl+0x59/0x78
[<ffffffff8025b300>] tracesys+0xb2/0xb7
Code: 48 8b 14 ca 48 21 c2 48 c1 e2 0c 48 85 db 48 8d 04 3a 74 11
RIP [<ffffffff80323985>] swiotlb_map_page+0x54/0x125
RSP <ffff880019f2bd28>
CR2: ffff880382180ae8
<3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
Call Trace:
[<ffffffff80295804>] down_read+0x15/0x23
[<ffffffff8029e75f>] acct_collect+0x42/0x18e
[<ffffffff80215467>] do_exit+0x200/0x809
[<ffffffff80262c8c>] do_page_fault+0x129c/0x134b
[<ffffffff8020e887>] link_path_walk+0xc5/0xd7
[<ffffffff802b07bf>] __rmqueue+0x50/0xf9
[<ffffffff80260447>] error_exit+0x0/0x6e
[<ffffffff80323985>] swiotlb_map_page+0x54/0x125
[<ffffffff883d2be0>] :atl1:atl1_alloc_rx_buffers+0x180/0x220
[<ffffffff883d2caf>] :atl1:atl1_up+0x2f/0x660
[<ffffffff883d37eb>] :atl1:atl1_open+0x2b/0x60
[<ffffffff803d5050>] dev_open+0x2f/0x6e
[<ffffffff803d3548>] dev_change_flags+0x5a/0x11a
[<ffffffff80407f20>] devinet_ioctl+0x235/0x59c
[<ffffffff8020ba24>] __might_sleep+0x26/0xd0
[<ffffffff803cbf9b>] sock_ioctl+0x1c8/0x1e5
[<ffffffff80240762>] do_ioctl+0x21/0x6b
[<ffffffff802304db>] vfs_ioctl+0x25c/0x275
[<ffffffff8024a735>] sys_ioctl+0x59/0x78
[<ffffffff8025b300>] tracesys+0xb2/0xb7
I'm pretty sure what's happening is that swiotlb -- apparently used by Xen to do
IO memory mapping -- doesn't like pci_map_page(). Here's the driver code snippet
that does the dma mapping for our receive buffers:
static u16 atl1_alloc_rx_buffers(struct atl1_adapter *adapter)
{
[snip}
skb_reserve(skb, NET_IP_ALIGN);
skb->dev = netdev;
buffer_info->alloced = 1;
buffer_info->skb = skb;
buffer_info->length = (u16) adapter->rx_buffer_len;
page = virt_to_page(skb->data);
offset = (unsigned long)skb->data & ~PAGE_MASK;
buffer_info->dma = pci_map_page(pdev, page, offset,
adapter->rx_buffer_len,
PCI_DMA_FROMDEVICE);
rfd_desc->buffer_addr = cpu_to_le64(buffer_info->dma);
rfd_desc->buf_len = cpu_to_le16(adapter->rx_buffer_len);
rfd_desc->coalese = 0;
[snip]
}
The Xen kernel seems to work okay if I change pci_map_page() to pci_map_single(), but
there are other dma mappings in the code that seem to /need/ pci_map_page(). (We
inherited this driver from the vendor, BTW.)
Should swiotlb_map_page() work? I can't even /find/ the function in 2.6.21-rc5
(non-xen).
Thanks,
Jay
Show replies by date