On 02/23/13 02:30, Mauro Carvalho Chehab wrote:
Em Sat, 23 Feb 2013 02:09:01 +0100
poma <pomidorabelisima(a)gmail.com> escreveu:
> On 02/23/13 01:09, Mauro Carvalho Chehab wrote:
>> Em Fri, 22 Feb 2013 18:48:25 -0500
>> Josh Boyer <jwboyer(a)redhat.com> escreveu:
>>
>>> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote:
>>>> On 02/22/13 19:15, Bruno Wolff III wrote:
>>>>> On Fri, Feb 22, 2013 at 10:17:17 -0500,
>>>>> Josh Boyer <jwboyer(a)redhat.com> wrote:
>>>>>> Just a quick heads up that the 3.9 merge window kernels are being
built
>>>>>> in rawhide now. I've tried to at least test boot kernels on
my machine
>>>>>> before submitting them to koji, but testers beware. Merge
window
>>>>>> kernels can be tricky.
>>>>>
>>>>
>>>> Euphemism for broken? ;)
>>>
>>> No. A euphemism for that would be "operationally challenged".
>>>
>>>>> I have git3 running on three machines and things seem fine. I'll
be
>>>>> trying out git5 very soon.
>>>>
>>>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64
>>>> (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220
>>>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013
>>>> …
>>>> =============================================
>>>> [ INFO: possible recursive locking detected ]
>>>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted
>>>> ---------------------------------------------
>>>> kworker/0:1/36 is trying to acquire lock:
>>>> (hdl->lock){+.+...}, at: [<ffffffffa042d745>]
find_ref_lock+0x25/0x60
>>>> [videodev]
>>>>
>>>> but task is already holding lock:
>>>> (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
>>>>
>>>> other info that might help us debug this:
>>>> Possible unsafe locking scenario:
>>>>
>>>> CPU0
>>>> ----
>>>> lock(hdl->lock);
>>>> lock(hdl->lock);
>>>>
>>>> *** DEADLOCK ***
>>>>
>>>> May be due to missing lock nesting notation
>>>>
>>>> 3 locks held by kworker/0:1/36:
>>>> #0: (events){.+.+.+}, at: [<ffffffff8108d870>]
>>>> process_one_work+0x190/0x680
>>>> #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>]
>>>> process_one_work+0x190/0x680
>>>> #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
>>>>
>>>> stack backtrace:
>>>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
>>>> Call Trace:
>>>> [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80
>>>> [<ffffffff81021d69>] ? sched_clock+0x9/0x10
>>>> [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100
>>>> [<ffffffff810db092>] lock_acquire+0xa2/0x1f0
>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>> [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0
>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>> [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190
>>>> [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev]
>>>> [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev]
>>>> [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev]
>>>> [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180
[videodev]
>>>> [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv]
>>>> [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv]
>>>> [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130
>>>> [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70
>>>> [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10
>>>> [<ffffffff813890ae>] local_pci_probe+0x3e/0x70
>>>> [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20
>>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
>>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680
>>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
>>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
>>>> [<ffffffff810938fd>] kthread+0xed/0x100
>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>> …
>>>
>>>> ------------[ cut here ]------------
>>>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100()
>>>> …
>>>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
>>>> Call Trace:
>>>> [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0
>>>> [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50
>>>> [<ffffffff81378dd0>] check_for_stack+0xa0/0x100
>>>> [<ffffffff81379180>] debug_dma_map_page+0x100/0x140
>>>> [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620
>>>> [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0
>>>> [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470
>>>> [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0
>>>> [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190
>>>> [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170
>>>> [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2]
>>>> [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015]
>>>> [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170
>>>> [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90
[dvb_usb_af9015]
>>>> [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2]
>>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
>>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680
>>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
>>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
>>>> [<ffffffff810938fd>] kthread+0xed/0x100
>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>> ---[ end trace b4a4208b0ed39626 ]---
>>>
>>> Mauro, any idea on these two?
>>
>> The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying
>> initialization to happen after registering the device. There's a patch for
it,
>> pending upstream merge.
>>
>
> This one is gone after after applying
> 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of
>
git://linuxtv.org/media_build
OK.
>> The second one seems to be a bad lock. I think I saw some discussion about
>> that at the media ML, but I can't remember the odd details. Perhaps
there's
>> a fix for it already. It should be noticed that there's a pending pull
>> request with about ~600 patches for 3.9 that is pending its merge.
>> The ivtv fix should be there. I hope Linus will merge it soon.
>>
>
> This one stayed;
> …
> ------------[ cut here ]------------
> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0()
> Hardware name: M720-US3
> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack
> [addr=ffff8801283d7be1]
> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF)
> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)…
> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF)
> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)…
> videodev(OF)… video i2c_algo_bit… (irrelevant omitted)
> Pid: 38, comm: kworker/2:1 Tainted: GF O
> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1
> Call Trace:
> usb 4-2: pl2303 converter now attached to ttyUSB0
> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0
> [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50
> [<ffffffff81386467>] check_for_stack+0xa7/0xf0
> [<ffffffff81388228>] debug_dma_map_page+0x118/0x150
> [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630
> [<ffffffff811be091>] ? set_track+0x61/0x1b0
> [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900
> [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540
> [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0
> [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0
> [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50
> [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180
> [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2]
> [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015]
> [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210
> [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210
> [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015]
> [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2]
> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
> [<ffffffff8108e3db>] process_one_work+0x20b/0x690
> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
> [<ffffffff8108ec10>] worker_thread+0x110/0x380
> [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260
> [<ffffffff8109569d>] kthread+0xed/0x100
> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
> [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0
> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
> ---[ end trace 4f4bc1c2abe300b5 ]---
> --
Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines,
almost all memories can do DMA, including the area used by stack. That's
not true on other archs like arm.
uname -m
x86_64
Morning is wiser than evening. :)
I suspect that Antti never tried to run this driver outside x86.
Let me look into it...
static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
{
#define BUF_LEN 63
#define REQ_HDR_LEN 8 /* send header size */
#define ACK_HDR_LEN 2 /* rece header size */
struct af9015_state *state = d_to_priv(d);
int ret, wlen, rlen;
u8 buf[BUF_LEN];
...
ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen);
int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf,
u16 rlen)
{
int ret, actual_length;
...
ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev,
d->props->generic_bulk_ctrl_endpoint), wbuf, wlen,
&actual_length, 2000);
That's the problem: "buf" is at stack, and it is passed directly to
usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause
data to be lost even on x86).
In order to fix it, the driver should be doing, instead, something like:
static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
{
...
u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC);
...
kfree(buf);
...
}
You shouldn't blame the merge windows kernel here. This never worked
on arm, even on older kernel versions ;)
Given that a zillion of code is in the rush, to work at all is miracle
enough.
A more permanent fix would be to allocate a temporary 80 bytes
buffer
embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to
always use this temporary buffer for the control URB transfers.
Antti,
Could you please tale a look in it?
Thanks!
Mauro
Mauro, with your patch 'af9015.c-3.9.0-0.rc0.git5.1.fc18.x86_64.diff'
now the media tree is fine, *but* network… here we go again, rock you
like a hurricane! :)
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0()
Hardware name: M720-US3
skge 0000:01:09.0: DMA-API: device driver failed to check map
error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single]
Modules linked in:
…uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4
nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack
ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl
lockd… r8169… mii skge… sunrpc
Pid: 0, comm: swapper/3 Tainted: GF O
3.9.0-0.rc0.git5.1.fc18.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50
[<ffffffff81387a47>] check_unmap+0x407/0x8a0
[<ffffffff810ab62f>] ? try_to_wake_up+0x1ff/0x350
[<ffffffff813880fe>] debug_dma_unmap_page+0x5e/0x70
[<ffffffffa00750ac>] ? skge_intr+0x2c/0x570 [skge]
[<ffffffffa0073525>] skge_poll+0x1b5/0x9c0 [skge]
[<ffffffff815e7672>] net_rx_action+0x172/0x380
[<ffffffff810737e0>] __do_softirq+0x100/0x400
[<ffffffff81021dd3>] ? native_sched_clock+0x13/0x80
[<ffffffff81739c7c>] call_softirq+0x1c/0x30
[<ffffffff8101c435>] do_softirq+0xa5/0xe0
[<ffffffff81073cd5>] irq_exit+0xd5/0xe0
[<ffffffff8173a5b3>] do_IRQ+0x63/0xe0
[<ffffffff8172f732>] common_interrupt+0x72/0x72
<EOI> [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff8172f154>] ? _raw_spin_unlock_irq+0x34/0x50
[<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff810a3ddc>] finish_task_switch+0x7c/0x120
[<ffffffff810a3d9f>] ? finish_task_switch+0x3f/0x120
[<ffffffff8172cd02>] __schedule+0x442/0xa00
[<ffffffff8172d5f9>] schedule+0x29/0x70
[<ffffffff81023abf>] cpu_idle+0x12f/0x140
[<ffffffff8171a70e>] start_secondary+0x269/0x26b
---[ end trace d41e441560f83f6a ]---
Mapped at:
[<ffffffff813881b9>] debug_dma_map_page+0xa9/0x150
[<ffffffffa0074c83>] skge_xmit_frame+0x163/0x560 [skge]
[<ffffffff815e8330>] dev_hard_start_xmit+0x260/0x6f0
[<ffffffff8160bb0e>] sch_direct_xmit+0xfe/0x2a0
[<ffffffff815e8a07>] dev_queue_xmit+0x247/0x970
…
--
Cheers,
poma