3.9 merge window kernels

The wireless card suddenly does...

Problems with git5 on one of three...

Josh Boyer

Friday, 22 February 2013 Fri, 22 Feb '13

9:17 a.m.

Just a quick heads up that the 3.9 merge window kernels are being built in rawhide now. I've tried to at least test boot kernels on my machine before submitting them to koji, but testers beware. Merge window kernels can be tricky. ARM people: the -git4 kernel build has the big ARM merge contained in it. I've tried to go through and make a reasonable estimation of what all the ARM related config options should be set to, but you really need to look those over. Also... the config file setup is confusing, particularly config-armv7. That seems to be used by only kernel-armv7/kernel-armv7hl but not by the omap and tegra armv7 configs. I think that is leading to some duplication of config settings. Anyway, look it over please. PPC/S390x people: You're set. Your architectures are apparently boring and predictable, so there were no config file changes needed. josh

Show replies by date

Peter Robinson

Friday, 22 February Fri, 22 Feb

9:36 a.m.

New subject: [fedora-arm] 3.9 merge window kernels

On Fri, Feb 22, 2013 at 3:17 PM, Josh Boyer <jwboyer(a)redhat.com> wrote:

...

Thanks, it's changed a fair bit again and this round omap goes away too, I'll try and get it sorted over the weekend. Peter

Bruno Wolff III

12:15 p.m.

On Fri, Feb 22, 2013 at 10:17:17 -0500, Josh Boyer <jwboyer(a)redhat.com> wrote:

...

I have git3 running on three machines and things seem fine. I'll be trying out git5 very soon.

poma

5:29 p.m.

On 02/22/13 19:15, Bruno Wolff III wrote:

...

On Fri, Feb 22, 2013 at 10:17:17 -0500, Josh Boyer <jwboyer(a)redhat.com> wrote: > Just a quick heads up that the 3.9 merge window kernels are being built > in rawhide now. I've tried to at least test boot kernels on my machine > before submitting them to koji, but testers beware. Merge window > kernels can be tricky.

Euphemism for broken? ;)

...

I have git3 running on three machines and things seem fine. I'll be trying out git5 very soon.

Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 … ============================================= [ INFO: possible recursive locking detected ] 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted --------------------------------------------- kworker/0:1/36 is trying to acquire lock: (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] but task is already holding lock: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] v4l2_ctrl_add_handler+0x78/0xf0 [videodev] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(hdl->lock); lock(hdl->lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/0:1/36: #0: (events){.+.+.+}, at: [<ffffffff8108d870>] process_one_work+0x190/0x680 #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] process_one_work+0x190/0x680 #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] v4l2_ctrl_add_handler+0x78/0xf0 [videodev] stack backtrace: Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Call Trace: [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 [<ffffffff81021d69>] ? sched_clock+0x9/0x10 [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 [<ffffffff8108d870>] ? process_one_work+0x190/0x680 [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 [<ffffffff810938fd>] kthread+0xed/0x100 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 … ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() … Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Call Trace: [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 [<ffffffff8108d870>] ? process_one_work+0x190/0x680 [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 [<ffffffff810938fd>] kthread+0xed/0x100 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 ---[ end trace b4a4208b0ed39626 ]--- … Cheers, poma

Josh Boyer

5:48 p.m.

On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote:

...

On 02/22/13 19:15, Bruno Wolff III wrote: > On Fri, Feb 22, 2013 at 10:17:17 -0500, > Josh Boyer <jwboyer(a)redhat.com> wrote: >> Just a quick heads up that the 3.9 merge window kernels are being built >> in rawhide now. I've tried to at least test boot kernels on my machine >> before submitting them to koji, but testers beware. Merge window >> kernels can be tricky. > Euphemism for broken? ;)

No. A euphemism for that would be "operationally challenged".

...

> I have git3 running on three machines and things seem fine. I'll be > trying out git5 very soon. Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 … ============================================= [ INFO: possible recursive locking detected ] 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted --------------------------------------------- kworker/0:1/36 is trying to acquire lock: (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] but task is already holding lock: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] v4l2_ctrl_add_handler+0x78/0xf0 [videodev] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(hdl->lock); lock(hdl->lock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/0:1/36: #0: (events){.+.+.+}, at: [<ffffffff8108d870>] process_one_work+0x190/0x680 #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] process_one_work+0x190/0x680 #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] v4l2_ctrl_add_handler+0x78/0xf0 [videodev] stack backtrace: Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Call Trace: [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 [<ffffffff81021d69>] ? sched_clock+0x9/0x10 [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 [<ffffffff8108d870>] ? process_one_work+0x190/0x680 [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 [<ffffffff810938fd>] kthread+0xed/0x100 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 …

...

------------[ cut here ]------------ WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() … Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Call Trace: [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 [<ffffffff8108d870>] ? process_one_work+0x190/0x680 [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 [<ffffffff810938fd>] kthread+0xed/0x100 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 ---[ end trace b4a4208b0ed39626 ]---

Mauro, any idea on these two? josh

Mauro Carvalho Chehab

6:09 p.m.

Em Fri, 22 Feb 2013 18:48:25 -0500 Josh Boyer <jwboyer(a)redhat.com> escreveu:

...

On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote: > On 02/22/13 19:15, Bruno Wolff III wrote: > > On Fri, Feb 22, 2013 at 10:17:17 -0500, > > Josh Boyer <jwboyer(a)redhat.com> wrote: > >> Just a quick heads up that the 3.9 merge window kernels are being built > >> in rawhide now. I've tried to at least test boot kernels on my machine > >> before submitting them to koji, but testers beware. Merge window > >> kernels can be tricky. > > > > Euphemism for broken? ;) No. A euphemism for that would be "operationally challenged". > > I have git3 running on three machines and things seem fine. I'll be > > trying out git5 very soon. > > Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 > (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 > (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 > … > ============================================= > [ INFO: possible recursive locking detected ] > 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted > --------------------------------------------- > kworker/0:1/36 is trying to acquire lock: > (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 > [videodev] > > but task is already holding lock: > (hdl->lock){+.+...}, at: [<ffffffffa0430888>] > v4l2_ctrl_add_handler+0x78/0xf0 [videodev] > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(hdl->lock); > lock(hdl->lock); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by kworker/0:1/36: > #0: (events){.+.+.+}, at: [<ffffffff8108d870>] > process_one_work+0x190/0x680 > #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] > process_one_work+0x190/0x680 > #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] > v4l2_ctrl_add_handler+0x78/0xf0 [videodev] > > stack backtrace: > Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 > Call Trace: > [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 > [<ffffffff81021d69>] ? sched_clock+0x9/0x10 > [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 > [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 > [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] > [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 > [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] > [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] > [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 > [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] > [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] > [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] > [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] > [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] > [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] > [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 > [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 > [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 > [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 > [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 > [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 > [<ffffffff8108d870>] ? process_one_work+0x190/0x680 > [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 > [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 > [<ffffffff810938fd>] kthread+0xed/0x100 > [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 > [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 > [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 > … > ------------[ cut here ]------------ > WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() > … > Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 > Call Trace: > [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 > [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 > [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 > [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 > [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 > [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 > [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 > [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 > [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 > [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 > [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] > [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] > [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 > [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] > [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] > [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 > [<ffffffff8108d870>] ? process_one_work+0x190/0x680 > [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 > [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 > [<ffffffff810938fd>] kthread+0xed/0x100 > [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 > [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 > [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 > ---[ end trace b4a4208b0ed39626 ]--- Mauro, any idea on these two?

The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying initialization to happen after registering the device. There's a patch for it, pending upstream merge. The second one seems to be a bad lock. I think I saw some discussion about that at the media ML, but I can't remember the odd details. Perhaps there's a fix for it already. It should be noticed that there's a pending pull request with about ~600 patches for 3.9 that is pending its merge. The ivtv fix should be there. I hope Linus will merge it soon. Regards, Mauro

poma

7:09 p.m.

On 02/23/13 01:09, Mauro Carvalho Chehab wrote:

...

Em Fri, 22 Feb 2013 18:48:25 -0500 Josh Boyer <jwboyer(a)redhat.com> escreveu: > On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote: >> On 02/22/13 19:15, Bruno Wolff III wrote: >>> On Fri, Feb 22, 2013 at 10:17:17 -0500, >>> Josh Boyer <jwboyer(a)redhat.com> wrote: >>>> Just a quick heads up that the 3.9 merge window kernels are being built >>>> in rawhide now. I've tried to at least test boot kernels on my machine >>>> before submitting them to koji, but testers beware. Merge window >>>> kernels can be tricky. >>> >> >> Euphemism for broken? ;) > > No. A euphemism for that would be "operationally challenged". > >>> I have git3 running on three machines and things seem fine. I'll be >>> trying out git5 very soon. >> >> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 >> (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 >> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 >> … >> ============================================= >> [ INFO: possible recursive locking detected ] >> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted >> --------------------------------------------- >> kworker/0:1/36 is trying to acquire lock: >> (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 >> [videodev] >> >> but task is already holding lock: >> (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >> >> other info that might help us debug this: >> Possible unsafe locking scenario: >> >> CPU0 >> ---- >> lock(hdl->lock); >> lock(hdl->lock); >> >> *** DEADLOCK *** >> >> May be due to missing lock nesting notation >> >> 3 locks held by kworker/0:1/36: >> #0: (events){.+.+.+}, at: [<ffffffff8108d870>] >> process_one_work+0x190/0x680 >> #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] >> process_one_work+0x190/0x680 >> #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >> >> stack backtrace: >> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >> Call Trace: >> [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 >> [<ffffffff81021d69>] ? sched_clock+0x9/0x10 >> [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 >> [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 >> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >> [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 >> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >> [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 >> [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] >> [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] >> [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] >> [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] >> [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] >> [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] >> [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 >> [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 >> [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 >> [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 >> [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 >> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >> [<ffffffff810938fd>] kthread+0xed/0x100 >> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >> … > >> ------------[ cut here ]------------ >> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() >> … >> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >> Call Trace: >> [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 >> [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 >> [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 >> [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 >> [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 >> [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 >> [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 >> [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 >> [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 >> [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 >> [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] >> [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] >> [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 >> [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >> [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] >> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >> [<ffffffff810938fd>] kthread+0xed/0x100 >> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >> ---[ end trace b4a4208b0ed39626 ]--- > > Mauro, any idea on these two? The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying initialization to happen after registering the device. There's a patch for it, pending upstream merge.

This one is gone after after applying 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of git://linuxtv.org/media_build

...

The second one seems to be a bad lock. I think I saw some discussion about that at the media ML, but I can't remember the odd details. Perhaps there's a fix for it already. It should be noticed that there's a pending pull request with about ~600 patches for 3.9 that is pending its merge. The ivtv fix should be there. I hope Linus will merge it soon.

This one stayed; … ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() Hardware name: M720-US3 ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack [addr=ffff8801283d7be1] Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… videodev(OF)… video i2c_algo_bit… (irrelevant omitted) Pid: 38, comm: kworker/2:1 Tainted: GF O 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 Call Trace: usb 4-2: pl2303 converter now attached to ttyUSB0 [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81386467>] check_for_stack+0xa7/0xf0 [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 [<ffffffff811be091>] ? set_track+0x61/0x1b0 [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 [<ffffffff8108e3db>] process_one_work+0x20b/0x690 [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 [<ffffffff8108ec10>] worker_thread+0x110/0x380 [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 [<ffffffff8109569d>] kthread+0xed/0x100 [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 ---[ end trace 4f4bc1c2abe300b5 ]--- -- Cheers, poma

Mauro Carvalho Chehab

7:30 p.m.

Em Sat, 23 Feb 2013 02:09:01 +0100 poma <pomidorabelisima(a)gmail.com> escreveu:

...

On 02/23/13 01:09, Mauro Carvalho Chehab wrote: > Em Fri, 22 Feb 2013 18:48:25 -0500 > Josh Boyer <jwboyer(a)redhat.com> escreveu: > >> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote: >>> On 02/22/13 19:15, Bruno Wolff III wrote: >>>> On Fri, Feb 22, 2013 at 10:17:17 -0500, >>>> Josh Boyer <jwboyer(a)redhat.com> wrote: >>>>> Just a quick heads up that the 3.9 merge window kernels are being built >>>>> in rawhide now. I've tried to at least test boot kernels on my machine >>>>> before submitting them to koji, but testers beware. Merge window >>>>> kernels can be tricky. >>>> >>> >>> Euphemism for broken? ;) >> >> No. A euphemism for that would be "operationally challenged". >> >>>> I have git3 running on three machines and things seem fine. I'll be >>>> trying out git5 very soon. >>> >>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 >>> (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 >>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 >>> … >>> ============================================= >>> [ INFO: possible recursive locking detected ] >>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted >>> --------------------------------------------- >>> kworker/0:1/36 is trying to acquire lock: >>> (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 >>> [videodev] >>> >>> but task is already holding lock: >>> (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>> >>> other info that might help us debug this: >>> Possible unsafe locking scenario: >>> >>> CPU0 >>> ---- >>> lock(hdl->lock); >>> lock(hdl->lock); >>> >>> *** DEADLOCK *** >>> >>> May be due to missing lock nesting notation >>> >>> 3 locks held by kworker/0:1/36: >>> #0: (events){.+.+.+}, at: [<ffffffff8108d870>] >>> process_one_work+0x190/0x680 >>> #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] >>> process_one_work+0x190/0x680 >>> #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>> >>> stack backtrace: >>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>> Call Trace: >>> [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 >>> [<ffffffff81021d69>] ? sched_clock+0x9/0x10 >>> [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 >>> [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 >>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>> [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 >>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>> [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 >>> [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] >>> [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] >>> [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] >>> [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] >>> [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] >>> [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] >>> [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 >>> [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 >>> [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 >>> [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 >>> [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 >>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>> [<ffffffff810938fd>] kthread+0xed/0x100 >>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>> … >> >>> ------------[ cut here ]------------ >>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() >>> … >>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>> Call Trace: >>> [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 >>> [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 >>> [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 >>> [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 >>> [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 >>> [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 >>> [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 >>> [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 >>> [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 >>> [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 >>> [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] >>> [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] >>> [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 >>> [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >>> [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] >>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>> [<ffffffff810938fd>] kthread+0xed/0x100 >>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>> ---[ end trace b4a4208b0ed39626 ]--- >> >> Mauro, any idea on these two? > > The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying > initialization to happen after registering the device. There's a patch for it, > pending upstream merge. > This one is gone after after applying 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of git://linuxtv.org/media_build

OK.

...

> The second one seems to be a bad lock. I think I saw some discussion about > that at the media ML, but I can't remember the odd details. Perhaps there's > a fix for it already. It should be noticed that there's a pending pull > request with about ~600 patches for 3.9 that is pending its merge. > The ivtv fix should be there. I hope Linus will merge it soon. > This one stayed; … ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() Hardware name: M720-US3 ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack [addr=ffff8801283d7be1] Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… videodev(OF)… video i2c_algo_bit… (irrelevant omitted) Pid: 38, comm: kworker/2:1 Tainted: GF O 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 Call Trace: usb 4-2: pl2303 converter now attached to ttyUSB0 [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81386467>] check_for_stack+0xa7/0xf0 [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 [<ffffffff811be091>] ? set_track+0x61/0x1b0 [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 [<ffffffff8108e3db>] process_one_work+0x20b/0x690 [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 [<ffffffff8108ec10>] worker_thread+0x110/0x380 [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 [<ffffffff8109569d>] kthread+0xed/0x100 [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 ---[ end trace 4f4bc1c2abe300b5 ]--- --

Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, almost all memories can do DMA, including the area used by stack. That's not true on other archs like arm. I suspect that Antti never tried to run this driver outside x86. Let me look into it... static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { #define BUF_LEN 63 #define REQ_HDR_LEN 8 /* send header size */ #define ACK_HDR_LEN 2 /* rece header size */ struct af9015_state *state = d_to_priv(d); int ret, wlen, rlen; u8 buf[BUF_LEN]; ... ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, u16 rlen) { int ret, actual_length; ... ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, &actual_length, 2000); That's the problem: "buf" is at stack, and it is passed directly to usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause data to be lost even on x86). In order to fix it, the driver should be doing, instead, something like: static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { ... u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); ... kfree(buf); ... } You shouldn't blame the merge windows kernel here. This never worked on arm, even on older kernel versions ;) A more permanent fix would be to allocate a temporary 80 bytes buffer embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to always use this temporary buffer for the control URB transfers. Antti, Could you please tale a look in it? Thanks! Mauro

poma

11:47 p.m.

On 02/23/13 02:30, Mauro Carvalho Chehab wrote:

...

Em Sat, 23 Feb 2013 02:09:01 +0100 poma <pomidorabelisima(a)gmail.com> escreveu: > On 02/23/13 01:09, Mauro Carvalho Chehab wrote: >> Em Fri, 22 Feb 2013 18:48:25 -0500 >> Josh Boyer <jwboyer(a)redhat.com> escreveu: >> >>> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote: >>>> On 02/22/13 19:15, Bruno Wolff III wrote: >>>>> On Fri, Feb 22, 2013 at 10:17:17 -0500, >>>>> Josh Boyer <jwboyer(a)redhat.com> wrote: >>>>>> Just a quick heads up that the 3.9 merge window kernels are being built >>>>>> in rawhide now. I've tried to at least test boot kernels on my machine >>>>>> before submitting them to koji, but testers beware. Merge window >>>>>> kernels can be tricky. >>>>> >>>> >>>> Euphemism for broken? ;) >>> >>> No. A euphemism for that would be "operationally challenged". >>> >>>>> I have git3 running on three machines and things seem fine. I'll be >>>>> trying out git5 very soon. >>>> >>>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 >>>> (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 >>>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 >>>> … >>>> ============================================= >>>> [ INFO: possible recursive locking detected ] >>>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted >>>> --------------------------------------------- >>>> kworker/0:1/36 is trying to acquire lock: >>>> (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 >>>> [videodev] >>>> >>>> but task is already holding lock: >>>> (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>>> >>>> other info that might help us debug this: >>>> Possible unsafe locking scenario: >>>> >>>> CPU0 >>>> ---- >>>> lock(hdl->lock); >>>> lock(hdl->lock); >>>> >>>> *** DEADLOCK *** >>>> >>>> May be due to missing lock nesting notation >>>> >>>> 3 locks held by kworker/0:1/36: >>>> #0: (events){.+.+.+}, at: [<ffffffff8108d870>] >>>> process_one_work+0x190/0x680 >>>> #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] >>>> process_one_work+0x190/0x680 >>>> #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>>> >>>> stack backtrace: >>>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>>> Call Trace: >>>> [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 >>>> [<ffffffff81021d69>] ? sched_clock+0x9/0x10 >>>> [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 >>>> [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 >>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>> [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 >>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>> [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 >>>> [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] >>>> [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] >>>> [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] >>>> [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] >>>> [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] >>>> [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] >>>> [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 >>>> [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 >>>> [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 >>>> [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 >>>> [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 >>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>>> [<ffffffff810938fd>] kthread+0xed/0x100 >>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>> … >>> >>>> ------------[ cut here ]------------ >>>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() >>>> … >>>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>>> Call Trace: >>>> [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 >>>> [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 >>>> [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 >>>> [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 >>>> [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 >>>> [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 >>>> [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 >>>> [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 >>>> [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 >>>> [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 >>>> [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] >>>> [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] >>>> [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 >>>> [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >>>> [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] >>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>>> [<ffffffff810938fd>] kthread+0xed/0x100 >>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>> ---[ end trace b4a4208b0ed39626 ]--- >>> >>> Mauro, any idea on these two? >> >> The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying >> initialization to happen after registering the device. There's a patch for it, >> pending upstream merge. >> > > This one is gone after after applying > 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of > git://linuxtv.org/media_build OK. >> The second one seems to be a bad lock. I think I saw some discussion about >> that at the media ML, but I can't remember the odd details. Perhaps there's >> a fix for it already. It should be noticed that there's a pending pull >> request with about ~600 patches for 3.9 that is pending its merge. >> The ivtv fix should be there. I hope Linus will merge it soon. >> > > This one stayed; > … > ------------[ cut here ]------------ > WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() > Hardware name: M720-US3 > ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack > [addr=ffff8801283d7be1] > Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) > wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… > rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) > videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… > videodev(OF)… video i2c_algo_bit… (irrelevant omitted) > Pid: 38, comm: kworker/2:1 Tainted: GF O > 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 > Call Trace: > usb 4-2: pl2303 converter now attached to ttyUSB0 > [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 > [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 > [<ffffffff81386467>] check_for_stack+0xa7/0xf0 > [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 > [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 > [<ffffffff811be091>] ? set_track+0x61/0x1b0 > [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 > [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 > [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 > [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 > [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 > [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 > [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] > [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] > [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 > [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 > [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] > [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] > [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 > [<ffffffff8108e3db>] process_one_work+0x20b/0x690 > [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 > [<ffffffff8108ec10>] worker_thread+0x110/0x380 > [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 > [<ffffffff8109569d>] kthread+0xed/0x100 > [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 > [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 > [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 > ---[ end trace 4f4bc1c2abe300b5 ]--- > -- Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, almost all memories can do DMA, including the area used by stack. That's not true on other archs like arm.

uname -m x86_64 Morning is wiser than evening. :)

...

I suspect that Antti never tried to run this driver outside x86. Let me look into it... static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { #define BUF_LEN 63 #define REQ_HDR_LEN 8 /* send header size */ #define ACK_HDR_LEN 2 /* rece header size */ struct af9015_state *state = d_to_priv(d); int ret, wlen, rlen; u8 buf[BUF_LEN]; ... ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, u16 rlen) { int ret, actual_length; ... ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, &actual_length, 2000); That's the problem: "buf" is at stack, and it is passed directly to usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause data to be lost even on x86). In order to fix it, the driver should be doing, instead, something like: static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { ... u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); ... kfree(buf); ... } You shouldn't blame the merge windows kernel here. This never worked on arm, even on older kernel versions ;)

Given that a zillion of code is in the rush, to work at all is miracle enough.

...

A more permanent fix would be to allocate a temporary 80 bytes buffer embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to always use this temporary buffer for the control URB transfers. Antti, Could you please tale a look in it? Thanks! Mauro

Mauro, with your patch 'af9015.c-3.9.0-0.rc0.git5.1.fc18.x86_64.diff' now the media tree is fine, *but* network… here we go again, rock you like a hurricane! :) ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0() Hardware name: M720-US3 skge 0000:01:09.0: DMA-API: device driver failed to check map error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single] Modules linked in: …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl lockd… r8169… mii skge… sunrpc Pid: 0, comm: swapper/3 Tainted: GF O 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 Call Trace: <IRQ> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81387a47>] check_unmap+0x407/0x8a0 [<ffffffff810ab62f>] ? try_to_wake_up+0x1ff/0x350 [<ffffffff813880fe>] debug_dma_unmap_page+0x5e/0x70 [<ffffffffa00750ac>] ? skge_intr+0x2c/0x570 [skge] [<ffffffffa0073525>] skge_poll+0x1b5/0x9c0 [skge] [<ffffffff815e7672>] net_rx_action+0x172/0x380 [<ffffffff810737e0>] __do_softirq+0x100/0x400 [<ffffffff81021dd3>] ? native_sched_clock+0x13/0x80 [<ffffffff81739c7c>] call_softirq+0x1c/0x30 [<ffffffff8101c435>] do_softirq+0xa5/0xe0 [<ffffffff81073cd5>] irq_exit+0xd5/0xe0 [<ffffffff8173a5b3>] do_IRQ+0x63/0xe0 [<ffffffff8172f732>] common_interrupt+0x72/0x72 <EOI> [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff8172f154>] ? _raw_spin_unlock_irq+0x34/0x50 [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810a3ddc>] finish_task_switch+0x7c/0x120 [<ffffffff810a3d9f>] ? finish_task_switch+0x3f/0x120 [<ffffffff8172cd02>] __schedule+0x442/0xa00 [<ffffffff8172d5f9>] schedule+0x29/0x70 [<ffffffff81023abf>] cpu_idle+0x12f/0x140 [<ffffffff8171a70e>] start_secondary+0x269/0x26b ---[ end trace d41e441560f83f6a ]--- Mapped at: [<ffffffff813881b9>] debug_dma_map_page+0xa9/0x150 [<ffffffffa0074c83>] skge_xmit_frame+0x163/0x560 [skge] [<ffffffff815e8330>] dev_hard_start_xmit+0x260/0x6f0 [<ffffffff8160bb0e>] sch_direct_xmit+0xfe/0x2a0 [<ffffffff815e8a07>] dev_queue_xmit+0x247/0x970 … -- Cheers, poma

Mauro Carvalho Chehab

Saturday, 23 February Sat, 23 Feb

6:01 a.m.

Em Sat, 23 Feb 2013 06:47:39 +0100 poma <pomidorabelisima(a)gmail.com> escreveu:

...

On 02/23/13 02:30, Mauro Carvalho Chehab wrote: > Em Sat, 23 Feb 2013 02:09:01 +0100 > poma <pomidorabelisima(a)gmail.com> escreveu: > >> On 02/23/13 01:09, Mauro Carvalho Chehab wrote: >>> Em Fri, 22 Feb 2013 18:48:25 -0500 >>> Josh Boyer <jwboyer(a)redhat.com> escreveu: >>> >>>> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote: >>>>> On 02/22/13 19:15, Bruno Wolff III wrote: >>>>>> On Fri, Feb 22, 2013 at 10:17:17 -0500, >>>>>> Josh Boyer <jwboyer(a)redhat.com> wrote: >>>>>>> Just a quick heads up that the 3.9 merge window kernels are being built >>>>>>> in rawhide now. I've tried to at least test boot kernels on my machine >>>>>>> before submitting them to koji, but testers beware. Merge window >>>>>>> kernels can be tricky. >>>>>> >>>>> >>>>> Euphemism for broken? ;) >>>> >>>> No. A euphemism for that would be "operationally challenged". >>>> >>>>>> I have git3 running on three machines and things seem fine. I'll be >>>>>> trying out git5 very soon. >>>>> >>>>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64 >>>>> (mockbuild(a)bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220 >>>>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013 >>>>> … >>>>> ============================================= >>>>> [ INFO: possible recursive locking detected ] >>>>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted >>>>> --------------------------------------------- >>>>> kworker/0:1/36 is trying to acquire lock: >>>>> (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60 >>>>> [videodev] >>>>> >>>>> but task is already holding lock: >>>>> (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>>>> >>>>> other info that might help us debug this: >>>>> Possible unsafe locking scenario: >>>>> >>>>> CPU0 >>>>> ---- >>>>> lock(hdl->lock); >>>>> lock(hdl->lock); >>>>> >>>>> *** DEADLOCK *** >>>>> >>>>> May be due to missing lock nesting notation >>>>> >>>>> 3 locks held by kworker/0:1/36: >>>>> #0: (events){.+.+.+}, at: [<ffffffff8108d870>] >>>>> process_one_work+0x190/0x680 >>>>> #1: ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>] >>>>> process_one_work+0x190/0x680 >>>>> #2: (hdl->lock){+.+...}, at: [<ffffffffa0430888>] >>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev] >>>>> >>>>> stack backtrace: >>>>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>>>> Call Trace: >>>>> [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80 >>>>> [<ffffffff81021d69>] ? sched_clock+0x9/0x10 >>>>> [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100 >>>>> [<ffffffff810db092>] lock_acquire+0xa2/0x1f0 >>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>>> [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0 >>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>>> [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev] >>>>> [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190 >>>>> [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev] >>>>> [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev] >>>>> [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev] >>>>> [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev] >>>>> [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv] >>>>> [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv] >>>>> [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130 >>>>> [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70 >>>>> [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10 >>>>> [<ffffffff813890ae>] local_pci_probe+0x3e/0x70 >>>>> [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20 >>>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>>>> [<ffffffff810938fd>] kthread+0xed/0x100 >>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>>> … >>>> >>>>> ------------[ cut here ]------------ >>>>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100() >>>>> … >>>>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 >>>>> Call Trace: >>>>> [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0 >>>>> [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50 >>>>> [<ffffffff81378dd0>] check_for_stack+0xa0/0x100 >>>>> [<ffffffff81379180>] debug_dma_map_page+0x100/0x140 >>>>> [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620 >>>>> [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0 >>>>> [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470 >>>>> [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0 >>>>> [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190 >>>>> [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170 >>>>> [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2] >>>>> [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015] >>>>> [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170 >>>>> [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >>>>> [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2] >>>>> [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680 >>>>> [<ffffffff8108d870>] ? process_one_work+0x190/0x680 >>>>> [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0 >>>>> [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270 >>>>> [<ffffffff810938fd>] kthread+0xed/0x100 >>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>>> [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0 >>>>> [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80 >>>>> ---[ end trace b4a4208b0ed39626 ]--- >>>> >>>> Mauro, any idea on these two? >>> >>> The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying >>> initialization to happen after registering the device. There's a patch for it, >>> pending upstream merge. >>> >> >> This one is gone after after applying >> 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of >> git://linuxtv.org/media_build > > OK. > >>> The second one seems to be a bad lock. I think I saw some discussion about >>> that at the media ML, but I can't remember the odd details. Perhaps there's >>> a fix for it already. It should be noticed that there's a pending pull >>> request with about ~600 patches for 3.9 that is pending its merge. >>> The ivtv fix should be there. I hope Linus will merge it soon. >>> >> >> This one stayed; >> … >> ------------[ cut here ]------------ >> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() >> Hardware name: M720-US3 >> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack >> [addr=ffff8801283d7be1] >> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) >> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… >> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) >> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… >> videodev(OF)… video i2c_algo_bit… (irrelevant omitted) >> Pid: 38, comm: kworker/2:1 Tainted: GF O >> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 >> Call Trace: >> usb 4-2: pl2303 converter now attached to ttyUSB0 >> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 >> [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 >> [<ffffffff81386467>] check_for_stack+0xa7/0xf0 >> [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 >> [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 >> [<ffffffff811be091>] ? set_track+0x61/0x1b0 >> [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 >> [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 >> [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 >> [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 >> [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 >> [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 >> [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] >> [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] >> [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 >> [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 >> [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >> [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] >> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >> [<ffffffff8108e3db>] process_one_work+0x20b/0x690 >> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >> [<ffffffff8108ec10>] worker_thread+0x110/0x380 >> [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 >> [<ffffffff8109569d>] kthread+0xed/0x100 >> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >> [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >> ---[ end trace 4f4bc1c2abe300b5 ]--- >> -- > > Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, > almost all memories can do DMA, including the area used by stack. That's > not true on other archs like arm. > uname -m x86_64 Morning is wiser than evening. :)

Well, sometimes, even on x86, the stack may happen to be on a memory area where DMA transfers aren't allowed (or maybe two drivers may be trying to do DMA transfers at the stack at the same time).

...

> I suspect that Antti never tried to run this driver outside x86. > > Let me look into it... > > static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) > { > #define BUF_LEN 63 > #define REQ_HDR_LEN 8 /* send header size */ > #define ACK_HDR_LEN 2 /* rece header size */ > struct af9015_state *state = d_to_priv(d); > int ret, wlen, rlen; > u8 buf[BUF_LEN]; > ... > ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); > > > > int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, > u16 rlen) > { > int ret, actual_length; > > ... > ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, > d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, > &actual_length, 2000); > > > That's the problem: "buf" is at stack, and it is passed directly to > usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause > data to be lost even on x86). > > In order to fix it, the driver should be doing, instead, something like: > > static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) > { > ... > u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); > > ... > > kfree(buf); > ... > } > > You shouldn't blame the merge windows kernel here. This never worked > on arm, even on older kernel versions ;) > Given that a zillion of code is in the rush, to work at all is miracle enough.

Agreed.

...

> A more permanent fix would be to allocate a temporary 80 bytes buffer > embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to > always use this temporary buffer for the control URB transfers. > > Antti, > > Could you please tale a look in it? > > Thanks! > Mauro > Mauro, with your patch 'af9015.c-3.9.0-0.rc0.git5.1.fc18.x86_64.diff' now the media tree is fine

...

--- linux/drivers/media/usb/dvb-usb-v2/af9015.c.orig 2013-02-23 05:57:03.911801068 +0100 +++ linux/drivers/media/usb/dvb-usb-v2/af9015.c 2013-02-23 05:57:21.327733911 +0100 @@ -35,7 +35,7 @@ #define ACK_HDR_LEN 2 /* rece header size */ struct af9015_state *state = d_to_priv(d); int ret, wlen, rlen; - u8 buf[BUF_LEN]; + u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); u8 write = 1; buf[0] = req->cmd; @@ -114,6 +114,8 @@ /* read request, copy returned data to return buf */ if (!write) memcpy(req->data, &buf[ACK_HDR_LEN], req->data_len); + + kfree(buf); error: return ret; }

Yeah, getting hid of stack usage there is enough to solve such problem ;)

...

, *but* network… here we go again, rock you like a hurricane! :)

:) I can't you help much with the net stack. I barely touched on that area of the Kernel.

...

------------[ cut here ]------------ WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0() Hardware name: M720-US3 skge 0000:01:09.0: DMA-API: device driver failed to check map error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single] Modules linked in: …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl lockd… r8169… mii skge… sunrpc

Funny enough, it is also a DMA error... I'm starting to suspect that maybe some change at the dma stack made it a little more pedantic ;) Two similar errors on different drivers is enough for me to take a deeper look into it to solve the mystery ;) Basically, those got generated by lib/dma-debug.c. On the af9015 case, this is a simple test, done at debug_dma_map_page(). It checks if the area is in stack and print an error. I suspect that, on x86, the driver will still run, on most cases. In the case of the network driver, it detected that debug_dma_mapping_error() was never called. Also, I believe that the network driver will still work, and this is more a warning that bad things might happen with the driver, if the check is not done. This error indicator seems to be cleaned by this code, at arch/x86/include/asm/dma-mapping.h: static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) { struct dma_map_ops *ops = get_dma_ops(dev); debug_dma_mapping_error(dev, dma_addr); if (ops->mapping_error) return ops->mapping_error(dev, dma_addr); return (dma_addr == DMA_ERROR_CODE); } From Documentation/DMA-API.txt: int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) In some circumstances dma_map_single and dma_map_page will fail to create a mapping. A driver can check for these errors by testing the returned dma address with dma_mapping_error(). A non-zero return value means the mapping could not be created and the driver should take appropriate action (e.g. reduce current DMA mapping usage or delay and try again later). I _suspect_ that the network driver is calling dma_map_single() or dma_map_page() but not calling dma_map_error(). The only network driver there on the list of drivers on your error is r8169. Weird enough, at least on linux-next, r8169 seems ok: mapping = dma_map_single(d, rtl8169_align(data), rx_buf_sz, DMA_FROM_DEVICE); if (unlikely(dma_mapping_error(d, mapping))) { if (net_ratelimit()) netif_err(tp, drv, tp->dev, "Failed to map RX DMA!\n"); goto err_out; } Perhaps it was not on r8169 driver, but on some other network driver that does a similar logic, but without checking for errors. $ git grep -L dma_mapping_error $(git grep -l -e dma_map_single -e dma_map_page drivers/net/) drivers/net/ethernet/broadcom/bcm63xx_enet.c drivers/net/ethernet/brocade/bna/bnad.c drivers/net/ethernet/cadence/at91_ether.c drivers/net/ethernet/cadence/macb.c drivers/net/ethernet/dec/tulip/de4x5.c drivers/net/ethernet/freescale/fec.c drivers/net/ethernet/freescale/fec_mpc52xx.c drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c drivers/net/ethernet/freescale/gianfar.c drivers/net/ethernet/freescale/ucc_geth.c drivers/net/ethernet/i825xx/lib82596.c drivers/net/ethernet/ibm/emac/core.c drivers/net/ethernet/intel/e1000/e1000_ethtool.c drivers/net/ethernet/lantiq_etop.c drivers/net/ethernet/marvell/mv643xx_eth.c drivers/net/ethernet/marvell/pxa168_eth.c drivers/net/ethernet/mellanox/mlx4/en_tx.c drivers/net/ethernet/natsemi/sonic.c drivers/net/ethernet/octeon/octeon_mgmt.c drivers/net/ethernet/realtek/8139cp.c drivers/net/ethernet/renesas/sh_eth.c drivers/net/ethernet/seeq/sgiseeq.c drivers/net/ethernet/sfc/net_driver.h drivers/net/ethernet/sgi/meth.c drivers/net/ethernet/smsc/smc911x.h drivers/net/ethernet/smsc/smc91x.h drivers/net/ethernet/stmicro/stmmac/chain_mode.c drivers/net/ethernet/stmicro/stmmac/ring_mode.c drivers/net/ethernet/stmicro/stmmac/stmmac_main.c drivers/net/ethernet/sun/niu.c drivers/net/ethernet/sun/sunbmac.c drivers/net/ethernet/sun/sunhme.c drivers/net/ethernet/ti/cpmac.c drivers/net/ethernet/ti/davinci_cpdma.c drivers/net/ethernet/toshiba/ps3_gelic_net.c drivers/net/ethernet/tundra/tsi108_eth.c drivers/net/ethernet/xilinx/ll_temac_main.c drivers/net/ethernet/xilinx/xilinx_axienet_main.c drivers/net/fddi/defxx.c drivers/net/wireless/brcm80211/brcmsmac/dma.c Oh, that's scary... Ok, a few of the above may be false-positives, but someone should take a deeper look on it ;) Regards, Mauro

...

Pid: 0, comm: swapper/3 Tainted: GF O 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 Call Trace: <IRQ> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81387a47>] check_unmap+0x407/0x8a0 [<ffffffff810ab62f>] ? try_to_wake_up+0x1ff/0x350 [<ffffffff813880fe>] debug_dma_unmap_page+0x5e/0x70 [<ffffffffa00750ac>] ? skge_intr+0x2c/0x570 [skge] [<ffffffffa0073525>] skge_poll+0x1b5/0x9c0 [skge] [<ffffffff815e7672>] net_rx_action+0x172/0x380 [<ffffffff810737e0>] __do_softirq+0x100/0x400 [<ffffffff81021dd3>] ? native_sched_clock+0x13/0x80 [<ffffffff81739c7c>] call_softirq+0x1c/0x30 [<ffffffff8101c435>] do_softirq+0xa5/0xe0 [<ffffffff81073cd5>] irq_exit+0xd5/0xe0 [<ffffffff8173a5b3>] do_IRQ+0x63/0xe0 [<ffffffff8172f732>] common_interrupt+0x72/0x72 <EOI> [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff8172f154>] ? _raw_spin_unlock_irq+0x34/0x50 [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810a3ddc>] finish_task_switch+0x7c/0x120 [<ffffffff810a3d9f>] ? finish_task_switch+0x3f/0x120 [<ffffffff8172cd02>] __schedule+0x442/0xa00 [<ffffffff8172d5f9>] schedule+0x29/0x70 [<ffffffff81023abf>] cpu_idle+0x12f/0x140 [<ffffffff8171a70e>] start_secondary+0x269/0x26b ---[ end trace d41e441560f83f6a ]--- Mapped at: [<ffffffff813881b9>] debug_dma_map_page+0xa9/0x150 [<ffffffffa0074c83>] skge_xmit_frame+0x163/0x560 [skge] [<ffffffff815e8330>] dev_hard_start_xmit+0x260/0x6f0 [<ffffffff8160bb0e>] sch_direct_xmit+0xfe/0x2a0 [<ffffffff815e8a07>] dev_queue_xmit+0x247/0x970 … -- Cheers, poma

-- Cheers, Mauro

Josh Boyer

6:48 a.m.

On Sat, Feb 23, 2013 at 7:01 AM, Mauro Carvalho Chehab <mchehab(a)redhat.com> wrote:

...

> ------------[ cut here ]------------ > WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0() > Hardware name: M720-US3 > skge 0000:01:09.0: DMA-API: device driver failed to check map > error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single] > Modules linked in: > …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 > nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack > ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl > lockd… r8169… mii skge… sunrpc Funny enough, it is also a DMA error... I'm starting to suspect that maybe some change at the dma stack made it a little more pedantic ;)

Yes. In 3.8.

...

Two similar errors on different drivers is enough for me to take a deeper look into it to solve the mystery ;) Basically, those got generated by lib/dma-debug.c. On the af9015 case, this is a simple test, done at debug_dma_map_page(). It checks if the area is in stack and print an error. I suspect that, on x86, the driver will still run, on most cases. In the case of the network driver, it detected that debug_dma_mapping_error() was never called. Also, I believe that the network driver will still work, and this is more a warning that bad things might happen with the driver, if the check is not done.

No, you don't call debug_dma_mapping_error directly. That's used by the DMA-API debugging functions itself. It's not something drivers call.

...

I _suspect_ that the network driver is calling dma_map_single() or dma_map_page() but not calling dma_map_error().

Close. It's calling pci_map_single and not calling pci_dma_mapping_error. Those are PCI wrappers around the same functions, but that's what they should be using.

...

The only network driver there on the list of drivers on your error is r8169. Weird enough, at least on linux-next, r8169 seems ok:

Er... his error was triggered by skge, which is a network driver.

...

$ git grep -L dma_mapping_error $(git grep -l -e dma_map_single -e dma_map_page drivers/net/)

That's the wrong function to grep for in most cases. I don't think the prbolem is as wide-spread as you fear. Poma, you should probably just start filing bugs for things you hit. In this case, the skge backtrace is just a warning but it can be fixed up relatively easily. josh

poma

4:56 p.m.

On 02/23/13 13:48, Josh Boyer wrote:

...

On Sat, Feb 23, 2013 at 7:01 AM, Mauro Carvalho Chehab <mchehab(a)redhat.com> wrote: >> ------------[ cut here ]------------ >> WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0() >> Hardware name: M720-US3 >> skge 0000:01:09.0: DMA-API: device driver failed to check map >> error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single] >> Modules linked in: >> …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >> nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack >> ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl >> lockd… r8169… mii skge… sunrpc > > Funny enough, it is also a DMA error... > > I'm starting to suspect that maybe some change at the dma stack made > it a little more pedantic ;) Yes. In 3.8. > Two similar errors on different drivers is enough for me to take a deeper > look into it to solve the mystery ;) > > Basically, those got generated by lib/dma-debug.c. > > On the af9015 case, this is a simple test, done at debug_dma_map_page(). > It checks if the area is in stack and print an error. I suspect that, on > x86, the driver will still run, on most cases. > > In the case of the network driver, it detected that > debug_dma_mapping_error() was never called. Also, I believe that the > network driver will still work, and this is more a warning that bad > things might happen with the driver, if the check is not done. No, you don't call debug_dma_mapping_error directly. That's used by the DMA-API debugging functions itself. It's not something drivers call. > I _suspect_ that the network driver is calling dma_map_single() or > dma_map_page() but not calling dma_map_error(). Close. It's calling pci_map_single and not calling pci_dma_mapping_error. Those are PCI wrappers around the same functions, but that's what they should be using. > The only network driver there on the list of drivers on your error is r8169. > Weird enough, at least on linux-next, r8169 seems ok: Er... his error was triggered by skge, which is a network driver. > $ git grep -L dma_mapping_error $(git grep -l -e dma_map_single -e dma_map_page drivers/net/) That's the wrong function to grep for in most cases. I don't think the prbolem is as wide-spread as you fear. Poma, you should probably just start filing bugs for things you hit. In this case, the skge backtrace is just a warning but it can be fixed up relatively easily.

https://bugzilla.redhat.com/show_bug.cgi?id=914994 Josh, Mauro thanks for the overview. :) Antti, cheers. ;) poma

Antti Palosaari

Sunday, 24 February Sun, 24 Feb

6:22 p.m.

On 02/24/2013 12:56 AM, poma wrote:

...

On 02/23/13 13:48, Josh Boyer wrote: > On Sat, Feb 23, 2013 at 7:01 AM, Mauro Carvalho Chehab > <mchehab(a)redhat.com> wrote: >>> ------------[ cut here ]------------ >>> WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0() >>> Hardware name: M720-US3 >>> skge 0000:01:09.0: DMA-API: device driver failed to check map >>> error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single] >>> Modules linked in: >>> …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 >>> nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack >>> ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl >>> lockd… r8169… mii skge… sunrpc >> >> Funny enough, it is also a DMA error... >> >> I'm starting to suspect that maybe some change at the dma stack made >> it a little more pedantic ;) > > Yes. In 3.8. > >> Two similar errors on different drivers is enough for me to take a deeper >> look into it to solve the mystery ;) >> >> Basically, those got generated by lib/dma-debug.c. >> >> On the af9015 case, this is a simple test, done at debug_dma_map_page(). >> It checks if the area is in stack and print an error. I suspect that, on >> x86, the driver will still run, on most cases. >> >> In the case of the network driver, it detected that >> debug_dma_mapping_error() was never called. Also, I believe that the >> network driver will still work, and this is more a warning that bad >> things might happen with the driver, if the check is not done. > > No, you don't call debug_dma_mapping_error directly. That's used by the > DMA-API debugging functions itself. It's not something drivers call. > >> I _suspect_ that the network driver is calling dma_map_single() or >> dma_map_page() but not calling dma_map_error(). > > Close. It's calling pci_map_single and not calling pci_dma_mapping_error. > Those are PCI wrappers around the same functions, but that's what they > should be using. > >> The only network driver there on the list of drivers on your error is r8169. >> Weird enough, at least on linux-next, r8169 seems ok: > > Er... his error was triggered by skge, which is a network driver. > >> $ git grep -L dma_mapping_error $(git grep -l -e dma_map_single -e dma_map_page drivers/net/) > > That's the wrong function to grep for in most cases. I don't think the > prbolem is as wide-spread as you fear. > > Poma, you should probably just start filing bugs for things you hit. > In this case, the skge backtrace is just a warning but it can be fixed > up relatively easily. > https://bugzilla.redhat.com/show_bug.cgi?id=914994 Josh, Mauro thanks for the overview. :) Antti, cheers. ;) poma

I cannot see these warnings at all. What is Kernel option to enable those debug(?) warnings? From which menu it could be located when make menuconfig ? Antti -- http://palosaari.fi/

poma

Monday, 25 February Mon, 25 Feb

6:05 p.m.

On 02/25/13 01:22, Antti Palosaari wrote: […]

...

>> Poma, you should probably just start filing bugs for things you hit. >> In this case, the skge backtrace is just a warning but it can be fixed >> up relatively easily. >> > > https://bugzilla.redhat.com/show_bug.cgi?id=914994 > Josh, Mauro thanks for the overview. :) > Antti, cheers. ;) > > poma I cannot see these warnings at all. What is Kernel option to enable those debug(?) warnings? From which menu it could be located when make menuconfig ?

http://kojipkgs.fedoraproject.org/packages/kernel/3.9.0/0.rc0.git7.1.fc19... Even after reapplying Stephen's "skge: check for PCI dma mapping errors"[1] from David's 'net-next' tree, no luck. … WARNING: at lib/dma-debug.c:933 check_unmap+0x47b/0x960() skge 0000:01:09.0: DMA-API: device driver failed to check map error[device address=0x000000010287094a] [size=90 bytes] [mapped as single] … Cheers, poma [1] http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=d...

Josh Boyer

7:12 p.m.

On Tue, Feb 26, 2013 at 01:05:05AM +0100, poma wrote:

...

On 02/25/13 01:22, Antti Palosaari wrote: […] >>> Poma, you should probably just start filing bugs for things you hit. >>> In this case, the skge backtrace is just a warning but it can be fixed >>> up relatively easily. >>> >> >> https://bugzilla.redhat.com/show_bug.cgi?id=914994 >> Josh, Mauro thanks for the overview. :) >> Antti, cheers. ;) >> >> poma > > I cannot see these warnings at all. What is Kernel option to enable > those debug(?) warnings? From which menu it could be located when make > menuconfig ? > http://kojipkgs.fedoraproject.org/packages/kernel/3.9.0/0.rc0.git7.1.fc19... Even after reapplying Stephen's "skge: check for PCI dma mapping errors"[1] from David's 'net-next' tree, no luck. … WARNING: at lib/dma-debug.c:933 check_unmap+0x47b/0x960() skge 0000:01:09.0: DMA-API: device driver failed to check map error[device address=0x000000010287094a] [size=90 bytes] [mapped as single] …

I'm confused. You pointed to the build I did this morning, which definitely doesn't include the commit you mentioned. Are you saying you took this morning's build and applied the patch yourself? If so, did it really get applied? Do you have logs? Do you have more than just that tiny snippet of error message? josh

poma

8:18 p.m.

On 02/26/13 02:12, Josh Boyer wrote:

...

On Tue, Feb 26, 2013 at 01:05:05AM +0100, poma wrote: > On 02/25/13 01:22, Antti Palosaari wrote: > […] >>>> Poma, you should probably just start filing bugs for things you hit. >>>> In this case, the skge backtrace is just a warning but it can be fixed >>>> up relatively easily. >>>> >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=914994 >>> Josh, Mauro thanks for the overview. :) >>> Antti, cheers. ;) >>> >>> poma >> >> I cannot see these warnings at all. What is Kernel option to enable >> those debug(?) warnings? From which menu it could be located when make >> menuconfig ? >> > > http://kojipkgs.fedoraproject.org/packages/kernel/3.9.0/0.rc0.git7.1.fc19... > Even after reapplying Stephen's "skge: check for PCI dma mapping > errors"[1] from David's 'net-next' tree, no luck. > … > WARNING: at lib/dma-debug.c:933 check_unmap+0x47b/0x960() > skge 0000:01:09.0: DMA-API: device driver failed to check map > error[device address=0x000000010287094a] [size=90 bytes] [mapped as single] > … I'm confused. You pointed to the build I did this morning, which definitely doesn't include the commit you mentioned. Are you saying you took this morning's build and applied the patch yourself? If so, did it really get applied? Do you have logs? Do you have more than just that tiny snippet of error message?

build.log is for Antti's eyes only. :) Logs are virtually the same, with or without that *old* commit, which is actually replaced by this one[1]. So of course it isn't in 3.9.0-0.rc0.git7.1.fc19.x86_64. ;) Building particular module groups(skge & Co.) rather than build an entire kernel tree isn't big deal, likewise. You know that better than me. ;) The real question is whether someone will help squeeze this bug. Cheers, poma [1] http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=a...

Antti Palosaari

Tuesday, 26 February Tue, 26 Feb

1:51 a.m.

On 02/26/2013 04:18 AM, poma wrote:

...

On 02/26/13 02:12, Josh Boyer wrote: > On Tue, Feb 26, 2013 at 01:05:05AM +0100, poma wrote: >> On 02/25/13 01:22, Antti Palosaari wrote: >> […] >>>>> Poma, you should probably just start filing bugs for things you hit. >>>>> In this case, the skge backtrace is just a warning but it can be fixed >>>>> up relatively easily. >>>>> >>>> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=914994 >>>> Josh, Mauro thanks for the overview. :) >>>> Antti, cheers. ;) >>>> >>>> poma >>> >>> I cannot see these warnings at all. What is Kernel option to enable >>> those debug(?) warnings? From which menu it could be located when make >>> menuconfig ? >>> >> >> http://kojipkgs.fedoraproject.org/packages/kernel/3.9.0/0.rc0.git7.1.fc19... >> Even after reapplying Stephen's "skge: check for PCI dma mapping >> errors"[1] from David's 'net-next' tree, no luck. >> … >> WARNING: at lib/dma-debug.c:933 check_unmap+0x47b/0x960() >> skge 0000:01:09.0: DMA-API: device driver failed to check map >> error[device address=0x000000010287094a] [size=90 bytes] [mapped as single] >> … > > I'm confused. You pointed to the build I did this morning, which > definitely doesn't include the commit you mentioned. Are you saying you > took this morning's build and applied the patch yourself? If so, did it > really get applied? Do you have logs? Do you have more than just that > tiny snippet of error message? > build.log is for Antti's eyes only. :) Logs are virtually the same, with or without that *old* commit, which is actually replaced by this one[1]. So of course it isn't in 3.9.0-0.rc0.git7.1.fc19.x86_64. ;) Building particular module groups(skge & Co.) rather than build an entire kernel tree isn't big deal, likewise. You know that better than me. ;) The real question is whether someone will help squeeze this bug.

I want just know if I could reproduce that AF9015 error message or was it just warning. Is it something I have to fix for Kernel 3.10 or earlier. I am running Fedora 17 AMD64 and could surely compile any Kernel needed. regards Antti -- http://palosaari.fi/

poma

2:44 a.m.

On 02/26/13 08:51, Antti Palosaari wrote: […]

...

A-haa! You're talking about AF9015. Sorry. See if you can hit the bug with this one: rpm -ivh http://kojipkgs.fedoraproject.org//packages/kernel/3.9.0/0.rc0.git7.1.fc1... http://kojipkgs.fedoraproject.org//packages/kernel/3.9.0/0.rc0.git7.1.fc1... Cheers, poma

Antti Palosaari

2:54 a.m.

On 02/26/2013 09:51 AM, Antti Palosaari wrote:

...

On 02/26/2013 04:18 AM, poma wrote: > On 02/26/13 02:12, Josh Boyer wrote: >> On Tue, Feb 26, 2013 at 01:05:05AM +0100, poma wrote: >>> On 02/25/13 01:22, Antti Palosaari wrote: >>> […] >>>>>> Poma, you should probably just start filing bugs for things you hit. >>>>>> In this case, the skge backtrace is just a warning but it can be >>>>>> fixed >>>>>> up relatively easily. >>>>>> >>>>> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=914994 >>>>> Josh, Mauro thanks for the overview. :) >>>>> Antti, cheers. ;) >>>>> >>>>> poma >>>> >>>> I cannot see these warnings at all. What is Kernel option to enable >>>> those debug(?) warnings? From which menu it could be located when make >>>> menuconfig ? >>>> >>> >>> http://kojipkgs.fedoraproject.org/packages/kernel/3.9.0/0.rc0.git7.1.fc19... >>> >>> Even after reapplying Stephen's "skge: check for PCI dma mapping >>> errors"[1] from David's 'net-next' tree, no luck. >>> … >>> WARNING: at lib/dma-debug.c:933 check_unmap+0x47b/0x960() >>> skge 0000:01:09.0: DMA-API: device driver failed to check map >>> error[device address=0x000000010287094a] [size=90 bytes] [mapped as >>> single] >>> … >> >> I'm confused. You pointed to the build I did this morning, which >> definitely doesn't include the commit you mentioned. Are you saying you >> took this morning's build and applied the patch yourself? If so, did it >> really get applied? Do you have logs? Do you have more than just that >> tiny snippet of error message? >> > > build.log is for Antti's eyes only. :) > Logs are virtually the same, with or without that *old* commit, which is > actually replaced by this one[1]. > So of course it isn't in 3.9.0-0.rc0.git7.1.fc19.x86_64. ;) > Building particular module groups(skge & Co.) rather than build an > entire kernel tree isn't big deal, likewise. > You know that better than me. ;) > The real question is whether someone will help squeeze this bug. I want just know if I could reproduce that AF9015 error message or was it just warning. Is it something I have to fix for Kernel 3.10 or earlier. I am running Fedora 17 AMD64 and could surely compile any Kernel needed.

OK, found it finally. Kernel hacking ---> Enable debugging of DMA-API usage There was rather many DVB USB drivers using USB bulk buffers from the stack. I will fix at least some of those, lets say for 3.10. regards Antti -- http://palosaari.fi/

poma

3:17 a.m.

On 02/26/13 09:54, Antti Palosaari wrote: […]

...

> I want just know if I could reproduce that AF9015 error message or was > it just warning. Is it something I have to fix for Kernel 3.10 or > earlier. I am running Fedora 17 AMD64 and could surely compile any > Kernel needed. OK, found it finally. Kernel hacking ---> Enable debugging of DMA-API usage

True. grep CONFIG_DMA_API_DEBUG /boot/config-3.* /boot/config-3.7.8-202.fc18.x86_64:# CONFIG_DMA_API_DEBUG is not set /boot/config-3.7.9-201.fc18.x86_64:# CONFIG_DMA_API_DEBUG is not set /boot/config-3.8.0-1.fc18.x86_64:# CONFIG_DMA_API_DEBUG is not set /boot/config-3.9.0-0.rc0.git5.1.fc18.x86_64:CONFIG_DMA_API_DEBUG=y /boot/config-3.9.0-0.rc0.git6.1.fc19.x86_64:CONFIG_DMA_API_DEBUG=y /boot/config-3.9.0-0.rc0.git7.1.fc19.x86_64:CONFIG_DMA_API_DEBUG=y

...

There was rather many DVB USB drivers using USB bulk buffers from the stack. I will fix at least some of those, lets say for 3.10. regards Antti

poma

6:16 p.m.

On 02/26/13 10:17, poma wrote:

...

On 02/26/13 09:54, Antti Palosaari wrote: […] >> I want just know if I could reproduce that AF9015 error message or was >> it just warning. Is it something I have to fix for Kernel 3.10 or >> earlier. I am running Fedora 17 AMD64 and could surely compile any >> Kernel needed. > > OK, found it finally. > Kernel hacking ---> Enable debugging of DMA-API usage

On /boot/vmlinuz-3.9.0-0.rc0.git7.1.NONE_DMA_API_DEBUG.fc18.x86_64 with 'CONFIG_DMA_API_DEBUG is not set', "WARNING(S): at lib/dma-debug.c…" are masked. Thanks for pointing out.

...

After applying http://git.linuxtv.org/anttip/media_tree.git/shortlog/refs/heads/dvb_usb_... on '7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of git://linuxtv.org/media_build';, all clear. :) Tested on 3.9.0-0.rc0.git7.1.fc19.x86_64 with mplayer and mythbackend. Even so, on /boot/vmlinuz-3.9.0-0.rc0.git7.1.NONE_DEBUG.fc18.x86_64 with disabled debugging options, none of those warnings exist, including skge's and solved ivtv's locking issue. Besides there is no speed reduction regarding network part - skge. Josh was right. OK, Antti, Mauro, Josh thank you again. ;) poma

Antti Palosaari

Saturday, 23 February Sat, 23 Feb

2:30 p.m.

On 02/23/2013 03:30 AM, Mauro Carvalho Chehab wrote:

...

Em Sat, 23 Feb 2013 02:09:01 +0100 poma <pomidorabelisima(a)gmail.com> escreveu: > On 02/23/13 01:09, Mauro Carvalho Chehab wrote:

...

> This one stayed; > … > ------------[ cut here ]------------ > WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() > Hardware name: M720-US3 > ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack > [addr=ffff8801283d7be1] > Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) > wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… > rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) > videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… > videodev(OF)… video i2c_algo_bit… (irrelevant omitted) > Pid: 38, comm: kworker/2:1 Tainted: GF O > 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 > Call Trace: > usb 4-2: pl2303 converter now attached to ttyUSB0 > [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 > [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 > [<ffffffff81386467>] check_for_stack+0xa7/0xf0 > [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 > [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 > [<ffffffff811be091>] ? set_track+0x61/0x1b0 > [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 > [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 > [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 > [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 > [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 > [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 > [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] > [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] > [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 > [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 > [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] > [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] > [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 > [<ffffffff8108e3db>] process_one_work+0x20b/0x690 > [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 > [<ffffffff8108ec10>] worker_thread+0x110/0x380 > [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 > [<ffffffff8109569d>] kthread+0xed/0x100 > [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 > [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 > [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 > ---[ end trace 4f4bc1c2abe300b5 ]--- > -- Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, almost all memories can do DMA, including the area used by stack. That's not true on other archs like arm. I suspect that Antti never tried to run this driver outside x86.

Yep, that's true! I will generally test only x86 64bit and let the others test x86 32bit. Basically x86 is only which is tested by multiple users. MIPS and ARM are something that could be nice to test frequently, but I have no hw...

...

Let me look into it... static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { #define BUF_LEN 63 #define REQ_HDR_LEN 8 /* send header size */ #define ACK_HDR_LEN 2 /* rece header size */ struct af9015_state *state = d_to_priv(d); int ret, wlen, rlen; u8 buf[BUF_LEN]; ... ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, u16 rlen) { int ret, actual_length; ... ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, &actual_length, 2000); That's the problem: "buf" is at stack, and it is passed directly to usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause data to be lost even on x86). In order to fix it, the driver should be doing, instead, something like: static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) { ... u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); ... kfree(buf); ... }

I remember patch set about two years ago which removes usb_control_msg() stacked buffers from the all the /media/ USB drivers. At that time stacked use of usb_bulk_msg() was left as it was OK(?).

...

You shouldn't blame the merge windows kernel here. This never worked on arm, even on older kernel versions ;) A more permanent fix would be to allocate a temporary 80 bytes buffer embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to always use this temporary buffer for the control URB transfers. Antti, Could you please tale a look in it?

I am happy to alloc() all the buffers what are used by drivers I made, if that is the reason of these bugs (or even if it is the correct way despite that bug or not). regards Antti -- http://palosaari.fi/

Mauro Carvalho Chehab

2:41 p.m.

Em Sat, 23 Feb 2013 22:30:57 +0200 Antti Palosaari <crope(a)iki.fi> escreveu:

...

On 02/23/2013 03:30 AM, Mauro Carvalho Chehab wrote: > Em Sat, 23 Feb 2013 02:09:01 +0100 > poma <pomidorabelisima(a)gmail.com> escreveu: > >> On 02/23/13 01:09, Mauro Carvalho Chehab wrote: >> This one stayed; >> … >> ------------[ cut here ]------------ >> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() >> Hardware name: M720-US3 >> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack >> [addr=ffff8801283d7be1] >> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) >> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… >> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) >> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… >> videodev(OF)… video i2c_algo_bit… (irrelevant omitted) >> Pid: 38, comm: kworker/2:1 Tainted: GF O >> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 >> Call Trace: >> usb 4-2: pl2303 converter now attached to ttyUSB0 >> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 >> [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 >> [<ffffffff81386467>] check_for_stack+0xa7/0xf0 >> [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 >> [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 >> [<ffffffff811be091>] ? set_track+0x61/0x1b0 >> [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 >> [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 >> [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 >> [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 >> [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 >> [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 >> [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] >> [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] >> [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 >> [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 >> [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >> [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] >> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >> [<ffffffff8108e3db>] process_one_work+0x20b/0x690 >> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >> [<ffffffff8108ec10>] worker_thread+0x110/0x380 >> [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 >> [<ffffffff8109569d>] kthread+0xed/0x100 >> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >> [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >> ---[ end trace 4f4bc1c2abe300b5 ]--- >> -- > > Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, > almost all memories can do DMA, including the area used by stack. That's > not true on other archs like arm. > > I suspect that Antti never tried to run this driver outside x86. Yep, that's true! I will generally test only x86 64bit and let the others test x86 32bit. Basically x86 is only which is tested by multiple users. MIPS and ARM are something that could be nice to test frequently, but I have no hw... > Let me look into it... > > static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) > { > #define BUF_LEN 63 > #define REQ_HDR_LEN 8 /* send header size */ > #define ACK_HDR_LEN 2 /* rece header size */ > struct af9015_state *state = d_to_priv(d); > int ret, wlen, rlen; > u8 buf[BUF_LEN]; > ... > ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); > > > > int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, > u16 rlen) > { > int ret, actual_length; > > ... > ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, > d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, > &actual_length, 2000); > > > That's the problem: "buf" is at stack, and it is passed directly to > usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause > data to be lost even on x86). > > In order to fix it, the driver should be doing, instead, something like: > > static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) > { > ... > u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); > > ... > > kfree(buf); > ... > } I remember patch set about two years ago which removes usb_control_msg() stacked buffers from the all the /media/ USB drivers. At that time stacked use of usb_bulk_msg() was left as it was OK(?).

No, that's not ok. Probably people just forgot usb_bulk_msg() when that patch got submitted.

...

> You shouldn't blame the merge windows kernel here. This never worked > on arm, even on older kernel versions ;) > > A more permanent fix would be to allocate a temporary 80 bytes buffer > embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to > always use this temporary buffer for the control URB transfers. > > Antti, > > Could you please tale a look in it? I am happy to alloc() all the buffers what are used by drivers I made, if that is the reason of these bugs (or even if it is the correct way despite that bug or not).

IMO, the better is to allocate it before. The only drawback is that a mutex is needed to serialize the buffer usage, otherwise the remote control and CA logic may race with usual ioctl handling that might try to access the device's registers. I would put that logic in the dvb-usb/dvb-usb-v2 core. With this new 3.8 "pedantic" dma mode, people will start complain about it anyway on all drivers. Regards, Mauro

Antti Palosaari

5:07 p.m.

On 02/23/2013 10:41 PM, Mauro Carvalho Chehab wrote:

...

Em Sat, 23 Feb 2013 22:30:57 +0200 Antti Palosaari <crope(a)iki.fi> escreveu: > On 02/23/2013 03:30 AM, Mauro Carvalho Chehab wrote: >> Em Sat, 23 Feb 2013 02:09:01 +0100 >> poma <pomidorabelisima(a)gmail.com> escreveu: >> >>> On 02/23/13 01:09, Mauro Carvalho Chehab wrote: > >>> This one stayed; >>> … >>> ------------[ cut here ]------------ >>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0() >>> Hardware name: M720-US3 >>> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack >>> [addr=ffff8801283d7be1] >>> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF) >>> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)… >>> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF) >>> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)… >>> videodev(OF)… video i2c_algo_bit… (irrelevant omitted) >>> Pid: 38, comm: kworker/2:1 Tainted: GF O >>> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1 >>> Call Trace: >>> usb 4-2: pl2303 converter now attached to ttyUSB0 >>> [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0 >>> [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50 >>> [<ffffffff81386467>] check_for_stack+0xa7/0xf0 >>> [<ffffffff81388228>] debug_dma_map_page+0x118/0x150 >>> [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630 >>> [<ffffffff811be091>] ? set_track+0x61/0x1b0 >>> [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900 >>> [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540 >>> [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0 >>> [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0 >>> [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50 >>> [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180 >>> [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2] >>> [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015] >>> [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210 >>> [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210 >>> [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015] >>> [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2] >>> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >>> [<ffffffff8108e3db>] process_one_work+0x20b/0x690 >>> [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690 >>> [<ffffffff8108ec10>] worker_thread+0x110/0x380 >>> [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260 >>> [<ffffffff8109569d>] kthread+0xed/0x100 >>> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >>> [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0 >>> [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190 >>> ---[ end trace 4f4bc1c2abe300b5 ]--- >>> -- >> >> Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines, >> almost all memories can do DMA, including the area used by stack. That's >> not true on other archs like arm. >> >> I suspect that Antti never tried to run this driver outside x86. > > Yep, that's true! I will generally test only x86 64bit and let the > others test x86 32bit. Basically x86 is only which is tested by multiple > users. MIPS and ARM are something that could be nice to test frequently, > but I have no hw... > >> Let me look into it... >> >> static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) >> { >> #define BUF_LEN 63 >> #define REQ_HDR_LEN 8 /* send header size */ >> #define ACK_HDR_LEN 2 /* rece header size */ >> struct af9015_state *state = d_to_priv(d); >> int ret, wlen, rlen; >> u8 buf[BUF_LEN]; >> ... >> ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen); >> >> >> >> int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf, >> u16 rlen) >> { >> int ret, actual_length; >> >> ... >> ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev, >> d->props->generic_bulk_ctrl_endpoint), wbuf, wlen, >> &actual_length, 2000); >> >> >> That's the problem: "buf" is at stack, and it is passed directly to >> usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause >> data to be lost even on x86). >> >> In order to fix it, the driver should be doing, instead, something like: >> >> static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req) >> { >> ... >> u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC); >> >> ... >> >> kfree(buf); >> ... >> } > > I remember patch set about two years ago which removes usb_control_msg() > stacked buffers from the all the /media/ USB drivers. At that time > stacked use of usb_bulk_msg() was left as it was OK(?). No, that's not ok. Probably people just forgot usb_bulk_msg() when that patch got submitted.

It was that patch set which did that fix for usb_control_msg() https://lkml.org/lkml/2011/4/3/67 https://bugzilla.kernel.org/show_bug.cgi?id=15977 So now same for the usb_bulk_msg()

...

>> You shouldn't blame the merge windows kernel here. This never worked >> on arm, even on older kernel versions ;) >> >> A more permanent fix would be to allocate a temporary 80 bytes buffer >> embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to >> always use this temporary buffer for the control URB transfers. >> >> Antti, >> >> Could you please tale a look in it? > > I am happy to alloc() all the buffers what are used by drivers I made, > if that is the reason of these bugs (or even if it is the correct way > despite that bug or not). IMO, the better is to allocate it before. The only drawback is that a mutex is needed to serialize the buffer usage, otherwise the remote control and CA logic may race with usual ioctl handling that might try to access the device's registers. I would put that logic in the dvb-usb/dvb-usb-v2 core.

I would like to put those transfer buffers to the driver state. For the DVB USB it was not possible as there was no pointer to state early enough, like identify state & download firmware phases. For the DVB USB V2 there is and it is very simply to move buffers to the state.

...

With this new 3.8 "pedantic" dma mode, people will start complain about it anyway on all drivers. Regards, Mauro

regards Antti -- http://palosaari.fi/

4075

days inactive

4080

days old

kernel@lists.fedoraproject.org

Manage subscription

23 comments

7 participants

tags (0)

participants (7)

Antti Palosaari
Bruno Wolff III
Josh Boyer
Josh Boyer
Mauro Carvalho Chehab
Peter Robinson
poma

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

3.9 merge window kernels