* [PATCH] hid: usbhid: hid-core: fix recursive deadlock @ 2015-11-18 19:25 Ioan-Adrian Ratiu 2015-11-18 20:37 ` Jiri Kosina 2015-11-20 20:19 ` [PATCH v2] " Ioan-Adrian Ratiu 0 siblings, 2 replies; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-18 19:25 UTC (permalink / raw) To: jikos; +Cc: pinglinux, linux-usb, linux-input, linux-kernel The critical section protected by usbhid->lock in hid_ctrl() is too big and in rare cases causes a recursive deadlock because of its call to hid_input_report(). This deadlock reproduces on newer wacom tablets like 056a:033c because the wacom driver in its irq handler ends up calling hid_hw_request() from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means is that it submits a report to reschedule a proximity read through a sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) before calling hid_input_report(). When the irq kicks in on the same cpu, it also tries to grab the lock resulting in a recursive deadlock. The proper fix is to shrink the critical section in hid_ctrl() to protect only the instructions which modify usbhid, thus move the lock after the hid_input_report() call and the deadlock dissapears. Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> --- drivers/hid/usbhid/hid-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 36712e9..5dd426f 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) struct usbhid_device *usbhid = hid->driver_data; int unplug = 0, status = urb->status; - spin_lock(&usbhid->lock); - switch (status) { case 0: /* success */ if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) hid_warn(urb->dev, "ctrl urb status %d received\n", status); } + spin_lock(&usbhid->lock); + if (unplug) { usbhid->ctrltail = usbhid->ctrlhead; } else { -- 2.6.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-18 19:25 [PATCH] hid: usbhid: hid-core: fix recursive deadlock Ioan-Adrian Ratiu @ 2015-11-18 20:37 ` Jiri Kosina 2015-11-18 21:05 ` Ioan-Adrian Ratiu 2015-11-20 20:19 ` [PATCH v2] " Ioan-Adrian Ratiu 1 sibling, 1 reply; 20+ messages in thread From: Jiri Kosina @ 2015-11-18 20:37 UTC (permalink / raw) To: Ioan-Adrian Ratiu; +Cc: pinglinux, linux-usb, linux-input, linux-kernel On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > The critical section protected by usbhid->lock in hid_ctrl() is too > big and in rare cases causes a recursive deadlock because of its call > to hid_input_report(). > > This deadlock reproduces on newer wacom tablets like 056a:033c because > the wacom driver in its irq handler ends up calling hid_hw_request() > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > is that it submits a report to reschedule a proximity read through a > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > before calling hid_input_report(). When the irq kicks in on the same > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > The proper fix is to shrink the critical section in hid_ctrl() to > protect only the instructions which modify usbhid, thus move the lock > after the hid_input_report() call and the deadlock dissapears. I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), isn't it? -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock @ 2015-11-18 21:05 ` Ioan-Adrian Ratiu 0 siblings, 0 replies; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-18 21:05 UTC (permalink / raw) To: Jiri Kosina; +Cc: pinglinux, linux-usb, linux-input, linux-kernel On Wed, 18 Nov 2015 21:37:42 +0100 (CET) Jiri Kosina <jikos@kernel.org> wrote: > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > big and in rare cases causes a recursive deadlock because of its call > > to hid_input_report(). > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > the wacom driver in its irq handler ends up calling hid_hw_request() > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > is that it submits a report to reschedule a proximity read through a > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > before calling hid_input_report(). When the irq kicks in on the same > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > protect only the instructions which modify usbhid, thus move the lock > > after the hid_input_report() call and the deadlock dissapears. > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > isn't it? > That was my first attempt, yes, but the deadlock still happens with interrupts disabled. It is very weird, I know. I tried many configurations, like disabling PREEMPT_RT and other stuff which might affect the call stack in this case, but the only two methods which actually avoid the deadlock are: 1. don't call wacom_intuos_schedule_prox_event() / hid_hw_request() from the wacom driver 2. shrink the critical region to not cover hid_input_report() inside hid_ctrl() I am very open to any ideas on how to better fix this, just to be able to use a mainline kernel with my device without out of tree patching :) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock @ 2015-11-18 21:05 ` Ioan-Adrian Ratiu 0 siblings, 0 replies; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-18 21:05 UTC (permalink / raw) To: Jiri Kosina Cc: pinglinux-Re5JQEeQqe8AvxtiuMwx3w, linux-usb-u79uwXL29TY76Z2rM5mHXA, linux-input-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Wed, 18 Nov 2015 21:37:42 +0100 (CET) Jiri Kosina <jikos-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > big and in rare cases causes a recursive deadlock because of its call > > to hid_input_report(). > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > the wacom driver in its irq handler ends up calling hid_hw_request() > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > is that it submits a report to reschedule a proximity read through a > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > before calling hid_input_report(). When the irq kicks in on the same > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > protect only the instructions which modify usbhid, thus move the lock > > after the hid_input_report() call and the deadlock dissapears. > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > isn't it? > That was my first attempt, yes, but the deadlock still happens with interrupts disabled. It is very weird, I know. I tried many configurations, like disabling PREEMPT_RT and other stuff which might affect the call stack in this case, but the only two methods which actually avoid the deadlock are: 1. don't call wacom_intuos_schedule_prox_event() / hid_hw_request() from the wacom driver 2. shrink the critical region to not cover hid_input_report() inside hid_ctrl() I am very open to any ideas on how to better fix this, just to be able to use a mainline kernel with my device without out of tree patching :) -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-18 21:05 ` Ioan-Adrian Ratiu @ 2015-11-18 23:58 ` Josh Cartwright -1 siblings, 0 replies; 20+ messages in thread From: Josh Cartwright @ 2015-11-18 23:58 UTC (permalink / raw) To: Ioan-Adrian Ratiu Cc: Jiri Kosina, pinglinux, linux-usb, linux-input, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1766 bytes --] On Wed, Nov 18, 2015 at 11:05:44PM +0200, Ioan-Adrian Ratiu wrote: > On Wed, 18 Nov 2015 21:37:42 +0100 (CET) > Jiri Kosina <jikos@kernel.org> wrote: > > > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > > big and in rare cases causes a recursive deadlock because of its call > > > to hid_input_report(). > > > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > > the wacom driver in its irq handler ends up calling hid_hw_request() > > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > > is that it submits a report to reschedule a proximity read through a > > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > > before calling hid_input_report(). When the irq kicks in on the same > > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > > protect only the instructions which modify usbhid, thus move the lock > > > after the hid_input_report() call and the deadlock dissapears. > > > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > > isn't it? > > > > That was my first attempt, yes, but the deadlock still happens with interrupts > disabled. It is very weird, I know. I think your best course of action is to figure out why this is the case, instead of continuing with trying to solve the symptoms. Do you have actual callstacks showing the cases where you hit? That might be useful to share (your lockdep picture cuts out the callstacks). Also, have you tried without the PREEMPT_RT patch in the picture at all? Josh [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock @ 2015-11-18 23:58 ` Josh Cartwright 0 siblings, 0 replies; 20+ messages in thread From: Josh Cartwright @ 2015-11-18 23:58 UTC (permalink / raw) To: Ioan-Adrian Ratiu Cc: Jiri Kosina, pinglinux-Re5JQEeQqe8AvxtiuMwx3w, linux-usb-u79uwXL29TY76Z2rM5mHXA, linux-input-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 1795 bytes --] On Wed, Nov 18, 2015 at 11:05:44PM +0200, Ioan-Adrian Ratiu wrote: > On Wed, 18 Nov 2015 21:37:42 +0100 (CET) > Jiri Kosina <jikos-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > > big and in rare cases causes a recursive deadlock because of its call > > > to hid_input_report(). > > > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > > the wacom driver in its irq handler ends up calling hid_hw_request() > > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > > is that it submits a report to reschedule a proximity read through a > > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > > before calling hid_input_report(). When the irq kicks in on the same > > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > > protect only the instructions which modify usbhid, thus move the lock > > > after the hid_input_report() call and the deadlock dissapears. > > > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > > isn't it? > > > > That was my first attempt, yes, but the deadlock still happens with interrupts > disabled. It is very weird, I know. I think your best course of action is to figure out why this is the case, instead of continuing with trying to solve the symptoms. Do you have actual callstacks showing the cases where you hit? That might be useful to share (your lockdep picture cuts out the callstacks). Also, have you tried without the PREEMPT_RT patch in the picture at all? Josh [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-18 23:58 ` Josh Cartwright (?) @ 2015-11-19 6:47 ` Ioan-Adrian Ratiu 2015-11-19 9:10 ` Jiri Kosina -1 siblings, 1 reply; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-19 6:47 UTC (permalink / raw) To: Josh Cartwright Cc: Jiri Kosina, pinglinux, linux-usb, linux-input, linux-kernel On Wed, 18 Nov 2015 17:58:56 -0600 Josh Cartwright <joshc@ni.com> wrote: > On Wed, Nov 18, 2015 at 11:05:44PM +0200, Ioan-Adrian Ratiu wrote: > > On Wed, 18 Nov 2015 21:37:42 +0100 (CET) > > Jiri Kosina <jikos@kernel.org> wrote: > > > > > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > > > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > > > big and in rare cases causes a recursive deadlock because of its call > > > > to hid_input_report(). > > > > > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > > > the wacom driver in its irq handler ends up calling hid_hw_request() > > > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > > > is that it submits a report to reschedule a proximity read through a > > > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > > > before calling hid_input_report(). When the irq kicks in on the same > > > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > > > protect only the instructions which modify usbhid, thus move the lock > > > > after the hid_input_report() call and the deadlock dissapears. > > > > > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > > > isn't it? > > > > > > > That was my first attempt, yes, but the deadlock still happens with > > interrupts disabled. It is very weird, I know. > > I think your best course of action is to figure out why this is the > case, instead of continuing with trying to solve the symptoms. Do you > have actual callstacks showing the cases where you hit? That might be > useful to share (your lockdep picture cuts out the callstacks). > > Also, have you tried without the PREEMPT_RT patch in the picture at all? > > Josh Yes, of course I tried it without PREEMPT_RT_FULL :) This happens on vanilla mainline kernels (only after 4.4-rc1 which introduced support for this kind of tablets). I also backported all the wacom patches to 4.1 non-RT and the same deadlock happens. I've sent another email with some lockdep traces and printk's on a running vanilla linux-next, maybe it didn't get through, here are the links again: First part of lockdep report: http://imgur.com/clLsCWe Second part: http://imgur.com/Wa2PzRl Here are some printk's of mine while reproducing + debugging the issue: http://imgur.com/SETOHT7 I'll continue to research this more in depth, but progress is slow because I don't have much time, I'm doing this in my spare time because it's my girlfriend's tablet. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-19 6:47 ` Ioan-Adrian Ratiu @ 2015-11-19 9:10 ` Jiri Kosina 2015-11-19 16:33 ` Ioan-Adrian Ratiu 0 siblings, 1 reply; 20+ messages in thread From: Jiri Kosina @ 2015-11-19 9:10 UTC (permalink / raw) To: Ioan-Adrian Ratiu Cc: Josh Cartwright, pinglinux, linux-usb, linux-input, linux-kernel On Thu, 19 Nov 2015, Ioan-Adrian Ratiu wrote: > First part of lockdep report: > http://imgur.com/clLsCWe > > Second part: > http://imgur.com/Wa2PzRl > > Here are some printk's of mine while reproducing + debugging the issue: > http://imgur.com/SETOHT7 So the real problem is that Intuos driver is calling hid_hw_request() (which tries to grab the lock in usbhid_submit_report()) while handling the CTRL IRQ (lock gets acquired there). So the proper way to fix seems to be delaying the scheduling of the proximity read event in wacom_intuos_inout() to workqueue. > I'll continue to research this more in depth, but progress is slow > because I don't have much time, I'm doing this in my spare time because > it's my girlfriend's tablet. Oh, now I understand the level of severity of this bug! :-) Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-19 9:10 ` Jiri Kosina @ 2015-11-19 16:33 ` Ioan-Adrian Ratiu 2015-11-19 21:34 ` Jiri Kosina 0 siblings, 1 reply; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-19 16:33 UTC (permalink / raw) To: Jiri Kosina Cc: Josh Cartwright, pinglinux, linux-usb, linux-input, linux-kernel On Thu, 19 Nov 2015 10:10:19 +0100 (CET) Jiri Kosina <jikos@kernel.org> wrote: > On Thu, 19 Nov 2015, Ioan-Adrian Ratiu wrote: > > > First part of lockdep report: > > http://imgur.com/clLsCWe > > > > Second part: > > http://imgur.com/Wa2PzRl > > > > Here are some printk's of mine while reproducing + debugging the issue: > > http://imgur.com/SETOHT7 > > So the real problem is that Intuos driver is calling hid_hw_request() > (which tries to grab the lock in usbhid_submit_report()) while handling > the CTRL IRQ (lock gets acquired there). > > So the proper way to fix seems to be delaying the scheduling of the > proximity read event in wacom_intuos_inout() to workqueue. > > > I'll continue to research this more in depth, but progress is slow > > because I don't have much time, I'm doing this in my spare time because > > it's my girlfriend's tablet. > > Oh, now I understand the level of severity of this bug! :-) > > Thanks, > Yes, exactly, you are beginning to understand! :) When I've put my 2 variants above to solve this deadlock, by "removing the call from wacom" at 1) I was trying to say exactly this, removing it from the irq to a workqueue. But please understand further my reasoning for submitting this patch. Consider if this is a bug in the wacom driver or in the usbhid core? IMO this is a usbhid bug: the critical region in hid_ctrl() is too big, there is no reason for the call to hid_input_report() to be protected by usbhid->lock. The correct way to fix this deadlock is to fix the critical section in usbhid, not remove the call from the wacom irq. If wacom wants to reschedule in the irq, it should not deadlock on usbhid. "Fixing" the wacom call would just work around the critical region bug inside usbhid. I hope I've made myself clear this time; I really needed to explain this patch better :( sorry. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-19 16:33 ` Ioan-Adrian Ratiu @ 2015-11-19 21:34 ` Jiri Kosina 2015-11-20 20:08 ` Ioan-Adrian Ratiu 0 siblings, 1 reply; 20+ messages in thread From: Jiri Kosina @ 2015-11-19 21:34 UTC (permalink / raw) To: Ioan-Adrian Ratiu Cc: Josh Cartwright, pinglinux, linux-usb, linux-input, linux-kernel On Thu, 19 Nov 2015, Ioan-Adrian Ratiu wrote: > But please understand further my reasoning for submitting this patch. > Consider if this is a bug in the wacom driver or in the usbhid core? IMO > this is a usbhid bug: the critical region in hid_ctrl() is too big, > there is no reason for the call to hid_input_report() to be protected by > usbhid->lock. Hmm, it's actually true that we might not need usbhid->lock during hid_input_report() at the end of the day, as we shouldn't be doing any URB-related operations there, neither iofl are being manipulated. If you have already done the full analysis that shows that usbhid->lock is indeed not needed, this absolutely needs to go into changelog as proper justification. Could you please reformulate the changelog in this respect and resubmit? Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-19 21:34 ` Jiri Kosina @ 2015-11-20 20:08 ` Ioan-Adrian Ratiu 0 siblings, 0 replies; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-20 20:08 UTC (permalink / raw) To: Jiri Kosina Cc: Josh Cartwright, pinglinux, linux-usb, linux-input, linux-kernel On Thu, 19 Nov 2015 22:34:18 +0100 (CET) Jiri Kosina <jikos@kernel.org> wrote: > Could you please reformulate the changelog in this respect and resubmit? Yes, of course, I tried to reformulate the problem and solution as clear and succint as I could in v2, which I'll send shortly. Thank you very much for your patience and feedback. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock 2015-11-18 21:05 ` Ioan-Adrian Ratiu (?) (?) @ 2015-11-19 8:56 ` Jiri Kosina -1 siblings, 0 replies; 20+ messages in thread From: Jiri Kosina @ 2015-11-19 8:56 UTC (permalink / raw) To: Ioan-Adrian Ratiu; +Cc: pinglinux, linux-usb, linux-input, linux-kernel On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote: > > > The critical section protected by usbhid->lock in hid_ctrl() is too > > > big and in rare cases causes a recursive deadlock because of its call > > > to hid_input_report(). > > > > > > This deadlock reproduces on newer wacom tablets like 056a:033c because > > > the wacom driver in its irq handler ends up calling hid_hw_request() > > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means > > > is that it submits a report to reschedule a proximity read through a > > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb) > > > before calling hid_input_report(). When the irq kicks in on the same > > > cpu, it also tries to grab the lock resulting in a recursive deadlock. > > > > > > The proper fix is to shrink the critical section in hid_ctrl() to > > > protect only the instructions which modify usbhid, thus move the lock > > > after the hid_input_report() call and the deadlock dissapears. > > > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), > > isn't it? > > That was my first attempt, yes, but the deadlock still happens with interrupts > disabled. That unfortunately however directly implies that your explanation above isn't actually correct description of the real problem. So we'd better first understand the problem rather than papering it over with more or less random fixes. First, have you tried to run your usecase on your system with lockdep enabled? Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock 2015-11-18 19:25 [PATCH] hid: usbhid: hid-core: fix recursive deadlock Ioan-Adrian Ratiu 2015-11-18 20:37 ` Jiri Kosina @ 2015-11-20 20:19 ` Ioan-Adrian Ratiu 2015-11-29 10:29 ` Ioan-Adrian Ratiu 2015-12-01 16:36 ` Jiri Kosina 1 sibling, 2 replies; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-20 20:19 UTC (permalink / raw) To: jikos; +Cc: pinglinux, linux-usb, linux-input, linux-kernel, joshc The critical section protected by usbhid->lock in hid_ctrl() is too big and because of this it causes a recursive deadlock. "Too big" means the case statement and the call to hid_input_report() do not need to be protected by the spinlock (no URB operations are done inside them). The deadlock happens because in certain rare cases drivers try to grab the lock while handling the ctrl irq which grabs the lock before them as described above. For example newer wacom tablets like 056a:033c try to reschedule proximity reads from wacom_intuos_schedule_prox_event() calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() which tries to grab the usbhid lock already held by hid_ctrl(). There are two ways to get out of this deadlock: 1. Make the drivers work "around" the ctrl critical region, in the wacom case for ex. by delaying the scheduling of the proximity read request itself to a workqueue. 2. Shrink the critical region so the usbhid lock protects only the instructions which modify usbhid state, calling hid_input_report() with the spinlock unlocked, allowing the device driver to grab the lock first, finish and then grab the lock afterwards in hid_ctrl(). This patch implements the 2nd solution. Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> --- drivers/hid/usbhid/hid-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 36712e9..5dd426f 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) struct usbhid_device *usbhid = hid->driver_data; int unplug = 0, status = urb->status; - spin_lock(&usbhid->lock); - switch (status) { case 0: /* success */ if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) hid_warn(urb->dev, "ctrl urb status %d received\n", status); } + spin_lock(&usbhid->lock); + if (unplug) { usbhid->ctrltail = usbhid->ctrlhead; } else { -- 2.6.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock 2015-11-20 20:19 ` [PATCH v2] " Ioan-Adrian Ratiu @ 2015-11-29 10:29 ` Ioan-Adrian Ratiu 2016-01-21 1:28 ` Jason Gerecke 2015-12-01 16:36 ` Jiri Kosina 1 sibling, 1 reply; 20+ messages in thread From: Ioan-Adrian Ratiu @ 2015-11-29 10:29 UTC (permalink / raw) To: jikos; +Cc: pinglinux, linux-usb, linux-input, linux-kernel, joshc On Fri, 20 Nov 2015 22:19:02 +0200 Ioan-Adrian Ratiu <adi@adirat.com> wrote: > The critical section protected by usbhid->lock in hid_ctrl() is too > big and because of this it causes a recursive deadlock. "Too big" means > the case statement and the call to hid_input_report() do not need to be > protected by the spinlock (no URB operations are done inside them). > > The deadlock happens because in certain rare cases drivers try to grab > the lock while handling the ctrl irq which grabs the lock before them > as described above. For example newer wacom tablets like 056a:033c try > to reschedule proximity reads from wacom_intuos_schedule_prox_event() > calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() > which tries to grab the usbhid lock already held by hid_ctrl(). > > There are two ways to get out of this deadlock: > 1. Make the drivers work "around" the ctrl critical region, in the > wacom case for ex. by delaying the scheduling of the proximity read > request itself to a workqueue. > 2. Shrink the critical region so the usbhid lock protects only the > instructions which modify usbhid state, calling hid_input_report() > with the spinlock unlocked, allowing the device driver to grab the > lock first, finish and then grab the lock afterwards in hid_ctrl(). > > This patch implements the 2nd solution. > > Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> > --- > drivers/hid/usbhid/hid-core.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c > index 36712e9..5dd426f 100644 > --- a/drivers/hid/usbhid/hid-core.c > +++ b/drivers/hid/usbhid/hid-core.c > @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) > struct usbhid_device *usbhid = hid->driver_data; > int unplug = 0, status = urb->status; > > - spin_lock(&usbhid->lock); > - > switch (status) { > case 0: /* success */ > if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) > @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) > hid_warn(urb->dev, "ctrl urb status %d received\n", status); > } > > + spin_lock(&usbhid->lock); > + > if (unplug) { > usbhid->ctrltail = usbhid->ctrlhead; > } else { Hello again Can this please be merged in the 4.4? There are quite a few people who have their tablets deadlock and don't know how to patch their kernels so are stuck waiting for a new release. The severity of this issue is much bigger than I initially thought. Since I've posted on the wacom mailing list that I have a fix for this deadlock I've recived lots of email from people complaining of the same problem on a wide range of tablets. Some of those people know how to patch a kernel, some found this patch on the mailing list and tested it and confirmed that it works on the wacom mailing list (you can verify the deadlock + fix on that mailing list). Thank you, Adrian ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock 2015-11-29 10:29 ` Ioan-Adrian Ratiu @ 2016-01-21 1:28 ` Jason Gerecke 0 siblings, 0 replies; 20+ messages in thread From: Jason Gerecke @ 2016-01-21 1:28 UTC (permalink / raw) To: Jiri Kosina Cc: Jiri Kosina, Ping Cheng, linux-usb, Linux Input, linux-kernel, joshc, Ioan-Adrian Ratiu Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... On Sun, Nov 29, 2015 at 2:29 AM, Ioan-Adrian Ratiu <adi@adirat.com> wrote: > On Fri, 20 Nov 2015 22:19:02 +0200 > Ioan-Adrian Ratiu <adi@adirat.com> wrote: > >> The critical section protected by usbhid->lock in hid_ctrl() is too >> big and because of this it causes a recursive deadlock. "Too big" means >> the case statement and the call to hid_input_report() do not need to be >> protected by the spinlock (no URB operations are done inside them). >> >> The deadlock happens because in certain rare cases drivers try to grab >> the lock while handling the ctrl irq which grabs the lock before them >> as described above. For example newer wacom tablets like 056a:033c try >> to reschedule proximity reads from wacom_intuos_schedule_prox_event() >> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() >> which tries to grab the usbhid lock already held by hid_ctrl(). >> >> There are two ways to get out of this deadlock: >> 1. Make the drivers work "around" the ctrl critical region, in the >> wacom case for ex. by delaying the scheduling of the proximity read >> request itself to a workqueue. >> 2. Shrink the critical region so the usbhid lock protects only the >> instructions which modify usbhid state, calling hid_input_report() >> with the spinlock unlocked, allowing the device driver to grab the >> lock first, finish and then grab the lock afterwards in hid_ctrl(). >> >> This patch implements the 2nd solution. >> >> Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> >> --- >> drivers/hid/usbhid/hid-core.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c >> index 36712e9..5dd426f 100644 >> --- a/drivers/hid/usbhid/hid-core.c >> +++ b/drivers/hid/usbhid/hid-core.c >> @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) >> struct usbhid_device *usbhid = hid->driver_data; >> int unplug = 0, status = urb->status; >> >> - spin_lock(&usbhid->lock); >> - >> switch (status) { >> case 0: /* success */ >> if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) >> @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) >> hid_warn(urb->dev, "ctrl urb status %d received\n", status); >> } >> >> + spin_lock(&usbhid->lock); >> + >> if (unplug) { >> usbhid->ctrltail = usbhid->ctrlhead; >> } else { > > Hello again > > Can this please be merged in the 4.4? There are quite a few people who have > their tablets deadlock and don't know how to patch their kernels so are stuck > waiting for a new release. > > The severity of this issue is much bigger than I initially thought. Since I've > posted on the wacom mailing list that I have a fix for this deadlock I've > recived lots of email from people complaining of the same problem on a wide > range of tablets. > > Some of those people know how to patch a kernel, some found this patch on the > mailing list and tested it and confirmed that it works on the wacom mailing > list (you can verify the deadlock + fix on that mailing list). > > Thank you, > Adrian On Tue, Dec 1, 2015 at 4:00 PM, Ping Cheng <pinglinux@gmail.com> wrote: > Hi Jiri, > > Since wacom_intuos_schedule_prox_event() was introduced on March 5, > 2015, the call was first used in kernel 3.19. Shouldn't we back-port > this patch to other stable versions? > > Thank you. > > Ping I've been prompted to resurrect this question after fielding yet another question from a user encountering a locked-up system. We received a surprising number of reports of people encountering this issue in the past several weeks and that's only going to get worse as 4.4 works its way into distros. Integrating back even just to 4.4 instead of 3.19 would be appreciated since this issue has (so far) only ever been encountered with the new Intuos tablets that were added in 4.4... Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock @ 2016-01-21 1:28 ` Jason Gerecke 0 siblings, 0 replies; 20+ messages in thread From: Jason Gerecke @ 2016-01-21 1:28 UTC (permalink / raw) To: Jiri Kosina Cc: Jiri Kosina, Ping Cheng, linux-usb, Linux Input, linux-kernel, joshc, Ioan-Adrian Ratiu Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... On Sun, Nov 29, 2015 at 2:29 AM, Ioan-Adrian Ratiu <adi@adirat.com> wrote: > On Fri, 20 Nov 2015 22:19:02 +0200 > Ioan-Adrian Ratiu <adi@adirat.com> wrote: > >> The critical section protected by usbhid->lock in hid_ctrl() is too >> big and because of this it causes a recursive deadlock. "Too big" means >> the case statement and the call to hid_input_report() do not need to be >> protected by the spinlock (no URB operations are done inside them). >> >> The deadlock happens because in certain rare cases drivers try to grab >> the lock while handling the ctrl irq which grabs the lock before them >> as described above. For example newer wacom tablets like 056a:033c try >> to reschedule proximity reads from wacom_intuos_schedule_prox_event() >> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() >> which tries to grab the usbhid lock already held by hid_ctrl(). >> >> There are two ways to get out of this deadlock: >> 1. Make the drivers work "around" the ctrl critical region, in the >> wacom case for ex. by delaying the scheduling of the proximity read >> request itself to a workqueue. >> 2. Shrink the critical region so the usbhid lock protects only the >> instructions which modify usbhid state, calling hid_input_report() >> with the spinlock unlocked, allowing the device driver to grab the >> lock first, finish and then grab the lock afterwards in hid_ctrl(). >> >> This patch implements the 2nd solution. >> >> Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> >> --- >> drivers/hid/usbhid/hid-core.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c >> index 36712e9..5dd426f 100644 >> --- a/drivers/hid/usbhid/hid-core.c >> +++ b/drivers/hid/usbhid/hid-core.c >> @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb) >> struct usbhid_device *usbhid = hid->driver_data; >> int unplug = 0, status = urb->status; >> >> - spin_lock(&usbhid->lock); >> - >> switch (status) { >> case 0: /* success */ >> if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN) >> @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb) >> hid_warn(urb->dev, "ctrl urb status %d received\n", status); >> } >> >> + spin_lock(&usbhid->lock); >> + >> if (unplug) { >> usbhid->ctrltail = usbhid->ctrlhead; >> } else { > > Hello again > > Can this please be merged in the 4.4? There are quite a few people who have > their tablets deadlock and don't know how to patch their kernels so are stuck > waiting for a new release. > > The severity of this issue is much bigger than I initially thought. Since I've > posted on the wacom mailing list that I have a fix for this deadlock I've > recived lots of email from people complaining of the same problem on a wide > range of tablets. > > Some of those people know how to patch a kernel, some found this patch on the > mailing list and tested it and confirmed that it works on the wacom mailing > list (you can verify the deadlock + fix on that mailing list). > > Thank you, > Adrian On Tue, Dec 1, 2015 at 4:00 PM, Ping Cheng <pinglinux@gmail.com> wrote: > Hi Jiri, > > Since wacom_intuos_schedule_prox_event() was introduced on March 5, > 2015, the call was first used in kernel 3.19. Shouldn't we back-port > this patch to other stable versions? > > Thank you. > > Ping I've been prompted to resurrect this question after fielding yet another question from a user encountering a locked-up system. We received a surprising number of reports of people encountering this issue in the past several weeks and that's only going to get worse as 4.4 works its way into distros. Integrating back even just to 4.4 instead of 3.19 would be appreciated since this issue has (so far) only ever been encountered with the new Intuos tablets that were added in 4.4... Jason --- Now instead of four in the eights place / you’ve got three, ‘Cause you added one / (That is to say, eight) to the two, / But you can’t take seven from three, / So you look at the sixty-fours.... -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock @ 2016-01-21 9:36 ` Jiri Kosina 0 siblings, 0 replies; 20+ messages in thread From: Jiri Kosina @ 2016-01-21 9:36 UTC (permalink / raw) To: Jason Gerecke Cc: Ping Cheng, linux-usb, Linux Input, linux-kernel, joshc, Ioan-Adrian Ratiu On Wed, 20 Jan 2016, Jason Gerecke wrote: > I've been prompted to resurrect this question after fielding yet another > question from a user encountering a locked-up system. We received a > surprising number of reports of people encountering this issue in the > past several weeks and that's only going to get worse as 4.4 works its > way into distros. Integrating back even just to 4.4 instead of 3.19 > would be appreciated since this issue has (so far) only ever been > encountered with the new Intuos tablets that were added in 4.4... I am in favor of pushing this to -stable. As you guys seem to be the ones able to test it with your hardware, would you guys (Ioan-Adrian? Jason?) be willing to prepare the backport and submit it once it passes your testing? Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock @ 2016-01-21 9:36 ` Jiri Kosina 0 siblings, 0 replies; 20+ messages in thread From: Jiri Kosina @ 2016-01-21 9:36 UTC (permalink / raw) To: Jason Gerecke Cc: Ping Cheng, linux-usb-u79uwXL29TY76Z2rM5mHXA, Linux Input, linux-kernel-u79uwXL29TY76Z2rM5mHXA, joshc-acOepvfBmUk, Ioan-Adrian Ratiu On Wed, 20 Jan 2016, Jason Gerecke wrote: > I've been prompted to resurrect this question after fielding yet another > question from a user encountering a locked-up system. We received a > surprising number of reports of people encountering this issue in the > past several weeks and that's only going to get worse as 4.4 works its > way into distros. Integrating back even just to 4.4 instead of 3.19 > would be appreciated since this issue has (so far) only ever been > encountered with the new Intuos tablets that were added in 4.4... I am in favor of pushing this to -stable. As you guys seem to be the ones able to test it with your hardware, would you guys (Ioan-Adrian? Jason?) be willing to prepare the backport and submit it once it passes your testing? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock 2015-11-20 20:19 ` [PATCH v2] " Ioan-Adrian Ratiu 2015-11-29 10:29 ` Ioan-Adrian Ratiu @ 2015-12-01 16:36 ` Jiri Kosina 2015-12-02 0:00 ` Ping Cheng 1 sibling, 1 reply; 20+ messages in thread From: Jiri Kosina @ 2015-12-01 16:36 UTC (permalink / raw) To: Ioan-Adrian Ratiu; +Cc: pinglinux, linux-usb, linux-input, linux-kernel, joshc On Fri, 20 Nov 2015, Ioan-Adrian Ratiu wrote: > The critical section protected by usbhid->lock in hid_ctrl() is too > big and because of this it causes a recursive deadlock. "Too big" means > the case statement and the call to hid_input_report() do not need to be > protected by the spinlock (no URB operations are done inside them). > > The deadlock happens because in certain rare cases drivers try to grab > the lock while handling the ctrl irq which grabs the lock before them > as described above. For example newer wacom tablets like 056a:033c try > to reschedule proximity reads from wacom_intuos_schedule_prox_event() > calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() > which tries to grab the usbhid lock already held by hid_ctrl(). > > There are two ways to get out of this deadlock: > 1. Make the drivers work "around" the ctrl critical region, in the > wacom case for ex. by delaying the scheduling of the proximity read > request itself to a workqueue. > 2. Shrink the critical region so the usbhid lock protects only the > instructions which modify usbhid state, calling hid_input_report() > with the spinlock unlocked, allowing the device driver to grab the > lock first, finish and then grab the lock afterwards in hid_ctrl(). > > This patch implements the 2nd solution. > > Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> Applied to for-4.5/core branch. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock 2015-12-01 16:36 ` Jiri Kosina @ 2015-12-02 0:00 ` Ping Cheng 0 siblings, 0 replies; 20+ messages in thread From: Ping Cheng @ 2015-12-02 0:00 UTC (permalink / raw) To: Jiri Kosina Cc: Ioan-Adrian Ratiu, linux-usb, linux-input, linux-kernel, joshc, Benjamin Tissoires Hi Jiri, On Tue, Dec 1, 2015 at 8:36 AM, Jiri Kosina <jikos@kernel.org> wrote: > On Fri, 20 Nov 2015, Ioan-Adrian Ratiu wrote: > >> The critical section protected by usbhid->lock in hid_ctrl() is too >> big and because of this it causes a recursive deadlock. "Too big" means >> the case statement and the call to hid_input_report() do not need to be >> protected by the spinlock (no URB operations are done inside them). >> >> The deadlock happens because in certain rare cases drivers try to grab >> the lock while handling the ctrl irq which grabs the lock before them >> as described above. For example newer wacom tablets like 056a:033c try >> to reschedule proximity reads from wacom_intuos_schedule_prox_event() >> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report() >> which tries to grab the usbhid lock already held by hid_ctrl(). >> >> There are two ways to get out of this deadlock: >> 1. Make the drivers work "around" the ctrl critical region, in the >> wacom case for ex. by delaying the scheduling of the proximity read >> request itself to a workqueue. >> 2. Shrink the critical region so the usbhid lock protects only the >> instructions which modify usbhid state, calling hid_input_report() >> with the spinlock unlocked, allowing the device driver to grab the >> lock first, finish and then grab the lock afterwards in hid_ctrl(). >> >> This patch implements the 2nd solution. >> >> Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com> > > Applied to for-4.5/core branch. Thanks, Since wacom_intuos_schedule_prox_event() was introduced on March 5, 2015, the call was first used in kernel 3.19. Shouldn't we back-port this patch to other stable versions? Thank you. Ping ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-01-21 9:36 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-11-18 19:25 [PATCH] hid: usbhid: hid-core: fix recursive deadlock Ioan-Adrian Ratiu 2015-11-18 20:37 ` Jiri Kosina 2015-11-18 21:05 ` Ioan-Adrian Ratiu 2015-11-18 21:05 ` Ioan-Adrian Ratiu 2015-11-18 23:58 ` Josh Cartwright 2015-11-18 23:58 ` Josh Cartwright 2015-11-19 6:47 ` Ioan-Adrian Ratiu 2015-11-19 9:10 ` Jiri Kosina 2015-11-19 16:33 ` Ioan-Adrian Ratiu 2015-11-19 21:34 ` Jiri Kosina 2015-11-20 20:08 ` Ioan-Adrian Ratiu 2015-11-19 8:56 ` Jiri Kosina 2015-11-20 20:19 ` [PATCH v2] " Ioan-Adrian Ratiu 2015-11-29 10:29 ` Ioan-Adrian Ratiu 2016-01-21 1:28 ` Jason Gerecke 2016-01-21 1:28 ` Jason Gerecke 2016-01-21 9:36 ` Jiri Kosina 2016-01-21 9:36 ` Jiri Kosina 2015-12-01 16:36 ` Jiri Kosina 2015-12-02 0:00 ` Ping Cheng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.