All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200	[thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)

On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter.  This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it.  The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
> 
> 	URB 0000000031fb463e submitted while active
> 	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363  
> 
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count).  The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
> 
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls.  In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
> 
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative.  There's
> even a warning about this in the USB power management documentation!
> 
> As it happens, this is exactly what the hub driver does.  The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread.  This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
> 
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock.  The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts.  As far as I know, all
> existing drivers currently do this.

Nice work, that took a lot of debugging, thanks for resolving this.

greg k-h

WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: Re: [PATCH] USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200	[thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)
Message-ID: <20190419191735.hwqGJHZT_IlTPl2vAuy0wpUjXQdFNJFJzJVvMt8_oxU@z> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1904191348580.1406-100000@iolanthe.rowland.org>

On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter.  This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it.  The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
> 
> 	URB 0000000031fb463e submitted while active
> 	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363  
> 
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count).  The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
> 
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls.  In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
> 
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative.  There's
> even a warning about this in the USB power management documentation!
> 
> As it happens, this is exactly what the hub driver does.  The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread.  This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
> 
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock.  The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts.  As far as I know, all
> existing drivers currently do this.

Nice work, that took a lot of debugging, thanks for resolving this.

greg k-h

             reply	other threads:[~2019-04-19 19:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-19 19:17 Greg KH [this message]
2019-04-19 19:17 ` [PATCH] USB: core: Fix bug caused by duplicate interface PM usage counter Greg KH
  -- strict thread matches above, loose matches on Subject: below --
2019-04-19 17:52 Alan Stern
2019-04-19 17:52 ` [PATCH] " Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190419191735.GA5970@kroah.com \
    --to=greg@kroah.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.