All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Liu <net147@gmail.com>
To: Jeffy Chen <jeffy.chen@rock-chips.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	briannorris@chromium.org, stern@rowland.harvard.edu,
	mka@chromium.org, dianders@chromium.org,
	AMAN DEEP <aman.deep@samsung.com>,
	stable@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org
Subject: Re: [v6] usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
Date: Sun, 25 Mar 2018 12:21:48 +1100	[thread overview]
Message-ID: <CANwerB1BBnQhGoCrna48dtwb36iqC1UcsYJ-1Xnk6K0ce2OiGg@mail.gmail.com> (raw)
In-Reply-To: <20180208035501.10711-1-jeffy.chen@rock-chips.com>

Hi,

On 8 February 2018 at 14:55, Jeffy Chen <jeffy.chen@rock-chips.com> wrote:
> From: AMAN DEEP <aman.deep@samsung.com>
>
> There is a race condition between finish_unlinks->finish_urb() function
> and usb_kill_urb() in ohci controller case. The finish_urb calls
> spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
> then if during this time, usb_kill_urb is called for another endpoint,
> then new ed will be added to ed_rm_list at beginning for unlink, and
> ed_rm_list will point to newly added.
>
> When finish_urb() is completed in finish_unlinks() and ed->td_list
> becomes empty as in below code (in finish_unlinks() function):
>
>         if (list_empty(&ed->td_list)) {
>                 *last = ed->ed_next;
>                 ed->ed_next = NULL;
>         } else if (ohci->rh_state == OHCI_RH_RUNNING) {
>                 *last = ed->ed_next;
>                 ed->ed_next = NULL;
>                 ed_schedule(ohci, ed);
>         }
>
> The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
> and previously added ed by usb_kill_urb will be left unreferenced by
> ed_rm_list. This causes usb_kill_urb() hang forever waiting for
> finish_unlink to remove added ed from ed_rm_list.
>
> The main reason for hang in this race condtion is addition and removal
> of ed from ed_rm_list in the beginning during usb_kill_urb and later
> last* is modified in finish_unlinks().
>
> As suggested by Alan Stern, the solution for proper handling of
> ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
> any URBs. Then at the end, we can add ed back to the list if necessary.
>
> This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
>
> Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies")
> Acked-by: Alan Stern <stern@rowland.harvard.edu>
> CC: <stable@vger.kernel.org>
> Signed-off-by: Aman Deep <aman.deep@samsung.com>
> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
> ---
>
> Changes in v6:
> This is a resend of Aman Deep's v5 patch [0], which solved the hang we
> hit [1]. (Thanks Aman :)
>
> The v5 has some format issues, so i slightly adjust the commit message.
>
> [0] https://www.spinics.net/lists/linux-usb/msg129010.html
> [1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749
>
>  drivers/usb/host/ohci-q.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c
> index b2ec8c399363..4ccb85a67bb3 100644
> --- a/drivers/usb/host/ohci-q.c
> +++ b/drivers/usb/host/ohci-q.c
> @@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci)
>                  * have modified this list.  normally it's just prepending
>                  * entries (which we'd ignore), but paranoia won't hurt.
>                  */
> +               *last = ed->ed_next;
> +               ed->ed_next = NULL;
>                 modified = 0;
>
>                 /* unlink urbs as requested, but rescan the list after
> @@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci)
>                         goto rescan_this;
>
>                 /*
> -                * If no TDs are queued, take ED off the ed_rm_list.
> +                * If no TDs are queued, ED is now idle.
>                  * Otherwise, if the HC is running, reschedule.
> -                * If not, leave it on the list for further dequeues.
> +                * If the HC isn't running, add ED back to the
> +                * start of the list for later processing.
>                  */
>                 if (list_empty(&ed->td_list)) {
> -                       *last = ed->ed_next;
> -                       ed->ed_next = NULL;
>                         ed->state = ED_IDLE;
>                         list_del(&ed->in_use_list);
>                 } else if (ohci->rh_state == OHCI_RH_RUNNING) {
> -                       *last = ed->ed_next;
> -                       ed->ed_next = NULL;
>                         ed_schedule(ohci, ed);
>                 } else {
> -                       last = &ed->ed_next;
> +                       ed->ed_next = ohci->ed_rm_list;
> +                       ohci->ed_rm_list = ed;
> +                       /* Don't loop on the same ED */
> +                       if (last == &ohci->ed_rm_list)
> +                               last = &ed->ed_next;
>                 }
>
>                 if (modified)

I am experiencing a USB function call hang from userspace with OCHI
(full speed USB device) after updating from Linux 4.14.15 to 4.14.24
and noticed this commit.

Here is the Linux 4.14.24 kernel stack trace (extracted from SysRq+w
and amended with addr2line):
[<c05cbd64>] (__schedule) from [<c05cc148>] (schedule+0x50/0xb4)
kernel/sched/core.c:2792
[<c05cc148>] (schedule) from [<c042789c>]
(usb_kill_urb.part.3+0x78/0xa8) include/asm-generic/preempt.h:59
[<c042789c>] (usb_kill_urb.part.3) from [<c0433a3c>]
(usbdev_ioctl+0x1288/0x1cf0) drivers/usb/core/urb.c:690
[<c0433a3c>] (usbdev_ioctl) from [<c0217db8>]
(do_vfs_ioctl+0x9c/0x8ec) drivers/usb/core/devio.c:1835
[<c0217db8>] (do_vfs_ioctl) from [<c021863c>] (SyS_ioctl+0x34/0x5c)
fs/ioctl.c:47
[<c021863c>] (SyS_ioctl) from [<c0107820>] (ret_fast_syscall+0x0/0x54)
include/linux/file.h:39

Afterwards the kernel is unresponsive to disconnect/connect of the
full speed USB device but I can connect/disconnect a high speed USB
device to the same port and communicate to it without problem since it
uses EHCI (OHCI is companion controller). If I try to connect the full
speed USB device again it is still unresponsive. The userspace
application is still hanging after all this.

Could this commit be causing the issue?

Thanks.

Regards,
Jonathan

WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Liu <net147@gmail.com>
To: Jeffy Chen <jeffy.chen@rock-chips.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	briannorris@chromium.org, stern@rowland.harvard.edu,
	mka@chromium.org, dianders@chromium.org,
	AMAN DEEP <aman.deep@samsung.com>,
	stable@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org
Subject: [v6] usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
Date: Sun, 25 Mar 2018 12:21:48 +1100	[thread overview]
Message-ID: <CANwerB1BBnQhGoCrna48dtwb36iqC1UcsYJ-1Xnk6K0ce2OiGg@mail.gmail.com> (raw)

Hi,

On 8 February 2018 at 14:55, Jeffy Chen <jeffy.chen@rock-chips.com> wrote:
> From: AMAN DEEP <aman.deep@samsung.com>
>
> There is a race condition between finish_unlinks->finish_urb() function
> and usb_kill_urb() in ohci controller case. The finish_urb calls
> spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
> then if during this time, usb_kill_urb is called for another endpoint,
> then new ed will be added to ed_rm_list at beginning for unlink, and
> ed_rm_list will point to newly added.
>
> When finish_urb() is completed in finish_unlinks() and ed->td_list
> becomes empty as in below code (in finish_unlinks() function):
>
>         if (list_empty(&ed->td_list)) {
>                 *last = ed->ed_next;
>                 ed->ed_next = NULL;
>         } else if (ohci->rh_state == OHCI_RH_RUNNING) {
>                 *last = ed->ed_next;
>                 ed->ed_next = NULL;
>                 ed_schedule(ohci, ed);
>         }
>
> The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
> and previously added ed by usb_kill_urb will be left unreferenced by
> ed_rm_list. This causes usb_kill_urb() hang forever waiting for
> finish_unlink to remove added ed from ed_rm_list.
>
> The main reason for hang in this race condtion is addition and removal
> of ed from ed_rm_list in the beginning during usb_kill_urb and later
> last* is modified in finish_unlinks().
>
> As suggested by Alan Stern, the solution for proper handling of
> ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
> any URBs. Then at the end, we can add ed back to the list if necessary.
>
> This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
>
> Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies")
> Acked-by: Alan Stern <stern@rowland.harvard.edu>
> CC: <stable@vger.kernel.org>
> Signed-off-by: Aman Deep <aman.deep@samsung.com>
> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
> ---
>
> Changes in v6:
> This is a resend of Aman Deep's v5 patch [0], which solved the hang we
> hit [1]. (Thanks Aman :)
>
> The v5 has some format issues, so i slightly adjust the commit message.
>
> [0] https://www.spinics.net/lists/linux-usb/msg129010.html
> [1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749
>
>  drivers/usb/host/ohci-q.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c
> index b2ec8c399363..4ccb85a67bb3 100644
> --- a/drivers/usb/host/ohci-q.c
> +++ b/drivers/usb/host/ohci-q.c
> @@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci)
>                  * have modified this list.  normally it's just prepending
>                  * entries (which we'd ignore), but paranoia won't hurt.
>                  */
> +               *last = ed->ed_next;
> +               ed->ed_next = NULL;
>                 modified = 0;
>
>                 /* unlink urbs as requested, but rescan the list after
> @@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci)
>                         goto rescan_this;
>
>                 /*
> -                * If no TDs are queued, take ED off the ed_rm_list.
> +                * If no TDs are queued, ED is now idle.
>                  * Otherwise, if the HC is running, reschedule.
> -                * If not, leave it on the list for further dequeues.
> +                * If the HC isn't running, add ED back to the
> +                * start of the list for later processing.
>                  */
>                 if (list_empty(&ed->td_list)) {
> -                       *last = ed->ed_next;
> -                       ed->ed_next = NULL;
>                         ed->state = ED_IDLE;
>                         list_del(&ed->in_use_list);
>                 } else if (ohci->rh_state == OHCI_RH_RUNNING) {
> -                       *last = ed->ed_next;
> -                       ed->ed_next = NULL;
>                         ed_schedule(ohci, ed);
>                 } else {
> -                       last = &ed->ed_next;
> +                       ed->ed_next = ohci->ed_rm_list;
> +                       ohci->ed_rm_list = ed;
> +                       /* Don't loop on the same ED */
> +                       if (last == &ohci->ed_rm_list)
> +                               last = &ed->ed_next;
>                 }
>
>                 if (modified)

I am experiencing a USB function call hang from userspace with OCHI
(full speed USB device) after updating from Linux 4.14.15 to 4.14.24
and noticed this commit.

Here is the Linux 4.14.24 kernel stack trace (extracted from SysRq+w
and amended with addr2line):
[<c05cbd64>] (__schedule) from [<c05cc148>] (schedule+0x50/0xb4)
kernel/sched/core.c:2792
[<c05cc148>] (schedule) from [<c042789c>]
(usb_kill_urb.part.3+0x78/0xa8) include/asm-generic/preempt.h:59
[<c042789c>] (usb_kill_urb.part.3) from [<c0433a3c>]
(usbdev_ioctl+0x1288/0x1cf0) drivers/usb/core/urb.c:690
[<c0433a3c>] (usbdev_ioctl) from [<c0217db8>]
(do_vfs_ioctl+0x9c/0x8ec) drivers/usb/core/devio.c:1835
[<c0217db8>] (do_vfs_ioctl) from [<c021863c>] (SyS_ioctl+0x34/0x5c)
fs/ioctl.c:47
[<c021863c>] (SyS_ioctl) from [<c0107820>] (ret_fast_syscall+0x0/0x54)
include/linux/file.h:39

Afterwards the kernel is unresponsive to disconnect/connect of the
full speed USB device but I can connect/disconnect a high speed USB
device to the same port and communicate to it without problem since it
uses EHCI (OHCI is companion controller). If I try to connect the full
speed USB device again it is still unresponsive. The userspace
application is still hanging after all this.

Could this commit be causing the issue?

Thanks.

Regards,
Jonathan
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-03-25  1:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-08  3:55 [PATCH v6] usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks() Jeffy Chen
2018-02-08  3:55 ` [v6] " Jeffy Chen
2018-02-08  3:55 ` [PATCH v6] " Jeffy Chen
2018-03-25  1:21 ` Jonathan Liu [this message]
2018-03-25  1:21   ` [v6] " Jonathan Liu
2018-03-25  2:03   ` Jonathan Liu
2018-03-25  2:03     ` Jonathan Liu
2018-03-25 15:25   ` Jonathan Liu
2018-03-25 15:25     ` Jonathan Liu
2018-03-27 17:06     ` Alan Stern
2018-03-27 17:06       ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANwerB1BBnQhGoCrna48dtwb36iqC1UcsYJ-1Xnk6K0ce2OiGg@mail.gmail.com \
    --to=net147@gmail.com \
    --cc=aman.deep@samsung.com \
    --cc=briannorris@chromium.org \
    --cc=dianders@chromium.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeffy.chen@rock-chips.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mka@chromium.org \
    --cc=stable@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.