All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Neukum <oneukum@suse.com>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Alan Stern <stern@rowland.harvard.edu>
Cc: Andrey Konovalov <andreyknvl@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Colin Ian King <colin.king@canonical.com>,
	Arnd Bergmann <arnd@arndb.de>,
	USB list <linux-usb@vger.kernel.org>,
	syzbot <syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: [PATCH] USB: cdc-wdm: Call wake_up_all() when clearing WDM_IN_USE bit.
Date: Thu, 25 Jun 2020 11:56:08 +0200	[thread overview]
Message-ID: <1593078968.28236.15.camel@suse.com> (raw)
In-Reply-To: <0c43caf8-1135-1d38-cb57-9c0f84c4394d@i-love.sakura.ne.jp>

[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]

Am Montag, den 08.06.2020, 11:24 +0900 schrieb Tetsuo Handa:

Hi,

sorry for being late in reply. I have had an emergency to take
care of.

> On 2020/05/31 0:47, Alan Stern wrote:
> > On Sat, May 30, 2020 at 05:25:11PM +0200, Oliver Neukum wrote:
> > > Am Donnerstag, den 28.05.2020, 16:58 -0400 schrieb Alan Stern:

> > > > This sounds like a bug in the driver.  What would it do if someone had a 
> > > 
> > > Arguably yes. I will introduce a timeout. Unfortunately flush()
> > > requires a non-interruptible sleep, as you cannot sanely return EAGAIN.
> > 
> > But maybe you can kill some URBs and drop some data.
> 
> You mean call usb_kill_urb() via kill_urbs() ?

I have to correct myself. We can return -EINTR.
But that is no solution ultimately. We could not close the fd,
though we would not hang.

> As far as I tested, it seems that usb_kill_urb() sometimes fails to call
> wdm_out_callback() despite the comment for usb_kill_urb() says
> 
>  * This routine cancels an in-progress request.  It is guaranteed that
>  * upon return all completion handlers will have finished and the URB
>  * will be totally idle and available for reuse.  These features make
>  * this an ideal way to stop I/O in a disconnect() callback or close()
>  * function.  If the request has not already finished or been unlinked
>  * the completion handler will see urb->status == -ENOENT.

It looks like it does exactly as the description says. Cancelling
an URB is by necessity a race condition. It can always finish
before you can kill it.

> . Is something still wrong? Or just replacing
> 
> 		BUG_ON(test_bit(WDM_IN_USE, &desc->flags) &&
> 		       !test_bit(WDM_DISCONNECTING, &desc->flags));
> 
> with
> 
> 		wait_event(desc->wait, !test_bit(WDM_IN_USE, &desc->flags) ||
> 			   test_bit(WDM_DISCONNECTING, &desc->flags));
> 
> in the patch shown below is sufficient?
> 
> diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
> index e3db6fbeadef..3e92e79ce0a0 100644
> --- a/drivers/usb/class/cdc-wdm.c
> +++ b/drivers/usb/class/cdc-wdm.c
> @@ -151,7 +151,7 @@ static void wdm_out_callback(struct urb *urb)
>  	kfree(desc->outbuf);
>  	desc->outbuf = NULL;
>  	clear_bit(WDM_IN_USE, &desc->flags);
> -	wake_up(&desc->wait);
> +	wake_up_all(&desc->wait);
>  }
>  
>  static void wdm_in_callback(struct urb *urb)
> @@ -424,6 +424,7 @@ static ssize_t wdm_write
>  	if (rv < 0) {
>  		desc->outbuf = NULL;
>  		clear_bit(WDM_IN_USE, &desc->flags);
> +		wake_up_all(&desc->wait);
>  		dev_err(&desc->intf->dev, "Tx URB error: %d\n", rv);
>  		rv = usb_translate_errors(rv);
>  		goto out_free_mem_pm;
> @@ -587,15 +588,16 @@ static int wdm_flush(struct file *file, fl_owner_t id)
>  {
>  	struct wdm_device *desc = file->private_data;
>  
> -	wait_event(desc->wait,
> -			/*
> -			 * needs both flags. We cannot do with one
> -			 * because resetting it would cause a race
> -			 * with write() yet we need to signal
> -			 * a disconnect
> -			 */
> -			!test_bit(WDM_IN_USE, &desc->flags) ||
> -			test_bit(WDM_DISCONNECTING, &desc->flags));
> +	/*
> +	 * needs both flags. We cannot do with one because resetting it would
> +	 * cause a race with write() yet we need to signal a disconnect
> +	 */
> +	if (!wait_event_timeout(desc->wait, !test_bit(WDM_IN_USE, &desc->flags) ||
> +				test_bit(WDM_DISCONNECTING, &desc->flags), 20 * HZ)) {
> +		kill_urbs(desc);

No. We cannot just kill all URBs just because one fd's owner wants to
flush.

In fact we have multiple code paths that can reach the same hang.
Could you test the attached patches?

	Regards
		Oliver

[-- Attachment #2: 0001-CDC-WDM-fix-hangs-in-flush.patch --]
[-- Type: text/x-patch, Size: 3137 bytes --]

From 27cd2e25b37af973b61b77217fa2dad822889ff8 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum@suse.com>
Date: Wed, 24 Jun 2020 10:52:03 +0200
Subject: [PATCH 1/2] CDC-WDM: fix hangs in flush()

When flushing a task needs to wait a bounded time, as a hardware failure
could mean eternal sleep. So an arbitrary timeout is introduced.
Simply making the syscall interruptible will not do the job,
as while the syscall would not hang, the fd would be unclosable.

In addition a flush() and a write() may be waiting for the same
IO to complete. Hence completion of output must use wake_up_all(),
even in error handling.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
---
 drivers/usb/class/cdc-wdm.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index e3db6fbeadef..ec5412773c57 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -58,6 +58,9 @@ MODULE_DEVICE_TABLE (usb, wdm_ids);
 
 #define WDM_MAX			16
 
+/* flush() needs to be uninterruptible, but we cannot wait forever */
+#define WDM_FLUSH_TIMEOUT	(30 * HZ)
+
 /* CDC-WMC r1.1 requires wMaxCommand to be "at least 256 decimal (0x100)" */
 #define WDM_DEFAULT_BUFSIZE	256
 
@@ -151,7 +154,7 @@ static void wdm_out_callback(struct urb *urb)
 	kfree(desc->outbuf);
 	desc->outbuf = NULL;
 	clear_bit(WDM_IN_USE, &desc->flags);
-	wake_up(&desc->wait);
+	wake_up_all(&desc->wait);
 }
 
 static void wdm_in_callback(struct urb *urb)
@@ -424,6 +427,7 @@ static ssize_t wdm_write
 	if (rv < 0) {
 		desc->outbuf = NULL;
 		clear_bit(WDM_IN_USE, &desc->flags);
+		wake_up_all(&desc->wait); /* for flush() */
 		dev_err(&desc->intf->dev, "Tx URB error: %d\n", rv);
 		rv = usb_translate_errors(rv);
 		goto out_free_mem_pm;
@@ -586,8 +590,9 @@ static ssize_t wdm_read
 static int wdm_flush(struct file *file, fl_owner_t id)
 {
 	struct wdm_device *desc = file->private_data;
+	int rv;
 
-	wait_event(desc->wait,
+	rv = wait_event_interruptible_timeout(desc->wait,
 			/*
 			 * needs both flags. We cannot do with one
 			 * because resetting it would cause a race
@@ -595,11 +600,16 @@ static int wdm_flush(struct file *file, fl_owner_t id)
 			 * a disconnect
 			 */
 			!test_bit(WDM_IN_USE, &desc->flags) ||
-			test_bit(WDM_DISCONNECTING, &desc->flags));
+			test_bit(WDM_DISCONNECTING, &desc->flags),
+			WDM_FLUSH_TIMEOUT);
 
 	/* cannot dereference desc->intf if WDM_DISCONNECTING */
 	if (test_bit(WDM_DISCONNECTING, &desc->flags))
 		return -ENODEV;
+	if (!rv)
+		return -EIO;
+	if (rv < 0)
+		return -EINTR;
 	if (desc->werr < 0)
 		dev_err(&desc->intf->dev, "Error in flush path: %d\n",
 			desc->werr);
@@ -656,6 +666,14 @@ static int wdm_open(struct inode *inode, struct file *file)
 		goto out;
 	}
 
+	/*
+	 * in case flush() had timed out
+	 */
+	usb_kill_urb(desc->command);
+	spin_lock_irq(&desc->iuspin);
+	desc->werr = 0;
+	spin_unlock_irq(&desc->iuspin);
+
 	/* using write lock to protect desc->count */
 	mutex_lock(&desc->wlock);
 	if (!desc->count++) {
-- 
2.16.4


[-- Attachment #3: 0002-CDC-WDM-fix-race-reporting-errors-in-flush.patch --]
[-- Type: text/x-patch, Size: 1252 bytes --]

From d588b8034b734ecce0575ae1110d3ab5a386e049 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum@suse.com>
Date: Thu, 25 Jun 2020 11:53:54 +0200
Subject: [PATCH 2/2] CDC-WDM: fix race reporting errors in flush

In case a race was lost and multiple fds used,
an error could be reported multiple times. To fix
this a spinlock must be taken.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
---
 drivers/usb/class/cdc-wdm.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index ec5412773c57..e9e8277a0c69 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -610,11 +610,16 @@ static int wdm_flush(struct file *file, fl_owner_t id)
 		return -EIO;
 	if (rv < 0)
 		return -EINTR;
-	if (desc->werr < 0)
-		dev_err(&desc->intf->dev, "Error in flush path: %d\n",
-			desc->werr);
 
-	return usb_translate_errors(desc->werr);
+	spin_lock_irq(&desc->iuspin);
+	rv = desc->werr;
+	desc->werr = 0;
+	spin_unlock_irq(&desc->iuspin);
+
+	if (rv < 0)
+		dev_err(&desc->intf->dev, "Error in flush path: %d\n", rv);
+
+	return usb_translate_errors(rv);
 }
 
 static __poll_t wdm_poll(struct file *file, struct poll_table_struct *wait)
-- 
2.16.4


  parent reply	other threads:[~2020-06-25  9:56 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20 23:31 [PATCH] USB: cdc-wdm: Call wake_up_all() when clearing WDM_IN_USE bit Tetsuo Handa
2020-05-21  7:33 ` Greg KH
2020-05-21 10:01   ` Tetsuo Handa
2020-05-21 19:50     ` Oliver Neukum
2020-05-21 22:48       ` Tetsuo Handa
2020-05-22  8:04         ` Oliver Neukum
2020-05-22  8:26           ` Tetsuo Handa
2020-05-25 12:06             ` Oliver Neukum
2020-05-25 13:32               ` Tetsuo Handa
2020-05-27  4:47                 ` Tetsuo Handa
2020-05-28 15:18                   ` Andrey Konovalov
2020-05-28 16:03                     ` Tetsuo Handa
2020-05-28 19:03                       ` Andrey Konovalov
2020-05-28 19:40                         ` Alan Stern
2020-05-28 19:51                           ` Andrey Konovalov
2020-05-28 20:58                             ` Alan Stern
2020-05-29 20:41                               ` Andrey Konovalov
2020-05-30  0:42                                 ` Tetsuo Handa
2020-05-30  1:10                                   ` Alan Stern
2020-05-30  4:58                                     ` Tetsuo Handa
2020-06-24 11:57                                       ` Oliver Neukum
2020-06-24 12:48                                         ` Tetsuo Handa
2020-05-30  6:08                                   ` Greg Kroah-Hartman
2020-06-01 12:26                                   ` Andrey Konovalov
2020-05-30 15:25                               ` Oliver Neukum
2020-05-30 15:47                                 ` Alan Stern
2020-06-08  2:24                                   ` Tetsuo Handa
2020-06-18  0:48                                     ` Tetsuo Handa
2020-06-19 13:56                                       ` Andrey Konovalov
2020-06-23 11:20                                         ` Tetsuo Handa
2020-07-02  5:44                                           ` Tetsuo Handa
2020-07-02  7:24                                             ` Oliver Neukum
2020-07-15  6:15                                               ` Tetsuo Handa
2020-08-10 10:47                                                 ` Tetsuo Handa
2020-09-24 15:09                                                   ` [PATCH] USB: cdc-wdm: Make wdm_flush() interruptible and add wdm_fsync() Tetsuo Handa
2020-09-28 14:17                                                     ` [PATCH (repost)] " Tetsuo Handa
2020-06-25  9:56                                     ` Oliver Neukum [this message]
2020-06-25 11:15                                       ` [PATCH] USB: cdc-wdm: Call wake_up_all() when clearing WDM_IN_USE bit Tetsuo Handa
2020-07-01  7:08                                     ` [TEST]Re: " Oliver Neukum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1593078968.28236.15.camel@suse.com \
    --to=oneukum@suse.com \
    --cc=andreyknvl@google.com \
    --cc=arnd@arndb.de \
    --cc=colin.king@canonical.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=stern@rowland.harvard.edu \
    --cc=syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.