From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Oliver Neukum <oneukum@suse.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Bjorn Mork <bjorn@mork.no>
Cc: Andrey Konovalov <andreyknvl@google.com>,
Alan Stern <stern@rowland.harvard.edu>,
Colin Ian King <colin.king@canonical.com>,
Arnd Bergmann <arnd@arndb.de>,
USB list <linux-usb@vger.kernel.org>,
syzbot <syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com>,
syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH] USB: cdc-wdm: Make wdm_flush() interruptible and add wdm_fsync().
Date: Fri, 25 Sep 2020 00:09:41 +0900 [thread overview]
Message-ID: <b27841ab-a88c-13e2-a66f-6df7af1f46b4@i-love.sakura.ne.jp> (raw)
In-Reply-To: <c7e9ab26-7a6b-7adb-0208-cf946a97b7d8@i-love.sakura.ne.jp>
Based on https://lkml.kernel.org/r/20200923092136.14824-1-oneukum@suse.com from Oliver Neukum <oneukum@suse.com>:
syzbot is reporting hung task at wdm_flush() [1], for there is a circular
dependency that wdm_flush() from flip_close() for /dev/cdc-wdm0 forever
waits for /dev/raw-gadget to be closed while close() for /dev/raw-gadget
cannot be called unless close() for /dev/cdc-wdm0 completes.
Tetsuo Handa considered that such circular dependency is an usage error [2]
which corresponds to an unresponding broken hardware [3]. But Alan Stern
responded that we should be prepared for such hardware [4]. Therefore,
this patch changes wdm_flush() to use wait_event_interruptible_timeout()
which gives up after 30 seconds, for hardware that remains silent must be
ignored. The 30 seconds are coming out of thin air.
Changing wait_event() to wait_event_interruptible_timeout() makes error
reporting from close() syscall less reliable. To compensate it, this patch
also implements wdm_fsync() which does not use timeout. Those who want to
be very sure that data has gone out to the device are now advised to call
fsync(), with a caveat that fsync() can return -EINVAL when running on
older kernels which do not implement wdm_fsync().
This patch also fixes three more problems (listed below) found during
exhaustive discussion and testing.
Since multiple threads can concurrently call wdm_write()/wdm_flush(),
we need to use wake_up_all() whenever clearing WDM_IN_USE in order to
make sure that all waiters are woken up. Also, error reporting needs
to use fetch-and-clear approach in order not to report same error for
multiple times.
Since wdm_flush() checks WDM_DISCONNECTING, wdm_write() should as well
check WDM_DISCONNECTING.
In wdm_flush(), since locks are not held, it is not safe to dereference
desc->intf after checking that WDM_DISCONNECTING is not set [5]. Thus,
remove dev_err() from wdm_flush().
[1] https://syzkaller.appspot.com/bug?id=e7b761593b23eb50855b9ea31e3be5472b711186
[2] https://lkml.kernel.org/r/27b7545e-8f41-10b8-7c02-e35a08eb1611@i-love.sakura.ne.jp
[3] https://lkml.kernel.org/r/79ba410f-e0ef-2465-b94f-6b9a4a82adf5@i-love.sakura.ne.jp
[4] https://lkml.kernel.org/r/20200530011040.GB12419@rowland.harvard.edu
[5] https://lkml.kernel.org/r/c85331fc-874c-6e46-a77f-0ef1dc075308@i-love.sakura.ne.jp
Reported-by: syzbot <syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Alan Stern <stern@rowland.harvard.edu>
---
drivers/usb/class/cdc-wdm.c | 72 ++++++++++++++++++++++++++++---------
1 file changed, 55 insertions(+), 17 deletions(-)
diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 7f5de956a2fc..02d0cfd23bb2 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -58,6 +58,9 @@ MODULE_DEVICE_TABLE (usb, wdm_ids);
#define WDM_MAX 16
+/* we cannot wait forever at flush() */
+#define WDM_FLUSH_TIMEOUT (30 * HZ)
+
/* CDC-WMC r1.1 requires wMaxCommand to be "at least 256 decimal (0x100)" */
#define WDM_DEFAULT_BUFSIZE 256
@@ -151,7 +154,7 @@ static void wdm_out_callback(struct urb *urb)
kfree(desc->outbuf);
desc->outbuf = NULL;
clear_bit(WDM_IN_USE, &desc->flags);
- wake_up(&desc->wait);
+ wake_up_all(&desc->wait);
}
static void wdm_in_callback(struct urb *urb)
@@ -393,6 +396,9 @@ static ssize_t wdm_write
if (test_bit(WDM_RESETTING, &desc->flags))
r = -EIO;
+ if (test_bit(WDM_DISCONNECTING, &desc->flags))
+ r = -ENODEV;
+
if (r < 0) {
rv = r;
goto out_free_mem_pm;
@@ -424,6 +430,7 @@ static ssize_t wdm_write
if (rv < 0) {
desc->outbuf = NULL;
clear_bit(WDM_IN_USE, &desc->flags);
+ wake_up_all(&desc->wait); /* for wdm_wait_for_response() */
dev_err(&desc->intf->dev, "Tx URB error: %d\n", rv);
rv = usb_translate_errors(rv);
goto out_free_mem_pm;
@@ -583,28 +590,58 @@ static ssize_t wdm_read
return rv;
}
-static int wdm_flush(struct file *file, fl_owner_t id)
+static int wdm_wait_for_response(struct file *file, long timeout)
{
struct wdm_device *desc = file->private_data;
+ long rv; /* Use long here because (int) MAX_SCHEDULE_TIMEOUT < 0. */
+
+ /*
+ * Needs both flags. We cannot do with one because resetting it would
+ * cause a race with write() yet we need to signal a disconnect.
+ */
+ rv = wait_event_interruptible_timeout(desc->wait,
+ !test_bit(WDM_IN_USE, &desc->flags) ||
+ test_bit(WDM_DISCONNECTING, &desc->flags),
+ timeout);
- wait_event(desc->wait,
- /*
- * needs both flags. We cannot do with one
- * because resetting it would cause a race
- * with write() yet we need to signal
- * a disconnect
- */
- !test_bit(WDM_IN_USE, &desc->flags) ||
- test_bit(WDM_DISCONNECTING, &desc->flags));
-
- /* cannot dereference desc->intf if WDM_DISCONNECTING */
+ /*
+ * To report the correct error. This is best effort.
+ * We are inevitably racing with the hardware.
+ */
if (test_bit(WDM_DISCONNECTING, &desc->flags))
return -ENODEV;
- if (desc->werr < 0)
- dev_err(&desc->intf->dev, "Error in flush path: %d\n",
- desc->werr);
+ if (!rv)
+ return -EIO;
+ if (rv < 0)
+ return -EINTR;
+
+ spin_lock_irq(&desc->iuspin);
+ rv = desc->werr;
+ desc->werr = 0;
+ spin_unlock_irq(&desc->iuspin);
+
+ return usb_translate_errors(rv);
+
+}
+
+/*
+ * You need to send a signal when you react to malicious or defective hardware.
+ * Also, don't abort when fsync() returned -EINVAL, for older kernels which do
+ * not implement wdm_flush() will return -EINVAL.
+ */
+static int wdm_fsync(struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return wdm_wait_for_response(file, MAX_SCHEDULE_TIMEOUT);
+}
- return usb_translate_errors(desc->werr);
+/*
+ * Same with wdm_fsync(), except it uses finite timeout in order to react to
+ * malicious or defective hardware which ceased communication after close() was
+ * implicitly called due to process termination.
+ */
+static int wdm_flush(struct file *file, fl_owner_t id)
+{
+ return wdm_wait_for_response(file, WDM_FLUSH_TIMEOUT);
}
static __poll_t wdm_poll(struct file *file, struct poll_table_struct *wait)
@@ -729,6 +766,7 @@ static const struct file_operations wdm_fops = {
.owner = THIS_MODULE,
.read = wdm_read,
.write = wdm_write,
+ .fsync = wdm_fsync,
.open = wdm_open,
.flush = wdm_flush,
.release = wdm_release,
--
2.25.1
next prev parent reply other threads:[~2020-09-24 15:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-20 23:31 [PATCH] USB: cdc-wdm: Call wake_up_all() when clearing WDM_IN_USE bit Tetsuo Handa
2020-05-21 7:33 ` Greg KH
2020-05-21 10:01 ` Tetsuo Handa
2020-05-21 19:50 ` Oliver Neukum
2020-05-21 22:48 ` Tetsuo Handa
2020-05-22 8:04 ` Oliver Neukum
2020-05-22 8:26 ` Tetsuo Handa
2020-05-25 12:06 ` Oliver Neukum
2020-05-25 13:32 ` Tetsuo Handa
2020-05-27 4:47 ` Tetsuo Handa
2020-05-28 15:18 ` Andrey Konovalov
2020-05-28 16:03 ` Tetsuo Handa
2020-05-28 19:03 ` Andrey Konovalov
2020-05-28 19:40 ` Alan Stern
2020-05-28 19:51 ` Andrey Konovalov
2020-05-28 20:58 ` Alan Stern
2020-05-29 20:41 ` Andrey Konovalov
2020-05-30 0:42 ` Tetsuo Handa
2020-05-30 1:10 ` Alan Stern
2020-05-30 4:58 ` Tetsuo Handa
2020-06-24 11:57 ` Oliver Neukum
2020-06-24 12:48 ` Tetsuo Handa
2020-05-30 6:08 ` Greg Kroah-Hartman
2020-06-01 12:26 ` Andrey Konovalov
2020-05-30 15:25 ` Oliver Neukum
2020-05-30 15:47 ` Alan Stern
2020-06-08 2:24 ` Tetsuo Handa
2020-06-18 0:48 ` Tetsuo Handa
2020-06-19 13:56 ` Andrey Konovalov
2020-06-23 11:20 ` Tetsuo Handa
2020-07-02 5:44 ` Tetsuo Handa
2020-07-02 7:24 ` Oliver Neukum
2020-07-15 6:15 ` Tetsuo Handa
2020-08-10 10:47 ` Tetsuo Handa
2020-09-24 15:09 ` Tetsuo Handa [this message]
2020-09-28 14:17 ` [PATCH (repost)] USB: cdc-wdm: Make wdm_flush() interruptible and add wdm_fsync() Tetsuo Handa
2020-06-25 9:56 ` [PATCH] USB: cdc-wdm: Call wake_up_all() when clearing WDM_IN_USE bit Oliver Neukum
2020-06-25 11:15 ` Tetsuo Handa
2020-07-01 7:08 ` [TEST]Re: " Oliver Neukum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b27841ab-a88c-13e2-a66f-6df7af1f46b4@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=andreyknvl@google.com \
--cc=arnd@arndb.de \
--cc=bjorn@mork.no \
--cc=colin.king@canonical.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-usb@vger.kernel.org \
--cc=oneukum@suse.com \
--cc=stern@rowland.harvard.edu \
--cc=syzbot+854768b99f19e89d7f81@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).