linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Bjørn Mork" <bjorn@mork.no>
To: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Cc: Oliver Neukum <oneukum@suse.de>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] usbnet: Fix a race between usbnet_stop() and the BH
Date: Fri, 28 Aug 2015 10:55:26 +0200	[thread overview]
Message-ID: <87mvxbzta9.fsf@nemi.mork.no> (raw)
In-Reply-To: <55E01750.4010202@rosalab.ru> (Eugene Shatokhin's message of "Fri, 28 Aug 2015 11:09:52 +0300")

Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:

> 25.08.2015 00:01, Bjørn Mork пишет:
>> Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:
>>
>>> The race may happen when a device (e.g. YOTA 4G LTE Modem) is
>>> unplugged while the system is downloading a large file from the Net.
>>>
>>> Hardware breakpoints and Kprobes with delays were used to confirm that
>>> the race does actually happen.
>>>
>>> The race is on skb_queue ('next' pointer) between usbnet_stop()
>>> and rx_complete(), which, in turn, calls usbnet_bh().
>>>
>>> Here is a part of the call stack with the code where the changes to the
>>> queue happen. The line numbers are for the kernel 4.1.0:
>>>
>>> *0 __skb_unlink (skbuff.h:1517)
>>>      prev->next = next;
>>> *1 defer_bh (usbnet.c:430)
>>>      spin_lock_irqsave(&list->lock, flags);
>>>      old_state = entry->state;
>>>      entry->state = state;
>>>      __skb_unlink(skb, list);
>>>      spin_unlock(&list->lock);
>>>      spin_lock(&dev->done.lock);
>>>      __skb_queue_tail(&dev->done, skb);
>>>      if (dev->done.qlen == 1)
>>>          tasklet_schedule(&dev->bh);
>>>      spin_unlock_irqrestore(&dev->done.lock, flags);
>>> *2 rx_complete (usbnet.c:640)
>>>      state = defer_bh(dev, skb, &dev->rxq, state);
>>>
>>> At the same time, the following code repeatedly checks if the queue is
>>> empty and reads these values concurrently with the above changes:
>>>
>>> *0  usbnet_terminate_urbs (usbnet.c:765)
>>>      /* maybe wait for deletions to finish. */
>>>      while (!skb_queue_empty(&dev->rxq)
>>>          && !skb_queue_empty(&dev->txq)
>>>          && !skb_queue_empty(&dev->done)) {
>>>              schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS));
>>>              set_current_state(TASK_UNINTERRUPTIBLE);
>>>              netif_dbg(dev, ifdown, dev->net,
>>>                    "waited for %d urb completions\n", temp);
>>>      }
>>> *1  usbnet_stop (usbnet.c:806)
>>>      if (!(info->flags & FLAG_AVOID_UNLINK_URBS))
>>>          usbnet_terminate_urbs(dev);
>>>
>>> As a result, it is possible, for example, that the skb is removed from
>>> dev->rxq by __skb_unlink() before the check
>>> "!skb_queue_empty(&dev->rxq)" in usbnet_terminate_urbs() is made. It is
>>> also possible in this case that the skb is added to dev->done queue
>>> after "!skb_queue_empty(&dev->done)" is checked. So
>>> usbnet_terminate_urbs() may stop waiting and return while dev->done
>>> queue still has an item.
>>
>> Exactly what problem will that result in?  The tasklet_kill() will wait
>> for the processing of the single element done queue, and everything will
>> be fine.  Or?
>
> Given enough time, what prevents defer_bh() from calling
> tasklet_schedule(&dev->bh) *after* usbnet_stop() calls tasklet_kill()?
>
> Consider the following situation (assuming '&&' are changed to '||' in
> that while loop in usbnet_terminate_urbs() as they should be):
>
> CPU0                            CPU1
> usbnet_stop()                   defer_bh() with list == dev->rxq
>   usbnet_terminate_urbs()
>                                 __skb_unlink() removes the last
>                                 skb from dev->rxq.
>                                 dev->rxq, dev->txq and dev->done
>                                 are now empty.
>   while (!skb_queue_empty()...)
>     The loop ends because all 3
>     queues are now empty.
>
>   usbnet_terminate_urbs() ends.
>
> usbnet_stop() continues:
>   usbnet_status_stop(dev);
>   ...
>   del_timer_sync (&dev->delay);
>   tasklet_kill (&dev->bh);
>                                 __skb_queue_tail(&dev->done, skb);
>                                 if (dev->done.qlen == 1)
>                                   tasklet_schedule(&dev->bh);
>
> The BH is scheduled at this point, which is not what was intended. The
> race window is small, but still.

I guess you are right.  At least I cannot prove that you are not :)

There is a bit too much complexity involved here for me...



Bjørn

  reply	other threads:[~2015-08-28  8:55 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-20 18:13 Several races in "usbnet" module (kernel 4.1.x) Eugene Shatokhin
2015-07-21 12:04 ` Oliver Neukum
2015-07-24 17:38   ` Eugene Shatokhin
2015-07-27 12:29     ` Oliver Neukum
2015-07-27 13:53       ` Eugene Shatokhin
2015-07-21 13:07 ` Oliver Neukum
2015-07-21 14:22 ` Oliver Neukum
2015-07-22 18:33   ` Eugene Shatokhin
2015-07-23  9:15     ` Oliver Neukum
2015-07-24 14:41       ` Eugene Shatokhin
2015-07-27 10:00         ` Oliver Neukum
2015-07-27 14:23           ` Eugene Shatokhin
2015-08-14 16:55   ` Eugene Shatokhin
2015-08-14 16:58     ` [PATCH] usbnet: Fix two races between usbnet_stop() and the BH Eugene Shatokhin
2015-08-19  1:54       ` David Miller
2015-08-19  7:57         ` Eugene Shatokhin
2015-08-19 10:54           ` Bjørn Mork
2015-08-19 11:59             ` Eugene Shatokhin
2015-08-19 12:31               ` Bjørn Mork
2015-08-24 12:20                 ` Eugene Shatokhin
2015-08-24 13:29                   ` Bjørn Mork
2015-08-24 17:00                     ` Eugene Shatokhin
2015-08-25 12:31                     ` Oliver Neukum
2015-08-24 17:43               ` David Miller
2015-08-24 18:06                 ` Alan Stern
2015-08-24 18:21                   ` Alan Stern
2015-08-25 12:36                     ` Oliver Neukum
2015-08-24 18:35                   ` David Miller
2015-08-24 18:12                 ` Eugene Shatokhin
2015-07-23  9:43 ` Several races in "usbnet" module (kernel 4.1.x) Oliver Neukum
2015-07-23 11:39   ` Eugene Shatokhin
2015-08-24 20:13 ` [PATCH 0/2] usbnet: Fix 2 problems in usbnet_stop() Eugene Shatokhin
2015-08-24 20:13   ` [PATCH 1/2] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Eugene Shatokhin
2015-08-25 13:01     ` Oliver Neukum
2015-08-25 14:16       ` Bjørn Mork
2015-08-25 14:22     ` Oliver Neukum
2015-08-26  2:44     ` David Miller
2015-08-24 20:13   ` [PATCH 2/2] usbnet: Fix a race between usbnet_stop() and the BH Eugene Shatokhin
2015-08-24 21:01     ` Bjørn Mork
2015-08-28  8:09       ` Eugene Shatokhin
2015-08-28  8:55         ` Bjørn Mork [this message]
2015-08-28 10:42           ` Eugene Shatokhin
2015-08-31  7:32             ` Bjørn Mork
2015-08-31  8:50               ` Eugene Shatokhin
2015-09-01  7:58                 ` Oliver Neukum
2015-09-01 13:54                   ` Eugene Shatokhin
2015-09-01 14:05                   ` [PATCH] " Eugene Shatokhin
2015-09-08  7:24                     ` Eugene Shatokhin
2015-09-08  7:37                       ` Bjørn Mork
2015-09-08  7:48                         ` Oliver Neukum
2015-09-08 20:18                     ` David Miller
2015-09-01  7:57         ` [PATCH 2/2] " Oliver Neukum
2015-08-26  2:45     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mvxbzta9.fsf@nemi.mork.no \
    --to=bjorn@mork.no \
    --cc=davem@davemloft.net \
    --cc=eugene.shatokhin@rosalab.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oneukum@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).