netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Bjørn Mork" <bjorn@mork.no>
To: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Cc: David Miller <davem@davemloft.net>,
	oneukum@suse.com, netdev@vger.kernel.org,
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] usbnet: Fix two races between usbnet_stop() and the BH
Date: Mon, 24 Aug 2015 15:29:47 +0200	[thread overview]
Message-ID: <87io84byr8.fsf@nemi.mork.no> (raw)
In-Reply-To: <55DB0C1C.5030607@rosalab.ru> (Eugene Shatokhin's message of "Mon, 24 Aug 2015 15:20:44 +0300")

Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:

> 19.08.2015 15:31, Bjørn Mork пишет:
>> Eugene Shatokhin <eugene.shatokhin@rosalab.ru> writes:
>>
>>> The problem is not in the reordering but rather in the fact that
>>> "dev->flags = 0" is not necessarily atomic
>>> w.r.t. "clear_bit(EVENT_RX_KILL, &dev->flags)", and vice versa.
>>>
>>> So the following might be possible, although unlikely:
>>>
>>> CPU0             CPU1
>>>                   clear_bit: read dev->flags
>>>                   clear_bit: clear EVENT_RX_KILL in the read value
>>>
>>> dev->flags=0;
>>>
>>>                   clear_bit: write updated dev->flags
>>>
>>> As a result, dev->flags may become non-zero again.
>>
>> Ah, right.  Thanks for explaining.
>>
>>> I cannot prove yet that this is an impossible situation. If anyone
>>> can, please explain. If so, this part of the patch will not be needed.
>>
>> I wonder if we could simply move the dev->flags = 0 down a few lines to
>> fix both issues?  It doesn't seem to do anything useful except for
>> resetting the flags to a sane initial state after the device is down.
>>
>> Stopping the tasklet rescheduling etc depends only on netif_running(),
>> which will be false when usbnet_stop is called.  There is no need to
>> touch dev->flags for this to happen.
>
> That was one of the first ideas we discussed here. Unfortunately, it
> is probably not so simple.
>
> Setting dev->flags to 0 makes some delayed operations do nothing and,
> among other things, not to reschedule usbnet_bh().

Yes, but I believe that is merely a side effect.  You should never need
to clear multiple flags to get the desired behaviour.

> As you can see in drivers/net/usb/usbnet.c, usbnet_bh() can be called
> as a tasklet function and as a timer function in a number of
> situations (look for the usage of dev->bh and dev->delay there).
>
> netif_running() is indeed false when usbnet_stop() runs, usbnet_stop()
> also disables Tx. This seems to be enough for many cases where
> usbnet_bh() is scheduled, but I am not so sure about the remaining
> ones, namely:
>
> 1. A work function, usbnet_deferred_kevent(), may reschedule
> usbnet_bh(). Looks like the workqueue is only stopped in
> usbnet_disconnect(), so a work item might be processed while
> usbnet_stop() works. Setting dev->flags to 0 makes the work function
> do nothing, by the way. See also the comment in usbnet_stop() about
> this.
>
> A work item may be placed to this workqueue in a number of ways, by
> both usbnet module and the mini-drivers. It is not too easy to track
> all these situations.

That's an understatement :)



> 2. rx_complete() and tx_complete() may schedule execution of
> usbnet_bh() as a tasklet or a timer function. These two are URB
> completion callbacks.
>
> It seems, new Rx and Tx URBs cannot be submitted when usbnet_stop()
> clears dev->flags, indeed. But it does not prevent the completion
> handlers for the previously submitted URBs from running concurrently
> with usbnet_stop(). The latter waits for them to complete (via
> usbnet_terminate_urbs(dev)) but only if FLAG_AVOID_UNLINK_URBS is not
> set in info->flags. rndis_wlan, however, sets this flag for a few
> hardware models. So - no guarantees here as well.

FLAG_AVOID_UNLINK_URBS looks like it should be replaced by the newer
ability to keep the status urb active. I believe that must have been the
real reason for adding it, based on the commit message and the effect
the flag will have:

 commit 1487cd5e76337555737cbc55d7d83f41460d198f
 Author: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
 Date:   Thu Jul 30 19:41:20 2009 +0300

    usbnet: allow "minidriver" to prevent urb unlinking on usbnet_stop
    
    rndis_wlan devices freeze after running usbnet_stop several times. It appears
    that firmware freezes in state where it does not respond to any RNDIS commands
    and device have to be physically unplugged/replugged. This patch lets
    minidrivers to disable unlink_urbs on usbnet_stop through new info flag.
    
    Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
    Cc: David Brownell <dbrownell@users.sourceforge.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>



The rx urbs will not be resubmitted in any case, and there are of course
no tx urbs being submitted.  So the only effect of this flag is on the
status/interrupt urb, which I can imagine some RNDIS devices wants
active all the time. 

So FLAG_AVOID_UNLINK_URBS should probably be removed and replaced calls
to usbnet_status_start() and usbnet_status_stop().  This will require
testing on some of the devices with the original firmware problem
however.

In any case: I do not think this flag should be considered when trying
to make usbnet_stop behaviour saner.  It's only purpose is to
deliberately break usbnet_stop by not actually stopping.


> If someone could list the particular bits of dev->flags that should be
> cleared to make sure no deferred call could reschedule usbnet_bh(),
> etc... Well, it would be enough to clear these first and use
> dev->flags = 0 later, after tasklet_kill() and del_timer_sync(). I
> cannot point out these particular bits now.


I don't think any of the flags must be cleared.  The sequence

        dev_close(dev->net);
	usbnet_terminate_urbs(dev);
	usbnet_purge_paused_rxq(dev);
	del_timer_sync (&dev->delay);
	tasklet_kill (&dev->bh);

should prevent any rescheduling of usbnet_bh

> Besides, it is possible, although unlikely, that new event bits will
> be added to dev->flags in the future. And one will need to keep track
> of these to see if they should be cleared as well. I'd prever to play
> safer for now and clear them all.

I don't think we should ever make a flag which will _have_ to be reset
for usbnet_stop.  The only reason for clearing all flags is to reset the
state before the next open.

Yes, I see that we currently need to clear EVENT_DEV_OPEN in
usbnet_stop, but I really don't see what this flag gives us which isn't
already provided by netif_running().  It looks like a duplicate.


Bjørn

  reply	other threads:[~2015-08-24 13:29 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-20 18:13 Several races in "usbnet" module (kernel 4.1.x) Eugene Shatokhin
2015-07-21 12:04 ` Oliver Neukum
     [not found]   ` <1437480243.3823.5.camel-IBi9RG/b67k@public.gmane.org>
2015-07-24 17:38     ` Eugene Shatokhin
2015-07-27 12:29       ` Oliver Neukum
2015-07-27 13:53         ` Eugene Shatokhin
2015-07-21 13:07 ` Oliver Neukum
     [not found] ` <55AD3A41.2040100-irhHPgl+04UvJsYlp49lxw@public.gmane.org>
2015-07-21 14:22   ` Oliver Neukum
2015-07-22 18:33     ` Eugene Shatokhin
2015-07-23  9:15       ` Oliver Neukum
2015-07-24 14:41         ` Eugene Shatokhin
2015-07-27 10:00           ` Oliver Neukum
2015-07-27 14:23             ` Eugene Shatokhin
2015-08-14 16:55     ` Eugene Shatokhin
2015-08-14 16:58       ` [PATCH] usbnet: Fix two races between usbnet_stop() and the BH Eugene Shatokhin
2015-08-19  1:54         ` David Miller
     [not found]           ` <20150818.185407.1667358232705414236.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2015-08-19  7:57             ` Eugene Shatokhin
2015-08-19 10:54               ` Bjørn Mork
2015-08-19 11:59                 ` Eugene Shatokhin
2015-08-19 12:31                   ` Bjørn Mork
2015-08-24 12:20                     ` Eugene Shatokhin
2015-08-24 13:29                       ` Bjørn Mork [this message]
2015-08-24 17:00                         ` Eugene Shatokhin
2015-08-25 12:31                         ` Oliver Neukum
2015-08-24 17:43                   ` David Miller
     [not found]                     ` <20150824.104328.554582952440857559.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2015-08-24 18:06                       ` Alan Stern
2015-08-24 18:21                         ` Alan Stern
2015-08-25 12:36                           ` Oliver Neukum
2015-08-24 18:35                         ` David Miller
2015-08-24 18:12                     ` Eugene Shatokhin
2015-07-23  9:43   ` Several races in "usbnet" module (kernel 4.1.x) Oliver Neukum
2015-07-23 11:39     ` Eugene Shatokhin
2015-08-24 20:13 ` [PATCH 0/2] usbnet: Fix 2 problems in usbnet_stop() Eugene Shatokhin
2015-08-24 20:13   ` [PATCH 1/2] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Eugene Shatokhin
2015-08-25 13:01     ` Oliver Neukum
     [not found]       ` <1440507709.13824.6.camel-IBi9RG/b67k@public.gmane.org>
2015-08-25 14:16         ` Bjørn Mork
2015-08-25 14:22     ` Oliver Neukum
2015-08-26  2:44     ` David Miller
2015-08-24 20:13   ` [PATCH 2/2] usbnet: Fix a race between usbnet_stop() and the BH Eugene Shatokhin
2015-08-24 21:01     ` Bjørn Mork
2015-08-28  8:09       ` Eugene Shatokhin
2015-08-28  8:55         ` Bjørn Mork
2015-08-28 10:42           ` Eugene Shatokhin
2015-08-31  7:32             ` Bjørn Mork
2015-08-31  8:50               ` Eugene Shatokhin
2015-09-01  7:58                 ` Oliver Neukum
2015-09-01 13:54                   ` Eugene Shatokhin
2015-09-01 14:05                   ` [PATCH] " Eugene Shatokhin
2015-09-08  7:24                     ` Eugene Shatokhin
2015-09-08  7:37                       ` Bjørn Mork
2015-09-08  7:48                         ` Oliver Neukum
2015-09-08 20:18                     ` David Miller
2015-09-01  7:57         ` [PATCH 2/2] " Oliver Neukum
2015-08-26  2:45     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87io84byr8.fsf@nemi.mork.no \
    --to=bjorn@mork.no \
    --cc=davem@davemloft.net \
    --cc=eugene.shatokhin@rosalab.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oneukum@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).