All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harris, James R <james.r.harris at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] spdk_bdev_close() threading
Date: Thu, 04 Apr 2019 23:35:17 +0000	[thread overview]
Message-ID: <24E34F29-CFE1-453A-984E-ABBDB7B5CEAC@intel.com> (raw)
In-Reply-To: FBE7E039FA50BF47A673AD0BD3CD56A8462F9B88@HASMSX105.ger.corp.intel.com

[-- Attachment #1: Type: text/plain, Size: 3898 bytes --]



On 4/4/19, 3:09 PM, "SPDK on behalf of Stojaczyk, Dariusz" <spdk-bounces(a)lists.01.org on behalf of dariusz.stojaczyk(a)intel.com> wrote:

<snip>

    > 
    > Thinking about this more though - how do we avoid this race?  We don't want
    > bdev to call the user's remove callback after they've closed the descriptor.
    > We ensure this today since the remove_callback gets called on the same
    > thread as the descriptor was opened.  User will have to be aware that if the
    > descriptor was opened on thread A, and is closed on thread B, that it must be
    > ready to handle a remove callback on thread A at any point up until
    > spdk_bdev_close() returns on thread B.  That sounds messy.
    
    If the application opens a desc on thread A and closes on B, it needs to
    handle multi-threading anyway. Generally, you could still open and close the
    descriptor on a single thread if you want to keep things simple.
    
    >  A typical first
    > reaction to getting a remove callback is to close the descriptor - meaning the
    > user has to add its own locking to ensure it doesn't try to close the same
    > descriptor on two different threads.
    
    But how we could possibly avoid that with multi-threading?
    
    > 
    > What if we just add a helper function (spdk_bdev_desc_get_thread?) to
    > return the thread that the descriptor was opened on.  Then at least the user
    > doesn't have to keep track of where it was opened itself.  This may still add
    > some work on the caller side, but I think trying to push this into the bdev
    > layer itself will actually create even more work (and trickier work) on the
    > caller side to handle the hot remove race.
    
    I managed to push it into the bdev layer with barely 70 lines of code:
    https://review.gerrithub.io/c/spdk/spdk/+/450112

This doesn't handle the race I mentioned above.  Consider this case:

Thread A opens a bdev.
Thread B later decides it's going to close the bdev for some reason.  Maybe the vhost session, nvmf subsystem, or iscsi target node is being deleted.  There may be other reasons for user applications (outside of SPDK) using the bdev layer.  So thread B calls spdk_bdev_close().
At the same time, that bdev is hot-removed.  Thread A will then callback to the user that the bdev was hot removed.  This thread calls spdk_bdev_close() to close the descriptor (which is a normal reaction to getting a hot remove notification).

To prevent these two threads from racing to call spdk_bdev_close(), the caller needs to add locking or some other kind of synchronization.  This is on top of the extra locking added in this patch.

I guess I don't understand why this can't be handled in vhost.  Have vhost send a message to the thread that constructed the SCSI device when it wants to destruct it (which closes the underlying bdevs).  Then that thread doesn't have to worry about any races - if the bdev was hotremoved by time the message arrived, it just doesn't call spdk_bdev_close().

I know there are some cases where we have to do locking in SPDK.  But I'd rather avoid adding it if possible.  This stuff is inherently difficult to really to test and ensure we get right.  This patch may only be about 120 lines of code, but we don't have any tests to make sure it's correct.

-Jim

    
    It doesn't affect any use cases which close the descriptor on the same
    thread that opened it - it only lifts those single thread limitations.
    
    D.
    
    > 
    > -Jim
    > 
    > 
    > _______________________________________________
    > SPDK mailing list
    > SPDK(a)lists.01.org
    > https://lists.01.org/mailman/listinfo/spdk
    _______________________________________________
    SPDK mailing list
    SPDK(a)lists.01.org
    https://lists.01.org/mailman/listinfo/spdk
    


             reply	other threads:[~2019-04-04 23:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 23:35 Harris, James R [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-09-28  2:06 [SPDK] spdk_bdev_close() threading wuzhouhui
2019-04-08 14:45 Andrey Kuzmin
2019-04-05 19:32 Stojaczyk, Dariusz
2019-04-05 17:10 Walker, Benjamin
2019-04-05  8:13 Stojaczyk, Dariusz
2019-04-04 22:09 Stojaczyk, Dariusz
2019-03-28 20:40 Harris, James R
2019-03-28 14:56 Stojaczyk, Dariusz
2019-03-28 14:55 Stojaczyk, Dariusz
2019-03-28 14:41 Harris, James R
2019-03-28 10:40 Andrey Kuzmin
2019-03-28  9:13 Stojaczyk, Dariusz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24E34F29-CFE1-453A-984E-ABBDB7B5CEAC@intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.