All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: Valentin Vidic <Valentin.Vidic@CARNet.hr>
Cc: drbd-user@lists.linbit.com, "Jens Axboe" <axboe@kernel.dk>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	linux-block@vger.kernel.org, xen-devel@lists.xenproject.org,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [DRBD-user] [PATCH] xen-blkback: Switch to closed state after releasing the backing device
Date: Fri, 7 Sep 2018 15:28:28 +0200	[thread overview]
Message-ID: <20180907132828.GC11834@soda.linbit> (raw)
In-Reply-To: <20180907121348.GM26705@gavran.carpriv.carnet.hr>

On Fri, Sep 07, 2018 at 02:13:48PM +0200, Valentin Vidic wrote:
> On Fri, Sep 07, 2018 at 02:03:37PM +0200, Lars Ellenberg wrote:
> > Very frequently it is *NOT* the "original user", that "still" holds it
> > open, but udev, or something triggered-by-udev.
> > 
> > So double-checking the udev rules,
> > or the "lvm global_filter" settings may help.
> > You could instrument DRBD to log current->{pid,comm} on open and close,
> > so you can better detect who the "someone" is in the message above.
> 
> Don't think there is anything else holding the device open, because it
> is possible to change state to Secondary a few seconds later. But I will
> try to print those values in case anything interesting comes up.
> 
> > Adding a small retry loop in the script may help as well.
> 
> Yes, that is an option, but it would still leave those nasty "State
> change failed" messages in the log. I guess there is no way to check
> the value of DRBD device->open_cnt from userspace?

We don't expose that, no.
But even if we did, that would not be racefree :-)

The last (or even: any?) "close" of a block device that used to be open
for WRITE triggeres a udev "change" event, thus a udev run,
and the minimal action will be some read-only open and ioctl from
(systemd-)udev itself, more likely there also will be blkid and possibly
pvscan and similar actions. All of them should be read-only openers,
and all of them should be "short".
But they will race with the drbd demotion.

You can watch that happen with
# udevadm monitor &
then open and close for write:
# : > /dev/sda
(or any other block device ...)

To excercise the drbd vs udev race,
(you can leave the udevadm monitor running if you want):
## clear dmesg
# dmesg -c > /dev/null
## promote
# drbdadm primary s0 ;
## open/close for write
# : > /dev/drbd0 ;
## the close has triggered a udev run,
## which will now race with demotion,
## which will "sometimes" fail:
# drbdadm secondary s0 ;
## wait a bit, and retry
# sleep 1; drbdadm secondary s0 ; dmesg

### --- example output on some test box right now: ----------------
@root@ava:~# udevadm monitor &
...
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent
...
@root@ava:~# dmesg -c > /dev/null; drbdadm primary s0 ; : > /dev/drbd0 ; drbdadm secondary s0  ; sleep 1; drbdadm secondary s0 ; dmesg
KERNEL[609638.990320] change   /devices/virtual/block/drbd0 (block)
KERNEL[609638.991364] change   /devices/virtual/block/drbd0 (block)
UDEV  [609639.008879] change   /devices/virtual/block/drbd0 (block)
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup-84 secondary 0' terminated with exit code 11
UDEV  [609639.011652] change   /devices/virtual/block/drbd0 (block)
KERNEL[609640.017356] change   /devices/virtual/block/drbd0 (block)
KERNEL[609640.018074] change   /devices/virtual/block/drbd0 (block)
[609613.882751] block drbd0: role( Secondary -> Primary )
[609613.889998] block drbd0: State change failed: Device is held open by someone
[609613.894280] block drbd0:   state = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated r----- }
[609613.897609] block drbd0:  wanted = { cs:WFConnection ro:Secondary/Unknown ds:UpToDate/Outdated r----- }
[609614.909537] block drbd0: role( Primary -> Secondary )
[609614.909662] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
UDEV  [609640.024588] change   /devices/virtual/block/drbd0 (block)
UDEV  [609640.028731] change   /devices/virtual/block/drbd0 (block)
@root@ava:~#
### --------------------------------------------------------------

Obviously change s0 and drbd0 appropriately; to exactly hit the time
window where udev has it open may be tricky, and on a busy system even
the sleep 1 can be not enough (so even the second invocation may fail
still), on an idle system udev may already be done, or may not even have
started yet when the first "secondary" runs.

Your best bet is to review your udev rules,
and make sure "drbd*" will not be looked at
by blkid or pvscan or similar.

See also: https://github.com/systemd/systemd/commit/fee854ee8ccde
respectively https://github.com/systemd/systemd/issues/9371
to get rid even of the unconditional open by systemd-udevd itself.

Note,
I'm not saying blkback has no issue; I don't know.
I'm just pointing out that there are other things
that may cause the same effects.

    Lars

  parent reply	other threads:[~2018-09-07 13:28 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-29  6:52 [PATCH] xen-blkback: Switch to closed state after releasing the backing device Valentin Vidic
2018-08-29  6:52 ` Valentin Vidic
2018-08-29  8:16 ` Juergen Gross
2018-08-29  8:27   ` Valentin Vidic
2018-08-29  8:27   ` Valentin Vidic
2018-08-29  8:43     ` Juergen Gross
2018-08-29  9:23       ` Valentin Vidic
2018-08-29  9:23       ` Valentin Vidic
2018-08-29  9:29         ` Juergen Gross
2018-08-29  9:29         ` Juergen Gross
2018-08-29  8:43     ` Juergen Gross
2018-08-29  8:16 ` Juergen Gross
2018-09-05 10:36 ` Roger Pau Monné
2018-09-05 10:36 ` Roger Pau Monné
2018-09-05 16:27   ` Valentin Vidic
2018-09-05 16:27     ` Valentin Vidic
2018-09-05 16:27     ` Valentin Vidic
2018-09-06 16:14     ` Roger Pau Monné
2018-09-06 16:14     ` Roger Pau Monné
2018-09-06 16:14       ` Roger Pau Monné
2018-09-06 22:03       ` Valentin Vidic
2018-09-06 22:03         ` Valentin Vidic
2018-09-06 22:03       ` Valentin Vidic
2018-09-07 12:03     ` [DRBD-user] " Lars Ellenberg
2018-09-07 12:03     ` Lars Ellenberg
2018-09-07 12:03       ` Lars Ellenberg
2018-09-07 12:13       ` Valentin Vidic
2018-09-07 12:13       ` Valentin Vidic
2018-09-07 13:28         ` Lars Ellenberg
2018-09-07 13:28         ` Lars Ellenberg [this message]
2018-09-07 16:45           ` Valentin Vidic
2018-09-07 16:45           ` Valentin Vidic
2018-09-07 17:14             ` Valentin Vidic
2018-09-07 17:14             ` Valentin Vidic
2018-09-08  7:34               ` Valentin Vidic
2018-09-10 12:45                 ` Lars Ellenberg
2018-09-10 12:45                 ` Lars Ellenberg
2018-09-10 13:22                   ` Valentin Vidic
2018-09-10 13:22                   ` Valentin Vidic
2018-09-10 15:00                     ` Roger Pau Monné
2018-09-10 15:00                     ` Roger Pau Monné
2018-09-10 16:18                       ` Valentin Vidic
2018-09-10 16:18                       ` Valentin Vidic
2018-09-10 16:18                         ` Valentin Vidic
2018-09-13 15:08                         ` Roger Pau Monné
2018-09-13 15:08                         ` Roger Pau Monné
2018-09-13 15:08                           ` Roger Pau Monné
2018-09-14 11:49                           ` Valentin Vidic
2018-09-14 11:49                             ` Valentin Vidic
2018-09-14 16:18                             ` Roger Pau Monné
2018-09-14 16:18                               ` Roger Pau Monné
2018-09-14 16:18                             ` Roger Pau Monné
2018-09-14 11:49                           ` Valentin Vidic
2018-09-08  7:34               ` Valentin Vidic
2018-09-05 16:27   ` Valentin Vidic
     [not found]   ` <20180905113515.GU26705@gavran.carpriv.carnet.hr>
2018-09-05 16:28     ` Valentin Vidic
2018-09-06 16:29       ` Roger Pau Monné
2018-09-06 16:29       ` Roger Pau Monné
2018-09-06 22:19         ` Valentin Vidic
2018-09-06 22:19         ` Valentin Vidic
2018-09-06 22:19           ` Valentin Vidic
2018-09-07  7:15           ` Roger Pau Monné
2018-09-07  7:15           ` Roger Pau Monné
2018-09-07  7:15             ` Roger Pau Monné
2018-09-07  7:23             ` Valentin Vidic
2018-09-07  7:23               ` Valentin Vidic
2018-09-07  7:54               ` Roger Pau Monné
2018-09-07  7:54               ` Roger Pau Monné
2018-09-07  7:54                 ` Roger Pau Monné
2018-09-07 10:20                 ` Valentin Vidic
2018-09-07 10:20                 ` Valentin Vidic
2018-09-07 10:20                   ` Valentin Vidic
2018-09-07 10:43                   ` Roger Pau Monné
2018-09-07 10:43                   ` Roger Pau Monné
2018-09-07 10:43                     ` Roger Pau Monné
2018-09-07 11:15                     ` Valentin Vidic
2018-09-07 11:15                       ` Valentin Vidic
2018-09-07 11:15                     ` Valentin Vidic
2018-09-07  7:23             ` Valentin Vidic
2018-09-05 16:28     ` Valentin Vidic

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180907132828.GC11834@soda.linbit \
    --to=lars.ellenberg@linbit.com \
    --cc=Valentin.Vidic@CARNet.hr \
    --cc=axboe@kernel.dk \
    --cc=drbd-user@lists.linbit.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roger.pau@citrix.com \
    --cc=stable@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.