linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: "Thomas Deutschmann" <whissi@whissi.de>
To: "LVM general discussion and development" <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] lvcreate hangs forever during snapshot creation when suspending volume
Date: Wed, 3 Aug 2022 03:37:16 +0200	[thread overview]
Message-ID: <000c01d8a6d9$908f63e0$b1ae2ba0$@whissi.de> (raw)
In-Reply-To: <63e2968a-cb32-fda5-8810-43f722ae2d28@gmail.com>

Hi,

Zdenek Kabelac wrote:
> So as guessed earlier - unrelated to lvm2.

Yes. And thank you for pointing me to fsfreeze.


> You likely need to discover what is wrong with your 'raid' device ?
> Was your raid array fully synchronized ?
> 
> Do you have only problem with one particular  MD 'raid' on your system - or 
> any other 'raid' you attach/create will suffer the same problem ?
>
> Is it 'nvme' related on your system ?
>
> Are the 'individual' nvme  devices running fine - just when they are mixed 
> together into a single array you get these  'fsfreeze' troubles ?

I think it is unrelated to the mdraid because I was able to reproduce
the problem with a single nvme device which I removed from the raid array.

However, I tried different kernel versions:

- Brand new 5.19 is showing same problem
- 5.16 was the last working kernel

So I run bisect which revealed

> md: add support for REQ_NOWAIT
> 
> commit 021a24460dc2 ("block: add QUEUE_FLAG_NOWAIT") added support
> for checking whether a given bdev supports handling of REQ_NOWAIT or not.
> Since then commit 6abc49468eea ("dm: add support for REQ_NOWAIT and enable
> it for linear target") added support for REQ_NOWAIT for dm. This uses
> a similar approach to incorporate REQ_NOWAIT for md based bios.
> 
> This patch was tested using t/io_uring tool within FIO. A nvme drive
> was partitioned into 2 partitions and a simple raid 0 configuration
> /dev/md0 was created.
> 
> md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0]
>       937423872 blocks super 1.2 512k chunks
> 
> Before patch:
> 
> $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
> 
> Running top while the above runs:
> 
> $ ps -eL | grep $(pidof io_uring)
> 
>   38396   38396 pts/2    00:00:00 io_uring
>   38396   38397 pts/2    00:00:15 io_uring
>   38396   38398 pts/2    00:00:13 iou-wrk-38397
> 
> We can see iou-wrk-38397 io worker thread created which gets created
> when io_uring sees that the underlying device (/dev/md0 in this case)
> doesn't support nowait.
> 
> After patch:
> 
> $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
> 
> Running top while the above runs:
> 
> $ ps -eL | grep $(pidof io_uring)
> 
>   38341   38341 pts/2    00:10:22 io_uring
>   38341   38342 pts/2    00:10:37 io_uring
> 
> After running this patch, we don't see any io worker thread
> being created which indicated that io_uring saw that the
> underlying device does support nowait. This is the exact behaviour
> noticed on a dm device which also supports nowait.
> 
> For all the other raid personalities except raid0, we would need
> to train pieces which involves make_request fn in order for them
> to correctly handle REQ_NOWAIT.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f51d46d0e7cb5b8494aa534d276a9d8915a2443d

as bad commit.

Building latest kernel with this commit reverted (and reverting follow up fix
0f9650bd838efe5c52f7e5f40c3204ad59f1964d, too) fixes the problem
for me.

What I do _not_ understand yet: It's a change in md driver -- how could
that change affect the single device I pulled off the array?

However, during bisect I only tested against the mdraid array. Maybe there
is another "nowait" issue or a specific problem with "REQ_NOWAIT" and
the Dell OEM nvme devices...

I'll post to LKML shortly, thanks!


-- 
Regards,
Thomas


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


  reply	other threads:[~2022-08-03  1:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-30 16:33 [linux-lvm] lvcreate hangs forever during snapshot creation when suspending volume Thomas Deutschmann
2022-08-01 17:29 ` Zdenek Kabelac
2022-08-01 17:34   ` Zdenek Kabelac
2022-08-01 20:34     ` Thomas Deutschmann
2022-08-02  3:01       ` Thomas Deutschmann
2022-08-02 14:52         ` Zdenek Kabelac
2022-08-03  1:37           ` Thomas Deutschmann [this message]
2022-08-28 22:38             ` Thomas Deutschmann
2022-08-29 10:32               ` Zdenek Kabelac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='000c01d8a6d9$908f63e0$b1ae2ba0$@whissi.de' \
    --to=whissi@whissi.de \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).