All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cesare Leonardi <celeonar@gmail.com>
To: Zdenek Kabelac <zkabelac@redhat.com>,
	LVM general discussion and development <linux-lvm@redhat.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: LVM RAID: task mdX_raid1:221 blocked for more than 120 seconds
Date: Mon, 26 Nov 2018 12:31:41 +0100	[thread overview]
Message-ID: <45bbdff9-b88c-4533-8aa5-9976564ed2bf@gmail.com> (raw)
In-Reply-To: <6dc75647-bec9-564d-23fa-aeb626678cf6@redhat.com>

Resending, I erroneusly replied only to Zdenek, sorry.


On 26/11/18 09:49, Zdenek Kabelac wrote:
> It does look like 'freeze' happens during LV  resize of device
> (just wild guess from bug=913138)
> 
> To track down the issue - there would need to be probably some 
> communication with bug reporters - they would need to expose what they 
> were doing plus state
> of dm tables and number of other things.

I can provide details about this, that was filed by me:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913119

It's about a desktop PC, with two SSD (Samsung 850 EVO) on which i build 
RAID1 using LVM.
# pvs
   PV         VG  Fmt  Attr PSize    PFree
   /dev/sdb3  vg0 lvm2 a--  <250,00g 15,98g
   /dev/sdc3  vg0 lvm2 a--  <250,00g 15,98g

# lvs
   LV    VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log 
Cpy%Sync Convert
   home  vg0 rwi-aor--- 200,00g 100,00
   root  vg0 rwi-aor---  30,00g 100,00
   swap0 vg0 rwi-aor---   4,00g 100,00

It's a desktop PC using Debian unstable, so it's rebooted quite often 
due to frequent updates.
The freezes happens during normal work, without any resizing or any 
maintenance on LVM going on. Most of the time I noted the freeze while I 
was using Thunderbird. But eventually they resolve by themself: I wait 
minutes and the system suddenly became responsive again. Sometimes I've 
noted freezes but without any notice in dmesg: maybe they resolved 
before some kernel threshold.
But most of the time another freeze will happen soon (it could be 1-2 
hours but also minutes), so a reboot is really necessary.

I've not noticed any corruption due to these freeze but often they are 
very long and very impacting. The only reliable workaround found was to 
reboot with:
scsi_mod.use_blk_mq=0 dm_mod.use_blk_mq=0

Or to reboot with Debian kernel 4.16.16 (linux-image-4.16.0-2-amd) the 
last that work without problem but also the last before Debian 
maintaner's activated SCSI_MQ_DEFAULT and DM_MQ_DEFAULT.

To me the only evidence is that disabling blk-mq the problem doesn't 
happen and so it looks an interaction with blk-mq.
I've read in RHEL8 release notes that it will enable it by default, so I 
wonder if that happened to others. I have a fedora-server 29 VM, 
upgraded from 28, but there, if I recall correctly, SCSI_MQ_DEFAULT and 
DM_MQ_DEFAULT are not set.

> Anyway without way more info such bug report is meaningless.

Please ask, I'll do my best to provide any info you need.

Cesare.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

WARNING: multiple messages have this Message-ID (diff)
From: Cesare Leonardi <celeonar@gmail.com>
To: Zdenek Kabelac <zkabelac@redhat.com>,
	LVM general discussion and development <linux-lvm@redhat.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [linux-lvm] LVM RAID: task mdX_raid1:221 blocked for more than 120 seconds
Date: Mon, 26 Nov 2018 12:31:41 +0100	[thread overview]
Message-ID: <45bbdff9-b88c-4533-8aa5-9976564ed2bf@gmail.com> (raw)
In-Reply-To: <6dc75647-bec9-564d-23fa-aeb626678cf6@redhat.com>

Resending, I erroneusly replied only to Zdenek, sorry.


On 26/11/18 09:49, Zdenek Kabelac wrote:
> It does look like 'freeze' happens during LV� resize of device
> (just wild guess from bug=913138)
> 
> To track down the issue - there would need to be probably some 
> communication with bug reporters - they would need to expose what they 
> were doing plus state
> of dm tables and number of other things.

I can provide details about this, that was filed by me:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913119

It's about a desktop PC, with two SSD (Samsung 850 EVO) on which i build 
RAID1 using LVM.
# pvs
   PV         VG  Fmt  Attr PSize    PFree
   /dev/sdb3  vg0 lvm2 a--  <250,00g 15,98g
   /dev/sdc3  vg0 lvm2 a--  <250,00g 15,98g

# lvs
   LV    VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log 
Cpy%Sync Convert
   home  vg0 rwi-aor--- 200,00g 100,00
   root  vg0 rwi-aor---  30,00g 100,00
   swap0 vg0 rwi-aor---   4,00g 100,00

It's a desktop PC using Debian unstable, so it's rebooted quite often 
due to frequent updates.
The freezes happens during normal work, without any resizing or any 
maintenance on LVM going on. Most of the time I noted the freeze while I 
was using Thunderbird. But eventually they resolve by themself: I wait 
minutes and the system suddenly became responsive again. Sometimes I've 
noted freezes but without any notice in dmesg: maybe they resolved 
before some kernel threshold.
But most of the time another freeze will happen soon (it could be 1-2 
hours but also minutes), so a reboot is really necessary.

I've not noticed any corruption due to these freeze but often they are 
very long and very impacting. The only reliable workaround found was to 
reboot with:
scsi_mod.use_blk_mq=0 dm_mod.use_blk_mq=0

Or to reboot with Debian kernel 4.16.16 (linux-image-4.16.0-2-amd) the 
last that work without problem but also the last before Debian 
maintaner's activated SCSI_MQ_DEFAULT and DM_MQ_DEFAULT.

To me the only evidence is that disabling blk-mq the problem doesn't 
happen and so it looks an interaction with blk-mq.
I've read in RHEL8 release notes that it will enable it by default, so I 
wonder if that happened to others. I have a fedora-server 29 VM, 
upgraded from 28, but there, if I recall correctly, SCSI_MQ_DEFAULT and 
DM_MQ_DEFAULT are not set.

> Anyway without way more info such bug report is meaningless.

Please ask, I'll do my best to provide any info you need.

Cesare.

  reply	other threads:[~2018-11-26 11:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-24 23:30 [linux-lvm] LVM RAID: task mdX_raid1:221 blocked for more than 120 seconds Cesare Leonardi
2018-11-26  7:25 ` Jack Wang
2018-11-26  7:25   ` [linux-lvm] " Jack Wang
2018-11-26  8:49 ` Zdenek Kabelac
2018-11-26 11:31   ` Cesare Leonardi [this message]
2018-11-26 11:31     ` Cesare Leonardi
2018-11-26 11:40     ` Zdenek Kabelac
2018-11-26 11:40       ` [linux-lvm] " Zdenek Kabelac
2018-11-26 12:43       ` Cesare Leonardi
2018-11-26 12:43         ` [linux-lvm] " Cesare Leonardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45bbdff9-b88c-4533-8aa5-9976564ed2bf@gmail.com \
    --to=celeonar@gmail.com \
    --cc=linux-lvm@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.