linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Heming Zhao <heming.zhao@suse.com>
To: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Cc: linux-lvm@redhat.com, teigland@redhat.com, martin.wilck@suse.com
Subject: Re: [linux-lvm] lvmpolld causes high cpu load issue
Date: Wed, 17 Aug 2022 16:43:43 +0800	[thread overview]
Message-ID: <20220817084343.33la7o6fdh5txul4@c73> (raw)
In-Reply-To: <6fa27852-e898-659f-76a5-52f50f0de898@gmail.com>

On Wed, Aug 17, 2022 at 10:06:35AM +0200, Zdenek Kabelac wrote:
> Dne 17. 08. 22 v 4:03 Heming Zhao napsal(a):
> > On Tue, Aug 16, 2022 at 12:26:51PM +0200, Zdenek Kabelac wrote:
> > > Dne 16. 08. 22 v 12:08 Heming Zhao napsal(a):
> > > > Ooh, very sorry, the subject is wrong, not IO performance but cpu high load
> > > > is triggered by pvmove.
> > > > 
> > > > On Tue, Aug 16, 2022 at 11:38:52AM +0200, Zdenek Kabelac wrote:
> > > > > Dne 16. 08. 22 v 11:28 Heming Zhao napsal(a):
> > > > > > Hello maintainers & list,
> > > > > > 
> > > > > > I bring a story:
> > > > > > One SUSE customer suffered lvmpolld issue, which cause IO performance dramatic
> > > > > > decrease.
> > > > > > 
> > > > > > How to trigger:
> > > > > > When machine connects large number of LUNs (eg 80~200), pvmove (eg, move a single
> > > > > > disk to a new one, cmd like: pvmove disk1 disk2), the system will suffer high
> > > > > > cpu load. But when system connects ~10 LUNs, the performance is fine.
> > > > > > 
> > > > > > We found two work arounds:
> > > > > > 1. set lvm.conf 'activation/polling_interval=120'.
> > > > > > 2. write a speical udev rule, which make udev ignore the event for mpath devices.
> > > > > >       echo 'ENV{DM_UUID}=="mpath-*", OPTIONS+="nowatch"' >\
> > > > > >        /etc/udev/rules.d/90-dm-watch.rules
> > > > > > 
> > > > > > Run above any one of two can make the performance issue disappear.
> > > > > > 
> > > > > > ** the root cause **
> > > > > > 
> > > > > > lvmpolld will do interval requeset info job for updating the pvmove status
> > > > > > 
> > > > > > On every polling_interval time, lvm2 will update vg metadata. The update job will
> > > > > > call sys_close, which will trigger systemd-udevd IN_CLOSE_WRITE event, eg:
> > > > > >      2022-<time>-xxx <hostname> systemd-udevd[pid]: dm-179: Inotify event: 8 for /dev/dm-179
> > > > > > (8 is IN_CLOSE_WRITE.)
> > > > > > 
> > > > > > These VGs underlying devices are multipath devices. So when lvm2 update metatdata,
> > > > > > even if pvmove write a few data, the sys_close action trigger udev's "watch"
> > > > > > mechanism to gets notified frequently about a process that has written to the
> > > > > > device and closed it. This causes frequent, pointless re-evaluation of the udev
> > > > > > rules for these devices.
> > > > > > 
> > > > > > My question: Does LVM2 maintainers have any idea to fix this bug?
> > > > > > 
> > > > > > In my view, does lvm2 could drop VGs devices fds until pvmove finish?
> > > > > 
> > > > > Hi
> > > > > 
> > > > > Please provide more info about lvm2  metadata and also some  'lvs -avvvvv'
> > > > > trace so we can get better picture about the layout - also version of
> > > > > lvm2,systemd,kernel in use.
> > > > > 
> > > > > pvmove is progressing by mirroring each segment of an LV - so if there would
> > > > > be a lot of segments - then each such update may trigger udev watch rule
> > > > > event.
> > > > > 
> > > > > But ATM I could hardly imagine how this could cause some 'dramatic'
> > > > > performance decrease -  maybe there is something wrong with udev rules on
> > > > > the system ?
> > > > > 
> > > > > What is the actual impact ?
> > > > > 
> > > > > Note - pvmove was never designed as a high performance operation (in fact it
> > > > > tries to not eat all the disk bandwidth as such)
> > > > > 
> > > > > Regards
> > > > > Zdenek
> > > > 
> > > > My mistake, I write here again:
> > > > The subject is wrong, not IO performance but cpu high load is triggered by pvmove.
> > > > 
> > > > There is no IO performance issue.
> > > > 
> > > > When system is connecting 80~200, the cpu load increase by 15~20, the
> > > > cpu usage by ~20%, which corresponds to about ~5,6 cores and led at
> > > > times to the cores fully utilized.
> > > > In another word: a single pvmove process cost 5-6 (sometime 10) cores
> > > > utilization. It's abnormal & unaccepted.
> > > > 
> > > > The lvm2 is 2.03.05, kernel is 5.3. systemd is v246.
> > > > 
> > > > BTW:
> > > > I change this mail subject from:  lvmpolld causes IO performance issue
> > > > to: lvmpolld causes high cpu load issue
> > > > Please use this mail for later discussing.
> > > 
> > > 
> > > Hi
> > > 
> > > Could you please retest with recent version of lvm2. There have been
> > > certainly some improvements in scanning - which might have caused in the
> > > older releases some higher CPU usage with longer set of devices.
> > > 
> > > Regards
> > > 
> > > Zdenek
> > 
> > The highest lvm2 version in SUSE products is lvm2-2.03.15, does this
> > version include the improvements change?
> > Could you mind to point out which commits related with the improvements?
> > I don't have the reproducible env, I need to get a little detail before
> > asking customer to try new version.
> > 
> 
> 
> Please try to reproduce your customer's problem and see if the newer version
> solves the issue.   Otherwise we could waste hours on theoretical
> discussions what might or might not have helped with this problem. Having a
> reproducer is a starting point for fixing it, if the problem is still there.
> 
> Here is one commit that may possibly affect CPU load:
> 
> d2522f4a05aa027bcc911ecb832450bc19b7fb57
> 
> 
> Regards
> 
> Zdenek

I gave a little bit explain for the root cause in previous mail, And the
work around <2> also matchs my analysis.

The machine connects lots of LUNs. pvmove one disk will trigger lvm2
update all underlying mpath devices (80~200). I guess the update job is
vg_commit() which updates latest metadata info, and the metadata locates in
all PVs. The update job finished with close(2) which trigger hundreds
devices udevd IN_CLOSE_WRITE event. every IN_CLOSE_WRITE will trigger
mpathd udev rules (11-dm-mpath.rules) to start scanning devices. So the
real world will flooding hundreds of multipath processes, the cpus load
become high.

Thanks,
Heming

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


  reply	other threads:[~2022-08-23  8:29 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16  9:28 [linux-lvm] lvmpolld causes IO performance issue Heming Zhao
2022-08-16  9:38 ` Zdenek Kabelac
2022-08-16 10:08   ` [linux-lvm] lvmpolld causes high cpu load issue Heming Zhao
2022-08-16 10:26     ` Zdenek Kabelac
2022-08-17  2:03       ` Heming Zhao
2022-08-17  8:06         ` Zdenek Kabelac
2022-08-17  8:43           ` Heming Zhao [this message]
2022-08-17  9:46             ` Zdenek Kabelac
2022-08-17 10:47               ` Heming Zhao
2022-08-17 11:13                 ` Zdenek Kabelac
2022-08-17 12:39                 ` Martin Wilck
2022-08-17 12:54                   ` Zdenek Kabelac
2022-08-17 13:41                     ` Martin Wilck
2022-08-17 15:11                       ` David Teigland
2022-08-18  8:06                         ` Martin Wilck
2022-08-17 15:26                       ` Zdenek Kabelac
2022-08-17 15:58                         ` Demi Marie Obenour
2022-08-18  7:37                           ` Martin Wilck
2022-08-17 17:35                         ` Gionatan Danti
2022-08-17 18:54                           ` Zdenek Kabelac
2022-08-17 18:54                             ` Zdenek Kabelac
2022-08-17 19:13                             ` Gionatan Danti
2022-08-18 21:13                   ` Martin Wilck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220817084343.33la7o6fdh5txul4@c73 \
    --to=heming.zhao@suse.com \
    --cc=linux-lvm@redhat.com \
    --cc=martin.wilck@suse.com \
    --cc=teigland@redhat.com \
    --cc=zdenek.kabelac@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).