linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: "John Stoffel" <john@stoffel.org>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] cache on SSD makes system unresponsive
Date: Thu, 19 Oct 2017 15:09:24 -0400	[thread overview]
Message-ID: <23016.63588.505141.142275@quad.stoffel.home> (raw)
In-Reply-To: <f30cd99b-2ea1-2766-a8c9-fd61c2f2e41c@member.fsf.org>


Oleg> Recently I have decided to try out LVM cache feature on one of
Oleg> our Dell NX3100 servers running CentOS 7.4.1708 with 110Tb disk
Oleg> array (hardware RAID5 with H710 and H830 Dell adapters).  Two
Oleg> SSD disks each 256Gb are in hardware RAID1 using H710 adapter
Oleg> with primary and extended partitions so I decided to make ~240Gb
Oleg> LVM cache to see if system I/O may be improved.  The server is
Oleg> running Bareos storage daemon and beside sshd and Dell
Oleg> OpenManage monitoring does not have any other services.
Oleg> Unfortunately testing went not as I expected nonetheless at the
Oleg> end system is up and running with no data corrupted.

Can you give more details about the system.  Is this providing storage
services (NFS) or is it just a backup server?

How did you setup your LVM config and your cache config?  Did you
mirror the two SSDs using MD, then add the device into your VG and use
that to setup the lvcache?

I ask because I'm running lvcache at home on my main file/kvm server
and I've never seen this problem.  But!  I suspect you're running a
much older kernel, lvm config, etc.  Please post the full details of
your system if you can. 

Oleg> Initially I have tried the default writethrough mode and after
Oleg> running dd reading test with 250Gb file got system unresponsive
Oleg> for roughly 15min with cache allocation around 50%.  Writing to
Oleg> disks it seems speed up the system however marginally, so around
Oleg> 10% on my tests and I did manage to pull more than 32Tb via
Oleg> backup from different hosts and once system became unresponsive
Oleg> to ssh and icmp requests however for a very short time.

Can you run 'top' or 'vmstat -admt 10' on the console while you're
running your tests to see what the system does?  How does memory look
on this system when you're NOT runnig lvcache?

Do you have any swap space configured on the system?  It might make
sense to allocate 10-20gb of swap space.  

Oleg> I though it may be something with cache mode so switched to writeback 
Oleg> via lvconvert and run dd reading test again with 250Gb file however that 
Oleg> time everything went completely unexpected.  System started to slow 
Oleg> responding for simple user interactions like list files and run top. And 
Oleg> then became completely unresponsive for about half an hours.  Switching 
Oleg> to main console via iLO I saw a lot of OOM messages and kernel tried to 
Oleg> survive therefore randomly killed almost all processes.  Eventually I 
Oleg> did manage to reboot and immediately uncached the array.

Oleg> My question is about very strange behavior of LVM cache.  Well, I may 
Oleg> expect no performance boost or even I/O degradation however I do not 
Oleg> expect run out of memory and than OOM kicks in.  That server has only 
Oleg> 12Gb RAM however it does run only sshd, bareos SD daemon and OpenManange 
Oleg> java based monitoring system so no RAM problems were notices for last 
Oleg> few years running with our LVM cache.

Oleg> Any ideas what may be wrong?  I have second NX3200 server with similar 
Oleg> hardware setup and it would be switch to FreeBSD 11.1 with ZFS very time 
Oleg> soon however I may try to install CentOS 7.4 first and see if the 
Oleg> problem may be reproduced.

Oleg> LVM2 installed is version lvm2-2.02.171-8.el7.x86_64.


Oleg> Thank you!
Oleg> Oleg

Oleg> _______________________________________________
Oleg> linux-lvm mailing list
Oleg> linux-lvm@redhat.com
Oleg> https://www.redhat.com/mailman/listinfo/linux-lvm
Oleg> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

  parent reply	other threads:[~2017-10-19 19:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 17:54 [linux-lvm] cache on SSD makes system unresponsive Oleg Cherkasov
2017-10-19 18:13 ` Xen
2017-10-20 10:21   ` Oleg Cherkasov
2017-10-20 10:38     ` Xen
2017-10-20 11:41       ` Oleg Cherkasov
2017-10-19 18:49 ` Mike Snitzer
2017-10-20 11:07   ` Joe Thornber
2017-10-19 19:09 ` John Stoffel [this message]
2017-10-19 19:46   ` Xen
2017-10-19 21:14     ` John Stoffel
2017-10-20  6:42       ` Xen
2017-10-19 21:59   ` Oleg Cherkasov
2017-10-20 19:35     ` John Stoffel
2017-10-21  3:05       ` Mike Snitzer
2017-10-21 14:33       ` Oleg Cherkasov
2017-10-23 10:58         ` Zdenek Kabelac
2017-10-21  2:55     ` Mike Snitzer
2017-10-21 14:10       ` Oleg Cherkasov
2017-10-23 20:45         ` John Stoffel
2017-10-20 16:20 ` lejeczek
2017-10-20 16:48   ` Xen
2017-10-20 17:02     ` Bernd Eckenfels
2017-10-24 14:51 ` lejeczek
     [not found] <640472762.2746512.1508882485777.ref@mail.yahoo.com>
2017-10-24 22:01 ` matthew patton
2017-10-24 23:10   ` Chris Friesen
     [not found] <1928541660.2031191.1508802005006.ref@mail.yahoo.com>
2017-10-23 23:40 ` matthew patton
2017-10-24 15:36   ` Xen
     [not found] <1714773615.1945146.1508792555922.ref@mail.yahoo.com>
2017-10-23 21:02 ` matthew patton
2017-10-23 21:54   ` Xen
2017-10-24  2:51   ` John Stoffel
     [not found] <1540708205.1077645.1508602122091.ref@mail.yahoo.com>
2017-10-21 16:08 ` matthew patton
     [not found] <1244564108.1073508.1508601932111.ref@mail.yahoo.com>
2017-10-21 16:05 ` matthew patton
2017-10-24 18:09   ` Oleg Cherkasov
     [not found] <541215543.377417.1508458336923.ref@mail.yahoo.com>
2017-10-20  0:12 ` matthew patton
2017-10-20  6:46   ` Xen
2017-10-20  9:59     ` Oleg Cherkasov
  -- strict thread matches above, loose matches on Subject: below --
2017-10-19 10:05 Oleg Cherkasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23016.63588.505141.142275@quad.stoffel.home \
    --to=john@stoffel.org \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).