linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Oleg Cherkasov <o1e9@member.fsf.org>
To: LVM general discussion and development <linux-lvm@redhat.com>,
	John Stoffel <john@stoffel.org>
Subject: Re: [linux-lvm] cache on SSD makes system unresponsive
Date: Sat, 21 Oct 2017 16:33:07 +0200	[thread overview]
Message-ID: <88f3c8a9-8c55-a74f-c9cb-4b8aa18a28fc@member.fsf.org> (raw)
In-Reply-To: <23018.20452.919839.109594@quad.stoffel.home>

On 20. okt. 2017 21:35, John Stoffel wrote:
>>>>>> "Oleg" == Oleg Cherkasov <o1e9@member.fsf.org> writes:
> 
> Oleg> On 19. okt. 2017 21:09, John Stoffel wrote:
>>>
> 
> Oleg> RAM 12Gb, swap around 12Gb as well.  /dev/sda is a hardware RAID1, the
> Oleg> rest are RAID5.
> 
> Interesting, it's all hardware RAID devices from what I can see.

It is exactly what I wrote initially in my first message!

> 
> Can you should the *exact* commands you used to make the cache?  Are
> you using lvcache, or bcache?  they're two totally different beasts.
> I looked into bcache in the past, but since you can't remove it from
> an LV, I decided not to use it.  I use lvcache like this:

I have used lvcache of course and here are commands from bash history:

lvcreate -L 1G -n primary_backup_lv_cache_meta primary_backup_vg /dev/sda5

### Allocate ~247G ib /dev/sda5 what has left of VG
lvcreate -l 100%FREE -n primary_backup_lv_cache primary_backup_vg /dev/sda5

lvconvert --type cache-pool --cachemode writethrough --poolmetadata 
primary_backup_vg/primary_backup_lv_cache_meta 
primary_backup_vg/primary_backup_lv_cache

lvconvert --type cache --cachepool 
primary_backup_vg/primary_backup_lv_cache 
primary_backup_vg/primary_backup_lv

### lvconvert failed because required some extra extends in VG so I had 
to reduce cache LV and try again:

lvreduce -L 200M primary_backup_vg/primary_backup_lv_cache

### so this time it worked ok:

lvconvert --type cache-pool --cachemode writethrough --poolmetadata 
primary_backup_vg/primary_backup_lv_cache_meta 
primary_backup_vg/primary_backup_lv_cache
lvconvert --type cache --cachepool 
primary_backup_vg/primary_backup_lv_cache 
primary_backup_vg/primary_backup_lv

### The exact output of `lvs -a -o +devices` is gone of course because I 
had uncached of course however it looks as in docs so did not bring any 
suspicions to me.

> How was the performance before your caching tests?  Are you looking
> for better compression of your backups?  I've used bacula (which
> Bareos is based on) for years, but recently gave up because the
> restores sucked to do.  Sorry for the side note.  :-)

The performance was good, no complains to aging hardware however having 
spare SSD disk I wanted to test if it would improve anything and did not 
expect that trivial DD puts whole system on its knees.

> Any messages from the console?

Unfortunately no in logs.  As I wrote before I saw a lot of OOM messages 
on a killing spree.

> Oleg> User stat:
> Oleg> 02:00:01 PM     CPU     %user     %nice   %system   %iowait    %steal
> Oleg>   %idle
> Oleg> 02:10:01 PM     all      0.22      0.00      0.08      0.05      0.00
> Oleg>   99.64
> Oleg> 02:20:35 PM     all      0.21      0.00      5.23     20.58      0.00
> Oleg>   73.98
> Oleg> 02:30:51 PM     all      0.23      0.00      0.43     31.06      0.00
> Oleg>   68.27
> Oleg> 02:40:02 PM     all      0.06      0.00      0.15     18.55      0.00
> Oleg>   81.24
> Oleg> Average:        all      0.19      0.00      1.54     17.67      0.00
> Oleg>   80.61
> 
> That looks ok to me... nothing obvious there at all.

Same is here ...

> Are you writing to a spool disk, before you then write the data into
> bacula's backup system?

Well, Bareos SD was down that time for testing, so it was:

dd if=sime_250G_file of=/dev/null status=process

Basically the first command after allocating LV cache.

> 
> I think you're running into a RedHat bug at this point.  I'd probably
> move to Debian and run my own kernel with the latest patches for MD, etc.

Would have to stay with CentOS and moving to Debian is not necessarily 
solves the problem.

> 
> You might even be running into problems with your HW RAID controllers
> and how Linux talks to them.
> 
> Any chance you could post more details?

HW RAID controller are PERC H710 and H810.  Posting extremely verbose 
MegaCli output would not help I guess.  Firmware is up to date according 
to BIOS Maintenance monitor.

  parent reply	other threads:[~2017-10-21 14:33 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 17:54 [linux-lvm] cache on SSD makes system unresponsive Oleg Cherkasov
2017-10-19 18:13 ` Xen
2017-10-20 10:21   ` Oleg Cherkasov
2017-10-20 10:38     ` Xen
2017-10-20 11:41       ` Oleg Cherkasov
2017-10-19 18:49 ` Mike Snitzer
2017-10-20 11:07   ` Joe Thornber
2017-10-19 19:09 ` John Stoffel
2017-10-19 19:46   ` Xen
2017-10-19 21:14     ` John Stoffel
2017-10-20  6:42       ` Xen
2017-10-19 21:59   ` Oleg Cherkasov
2017-10-20 19:35     ` John Stoffel
2017-10-21  3:05       ` Mike Snitzer
2017-10-21 14:33       ` Oleg Cherkasov [this message]
2017-10-23 10:58         ` Zdenek Kabelac
2017-10-21  2:55     ` Mike Snitzer
2017-10-21 14:10       ` Oleg Cherkasov
2017-10-23 20:45         ` John Stoffel
2017-10-20 16:20 ` lejeczek
2017-10-20 16:48   ` Xen
2017-10-20 17:02     ` Bernd Eckenfels
2017-10-24 14:51 ` lejeczek
     [not found] <640472762.2746512.1508882485777.ref@mail.yahoo.com>
2017-10-24 22:01 ` matthew patton
2017-10-24 23:10   ` Chris Friesen
     [not found] <1928541660.2031191.1508802005006.ref@mail.yahoo.com>
2017-10-23 23:40 ` matthew patton
2017-10-24 15:36   ` Xen
     [not found] <1714773615.1945146.1508792555922.ref@mail.yahoo.com>
2017-10-23 21:02 ` matthew patton
2017-10-23 21:54   ` Xen
2017-10-24  2:51   ` John Stoffel
     [not found] <1540708205.1077645.1508602122091.ref@mail.yahoo.com>
2017-10-21 16:08 ` matthew patton
     [not found] <1244564108.1073508.1508601932111.ref@mail.yahoo.com>
2017-10-21 16:05 ` matthew patton
2017-10-24 18:09   ` Oleg Cherkasov
     [not found] <541215543.377417.1508458336923.ref@mail.yahoo.com>
2017-10-20  0:12 ` matthew patton
2017-10-20  6:46   ` Xen
2017-10-20  9:59     ` Oleg Cherkasov
  -- strict thread matches above, loose matches on Subject: below --
2017-10-19 10:05 Oleg Cherkasov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88f3c8a9-8c55-a74f-c9cb-4b8aa18a28fc@member.fsf.org \
    --to=o1e9@member.fsf.org \
    --cc=john@stoffel.org \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).