Linux-Raid Archives on lore.kernel.org
 help / color / Atom feed
* use ssd as write-journal or lvm-cache?
@ 2021-02-17  3:27 d tbsky
  2021-02-17  6:09 ` Roman Mamedov
  2021-02-17  9:12 ` Peter Grandi
  0 siblings, 2 replies; 9+ messages in thread
From: d tbsky @ 2021-02-17  3:27 UTC (permalink / raw)
  To: linux-raid

Hi:
   I was to use ssd to cache my mdadm-raid5 + lvm storage. but I
wonder if I should use them as lvm-cache or mdadm write journal.
lvm-cache has benefits that it can do also read-cache. but I wonder if
full-stripe write is the key point I need. I prefer to use the ssd as
mdadm write journal. is there other reason I should use lvm-cache
instead of  mdadm write-journal?

ps: my ssd is intel dc grade, so I think enable write-back mode of
cache is not problem.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  3:27 use ssd as write-journal or lvm-cache? d tbsky
@ 2021-02-17  6:09 ` Roman Mamedov
  2021-02-17  8:45   ` Peter Grandi
  2021-02-17  8:52   ` d tbsky
  2021-02-17  9:12 ` Peter Grandi
  1 sibling, 2 replies; 9+ messages in thread
From: Roman Mamedov @ 2021-02-17  6:09 UTC (permalink / raw)
  To: d tbsky; +Cc: linux-raid

On Wed, 17 Feb 2021 11:27:58 +0800
d tbsky <tbskyd@gmail.com> wrote:

> Hi:
>    I was to use ssd to cache my mdadm-raid5 + lvm storage. but I
> wonder if I should use them as lvm-cache or mdadm write journal.
> lvm-cache has benefits that it can do also read-cache. but I wonder if
> full-stripe write is the key point I need. I prefer to use the ssd as
> mdadm write journal. is there other reason I should use lvm-cache
> instead of  mdadm write-journal?
> 
> ps: my ssd is intel dc grade, so I think enable write-back mode of
> cache is not problem.

Why not both? It's not like you have to use the entire SSD for one or the
other. And it's very unlikely anything will be bottlenecked by concurrent
access to the SSD from both mechanisms.

Choosing one, I would prefer LVM caching, since it also gives benefit for
reads. And the mdadm write journal feature sounds[1] more like of a
reliability, not a performance enhancement.

In any case, in order to not add a single point of failure to the array,
better rely not on SSD being a "datacenter" one (anything can fail), but use a
RAID1 of two SSDs.

[1] https://lwn.net/Articles/665299/

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  6:09 ` Roman Mamedov
@ 2021-02-17  8:45   ` Peter Grandi
  2021-02-17 10:17     ` Roman Mamedov
  2021-02-17  8:52   ` d tbsky
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Grandi @ 2021-02-17  8:45 UTC (permalink / raw)
  To: list Linux RAID

>> [...] ps: my ssd is intel dc grade, so I think enable
>> write-back mode of cache is not problem. [...]

Not all "enterprise" grade flash SSD models have persistent
buffers though, so better check.

> In any case, in order to not add a single point of failure to
> the array, better rely not on SSD being a "datacenter" one
> (anything can fail), but use a RAID1 of two SSDs.

The main point of "enterprise" flash SSD models is (as the
original poster wrote) the ability to enable write-back (thus
much, much higher committed write rates), if they have
persistent buffering. Higher "endurance" and reliability are
secondary points. With redundant units having models with
persistent buffering is even more important too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  6:09 ` Roman Mamedov
  2021-02-17  8:45   ` Peter Grandi
@ 2021-02-17  8:52   ` d tbsky
  1 sibling, 0 replies; 9+ messages in thread
From: d tbsky @ 2021-02-17  8:52 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid

Roman Mamedov <rm@romanrm.net>
>
> On Wed, 17 Feb 2021 11:27:58 +0800
> Why not both? It's not like you have to use the entire SSD for one or the
> other. And it's very unlikely anything will be bottlenecked by concurrent
> access to the SSD from both mechanisms.

   if I use both, then data may write twice to the same ssd, it seems waste.

> Choosing one, I would prefer LVM caching, since it also gives benefit for
> reads. And the mdadm write journal feature sounds[1] more like of a
> reliability, not a performance enhancement.

   at final stage data write to disk with any kind of cache. so if the
data can write to disk with optimized method,
it seems speed up the whole thing. read-cache is fine, but I don't
know how much benefit it will bring.

> In any case, in order to not add a single point of failure to the array,
> better rely not on SSD being a "datacenter" one (anything can fail), but use a
> RAID1 of two SSDs.

   yes raid1 is must. "datacenter" ssd can protect the ssd-cache so it
won't suffer with power outage. so I think I can enable write-back
mode safely.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  3:27 use ssd as write-journal or lvm-cache? d tbsky
  2021-02-17  6:09 ` Roman Mamedov
@ 2021-02-17  9:12 ` Peter Grandi
  2021-02-17 13:50   ` d tbsky
  2021-02-17 18:03   ` antlists
  1 sibling, 2 replies; 9+ messages in thread
From: Peter Grandi @ 2021-02-17  9:12 UTC (permalink / raw)
  To: list Linux RAID

> I was to use ssd to cache my mdadm-raid5 + lvm storage.

Not that sure that layering MDADM on top of DM/LVM2 is always a
good idea, I tend to prefer to keep things simple.

> but I wonder if I should use them as lvm-cache or mdadm write
> journal.  lvm-cache has benefits that it can do also
> read-cache. but I wonder if full-stripe write is the key point
> I need.

It depends on your load; does the small chance of RAID5 "write
hole" matter to your load? MDRAID has been used for a long time
without having a write journal, as the "write hole" issue
happens rarely and does not always matter. Anyhow with a write
journal, slow "resyncs" are avoided, which may also be
convenient (using the journal as a write buffer is not not that
coherent).

Also, the MDRAID write journal is usually (and should be) a lot
smaller than a whole flash SSD unit, so you can use a small part
of a flash SSD for write journaling and most of it for caching.

Whether caching is useful, and whether DM/LVM2 caching in
particular is useful, depends a lot on the specific load.

Apart from DM/LVM2 caching there is also "bcache", and there is
a new and fairly reliable filesystem type, "bcachefs", that
integrates it and gives quite neatly multiple tiers of storage
(it also has some RAID aspects, but those can be ignored).

https://www.reddit.com/r/bcachefs/comments/l44lmj/list_of_some_useful_links_for_bcachefs/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  8:45   ` Peter Grandi
@ 2021-02-17 10:17     ` Roman Mamedov
  0 siblings, 0 replies; 9+ messages in thread
From: Roman Mamedov @ 2021-02-17 10:17 UTC (permalink / raw)
  To: Peter Grandi; +Cc: list Linux RAID

On Wed, 17 Feb 2021 09:45:12 +0100
pg@mdraid.list.sabi.co.UK (Peter Grandi) wrote:

> The main point of "enterprise" flash SSD models is (as the
> original poster wrote) the ability to enable write-back (thus
> much, much higher committed write rates), if they have
> persistent buffering.

I read the OP as they meant enabling the write-back mode of LVM cache, where
loss of the SSD may lead to a data loss for the entire array, and justifying
that simply by the SSD being a reliable datacenter one. My objection was to
that, but perhaps indeed I understood that part wrong.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  9:12 ` Peter Grandi
@ 2021-02-17 13:50   ` d tbsky
  2021-02-17 20:50     ` Peter Grandi
  2021-02-17 18:03   ` antlists
  1 sibling, 1 reply; 9+ messages in thread
From: d tbsky @ 2021-02-17 13:50 UTC (permalink / raw)
  To: Peter Grandi; +Cc: list Linux RAID

Peter Grandi <pg@mdraid.list.sabi.co.uk>
> Also, the MDRAID write journal is usually (and should be) a lot
> smaller than a whole flash SSD unit, so you can use a small part
> of a flash SSD for write journaling and most of it for caching.

I thought journal write-back mode should use large ssd space,like
bcache which will prevent random write at all cost.
but reading the document again, it said "The flush conditions could be
free in-kernel memory cache space is low".  since the memory won't be
too large compare to normal ssd disk, maybe a small optane ssd is best
for mdadm write-journal.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17  9:12 ` Peter Grandi
  2021-02-17 13:50   ` d tbsky
@ 2021-02-17 18:03   ` antlists
  1 sibling, 0 replies; 9+ messages in thread
From: antlists @ 2021-02-17 18:03 UTC (permalink / raw)
  To: Peter Grandi, list Linux RAID

On 17/02/2021 09:12, Peter Grandi wrote:
>> I was to use ssd to cache my mdadm-raid5 + lvm storage.

> Not that sure that layering MDADM on top of DM/LVM2 is always a
> good idea, I tend to prefer to keep things simple.
> 
Is that what the OP is doing? I don't think putting raid on top of lvm 
is a good idea, which is why I'm putting lvm on top of raid.

Whatever, I guess you're better putting the cache on the bottom layer, 
just above the actual hardware. In my case, that would be caching the raid.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: use ssd as write-journal or lvm-cache?
  2021-02-17 13:50   ` d tbsky
@ 2021-02-17 20:50     ` Peter Grandi
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Grandi @ 2021-02-17 20:50 UTC (permalink / raw)
  To: list Linux RAID

> I thought journal write-back mode should use large ssd
> space,like bcache which will prevent random write at all cost.

The write journal is supposed to buffer a few stripes to avoid
the write hole. Consider the case of a 2-drive write journal
arrangement: you would be effectively adding a RAID1 component
to your RAID5 set for recently updated data. Then why use RAID5?
Also consider the size of journals for filesystem types that
have it: typically it is 32MiB-128MiB.

> but reading the document again, it said "The flush conditions
> could be free in-kernel memory cache space is low".

That's another issue with the Linux default for the buffer
system, it usually buffers too much if there is no 'sync'.

> since the memory won't be too large compare to normal ssd
> disk,

I am not sure I understand why that is relevant, what happens
there depends on 'sync' behaviour and the filesystem and buffer
cache flushing interval if any.

> maybe a small optane ssd is best for mdadm write-journal.

The reasoning before this I don't quite understand, but Optane
is a very good choice for a persistent write buffer, as it is
not volatile and has much faster and smaller writes than flash
chips.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-17  3:27 use ssd as write-journal or lvm-cache? d tbsky
2021-02-17  6:09 ` Roman Mamedov
2021-02-17  8:45   ` Peter Grandi
2021-02-17 10:17     ` Roman Mamedov
2021-02-17  8:52   ` d tbsky
2021-02-17  9:12 ` Peter Grandi
2021-02-17 13:50   ` d tbsky
2021-02-17 20:50     ` Peter Grandi
2021-02-17 18:03   ` antlists

Linux-Raid Archives on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-raid/0 linux-raid/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-raid linux-raid/ https://lore.kernel.org/linux-raid \
		linux-raid@vger.kernel.org
	public-inbox-index linux-raid

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-raid


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git