* It is possible to put write cache on ssd?
@ 2010-06-04 8:52 Mario
2010-06-07 19:14 ` Bill Davidsen
0 siblings, 1 reply; 12+ messages in thread
From: Mario @ 2010-06-04 8:52 UTC (permalink / raw)
To: linux-raid
Hello,
I have seen that the only hardware raid controllers that can go faster than
linux md raid are the controllers with BBU unit.
Infact thanks to the battery the controllers can have a more aggressive write
caching without the risk of losing data.
Obviously in a standard pc there is not BBU to use with linux software raid.
Now I see that latest hardware raid controllers exchange battery AND ram with a
little flash disk.
So I ask: if I add a fast (with little size) ssd to a linux server is there a
way for linux md raid to use it as a cache to have safer writes and faster raid?
Thanks in advance for interest.
Mario
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-04 8:52 It is possible to put write cache on ssd? Mario
@ 2010-06-07 19:14 ` Bill Davidsen
2010-06-08 4:54 ` Ian Dall
2010-06-08 7:31 ` Mario
0 siblings, 2 replies; 12+ messages in thread
From: Bill Davidsen @ 2010-06-07 19:14 UTC (permalink / raw)
To: Mario; +Cc: linux-raid
Mario wrote:
> Hello,
> I have seen that the only hardware raid controllers that can go faster than
> linux md raid are the controllers with BBU unit.
>
> Infact thanks to the battery the controllers can have a more aggressive write
> caching without the risk of losing data.
>
> Obviously in a standard pc there is not BBU to use with linux software raid.
>
> Now I see that latest hardware raid controllers exchange battery AND ram with a
> little flash disk.
>
> So I ask: if I add a fast (with little size) ssd to a linux server is there a
> way for linux md raid to use it as a cache to have safer writes and faster raid?
>
> Thanks in advance for interest.
>
>
Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
trying various combinations of things to put on it. One thing which I
hoped would benefit was to put a f/s journal on SSD and then use the
option to push all through the journal (data=journal) in hopes that it
would then free the RAM needed for cache and thus speed operation.
Since none of that has generated the performance I hoped, I'm now
looking at a kernel patch to overflow the cache in RAM into the SSD,
stealing code from the mmap to make some address space on the SSD. At
the moment that works poorly (ok, doesn't work) and I'm going to have to
rethink the way I do things and probably write a whole bunch of code to
do it. Not sure if I want to do that, it's unlikely to be a candidate
for mainline unless I put a ton of time into learning the corner cases.
I also played with mirroring and write mostly, etc. Does provide a
general solution, at least in my tests.
--
Bill Davidsen <davidsen@tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-07 19:14 ` Bill Davidsen
@ 2010-06-08 4:54 ` Ian Dall
2010-06-08 19:28 ` Bill Davidsen
2010-06-08 22:48 ` David Rees
2010-06-08 7:31 ` Mario
1 sibling, 2 replies; 12+ messages in thread
From: Ian Dall @ 2010-06-08 4:54 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Mario, linux-raid
On Mon, 2010-06-07 at 15:14 -0400, Bill Davidsen wrote:
> Mario wrote:
> > [...]
> > So I ask: if I add a fast (with little size) ssd to a linux server is there a
> > way for linux md raid to use it as a cache to have safer writes and faster raid?
> >
> > Thanks in advance for interest.
> >
> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
> trying various combinations of things to put on it. One thing which I
> hoped would benefit was to put a f/s journal on SSD and then use the
> option to push all through the journal (data=journal) in hopes that it
> would then free the RAM needed for cache and thus speed operation.
>
> Since none of that has generated the performance I hoped,
Interesting. If its the X25-V that you have, write performance is
nothing to write home about even compared to a single hard drive, let
alone a raid. By journaling data as well (as metadata), you just add
extra write overhead, possibly even a new bottleneck.
What happens if you journal only the metadata? The hoped for advantage
would be to avoid seeks between the areas used for the journal and the
data.
The characteristics of these SSD devices seems to be that they get
faster as they get bigger (like the chips are effectively in a kind of
raid).
> I'm now
> looking at a kernel patch to overflow the cache in RAM into the SSD,
> stealing code from the mmap to make some address space on the SSD.
Again, I wonder if write performance is good enough for this to pay off.
How does that compare with just using the ssd for swap and possibly
tweaking some parameters to encourage the kernel to use swap more? This
would effectively free up more ram for buffers.
> At
> the moment that works poorly (ok, doesn't work) and I'm going to have to
> rethink the way I do things and probably write a whole bunch of code to
> do it. Not sure if I want to do that, it's unlikely to be a candidate
> for mainline unless I put a ton of time into learning the corner cases.
>
> I also played with mirroring and write mostly, etc. Does provide a
> general solution, at least in my tests.
Do you mean "does NOT"?
--
Ian Dall <ian@beware.dropbear.id.au>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-07 19:14 ` Bill Davidsen
2010-06-08 4:54 ` Ian Dall
@ 2010-06-08 7:31 ` Mario
2010-06-08 12:23 ` CoolCold
1 sibling, 1 reply; 12+ messages in thread
From: Mario @ 2010-06-08 7:31 UTC (permalink / raw)
To: linux-raid
Bill Davidsen <davidsen <at> tmr.com> writes:
> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
> trying various combinations of things to put on it. One thing which I
> hoped would benefit was to put a f/s journal on SSD and then use the
> option to push all through the journal (data=journal) in hopes that it
> would then free the RAM needed for cache and thus speed operation.
>
>
Probably it is due to two things:
1) to see advantages you have to test in a server situation where a
lot of thread are writing AND others are reading. So in this case
writes to hard disk can be delayed a lot to give less latency to reads.
2) A lot of tuning is due to force linux to keep data on journal
as long as it can.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 7:31 ` Mario
@ 2010-06-08 12:23 ` CoolCold
2010-06-09 7:49 ` Mario
2010-06-09 11:06 ` MRK
0 siblings, 2 replies; 12+ messages in thread
From: CoolCold @ 2010-06-08 12:23 UTC (permalink / raw)
To: Mario; +Cc: linux-raid
May be something like
http://github.com/facebook/flashcache/blob/master/doc/flashcache-doc.txt
will be interesting for you.
On Tue, Jun 8, 2010 at 11:31 AM, Mario <mgiammarco@gmail.com> wrote:
> Bill Davidsen <davidsen <at> tmr.com> writes:
>
>
>> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
>> trying various combinations of things to put on it. One thing which I
>> hoped would benefit was to put a f/s journal on SSD and then use the
>> option to push all through the journal (data=journal) in hopes that it
>> would then free the RAM needed for cache and thus speed operation.
>>
>>
>
> Probably it is due to two things:
>
> 1) to see advantages you have to test in a server situation where a
> lot of thread are writing AND others are reading. So in this case
> writes to hard disk can be delayed a lot to give less latency to reads.
> 2) A lot of tuning is due to force linux to keep data on journal
> as long as it can.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 4:54 ` Ian Dall
@ 2010-06-08 19:28 ` Bill Davidsen
2010-06-08 22:48 ` David Rees
1 sibling, 0 replies; 12+ messages in thread
From: Bill Davidsen @ 2010-06-08 19:28 UTC (permalink / raw)
To: Ian Dall; +Cc: Mario, linux-raid
Ian Dall wrote:
> On Mon, 2010-06-07 at 15:14 -0400, Bill Davidsen wrote:
>
>> Mario wrote:
>>
>>> [...]
>>> So I ask: if I add a fast (with little size) ssd to a linux server is there a
>>> way for linux md raid to use it as a cache to have safer writes and faster raid?
>>>
>>> Thanks in advance for interest.
>>>
>>>
>> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
>> trying various combinations of things to put on it. One thing which I
>> hoped would benefit was to put a f/s journal on SSD and then use the
>> option to push all through the journal (data=journal) in hopes that it
>> would then free the RAM needed for cache and thus speed operation.
>>
>> Since none of that has generated the performance I hoped,
>>
>
> Interesting. If its the X25-V that you have, write performance is
> nothing to write home about even compared to a single hard drive, let
> alone a raid. By journaling data as well (as metadata), you just add
> extra write overhead, possibly even a new bottleneck.
>
>
There was a claim that if you use journaled data that the memory buffers
would be released after the journal was written. Looking at the code I
didn't think so, but the idea was that a burst of less than 10GB or so
would get out of memory to the SSD and then be pulled back more slowly
without blowing everything out of memory cache. Always good to actually
try stuff than look at the code and pontificate about what it will do
under dynamic conditions.
The best thing I found was some code I was playing with in 2.6.27 or so,
which limited the cache used by any one fd, so that there was cache for
other programs. This shortened the initial fast write speed (write were
going to buffer, not disk) but didn't hurt 10GB write time, and left the
system working for other programs.
> What happens if you journal only the metadata? The hoped for advantage
> would be to avoid seeks between the areas used for the journal and the
> data.
>
>
I've tried putting the journal (and bitmap) on other devices, even on a
ramdisk, it only helps for certain load.
> The characteristics of these SSD devices seems to be that they get
> faster as they get bigger (like the chips are effectively in a kind of
> raid).
>
>
>> I'm now
>> looking at a kernel patch to overflow the cache in RAM into the SSD,
>> stealing code from the mmap to make some address space on the SSD.
>>
>
> Again, I wonder if write performance is good enough for this to pay off.
> How does that compare with just using the ssd for swap and possibly
> tweaking some parameters to encourage the kernel to use swap more? This
> would effectively free up more ram for buffers.
>
>
>> At
>> the moment that works poorly (ok, doesn't work) and I'm going to have to
>> rethink the way I do things and probably write a whole bunch of code to
>> do it. Not sure if I want to do that, it's unlikely to be a candidate
>> for mainline unless I put a ton of time into learning the corner cases.
>>
>> I also played with mirroring and write mostly, etc. Does provide a
>> general solution, at least in my tests.
>>
>
> Do you mean "does NOT"?
>
>
>
--
Bill Davidsen <davidsen@tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 4:54 ` Ian Dall
2010-06-08 19:28 ` Bill Davidsen
@ 2010-06-08 22:48 ` David Rees
2010-06-09 9:31 ` Ian Dall
1 sibling, 1 reply; 12+ messages in thread
From: David Rees @ 2010-06-08 22:48 UTC (permalink / raw)
To: Ian Dall; +Cc: Bill Davidsen, Mario, linux-raid
On Mon, Jun 7, 2010 at 9:54 PM, Ian Dall <ian@beware.dropbear.id.au> wrote:
> On Mon, 2010-06-07 at 15:14 -0400, Bill Davidsen wrote:
>> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
>> trying various combinations of things to put on it. One thing which I
>> hoped would benefit was to put a f/s journal on SSD and then use the
>> option to push all through the journal (data=journal) in hopes that it
>> would then free the RAM needed for cache and thus speed operation.
>>
>> Since none of that has generated the performance I hoped,
>
> Interesting. If its the X25-V that you have, write performance is
> nothing to write home about even compared to a single hard drive, let
> alone a raid. By journaling data as well (as metadata), you just add
> extra write overhead, possibly even a new bottleneck.
Depends on whether you are talking about small, seeky writes or large
writes. Even the X25-V will kill any rotating drive in small seeky
writes, but if you are trying to write big files faster than ~40MB/s,
the the rotating disk might win depending on the exact one you are
comparing it to.
>> I also played with mirroring and write mostly, etc. Does provide a
>> general solution, at least in my tests.
>
> Do you mean "does NOT"?
write-mostly DOES work well in my tests...
-Dave
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 12:23 ` CoolCold
@ 2010-06-09 7:49 ` Mario
2010-06-09 11:06 ` MRK
1 sibling, 0 replies; 12+ messages in thread
From: Mario @ 2010-06-09 7:49 UTC (permalink / raw)
To: linux-raid
CoolCold <coolthecold <at> gmail.com> writes:
>
> May be something like
> http://github.com/facebook/flashcache/blob/master/doc/flashcache-doc.txt
> will be interesting for you.
>
Great thing! Is it working or is it in alpha stage?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 22:48 ` David Rees
@ 2010-06-09 9:31 ` Ian Dall
0 siblings, 0 replies; 12+ messages in thread
From: Ian Dall @ 2010-06-09 9:31 UTC (permalink / raw)
To: David Rees; +Cc: Bill Davidsen, Mario, linux-raid
On Tue, 2010-06-08 at 15:48 -0700, David Rees wrote:
> On Mon, Jun 7, 2010 at 9:54 PM, Ian Dall <ian@beware.dropbear.id.au> wrote:
> > On Mon, 2010-06-07 at 15:14 -0400, Bill Davidsen wrote:
> >> Actually playing with that now. I got an Intel SATA 40GB SSD, and I am
> >> trying various combinations of things to put on it. One thing which I
> >> hoped would benefit was to put a f/s journal on SSD and then use the
> >> option to push all through the journal (data=journal) in hopes that it
> >> would then free the RAM needed for cache and thus speed operation.
> >>
> >> Since none of that has generated the performance I hoped,
> >
> > Interesting. If its the X25-V that you have, write performance is
> > nothing to write home about even compared to a single hard drive, let
> > alone a raid. By journaling data as well (as metadata), you just add
> > extra write overhead, possibly even a new bottleneck.
>
> Depends on whether you are talking about small, seeky writes or large
> writes. Even the X25-V will kill any rotating drive in small seeky
> writes,
Of course, low (almost 0) seek time it the forte of SSD disks, which is
why, is seems to me, swap would be an ideal application. I may be wrong
about that, but I imagine that paging is a kind of semi random pattern.
> >> I also played with mirroring and write mostly, etc. Does provide a
> >> general solution, at least in my tests.
> >
> > Do you mean "does NOT"?
>
> write-mostly DOES work well in my tests...
Ah! My understanding of "write-mostly" it that it is used in a mirror
(raid 0), which means that you need enough SSD to store your entire
filesystem, and the rotating disk is just redundancy. So:
capacity=capacity of SSD, speed ~= speed of SSD (until write behind
queue is full), probability of failure ~= P(Fail SSD)*P(Fail rotating
media). If the reliability of SSD is good enough (I don't know), is this
much of a win?
Regards,
Ian
--
Ian Dall <ian@beware.dropbear.id.au>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-08 12:23 ` CoolCold
2010-06-09 7:49 ` Mario
@ 2010-06-09 11:06 ` MRK
2010-06-09 16:21 ` Aryeh Gregor
1 sibling, 1 reply; 12+ messages in thread
From: MRK @ 2010-06-09 11:06 UTC (permalink / raw)
To: CoolCold; +Cc: Mario, linux-raid
On 06/08/2010 02:23 PM, CoolCold wrote:
> May be something like
> http://github.com/facebook/flashcache/blob/master/doc/flashcache-doc.txt
> will be interesting for you.
There is another one: bcache
http://lkml.indiana.edu/hypermail//linux/kernel/1004.0/01051.html
http://lkml.org/lkml/2010/4/23/376
seems beta stage but nearing completion.
I have not yet investigated all the differences to flashcache. At first
sight, bcache seems to cache reads, flashcache seems to cache writes.
But be warned that if you use a flash disk as a cache for a big RAID I
believe it's gonna wear out very quickly.
Consider e.g. bcache caches reads, so on every cache miss it reads
through the RAID and writes to the flash. If you have indexing programs
that scrub the whole array (which is clearly larger than the SSD) they
are going to transform all reads to writes to the flash disk. This is
going to burn it quickly. I am not sure an SSD is a good medium for
caching purposes. Try to buy RAM for this.
OTOH caching writes has another big problem:
http://www.legitreviews.com/news/7225/
read the grey box:
There is written that, due to the internal workings of SSDs, without a
supercapacitor it is not possible to guarantee data integrity upon power
loss. This is true EVEN if you are running it with cache off (so I
suppose it is true even if you are using cache flushes or barriers).
Do you really want to lose data upon power loss? Caching writes on an
SSD is risky...
At least bcache caches only reads, this should be safe I think.
Using a SSD for filesystem journal when the filesystem is on a HDD I
suppose would not yield big improvements because the bottleneck will
always be the HDD writes, which won't stay much behind journal commits.
At most you are going to get the speed you have without a journal (like
ext2).
Same problem with write-mostly/write-behind I think. I don't know how
long is the queue that holds data already committed to the SSD and not
yet committed to the HDD but it can't be too long. I'm reading the "man
md" right now and it's not extremely clear on this. I have the
impression the queue between the two it's either the
/sys/block/hdddevice/queue/nr_requests or it uses the write-intent
bitmap (if set). In case of the nr_requests, it's gonna be very short so
the SSD can give you quick bursts but continuous performance will be
that of the HDD. In case of write-intent bitmap, the delay between the
two can probably be unbounded, but be warned that if the HDD is even
just a bit behind the SSD, it's like not having the RAID: if the SSD
fails, you lose data (might be a lot of data; will also probably need fsck).
What do you think?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-09 11:06 ` MRK
@ 2010-06-09 16:21 ` Aryeh Gregor
2010-06-10 12:08 ` MRK
0 siblings, 1 reply; 12+ messages in thread
From: Aryeh Gregor @ 2010-06-09 16:21 UTC (permalink / raw)
To: MRK; +Cc: CoolCold, Mario, linux-raid
On Wed, Jun 9, 2010 at 7:06 AM, MRK <mrk@shiftmail.org> wrote:
> Same problem with write-mostly/write-behind I think. I don't know how long
> is the queue that holds data already committed to the SSD and not yet
> committed to the HDD but it can't be too long. I'm reading the "man md"
> right now and it's not extremely clear on this. I have the impression the
> queue between the two it's either the /sys/block/hdddevice/queue/nr_requests
> or it uses the write-intent bitmap (if set). In case of the nr_requests,
> it's gonna be very short so the SSD can give you quick bursts but continuous
> performance will be that of the HDD.
I tried this once and posted some bonnie++ results:
https://kerneltrap.org/mailarchive/linux-raid/2010/1/31/6742263
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: It is possible to put write cache on ssd?
2010-06-09 16:21 ` Aryeh Gregor
@ 2010-06-10 12:08 ` MRK
0 siblings, 0 replies; 12+ messages in thread
From: MRK @ 2010-06-10 12:08 UTC (permalink / raw)
To: Aryeh Gregor; +Cc: CoolCold, Mario, linux-raid
On 06/09/2010 06:21 PM, Aryeh Gregor wrote:
> On Wed, Jun 9, 2010 at 7:06 AM, MRK<mrk@shiftmail.org> wrote:
>
>> Same problem with write-mostly/write-behind I think. I don't know how long
>> is the queue that holds data already committed to the SSD and not yet
>> committed to the HDD but it can't be too long. I'm reading the "man md"
>> right now and it's not extremely clear on this. I have the impression the
>> queue between the two it's either the /sys/block/hdddevice/queue/nr_requests
>> or it uses the write-intent bitmap (if set). In case of the nr_requests,
>> it's gonna be very short so the SSD can give you quick bursts but continuous
>> performance will be that of the HDD.
>>
> I tried this once and posted some bonnie++ results:
>
> https://kerneltrap.org/mailarchive/linux-raid/2010/1/31/6742263
>
Thanks for your tests. The write-mostly array seems to go roughly as
fast as the SSD itself if I interpret your tests correctly (have you
really saturated the write-behind queue?). An HDD-only test would have
been interesting though (with SSDs failed and removed).
Secondly:
I now realize that the write-behind distance is settable (man mdadm see
--write-behind= ). However there is written it needs the write intent
bitmap to work. This makes me think that it is not really safe upon SSD
failure. Is the data in the write-behind queue also saved in RAM or does
it exist only in the SSD device (pointed to by the bitmap)? In the
second case, if the SSD dies, the HDD will likely be corrupt, it's not
really like having a RAID. In the first case, I don't understand why it
should need the write intent bitmap active.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-06-10 12:08 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-04 8:52 It is possible to put write cache on ssd? Mario
2010-06-07 19:14 ` Bill Davidsen
2010-06-08 4:54 ` Ian Dall
2010-06-08 19:28 ` Bill Davidsen
2010-06-08 22:48 ` David Rees
2010-06-09 9:31 ` Ian Dall
2010-06-08 7:31 ` Mario
2010-06-08 12:23 ` CoolCold
2010-06-09 7:49 ` Mario
2010-06-09 11:06 ` MRK
2010-06-09 16:21 ` Aryeh Gregor
2010-06-10 12:08 ` MRK
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.