* mix ssd and hdd in single volume
@ 2017-04-01 6:06 UGlee
2017-04-02 0:13 ` Duncan
2017-04-03 12:23 ` Austin S. Hemmelgarn
0 siblings, 2 replies; 6+ messages in thread
From: UGlee @ 2017-04-01 6:06 UTC (permalink / raw)
To: linux-btrfs
We are working on a small NAS server for home user. The product is
equipped with a small fast SSD (around 60-120GB) and a large HDD (2T
to 4T).
We have two choices:
1. using bcache to accelerate io operation
2. combining SSD and HDD into a single btrfs volume.
Bcache is certainly designed for our purpose. But bcache requires
complex configuration and can only start from clean disk. Also in our
test in Ubuntu 16.04, data inconsistence was observed at least once,
resulting total HDD data lost.
So we wonder if simply putting SSD and HDD into a single btrfs volume,
in whatever mode, the general read operation (mostly readdir and
getxattr) will also be significantly faster than a single HDD without
SSD.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mix ssd and hdd in single volume
2017-04-01 6:06 mix ssd and hdd in single volume UGlee
@ 2017-04-02 0:13 ` Duncan
2017-04-03 8:30 ` Marat Khalili
2017-04-03 12:23 ` Austin S. Hemmelgarn
1 sibling, 1 reply; 6+ messages in thread
From: Duncan @ 2017-04-02 0:13 UTC (permalink / raw)
To: linux-btrfs
UGlee posted on Sat, 01 Apr 2017 14:06:11 +0800 as excerpted:
> We are working on a small NAS server for home user. The product is
> equipped with a small fast SSD (around 60-120GB) and a large HDD (2T to
> 4T).
>
> We have two choices:
>
> 1. using bcache to accelerate io operation 2. combining SSD and HDD into
> a single btrfs volume.
>
> Bcache is certainly designed for our purpose. But bcache requires
> complex configuration and can only start from clean disk. Also in our
> test in Ubuntu 16.04, data inconsistence was observed at least once,
> resulting total HDD data lost.
>
> So we wonder if simply putting SSD and HDD into a single btrfs volume,
> in whatever mode, the general read operation (mostly readdir and
> getxattr) will also be significantly faster than a single HDD without
> SSD.
At present, bcache, or possibly the lvmcache alternative, are the only
recommended way of creating a single btrfs out of a mixed-size ssd/hdd
multi-volume.
The problem is that while they've been considered, there's no present
method of telling btrfs to use the smaller ssd for hotter content. The
btrfs chunk allocator simply doesn't have that option at present.
Which would leave you with the choice of single, raid1 or raid0 modes.
Raid1 requires two copies on separate devices which would mean the extra
space on the larger hdd would be wasted/unusable, and the read-mode
mirror choice algorithm is purely even/odd PID-based so on single reads
you'd have a 50% chance of fast ssd reads, 50% chance slow hdd. With
single mode the allocator allocates to the device with the most space
available first, so until the free space equalized between the two, all
chunks would end up on the larger/slower hdd. And raid0 would allocate
evenly (allocate-wide policy) to both, again wasting the extra space on
the larger device while only giving you overall about the same speed as
two hdds would give you, tho less predictably you'd get the full speed of
the ssd.
The default two-device setup, FWIW, is raid1-mode metadata for safety,
single-mode data.
As you can see, none of those are ideal from a fast-small-ssd as cache to
a large-slow-hdd perspective, thus the recommendation of bcache or
lvmcache if that's what you want/need.
The other alternative, of course, is separate filesystems, using a
combination of symlinks, partitioning and bind-mounts, to arrange for
frequently accessed and performance-critical stuff such as root and /home
to be on the smaller/faster ssd, while the larger/slower hdd is used for
stuff like a user's multimedia partition/filesystem. That's actually
what I've done here and I'm *very* happy with the result, but it's the
type of solution that must either be customized per-installation, or
perhaps be setup by a special-purpose distro installer designed with that
sort of use-case target in mind. It's /not/ the sort of thing you can do
in a NAS product and expect mass-market users to actually read and
understand the docs in ordered to use the product in an optimal way.
Meanwhile, since you appear to be designing a mass-market product, it's
worth mentioning that btrfs is considered, certainly by its devs and
users on this list, to be "still stabilizing, not fully stable and
mature." As such, making and having backups at the ready for any data
considered to be more valuable than the time and resources necessary to
make those backups is strongly recommended, even more so than when the
filesystem is considered stable and mature (tho certainly the rule
applies even then, but try telling that to a mass-market user...).
Additionally, since btrfs /is/ still stabilizing, we recommend that users
run relatively new kernels, we best support the latest kernels in either
of the current kernel series (thus 4.10 and 4.9 at present) or the
mainline LTS series (thus 4.9 and 4.4 at present), and further recommend
that users at least loosely follow the list in ordered to keep up with
current btrfs developments and possible issues they may confront.
That doesn't sound like a particularly good choice for a mass-market NAS
product to me. Of course there's rockstor and others out there already
shipping such products, but they're risking their reputation and the
safety of their customer's data in the process, altho there's certainly a
few customers out there with the time, desire and technical know-how to
ensure the recommended backups and following current kernel and list, and
that's exactly the sort of people you'll find already here. But that's
not sufficiently mass-market to appeal to most vendors.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mix ssd and hdd in single volume
2017-04-02 0:13 ` Duncan
@ 2017-04-03 8:30 ` Marat Khalili
2017-04-03 8:41 ` Roman Mamedov
0 siblings, 1 reply; 6+ messages in thread
From: Marat Khalili @ 2017-04-03 8:30 UTC (permalink / raw)
To: linux-btrfs
On 02/04/17 03:13, Duncan wrote:
> Meanwhile, since you appear to be designing a mass-market product, it's
> worth mentioning that btrfs is considered, certainly by its devs and
> users on this list, to be "still stabilizing, not fully stable and
> mature." [...] That doesn't sound like a particularly good choice for a mass-market NAS
> product to me. Of course there's rockstor and others out there already
> shipping such products, but they're risking their reputation and the
> safety of their customer's data in the process, altho there's certainly a
> few customers out there with the time, desire and technical know-how to
> ensure the recommended backups and following current kernel and list, and
> that's exactly the sort of people you'll find already here. But that's
> not sufficiently mass-market to appeal to most vendors.
You may want to look here: https://www.synology.com/en-global/dsm/Btrfs
. Somebody forgot to tell Synology, which already supports btrfs in all
hardware-capable devices. I think Rubicon has been crossed in
'mass-market NAS[es]', for good or not.
--
With Best Regards,
Marat Khalili
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mix ssd and hdd in single volume
2017-04-03 8:30 ` Marat Khalili
@ 2017-04-03 8:41 ` Roman Mamedov
2017-04-07 3:12 ` Duncan
0 siblings, 1 reply; 6+ messages in thread
From: Roman Mamedov @ 2017-04-03 8:41 UTC (permalink / raw)
To: Marat Khalili; +Cc: linux-btrfs
On Mon, 3 Apr 2017 11:30:44 +0300
Marat Khalili <mkh@rqc.ru> wrote:
> You may want to look here: https://www.synology.com/en-global/dsm/Btrfs
> . Somebody forgot to tell Synology, which already supports btrfs in all
> hardware-capable devices. I think Rubicon has been crossed in
> 'mass-market NAS[es]', for good or not.
AFAIR Synology did not come to this list asking for (any kind of) advice
prior to implementing that (else they would have gotten the same kind of post
from Duncan and others), and it's not Btrfs developers job to have an outreach
program to contact vendors and educate them to not use Btrfs.
I don't remember seeing them actively contribute improvements or fixes
especially for the RAID5 or RAID6 features (which they ADVERTISE on that page
as a fully working part of their product). That doesn't seem honest to end
users or playing nicely with the upstream developers. What the upstream gets
instead is just those end-users coming here one by one some years later,
asking how to fix a broken Btrfs RAID5 on an embedded box running some 3.10 or
3.14 kernel.
--
With respect,
Roman
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mix ssd and hdd in single volume
2017-04-01 6:06 mix ssd and hdd in single volume UGlee
2017-04-02 0:13 ` Duncan
@ 2017-04-03 12:23 ` Austin S. Hemmelgarn
1 sibling, 0 replies; 6+ messages in thread
From: Austin S. Hemmelgarn @ 2017-04-03 12:23 UTC (permalink / raw)
To: matianfu, linux-btrfs
On 2017-04-01 02:06, UGlee wrote:
> We are working on a small NAS server for home user. The product is
> equipped with a small fast SSD (around 60-120GB) and a large HDD (2T
> to 4T).
>
> We have two choices:
>
> 1. using bcache to accelerate io operation
> 2. combining SSD and HDD into a single btrfs volume.
>
> Bcache is certainly designed for our purpose. But bcache requires
> complex configuration and can only start from clean disk. Also in our
> test in Ubuntu 16.04, data inconsistence was observed at least once,
> resulting total HDD data lost.
>
> So we wonder if simply putting SSD and HDD into a single btrfs volume,
> in whatever mode, the general read operation (mostly readdir and
> getxattr) will also be significantly faster than a single HDD without
> SSD.
Have you tried dm-cache? The general idea is similar to bcache, but
it's been much more reliable in my experience, and it's possible to add
it to an existing device without any need for reprovisioning (although
the existing device can't have any pending writes, otherwise you might
get some data corruption).
Additionally, given what you've said, write-through mode should cover
what you need in terms of performance, and may be more reliable on
bcache than writeback mode.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mix ssd and hdd in single volume
2017-04-03 8:41 ` Roman Mamedov
@ 2017-04-07 3:12 ` Duncan
0 siblings, 0 replies; 6+ messages in thread
From: Duncan @ 2017-04-07 3:12 UTC (permalink / raw)
To: linux-btrfs
Roman Mamedov posted on Mon, 03 Apr 2017 13:41:07 +0500 as excerpted:
> On Mon, 3 Apr 2017 11:30:44 +0300 Marat Khalili <mkh@rqc.ru> wrote:
>
>> You may want to look here: https://www.synology.com/en-global/dsm/Btrfs
>> . Somebody forgot to tell Synology, which already supports btrfs in all
>> hardware-capable devices. I think Rubicon has been crossed in
>> 'mass-market NAS[es]', for good or not.
>
> AFAIR Synology did not come to this list asking for (any kind of) advice
> prior to implementing that (else they would have gotten the same kind of
> post from Duncan and others)[.] I don't remember seeing them actively
> contribute improvements or fixes especially for the RAID5 or RAID6
> features (which they ADVERTISE on that page as a fully working part of
> their product).
> That doesn't seem honest to end users or playing nicely with the
> upstream developers. What the upstream gets instead is just those
> end-users coming here one by one some years later, asking how to fix
> a broken Btrfs RAID5 on an embedded box running some 3.10 or 3.14
> kernel.
And of course then the user gets the real state of btrfs and of btrfs
raid56 mode, particularly back that far, explained to them. Along with
that we'll explain that any data on it is in all likelihood lost data,
with little to no chance at recovery.
And we'll point out that if there was serious value in the data, they
would have investigated the state of the filesystem before they put that
data on it, and of course, as I've already said, they'd have had backups
for anything that was of more value than the time/resources/hassle of
doing those backups.
And if they're lucky, that NAS will have /been/ the backup, and they'll
still have the actual working copy at least, and can make another backup
ASAP just in case that working copy dies too.
But if they're unlucky...
Of course the user will then blame the manufacturer, but by that time the
warranty will be up, and even if not, while they /might/ get their money
back, that won't get their data back.
And the manufacturer will get a bad name, but by then having taken the
money and run they'll be on to something else or perhaps be bought out by
someone bigger or be out of business.
And all the user will be able to do is chalk it up to experience, and
mourn the loss of their kids' baby pictures/videos or their wedding
videos, or whatever. If they're /really/ lucky, they'll have put them on
facebook or youtube or whatever, and can retrieve at least those, from
there.
Meanwhile, the user, having been once burned, may never use the by then
much improved btrfs, or even worse, never trust anything Linux, again.
Oh, well. The best we can do here is warn those that /do/ value their
data enough to do their research first, so they /do/ have those backups
or at least use something a bit more mature than btrfs raid56 mode. Of
course and continue to work on full btrfs stabilization. And I like to
think we're reasonably good at those warnings, anyway. The
stabilization, too, but that takes time and patience, plus coder skill,
the last of which which I personally don't have, so I just pitch in where
I can, answering questions, etc.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-04-07 3:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-01 6:06 mix ssd and hdd in single volume UGlee
2017-04-02 0:13 ` Duncan
2017-04-03 8:30 ` Marat Khalili
2017-04-03 8:41 ` Roman Mamedov
2017-04-07 3:12 ` Duncan
2017-04-03 12:23 ` Austin S. Hemmelgarn
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.