All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Martin Steigerwald <martin@lichtvoll.de>,
	Kai Krakow <hurikhan77@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Use fast device only for metadata?
Date: Mon, 8 Feb 2016 08:29:29 -0500	[thread overview]
Message-ID: <56B89839.1060709@gmail.com> (raw)
In-Reply-To: <56B8962C.6050302@gmx.com>

On 2016-02-08 08:20, Qu Wenruo wrote:
> On 02/08/2016 08:24 PM, Austin S. Hemmelgarn wrote:
>> On 2016-02-07 15:59, Martin Steigerwald wrote:
>>> Am Sonntag, 7. Februar 2016, 21:07:13 CET schrieb Kai Krakow:
>>>> Am Sun, 07 Feb 2016 11:06:58 -0800
>>>>
>>>> schrieb Nikolaus Rath <Nikolaus@rath.org>:
>>>>> Hello,
>>>>>
>>>>> I have a large home directory on a spinning disk that I regularly
>>>>> synchronize between different computers using unison. That takes ages,
>>>>> even though the amount of changed files is typically small. I suspect
>>>>> most if the time is spend walking through the file system and checking
>>>>> mtimes.
>>>>>
>>>>> So I was wondering if I could possibly speed-up this operation by
>>>>> storing all btrfs metadata on a fast, SSD drive. It seems that
>>>>> mkfs.btrfs allows me to put the metadata in raid1 or dup mode, and the
>>>>> file contents in single mode. However, I could not find a way to tell
>>>>> btrfs to use a device *only* for metadata. Is there a way to do that?
>>>>>
>>>>> Also, what is the difference between using "dup" and "raid1" for the
>>>>> metadata?
>>>>
>>>> You may want to try bcache. It will speedup random access which is
>>>> probably the main cause for your slow sync. Unfortunately it requires
>>>> you to reformat your btrfs partitions to add a bcache superblock. But
>>>> it's worth the efforts.
>>>>
>>>> I use a nightly rsync to USB3 disk, and bcache reduced it from 5+ hours
>>>> to typically 1.5-3 depending on how much data changed.
>>>
>>> An alternative is using dm-cache, I think it doesn´t need to recreate
>>> the
>>> filesystem.
>> That's correct, dm-cache can use a regular underlying storage device.
>> This of course has potential implications for a multi-device filesystem
>> (it can seriously confuse BTRFS and cause data corruption), but it works
>> just fine for a single device filesystem.  This makes it a bit easier to
>> test run, but also means you need more devices (internally, it uses 3,
>> one backing device, one cache device, and a metadata device for
>> persistently mapping between the two).  It's really easy to set up
>> though if you have a recent version of LVM built with dm-cache support.
>>
>> In general, bcache takes a bit more setup, but avoids the multi-device
>> issues, and importantly, doesn't require LVM or dmsetup (which are
>> usually pretty big packages on many distros).  The caveat with bcache
>> though is that there have been issues in the past with data integrity
>> when used with BTRFS, but if you're on a recent kernel (at least 4.0 if
>> you're using BTRFS for actual data storage), you should have no issues.
>
> And I just want to add more about using a device *only* for metadata.
>
> The short answer is, unfortunately, NO.
>
> 1) Even using bcache/dm-cache, it may still cache small data write
>
> Although I'm not quite sure about dm-cache/bcache, but as long as the
> top file is Btrfs, it won't be possible to limit data/metadata to/from
> specific device.
>
> IIRC, bcache or similiar method may cache most random r/w of metadata,
> it's still quite possible to cache a lot of random r/w of data.
>
> And depending on the sector size(minimal data block size) and leaf size
> (metadata block size), it's even more possible to cache small data other
> than metadata under specific worload.
> As default sectorsize is 4K, but leafsize is 16K.
The mention of dm-cache/bcache was more intended as an alternative, 
since BTRFS currently can't do what Nikolaus was trying to achieve. 
Neither will give quite the performance profile that a dedicated 
metadata device might, but they should still significantly improve 
general performance.  In essence, these function for BTRFS like L2ARC on 
an SSD does for ZFS.
>
> 2) Btrfs don't have special preference on chunk allocation.
>
> Btrfs just allocate chunks in the order of unallocated space.
> So, even there is a super big TB or PB spinning device, and GB level
> SSD, btrfs will just trust them according to unallocated space.
On at least the project page, there is a suggestion to provide this 
functionality.  In a way, it's essentially equivalent to the external 
journal device supported by ext4, XFS, OCFS2 and some other filesystems, 
and as such, I'd say it's a feature we should seriously consider looking 
at implementing eventually, even if just for feature parity, and even if 
we speed up metadata operations in BTRFS.


  reply	other threads:[~2016-02-08 13:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-07 19:06 Use fast device only for metadata? Nikolaus Rath
2016-02-07 20:07 ` Kai Krakow
2016-02-07 20:59   ` Martin Steigerwald
2016-02-08  1:04     ` Duncan
2016-02-08 12:24     ` Austin S. Hemmelgarn
2016-02-08 13:20       ` Qu Wenruo
2016-02-08 13:29         ` Austin S. Hemmelgarn [this message]
2016-02-08 14:23           ` Qu Wenruo
2016-02-08 21:44     ` Nikolaus Rath
2016-02-08 22:12       ` Duncan
2016-02-09  7:29       ` Kai Krakow
2016-02-09 16:09         ` Nikolaus Rath
2016-02-09 21:43           ` Kai Krakow
2016-02-09 22:02             ` Chris Murphy
2016-02-09 22:38             ` Nikolaus Rath
2016-02-10  1:12               ` Henk Slager
2016-02-09 16:10         ` Nikolaus Rath
2016-02-09 21:29           ` Kai Krakow
2016-02-09 18:23         ` Henk Slager
2016-02-09 13:22       ` Austin S. Hemmelgarn
2016-02-10  4:08       ` Nikolaus Rath

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B89839.1060709@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=martin@lichtvoll.de \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.