From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net ([212.227.15.15]:62403 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751764AbcBHNUw (ORCPT ); Mon, 8 Feb 2016 08:20:52 -0500 Subject: Re: Use fast device only for metadata? To: "Austin S. Hemmelgarn" , Martin Steigerwald , Kai Krakow References: <874mdktk4t.fsf@vostro.rath.org> <20160207210713.7e4661a8@jupiter.sol.kaishome.de> <1507413.RERLDqpHyU@merkaba> <56B888FF.5080605@gmail.com> Cc: linux-btrfs@vger.kernel.org From: Qu Wenruo Message-ID: <56B8962C.6050302@gmx.com> Date: Mon, 8 Feb 2016 21:20:44 +0800 MIME-Version: 1.0 In-Reply-To: <56B888FF.5080605@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 02/08/2016 08:24 PM, Austin S. Hemmelgarn wrote: > On 2016-02-07 15:59, Martin Steigerwald wrote: >> Am Sonntag, 7. Februar 2016, 21:07:13 CET schrieb Kai Krakow: >>> Am Sun, 07 Feb 2016 11:06:58 -0800 >>> >>> schrieb Nikolaus Rath : >>>> Hello, >>>> >>>> I have a large home directory on a spinning disk that I regularly >>>> synchronize between different computers using unison. That takes ages, >>>> even though the amount of changed files is typically small. I suspect >>>> most if the time is spend walking through the file system and checking >>>> mtimes. >>>> >>>> So I was wondering if I could possibly speed-up this operation by >>>> storing all btrfs metadata on a fast, SSD drive. It seems that >>>> mkfs.btrfs allows me to put the metadata in raid1 or dup mode, and the >>>> file contents in single mode. However, I could not find a way to tell >>>> btrfs to use a device *only* for metadata. Is there a way to do that? >>>> >>>> Also, what is the difference between using "dup" and "raid1" for the >>>> metadata? >>> >>> You may want to try bcache. It will speedup random access which is >>> probably the main cause for your slow sync. Unfortunately it requires >>> you to reformat your btrfs partitions to add a bcache superblock. But >>> it's worth the efforts. >>> >>> I use a nightly rsync to USB3 disk, and bcache reduced it from 5+ hours >>> to typically 1.5-3 depending on how much data changed. >> >> An alternative is using dm-cache, I think it doesn´t need to recreate the >> filesystem. > That's correct, dm-cache can use a regular underlying storage device. > This of course has potential implications for a multi-device filesystem > (it can seriously confuse BTRFS and cause data corruption), but it works > just fine for a single device filesystem. This makes it a bit easier to > test run, but also means you need more devices (internally, it uses 3, > one backing device, one cache device, and a metadata device for > persistently mapping between the two). It's really easy to set up > though if you have a recent version of LVM built with dm-cache support. > > In general, bcache takes a bit more setup, but avoids the multi-device > issues, and importantly, doesn't require LVM or dmsetup (which are > usually pretty big packages on many distros). The caveat with bcache > though is that there have been issues in the past with data integrity > when used with BTRFS, but if you're on a recent kernel (at least 4.0 if > you're using BTRFS for actual data storage), you should have no issues. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html And I just want to add more about using a device *only* for metadata. The short answer is, unfortunately, NO. 1) Even using bcache/dm-cache, it may still cache small data write Although I'm not quite sure about dm-cache/bcache, but as long as the top file is Btrfs, it won't be possible to limit data/metadata to/from specific device. IIRC, bcache or similiar method may cache most random r/w of metadata, it's still quite possible to cache a lot of random r/w of data. And depending on the sector size(minimal data block size) and leaf size (metadata block size), it's even more possible to cache small data other than metadata under specific worload. As default sectorsize is 4K, but leafsize is 16K. 2) Btrfs don't have special preference on chunk allocation. Btrfs just allocate chunks in the order of unallocated space. So, even there is a super big TB or PB spinning device, and GB level SSD, btrfs will just trust them according to unallocated space. BTW, to really allocate the bottleneck, it's better to use perf to allocate which function btrfs spends most of its time on. Although it's a known fact that btrfs is quite slow on metadata modification compared to other file systems, I'm still not quite sure about if that's the root cause. Thanks, Qu