All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Goffredo Baroncelli <kreijack@libero.it>, linux-btrfs@vger.kernel.org
Cc: Michael <mclaud@roznica.com.ua>, Hugo Mills <hugo@carfax.org.uk>,
	Martin Svec <martin.svec@zoner.cz>,
	Wang Yugui <wangyugui@e16-tech.com>
Subject: Re: [RFC][PATCH V3] btrfs: ssd_metadata: storing metadata on SSD
Date: Sat, 30 May 2020 12:59:54 +0800	[thread overview]
Message-ID: <69939407-de18-e455-6c85-cd10683894be@gmx.com> (raw)
In-Reply-To: <20200405082636.18016-1-kreijack@libero.it>



On 2020/4/5 下午4:26, Goffredo Baroncelli wrote:
>
> Hi all,
>
> This is an RFC; I wrote this patch because I find the idea interesting
> even though it adds more complication to the chunk allocator.
>
> The core idea is to store the metadata on the ssd and to leave the data
> on the rotational disks. BTRFS looks at the rotational flags to
> understand the kind of disks.
>
> This new mode is enabled passing the option ssd_metadata at mount time.
> This policy of allocation is the "preferred" one. If this doesn't permit
> a chunk allocation, the "classic" one is used.

One thing to improve here, in fact we can use existing members to
restore the device related info:
- btrfs_dev_item::seek_speed
- btrfs_dev_item::bandwidth (I tend to rename it to IOPS)

In fact, what you're trying to do is to provide a policy to allocate
chunks based on each device performance characteristics.

I believe it would be super awesome, but to get it upstream, I guess we
would prefer a more flex framework, thus it would be pretty slow to merge.

But still, thanks for your awesome idea.

Thanks,
Qu


>
> Some examples: (/dev/sd[abc] are ssd, and /dev/sd[ef] are rotational)
>
> Non striped profile: metadata->raid1, data->raid1
> The data is stored on /dev/sd[ef], metadata is stored on /dev/sd[abc].
> When /dev/sd[ef] are full, then the data chunk is allocated also on
> /dev/sd[abc].
>
> Striped profile: metadata->raid6, data->raid6
> raid6 requires 3 disks at minimum, so /dev/sd[ef] are not enough for a
> data profile raid6. To allow a data chunk allocation, the data profile raid6
> will be stored on all the disks /dev/sd[abcdef].
> Instead the metadata profile raid6 will be allocated on /dev/sd[abc],
> because these are enough to host this chunk.
>
> Changelog:
> v1: - first issue
> v2: - rebased to v5.6.2
>     - correct the comparison about the rotational disks (>= instead of >)
>     - add the flag rotational to the struct btrfs_device_info to
>       simplify the comparison function (btrfs_cmp_device_info*() )
> v3: - correct the collision between BTRFS_MOUNT_DISCARD_ASYNC and
>       BTRFS_MOUNT_SSD_METADATA.
>
> Below I collected some data to highlight the performance increment.
>
> Test setup:
> I performed as test a "dist-upgrade" of a Debian from stretch to buster.
> The test consisted in an image of a Debian stretch[1]  with the packages
> needed under /var/cache/apt/archives/ (so no networking was involved).
> For each test I formatted the filesystem from scratch, un-tar-red the
> image and the ran "apt-get dist-upgrade" [2]. For each disk(s)/filesystem
> combination I measured the time of apt dist-upgrade with and
> without the flag "force-unsafe-io" which reduce the using of sync(2) and
> flush(2). The ssd was 20GB big, the hdd was 230GB big,
>
> I considered the following scenarios:
> - btrfs over ssd
> - btrfs over ssd + hdd with my patch enabled
> - btrfs over bcache over hdd+ssd
> - btrfs over hdd (very, very slow....)
> - ext4 over ssd
> - ext4 over hdd
>
> The test machine was an "AMD A6-6400K" with 4GB of ram, where 3GB was used
> as cache/buff.
>
> Data analysis:
>
> Of course btrfs is slower than ext4 when a lot of sync/flush are involved. Using
> apt on a rotational was a dramatic experience. And IMHO  this should be replaced
> by using the btrfs snapshot capabilities. But this is another (not easy) story.
>
> Unsurprising bcache performs better than my patch. But this is an expected
> result because it can cache also the data chunk (the read can goes directly to
> the ssd). bcache perform about +60% slower when there are a lot of sync/flush
> and only +20% in the other case.
>
> Regarding the test with force-unsafe-io (fewer sync/flush), my patch reduce the
> time from +256% to +113%  than the hdd-only . Which I consider a good
> results considering how small is the patch.
>
>
> Raw data:
> The data below is the "real" time (as return by the time command) consumed by
> apt
>
>
> Test description         real (mmm:ss)	Delta %
> --------------------     -------------  -------
> btrfs hdd w/sync	   142:38	+533%
> btrfs ssd+hdd w/sync        81:04	+260%
> ext4 hdd w/sync	            52:39	+134%
> btrfs bcache w/sync	    35:59	 +60%
> btrfs ssd w/sync	    22:31	reference
> ext4 ssd w/sync	            12:19	 -45%
>
>
>
> Test description         real (mmm:ss)	Delta %
> --------------------     -------------  -------
> btrfs hdd	             56:2	+256%
> ext4 hdd	            51:32	+228%
> btrfs ssd+hdd	            33:30	+113%
> btrfs bcache	            18:57	 +20%
> btrfs ssd	            15:44	reference
> ext4 ssd	            11:49	 -25%
>
>
> [1] I created the image, using "debootrap stretch", then I installed a set
> of packages using the commands:
>
>   # debootstrap stretch test/
>   # chroot test/
>   # mount -t proc proc proc
>   # mount -t sysfs sys sys
>   # apt --option=Dpkg::Options::=--force-confold \
>         --option=Dpkg::options::=--force-unsafe-io \
> 	install mate-desktop-environment* xserver-xorg vim \
>         task-kde-desktop task-gnome-desktop
>
> Then updated the release from stretch to buster changing the file /etc/apt/source.list
> Then I download the packages for the dist upgrade:
>
>   # apt-get update
>   # apt-get --download-only dist-upgrade
>
> Then I create a tar of this image.
> Before the dist upgrading the space used was about 7GB of space with 2281
> packages. After the dist-upgrade, the space used was 9GB with 2870 packages.
> The upgrade installed/updated about 2251 packages.
>
>
> [2] The command was a bit more complex, to avoid an interactive session
>
>   # mkfs.btrfs -m single -d single /dev/sdX
>   # mount /dev/sdX test/
>   # cd test
>   # time tar xzf ../image.tgz
>   # chroot .
>   # mount -t proc proc proc
>   # mount -t sysfs sys sys
>   # export DEBIAN_FRONTEND=noninteractive
>   # time apt-get -y --option=Dpkg::Options::=--force-confold \
> 	--option=Dpkg::options::=--force-unsafe-io dist-upgrade
>
>
> BR
> G.Baroncelli
>

  parent reply	other threads:[~2020-05-30  5:00 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-05  8:26 [RFC][PATCH V3] btrfs: ssd_metadata: storing metadata on SSD Goffredo Baroncelli
2020-04-05  8:26 ` [PATCH] btrfs: add ssd_metadata mode Goffredo Baroncelli
2020-04-14  5:24   ` Paul Jones
2020-10-23  7:23   ` Wang Yugui
2020-10-23 10:11     ` Adam Borowski
2020-10-23 11:25       ` Qu Wenruo
2020-10-23 12:37         ` Wang Yugui
2020-10-23 12:45           ` Qu Wenruo
2020-10-23 13:10           ` Steven Davies
2020-10-23 13:49             ` Wang Yugui
2020-10-23 18:03           ` Goffredo Baroncelli
2020-10-24  3:26             ` Paul Jones
2020-04-05 10:57 ` [RFC][PATCH V3] btrfs: ssd_metadata: storing metadata on SSD Graham Cobb
2020-04-05 18:47   ` Goffredo Baroncelli
2020-04-05 21:58     ` Adam Borowski
2020-04-06  2:24   ` Zygo Blaxell
2020-04-06 16:43     ` Goffredo Baroncelli
2020-04-06 17:21       ` Zygo Blaxell
2020-04-06 17:33         ` Goffredo Baroncelli
2020-04-06 17:40           ` Zygo Blaxell
2020-05-29 16:06 ` Hans van Kranenburg
2020-05-29 16:40   ` Goffredo Baroncelli
2020-05-29 18:37     ` Hans van Kranenburg
2020-05-30  4:59 ` Qu Wenruo [this message]
2020-05-30  6:48   ` Goffredo Baroncelli
2020-05-30  8:57     ` Paul Jones
2020-04-27 15:06 Torstein Eide
2020-04-28 19:31 ` Goffredo Baroncelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69939407-de18-e455-6c85-cd10683894be@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=hugo@carfax.org.uk \
    --cc=kreijack@libero.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=martin.svec@zoner.cz \
    --cc=mclaud@roznica.com.ua \
    --cc=wangyugui@e16-tech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.