Linux-BTRFS Archive on
 help / color / Atom feed
From: Hans van Kranenburg <>
To: "Zygo Blaxell" <>,
	"Holger Hoffstätte" <>
Cc: linux-btrfs <>
Subject: Re: Q: what exactly does SSD mode still do?
Date: Sat, 28 Mar 2020 22:31:26 +0100
Message-ID: <> (raw)
In-Reply-To: <>

On 3/28/20 8:35 PM, Zygo Blaxell wrote:
> On Fri, Mar 27, 2020 at 11:29:52AM +0100, Holger Hoffstätte wrote:
>> On 3/26/20 11:21 PM, Hans van Kranenburg wrote:
>>> 2) Metadata "cluster allocator" write behavior:
>>> *empty_cluster = SZ_64K  # nossd
>>> *empty_cluster = SZ_2M  # ssd
>>> This happens in extent-tree.c.
>> 2M used to be a common erase block size on SSDs. Or maybe it's just
>> a nice round number..  ¯\(ツ)/¯
> As a side-effect, 2M write clusters close the write hole on raid5/6 if you
> have an array that is a power of 2 data disks wide.  This capability is
> wasted when it's only available through the 'ssd' mount option.

Search for SSD_SPREAD in free-space-cache.c. There's this cont1_bytes
which is a fallback, so you'll have to run full SSD_SPREAD mode for this
to happen IINM. for a huge braindump

While running Linux 4.9 back then, I had to actually use 'ssd_spread'
metadata (not for data, possible thanks to that 'bug') to prevent
metadata writes from running around in circles while writing the extent
tree. With 4.19, I can juse use 'ssd' and TBH I have no idea what change
in between got rid of that insane amount of write overhead. So, I never
continued with researching behavior of different options (empty_cluster,
cont1_bytes combinations).

> The behavior could be quite useful if it was properly integrated with
> the raid5/6 stuff:  set *empty_cluster = block group data width, make
> sure it's aligned to raid5/6 stripe boundaries, and use it for both data
> and metadata.
> It works by effectively making partially-filled clusters read-only.
> If we can guarantee that clusters are aligned to raid5/6 data/parity block
> boundaries, then btrfs can't allocate new data in partially filled raid5/6
> stripes, so it won't break the parity relation and won't have write hole.
>> cheers,
>> Holger
>> [1]
>> [2]


      reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 18:16 Holger Hoffstätte
2020-03-26 22:21 ` Hans van Kranenburg
2020-03-27 10:29   ` Holger Hoffstätte
2020-03-28 19:35     ` Zygo Blaxell
2020-03-28 21:31       ` Hans van Kranenburg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on

Archives are clonable:
	git clone --mirror linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ \
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone