All of lore.kernel.org
 help / color / mirror / Atom feed
* Using async discard by default with SSDs?
@ 2022-07-21 18:43 Neal Gompa
  2022-07-21 19:21 ` Roman Mamedov
  2022-07-25 19:08 ` David Sterba
  0 siblings, 2 replies; 13+ messages in thread
From: Neal Gompa @ 2022-07-21 18:43 UTC (permalink / raw)
  To: Btrfs BTRFS; +Cc: Chris Mason, Josef Bacik, Chris Murphy

Hey all,

I was talking with Chris Mason today at a Fedora Hatch event about
async discards (as we were thinking about doing this in Fedora some
time ago[1]), and he seemed to consider it reasonable to make it so
Btrfs uses async discards by default when being formatted on SSDs.

He and I couldn't think of a reason why not to, other than the
potential lack of "discard=none" option to turn off discards if the
user wanted it to. Do we already have this option? Are there any other
reasons not to do this? Or is this something we should have changed in
Btrfs so everyone gets async discards by default going forward?

Thanks in advance and best regards,
Neal

[1]: https://pagure.io/fedora-btrfs/project/issue/6

--
真実はいつも一つ!/ Always, there's only one truth!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-21 18:43 Using async discard by default with SSDs? Neal Gompa
@ 2022-07-21 19:21 ` Roman Mamedov
  2022-07-25 19:08 ` David Sterba
  1 sibling, 0 replies; 13+ messages in thread
From: Roman Mamedov @ 2022-07-21 19:21 UTC (permalink / raw)
  To: Neal Gompa; +Cc: Btrfs BTRFS, Chris Mason, Josef Bacik, Chris Murphy

On Thu, 21 Jul 2022 14:43:16 -0400
Neal Gompa <ngompa13@gmail.com> wrote:

> potential lack of "discard=none" option to turn off discards if the
> user wanted it to. Do we already have this option?

Like on Ext4, XFS and others, the "nodiscard" mount option is available:
https://btrfs.readthedocs.io/en/latest/btrfs-man5.html

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-21 18:43 Using async discard by default with SSDs? Neal Gompa
  2022-07-21 19:21 ` Roman Mamedov
@ 2022-07-25 19:08 ` David Sterba
  2022-07-26 20:00   ` Chris Murphy
  1 sibling, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-07-25 19:08 UTC (permalink / raw)
  To: Neal Gompa; +Cc: Btrfs BTRFS, Chris Mason, Josef Bacik, Chris Murphy

On Thu, Jul 21, 2022 at 02:43:16PM -0400, Neal Gompa wrote:
> Hey all,
> 
> I was talking with Chris Mason today at a Fedora Hatch event about
> async discards (as we were thinking about doing this in Fedora some
> time ago[1]), and he seemed to consider it reasonable to make it so
> Btrfs uses async discards by default when being formatted on SSDs.
> 
> He and I couldn't think of a reason why not to, other than the
> potential lack of "discard=none" option to turn off discards if the
> user wanted it to. Do we already have this option? Are there any other
> reasons not to do this? Or is this something we should have changed in
> Btrfs so everyone gets async discards by default going forward?

I think it's safe to use by default, with the usual question who could
be affected by that negatively. The async triggers, timeouts and
thresholds are preset conservatively and so far there have been no
complaints.

The tunables have been hidden under debug, but there are also some stats
(like how many bytes could be discarded in the next round), so it would
be good to also publish that when it's on by default.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-25 19:08 ` David Sterba
@ 2022-07-26 20:00   ` Chris Murphy
  2022-07-26 21:36     ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Murphy @ 2022-07-26 20:00 UTC (permalink / raw)
  To: David Sterba, Neal Gompa; +Cc: Btrfs BTRFS, Chris Mason, Josef Bacik



On Mon, Jul 25, 2022, at 3:08 PM, David Sterba wrote:

> I think it's safe to use by default, with the usual question who could
> be affected by that negatively. The async triggers, timeouts and
> thresholds are preset conservatively and so far there have been no
> complaints.
>
> The tunables have been hidden under debug, but there are also some stats
> (like how many bytes could be discarded in the next round), so it would
> be good to also publish that when it's on by default.

In my testing, discard=async rather quickly submits recently freed metadata blocks for gc, blocks referred by the backup roots in the super. Is this a concern for recovery? Is there a way to exclude the most recent ~5 generations of metadata from gc, and could it improve the usefulness of backup roots for recovery?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-26 20:00   ` Chris Murphy
@ 2022-07-26 21:36     ` David Sterba
  2022-07-26 22:10       ` Chris Murphy
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-07-26 21:36 UTC (permalink / raw)
  To: Chris Murphy
  Cc: David Sterba, Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik

On Tue, Jul 26, 2022 at 04:00:42PM -0400, Chris Murphy wrote:
> 
> 
> On Mon, Jul 25, 2022, at 3:08 PM, David Sterba wrote:
> 
> > I think it's safe to use by default, with the usual question who could
> > be affected by that negatively. The async triggers, timeouts and
> > thresholds are preset conservatively and so far there have been no
> > complaints.
> >
> > The tunables have been hidden under debug, but there are also some stats
> > (like how many bytes could be discarded in the next round), so it would
> > be good to also publish that when it's on by default.
> 
> In my testing, discard=async rather quickly submits recently freed
> metadata blocks for gc, blocks referred by the backup roots in the
> super. Is this a concern for recovery? Is there a way to exclude the
> most recent ~5 generations of metadata from gc, and could it improve
> the usefulness of backup roots for recovery?

What is quickly here? The default timeout is set to 2 minutes and that's
what I've seen when testing that on a fresh filesystem. There's some
additional logic that's adaptive and derived from the latency increase
caused by the discard, but this could be observed on some specific
delete heavy workloads.

There's no logic that would take into account generation so it could
interfere with the backup root feature, but it's been unreliable already
and there are no guarantees to keep the recently freed blocks unallocated.
I've heared some feedback that the feature should be removed or
improved, I'm personally in the "removed" camp.

The block groups eligible for discard are evaluated based on time and
size criteria so I think adding a generation is straightforward, rough
guess is to btrfs_discard_check_filter(). This affects overall
performance though, FB guys wanted faster discard for their workload so
adding delay based on generation could make it worse. They now use the
default settings, though there are tunables, and I'd like to keep
defaults working for most users. If somebody evaluates the effects of
additional generational we can change it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-26 21:36     ` David Sterba
@ 2022-07-26 22:10       ` Chris Murphy
  2022-07-27 14:50         ` Chris Murphy
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Murphy @ 2022-07-26 22:10 UTC (permalink / raw)
  To: David Sterba; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik



On Tue, Jul 26, 2022, at 5:36 PM, David Sterba wrote:
> On Tue, Jul 26, 2022 at 04:00:42PM -0400, Chris Murphy wrote:
>> 
>> 
>> On Mon, Jul 25, 2022, at 3:08 PM, David Sterba wrote:
>> 
>> > I think it's safe to use by default, with the usual question who could
>> > be affected by that negatively. The async triggers, timeouts and
>> > thresholds are preset conservatively and so far there have been no
>> > complaints.
>> >
>> > The tunables have been hidden under debug, but there are also some stats
>> > (like how many bytes could be discarded in the next round), so it would
>> > be good to also publish that when it's on by default.
>> 
>> In my testing, discard=async rather quickly submits recently freed
>> metadata blocks for gc, blocks referred by the backup roots in the
>> super. Is this a concern for recovery? Is there a way to exclude the
>> most recent ~5 generations of metadata from gc, and could it improve
>> the usefulness of backup roots for recovery?
>
> What is quickly here?

Seconds, less than a minute, all the blocks pointed to by backup roots are empty (zeros).

 The default timeout is set to 2 minutes and that's
> what I've seen when testing that on a fresh filesystem. 

Ok I'll retest with 5.19 series. The last time I tested time to drive gc backup rootswas circa 5.8.

There's some
> additional logic that's adaptive and derived from the latency increase
> caused by the discard, but this could be observed on some specific
> delete heavy workloads.
>
> There's no logic that would take into account generation so it could
> interfere with the backup root feature, but it's been unreliable already
> and there are no guarantees to keep the recently freed blocks unallocated.
> I've heared some feedback that the feature should be removed or
> improved, I'm personally in the "removed" camp.

I have no strong opinion. If it very rarely works, I'm less concerned. But if it can sometimes work a significant minority of the time, some folks occasionally benefit in a pretty significant way.

--
Chris Murphy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-26 22:10       ` Chris Murphy
@ 2022-07-27 14:50         ` Chris Murphy
  2022-07-27 14:56           ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Murphy @ 2022-07-27 14:50 UTC (permalink / raw)
  To: David Sterba; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik



On Tue, Jul 26, 2022, at 6:10 PM, Chris Murphy wrote:
> On Tue, Jul 26, 2022, at 5:36 PM, David Sterba wrote:

>> What is quickly here?
>
> Seconds, less than a minute, all the blocks pointed to by backup roots 
> are empty (zeros).
>
>  The default timeout is set to 2 minutes and that's
>> what I've seen when testing that on a fresh filesystem. 
>
> Ok I'll retest with 5.19 series. The last time I tested time to drive 
> gc backup rootswas circa 5.8.

With 5.19, backup roots remain available for a surprisingly long time, definitely more than 2 minutes. It's not an exhaustive test, just a dozen samples over a half hour, but I never once saw zeros returned. Two out of those dozen, I saw the block already overwritten with a leaf of a newer generation and a completely different owner (e.g. csum tree written at the block address for the backup tree root)  - which would make that backup root useless.

What is a likely target kernel version to make discard=async the default? The merge window for 5.20 closes August 14. Is 5.21 a practical target?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 14:50         ` Chris Murphy
@ 2022-07-27 14:56           ` David Sterba
  2022-07-27 15:14             ` Chris Murphy
                               ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: David Sterba @ 2022-07-27 14:56 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik

On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
> On Tue, Jul 26, 2022, at 6:10 PM, Chris Murphy wrote:
> > On Tue, Jul 26, 2022, at 5:36 PM, David Sterba wrote:
> >> What is quickly here?
> >
> > Seconds, less than a minute, all the blocks pointed to by backup roots 
> > are empty (zeros).
> >
> >  The default timeout is set to 2 minutes and that's
> >> what I've seen when testing that on a fresh filesystem. 
> >
> > Ok I'll retest with 5.19 series. The last time I tested time to drive 
> > gc backup rootswas circa 5.8.
> 
> With 5.19, backup roots remain available for a surprisingly long time,
> definitely more than 2 minutes. It's not an exhaustive test, just a
> dozen samples over a half hour, but I never once saw zeros returned.
> Two out of those dozen, I saw the block already overwritten with a
> leaf of a newer generation and a completely different owner (e.g. csum
> tree written at the block address for the backup tree root)  - which
> would make that backup root useless.
> 
> What is a likely target kernel version to make discard=async the
> default? The merge window for 5.20 closes August 14. Is 5.21 a
> practical target?

The changes for the next merge window are supposed to be done a week or
two before it opens, but as this is a simple change I think I can
squeeze it in.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 14:56           ` David Sterba
@ 2022-07-27 15:14             ` Chris Murphy
  2022-07-27 17:47               ` David Sterba
  2022-07-27 15:26             ` Neal Gompa
  2022-08-05 15:14             ` David Sterba
  2 siblings, 1 reply; 13+ messages in thread
From: Chris Murphy @ 2022-07-27 15:14 UTC (permalink / raw)
  To: David Sterba; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik



On Wed, Jul 27, 2022, at 10:56 AM, David Sterba wrote:
> On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
>> What is a likely target kernel version to make discard=async the
>> default? The merge window for 5.20 closes August 14. Is 5.21 a
>> practical target?
>
> The changes for the next merge window are supposed to be done a week or
> two before it opens, but as this is a simple change I think I can
> squeeze it in.

For 5.20?

I'm not aware of any conflict with fstrim. But I wonder if there's a preference to coordinate the change with util-linux folks? 

Currently, util-linux provides fstrim.timer which runs fstrim.service every Monday at 00:00 local time. The command is:

ExecStart=/usr/sbin/fstrim --listed-in /etc/fstab:/proc/self/mountinfo --verbose --quiet-unsupported

I'm not sure how they'd go about implementing an exception for Btrfs, either entirely or only if a discard mount option is detected. But I can ask?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 14:56           ` David Sterba
  2022-07-27 15:14             ` Chris Murphy
@ 2022-07-27 15:26             ` Neal Gompa
  2022-08-05 15:14             ` David Sterba
  2 siblings, 0 replies; 13+ messages in thread
From: Neal Gompa @ 2022-07-27 15:26 UTC (permalink / raw)
  To: dsterba, Chris Murphy, Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik

On Wed, Jul 27, 2022 at 8:01 AM David Sterba <dsterba@suse.cz> wrote:
>
> On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
> > On Tue, Jul 26, 2022, at 6:10 PM, Chris Murphy wrote:
> > > On Tue, Jul 26, 2022, at 5:36 PM, David Sterba wrote:
> > >> What is quickly here?
> > >
> > > Seconds, less than a minute, all the blocks pointed to by backup roots
> > > are empty (zeros).
> > >
> > >  The default timeout is set to 2 minutes and that's
> > >> what I've seen when testing that on a fresh filesystem.
> > >
> > > Ok I'll retest with 5.19 series. The last time I tested time to drive
> > > gc backup rootswas circa 5.8.
> >
> > With 5.19, backup roots remain available for a surprisingly long time,
> > definitely more than 2 minutes. It's not an exhaustive test, just a
> > dozen samples over a half hour, but I never once saw zeros returned.
> > Two out of those dozen, I saw the block already overwritten with a
> > leaf of a newer generation and a completely different owner (e.g. csum
> > tree written at the block address for the backup tree root)  - which
> > would make that backup root useless.
> >
> > What is a likely target kernel version to make discard=async the
> > default? The merge window for 5.20 closes August 14. Is 5.21 a
> > practical target?
>
> The changes for the next merge window are supposed to be done a week or
> two before it opens, but as this is a simple change I think I can
> squeeze it in.

It would definitely be nice to see this for 5.20. :)


-- 
真実はいつも一つ!/ Always, there's only one truth!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 15:14             ` Chris Murphy
@ 2022-07-27 17:47               ` David Sterba
  2022-07-28 12:27                 ` Chris Murphy
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2022-07-27 17:47 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik

On Wed, Jul 27, 2022 at 11:14:01AM -0400, Chris Murphy wrote:
> 
> 
> On Wed, Jul 27, 2022, at 10:56 AM, David Sterba wrote:
> > On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
> >> What is a likely target kernel version to make discard=async the
> >> default? The merge window for 5.20 closes August 14. Is 5.21 a
> >> practical target?
> >
> > The changes for the next merge window are supposed to be done a week or
> > two before it opens, but as this is a simple change I think I can
> > squeeze it in.
> 
> For 5.20?

Yes, 5.20.

> I'm not aware of any conflict with fstrim. But I wonder if there's a
> preference to coordinate the change with util-linux folks? 

The -o discard and fstrim are independent and can be used at the same
time.

> Currently, util-linux provides fstrim.timer which runs fstrim.service every Monday at 00:00 local time. The command is:
> 
> ExecStart=/usr/sbin/fstrim --listed-in /etc/fstab:/proc/self/mountinfo --verbose --quiet-unsupported
> 
> I'm not sure how they'd go about implementing an exception for Btrfs,
> either entirely or only if a discard mount option is detected. But I
> can ask?

No need for an exception, during mount btrfs track ranges that have been
trimmed and skips them when discard is requested by fstrim, so it's a
no-op.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 17:47               ` David Sterba
@ 2022-07-28 12:27                 ` Chris Murphy
  0 siblings, 0 replies; 13+ messages in thread
From: Chris Murphy @ 2022-07-28 12:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik



On Wed, Jul 27, 2022, at 1:47 PM, David Sterba wrote:
> On Wed, Jul 27, 2022 at 11:14:01AM -0400, Chris Murphy wrote:
>> 
>> 
>> On Wed, Jul 27, 2022, at 10:56 AM, David Sterba wrote:
>> > On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
>
>> > The changes for the next merge window are supposed to be done a week or
>> > two before it opens, but as this is a simple change I think I can
>> > squeeze it in.
>> 
>> For 5.20?
>
> Yes, 5.20.

Well I'll let Josef and Chris arm wrestle over who sends you a patch when they can get ahold of some round tuits. Once it's merged in 5.20, sounds easily backported for Fedora to carry a patch for 5.19.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Using async discard by default with SSDs?
  2022-07-27 14:56           ` David Sterba
  2022-07-27 15:14             ` Chris Murphy
  2022-07-27 15:26             ` Neal Gompa
@ 2022-08-05 15:14             ` David Sterba
  2 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2022-08-05 15:14 UTC (permalink / raw)
  To: dsterba, Chris Murphy, Neal Gompa, Btrfs BTRFS, Chris Mason, Josef Bacik

On Wed, Jul 27, 2022 at 04:56:40PM +0200, David Sterba wrote:
> On Wed, Jul 27, 2022 at 10:50:01AM -0400, Chris Murphy wrote:
> > On Tue, Jul 26, 2022, at 6:10 PM, Chris Murphy wrote:
> > > On Tue, Jul 26, 2022, at 5:36 PM, David Sterba wrote:
> > >> What is quickly here?
> > >
> > > Seconds, less than a minute, all the blocks pointed to by backup roots 
> > > are empty (zeros).
> > >
> > >  The default timeout is set to 2 minutes and that's
> > >> what I've seen when testing that on a fresh filesystem. 
> > >
> > > Ok I'll retest with 5.19 series. The last time I tested time to drive 
> > > gc backup rootswas circa 5.8.
> > 
> > With 5.19, backup roots remain available for a surprisingly long time,
> > definitely more than 2 minutes. It's not an exhaustive test, just a
> > dozen samples over a half hour, but I never once saw zeros returned.
> > Two out of those dozen, I saw the block already overwritten with a
> > leaf of a newer generation and a completely different owner (e.g. csum
> > tree written at the block address for the backup tree root)  - which
> > would make that backup root useless.
> > 
> > What is a likely target kernel version to make discard=async the
> > default? The merge window for 5.20 closes August 14. Is 5.21 a
> > practical target?
> 
> The changes for the next merge window are supposed to be done a week or
> two before it opens, but as this is a simple change I think I can
> squeeze it in.

This will have to be postponed, Filipe found some problems,
transaction abort can happen due to ENOSPC when the system under load.
This does not happen in most cases but for a feature supposed to be on
by default such thing needs to be fixed first.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-08-05 15:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21 18:43 Using async discard by default with SSDs? Neal Gompa
2022-07-21 19:21 ` Roman Mamedov
2022-07-25 19:08 ` David Sterba
2022-07-26 20:00   ` Chris Murphy
2022-07-26 21:36     ` David Sterba
2022-07-26 22:10       ` Chris Murphy
2022-07-27 14:50         ` Chris Murphy
2022-07-27 14:56           ` David Sterba
2022-07-27 15:14             ` Chris Murphy
2022-07-27 17:47               ` David Sterba
2022-07-28 12:27                 ` Chris Murphy
2022-07-27 15:26             ` Neal Gompa
2022-08-05 15:14             ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.