On Wed, 22 Mar 2023 01:38:42 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Mar 21, 2023 at 05:26:49PM -0400, Josef Bacik wrote:
> > We got the defaults based on our testing with our workloads inside of
> > FB.  Clearly this isn't representative of a normal desktop usage, but
> > we've also got a lot of workloads so figured if it made the whole
> > fleet happy it would probably be fine everywhere.
> > 
> > That being said this is tunable for a reason, your workload seems to
> > generate a lot of free'd extents and discards.  We can probably mess
> > with the async stuff to maybe pause discarding if there's no other
> > activity happening on the device at the moment, but tuning it to let
> > more discards through at a time is also completely valid.  Thanks,  
> 
> FYI, discard performance differs a lot between different SSDs.
> It used to be pretty horrible for most devices early on, and then a
> certain hyperscaler started requiring decent performance for enterprise
> drives, so many of them are good now.  A lot less so for the typical
> consumer drive, especially at the lower end of the spectrum.
> 
> And that jut NVMe, the still shipping SATA SSDs are another different
> story.  Not helped by the fact that we don't even support ranged
> discards for them in Linux.

Josef, what did you use as a signal to detect what value was good
enough? Did you crank up the number until discard backlog clears up in a
reasonable time?

I still don't understand what I should take into account to change the
default and whether I should change it at all. Is it fine if the discard
backlog takes a week to get through it? (Or a day? An hour? A minute?)

Is it fine to send discards as fast as device allows instead of doing 10
IOPS? Does IOPS limit consider a device wearing tradeoff? Then low IOPS
makes sense. Or IOPS limit is just a way to reserve most bandwidth to
non-discard workloads? Then I would say unlimited IOPS as a default
would make more sense for btrfs.

-- 

  Sergei