On 9/1/16 8:12 PM, Chris Murphy wrote:
> On Thu, Sep 1, 2016 at 12:47 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
> 
> 
>> 2. Snapper's default snapshot creation configuration is absolutely
>> pathological in nature, generating insane amounts of background resource
>> usage and taking up huge amounts of space.  If this were changed, you would
>> be a lot less dependent on being able to free up snapshots based on space
>> usage.
> 
> That's diplomatic.
> 
> They know all of this already though, but instead of toning down
> snapper defaults, they're amping up the voluming by enabling quotas
> instead.
> 
> There is only one logical reason for this that I can thing of. They're
> trying to increase problem reports, presumably in order to smooth out
> noisy data, maybe even by getting better bug reports like Ronan's. But
> I think this is a specious policy.

There's no conspiracy to leverage the openSUSE user base to generate bug
reports any more than enabling any other feature in Tumbleweed before
SLES is.  We've enabled qgroups by default so that snapper can make sane
decisions based on space usage.  That's it.

>> It's poor choices like this that fall into the category of 'Ooh, this looks
>> cool, let's do it!' made by major distros that are most of the reason that
>> BTRFS has such a bad reputation right now.
> 
> Over on Factory list, they're trying to have this two ways. First
> they're saying quotas are stable as they've implemented them in the
> Leap 4.4 kernel. And they consider the btrfs-progs man page warning
> that quotas aren't yet stable even in 4.7, and aren't recommended
> unless the user will use them, is a bug that should be removed from
> their copy of the man page.

Yep.  That's a bug in the man page.  We do consider them stable.  I see
every btrfs bug that gets reported against SLE12 SP2, upon which the
Leap kernel is based.  Have there been qgroups bugs over the development
cycle?  You bet.  There's a reason if you look at the commit log for
qgroups over the past year, you'll see a bunch of fixes from SUSE
developers.

I explained what I think Ronan's issue is in another part of the thread
just now.  I don't think that's a severe issue at all.  Annoying?  Sure,
but I'm more concerned with the underlying ENOSPC issue.  Without more
info, I don't know what the cause of it is and when it was introduced.

We, like every other group of file system developers, run xfstests
pretty religiously.  Since qgroups are becoming a bigger part of the
btrfs experience for our products, we test them specifically.  Yes,
there are xfstests /just/ for qgroups, but we also make it a point to
run the entire xfstests suite with and without qgroups enabled.  Since
the requirement for snapper was to have accurate space tracking, that's
what we've focused on.

I obviously can't open up the SLES bugzilla to the world, so you're
going to have to take my word on this.  For our 4.4-based kernel there
are currently 3 qgroup related bugs.  The first is a report about how
annoying it is to see old qgroup items for removed subvolumes.  The
second is an accounting bug that is old and the developer just hasn't
gotten around to closing it yet.  The third is a real issue, where users
can hit the qgroup limit and are then stuck, similar to how it used to
be when you'd hit ENOSPC and couldn't remove files or subvolumes.  My
gut feeling is that it's the same kind of problem:  Removing files
involves allocating blocks to CoW the metadata and when you've hit your
quota limit, you can't allocate the blocks.  I expect the solution will
be similar to the ENOSPC issue except that rather than keeping a pool
around, we can just CoW knowing full well the intention is to release
space.  My team is working on that today and I expect a fix shortly.

-Jeff

-- 
Jeff Mahoney
SUSE Labs