All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Mike Fleetwood <mike.fleetwood@googlemail.com>
Cc: Tarik Ceylan <Tarik.Ceylan@ruhr-uni-bochum.de>,
	linux-xfs@vger.kernel.org, sandeen@sandeen.net
Subject: Re: How to reliably measure fs usage with reflinks enabled?
Date: Wed, 16 May 2018 10:13:42 +1000	[thread overview]
Message-ID: <20180516001342.GK23861@dastard> (raw)
In-Reply-To: <CAMU1PDgMH_K1N71SmkwDj159wTjFM3js-hhFGr8uE_GhmLn5mA@mail.gmail.com>

On Tue, May 15, 2018 at 02:52:30PM +0100, Mike Fleetwood wrote:
> On 15 May 2018 at 02:29, Dave Chinner <david@fromorbit.com> wrote:
> > So the reflink code reserved ~7GB of space in the filesystem (less
> > than 1%) for it's own reflink related metadata if it ever needs it.
> > It hasn't used it yet but we need to make sure that it's available
> > when the filesystem is near ENOSPC. Hence it's considered used space
> > because users cannot store user data in that space.
> >
> > The change I plan to make is to reduce the user reported filesystem
> > size rather than account for it as used space. IOWs, you'd see a
> > filesystem size of 889G instead of 896G, but have only 8.8GB used.
> > It means exactly the same thingi and will behave exactly the same
> > way, it's just a different space accounting technique....
> 
> I'm one of the authors of GParted and it uses the reported file system
> size [1] and compares it to the block device size to see if the file
> system fills the partition or not and whether to show unallocated space
> to the user and advise them to grown the file system to fill the block
> device [2].  As such we prefer that the reported size of the file system
> match the highest offset that the file system can write to in the block
> device.

I think that's a narrow, use case specific assumption. There is
absolutely no guarantee that the filesystem on a device fills the
entire device or that the filesystem space reported by df/statvfs
accurately reflects the size of the underlying block device.

Filesystems are moving towards a virtualised world where space usage
and capacity is kept separate from the capacity of the underlying
storage provider. That's a solid direction we are moving with xfs:

https://www.spinics.net/lists/linux-xfs/msg12216.html

so we can support subvolumes:

https://www.youtube.com/watch?v=wG8FUvSGROw

via a virtual block address space that remaps the filesystem space
accounting away from the underlying physical block device:

https://lwn.net/SubscriberLink/753650/32230c15f3453808/

This will completely break any assumption that the filesystem size
is related to the underlying storage device(s).

GParted deals very firmly with a specific aspect of disk based
storage - managing partitions on a physical block device.
Filesystems need to move beyond physical block devices - sanely
supporting sparse virtual block devices has been on everyone's
enterprise filesystem wish list for years.

GParted doesn't have to support these new features - it can simply
turn them off for filesystems it creates on physical disk
partitions, but we're doing stuff to support the storage models
needed for container hosting, virtualisation, efficient backups and
cloning, etc. If that means we have to break assumptions that legacy
infrastructure make to support those new features, then so be it....

<snip>

> [2] For full disclosure, because tools for various FSs under report
>     their file system size, there is a heuristic that there must be at
>     least 2% difference before unallocated space and grow file system
>     recommendation is generated so under reporting the FS size by less
>     than 1% wouldn't actually be an issue. for us.

So, an ext3 example on a small root filesystem:

$ grep sda1 /proc/partitions 
   8        1    9984366 sda1
$ df -k /
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/root        9696448 8615892    581340  94% /
$

Just under 3% difference between fs reported size and the block
device size, and obviously GParted has been fine with this sort of
discrepancy on ext3 for the past 15+years. IIRC the XFS metadata
reservations max out at around 3% of total filesystem space, so
GParted should be just fine with us hiding them by reducing total
filesystem size...

> Just providing an app authors point of view.

*nod*.

We're aware that we need to let existing apps continue to work on
existing formats and features. But we need to break from the old
ways to do what people are asking us to do, so we're not going to
lock ourselves in. If we're not breaking old things and making
people unhappy, then we're not making sufficient progress.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-05-16  0:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-14 20:02 How to reliably measure fs usage with reflinks enabled? Tarik Ceylan
2018-05-14 22:02 ` Eric Sandeen
2018-05-14 22:57   ` Dave Chinner
2018-05-14 23:37     ` Tarik Ceylan
2018-05-15  1:29       ` Dave Chinner
2018-05-15 13:52         ` Mike Fleetwood
2018-05-16  0:13           ` Dave Chinner [this message]
2018-05-18 14:43             ` Mike Fleetwood
2018-05-18 14:56               ` Eric Sandeen
2018-05-19  8:36                 ` Mike Fleetwood
2018-05-18 14:58         ` Darrick J. Wong
2018-05-20  0:10           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180516001342.GK23861@dastard \
    --to=david@fromorbit.com \
    --cc=Tarik.Ceylan@ruhr-uni-bochum.de \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mike.fleetwood@googlemail.com \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.