From: Steven Davies <btrfs-list@steev.me.uk>
To: Hans van Kranenburg <hans@knorrie.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Exploring referenced extents
Date: Wed, 13 May 2020 21:08:05 +0100 [thread overview]
Message-ID: <ac22328714cb200989294a451fc9930b@steev.me.uk> (raw)
In-Reply-To: <3e9446ef-955b-351c-8238-9ca07ee38bf6@knorrie.org>
On 2020-05-11 02:21, Hans van Kranenburg wrote:
> Hi!
Thanks for your insights!
> On 5/9/20 1:11 PM, Steven Davies wrote:
>> For curiosity I'm trying to write a tool which will show me the size
>> of
>> data extents belonging to which files in a snapshot are exclusive to
>> that snapshot as a way to show how much space would be freed if the
>> snapshot were to be deleted, and which files in the snapshot are
>> taking
>> up the most space.
<snip lots of useful information>
This is what I was missing when I read the documentation:
>> find what files reference it #1
>> for each referencing file:
>> determine which subvolumes it lives in #2
>
> For this, we delegate the work to the running linux kernel code, to ask
> it who's using the extent at this disk_bytenr.
>
> https://python-btrfs.readthedocs.io/en/stable/btrfs.html#btrfs.ioctl.logical_to_ino_v2
>
> The main thing you're looking for is the ignore_offset option, which
> will give you a list of *any* user of *any* data in that extent,
> instead
> of only the first 4096 bytes in it which disk_bytenr itself is part of.
I did rework the script - albeit not the way you suggested (I still walk
the file tree and look up the extents) because my subvolumes are small
and stored on relatively fast SSDs, and this way allows me to narrow the
search to a single directory - but it seems to work now. It isn't pretty
yet either! It's succeeded in telling me that the reason the oldest
snapshot of my / subvolume is huge is because it contains a dump of
linux-firmware that's not shared by anything.
Next job - make it into a tree-like utility.
https://github.com/daviessm/btrfs-snapshots-diff/blob/4003a3fdec70c2a0de348e75a6576f9342754f54/btrfs-subvol-size.py
--
Steven Davies
next prev parent reply other threads:[~2020-05-13 20:08 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-09 11:11 Exploring referenced extents Steven Davies
2020-05-09 19:16 ` Steven Davies
2020-05-09 21:32 ` Graham Cobb
2020-05-10 11:07 ` Steven Davies
2020-05-10 12:15 ` Graham Cobb
2020-05-10 1:20 ` Qu Wenruo
2020-05-10 10:55 ` Steven Davies
2020-05-10 11:55 ` Qu Wenruo
2020-05-10 12:51 ` Steven Davies
2020-05-10 13:05 ` Qu Wenruo
2020-05-11 1:21 ` Hans van Kranenburg
2020-05-13 20:08 ` Steven Davies [this message]
2020-05-13 20:15 ` Hans van Kranenburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac22328714cb200989294a451fc9930b@steev.me.uk \
--to=btrfs-list@steev.me.uk \
--cc=hans@knorrie.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.