linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* python-btrfs v10
@ 2019-01-18 23:10 Hans van Kranenburg
  0 siblings, 0 replies; only message in thread
From: Hans van Kranenburg @ 2019-01-18 23:10 UTC (permalink / raw)
  To: linux-btrfs

I just tagged v10 of the python btrfs library.

https://github.com/knorrie/python-btrfs/

Somewhere in the next days I'll update pypi and will prepare debian
packages for unstable and stretch-backports.

Note: I will be converting the branches in git for debian packaging to
the DEP14 standard, also see
https://github.com/knorrie/python-btrfs/issues/14

---- >8 ----

Here's the summary of changes (from CHANGES):

python-btrfs v10, Jan 18, 2019
  * Switch to LGPL-3.0 license.
  * Add reference documentation to all code, using sphinx autodoc. An
    online version of the result in html format is available at:
    https://python-btrfs.readthedocs.io/en/latest/btrfs.html
  * All parts of the code that are covered by reference documentation
    are now considered to be the public API of the library. Effort will
    be done to not break this API in the future.
  * The FileSystem class now provides a context manager, to help prevent
    leaking the open file descriptor used internally.
  * The mounted_filesystems function in btrfs.utils module has been
    replaced by mounted_filesystem_paths. It is the responsibility of
    the caller to create the FileSystem objects for these paths and use
    the new context manager while doing do.
  * Add the fs_usage module, containing the FsUsage class, providing
    detailed usage reporting for a filesystem. This code replaces old
    similar functions, which were incomplete and buggy.
  * Adopt munin and nagios plugins to use the new FsUsage object.
  * IOCTLs: IOC_SYNC, FIDEDUPERANGE, IOC_GET_FEATURES
  * Data structures: FreeSpaceInfo, FreeSpaceExtent, FreeSpaceBitmap.
  * Introduce ItemNotFoundError, currently only raised by the block
    group lookup function in FileSystem. It inherits from IndexError for
    backwards compatibility.
  * Many small changes to improve the object pretty printer.
  * Fixes:
    - By default the maximum amount of items generated by a single call
      to search or search_v2 would be limited to 2^32-1. Fix this, so
      it's unlimited. The search functions return an iterator and call
      the search ioctl of the linux kernel multiple times internally.
  * Examples added:
    - A new show_usage, which shows information from the new usage
      reporting.
    - space_calculator: Offline usable space calculator, based on disk
      sizes and profiles provided on the command line.
    - file_dedupe_range: Example about using the deduplication ioctl.

---- >8 ----

My favourite one for this release is the new detailed usage reporting.

By using an algorithm that is based on the behaviour of the chunk
allocator in kernel code it simulates what would happen if the rest of
all available unallocated raw disk space for the filesystem would be
used up by newly written data. While doing so, it takes the current
usage pattern of the filesystem into account (data vs metadata ratio).

https://github.com/knorrie/python-btrfs/commit/41f7f3ca8f32566cf2182949c3724295e3776fc8

After running the simulation of the chunk allocator algorithm, there
might still be unallocated raw disk space left when no further chunk
allocations using the current target profiles for data and metadata are
possible. These amounts of "unallocatable" raw disk space are also
available in the report.

Moreover, when re-running the same simulation, using the current usage
pattern, and starting with all available disk space as unallocated, we
can figure out the amounts of raw disk space that would be unallocatable
when converting all data to the target profiles, while taking into
account that there might be different sized attached block devices
spoiling the fun.

By combining both of those results, it's possible to report the amount
of unallocatable raw disk space that could be "reclaimed" for use when
doing btrfs balance to rewrite data.

And if your head is not yet spinning right now, by combining all of
that, it's quite easy to predict if a btrfs balance convert will have
any chance of completion. Just watch output of the new show_usage.py
example, or use your own cobbled up script that does things with
attributes of your fs.usage() to follow progress.

The code works for all currently implemented block group profiles in btrfs.

But it's python, so isn't this very slow? I have a 72TiB btrfs
filesystem here with about 69k block groups, and crawling through the
whole chunk tree while retrieving it from the kernel, as well as doing
both the chunk allocator simulations for remaining unallocated space and
for a fully empty filesystem takes two and a halve seconds in total. I
think that isn't bad.

---- >8 ----

So, it has been more than a year... What happened? Well, at first I
started writing tutorial style documentation. It can be seen in still
highly WIP state at
https://github.com/knorrie/python-btrfs/blob/tutorial/tutorial/README.md

While doing so, I ran into a bug, and another one... And a lot of things
that just didn't work for RAID0/5/6 yet.

Then I started fixing that by writing the new usage reporting code. When
this got into shape, I realized that noone else than myself would ever
be able to use all of that without proper reference documentation about
all objects and attributes.

So, I started writing reference documentation for it, and then for the
rest of the code. While doing so, I went over all of the code in the
library again, and fixed a huge amount of small issues.

And, I'm happy with the end result. :) I think the reference docs will
help users a lot to find out what can be done using the library, and
have a fun time exploring.

---- >8 ----

Future work (or RSN and SIYH):

1) There has been a repeated question to promote some of the things that
are in examples/ to programs that will be put in /usr/bin. There are
quite a bunch of things that are very useful for end users. I will start
a separate discussion about this RSN on this mailing list. It will
require improving command line argument handling, writing man pages and
I don't know if the btrfs team would like me to start polluting the
btrfs-* tab space again after they've tried to get rid of all of those
btrfs-* commands.

This is the very first thing I want to work on now.

2) Tutorial style documentation.

I will move the current work on tutorial style documentation as
referenced above into the sphinx stuff, so I can cross reference
everything. No ETA, since this is just my hobby project. ;]

Have fun, and don't hesitate to talk to me on IRC (Knorrie) or email me
if you run into problems using all of this. o/

--
Hans van Kranenburg

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-01-18 23:10 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-18 23:10 python-btrfs v10 Hans van Kranenburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).