* python-btrfs v10 preview... detailed usage reporting and a tutorial @ 2018-09-23 21:54 Hans van Kranenburg 2018-09-23 23:19 ` Adam Borowski 2018-09-24 8:08 ` Nikolay Borisov 0 siblings, 2 replies; 6+ messages in thread From: Hans van Kranenburg @ 2018-09-23 21:54 UTC (permalink / raw) To: linux-btrfs Hi all, I'm planning for a python-btrfs release to happen in a about week. All new changes are in the develop branch: https://github.com/knorrie/python-btrfs/commits/develop tl;dr: check out the two new examples added in the latest git commits and see if they provide correct info! ## Detailed usage reporting The new FsUsage object provides information on different levels (physical allocated bytes on devices, virtual space usage, etc) and also contains code to estimate how much space is still actually really available before ENOSPC happens. (e.g. the values that would ideally show up in df output). It works for all allocation profiles! Two examples have been added, which use the new code. I would appreciate extra testing. Please try them and see if the reported numbers make sense: space_calculator.py ------------------- Best to be initially described as a CLI version of the well-known webbased btrfs space calculator by Hugo. ;] Throw a few disk sizes at it, choose data and metadata profile and see how much space you would get to store actual data. See commit message "Add example to calculate usable and wasted space" for example output. show_usage.py ------------- The contents of the old show_usage.py example that simply showed a list of block groups are replaced with a detailed usage report of an existing filesystem. See commit message "A new show usage example!" for example output. ## A btrfs tutorial! A while ago I started creating documentation for python-btrfs in tutorial style. By playing around with an example filesystem we learn where btrfs puts our data on the disks, what a chunk, block group and an extent is, how we can quickly look up interesting things in metadata and how cows climb trees, moo. https://github.com/knorrie/python-btrfs/issues/11 https://github.com/knorrie/python-btrfs/blob/tutorial/tutorial/README.md I'm not sure yet if I'm going to 'ship' the first few pages already, since it's still very much a work in progress, but in any case feedback / ideas are welcome. Have a look! ## Other changes Other changes are the addition of the sync, fideduperange and get_features ioctl calls and a workaround for python 3.7 which breaks the struct module api. ## P.S. And finally, when doing the above, I discovered a few extra unintended features and bugs in the btrfs chunk allocator (Did you know RAID10 block groups are limited to 5GiB in size? Did you know that when the last chunk added on a disk is of DUP type, it could end up having an end beyond the limit of a device?). I still have to actually test the second one, causing it to happen. If anyone is interested to help with that, please ask about it. The bugs are all related to repeated kernel code all over the place containing a lot of if statements dealing with different kind of allocation profiles and their exceptions. What I ended up doing is making a few helper functions instead, see the commit "Add volumes.py, handling device / chunk logic". It would probably be nice to do the same in the kernel code, which would also solve the mentioned bugs and prevent new similar ones from happening. Have fun, -- Hans van Kranenburg ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: python-btrfs v10 preview... detailed usage reporting and a tutorial 2018-09-23 21:54 python-btrfs v10 preview... detailed usage reporting and a tutorial Hans van Kranenburg @ 2018-09-23 23:19 ` Adam Borowski 2018-10-08 0:03 ` Hans van Kranenburg 2018-09-24 8:08 ` Nikolay Borisov 1 sibling, 1 reply; 6+ messages in thread From: Adam Borowski @ 2018-09-23 23:19 UTC (permalink / raw) To: Hans van Kranenburg; +Cc: linux-btrfs On Sun, Sep 23, 2018 at 11:54:12PM +0200, Hans van Kranenburg wrote: > Two examples have been added, which use the new code. I would appreciate > extra testing. Please try them and see if the reported numbers make sense: > > space_calculator.py > ------------------- > Best to be initially described as a CLI version of the well-known > webbased btrfs space calculator by Hugo. ;] Throw a few disk sizes at > it, choose data and metadata profile and see how much space you would > get to store actual data. > > See commit message "Add example to calculate usable and wasted space" > for example output. > > show_usage.py > ------------- > The contents of the old show_usage.py example that simply showed a list > of block groups are replaced with a detailed usage report of an existing > filesystem. I wonder, perhaps at least some of the examples could be elevated to commands meant to be run by end-user? Ie, installing them to /usr/bin/, dropping the extension? They'd probably need less generic names, though. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ 10 people enter a bar: ⣾⠁⢰⠒⠀⣿⡁ • 1 who understands binary, ⢿⡄⠘⠷⠚⠋⠀ • 1 who doesn't, ⠈⠳⣄⠀⠀⠀⠀ • and E who prefer to write it as hex. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: python-btrfs v10 preview... detailed usage reporting and a tutorial 2018-09-23 23:19 ` Adam Borowski @ 2018-10-08 0:03 ` Hans van Kranenburg 2018-10-08 5:42 ` Adam Borowski 0 siblings, 1 reply; 6+ messages in thread From: Hans van Kranenburg @ 2018-10-08 0:03 UTC (permalink / raw) To: Adam Borowski; +Cc: linux-btrfs Hi, On 09/24/2018 01:19 AM, Adam Borowski wrote: > On Sun, Sep 23, 2018 at 11:54:12PM +0200, Hans van Kranenburg wrote: >> Two examples have been added, which use the new code. I would appreciate >> extra testing. Please try them and see if the reported numbers make sense: >> >> space_calculator.py >> ------------------- >> Best to be initially described as a CLI version of the well-known >> webbased btrfs space calculator by Hugo. ;] Throw a few disk sizes at >> it, choose data and metadata profile and see how much space you would >> get to store actual data. >> >> See commit message "Add example to calculate usable and wasted space" >> for example output. >> >> show_usage.py >> ------------- >> The contents of the old show_usage.py example that simply showed a list >> of block groups are replaced with a detailed usage report of an existing >> filesystem. > > I wonder, perhaps at least some of the examples could be elevated to > commands meant to be run by end-user? Ie, installing them to /usr/bin/, > dropping the extension? They'd probably need less generic names, though. Some of the examples are very useful, and I keep using them frequently. That's actually also the reason that I for now just have copied the examples/ to /usr/share/doc/python3-btrfs/examples for the Debian package, so that they're easily available on all systems that I work on. Currently the examples collection is serving a few purposes. It's my poor mans testing framework, which covers all functionality of the lib. It displays all the things that you can do. There's a rich git commit message history on them, which I plan to transform into documentation and tutorial stuff later. So, yes, a bunch of the things are quite useful actually. The new show_usage and space_calculator are examples of things that are possible which start to ascend the small thingies on debugging level. So what would be candidates to be promoted to 'official' utils? 0) Ah, btrfs-heatmap Yeah, that's the thing it all started with. I started writing all of the code to be able to debug why my filesystems were allocating raw disk space all the time and not reusing the free already allocated space. But, that one is already done. https://github.com/knorrie/btrfs-heatmap/ 1) Custom btrfs balance If really needed (and luckily, the need for it is mostly removed after solving the -o ssd issues) I always use balance_least_used.py instead of regular btrfs balance. I think it totally makes sense to do the analysis of what blockgroups to feed to balance in what order in user space. I also used another custom script to feed block groups with highly fragmented free space to balance to try repairing filesystems that had been using the cluster data extent allocator. That's not in examples, but when you combine show_free_space_fragmentation with parts of balance_least_used, you get the idea. The best example I can think of here is a program that uses the new usage information to find out how to feed block groups to balance to actually get a balanced filesystem with minimal amount of wasted raw space, and then do exactly that in the quickest way possible while providing interesting progress information, instead of just brute force rewriting all of the data and having no idea what's actually happening. 2) Advanced usage reporting Something like the new show_usage, but hey, when using python with some batteries included, I guess we can relatively easily do a nice html or pdf output with pie and bar charts which provide the user with information about the filesystem. Just having users run that when they're asking for help on IRC and share the result would be nice. :o) 3) The space calculator Yup, obviously. 4) Maybe show_orphan_cleaner_progress I use that one now and then to get a live view on mass-removal of subvolumes (backup snapshot expiry), but it's very close to a debug tool. Or maybe I'm already spoiled and used to it now, and I don't realize any more how frustrating it must be to see disk IO and cpu go all places and have no idea about what btrfs is doing. 5) So much more... So... the examples are just basic test coverage. There is so much more that can be done. And yes, to be able to write a small thingie that uses the lib, you already have to know a lot about btrfs. -> That's why I started writing the tutorial. And yes, when promoting things like the new show_usage example to programs that are easily available, users will probably start parsing the output of them with sed and awk which is a total abomination and the absolute opposite of the purpose of the library. So be it. Let it go. :D "The code never bothered me any way". The interesting question that remains is where the result should go. btrfs-heatmap is a thing of its own now, but it's a bit of the "show case" example using the lib, with its own collection of documentation and even possibility to script it again. Shipping the 'binaries' in the python3-btrfs package wouldn't be the right thing, so where should they go? apt-get install btrfs-moar-utils-yolo? Or should btrfs-progs start to use this to accelerate improvement for providing a richer collection of useful progs for things that are not on essential level (like, you won't need them inside initramfs, so they can use python)? -- Hans van Kranenburg ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: python-btrfs v10 preview... detailed usage reporting and a tutorial 2018-10-08 0:03 ` Hans van Kranenburg @ 2018-10-08 5:42 ` Adam Borowski 0 siblings, 0 replies; 6+ messages in thread From: Adam Borowski @ 2018-10-08 5:42 UTC (permalink / raw) To: Hans van Kranenburg; +Cc: linux-btrfs On Mon, Oct 08, 2018 at 02:03:44AM +0200, Hans van Kranenburg wrote: > And yes, when promoting things like the new show_usage example to > programs that are easily available, users will probably start parsing > the output of them with sed and awk which is a total abomination and the > absolute opposite of the purpose of the library. So be it. Let it go. :D > "The code never bothered me any way". It's not like some deranged person would parse the output of, say, show_file in Perl... > The interesting question that remains is where the result should go. > > btrfs-heatmap is a thing of its own now, but it's a bit of the "show > case" example using the lib, with its own collection of documentation > and even possibility to script it again. > > Shipping the 'binaries' in the python3-btrfs package wouldn't be the > right thing, so where should they go? apt-get install btrfs-moar-utils-yolo? At least in Debian, moving executables between packages is a matter of versioned Replaces (+Conflicts: old), so if any point you decide differently it's not a problem. So btrfs-moar-utils-yolo should work well. > Or should btrfs-progs start to use this to accelerate improvement for > providing a richer collection of useful progs for things that are not on > essential level (like, you won't need them inside initramfs, so they can > use python)? You might want your own package that's agile and btrfs-progs for things declared to be rock stable (WRT command-line API, not neccesarily stability of code). Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ 10 people enter a bar: 1 who understands binary, ⢿⡄⠘⠷⠚⠋⠀ 1 who doesn't, D who prefer to write it as hex, ⠈⠳⣄⠀⠀⠀⠀ and 1 who narrowly avoided an off-by-one error. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: python-btrfs v10 preview... detailed usage reporting and a tutorial 2018-09-23 21:54 python-btrfs v10 preview... detailed usage reporting and a tutorial Hans van Kranenburg 2018-09-23 23:19 ` Adam Borowski @ 2018-09-24 8:08 ` Nikolay Borisov 2018-09-28 23:04 ` Hans van Kranenburg 1 sibling, 1 reply; 6+ messages in thread From: Nikolay Borisov @ 2018-09-24 8:08 UTC (permalink / raw) To: Hans van Kranenburg, linux-btrfs On 24.09.2018 00:54, Hans van Kranenburg wrote: <snip> . > > The bugs are all related to repeated kernel code all over the place > containing a lot of if statements dealing with different kind of > allocation profiles and their exceptions. What I ended up doing is > making a few helper functions instead, see the commit "Add volumes.py, > handling device / chunk logic". It would probably be nice to do the same > in the kernel code, which would also solve the mentioned bugs and > prevent new similar ones from happening. Would you care to report each bug separately so they can be triaged and fixed? > > Have fun, > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: python-btrfs v10 preview... detailed usage reporting and a tutorial 2018-09-24 8:08 ` Nikolay Borisov @ 2018-09-28 23:04 ` Hans van Kranenburg 0 siblings, 0 replies; 6+ messages in thread From: Hans van Kranenburg @ 2018-09-28 23:04 UTC (permalink / raw) To: Nikolay Borisov, linux-btrfs On 09/24/2018 10:08 AM, Nikolay Borisov wrote: >> >> The bugs are all related to repeated kernel code all over the place >> containing a lot of if statements dealing with different kind of >> allocation profiles and their exceptions. What I ended up doing is >> making a few helper functions instead, see the commit "Add volumes.py, >> handling device / chunk logic". It would probably be nice to do the same >> in the kernel code, which would also solve the mentioned bugs and >> prevent new similar ones from happening. > > Would you care to report each bug separately so they can be triaged and > fixed? In case of the RAID10 5GiB thing I think I was mixing up things. When doing mkfs you end up with a RAID10 chunk of 5GiB (dunno why, didn't research), when mounting and pointing balance at it, I get a 10GiB for it back, so that's ok. For the DUP thing, I sent an explanation ("DUP dev_extent might overlap something next to it"), which doesn't seem to attract much attention yet. I'm preparing a pile of patches to volumes.[ch] to fix this, clean up things that I ran into and make the logic a bit less convoluted. -- Hans van Kranenburg ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-08 5:42 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-23 21:54 python-btrfs v10 preview... detailed usage reporting and a tutorial Hans van Kranenburg 2018-09-23 23:19 ` Adam Borowski 2018-10-08 0:03 ` Hans van Kranenburg 2018-10-08 5:42 ` Adam Borowski 2018-09-24 8:08 ` Nikolay Borisov 2018-09-28 23:04 ` Hans van Kranenburg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).