* Questions related to BCacheFS
@ 2023-11-18 19:15 Martin Steigerwald
2023-11-18 19:50 ` Kent Overstreet
0 siblings, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-18 19:15 UTC (permalink / raw)
To: linux-bcachefs
Hi!
Awesome that BCacheFS is finally merged! Many thanks to everyone who made
this happen. I appreciate it!
I am writing an article about BCacheFS. I am willing to provide a link
once it is published. It will be in German language.
I do have a few questions:
1) Is discard supported? fstrim says it is not. However /sys/fs/bcachefs/
UUID/options/discard shows "1". BCacheFS User manual Principles of
Operations mentions it at a device option. I am not completely sure how
these work. Auto-detected and just IOCTL for fstrim missing?
2) What are the plans for scrubbing? Right now it is not yet implemented,
right?
3) Is the documentation of mount and other options in
https://bcachefs.org/bcachefs-principles-of-operation.pdf complete? If
not, care to elaborate what is missing?
4) What are the plans or ideas for documentation? I specially ask as there
does not seem to be a manpage like mount.bcachefs or mkfs.bcachefs yet.
There is no mention of bcachefs in mount manpage either. And no bcachefs
manpage in section 5 like with btrfs or xfs. There is a bcachefs manpage
in section 8 which for example for a complete list of mount options refers
to above Principles of Operation user manual. And it has information on
"bcachefs format" and some other sub commands. I bet it is still too early
or maybe you have different plans on how to go about documentation.
Anything you can share already regarding this?
5) Is the feature implementation status on bcachefs.org up-to-date? How
about the one in Principles of Operation user manual? Is any of these more
up-to-date? If anything is missing from these, care to elaborate?
6) What is the status for xxhash checksums? They are mentioned as an
option in the output of "bcachefs format". Yet no mention of it in
bcachefs manpage nor in Principles of Operation user manual.
7) On mounting BCacheFS without compression enabled on 6.7-rc1, shortly
before rc2, commit 791c8ab095f71327899023223940dd52257a4173 also LZ4
compression modules lz4hc_compress and lz4_compress are loaded. Why?
8) Regarding bcachefs-tools. More out of curiosity, cause there is already
a bcachefs-tools package in Debian repo, albeit only version 1.2. I see a
"debian" directory, however version number is 1.0.8-2~bpo8+1 while
compiling via make gives version 1.33. So I suppose packaging information
is not up to date? For now I am going with "make install" from bcachefs-
tools git repo, as package in Debian repo is outdated.
9) What is the preferred way to report bugs? Mailing list? Kernel bug
tracker? Both? Anything else?
10) Anything you think an article about BCacheFS should absolutely
mention?
There may be more at a later time. :)
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 19:15 Questions related to BCacheFS Martin Steigerwald
@ 2023-11-18 19:50 ` Kent Overstreet
2023-11-18 20:57 ` Martin Steigerwald
0 siblings, 1 reply; 15+ messages in thread
From: Kent Overstreet @ 2023-11-18 19:50 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-bcachefs
On Sat, Nov 18, 2023 at 08:15:50PM +0100, Martin Steigerwald wrote:
> Hi!
>
> Awesome that BCacheFS is finally merged! Many thanks to everyone who made
> this happen. I appreciate it!
>
> I am writing an article about BCacheFS. I am willing to provide a link
> once it is published. It will be in German language.
>
> I do have a few questions:
>
> 1) Is discard supported? fstrim says it is not. However /sys/fs/bcachefs/
> UUID/options/discard shows "1". BCacheFS User manual Principles of
> Operations mentions it at a device option. I am not completely sure how
> these work. Auto-detected and just IOCTL for fstrim missing?
Yes, it's supported. There's no need for fstrim support because we
discard buckets as soon as they become empty.
>
> 2) What are the plans for scrubbing? Right now it is not yet implemented,
> right?
Yes, it's very much planned.
> 3) Is the documentation of mount and other options in
> https://bcachefs.org/bcachefs-principles-of-operation.pdf complete? If
> not, care to elaborate what is missing?
The master option list is here:
https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/opts.h#n122
A few of these are hidden; OPT_FORMAT and OPT_MOUNT options are the
options you'll be looking for - and OPT_DEVICE for device specific
options.
> 4) What are the plans or ideas for documentation? I specially ask as there
> does not seem to be a manpage like mount.bcachefs or mkfs.bcachefs yet.
> There is no mention of bcachefs in mount manpage either. And no bcachefs
> manpage in section 5 like with btrfs or xfs. There is a bcachefs manpage
> in section 8 which for example for a complete list of mount options refers
> to above Principles of Operation user manual. And it has information on
> "bcachefs format" and some other sub commands. I bet it is still too early
> or maybe you have different plans on how to go about documentation.
> Anything you can share already regarding this?
I'm entirely too busy with just writing code - I'd love to have more
time for documentation, but it's hard :) But, there are people starting
to contribute to the man pages, so I expect that will improve.
> 5) Is the feature implementation status on bcachefs.org up-to-date? How
> about the one in Principles of Operation user manual? Is any of these more
> up-to-date? If anything is missing from these, care to elaborate?
Reasonably up to date, yes. The main areas that still need work and
testing are snapshots and erasure coding; with snapshots it's looking
like just minor bugs are left and fleshing out features, erasure coding
is improving but still needs quite a bit of work.
> 6) What is the status for xxhash checksums? They are mentioned as an
> option in the output of "bcachefs format". Yet no mention of it in
> bcachefs manpage nor in Principles of Operation user manual.
I initially had concerns about whether that code was actually solid - I
think it's been resolved; I'll just want to hear some positive feedback
from people using it before I add it to the documentation.
> 7) On mounting BCacheFS without compression enabled on 6.7-rc1, shortly
> before rc2, commit 791c8ab095f71327899023223940dd52257a4173 also LZ4
> compression modules lz4hc_compress and lz4_compress are loaded. Why?
We just add hard dependencies on the compression modules because a) the
crypto interface (that lets you use them as runtime dependencies)
_sucks_ and the lz4 modules at least are pretty small. zstd is bigger
though, so making these runtime dependencies would be a worthwhile
enhancement for anyone who's interested.
> 8) Regarding bcachefs-tools. More out of curiosity, cause there is already
> a bcachefs-tools package in Debian repo, albeit only version 1.2. I see a
> "debian" directory, however version number is 1.0.8-2~bpo8+1 while
> compiling via make gives version 1.33. So I suppose packaging information
> is not up to date? For now I am going with "make install" from bcachefs-
> tools git repo, as package in Debian repo is outdated.
Yeah I'm in contact with the debian maintainer, it should be updated
soon.
> 9) What is the preferred way to report bugs? Mailing list? Kernel bug
> tracker? Both? Anything else?
Mailing list is good, or the github bugtrackers (that really should be
linked on the website:
https://github.com/koverstreet/bcachefs/issues/
https://github.com/koverstreet/bcachefs-tools/issues/
> 10) Anything you think an article about BCacheFS should absolutely
> mention?
Would personally love to see some non-phoronix benchmarks :)
I've put a ton of effort into performance, my goal is a COW filesystem
that can compete with XFS on performance and scalabality - which is a
tall order! but we're getting close.
With the btree write buffer rewrite (still not quite merged, any day
now) - I'm pushing _900k_ iops, 4k random writes - through the COW write
path.
This is in my benchmarking/profiling mode, with checksums off and data
reads/writes to the device turned off - i.e. just showing bcachefs
overhead. So not real world nummbers, but indicative of how well we can
scale.
Cheers,
Kent
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 19:50 ` Kent Overstreet
@ 2023-11-18 20:57 ` Martin Steigerwald
2023-11-18 21:07 ` Kent Overstreet
2023-12-28 22:29 ` deletion time of big files (was: Re: Questions related to BCacheFS) Martin Steigerwald
0 siblings, 2 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-18 20:57 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Hi Kent.
Thanks for answering so timely. Feel free to skip answering during rest of
the weekend :)
Kent Overstreet - 18.11.23, 20:50:24 CET:
> > 10) Anything you think an article about BCacheFS should absolutely
> > mention?
>
> Would personally love to see some non-phoronix benchmarks :)
I see. Well thing is, I am not really satisfied about Samsung 980 Pro 2 TB
NVME SSD performance on this ThinkPad T14 AMD Gen 1 under Linux, so not
sure whether performance benchmarks would be suitable on that setup. At
least not without going about a firmware upgrade again and hoping it helps
this time, if available. However I remember not really liking to dig out
the firmware upgrade from an ISO image for Samsung not providing via LVFS.
Also benchmarking may more be in scope of a later article if at all, cause
I think even with just explaining about BCacheFS the article will become
long enough :). It is challenging to get benchmarking right and obtain
actually meaningful results. And before getting it wrong, I'd rather skip
or delay that. But anyway: Any suggestion for a specific benchmark?
Any advice about Phoronix benchmarks? I bet the one I saw was with some
debug option on, that may better be off. I think it has been:
CONFIG_BCACHEFS_DEBUG_TRANSACTIONS? I did not check whether Michael
Larabel did a new one already with that turned off.
As far as I understand one specific performance related aspect of BCacheFS
would be low latencies due to the frontend / backend architecture which in
principle is based on what has been there in BCache already. I am
intending to explore a bit into that concept in my article.
> I've put a ton of effort into performance, my goal is a COW filesystem
> that can compete with XFS on performance and scalabality - which is a
> tall order! but we're getting close.
>
> With the btree write buffer rewrite (still not quite merged, any day
> now) - I'm pushing _900k_ iops, 4k random writes - through the COW write
> path.
>
> This is in my benchmarking/profiling mode, with checksums off and data
> reads/writes to the device turned off - i.e. just showing bcachefs
> overhead. So not real world nummbers, but indicative of how well we can
> scale.
Interesting. Only thing regarding performance I noticed so far that
deleting an almost 8 GiB large DVD ISO image file took a bit longer than
instant, but I was using Dolphin on Plasma, so not sure whether this tiny
delay was filesystem or GUI related.
Also I found that free space with "df -hT" was only 35,8 GiB initially,
now 36 GiB of 40 GiB instead of the initial 37 GiB after making the
filesystem, but I bet that may just be related to allocation behavior.
Some kind of chunk allocated but not freed again so it can be reused
later. But I need to dig into this a bit deeper. I read about some
reservation as well, but need to dig that up again.
I'd really love to dig a bit into what makes BCacheFS unique, also in
comparison with BTRFS and maybe a bit also ZFS. Also to explain: "Why yet
another filesystem?" to the reader :). My own hope is that indeed BCacheFS
will improve on some of the performance issues with BTRFS. And also with
BCacheFS you can have cache devices which AFAIK is still not implemented
for BTRFS. There was VFS Hot Data Tracking + BTRFS part patches on BTRFS
mailing list some longer time ago, but AFAIK they never went in.
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 20:57 ` Martin Steigerwald
@ 2023-11-18 21:07 ` Kent Overstreet
2023-11-18 23:15 ` Martin Steigerwald
2023-12-28 22:29 ` deletion time of big files (was: Re: Questions related to BCacheFS) Martin Steigerwald
1 sibling, 1 reply; 15+ messages in thread
From: Kent Overstreet @ 2023-11-18 21:07 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-bcachefs
On Sat, Nov 18, 2023 at 09:57:50PM +0100, Martin Steigerwald wrote:
> Hi Kent.
>
> Thanks for answering so timely. Feel free to skip answering during rest of
> the weekend :)
>
> Kent Overstreet - 18.11.23, 20:50:24 CET:
> > > 10) Anything you think an article about BCacheFS should absolutely
> > > mention?
> >
> > Would personally love to see some non-phoronix benchmarks :)
>
> I see. Well thing is, I am not really satisfied about Samsung 980 Pro 2 TB
> NVME SSD performance on this ThinkPad T14 AMD Gen 1 under Linux, so not
> sure whether performance benchmarks would be suitable on that setup. At
> least not without going about a firmware upgrade again and hoping it helps
> this time, if available. However I remember not really liking to dig out
> the firmware upgrade from an ISO image for Samsung not providing via LVFS.
> Also benchmarking may more be in scope of a later article if at all, cause
> I think even with just explaining about BCacheFS the article will become
> long enough :). It is challenging to get benchmarking right and obtain
> actually meaningful results. And before getting it wrong, I'd rather skip
> or delay that. But anyway: Any suggestion for a specific benchmark?
>
> Any advice about Phoronix benchmarks? I bet the one I saw was with some
> debug option on, that may better be off. I think it has been:
> CONFIG_BCACHEFS_DEBUG_TRANSACTIONS? I did not check whether Michael
> Larabel did a new one already with that turned off.
>
> As far as I understand one specific performance related aspect of BCacheFS
> would be low latencies due to the frontend / backend architecture which in
> principle is based on what has been there in BCache already. I am
> intending to explore a bit into that concept in my article.
The low latency stuff actually wasn't in bcache - that work came later.
Things like
- six locks - so we have intent locks that don't block readers, and
only need to take write locks for the actual btree node update
- asynchronous interior btree node updates; in bcache when we split a
node we have to wait for writes to complete before updating the
parent node, in bcachefs work after IO completion is fully
asynchronous
- the big one that no other filesystem has: a 'btree_trans' object that
tracks all btree locks, and can be unlocked and then relocked when we
do an operation that might block (at the cost of a potential
transaction restart at relock() time) - we never have to block with
btree locks held.
> > I've put a ton of effort into performance, my goal is a COW filesystem
> > that can compete with XFS on performance and scalabality - which is a
> > tall order! but we're getting close.
> >
> > With the btree write buffer rewrite (still not quite merged, any day
> > now) - I'm pushing _900k_ iops, 4k random writes - through the COW write
> > path.
> >
> > This is in my benchmarking/profiling mode, with checksums off and data
> > reads/writes to the device turned off - i.e. just showing bcachefs
> > overhead. So not real world nummbers, but indicative of how well we can
> > scale.
>
> Interesting. Only thing regarding performance I noticed so far that
> deleting an almost 8 GiB large DVD ISO image file took a bit longer than
> instant, but I was using Dolphin on Plasma, so not sure whether this tiny
> delay was filesystem or GUI related.
It could be that we still have work to do; there are plenty of higher
level filesystem operations that I haven't specifically benchmarked. If
you do happen to do a head to head comparison with other filesystems and
find that unlink (or anything else) is slow - please report it!
> Also I found that free space with "df -hT" was only 35,8 GiB initially,
> now 36 GiB of 40 GiB instead of the initial 37 GiB after making the
> filesystem, but I bet that may just be related to allocation behavior.
> Some kind of chunk allocated but not freed again so it can be reused
> later. But I need to dig into this a bit deeper. I read about some
> reservation as well, but need to dig that up again.
That's the copygc reserve.
> I'd really love to dig a bit into what makes BCacheFS unique, also in
> comparison with BTRFS and maybe a bit also ZFS. Also to explain: "Why yet
> another filesystem?" to the reader :). My own hope is that indeed BCacheFS
> will improve on some of the performance issues with BTRFS. And also with
> BCacheFS you can have cache devices which AFAIK is still not implemented
> for BTRFS. There was VFS Hot Data Tracking + BTRFS part patches on BTRFS
> mailing list some longer time ago, but AFAIK they never went in.
Performance with more than a few snapshots is a big selling point vs.
btrfs - Dave Chinner did some comparisons awhile back, bcachefs beats
the pants off of btrfs in snapshot scalability :)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 21:07 ` Kent Overstreet
@ 2023-11-18 23:15 ` Martin Steigerwald
2023-11-18 23:42 ` Kent Overstreet
0 siblings, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-18 23:15 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Thanks again, Kent.
Kent Overstreet - 18.11.23, 22:07:27 CET:
> > As far as I understand one specific performance related aspect of
> > BCacheFS would be low latencies due to the frontend / backend
> > architecture which in principle is based on what has been there in
> > BCache already. I am intending to explore a bit into that concept in
> > my article.
>
> The low latency stuff actually wasn't in bcache - that work came later.
So the frontend / backend architecture is not that much of what makes
BCacheFS unique? Important to know as it seems I may have misunderstood
something here.
I may need to shift the approach to my article a bit then. Good that I
asked early on.
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 23:15 ` Martin Steigerwald
@ 2023-11-18 23:42 ` Kent Overstreet
2023-11-19 11:13 ` Martin Steigerwald
0 siblings, 1 reply; 15+ messages in thread
From: Kent Overstreet @ 2023-11-18 23:42 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-bcachefs
On Sun, Nov 19, 2023 at 12:15:19AM +0100, Martin Steigerwald wrote:
> Thanks again, Kent.
>
> Kent Overstreet - 18.11.23, 22:07:27 CET:
> > > As far as I understand one specific performance related aspect of
> > > BCacheFS would be low latencies due to the frontend / backend
> > > architecture which in principle is based on what has been there in
> > > BCache already. I am intending to explore a bit into that concept in
> > > my article.
> >
> > The low latency stuff actually wasn't in bcache - that work came later.
>
> So the frontend / backend architecture is not that much of what makes
> BCacheFS unique? Important to know as it seems I may have misunderstood
> something here.
The "filesystem on top of a database" is the main thing that makes
bcachefs unique - you have that right.
bcache had much of the core btree design - log structured btree nodes
with eytzinger search trees; that's how we got a high enough performance
btree to make the "filesystem on top of a database" thing practical.
But the btree in bcache was, from a performance POV, prototype quality -
stable, but a lot of performance corner cases unfinished.
The latency work, real iterators, and the whole transaction layer came
later :)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-18 23:42 ` Kent Overstreet
@ 2023-11-19 11:13 ` Martin Steigerwald
2023-11-19 16:43 ` Martin Steigerwald
2023-11-19 23:10 ` Kent Overstreet
0 siblings, 2 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-19 11:13 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Take your time and have a great Sunday :)
Kent Overstreet - 19.11.23, 00:42:05 CET:
> On Sun, Nov 19, 2023 at 12:15:19AM +0100, Martin Steigerwald wrote:
> > Kent Overstreet - 18.11.23, 22:07:27 CET:
> > > > As far as I understand one specific performance related aspect of
> > > > BCacheFS would be low latencies due to the frontend / backend
> > > > architecture which in principle is based on what has been there in
> > > > BCache already. I am intending to explore a bit into that concept
> > > > in
> > > > my article.
> > >
> > > The low latency stuff actually wasn't in bcache - that work came
> > > later.
> >
> > So the frontend / backend architecture is not that much of what makes
> > BCacheFS unique? Important to know as it seems I may have
> > misunderstood
> > something here.
>
> The "filesystem on top of a database" is the main thing that makes
> bcachefs unique - you have that right.
Phew! Seems I did not get it completely wrong then :)
> bcache had much of the core btree design - log structured btree nodes
> with eytzinger search trees; that's how we got a high enough performance
> btree to make the "filesystem on top of a database" thing practical.
>
> But the btree in bcache was, from a performance POV, prototype quality -
> stable, but a lot of performance corner cases unfinished.
>
> The latency work, real iterators, and the whole transaction layer came
> later :)
So it is fair to say that being based on BCache enabled that low latency
work?
I am trying to find a balance here. The audience of the article are
experienced sys admins working in small and large organizations, but not
(primarily) kernel hackers. So I like to explain possible reasons to
consider BCacheFS while also having BTRFS and ZFS and XFS, but without
going into so much detail that one needs to be a kernel developer to
understand it. For XFS it is easy as while there was a proof of concept
with subvolumes and snapshots based on XFS in files on XFS, AFAIR from
Dave Chinner, some years back, it does not have those advanced features
(yet). For ZFS one can always argue it is not in the mainline kernel. But
regarding BTRFS it becomes important to really explain something. I
started to do some BCacheFS / BTRFS feature comparison chart, but I also
like to explain on the benefits BCacheFS can have in the text and a bit
about the background of those benefits. Of course also mentioning that
BCacheFS is in development for more than 10 years, even without taking all
the work for BCache itself into account.
So my idea currently is to explain that the BCache BTree and/or frontend/
backend architecture, not sure how to best word it, enabled a database
approach to a filesystem to be feasible. And that in a sense it also
enabled the latency work. And I can mention sixlocks and all the nice
other stuff you mentioned. However… I may not explain exactly what those
nice things are and how they work for example. For two reasons: 1. I need
to understand those nice features myself, 2. limit of pages I may use and
scope of the article. Currently I understood that sixlocks for certain use
cases are the next best thing since the invention of a wheel or something
like that. But not much more :)
Would something like that be accurate enough in your opinion?
I will review the user manual once again, I read about the database
approach, and aim to find a good balance. Cause for certain I won't be
getting 24 pages for the article :)
I got two setbacks regarding trying BCacheFS on my laptop yesterday and
today:
1) Linux 6.7-rc1, as mentioned almost rc2, did not hibernate on ThinkPad
T14 AMD Gen 1. It just blanked screen and nothing happened. So I went back
to 6.6.1 temporarily. I do not really intend to do a git bisect between
rc1 and 6.6 on a production laptop. It would be very time-consuming and
possibly be dangerous. I may go back to 6.7 even without hibernation, just
using standby over night for a while. I bet BCacheFS is compatible with
hibernation (as in writing memory contents to encrypted swap and resuming
from it)?
2) But after I removed 6.7-rc1 yesterday night after having booted into
6.6.1 this morning I was greeted by GRUB command line. As I had a meeting
scheduled my primary aim was to get things running again. I certainly did
not really get the humorous aspect about it. I thought I'd had copied etc
in the broken state to a another place, but do not find it anymore, maybe
I copied it to /root or the GRML live distro accidently and so it is gone.
Also I missed to copy the broken /boot to another place. So I am not
really able to do any forensic analysis of what might have happened. I do
not recall having seen any error messages either, but I may have missed
something. I reviewed hook and script for initial RAM disk in bcachefs-
tools repo, but did not find anything in there that could have caused grub
2 not finding its config file and modules anymore. Also it booted into 6.7
and even 6.6.1 then after having executed "make install" from bcachefs-
tools directory. Also I see nothing being done with grub itself within
bcachefs-tools. So this really is quite the mystery for me. I am on Devuan
Ceres (based on Debian Sid) so maybe something else got messed up. Will
review their bug trackers. I really have no idea what went wrong here,
luckily I was able to recover with GRML. I use LVM on LUKS for BTRFS and
test BCacheFS filesystem.
Maybe I need to continue testing BCacheFS on a virtual machine, but I'd
really love to have a BCacheFS filesystem on my laptop and actually really
using it for something. Well I am going to leave it at that and probably
research on this after the weekend. Now it is time to have a Sunday off.
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-19 11:13 ` Martin Steigerwald
@ 2023-11-19 16:43 ` Martin Steigerwald
2023-11-19 23:10 ` Kent Overstreet
1 sibling, 0 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-19 16:43 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Martin Steigerwald - 19.11.23, 12:13:29 CET:
> Maybe I need to continue testing BCacheFS on a virtual machine, but I'd
> really love to have a BCacheFS filesystem on my laptop and actually
> really using it for something. Well I am going to leave it at that and
> probably research on this after the weekend. Now it is time to have a
> Sunday off.
Well will stay away from that 6.7-almost-rc2 kernel on this laptop for now
after having had experienced and recovered from:
parent transid verify failed + level verify failed
https://lore.kernel.org/linux-btrfs/9221302.CDJkKcVGEf@lichtvoll.de/T/#u
Not sure what caused it, but for now it appears to me that it is not safe
for me to run this kernel on this laptop.
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-19 11:13 ` Martin Steigerwald
2023-11-19 16:43 ` Martin Steigerwald
@ 2023-11-19 23:10 ` Kent Overstreet
2023-11-20 17:34 ` Martin Steigerwald
2023-12-03 16:58 ` Martin Steigerwald
1 sibling, 2 replies; 15+ messages in thread
From: Kent Overstreet @ 2023-11-19 23:10 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-bcachefs
On Sun, Nov 19, 2023 at 12:13:29PM +0100, Martin Steigerwald wrote:
> Take your time and have a great Sunday :)
>
> Kent Overstreet - 19.11.23, 00:42:05 CET:
> > On Sun, Nov 19, 2023 at 12:15:19AM +0100, Martin Steigerwald wrote:
> > > Kent Overstreet - 18.11.23, 22:07:27 CET:
> > > > > As far as I understand one specific performance related aspect of
> > > > > BCacheFS would be low latencies due to the frontend / backend
> > > > > architecture which in principle is based on what has been there in
> > > > > BCache already. I am intending to explore a bit into that concept
> > > > > in
> > > > > my article.
> > > >
> > > > The low latency stuff actually wasn't in bcache - that work came
> > > > later.
> > >
> > > So the frontend / backend architecture is not that much of what makes
> > > BCacheFS unique? Important to know as it seems I may have
> > > misunderstood
> > > something here.
> >
> > The "filesystem on top of a database" is the main thing that makes
> > bcachefs unique - you have that right.
>
> Phew! Seems I did not get it completely wrong then :)
>
> > bcache had much of the core btree design - log structured btree nodes
> > with eytzinger search trees; that's how we got a high enough performance
> > btree to make the "filesystem on top of a database" thing practical.
> >
> > But the btree in bcache was, from a performance POV, prototype quality -
> > stable, but a lot of performance corner cases unfinished.
> >
> > The latency work, real iterators, and the whole transaction layer came
> > later :)
>
> So it is fair to say that being based on BCache enabled that low latency
> work?
>
> I am trying to find a balance here. The audience of the article are
> experienced sys admins working in small and large organizations, but not
> (primarily) kernel hackers. So I like to explain possible reasons to
> consider BCacheFS while also having BTRFS and ZFS and XFS, but without
> going into so much detail that one needs to be a kernel developer to
> understand it. For XFS it is easy as while there was a proof of concept
> with subvolumes and snapshots based on XFS in files on XFS, AFAIR from
> Dave Chinner, some years back, it does not have those advanced features
> (yet). For ZFS one can always argue it is not in the mainline kernel. But
> regarding BTRFS it becomes important to really explain something. I
> started to do some BCacheFS / BTRFS feature comparison chart, but I also
> like to explain on the benefits BCacheFS can have in the text and a bit
> about the background of those benefits. Of course also mentioning that
> BCacheFS is in development for more than 10 years, even without taking all
> the work for BCache itself into account.
>
> So my idea currently is to explain that the BCache BTree and/or frontend/
> backend architecture, not sure how to best word it, enabled a database
> approach to a filesystem to be feasible. And that in a sense it also
> enabled the latency work. And I can mention sixlocks and all the nice
> other stuff you mentioned. However… I may not explain exactly what those
> nice things are and how they work for example. For two reasons: 1. I need
> to understand those nice features myself, 2. limit of pages I may use and
> scope of the article. Currently I understood that sixlocks for certain use
> cases are the next best thing since the invention of a wheel or something
> like that. But not much more :)
>
> Would something like that be accurate enough in your opinion?
Yeah, that all sounds reasonable :)
It would be a great project to get all this stuff documented better...
when I have more free time... :)
> I will review the user manual once again, I read about the database
> approach, and aim to find a good balance. Cause for certain I won't be
> getting 24 pages for the article :)
>
>
> I got two setbacks regarding trying BCacheFS on my laptop yesterday and
> today:
>
> 1) Linux 6.7-rc1, as mentioned almost rc2, did not hibernate on ThinkPad
> T14 AMD Gen 1. It just blanked screen and nothing happened. So I went back
> to 6.6.1 temporarily. I do not really intend to do a git bisect between
> rc1 and 6.6 on a production laptop. It would be very time-consuming and
> possibly be dangerous. I may go back to 6.7 even without hibernation, just
> using standby over night for a while. I bet BCacheFS is compatible with
> hibernation (as in writing memory contents to encrypted swap and resuming
> from it)?
I believe so - I have not tested hibernation specifically.
>
> 2) But after I removed 6.7-rc1 yesterday night after having booted into
> 6.6.1 this morning I was greeted by GRUB command line. As I had a meeting
> scheduled my primary aim was to get things running again. I certainly did
> not really get the humorous aspect about it. I thought I'd had copied etc
> in the broken state to a another place, but do not find it anymore, maybe
> I copied it to /root or the GRML live distro accidently and so it is gone.
> Also I missed to copy the broken /boot to another place. So I am not
> really able to do any forensic analysis of what might have happened. I do
> not recall having seen any error messages either, but I may have missed
> something. I reviewed hook and script for initial RAM disk in bcachefs-
> tools repo, but did not find anything in there that could have caused grub
> 2 not finding its config file and modules anymore. Also it booted into 6.7
> and even 6.6.1 then after having executed "make install" from bcachefs-
> tools directory. Also I see nothing being done with grub itself within
> bcachefs-tools. So this really is quite the mystery for me. I am on Devuan
> Ceres (based on Debian Sid) so maybe something else got messed up. Will
> review their bug trackers. I really have no idea what went wrong here,
> luckily I was able to recover with GRML. I use LVM on LUKS for BTRFS and
> test BCacheFS filesystem.
>
> Maybe I need to continue testing BCacheFS on a virtual machine, but I'd
> really love to have a BCacheFS filesystem on my laptop and actually really
> using it for something. Well I am going to leave it at that and probably
> research on this after the weekend. Now it is time to have a Sunday off.
The Nixos install process with bcachefs as root is pretty smooth, just
make sure to set up a separate filesystem for /boot!
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-19 23:10 ` Kent Overstreet
@ 2023-11-20 17:34 ` Martin Steigerwald
2023-12-03 16:58 ` Martin Steigerwald
1 sibling, 0 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-11-20 17:34 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Kent Overstreet - 20.11.23, 00:10:01 CET:
> Maybe I need to continue testing BCacheFS on a virtual machine, but I'd
>
> > really love to have a BCacheFS filesystem on my laptop and actually
> > really using it for something. Well I am going to leave it at that
> > and probably research on this after the weekend. Now it is time to
> > have a Sunday off.
> The Nixos install process with bcachefs as root is pretty smooth, just
> make sure to set up a separate filesystem for /boot!
Thanks for that idea.
While it might have been that those BTRFS filesystems that got corrupted
where corrupted by an "thou shalt not resume from this hibernation image"
I will indeed wait before I try 5.7 kernel again on this laptop. Also I
will likely wait for an update bcachefs-tools package. I still don't know
why GRUB greeted me with a command prompt after having removed 5.7 kernel
again. My current idea on why the BTRFS corruption happened is cause of
kernel 6.6.1 resuming from hibernation after I fixed up the boot loader in
two GRML sessions, accessing / and /boot, but not doing anything to
invalidate the hibernation image on the swap volume. Reminder to self:
Absolutely do that next time!
I had to restore / and /home BTRFS filesystem as well as /boot, which got
GRUB broken once more:
xfsprogs-6.5.0 with grub 2.12~rc1-12: unknown filesystem
https://lore.kernel.org/linux-xfs/1889442.tdWV9SEqCh@lichtvoll.de/T/#t
So yes I am very well aware of the advantages of a separate /boot. But
even with that GRUB was funny on me. Maybe that is why I have seen Ext4 in
Ext3 mode for /boot. It does not gain new features.
I may investigate an alternative boot loader… it is not the first time GRUB
broke on new filesystem features and it is not even surprising. It would be
best to have GRUB use a very simple file system that does not gain new
features. A bootfs filesystem just for the purposes of a boot loader to
store a few files and be done with it.
Anyway, it will be a VM for now.
Ciao,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-11-19 23:10 ` Kent Overstreet
2023-11-20 17:34 ` Martin Steigerwald
@ 2023-12-03 16:58 ` Martin Steigerwald
2023-12-18 16:50 ` Martin Steigerwald
1 sibling, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2023-12-03 16:58 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Kent Overstreet - 20.11.23, 00:10:01 CET:
> > I got two setbacks regarding trying BCacheFS on my laptop yesterday
> > and
> > today:
> >
> > 1) Linux 6.7-rc1, as mentioned almost rc2, did not hibernate on
> > ThinkPad T14 AMD Gen 1. It just blanked screen and nothing happened.
> > So I went back to 6.6.1 temporarily. I do not really intend to do a
> > git bisect between rc1 and 6.6 on a production laptop. It would be
> > very time-consuming and possibly be dangerous. I may go back to 6.7
> > even without hibernation, just using standby over night for a while.
> > I bet BCacheFS is compatible with hibernation (as in writing memory
> > contents to encrypted swap and resuming from it)?
>
> I believe so - I have not tested hibernation specifically.
Today I tried with 6.7-rc4 and the issue of non working hibernation does
not seem to be related to BCacheFS, cause the module was not loaded and it
again did not hibernate.
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Questions related to BCacheFS
2023-12-03 16:58 ` Martin Steigerwald
@ 2023-12-18 16:50 ` Martin Steigerwald
0 siblings, 0 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-12-18 16:50 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Martin Steigerwald - 03.12.23, 17:58:36 CET:
> > I believe so - I have not tested hibernation specifically.
>
> Today I tried with 6.7-rc4 and the issue of non working hibernation does
> not seem to be related to BCacheFS, cause the module was not loaded and
> it again did not hibernate.
Finally.
6.7-rc6 does hibernate again. How awesome is that?
However bcachefs-tools package
https://bugs.debian.org/1057295
is broken regarding mounting via mount command unless one moves
mount.bcachefs out of the way :)
Then it mounts okay. And also with mounted BCacheFS hibernation still
works.
So that obstacle is gone.
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* deletion time of big files (was: Re: Questions related to BCacheFS)
2023-11-18 20:57 ` Martin Steigerwald
2023-11-18 21:07 ` Kent Overstreet
@ 2023-12-28 22:29 ` Martin Steigerwald
2023-12-29 18:48 ` Kent Overstreet
1 sibling, 1 reply; 15+ messages in thread
From: Martin Steigerwald @ 2023-12-28 22:29 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Hi Kent!
Martin Steigerwald - 18.11.23, 21:57:50 CET:
> Interesting. Only thing regarding performance I noticed so far that
> deleting an almost 8 GiB large DVD ISO image file took a bit longer than
> instant, but I was using Dolphin on Plasma, so not sure whether this
> tiny delay was filesystem or GUI related.
Meanwhile I have a working BCacheFS test setup on my laptop. Currently
with 6.7-rc7.
I can confirm on the longer than instant deletion times for almost 8 GiB
large DVD ISO image files. It took 3,4 seconds for two of them.
It appears to me that there is some optimization potential hidden in that.
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deletion time of big files (was: Re: Questions related to BCacheFS)
2023-12-28 22:29 ` deletion time of big files (was: Re: Questions related to BCacheFS) Martin Steigerwald
@ 2023-12-29 18:48 ` Kent Overstreet
2023-12-30 10:51 ` Martin Steigerwald
0 siblings, 1 reply; 15+ messages in thread
From: Kent Overstreet @ 2023-12-29 18:48 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-bcachefs
On Thu, Dec 28, 2023 at 11:29:19PM +0100, Martin Steigerwald wrote:
> Hi Kent!
>
> Martin Steigerwald - 18.11.23, 21:57:50 CET:
> > Interesting. Only thing regarding performance I noticed so far that
> > deleting an almost 8 GiB large DVD ISO image file took a bit longer than
> > instant, but I was using Dolphin on Plasma, so not sure whether this
> > tiny delay was filesystem or GUI related.
>
> Meanwhile I have a working BCacheFS test setup on my laptop. Currently
> with 6.7-rc7.
>
> I can confirm on the longer than instant deletion times for almost 8 GiB
> large DVD ISO image files. It took 3,4 seconds for two of them.
>
> It appears to me that there is some optimization potential hidden in that.
Bug me again in a few weeks/months if I don't get to it; Dave's fs
metadata benchmarks were also pointing out things that need to be
opimized as well, this is probably related - but I'm going to be deep in
the disk space accounting rewrite for a good while
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: deletion time of big files (was: Re: Questions related to BCacheFS)
2023-12-29 18:48 ` Kent Overstreet
@ 2023-12-30 10:51 ` Martin Steigerwald
0 siblings, 0 replies; 15+ messages in thread
From: Martin Steigerwald @ 2023-12-30 10:51 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs
Kent Overstreet - 29.12.23, 19:48:55 CET:
> > I can confirm on the longer than instant deletion times for almost 8
> > GiB large DVD ISO image files. It took 3,4 seconds for two of them.
> >
> > It appears to me that there is some optimization potential hidden in
> > that.
>
> Bug me again in a few weeks/months if I don't get to it; Dave's fs
> metadata benchmarks were also pointing out things that need to be
> opimized as well, this is probably related - but I'm going to be deep in
> the disk space accounting rewrite for a good while
Alright. All in due time. It is not that I have gazillions of such large
files laying around or even the amount of storage capacity available. I
intend to format an external 4TB NVME SSD with encrypted BCacheFS. And for
testing copy over 1 TB of data from an external 2TB NVME SSD to it and see
how that goes.
I am happy that BCacheFS went into the mainline kernel and I rather see
things done right than fast.
Have a Happy New Year.
Best,
--
Martin
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-12-30 10:51 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-18 19:15 Questions related to BCacheFS Martin Steigerwald
2023-11-18 19:50 ` Kent Overstreet
2023-11-18 20:57 ` Martin Steigerwald
2023-11-18 21:07 ` Kent Overstreet
2023-11-18 23:15 ` Martin Steigerwald
2023-11-18 23:42 ` Kent Overstreet
2023-11-19 11:13 ` Martin Steigerwald
2023-11-19 16:43 ` Martin Steigerwald
2023-11-19 23:10 ` Kent Overstreet
2023-11-20 17:34 ` Martin Steigerwald
2023-12-03 16:58 ` Martin Steigerwald
2023-12-18 16:50 ` Martin Steigerwald
2023-12-28 22:29 ` deletion time of big files (was: Re: Questions related to BCacheFS) Martin Steigerwald
2023-12-29 18:48 ` Kent Overstreet
2023-12-30 10:51 ` Martin Steigerwald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).