All of lore.kernel.org
 help / color / mirror / Atom feed
* Is stability a joke?
@ 2016-09-11  8:55 Waxhead
  2016-09-11  9:56 ` Steven Haigh
                   ` (3 more replies)
  0 siblings, 4 replies; 93+ messages in thread
From: Waxhead @ 2016-09-11  8:55 UTC (permalink / raw)
  To: linux-btrfs

I have been following BTRFS for years and have recently been starting to 
use BTRFS more and more and as always BTRFS' stability is a hot topic.
Some says that BTRFS is a dead end research project while others claim 
the opposite.

Taking a quick glance at the wiki does not say much about what is safe 
to use or not and it also points to some who are using BTRFS in production.
While BTRFS can apparently work well in production it does have some 
caveats, and finding out what features is safe or not can be problematic 
and I especially think that new users of BTRFS can easily be bitten if 
they do not do a lot of research on it first.

The Debian wiki for BTRFS (which is recent by the way) contains a bunch 
of warnings and recommendations and is for me a bit better than the 
official BTRFS wiki when it comes to how to decide what features to use.

The Nouveau graphics driver have a nice feature matrix on it's webpage 
and I think that BTRFS perhaps should consider doing something like that 
on it's official wiki as well

For example something along the lines of .... (the statuses are taken 
our of thin air just for demonstration purposes)

Kernel version 4.7
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| Feature / Redundancy level | Single | Dup | Raid0 | Raid1 | Raid10 | 
Raid5 | Raid 6 |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| Subvolumes                 | Ok     | Ok  | Ok    | Ok    | Ok   | Bad 
   | Bad    |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| Snapshots                  | Ok     | Ok  | Ok    | Ok    | Ok     | 
Bad   | Bad    |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| LZO Compression            | Bad(1) | Bad | Bad   | Bad(2)| Bad    | 
Bad   | Bad    |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| ZLIB Compression           | Ok     | Ok  | Ok    | Ok    | Ok     | 
Bad   | Bad    |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+
| Autodefrag                 | Ok     | Bad | Bad(3)| Ok    | Ok     | 
Bad   | Bad    |
+----------------------------+--------+-----+-------+-------+--------+-------+--------+

(1) Some explanation here...
(2) Some explanation there....
(3) And some explanation elsewhere...

...etc...etc...

I therefore would like to propose that some sort of feature / stability 
matrix for the latest kernel is added to the wiki preferably somewhere 
where it is easy to find. It would be nice to archive old matrix'es as 
well in case someone runs on a bit older kernel (we who use Debian tend 
to like older kernels). In my opinion it would make things bit easier 
and perhaps a bit less scary too. Remember if you get bitten badly once 
you tend to stay away from from it all just in case, if you on the other 
hand know what bites you can safely pet the fluffy end instead :)

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11  8:55 Is stability a joke? Waxhead
@ 2016-09-11  9:56 ` Steven Haigh
  2016-09-11 10:23 ` Martin Steigerwald
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 93+ messages in thread
From: Steven Haigh @ 2016-09-11  9:56 UTC (permalink / raw)
  To: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3832 bytes --]

This. So much this.

After being burned badly by the documentation / wiki etc making RAID5/6
seem stable, I think its a joke how the features of BTRFS are promoted.

A lot that is marked as 'Implemented' or 'Complete' is little more than
a "In theory, it works" - but will eat your data.

Having a simple reference as to the status of what is going on, and what
will eat your data would probably save Tb's of data in the next few
months and lots of reputation for BTRFS...

On 11/09/16 18:55, Waxhead wrote:
> I have been following BTRFS for years and have recently been starting to
> use BTRFS more and more and as always BTRFS' stability is a hot topic.
> Some says that BTRFS is a dead end research project while others claim
> the opposite.
> 
> Taking a quick glance at the wiki does not say much about what is safe
> to use or not and it also points to some who are using BTRFS in production.
> While BTRFS can apparently work well in production it does have some
> caveats, and finding out what features is safe or not can be problematic
> and I especially think that new users of BTRFS can easily be bitten if
> they do not do a lot of research on it first.
> 
> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
> of warnings and recommendations and is for me a bit better than the
> official BTRFS wiki when it comes to how to decide what features to use.
> 
> The Nouveau graphics driver have a nice feature matrix on it's webpage
> and I think that BTRFS perhaps should consider doing something like that
> on it's official wiki as well
> 
> For example something along the lines of .... (the statuses are taken
> our of thin air just for demonstration purposes)
> 
> Kernel version 4.7
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | Feature / Redundancy level | Single | Dup | Raid0 | Raid1 | Raid10 |
> Raid5 | Raid 6 |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | Subvolumes                 | Ok     | Ok  | Ok    | Ok    | Ok   | Bad
>   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | Snapshots                  | Ok     | Ok  | Ok    | Ok    | Ok     |
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | LZO Compression            | Bad(1) | Bad | Bad   | Bad(2)| Bad    |
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | ZLIB Compression           | Ok     | Ok  | Ok    | Ok    | Ok     |
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> | Autodefrag                 | Ok     | Bad | Bad(3)| Ok    | Ok     |
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> 
> (1) Some explanation here...
> (2) Some explanation there....
> (3) And some explanation elsewhere...
> 
> ...etc...etc...
> 
> I therefore would like to propose that some sort of feature / stability
> matrix for the latest kernel is added to the wiki preferably somewhere
> where it is easy to find. It would be nice to archive old matrix'es as
> well in case someone runs on a bit older kernel (we who use Debian tend
> to like older kernels). In my opinion it would make things bit easier
> and perhaps a bit less scary too. Remember if you get bitten badly once
> you tend to stay away from from it all just in case, if you on the other
> hand know what bites you can safely pet the fluffy end instead :)


-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11  8:55 Is stability a joke? Waxhead
  2016-09-11  9:56 ` Steven Haigh
@ 2016-09-11 10:23 ` Martin Steigerwald
  2016-09-11 11:21   ` Zoiled
  2016-09-12 12:48   ` Swâmi Petaramesh
  2016-09-12 13:53 ` Chris Mason
  2016-09-12 14:27 ` David Sterba
  3 siblings, 2 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 10:23 UTC (permalink / raw)
  To: Waxhead; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 10:55:21 CEST schrieb Waxhead:
> I have been following BTRFS for years and have recently been starting to
> use BTRFS more and more and as always BTRFS' stability is a hot topic.
> Some says that BTRFS is a dead end research project while others claim
> the opposite.

First off: On my systems BTRFS definately runs too stable for a research 
project. Actually: I have zero issues with stability of BTRFS on *any* of my 
systems at the moment and in the last half year.

The only issue I had till about half an year ago was BTRFS getting stuck at 
seeking free space on a highly fragmented RAID 1 + compress=lzo /home. This 
went away with either kernel 4.4 or 4.5.

Additionally I never ever lost even a single byte of data on my own BTRFS 
filesystems. I had a checksum failure on one of the SSDs, but BTRFS RAID 1 
repaired it.


Where do I use BTRFS?

1) On this ThinkPad T520 with two SSDs. /home and / in RAID 1, another data 
volume as single. In case you can read german, search blog.teamix.de for 
BTRFS.

2) On my music box ThinkPad T42 for /home. I did not bother to change / so far 
and may never to so for this laptop. It has a slow 2,5 inch harddisk.

3) I used it on Workstation at work as well for a data volume in RAID 1. But 
workstation is no more (not due to a filesystem failure).

4) On a server VM for /home with Maildirs and Owncloud data. /var is still on 
Ext4, but I want to migrate it as well. Whether I ever change /, I don´t know.

5) On another server VM, a backup VM which I currently use with borgbackup. 
With borgbackup I actually wouldn´t really need BTRFS, but well…

6) On *all* of my externel eSATA based backup harddisks for snapshotting older 
states of the backups.

> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
> of warnings and recommendations and is for me a bit better than the
> official BTRFS wiki when it comes to how to decide what features to use.

Nice page. I wasn´t aware of this one.

If you use BTRFS with Debian, I suggest to usually use the recent backport 
kernel, currently 4.6.

Hmmm, maybe I better remove that compress=lzo mount option. Never saw any 
issue with it, tough. Will research what they say about it.

> The Nouveau graphics driver have a nice feature matrix on it's webpage
> and I think that BTRFS perhaps should consider doing something like that
> on it's official wiki as well

BTRFS also has a feature matrix. The links to it are in the "News" section 
however:

https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature

Thing is: This just seems to be when has a feature been implemented matrix. 
Not when it is considered to be stable. I think this could be done with colors 
or so. Like red for not supported, yellow for implemented and green for 
production ready.

Another hint you can get by reading SLES 12 releasenotes. SUSE dares to 
support BTRFS since quite a while – frankly, I think for SLES 11 SP 3 this was 
premature, at least for the initial release without updates, I have a VM that 
with BTRFS I can break very easily having BTRFS say it is full, while it is 
has still 2 GB free. But well… this still seems to happen for some people 
according to the threads on BTRFS mailing list.

SUSE doesn´t support all of BTRFS. They even put features they do not support 
behind a "allow_unsupported=1" module option:

https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-314697

But they even seem to contradict themselves by claiming they support RAID 0, 
RAID 1 and RAID10, but not RAID 5 or RAID 6, but then putting RAID behind that 
module option – or I misunderstood their RAID statement

"Btrfs is supported on top of MD (multiple devices) and DM (device mapper) 
configurations. Use the YaST partitioner to achieve a proper setup. 
Multivolume Btrfs is supported in RAID0, RAID1, and RAID10 profiles in SUSE 
Linux Enterprise 12, higher RAID levels are not yet supported, but might be 
enabled with a future service pack."

and they only support BTRFS on MD for RAID. They also do not support 
compression yet. They even do not support big metadata.

https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221

Interestingly enough RedHat only supports BTRFS as a technology preview, even 
with RHEL 7.

> For example something along the lines of .... (the statuses are taken
> our of thin air just for demonstration purposes)

I´d say feel free to work with the feature matrix already there and fill in 
information about stability. I think it makes sense tough to discuss first on 
how to do it with still keeping it manageable.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 10:23 ` Martin Steigerwald
@ 2016-09-11 11:21   ` Zoiled
  2016-09-11 11:43     ` Martin Steigerwald
  2016-09-12 12:48   ` Swâmi Petaramesh
  1 sibling, 1 reply; 93+ messages in thread
From: Zoiled @ 2016-09-11 11:21 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

Martin Steigerwald wrote:
> Am Sonntag, 11. September 2016, 10:55:21 CEST schrieb Waxhead:
>> I have been following BTRFS for years and have recently been starting to
>> use BTRFS more and more and as always BTRFS' stability is a hot topic.
>> Some says that BTRFS is a dead end research project while others claim
>> the opposite.
> First off: On my systems BTRFS definately runs too stable for a research
> project. Actually: I have zero issues with stability of BTRFS on *any* of my
> systems at the moment and in the last half year.
>
> The only issue I had till about half an year ago was BTRFS getting stuck at
> seeking free space on a highly fragmented RAID 1 + compress=lzo /home. This
> went away with either kernel 4.4 or 4.5.
>
> Additionally I never ever lost even a single byte of data on my own BTRFS
> filesystems. I had a checksum failure on one of the SSDs, but BTRFS RAID 1
> repaired it.
>
>
> Where do I use BTRFS?
>
> 1) On this ThinkPad T520 with two SSDs. /home and / in RAID 1, another data
> volume as single. In case you can read german, search blog.teamix.de for
> BTRFS.
>
> 2) On my music box ThinkPad T42 for /home. I did not bother to change / so far
> and may never to so for this laptop. It has a slow 2,5 inch harddisk.
>
> 3) I used it on Workstation at work as well for a data volume in RAID 1. But
> workstation is no more (not due to a filesystem failure).
>
> 4) On a server VM for /home with Maildirs and Owncloud data. /var is still on
> Ext4, but I want to migrate it as well. Whether I ever change /, I don´t know.
>
> 5) On another server VM, a backup VM which I currently use with borgbackup.
> With borgbackup I actually wouldn´t really need BTRFS, but well…
>
> 6) On *all* of my externel eSATA based backup harddisks for snapshotting older
> states of the backups.
In other words, you are one of those who claim the opposite :) I have 
also myself run btrfs for a "toy" filesystem since 2013 without any 
issues, but this is more or less irrelevant since some people have 
experienced data loss thanks to unstable features that are not clearly 
marked as such.
And making a claim that you have not lost a single byte of data does not 
make sense, how did you test this? SHA256 against a backup? :)
>> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
>> of warnings and recommendations and is for me a bit better than the
>> official BTRFS wiki when it comes to how to decide what features to use.
> Nice page. I wasn´t aware of this one.
>
> If you use BTRFS with Debian, I suggest to usually use the recent backport
> kernel, currently 4.6.
>
> Hmmm, maybe I better remove that compress=lzo mount option. Never saw any
> issue with it, tough. Will research what they say about it.
My point exactly: You did not know about this and hence the risk of your 
data being gnawed on.
>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>> and I think that BTRFS perhaps should consider doing something like that
>> on it's official wiki as well
> BTRFS also has a feature matrix. The links to it are in the "News" section
> however:
>
> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
I disagree, this is not a feature / stability matrix. It is a clearly a 
changelog by kernel version.
> Thing is: This just seems to be when has a feature been implemented matrix.
> Not when it is considered to be stable. I think this could be done with colors
> or so. Like red for not supported, yellow for implemented and green for
> production ready.
Exactly, just like the Nouveau matrix. It clearly shows what you can 
expect from it.
> Another hint you can get by reading SLES 12 releasenotes. SUSE dares to
> support BTRFS since quite a while – frankly, I think for SLES 11 SP 3 this was
> premature, at least for the initial release without updates, I have a VM that
> with BTRFS I can break very easily having BTRFS say it is full, while it is
> has still 2 GB free. But well… this still seems to happen for some people
> according to the threads on BTRFS mailing list.
>
> SUSE doesn´t support all of BTRFS. They even put features they do not support
> behind a "allow_unsupported=1" module option:
>
> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-314697
>
> But they even seem to contradict themselves by claiming they support RAID 0,
> RAID 1 and RAID10, but not RAID 5 or RAID 6, but then putting RAID behind that
> module option – or I misunderstood their RAID statement
>
> "Btrfs is supported on top of MD (multiple devices) and DM (device mapper)
> configurations. Use the YaST partitioner to achieve a proper setup.
> Multivolume Btrfs is supported in RAID0, RAID1, and RAID10 profiles in SUSE
> Linux Enterprise 12, higher RAID levels are not yet supported, but might be
> enabled with a future service pack."
>
> and they only support BTRFS on MD for RAID. They also do not support
> compression yet. They even do not support big metadata.
>
> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221
>
> Interestingly enough RedHat only supports BTRFS as a technology preview, even
> with RHEL 7.
>
I would much rather prefer to rely on the btrfs wiki as the source and 
not distro's ideas about what is reliable or not. The Debian wiki is 
nice, but there should honestly not be any need for it if the btrfs wiki 
had the relevant information.
>> For example something along the lines of .... (the statuses are taken
>> our of thin air just for demonstration purposes)
> I´d say feel free to work with the feature matrix already there and fill in
> information about stability. I think it makes sense tough to discuss first on
> how to do it with still keeping it manageable.
>
> Thanks,
I am afraid the changelog is not a stability/status feature matrix as 
you yourself have mentioned, but absolutely I could have edited the wiki 
and see what happened :)


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 11:21   ` Zoiled
@ 2016-09-11 11:43     ` Martin Steigerwald
  2016-09-11 12:05       ` Martin Steigerwald
  2016-09-11 12:30       ` Waxhead
  0 siblings, 2 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 11:43 UTC (permalink / raw)
  To: Zoiled; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 13:21:30 CEST schrieb Zoiled:
> Martin Steigerwald wrote:
> > Am Sonntag, 11. September 2016, 10:55:21 CEST schrieb Waxhead:
> >> I have been following BTRFS for years and have recently been starting to
> >> use BTRFS more and more and as always BTRFS' stability is a hot topic.
> >> Some says that BTRFS is a dead end research project while others claim
> >> the opposite.
> > 
> > First off: On my systems BTRFS definately runs too stable for a research
> > project. Actually: I have zero issues with stability of BTRFS on *any* of
> > my systems at the moment and in the last half year.
> > 
> > The only issue I had till about half an year ago was BTRFS getting stuck
> > at
> > seeking free space on a highly fragmented RAID 1 + compress=lzo /home.
> > This
> > went away with either kernel 4.4 or 4.5.
> > 
> > Additionally I never ever lost even a single byte of data on my own BTRFS
> > filesystems. I had a checksum failure on one of the SSDs, but BTRFS RAID 1
> > repaired it.
> > 
> > 
> > Where do I use BTRFS?
> > 
> > 1) On this ThinkPad T520 with two SSDs. /home and / in RAID 1, another
> > data
> > volume as single. In case you can read german, search blog.teamix.de for
> > BTRFS.
> > 
> > 2) On my music box ThinkPad T42 for /home. I did not bother to change / so
> > far and may never to so for this laptop. It has a slow 2,5 inch harddisk.
> > 
> > 3) I used it on Workstation at work as well for a data volume in RAID 1.
> > But workstation is no more (not due to a filesystem failure).
> > 
> > 4) On a server VM for /home with Maildirs and Owncloud data. /var is still
> > on Ext4, but I want to migrate it as well. Whether I ever change /, I
> > don´t know.
> > 
> > 5) On another server VM, a backup VM which I currently use with
> > borgbackup.
> > With borgbackup I actually wouldn´t really need BTRFS, but well…
> > 
> > 6) On *all* of my externel eSATA based backup harddisks for snapshotting
> > older states of the backups.
> 
> In other words, you are one of those who claim the opposite :) I have
> also myself run btrfs for a "toy" filesystem since 2013 without any
> issues, but this is more or less irrelevant since some people have
> experienced data loss thanks to unstable features that are not clearly
> marked as such.
> And making a claim that you have not lost a single byte of data does not
> make sense, how did you test this? SHA256 against a backup? :)

Do you have any proof like that with *any* other filesystem on Linux?

No, my claim is a bit weaker: BTRFS own scrubbing feature and well no I/O 
errors on rsyncing my data over to the backup drive - BTRFS checks checksum on 
read as well –, and yes I know BTRFS uses a weaker hashing algorithm, I think 
crc32c. Yet this is still more than what I can say about *any* other 
filesystem I used so far. Up to my current knowledge neither XFS nor Ext4/3 
provide data checksumming. They do have metadata checksumming and I found 
contradicting information on whether XFS may support data checksumming in the 
future, but up to now, no *proof* *whatsoever* from side of the filesystem 
that the data is, what it was when I saved it initially. There may be bit 
errors rotting on any of your Ext4 and XFS filesystem without you even 
noticing for *years*. I think thats still unlikely, but it can happen, I have 
seen this years ago after restoring a backup with bit errors from a hardware 
RAID controller.

Of course, I rely on the checksumming feature within BTRFS – which may have 
errors. But even that is more than with any other filesystem I had before.

And I do not scrub daily, especially not the backup disks, but for any scrubs 
up to now, no issues. So, granted, my claim has been a bit bold. Right now I 
have no up-to-this-day scrubs so all I can say is that I am not aware of any 
data losses up to the point in time where I last scrubbed my devices. Just 
redoing the scrubbing now on my laptop.

> >> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
> >> of warnings and recommendations and is for me a bit better than the
> >> official BTRFS wiki when it comes to how to decide what features to use.
> > 
> > Nice page. I wasn´t aware of this one.
> > 
> > If you use BTRFS with Debian, I suggest to usually use the recent backport
> > kernel, currently 4.6.
> > 
> > Hmmm, maybe I better remove that compress=lzo mount option. Never saw any
> > issue with it, tough. Will research what they say about it.
> 
> My point exactly: You did not know about this and hence the risk of your
> data being gnawed on.

Well I do follow BTRFS mailinglist to some extent and I recommend anyone who 
uses BTRFS in production to do this. And: So far I see no data loss from using 
that option and for me personally it is exactly that what counts. J

Still: An information on what features are stable with what version of kernel 
and btrfs-progrs is important. I totally agree with that and there is not the 
slighted need to discuss about it.

But also just saying: I wasn´t aware is no excuse either. BTRFS is not 
officially declared fully production ready. Just read this:

https://btrfs.wiki.kernel.org/index.php/Main_Page#Stability_status

It just talks about the disk format being stable and a bit cowardly avoids any 
statement regarding production stability. If I´d read this, I´d think: Okay, I 
may use this, but I better check back more closely and be prepared to upgrade 
kernels and read BTRFS mailinglist.

That said, the statement avoids clarity to some extent and I think it would be 
better for formulate it in a clearer way.

> >> The Nouveau graphics driver have a nice feature matrix on it's webpage
> >> and I think that BTRFS perhaps should consider doing something like that
> >> on it's official wiki as well
> > 
> > BTRFS also has a feature matrix. The links to it are in the "News" section
> > however:
> > 
> > https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature

> I disagree, this is not a feature / stability matrix. It is a clearly a
> changelog by kernel version.

It is a *feature* matrix. I fully said its not about stability, but about 
implementation – I just wrote this a sentence after this one. There is no need 
whatsoever to further discuss this as I never claimed that it is a feature / 
stability matrix in the first place.

> > Thing is: This just seems to be when has a feature been implemented
> > matrix.
> > Not when it is considered to be stable. I think this could be done with
> > colors or so. Like red for not supported, yellow for implemented and
> > green for production ready.
> 
> Exactly, just like the Nouveau matrix. It clearly shows what you can
> expect from it.
> 
> > Another hint you can get by reading SLES 12 releasenotes. SUSE dares to
> > support BTRFS since quite a while – frankly, I think for SLES 11 SP 3 this
> > was premature, at least for the initial release without updates, I have a
> > VM that with BTRFS I can break very easily having BTRFS say it is full,
> > while it is has still 2 GB free. But well… this still seems to happen for
> > some people according to the threads on BTRFS mailing list.
> > 
> > SUSE doesn´t support all of BTRFS. They even put features they do not
> > support behind a "allow_unsupported=1" module option:
> > 
> > https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-314697
> > 
> > But they even seem to contradict themselves by claiming they support RAID
> > 0, RAID 1 and RAID10, but not RAID 5 or RAID 6, but then putting RAID
> > behind that module option – or I misunderstood their RAID statement
> > 
> > "Btrfs is supported on top of MD (multiple devices) and DM (device mapper)
> > configurations. Use the YaST partitioner to achieve a proper setup.
> > Multivolume Btrfs is supported in RAID0, RAID1, and RAID10 profiles in
> > SUSE
> > Linux Enterprise 12, higher RAID levels are not yet supported, but might
> > be
> > enabled with a future service pack."
> > 
> > and they only support BTRFS on MD for RAID. They also do not support
> > compression yet. They even do not support big metadata.
> > 
> > https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221
> > 
> > Interestingly enough RedHat only supports BTRFS as a technology preview,
> > even with RHEL 7.
> 
> I would much rather prefer to rely on the btrfs wiki as the source and
> not distro's ideas about what is reliable or not. The Debian wiki is
> nice, but there should honestly not be any need for it if the btrfs wiki
> had the relevant information.

See, this is what you prefer. And then there is reality.

It seems reality doesn´t match what you prefer. You can now spend time 
complaining about this, or… offer your help to improve the situation.

If you choose the complaining path, I am out, and rather spend my time 
enjoying to use BTRFS as I do. Maybe reviewing that compress=lzo thing.

As I first read your subject "Is stability a joke?" I wondered whether to even 
answer this. Fortunately your post has been a bit more than this complaint.

And trust me, I have been there. I complained myself about stability here. And 
I found that it didn´t help my cause very much.

> >> For example something along the lines of .... (the statuses are taken
> >> our of thin air just for demonstration purposes)
> > 
> > I´d say feel free to work with the feature matrix already there and fill
> > in
> > information about stability. I think it makes sense tough to discuss first
> > on how to do it with still keeping it manageable.

> I am afraid the changelog is not a stability/status feature matrix as
> you yourself have mentioned, but absolutely I could have edited the wiki
> and see what happened :)

I think what would be a good next step would be to ask developers / users 
about feature stability and then update the wiki. If thats important to you, I 
suggest you invest some energy in doing that. And ask for help. This 
mailinglist is a good idea.

I already gave you my idea on what works for me.

There is just one thing I won´t go further even a single step: The complaining 
path. As it leads to no desirable outcome.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 11:43     ` Martin Steigerwald
@ 2016-09-11 12:05       ` Martin Steigerwald
  2016-09-11 12:39         ` Waxhead
  2016-09-11 17:11         ` Duncan
  2016-09-11 12:30       ` Waxhead
  1 sibling, 2 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 12:05 UTC (permalink / raw)
  To: Zoiled; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> > >> The Nouveau graphics driver have a nice feature matrix on it's webpage
> > >> and I think that BTRFS perhaps should consider doing something like
> > >> that
> > >> on it's official wiki as well
> > > 
> > > BTRFS also has a feature matrix. The links to it are in the "News"
> > > section
> > > however:
> > > 
> > > https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
> > 
> > I disagree, this is not a feature / stability matrix. It is a clearly a
> > changelog by kernel version.
> 
> It is a *feature* matrix. I fully said its not about stability, but about 
> implementation – I just wrote this a sentence after this one. There is no
> need  whatsoever to further discuss this as I never claimed that it is a
> feature / stability matrix in the first place.
> 
> > > Thing is: This just seems to be when has a feature been implemented
> > > matrix.
> > > Not when it is considered to be stable. I think this could be done with
> > > colors or so. Like red for not supported, yellow for implemented and
> > > green for production ready.
> > 
> > Exactly, just like the Nouveau matrix. It clearly shows what you can
> > expect from it.

I mentioned this matrix as a good *starting* point. And I think it would be 
easy to extent it:

Just add another column called "Production ready". Then research / ask about 
production stability of each feature. The only challenge is: Who is 
authoritative on that? I´d certainly ask the developer of a feature, but I´d 
also consider user reports to some extent.

Maybe thats the real challenge.

If you wish, I´d go through each feature there and give my own estimation. But 
I think there are others who are deeper into this.

I do think for example that scrubbing and auto raid repair are stable, except 
for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable. 
I think RAID 10 is also stable, but as I do not run it, I don´t know. For me 
also skinny-metadata is stable. For me so far even compress=lzo seems to be 
stable, but well for others it may not.

Since what kernel version? Now, there you go. I have no idea. All I know I 
started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at 
that time.

See, the implementation time of a feature is much easier to assess. Maybe 
thats part of the reason why there is not stability matrix: Maybe no one 
*exactly* knows *for sure*. How could you? So I would even put a footnote on 
that "production ready" row explaining "Considered to be stable by developer 
and user oppinions".

Of course additionally it would be good to read about experiences of corporate 
usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it. 
But I don´t know in what configurations and with what experiences. One Oracle 
developer invests a lot of time to bring BTRFS like features to XFS and RedHat 
still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non 
/-filesystems. That also tells a story.

Some ideas you can get from SUSE releasenotes. Even if you do not want to use 
it, it tells something and I bet is one of the better sources of information 
regarding your question you can get at this time. Cause I believe SUSE 
developers invested some time to assess the stability of features. Cause they 
would carefully assess what they can support in enterprise environments. There 
is also someone from Fujitsu who shared experiences in a talk, I can search 
the URL to the slides again.

I bet Chris Mason and other BTRFS developers at Facebook have some idea on 
what they use within Facebook as well. To what extent they are allowed to talk 
about it… I don´t know. My personal impression is that as soon as Chris went 
to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to 
Facebook being concerned much more about the privacy of itself than of its 
users.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 11:43     ` Martin Steigerwald
  2016-09-11 12:05       ` Martin Steigerwald
@ 2016-09-11 12:30       ` Waxhead
  2016-09-11 14:36         ` Martin Steigerwald
  1 sibling, 1 reply; 93+ messages in thread
From: Waxhead @ 2016-09-11 12:30 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

Martin Steigerwald wrote:
> Am Sonntag, 11. September 2016, 13:21:30 CEST schrieb Zoiled:
>> Martin Steigerwald wrote:
>>> Am Sonntag, 11. September 2016, 10:55:21 CEST schrieb Waxhead:
>>>> I have been following BTRFS for years and have recently been starting to
>>>> use BTRFS more and more and as always BTRFS' stability is a hot topic.
>>>> Some says that BTRFS is a dead end research project while others claim
>>>> the opposite.
>>> First off: On my systems BTRFS definately runs too stable for a research
>>> project. Actually: I have zero issues with stability of BTRFS on *any* of
>>> my systems at the moment and in the last half year.
>>>
>>> The only issue I had till about half an year ago was BTRFS getting stuck
>>> at
>>> seeking free space on a highly fragmented RAID 1 + compress=lzo /home.
>>> This
>>> went away with either kernel 4.4 or 4.5.
>>>
>>> Additionally I never ever lost even a single byte of data on my own BTRFS
>>> filesystems. I had a checksum failure on one of the SSDs, but BTRFS RAID 1
>>> repaired it.
>>>
>>>
>>> Where do I use BTRFS?
>>>
>>> 1) On this ThinkPad T520 with two SSDs. /home and / in RAID 1, another
>>> data
>>> volume as single. In case you can read german, search blog.teamix.de for
>>> BTRFS.
>>>
>>> 2) On my music box ThinkPad T42 for /home. I did not bother to change / so
>>> far and may never to so for this laptop. It has a slow 2,5 inch harddisk.
>>>
>>> 3) I used it on Workstation at work as well for a data volume in RAID 1.
>>> But workstation is no more (not due to a filesystem failure).
>>>
>>> 4) On a server VM for /home with Maildirs and Owncloud data. /var is still
>>> on Ext4, but I want to migrate it as well. Whether I ever change /, I
>>> don´t know.
>>>
>>> 5) On another server VM, a backup VM which I currently use with
>>> borgbackup.
>>> With borgbackup I actually wouldn´t really need BTRFS, but well…
>>>
>>> 6) On *all* of my externel eSATA based backup harddisks for snapshotting
>>> older states of the backups.
>> In other words, you are one of those who claim the opposite :) I have
>> also myself run btrfs for a "toy" filesystem since 2013 without any
>> issues, but this is more or less irrelevant since some people have
>> experienced data loss thanks to unstable features that are not clearly
>> marked as such.
>> And making a claim that you have not lost a single byte of data does not
>> make sense, how did you test this? SHA256 against a backup? :)
> Do you have any proof like that with *any* other filesystem on Linux?
>
> No, my claim is a bit weaker: BTRFS own scrubbing feature and well no I/O
> errors on rsyncing my data over to the backup drive - BTRFS checks checksum on
> read as well –, and yes I know BTRFS uses a weaker hashing algorithm, I think
> crc32c. Yet this is still more than what I can say about *any* other
> filesystem I used so far. Up to my current knowledge neither XFS nor Ext4/3
> provide data checksumming. They do have metadata checksumming and I found
> contradicting information on whether XFS may support data checksumming in the
> future, but up to now, no *proof* *whatsoever* from side of the filesystem
> that the data is, what it was when I saved it initially. There may be bit
> errors rotting on any of your Ext4 and XFS filesystem without you even
> noticing for *years*. I think thats still unlikely, but it can happen, I have
> seen this years ago after restoring a backup with bit errors from a hardware
> RAID controller.
>
> Of course, I rely on the checksumming feature within BTRFS – which may have
> errors. But even that is more than with any other filesystem I had before.
>
> And I do not scrub daily, especially not the backup disks, but for any scrubs
> up to now, no issues. So, granted, my claim has been a bit bold. Right now I
> have no up-to-this-day scrubs so all I can say is that I am not aware of any
> data losses up to the point in time where I last scrubbed my devices. Just
> redoing the scrubbing now on my laptop.
The way I see it BTRFS is the best filesystem we got so far. It is also 
the first (to my knowledge) that provides checksums of both data and 
metadata. My point was simply that such an extraordinary claim require 
some evidence. I am not saying it is unlikely that you have never lost a 
byte, I am just saying that it is a fantastic thing to claim.
>>>> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
>>>> of warnings and recommendations and is for me a bit better than the
>>>> official BTRFS wiki when it comes to how to decide what features to use.
>>> Nice page. I wasn´t aware of this one.
>>>
>>> If you use BTRFS with Debian, I suggest to usually use the recent backport
>>> kernel, currently 4.6.
>>>
>>> Hmmm, maybe I better remove that compress=lzo mount option. Never saw any
>>> issue with it, tough. Will research what they say about it.
>> My point exactly: You did not know about this and hence the risk of your
>> data being gnawed on.
> Well I do follow BTRFS mailinglist to some extent and I recommend anyone who
> uses BTRFS in production to do this. And: So far I see no data loss from using
> that option and for me personally it is exactly that what counts. J
>
> Still: An information on what features are stable with what version of kernel
> and btrfs-progrs is important. I totally agree with that and there is not the
> slighted need to discuss about it.
>
> But also just saying: I wasn´t aware is no excuse either. BTRFS is not
> officially declared fully production ready. Just read this:
>
> https://btrfs.wiki.kernel.org/index.php/Main_Page#Stability_status
>
> It just talks about the disk format being stable and a bit cowardly avoids any
> statement regarding production stability. If I´d read this, I´d think: Okay, I
> may use this, but I better check back more closely and be prepared to upgrade
> kernels and read BTRFS mailinglist.
>
> That said, the statement avoids clarity to some extent and I think it would be
> better for formulate it in a clearer way.
>
Regarding the stability status it can give a false impression that if 
for example if feature XYZ was introduced 5-6 kernel releases back it is 
stable and good to go which may or may not be the case. Yes it indicates 
that btrfs is stabilizing , but as you said it is not very precise.
>>>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>>>> and I think that BTRFS perhaps should consider doing something like that
>>>> on it's official wiki as well
>>> BTRFS also has a feature matrix. The links to it are in the "News" section
>>> however:
>>>
>>> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
>> I disagree, this is not a feature / stability matrix. It is a clearly a
>> changelog by kernel version.
> It is a *feature* matrix. I fully said its not about stability, but about
> implementation – I just wrote this a sentence after this one. There is no need
> whatsoever to further discuss this as I never claimed that it is a feature /
> stability matrix in the first place.
Ok I was a bit unclear... it is a feature matrix , but not a feature + 
stability matrix.
My mistake - sorry about that.
>>> Thing is: This just seems to be when has a feature been implemented
>>> matrix.
>>> Not when it is considered to be stable. I think this could be done with
>>> colors or so. Like red for not supported, yellow for implemented and
>>> green for production ready.
>> Exactly, just like the Nouveau matrix. It clearly shows what you can
>> expect from it.
>>
>>> Another hint you can get by reading SLES 12 releasenotes. SUSE dares to
>>> support BTRFS since quite a while – frankly, I think for SLES 11 SP 3 this
>>> was premature, at least for the initial release without updates, I have a
>>> VM that with BTRFS I can break very easily having BTRFS say it is full,
>>> while it is has still 2 GB free. But well… this still seems to happen for
>>> some people according to the threads on BTRFS mailing list.
>>>
>>> SUSE doesn´t support all of BTRFS. They even put features they do not
>>> support behind a "allow_unsupported=1" module option:
>>>
>>> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-314697
>>>
>>> But they even seem to contradict themselves by claiming they support RAID
>>> 0, RAID 1 and RAID10, but not RAID 5 or RAID 6, but then putting RAID
>>> behind that module option – or I misunderstood their RAID statement
>>>
>>> "Btrfs is supported on top of MD (multiple devices) and DM (device mapper)
>>> configurations. Use the YaST partitioner to achieve a proper setup.
>>> Multivolume Btrfs is supported in RAID0, RAID1, and RAID10 profiles in
>>> SUSE
>>> Linux Enterprise 12, higher RAID levels are not yet supported, but might
>>> be
>>> enabled with a future service pack."
>>>
>>> and they only support BTRFS on MD for RAID. They also do not support
>>> compression yet. They even do not support big metadata.
>>>
>>> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221
>>>
>>> Interestingly enough RedHat only supports BTRFS as a technology preview,
>>> even with RHEL 7.
>> I would much rather prefer to rely on the btrfs wiki as the source and
>> not distro's ideas about what is reliable or not. The Debian wiki is
>> nice, but there should honestly not be any need for it if the btrfs wiki
>> had the relevant information.
> See, this is what you prefer. And then there is reality.
>
> It seems reality doesn´t match what you prefer. You can now spend time
> complaining about this, or… offer your help to improve the situation.
I do not intend to be hostile. This depends on how you view the my mail 
and how you interpret my response.
Regardless if you consider my mail to be a complaint or not the 
intention is to of course improve the situation and my posting to the 
mailing list I hope to do exactly that.
>
> If you choose the complaining path, I am out, and rather spend my time
> enjoying to use BTRFS as I do. Maybe reviewing that compress=lzo thing.
>
> As I first read your subject "Is stability a joke?" I wondered whether to even
> answer this. Fortunately your post has been a bit more than this complaint.
The subject was chosen for the following reason:
I like BTRFS and think the filesystem is fantastic. Many of the issues 
people run into are on older kernels, and the BTRFS wiki is not very 
clear on the stability stuff. Ergo should not the stability of the 
filesystem be taken a bit more seriously from a documentation point of view?
> And trust me, I have been there. I complained myself about stability here. And
> I found that it didn´t help my cause very much.
>
>>>> For example something along the lines of .... (the statuses are taken
>>>> our of thin air just for demonstration purposes)
>>> I´d say feel free to work with the feature matrix already there and fill
>>> in
>>> information about stability. I think it makes sense tough to discuss first
>>> on how to do it with still keeping it manageable.
>> I am afraid the changelog is not a stability/status feature matrix as
>> you yourself have mentioned, but absolutely I could have edited the wiki
>> and see what happened :)
> I think what would be a good next step would be to ask developers / users
> about feature stability and then update the wiki. If thats important to you, I
> suggest you invest some energy in doing that. And ask for help. This
> mailinglist is a good idea.
>
> I already gave you my idea on what works for me.
>
> There is just one thing I won´t go further even a single step: The complaining
> path. As it leads to no desirable outcome.
>
> Thanks,
My intention was not to be hostile and if my response sound a bit harsh 
for you then by all means I do apologize for that.
I pointed out what I felt needed improvements and I did also supply a 
example of how to improve it. I am sorry, but I do not myself see any 
part of my mail that come off as complaining. It may have been 
unnecessary for me to point out your "did not loose a single byte" claim 
as this was obviously more a figure of speech and if this felt a bit 
hostile , I'm sorry about that. I hope that I with this this cleared up 
any potential confusion. Have a still nice day Sir :)


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:05       ` Martin Steigerwald
@ 2016-09-11 12:39         ` Waxhead
  2016-09-11 13:02           ` Hugo Mills
                             ` (2 more replies)
  2016-09-11 17:11         ` Duncan
  1 sibling, 3 replies; 93+ messages in thread
From: Waxhead @ 2016-09-11 12:39 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: linux-btrfs

Martin Steigerwald wrote:
> Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
>>>>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>>>>> and I think that BTRFS perhaps should consider doing something like
>>>>> that
>>>>> on it's official wiki as well
>>>> BTRFS also has a feature matrix. The links to it are in the "News"
>>>> section
>>>> however:
>>>>
>>>> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
>>> I disagree, this is not a feature / stability matrix. It is a clearly a
>>> changelog by kernel version.
>> It is a *feature* matrix. I fully said its not about stability, but about
>> implementation – I just wrote this a sentence after this one. There is no
>> need  whatsoever to further discuss this as I never claimed that it is a
>> feature / stability matrix in the first place.
>>
>>>> Thing is: This just seems to be when has a feature been implemented
>>>> matrix.
>>>> Not when it is considered to be stable. I think this could be done with
>>>> colors or so. Like red for not supported, yellow for implemented and
>>>> green for production ready.
>>> Exactly, just like the Nouveau matrix. It clearly shows what you can
>>> expect from it.
> I mentioned this matrix as a good *starting* point. And I think it would be
> easy to extent it:
>
> Just add another column called "Production ready". Then research / ask about
> production stability of each feature. The only challenge is: Who is
> authoritative on that? I´d certainly ask the developer of a feature, but I´d
> also consider user reports to some extent.
>
> Maybe thats the real challenge.
>
> If you wish, I´d go through each feature there and give my own estimation. But
> I think there are others who are deeper into this.
That is exactly the same reason I don't edit the wiki myself. I could of 
course get it started and hopefully someone will correct what I write, 
but I feel that if I start this off I don't have deep enough knowledge 
to do a proper start. Perhaps I will change my mind about this.
>
> I do think for example that scrubbing and auto raid repair are stable, except
> for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable.
> I think RAID 10 is also stable, but as I do not run it, I don´t know. For me
> also skinny-metadata is stable. For me so far even compress=lzo seems to be
> stable, but well for others it may not.
>
> Since what kernel version? Now, there you go. I have no idea. All I know I
> started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at
> that time.
>
> See, the implementation time of a feature is much easier to assess. Maybe
> thats part of the reason why there is not stability matrix: Maybe no one
> *exactly* knows *for sure*. How could you? So I would even put a footnote on
> that "production ready" row explaining "Considered to be stable by developer
> and user oppinions".
>
> Of course additionally it would be good to read about experiences of corporate
> usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it.
> But I don´t know in what configurations and with what experiences. One Oracle
> developer invests a lot of time to bring BTRFS like features to XFS and RedHat
> still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non
> /-filesystems. That also tells a story.
>
> Some ideas you can get from SUSE releasenotes. Even if you do not want to use
> it, it tells something and I bet is one of the better sources of information
> regarding your question you can get at this time. Cause I believe SUSE
> developers invested some time to assess the stability of features. Cause they
> would carefully assess what they can support in enterprise environments. There
> is also someone from Fujitsu who shared experiences in a talk, I can search
> the URL to the slides again.
By all means, SUSE's wiki is very valuable. I just said that I *prefer* 
to have that stuff on the BTRFS wiki and feel that is the right place 
for it.
>
> I bet Chris Mason and other BTRFS developers at Facebook have some idea on
> what they use within Facebook as well. To what extent they are allowed to talk
> about it… I don´t know. My personal impression is that as soon as Chris went
> to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to
> Facebook being concerned much more about the privacy of itself than of its
> users.
>
> Thanks,


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:39         ` Waxhead
@ 2016-09-11 13:02           ` Hugo Mills
  2016-09-11 14:59             ` Martin Steigerwald
                               ` (2 more replies)
  2016-09-11 14:54           ` Martin Steigerwald
  2016-09-11 17:46           ` Marc MERLIN
  2 siblings, 3 replies; 93+ messages in thread
From: Hugo Mills @ 2016-09-11 13:02 UTC (permalink / raw)
  To: Waxhead; +Cc: Martin Steigerwald, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5221 bytes --]

On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> Martin Steigerwald wrote:
> >Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> >>>>Thing is: This just seems to be when has a feature been implemented
> >>>>matrix.
> >>>>Not when it is considered to be stable. I think this could be done with
> >>>>colors or so. Like red for not supported, yellow for implemented and
> >>>>green for production ready.
> >>>Exactly, just like the Nouveau matrix. It clearly shows what you can
> >>>expect from it.
> >I mentioned this matrix as a good *starting* point. And I think it would be
> >easy to extent it:
> >
> >Just add another column called "Production ready". Then research / ask about
> >production stability of each feature. The only challenge is: Who is
> >authoritative on that? I´d certainly ask the developer of a feature, but I´d
> >also consider user reports to some extent.
> >
> >Maybe thats the real challenge.
> >
> >If you wish, I´d go through each feature there and give my own estimation. But
> >I think there are others who are deeper into this.
> That is exactly the same reason I don't edit the wiki myself. I
> could of course get it started and hopefully someone will correct
> what I write, but I feel that if I start this off I don't have deep
> enough knowledge to do a proper start. Perhaps I will change my mind
> about this.

   Given that nobody else has done it yet, what are the odds that
someone else will step up to do it now? I would say that you should at
least try. Yes, you don't have as much knowledge as some others, but
if you keep working at it, you'll gain that knowledge. Yes, you'll
probably get it wrong to start with, but you probably won't get it
*very* wrong. You'll probably get it horribly wrong at some point, but
even the more knowledgable people you're deferring to didn't identify
the problems with parity RAID until Zygo and Austin and Chris (and
others) put in the work to pin down the exact issues.

   So I'd strongly encourage you to set up and maintain the stability
matrix yourself -- you have the motivation at least, and the knowledge
will come with time and effort. Just keep reading the mailing list and
IRC and bugzilla, and try to identify where you see lots of repeated
problems, and where bugfixes in those areas happen.

   So, go for it. You have a lot to offer the community.

   Hugo.

> >I do think for example that scrubbing and auto raid repair are stable, except
> >for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable.
> >I think RAID 10 is also stable, but as I do not run it, I don´t know. For me
> >also skinny-metadata is stable. For me so far even compress=lzo seems to be
> >stable, but well for others it may not.
> >
> >Since what kernel version? Now, there you go. I have no idea. All I know I
> >started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at
> >that time.
> >
> >See, the implementation time of a feature is much easier to assess. Maybe
> >thats part of the reason why there is not stability matrix: Maybe no one
> >*exactly* knows *for sure*. How could you? So I would even put a footnote on
> >that "production ready" row explaining "Considered to be stable by developer
> >and user oppinions".
> >
> >Of course additionally it would be good to read about experiences of corporate
> >usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it.
> >But I don´t know in what configurations and with what experiences. One Oracle
> >developer invests a lot of time to bring BTRFS like features to XFS and RedHat
> >still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non
> >/-filesystems. That also tells a story.
> >
> >Some ideas you can get from SUSE releasenotes. Even if you do not want to use
> >it, it tells something and I bet is one of the better sources of information
> >regarding your question you can get at this time. Cause I believe SUSE
> >developers invested some time to assess the stability of features. Cause they
> >would carefully assess what they can support in enterprise environments. There
> >is also someone from Fujitsu who shared experiences in a talk, I can search
> >the URL to the slides again.
> By all means, SUSE's wiki is very valuable. I just said that I
> *prefer* to have that stuff on the BTRFS wiki and feel that is the
> right place for it.
> >
> >I bet Chris Mason and other BTRFS developers at Facebook have some idea on
> >what they use within Facebook as well. To what extent they are allowed to talk
> >about it… I don´t know. My personal impression is that as soon as Chris went
> >to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to
> >Facebook being concerned much more about the privacy of itself than of its
> >users.
> >
> >Thanks,
> 

-- 
Hugo Mills             | How do you become King? You stand in the marketplace
hugo@... carfax.org.uk | and announce you're going to tax everyone. If you
http://carfax.org.uk/  | get out alive, you're King.
PGP: E2AB1DE4          |                                        Harry Harrison

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:30       ` Waxhead
@ 2016-09-11 14:36         ` Martin Steigerwald
  0 siblings, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 14:36 UTC (permalink / raw)
  To: Waxhead; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 14:30:51 CEST schrieb Waxhead:
> > I think what would be a good next step would be to ask developers / users
> > about feature stability and then update the wiki. If thats important to
> > you, I suggest you invest some energy in doing that. And ask for help.
> > This mailinglist is a good idea.
> > 
> > I already gave you my idea on what works for me.
> > 
> > There is just one thing I won´t go further even a single step: The
> > complaining path. As it leads to no desirable outcome.
> > 
> > Thanks,
> 
> My intention was not to be hostile and if my response sound a bit harsh 
> for you then by all means I do apologize for that.

Okay, maybe I read something into your mail that you didn´t intend to put 
there. Sorry. Let us focus on the constructive way to move forward with this.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:39         ` Waxhead
  2016-09-11 13:02           ` Hugo Mills
@ 2016-09-11 14:54           ` Martin Steigerwald
  2016-09-11 15:19             ` Martin Steigerwald
  2016-09-11 20:21             ` Chris Murphy
  2016-09-11 17:46           ` Marc MERLIN
  2 siblings, 2 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 14:54 UTC (permalink / raw)
  To: Waxhead; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 14:39:14 CEST schrieb Waxhead:
> Martin Steigerwald wrote:
> > Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> >>>>> The Nouveau graphics driver have a nice feature matrix on it's webpage
> >>>>> and I think that BTRFS perhaps should consider doing something like
> >>>>> that
> >>>>> on it's official wiki as well
> >>>> 
> >>>> BTRFS also has a feature matrix. The links to it are in the "News"
> >>>> section
> >>>> however:
> >>>> 
> >>>> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
[…]
> > I mentioned this matrix as a good *starting* point. And I think it would
> > be
> > easy to extent it:
> > 
> > Just add another column called "Production ready". Then research / ask
> > about production stability of each feature. The only challenge is: Who is
> > authoritative on that? I´d certainly ask the developer of a feature, but
> > I´d also consider user reports to some extent.
> > 
> > Maybe thats the real challenge.
> > 
> > If you wish, I´d go through each feature there and give my own estimation.
> > But I think there are others who are deeper into this.
> 
> That is exactly the same reason I don't edit the wiki myself. I could of
> course get it started and hopefully someone will correct what I write,
> but I feel that if I start this off I don't have deep enough knowledge
> to do a proper start. Perhaps I will change my mind about this.

Well one thing would be to start with the column and start filling the more 
easy stuff. And if its not known since what kernel version, but its known to 
be stable I suggest to conservatively just put the first kernel version into 
it where people think it is stable or in doubt even put 4.7 into it. It can 
still be reduced to lower kernel versions.

Well: I made a tiny start. I linked "Features by kernel version" more 
prominently on the main page, so it is easier to find and also added the 
following warning just above the table:

"WARNING: The "Version" row states at which version a feature has been merged 
into the mainline kernel. It does not tell anything about at which kernel 
version it is considered mature enough for production use."

Now I wonder: Would adding a "Production ready" column, stating the first 
known to be stable kernel version make sense in this table? What do you think? 
I can add the column and give some first rough, conservative estimations on a 
few features.

What do you think? Is this a good place?

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 13:02           ` Hugo Mills
@ 2016-09-11 14:59             ` Martin Steigerwald
  2016-09-11 20:14             ` Chris Murphy
  2016-09-12 12:20             ` Austin S. Hemmelgarn
  2 siblings, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 14:59 UTC (permalink / raw)
  To: Hugo Mills; +Cc: Waxhead, linux-btrfs

Am Sonntag, 11. September 2016, 13:02:21 CEST schrieb Hugo Mills:
> On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> > Martin Steigerwald wrote:
> > >Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> > >>>>Thing is: This just seems to be when has a feature been implemented
> > >>>>matrix.
> > >>>>Not when it is considered to be stable. I think this could be done
> > >>>>with
> > >>>>colors or so. Like red for not supported, yellow for implemented and
> > >>>>green for production ready.
> > >>>
> > >>>Exactly, just like the Nouveau matrix. It clearly shows what you can
> > >>>expect from it.
> > >
> > >I mentioned this matrix as a good *starting* point. And I think it would
> > >be
> > >easy to extent it:
> > >
> > >Just add another column called "Production ready". Then research / ask
> > >about production stability of each feature. The only challenge is: Who
> > >is authoritative on that? I´d certainly ask the developer of a feature,
> > >but I´d also consider user reports to some extent.
> > >
> > >Maybe thats the real challenge.
> > >
> > >If you wish, I´d go through each feature there and give my own
> > >estimation. But I think there are others who are deeper into this.
> > 
> > That is exactly the same reason I don't edit the wiki myself. I
> > could of course get it started and hopefully someone will correct
> > what I write, but I feel that if I start this off I don't have deep
> > enough knowledge to do a proper start. Perhaps I will change my mind
> > about this.
> 
>    Given that nobody else has done it yet, what are the odds that
> someone else will step up to do it now? I would say that you should at
> least try. Yes, you don't have as much knowledge as some others, but
> if you keep working at it, you'll gain that knowledge. Yes, you'll
> probably get it wrong to start with, but you probably won't get it
> *very* wrong. You'll probably get it horribly wrong at some point, but
> even the more knowledgable people you're deferring to didn't identify
> the problems with parity RAID until Zygo and Austin and Chris (and
> others) put in the work to pin down the exact issues.
> 
>    So I'd strongly encourage you to set up and maintain the stability
> matrix yourself -- you have the motivation at least, and the knowledge
> will come with time and effort. Just keep reading the mailing list and
> IRC and bugzilla, and try to identify where you see lots of repeated
> problems, and where bugfixes in those areas happen.
> 
>    So, go for it. You have a lot to offer the community.

Yep! Fully agreed.

-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 14:54           ` Martin Steigerwald
@ 2016-09-11 15:19             ` Martin Steigerwald
  2016-09-11 20:21             ` Chris Murphy
  1 sibling, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-11 15:19 UTC (permalink / raw)
  To: Waxhead; +Cc: linux-btrfs

Am Sonntag, 11. September 2016, 16:54:25 CEST schrieben Sie:
> Am Sonntag, 11. September 2016, 14:39:14 CEST schrieb Waxhead:
> > Martin Steigerwald wrote:
> > > Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin 
Steigerwald:
> > >>>>> The Nouveau graphics driver have a nice feature matrix on it's
> > >>>>> webpage
> > >>>>> and I think that BTRFS perhaps should consider doing something like
> > >>>>> that
> > >>>>> on it's official wiki as well
> > >>>> 
> > >>>> BTRFS also has a feature matrix. The links to it are in the "News"
> > >>>> section
> > >>>> however:
> > >>>> 
> > >>>> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
> 
> […]
> 
> > > I mentioned this matrix as a good *starting* point. And I think it would
> > > be
> > > easy to extent it:
> > > 
> > > Just add another column called "Production ready". Then research / ask
> > > about production stability of each feature. The only challenge is: Who
> > > is
> > > authoritative on that? I´d certainly ask the developer of a feature, but
> > > I´d also consider user reports to some extent.
> > > 
> > > Maybe thats the real challenge.
> > > 
> > > If you wish, I´d go through each feature there and give my own
> > > estimation.
> > > But I think there are others who are deeper into this.
> > 
> > That is exactly the same reason I don't edit the wiki myself. I could of
> > course get it started and hopefully someone will correct what I write,
> > but I feel that if I start this off I don't have deep enough knowledge
> > to do a proper start. Perhaps I will change my mind about this.
> 
> Well one thing would be to start with the column and start filling the more
> easy stuff. And if its not known since what kernel version, but its known to
> be stable I suggest to conservatively just put the first kernel version
> into it where people think it is stable or in doubt even put 4.7 into it.
> It can still be reduced to lower kernel versions.
> 
> Well: I made a tiny start. I linked "Features by kernel version" more
> prominently on the main page, so it is easier to find and also added the
> following warning just above the table:
> 
> "WARNING: The "Version" row states at which version a feature has been
> merged into the mainline kernel. It does not tell anything about at which
> kernel version it is considered mature enough for production use."
> 
> Now I wonder: Would adding a "Production ready" column, stating the first
> known to be stable kernel version make sense in this table? What do you
> think? I can add the column and give some first rough, conservative
> estimations on a few features.
> 
> What do you think? Is this a good place?

It isn´t as straight forward to add this column as I thought. If I add it 
after "Version" then the following fields are not aligned anymore, even tough 
they use some kind of identifier – but that identifier also doesn´t match the 
row title. After reading about mediawiki syntax I came to the conclusion that 
I need to add the new column in every data row as well and cannot just assign 
values to the rows and leave out whats not known yet.

! Feature !! Version !! Description !! Notes
{{FeatureMerged
|name=scrub
|version=3.0
|text=Read all data and verify checksums, repair if possible.
}}

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:05       ` Martin Steigerwald
  2016-09-11 12:39         ` Waxhead
@ 2016-09-11 17:11         ` Duncan
  2016-09-12 12:26           ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 93+ messages in thread
From: Duncan @ 2016-09-11 17:11 UTC (permalink / raw)
  To: linux-btrfs

Martin Steigerwald posted on Sun, 11 Sep 2016 14:05:03 +0200 as excerpted:

> Just add another column called "Production ready". Then research / ask
> about production stability of each feature. The only challenge is: Who
> is authoritative on that? I´d certainly ask the developer of a feature,
> but I´d also consider user reports to some extent.

Just a note that I'd *not* call it "production ready".  Btrfs in general 
is considered "stabilizing, not yet fully stable and mature", as I 
normally put it.  Thus, I'd call that column "stabilized to the level of 
btrfs in general", or perhaps just "stabilized", with a warning note with 
the longer form.

Because "production ready" can mean many things to many people.  The term 
seems to come from a big enterprise stack, with enterprise generally both 
somewhat conservative in deployment, and having backups and often hot-
spare-redundancy available, because lost time is lost money, and lost 
data has serious legal and financial implications.

But by the same token, /because/ they have the resources for fail-over, 
etc, large enterprises can and occasionally do deploy still stabilizing 
technologies, knowing they have fall-backs if needed, that smaller 
businesses and individuals often don't have.

Which is in my mind what's going on here.  Some places may be using it in 
production, but if they're sane, they have backups and even fail-over 
available.  Which is quite a bit different than saying it's "production 
ready" on an only machine, possibly with backups available but which 
would take some time to bring systems back up, and if it's a time is 
money environment, then...

Which again is far different than individual users, some of which 
unfortunately may not even have backups.

If "production ready" is taken to be the first group, with fail-overs 
available, etc, it means something entirely different than it does in the 
second and third cases, and I'd argue that while btrfs is ready for the 
first and can in some cases be ready for the second and the third if they 
have backups, it's definitely *not* "production ready" for the segment of 
the third that don't even have backups.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 12:39         ` Waxhead
  2016-09-11 13:02           ` Hugo Mills
  2016-09-11 14:54           ` Martin Steigerwald
@ 2016-09-11 17:46           ` Marc MERLIN
  2016-09-20 16:33             ` Chris Murphy
  2 siblings, 1 reply; 93+ messages in thread
From: Marc MERLIN @ 2016-09-11 17:46 UTC (permalink / raw)
  To: Waxhead; +Cc: Martin Steigerwald, linux-btrfs

On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> That is exactly the same reason I don't edit the wiki myself. I could of
> course get it started and hopefully someone will correct what I write, but I
> feel that if I start this off I don't have deep enough knowledge to do a
> proper start. Perhaps I will change my mind about this.

My first edits to the wiki was when I had barely started btrfs myself,
to simply write down answers to questions I had asked on the list and
that were not present on the wiki yet.

You don't have to be 100% right for everything, if something is wrong,
it'll likely bother someone and they'll go edit your changes, which is
more motivation and less work for them and write your changes from
scratch.
You can also add a small disclaimer "to the best of my knowledge",
etc...

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 13:02           ` Hugo Mills
  2016-09-11 14:59             ` Martin Steigerwald
@ 2016-09-11 20:14             ` Chris Murphy
  2016-09-12 12:20             ` Austin S. Hemmelgarn
  2 siblings, 0 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-11 20:14 UTC (permalink / raw)
  To: Hugo Mills, Waxhead, Martin Steigerwald, Btrfs BTRFS

On Sun, Sep 11, 2016 at 7:02 AM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
>> Martin Steigerwald wrote:
>> >Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
>> >>>>Thing is: This just seems to be when has a feature been implemented
>> >>>>matrix.
>> >>>>Not when it is considered to be stable. I think this could be done with
>> >>>>colors or so. Like red for not supported, yellow for implemented and
>> >>>>green for production ready.
>> >>>Exactly, just like the Nouveau matrix. It clearly shows what you can
>> >>>expect from it.
>> >I mentioned this matrix as a good *starting* point. And I think it would be
>> >easy to extent it:
>> >
>> >Just add another column called "Production ready". Then research / ask about
>> >production stability of each feature. The only challenge is: Who is
>> >authoritative on that? I´d certainly ask the developer of a feature, but I´d
>> >also consider user reports to some extent.
>> >
>> >Maybe thats the real challenge.
>> >
>> >If you wish, I´d go through each feature there and give my own estimation. But
>> >I think there are others who are deeper into this.
>> That is exactly the same reason I don't edit the wiki myself. I
>> could of course get it started and hopefully someone will correct
>> what I write, but I feel that if I start this off I don't have deep
>> enough knowledge to do a proper start. Perhaps I will change my mind
>> about this.
>
>    Given that nobody else has done it yet, what are the odds that
> someone else will step up to do it now? I would say that you should at
> least try. Yes, you don't have as much knowledge as some others, but
> if you keep working at it, you'll gain that knowledge. Yes, you'll
> probably get it wrong to start with, but you probably won't get it
> *very* wrong. You'll probably get it horribly wrong at some point, but
> even the more knowledgable people you're deferring to didn't identify
> the problems with parity RAID until Zygo and Austin and Chris (and
> others) put in the work to pin down the exact issues.
>
>    So I'd strongly encourage you to set up and maintain the stability
> matrix yourself -- you have the motivation at least, and the knowledge
> will come with time and effort. Just keep reading the mailing list and
> IRC and bugzilla, and try to identify where you see lots of repeated
> problems, and where bugfixes in those areas happen.
>
>    So, go for it. You have a lot to offer the community.

I agree.

Mistakes happen, but there should be explanations. My suggestion is to
keep it really concise. It will be easier to maintain, less prone to
interpretation (better to get it flat out wrong, which gives it a
better chance of being caught and fixed, than being vague), and make
it more readable.

I recently received a beat down over on opensuse-factory list because
I said they should revert quotas by default. And the reason for that
was, hanging out here and connecting the only dots that are really
available upstream suggested they're ready for testing not production
usage. But suse/opensuse have a different opinion, and yet that
opinion hasn't been expressed on this list until recently. On the one
hand I'm a little annoyed the developer to user communication is
lacking significantly enough that this miscommunication can happen, on
the other hand I realize they're up to their eyeballs doing things
they do best which is fixing bugs and adding features. I don't know
that anyone has a perfect idea of what is "stable" until it passes the
test of time.

So you can expect things to change. Something might become clearly
less stable than we thought, more likely it'll become more stable than
it is now but in slow not so obvious ways.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 14:54           ` Martin Steigerwald
  2016-09-11 15:19             ` Martin Steigerwald
@ 2016-09-11 20:21             ` Chris Murphy
  1 sibling, 0 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-11 20:21 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: Waxhead, Btrfs BTRFS

On Sun, Sep 11, 2016 at 8:54 AM, Martin Steigerwald <martin@lichtvoll.de> wrote:
> Am Sonntag, 11. September 2016, 14:39:14 CEST schrieb Waxhead:
>> Martin Steigerwald wrote:
>> > Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
>> >>>>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>> >>>>> and I think that BTRFS perhaps should consider doing something like
>> >>>>> that
>> >>>>> on it's official wiki as well
>> >>>>
>> >>>> BTRFS also has a feature matrix. The links to it are in the "News"
>> >>>> section
>> >>>> however:
>> >>>>
>> >>>> https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature
> […]
>> > I mentioned this matrix as a good *starting* point. And I think it would
>> > be
>> > easy to extent it:
>> >
>> > Just add another column called "Production ready". Then research / ask
>> > about production stability of each feature. The only challenge is: Who is
>> > authoritative on that? I´d certainly ask the developer of a feature, but
>> > I´d also consider user reports to some extent.
>> >
>> > Maybe thats the real challenge.
>> >
>> > If you wish, I´d go through each feature there and give my own estimation.
>> > But I think there are others who are deeper into this.
>>
>> That is exactly the same reason I don't edit the wiki myself. I could of
>> course get it started and hopefully someone will correct what I write,
>> but I feel that if I start this off I don't have deep enough knowledge
>> to do a proper start. Perhaps I will change my mind about this.
>
> Well one thing would be to start with the column and start filling the more
> easy stuff. And if its not known since what kernel version, but its known to
> be stable I suggest to conservatively just put the first kernel version into
> it where people think it is stable or in doubt even put 4.7 into it. It can
> still be reduced to lower kernel versions.
>
> Well: I made a tiny start. I linked "Features by kernel version" more
> prominently on the main page, so it is easier to find and also added the
> following warning just above the table:
>
> "WARNING: The "Version" row states at which version a feature has been merged
> into the mainline kernel. It does not tell anything about at which kernel
> version it is considered mature enough for production use."
>
> Now I wonder: Would adding a "Production ready" column, stating the first
> known to be stable kernel version make sense in this table? What do you think?
> I can add the column and give some first rough, conservative estimations on a
> few features.
>
> What do you think? Is this a good place?

Yes. Again I'd emphasize keeping it simple, even at some risk of
oversimplification. There can be the "bird's eye view" matrix, with
some footnotes to further qualify things. And the rest of the wiki for
details that are often repeated on the list but shouldn't need to be
repeated on the list. And then the list for conversations/evaluations.
It is really a lot easier to make the wiki too verbose, making good
info harder to find. Wherever possible just take a stand, have an
opinion, defer explanations to links to posts. That way people who
want to read volumes of source material can, those who just need to
get on with business can too, rather than having to filter a volume of
stale material.

Production ready comes with some assumptions, like stable hardware. In
what ways is Btrfs less tolerant of device problems than other
configurations? In what was is Btrfs more tolerant? That might be a
good thread down the road.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 13:02           ` Hugo Mills
  2016-09-11 14:59             ` Martin Steigerwald
  2016-09-11 20:14             ` Chris Murphy
@ 2016-09-12 12:20             ` Austin S. Hemmelgarn
  2016-09-12 12:59               ` Michel Bouissou
                                 ` (2 more replies)
  2 siblings, 3 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 12:20 UTC (permalink / raw)
  To: Hugo Mills, Waxhead, Martin Steigerwald, linux-btrfs

On 2016-09-11 09:02, Hugo Mills wrote:
> On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
>> Martin Steigerwald wrote:
>>> Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
>>>>>> Thing is: This just seems to be when has a feature been implemented
>>>>>> matrix.
>>>>>> Not when it is considered to be stable. I think this could be done with
>>>>>> colors or so. Like red for not supported, yellow for implemented and
>>>>>> green for production ready.
>>>>> Exactly, just like the Nouveau matrix. It clearly shows what you can
>>>>> expect from it.
>>> I mentioned this matrix as a good *starting* point. And I think it would be
>>> easy to extent it:
>>>
>>> Just add another column called "Production ready". Then research / ask about
>>> production stability of each feature. The only challenge is: Who is
>>> authoritative on that? I´d certainly ask the developer of a feature, but I´d
>>> also consider user reports to some extent.
>>>
>>> Maybe thats the real challenge.
>>>
>>> If you wish, I´d go through each feature there and give my own estimation. But
>>> I think there are others who are deeper into this.
>> That is exactly the same reason I don't edit the wiki myself. I
>> could of course get it started and hopefully someone will correct
>> what I write, but I feel that if I start this off I don't have deep
>> enough knowledge to do a proper start. Perhaps I will change my mind
>> about this.
>
>    Given that nobody else has done it yet, what are the odds that
> someone else will step up to do it now? I would say that you should at
> least try. Yes, you don't have as much knowledge as some others, but
> if you keep working at it, you'll gain that knowledge. Yes, you'll
> probably get it wrong to start with, but you probably won't get it
> *very* wrong. You'll probably get it horribly wrong at some point, but
> even the more knowledgable people you're deferring to didn't identify
> the problems with parity RAID until Zygo and Austin and Chris (and
> others) put in the work to pin down the exact issues.
FWIW, here's a list of what I personally consider stable (as in, I'm 
willing to bet against reduced uptime to use this stuff on production 
systems at work and personal systems at home):
1. Single device mode, including DUP data profiles on single device 
without mixed-bg.
2. Multi-device raid0, raid1, and raid10 profiles with symmetrical 
devices (all devices are the same size).
3. Multi-device single profiles with asymmetrical devices.
4. Small numbers (max double digit) of snapshots, taken at infrequent 
intervals (no more than once an hour).  I use single snapshots regularly 
to get stable images of the filesystem for backups, and I keep hourly 
ones of my home directory for about 48 hours.
5. Subvolumes used to isolate parts of a filesystem from snapshots.  I 
use this regularly to isolate areas of my filesystems from backups.
6. Non-incremental send/receive (no clone source, no parent's, no 
deduplication).  I use this regularly for cloning virtual machines.
7. Checksumming and scrubs using any of the profiles I've listed above.
8. Defragmentation, including autodefrag.
9. All of the compat_features, including no-holes and skinny-metadata.

Things I consider stable enough that I'm willing to use them on my 
personal systems but not systems at work:
1. In-line data compression with compress=lzo.  I use this on my laptop 
and home server system.  I've never had any issues with it myself, but I 
know that other people have, and it does seem to make other things more 
likely to have issues.
2. Batch deduplication.  I only use this on the back-end filesystems for 
my personal storage cluster, and only because I have multiple copies as 
a result of GlusterFS on top of BTRFS.  I've not had any significant 
issues with it, and I don't remember any reports of data loss resulting 
from it, but it's something that people should not be using if they 
don't understand all the implications.

Things that I don't consider stable but some people do:
1. Quotas and qgroups.  Some people (such as SUSE) consider these to be 
stable.  There are a couple of known issues with them still however 
(such as returning the wrong errno when a quota is hit (should be 
returning -EDQUOT, instead returns -ENOSPC)).
2. RAID5/6.  There are a few people who use this, but it's generally 
agreed to be unstable.  There are still at least 3 known bugs which can 
cause complete loss of a filesystem, and there's also a known issue with 
rebuilds taking insanely long, which puts data at risk as well.
3. Multi device filesystems with asymmetrical devices running raid0, 
raid1, or raid10.  The issue I have here is that it's much easier to hit 
errors regarding free space than a reliable system should be.  It's 
possible to avoid with careful planning (for example, a 3 disk raid1 
profile with 1 disk exactly twice the size of the other two will work 
fine, albeit with more load on the larger disk).

There's probably some stuff I've missed, but that should cover most of 
the widely known features.  The problem ends up becoming that what's 
'stable' depends a lot on what you consider stable.  SUSE obviously 
considers qgroups stable (they're enabled by default in all current SUSE 
distributions), but I wouldn't be willing to use them, and I'd be 
willing to bet most of the developers wouldn't either.

As far as what I consider stable, I've been using just about everything 
I listed above in the first two lists for the past year or so with no 
issues that were due to BTRFS itself (I've had some hardware issues, but 
BTRFS actually saved my data in those cases).  I'm also not a typical 
user though, both in terms of use cases (I use LVM for storing VM images 
and then set ACL's on the device nodes so I can use them as a regular 
user, and I do regular maintenance on all the databases on my systems), 
and relative knowledge of the filesystem (I've fixed BTRFS filesystems 
by hand with a hex editor before, not something I ever want to do again, 
but I know I can do it if I need to), and both of those impact my 
confidence in using some features.
>
>    So I'd strongly encourage you to set up and maintain the stability
> matrix yourself -- you have the motivation at least, and the knowledge
> will come with time and effort. Just keep reading the mailing list and
> IRC and bugzilla, and try to identify where you see lots of repeated
> problems, and where bugfixes in those areas happen.
Exactly this.  Most people don't start working on something for the 
first time with huge amounts of preexisting knowledge about it.  Heaven 
knows I didn't both when I first started using Linux, and when I started 
using BTRFS.  One of the big advantages of open source in this respect 
though is that you generally can find people willing to help you without 
much effort, and there's generally relatively good support.

As far as documentation though, we [BTRFS] really do need to get our act 
together.  It really doesn't look good to have most of the best 
documentation be in the distro's wikis instead of ours.  I'm not trying 
to say the distros shouldn't be documenting BTRFS, but the point at 
which Debian (for example) has better documentation of the upstream 
version of BTRFS than the upstream project itself does, that starts to 
look bad.
>
>    So, go for it. You have a lot to offer the community.
>
>    Hugo.
>
>>> I do think for example that scrubbing and auto raid repair are stable, except
>>> for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable.
>>> I think RAID 10 is also stable, but as I do not run it, I don´t know. For me
>>> also skinny-metadata is stable. For me so far even compress=lzo seems to be
>>> stable, but well for others it may not.
>>>
>>> Since what kernel version? Now, there you go. I have no idea. All I know I
>>> started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at
>>> that time.
>>>
>>> See, the implementation time of a feature is much easier to assess. Maybe
>>> thats part of the reason why there is not stability matrix: Maybe no one
>>> *exactly* knows *for sure*. How could you? So I would even put a footnote on
>>> that "production ready" row explaining "Considered to be stable by developer
>>> and user oppinions".
>>>
>>> Of course additionally it would be good to read about experiences of corporate
>>> usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it.
>>> But I don´t know in what configurations and with what experiences. One Oracle
>>> developer invests a lot of time to bring BTRFS like features to XFS and RedHat
>>> still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non
>>> /-filesystems. That also tells a story.
>>>
>>> Some ideas you can get from SUSE releasenotes. Even if you do not want to use
>>> it, it tells something and I bet is one of the better sources of information
>>> regarding your question you can get at this time. Cause I believe SUSE
>>> developers invested some time to assess the stability of features. Cause they
>>> would carefully assess what they can support in enterprise environments. There
>>> is also someone from Fujitsu who shared experiences in a talk, I can search
>>> the URL to the slides again.
>> By all means, SUSE's wiki is very valuable. I just said that I
>> *prefer* to have that stuff on the BTRFS wiki and feel that is the
>> right place for it.
>>>
>>> I bet Chris Mason and other BTRFS developers at Facebook have some idea on
>>> what they use within Facebook as well. To what extent they are allowed to talk
>>> about it… I don´t know. My personal impression is that as soon as Chris went
>>> to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to
>>> Facebook being concerned much more about the privacy of itself than of its
>>> users.
>>>
>>> Thanks,
>>
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 17:11         ` Duncan
@ 2016-09-12 12:26           ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 12:26 UTC (permalink / raw)
  To: linux-btrfs

On 2016-09-11 13:11, Duncan wrote:
> Martin Steigerwald posted on Sun, 11 Sep 2016 14:05:03 +0200 as excerpted:
>
>> Just add another column called "Production ready". Then research / ask
>> about production stability of each feature. The only challenge is: Who
>> is authoritative on that? I´d certainly ask the developer of a feature,
>> but I´d also consider user reports to some extent.
>
> Just a note that I'd *not* call it "production ready".  Btrfs in general
> is considered "stabilizing, not yet fully stable and mature", as I
> normally put it.  Thus, I'd call that column "stabilized to the level of
> btrfs in general", or perhaps just "stabilized", with a warning note with
> the longer form.
>
> Because "production ready" can mean many things to many people.  The term
> seems to come from a big enterprise stack, with enterprise generally both
> somewhat conservative in deployment, and having backups and often hot-
> spare-redundancy available, because lost time is lost money, and lost
> data has serious legal and financial implications.
>
> But by the same token, /because/ they have the resources for fail-over,
> etc, large enterprises can and occasionally do deploy still stabilizing
> technologies, knowing they have fall-backs if needed, that smaller
> businesses and individuals often don't have.
>
> Which is in my mind what's going on here.  Some places may be using it in
> production, but if they're sane, they have backups and even fail-over
> available.  Which is quite a bit different than saying it's "production
> ready" on an only machine, possibly with backups available but which
> would take some time to bring systems back up, and if it's a time is
> money environment, then...
>
> Which again is far different than individual users, some of which
> unfortunately may not even have backups.
>
> If "production ready" is taken to be the first group, with fail-overs
> available, etc, it means something entirely different than it does in the
> second and third cases, and I'd argue that while btrfs is ready for the
> first and can in some cases be ready for the second and the third if they
> have backups, it's definitely *not* "production ready" for the segment of
> the third that don't even have backups.
>
And definitely not for the segment of the second and third who believe 
that RAID is a backup.

It brings to mind one of my friends who I was explaining my home 
server's storage stack to.  When I explained that I could lose three of 
the four primary disks and one of the SSD's and the system would still 
keep running and be pretty much fully usable, his first response was 
'Oh, so you have three backups of the prim,ary disk and one of the 
SSD?'.  We had a long discussion after that where I explained that RAID 
was not a backup, it was for keeping things working when a disk failed 
so you didn't have to restore from a backup.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 10:23 ` Martin Steigerwald
  2016-09-11 11:21   ` Zoiled
@ 2016-09-12 12:48   ` Swâmi Petaramesh
  1 sibling, 0 replies; 93+ messages in thread
From: Swâmi Petaramesh @ 2016-09-12 12:48 UTC (permalink / raw)
  To: Btrfs BTRFS

Le dimanche 11 septembre 2016 12:23:23, vous avez écrit :
> First off: On my systems BTRFS definately runs too stable for a research 
> project. Actually: I have zero issues with stability of BTRFS on *any* of
> my  systems at the moment and in the last half year.

I have been using BTRFS for 3+ years on 10+ machines.

I have never lost any serious data.

My usual mount options are « noatime, autodefrag, compress=lzo » on mechanical 
HDs and « ssd, noatime, compress=lzo » on SSDs.

I usually use automated snapshotting, using SuSE’s excellent « snapper » tool, 
even though the distros I use are not SuSE, but Arch, Fedora, Mint and Ubuntu.

I use « chattr +C » and no snapshots on database dirs, VM dirs and the like.

In such conditions, BTRFS performs pretty well on all of my SSD machines. For 
years it has stayed OK - and the SSD wear reported by SMART seems normal.

On all my machines with mechanical HDs, the systems progressively slows down 
over months, to the point that it becomes hardly usable (i.e. 10 minutes 
booting…).

It is worth noting that even destroying all snapshots, performing manual 
defrags + rebalance doesn’t help. Once a given FS has become slow to death, it 
remains slow to death.

I’ve also experienced corruption of a dozen files with a power failures, that « 
btrfs check » could not fix. These files have remain broken since (with no 
further effect).

I also use a BTRFS RAID 1 on top of 2 bcache devices, works perfectly — and 
the cache SSD prevents the system from the slowdown effect…

My 2 cents.

-- 
Swâmi Petaramesh <swami@petaramesh.org> http://petaramesh.org PGP 9076E32E


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 12:20             ` Austin S. Hemmelgarn
@ 2016-09-12 12:59               ` Michel Bouissou
  2016-09-12 13:14                 ` Austin S. Hemmelgarn
  2016-09-12 14:04                 ` Lionel Bouton
  2016-09-15  1:05               ` Nicholas D Steeves
  2016-09-15  5:55               ` Kai Krakow
  2 siblings, 2 replies; 93+ messages in thread
From: Michel Bouissou @ 2016-09-12 12:59 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Hugo Mills, Waxhead, Martin Steigerwald, linux-btrfs

Le lundi 12 septembre 2016, 08:20:20 Austin S. Hemmelgarn a écrit :
> FWIW, here's a list of what I personally consider stable (as in, I'm 
> willing to bet against reduced uptime to use this stuff on production 
> systems at work and personal systems at home):
> 1. Single device mode, including DUP data profiles on single device 
> without mixed-bg.
> 2. Multi-device … raid1, … with symmetrical 
> devices (all devices are the same size).
> 4. Small numbers (max double digit) of snapshots, taken at infrequent 
> intervals (no more than once an hour).  I use single snapshots regularly 
> to get stable images of the filesystem for backups, and I keep hourly 
> ones of my home directory for about 48 hours.
> 5. Subvolumes used to isolate parts of a filesystem from snapshots.  I 
> use this regularly to isolate areas of my filesystems from backups.
> 6. Non-incremental send/receive (no clone source, no parent's, no 
> deduplication).  I use this regularly for cloning virtual machines.
> 7. Checksumming and scrubs using any of the profiles I've listed above.
> 8. Defragmentation, including autodefrag.
> 9. All of the compat_features, including no-holes and skinny-metadata.

I would also agree that all this is perfectly stable in my own experience. (I 
removed above what I didn’t personnally use, or didn’t use long enough to 
vouch for it).

> Things I consider stable enough that I'm willing to use them on my 
> personal systems but not systems at work:
> 1. In-line data compression with compress=lzo.  I use this on my laptop 
> and home server system.  I've never had any issues with it myself, but I 
> know that other people have, and it does seem to make other things more 
> likely to have issues.

I never had problems with lzo compression, although I suspect that it (in 
conjuction with snapshots) adds much fragmentation that may relate to the 
extremely bad performance I get over time with mechanical HDs.

> 2. Batch deduplication.

Every time I tried to use any of the available dedup tools, either it 
immediately failed miserably, or it failed after eating all of my machine’s 
RAM. It didn’t eat my data, although.

My 2 cents…

-- 
Michel Bouissou <michel.bouissou@umontpellier.fr>
Ingénieur Systèmes
Université de Montpellier - DSIN


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 12:59               ` Michel Bouissou
@ 2016-09-12 13:14                 ` Austin S. Hemmelgarn
  2016-09-12 14:04                 ` Lionel Bouton
  1 sibling, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 13:14 UTC (permalink / raw)
  To: Michel Bouissou; +Cc: Hugo Mills, Waxhead, Martin Steigerwald, linux-btrfs

On 2016-09-12 08:59, Michel Bouissou wrote:
> Le lundi 12 septembre 2016, 08:20:20 Austin S. Hemmelgarn a écrit :
>> FWIW, here's a list of what I personally consider stable (as in, I'm
>> willing to bet against reduced uptime to use this stuff on production
>> systems at work and personal systems at home):
>> 1. Single device mode, including DUP data profiles on single device
>> without mixed-bg.
>> 2. Multi-device … raid1, … with symmetrical
>> devices (all devices are the same size).
>> 4. Small numbers (max double digit) of snapshots, taken at infrequent
>> intervals (no more than once an hour).  I use single snapshots regularly
>> to get stable images of the filesystem for backups, and I keep hourly
>> ones of my home directory for about 48 hours.
>> 5. Subvolumes used to isolate parts of a filesystem from snapshots.  I
>> use this regularly to isolate areas of my filesystems from backups.
>> 6. Non-incremental send/receive (no clone source, no parent's, no
>> deduplication).  I use this regularly for cloning virtual machines.
>> 7. Checksumming and scrubs using any of the profiles I've listed above.
>> 8. Defragmentation, including autodefrag.
>> 9. All of the compat_features, including no-holes and skinny-metadata.
>
> I would also agree that all this is perfectly stable in my own experience. (I
> removed above what I didn’t personnally use, or didn’t use long enough to
> vouch for it).
FWIW, the multi-device single and raid0 profiles that I listed are not 
something I personally use, but I do include testing of them in my 
regular testing of BTRFS, and it's never hit anything unique to that 
configuration, so I'd be willing to use them if I had a use case for it. 
  The raid10 profile I have used in the past, but these days I usually 
run raid1 on top of LVM raid0 volumes, which gives the essentially the 
same net result, but with better performance and slightly better 
reliability (I can actually fix a double device loss in this 
configuration half the time, albeit with a lot of manual work).
>
>> Things I consider stable enough that I'm willing to use them on my
>> personal systems but not systems at work:
>> 1. In-line data compression with compress=lzo.  I use this on my laptop
>> and home server system.  I've never had any issues with it myself, but I
>> know that other people have, and it does seem to make other things more
>> likely to have issues.
>
> I never had problems with lzo compression, although I suspect that it (in
> conjuction with snapshots) adds much fragmentation that may relate to the
> extremely bad performance I get over time with mechanical HDs.
FWIW, the issues with compression in general (everyone talks about lzo 
compression, but almost nobody uses zlib even in testing, so we really 
have no indication that it doesn't have the same issue) seem to only be 
when there's lots of read errors on compressed data.  I usually replace 
my storage devices before they get to that point, so it makes sense that 
I've never seen any issues.
>
>> 2. Batch deduplication.
>
> Every time I tried to use any of the available dedup tools, either it
> immediately failed miserably, or it failed after eating all of my machine’s
> RAM. It didn’t eat my data, although.
Like I said, I only use it on one set of filesystems, and in this case, 
I've been using it through a script which finds files with duplicate 
data.  Instead of running duperemove over the whole filesystem, I use 
this script to call it directly on a list of files which have duplicate 
data, which speeds up the scanning phase in duperemove and really cuts 
down on the RAM usage, albeit at the cost of taking longer in the 
initial phase to find duplicates.  Some day I may post the script, but 
at the moment it looks horrible, isn't all that efficient itself, and 
has a couple of bugs that cause it to not reliably catch all the 
duplicated data, so I'm not too keen to share it in it's current 
condition.  The actual process itself isn't all that hard though, you 
can parse the output of diff to get the info, or you can do a manual 
block comparison like duperemove and use files to store the results to 
save on RAM usage.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11  8:55 Is stability a joke? Waxhead
  2016-09-11  9:56 ` Steven Haigh
  2016-09-11 10:23 ` Martin Steigerwald
@ 2016-09-12 13:53 ` Chris Mason
  2016-09-12 17:36   ` Zoiled
  2016-09-12 14:27 ` David Sterba
  3 siblings, 1 reply; 93+ messages in thread
From: Chris Mason @ 2016-09-12 13:53 UTC (permalink / raw)
  To: Waxhead, linux-btrfs



On 09/11/2016 04:55 AM, Waxhead wrote:
> I have been following BTRFS for years and have recently been starting to
> use BTRFS more and more and as always BTRFS' stability is a hot topic.
> Some says that BTRFS is a dead end research project while others claim
> the opposite.
>
> Taking a quick glance at the wiki does not say much about what is safe
> to use or not and it also points to some who are using BTRFS in production.
> While BTRFS can apparently work well in production it does have some
> caveats, and finding out what features is safe or not can be problematic
> and I especially think that new users of BTRFS can easily be bitten if
> they do not do a lot of research on it first.
>
> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
> of warnings and recommendations and is for me a bit better than the
> official BTRFS wiki when it comes to how to decide what features to use.
>
> The Nouveau graphics driver have a nice feature matrix on it's webpage
> and I think that BTRFS perhaps should consider doing something like that
> on it's official wiki as well
>
> For example something along the lines of .... (the statuses are taken
> our of thin air just for demonstration purposes)
>

The out of thin air part is a little confusing, I'm not sure if you're 
basing this on reports you've read?

I'm in favor flagged device replace with raid5/6 as not supported yet. 
That seems to be where most of the problems are coming in.

The compression framework shouldn't allow one to work well with the 
other unusable.

There were  problems with autodefrag related to snapshot-aware defrag, 
so Josef disabled the snapshot aware part.

In general, we put btrfs through heavy use at facebook.  The crcs have 
found serious hardware problems the other filesystems missed.

We've also uncovered performance problems and a some serious bugs, both 
in btrfs and the other filesystems.  With the other filesystems the 
fixes were usually upstream (doubly true for the most serious problems), 
and with btrfs we usually had to make the fixes ourselves.

-chris

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 12:59               ` Michel Bouissou
  2016-09-12 13:14                 ` Austin S. Hemmelgarn
@ 2016-09-12 14:04                 ` Lionel Bouton
  1 sibling, 0 replies; 93+ messages in thread
From: Lionel Bouton @ 2016-09-12 14:04 UTC (permalink / raw)
  To: Michel Bouissou, Austin S. Hemmelgarn
  Cc: Hugo Mills, Waxhead, Martin Steigerwald, linux-btrfs

Hi,

On 12/09/2016 14:59, Michel Bouissou wrote:
>  [...]
> I never had problems with lzo compression, although I suspect that it (in 
> conjuction with snapshots) adds much fragmentation that may relate to the 
> extremely bad performance I get over time with mechanical HDs.

I had about 30 btrfs filesystems on 2TB drives for a Ceph cluster with
compress=lzo and a background process which detected files recently
written to and defragmented/recompressed them using zlib when they
reached an arbitrary fragmentation level (so the fs was a mix of lzo,
zlib and "normal" extents).

With our usage pattern, our Ceph cluster is faster with compress=zlib
instead of the lzo then zlib mechanism (which tried to make writes
faster but was in fact counterproductive) so we made the switch to
compress=zlib this winter.

On these compress=lzo filesystems, at least 12 where often (up to
several times a week) corrupted by defective hardware controllers. I
never had any crash related to BTRFS under these conditions (at the time
with late 3.19 and 4.1.5 + Gentoo patches kernels). Is there a bug
somewhere open with kernel version affected and the kind of usage that
could reproduce any lzo specific problem (or any problem made worse by
lzo) ?

Lionel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11  8:55 Is stability a joke? Waxhead
                   ` (2 preceding siblings ...)
  2016-09-12 13:53 ` Chris Mason
@ 2016-09-12 14:27 ` David Sterba
  2016-09-12 14:54   ` Austin S. Hemmelgarn
  2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
  3 siblings, 2 replies; 93+ messages in thread
From: David Sterba @ 2016-09-12 14:27 UTC (permalink / raw)
  To: Waxhead; +Cc: linux-btrfs

Hi,

first, thanks for choosing a catchy subject, this always helps. While it
will serve as another beating stick to those who enjoy bashing btrfs,
I'm glad to see people answer in a constructive way.

On Sun, Sep 11, 2016 at 10:55:21AM +0200, Waxhead wrote:
> I have been following BTRFS for years and have recently been starting to 
> use BTRFS more and more and as always BTRFS' stability is a hot topic.
> Some says that BTRFS is a dead end research project while others claim 
> the opposite.

I take the 'research' part, as it's still possible to implement features
that were not expected years ago, but not a research in the sense "will
shadowing and clones approach to b-trees work?", ie. the original paper
by Ohad Rodeh from 2007.  That we can still add various metadata and
structures to the same underlying format proves that the design is sound
and flexible, building on the same primitives, only extending the
logical structure.

But I'm sure you'll find people who still claim that btrfs is broken by
design, because they heared somebody say that [1].

> Taking a quick glance at the wiki does not say much about what is safe 
> to use or not and it also points to some who are using BTRFS in production.
> While BTRFS can apparently work well in production it does have some 
> caveats, and finding out what features is safe or not can be problematic 
> and I especially think that new users of BTRFS can easily be bitten if 
> they do not do a lot of research on it first.

That's a valid point, the wiki lacks that, the usrespace tools do not
warn or prevent before using features deemed unsafe. In the enterprise
SLES kernel we can afford to draw a line where the support from our side
ends, regardless of the upstream status of the features. Doing that in
the upstream kernel is a bit different, the release and update schedules
are not the same, code is not selectively backported etc.

> The Debian wiki for BTRFS (which is recent by the way) contains a bunch 
> of warnings and recommendations and is for me a bit better than the 
> official BTRFS wiki when it comes to how to decide what features to use.

The 'wiki problem' is real, for too long people went to distro wikis for
generic information, so even if our k.org wiki is up to date, it's not
the primary source anyway. Changing that back is a long-term goal.

> The Nouveau graphics driver have a nice feature matrix on it's webpage 
> and I think that BTRFS perhaps should consider doing something like that 
> on it's official wiki as well
> 
> For example something along the lines of .... (the statuses are taken 
> our of thin air just for demonstration purposes)
> 
> Kernel version 4.7
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | Feature / Redundancy level | Single | Dup | Raid0 | Raid1 | Raid10 | 
> Raid5 | Raid 6 |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | Subvolumes                 | Ok     | Ok  | Ok    | Ok    | Ok   | Bad 
>    | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | Snapshots                  | Ok     | Ok  | Ok    | Ok    | Ok     | 
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | LZO Compression            | Bad(1) | Bad | Bad   | Bad(2)| Bad    | 
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | ZLIB Compression           | Ok     | Ok  | Ok    | Ok    | Ok     | 
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> | Autodefrag                 | Ok     | Bad | Bad(3)| Ok    | Ok     | 
> Bad   | Bad    |
> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
> 
> (1) Some explanation here...
> (2) Some explanation there....
> (3) And some explanation elsewhere...
> 
> ...etc...etc...
> 
> I therefore would like to propose that some sort of feature / stability 
> matrix for the latest kernel is added to the wiki preferably somewhere 
> where it is easy to find. It would be nice to archive old matrix'es as 
> well in case someone runs on a bit older kernel (we who use Debian tend 
> to like older kernels). In my opinion it would make things bit easier 
> and perhaps a bit less scary too. Remember if you get bitten badly once 
> you tend to stay away from from it all just in case, if you on the other 
> hand know what bites you can safely pet the fluffy end instead :)

Somebody has put that table on the wiki, so it's a good starting point.
I'm not sure we can fit everything into one table, some combinations do
not bring new information and we'd need n-dimensional matrix to get the
whole picture.

d.


[1] https://btrfs.wiki.kernel.org/index.php/FAQ#..._btrfs_is_broken_by_design_.28aka._Edward_Shishkin.27s_.22Unbound.28.3F.29_internal_fragmentation_in_Btrfs.22.29

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 14:27 ` David Sterba
@ 2016-09-12 14:54   ` Austin S. Hemmelgarn
  2016-09-12 16:51     ` David Sterba
  2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
  1 sibling, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 14:54 UTC (permalink / raw)
  To: dsterba, Waxhead, linux-btrfs

On 2016-09-12 10:27, David Sterba wrote:
> Hi,
>
> first, thanks for choosing a catchy subject, this always helps. While it
> will serve as another beating stick to those who enjoy bashing btrfs,
> I'm glad to see people answer in a constructive way.
>
> On Sun, Sep 11, 2016 at 10:55:21AM +0200, Waxhead wrote:
>> I have been following BTRFS for years and have recently been starting to
>> use BTRFS more and more and as always BTRFS' stability is a hot topic.
>> Some says that BTRFS is a dead end research project while others claim
>> the opposite.
>
> I take the 'research' part, as it's still possible to implement features
> that were not expected years ago, but not a research in the sense "will
> shadowing and clones approach to b-trees work?", ie. the original paper
> by Ohad Rodeh from 2007.  That we can still add various metadata and
> structures to the same underlying format proves that the design is sound
> and flexible, building on the same primitives, only extending the
> logical structure.
>
> But I'm sure you'll find people who still claim that btrfs is broken by
> design, because they heared somebody say that [1].
There have been a lot of things that hurt BTRFS's reputation.  I don't 
see this one as quite as much of an issue as the fiasco that was the 
original merge with mainline and the fact that a lot of distros added 
'support' before it was at all ready.  IMHO, if a distro is going to 
provide a filesystem as their default, then their default configuration 
needs to work with no user intervention for more than 90% of their 
users, and BTRFS has not been and still is not there at least with it's 
default options.
>
>> Taking a quick glance at the wiki does not say much about what is safe
>> to use or not and it also points to some who are using BTRFS in production.
>> While BTRFS can apparently work well in production it does have some
>> caveats, and finding out what features is safe or not can be problematic
>> and I especially think that new users of BTRFS can easily be bitten if
>> they do not do a lot of research on it first.
>
> That's a valid point, the wiki lacks that, the usrespace tools do not
> warn or prevent before using features deemed unsafe. In the enterprise
> SLES kernel we can afford to draw a line where the support from our side
> ends, regardless of the upstream status of the features. Doing that in
> the upstream kernel is a bit different, the release and update schedules
> are not the same, code is not selectively backported etc.
The other problem though is that what is considered unsafe varies by use 
case and a bunch of other factors.  As an example, SUSE obviously 
considers qgroups safe, but I and a number of other list regulars do 
not.  Similarly, LZO compression is something I would consider safe 
under specific circumstances (namely, if you have reliable hardware and 
can expect to be able to replace a failing component before you get 
storms of read or write failures), but not in others.
>
>> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
>> of warnings and recommendations and is for me a bit better than the
>> official BTRFS wiki when it comes to how to decide what features to use.
>
> The 'wiki problem' is real, for too long people went to distro wikis for
> generic information, so even if our k.org wiki is up to date, it's not
> the primary source anyway. Changing that back is a long-term goal.
>
>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>> and I think that BTRFS perhaps should consider doing something like that
>> on it's official wiki as well
>>
>> For example something along the lines of .... (the statuses are taken
>> our of thin air just for demonstration purposes)
>>
>> Kernel version 4.7
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | Feature / Redundancy level | Single | Dup | Raid0 | Raid1 | Raid10 |
>> Raid5 | Raid 6 |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | Subvolumes                 | Ok     | Ok  | Ok    | Ok    | Ok   | Bad
>>    | Bad    |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | Snapshots                  | Ok     | Ok  | Ok    | Ok    | Ok     |
>> Bad   | Bad    |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | LZO Compression            | Bad(1) | Bad | Bad   | Bad(2)| Bad    |
>> Bad   | Bad    |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | ZLIB Compression           | Ok     | Ok  | Ok    | Ok    | Ok     |
>> Bad   | Bad    |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>> | Autodefrag                 | Ok     | Bad | Bad(3)| Ok    | Ok     |
>> Bad   | Bad    |
>> +----------------------------+--------+-----+-------+-------+--------+-------+--------+
>>
>> (1) Some explanation here...
>> (2) Some explanation there....
>> (3) And some explanation elsewhere...
>>
>> ...etc...etc...
>>
>> I therefore would like to propose that some sort of feature / stability
>> matrix for the latest kernel is added to the wiki preferably somewhere
>> where it is easy to find. It would be nice to archive old matrix'es as
>> well in case someone runs on a bit older kernel (we who use Debian tend
>> to like older kernels). In my opinion it would make things bit easier
>> and perhaps a bit less scary too. Remember if you get bitten badly once
>> you tend to stay away from from it all just in case, if you on the other
>> hand know what bites you can safely pet the fluffy end instead :)
>
> Somebody has put that table on the wiki, so it's a good starting point.
> I'm not sure we can fit everything into one table, some combinations do
> not bring new information and we'd need n-dimensional matrix to get the
> whole picture.
Agreed, especially because some things are only bad in specific 
circumstances (For example, snapshots generally work fine on almost 
anything, until you get into the range of more than about 250, then they 
start causing issues).
>
> d.
>
>
> [1] https://btrfs.wiki.kernel.org/index.php/FAQ#..._btrfs_is_broken_by_design_.28aka._Edward_Shishkin.27s_.22Unbound.28.3F.29_internal_fragmentation_in_Btrfs.22.29

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 14:27 ` David Sterba
  2016-09-12 14:54   ` Austin S. Hemmelgarn
@ 2016-09-12 16:27   ` David Sterba
  2016-09-12 16:56     ` Austin S. Hemmelgarn
  2016-09-12 19:57     ` Martin Steigerwald
  1 sibling, 2 replies; 93+ messages in thread
From: David Sterba @ 2016-09-12 16:27 UTC (permalink / raw)
  To: dsterba, Waxhead, linux-btrfs

On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
> > I therefore would like to propose that some sort of feature / stability 
> > matrix for the latest kernel is added to the wiki preferably somewhere 
> > where it is easy to find. It would be nice to archive old matrix'es as 
> > well in case someone runs on a bit older kernel (we who use Debian tend 
> > to like older kernels). In my opinion it would make things bit easier 
> > and perhaps a bit less scary too. Remember if you get bitten badly once 
> > you tend to stay away from from it all just in case, if you on the other 
> > hand know what bites you can safely pet the fluffy end instead :)
> 
> Somebody has put that table on the wiki, so it's a good starting point.
> I'm not sure we can fit everything into one table, some combinations do
> not bring new information and we'd need n-dimensional matrix to get the
> whole picture.

https://btrfs.wiki.kernel.org/index.php/Status

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 14:54   ` Austin S. Hemmelgarn
@ 2016-09-12 16:51     ` David Sterba
  2016-09-12 17:31       ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 93+ messages in thread
From: David Sterba @ 2016-09-12 16:51 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

On Mon, Sep 12, 2016 at 10:54:40AM -0400, Austin S. Hemmelgarn wrote:
> > Somebody has put that table on the wiki, so it's a good starting point.
> > I'm not sure we can fit everything into one table, some combinations do
> > not bring new information and we'd need n-dimensional matrix to get the
> > whole picture.
> Agreed, especially because some things are only bad in specific 
> circumstances (For example, snapshots generally work fine on almost 
> anything, until you get into the range of more than about 250, then they 
> start causing issues).

The performance aspect could be hard to estimate. Each feature has some
cost, we can document what's expected hit but various combinations and
actual runtime performance is unpredictable. I'd rather let the tools do
what the user asks for, as we might not be able to even detect there are
some bad external factors. I think that 250 snapshots would perform
better on an ssd than a rotational disk. In the end this leads to the
"dos & don'ts".

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
@ 2016-09-12 16:56     ` Austin S. Hemmelgarn
  2016-09-12 17:29       ` Filipe Manana
                         ` (2 more replies)
  2016-09-12 19:57     ` Martin Steigerwald
  1 sibling, 3 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 16:56 UTC (permalink / raw)
  To: dsterba, Waxhead, linux-btrfs

On 2016-09-12 12:27, David Sterba wrote:
> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>>> I therefore would like to propose that some sort of feature / stability
>>> matrix for the latest kernel is added to the wiki preferably somewhere
>>> where it is easy to find. It would be nice to archive old matrix'es as
>>> well in case someone runs on a bit older kernel (we who use Debian tend
>>> to like older kernels). In my opinion it would make things bit easier
>>> and perhaps a bit less scary too. Remember if you get bitten badly once
>>> you tend to stay away from from it all just in case, if you on the other
>>> hand know what bites you can safely pet the fluffy end instead :)
>>
>> Somebody has put that table on the wiki, so it's a good starting point.
>> I'm not sure we can fit everything into one table, some combinations do
>> not bring new information and we'd need n-dimensional matrix to get the
>> whole picture.
>
> https://btrfs.wiki.kernel.org/index.php/Status

Some things to potentially add based on my own experience:

Things listed as TBD status:
1. Seeding: Seems to work fine the couple of times I've tested it, 
however I've only done very light testing, and the whole feature is 
pretty much undocumented.
2. Device Replace: Works perfectly as long as the filesystem itself is 
not corrupted, all the component devices are working, and the FS isn't 
using any raid56 profiles.  Works fine if only the device being replaced 
is failing.  I've not done much testing WRT replacement when multiple 
devices are suspect, but what I've done seems to suggest that it might 
be possible to make it work, but it doesn't currently.  On raid56 it 
sometimes works fine, sometimes corrupts data, and sometimes takes an 
insanely long time to complete (putting data at risk from subsequent 
failures while the replace is running).
3. Balance: Works perfectly as long as the filesystem is not corrupted 
and nothing throws any read or write errors.  IOW, only run this on a 
generally healthy filesystem.  Similar caveats to those for replace with 
raid56 apply here too.
4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if 
the FS is healthy.

Other stuff:
1. Compression: The specific known issue is that compressed extents 
don't always get recovered properly on failed reads when dealing with 
lots of failed reads.  This can be demonstrated by generating a large 
raid1 filesystem image with huge numbers of small (1MB) readliy 
compressible files, then putting that on top of a dm-flaky or dm-error 
target set to give a high read-error rate, then mounting and running cat 
`find .` > /dev/null from the top level of the FS multiple times in a row.
2. Send: The particular edge case appears to be caused by metadata 
corruption on the sender and results in send choking on the same file 
every time you try to run it.  The quick fix is to copy the contents of 
the file to another file and rename that over the original.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 16:56     ` Austin S. Hemmelgarn
@ 2016-09-12 17:29       ` Filipe Manana
  2016-09-12 17:42         ` Austin S. Hemmelgarn
  2016-09-12 20:08       ` Chris Murphy
  2016-09-19  3:47       ` Zygo Blaxell
  2 siblings, 1 reply; 93+ messages in thread
From: Filipe Manana @ 2016-09-12 17:29 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

On Mon, Sep 12, 2016 at 5:56 PM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2016-09-12 12:27, David Sterba wrote:
>>
>> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>>>>
>>>> I therefore would like to propose that some sort of feature / stability
>>>> matrix for the latest kernel is added to the wiki preferably somewhere
>>>> where it is easy to find. It would be nice to archive old matrix'es as
>>>> well in case someone runs on a bit older kernel (we who use Debian tend
>>>> to like older kernels). In my opinion it would make things bit easier
>>>> and perhaps a bit less scary too. Remember if you get bitten badly once
>>>> you tend to stay away from from it all just in case, if you on the other
>>>> hand know what bites you can safely pet the fluffy end instead :)
>>>
>>>
>>> Somebody has put that table on the wiki, so it's a good starting point.
>>> I'm not sure we can fit everything into one table, some combinations do
>>> not bring new information and we'd need n-dimensional matrix to get the
>>> whole picture.
>>
>>
>> https://btrfs.wiki.kernel.org/index.php/Status
>
>
> Some things to potentially add based on my own experience:
>
> Things listed as TBD status:
> 1. Seeding: Seems to work fine the couple of times I've tested it, however
> I've only done very light testing, and the whole feature is pretty much
> undocumented.
> 2. Device Replace: Works perfectly as long as the filesystem itself is not
> corrupted, all the component devices are working, and the FS isn't using any
> raid56 profiles.  Works fine if only the device being replaced is failing.
> I've not done much testing WRT replacement when multiple devices are
> suspect, but what I've done seems to suggest that it might be possible to
> make it work, but it doesn't currently.  On raid56 it sometimes works fine,
> sometimes corrupts data, and sometimes takes an insanely long time to
> complete (putting data at risk from subsequent failures while the replace is
> running).
> 3. Balance: Works perfectly as long as the filesystem is not corrupted and
> nothing throws any read or write errors.  IOW, only run this on a generally
> healthy filesystem.  Similar caveats to those for replace with raid56 apply
> here too.
> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
> is healthy.

Virtually all other features work fine if the fs is healthy...

>
> Other stuff:
> 1. Compression: The specific known issue is that compressed extents don't
> always get recovered properly on failed reads when dealing with lots of
> failed reads.  This can be demonstrated by generating a large raid1
> filesystem image with huge numbers of small (1MB) readliy compressible
> files, then putting that on top of a dm-flaky or dm-error target set to give
> a high read-error rate, then mounting and running cat `find .` > /dev/null
> from the top level of the FS multiple times in a row.

> 2. Send: The particular edge case appears to be caused by metadata
> corruption on the sender and results in send choking on the same file every
> time you try to run it.  The quick fix is to copy the contents of the file
> to another file and rename that over the original.

I don't remember having seen such case at least for the last 2 or 3
years, all the problems I've seen/solved or seen fixes from others
were all related to bugs in the send algorithm and definitely not any
metadata corruption.
So I wonder what evidence you have about this.

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 16:51     ` David Sterba
@ 2016-09-12 17:31       ` Austin S. Hemmelgarn
  2016-09-15  1:07         ` Nicholas D Steeves
  2016-09-19 15:38         ` Is stability a joke? David Sterba
  0 siblings, 2 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 17:31 UTC (permalink / raw)
  To: dsterba, Waxhead, linux-btrfs

On 2016-09-12 12:51, David Sterba wrote:
> On Mon, Sep 12, 2016 at 10:54:40AM -0400, Austin S. Hemmelgarn wrote:
>>> Somebody has put that table on the wiki, so it's a good starting point.
>>> I'm not sure we can fit everything into one table, some combinations do
>>> not bring new information and we'd need n-dimensional matrix to get the
>>> whole picture.
>> Agreed, especially because some things are only bad in specific
>> circumstances (For example, snapshots generally work fine on almost
>> anything, until you get into the range of more than about 250, then they
>> start causing issues).
>
> The performance aspect could be hard to estimate. Each feature has some
> cost, we can document what's expected hit but various combinations and
> actual runtime performance is unpredictable. I'd rather let the tools do
> what the user asks for, as we might not be able to even detect there are
> some bad external factors. I think that 250 snapshots would perform
> better on an ssd than a rotational disk. In the end this leads to the
> "dos & don'ts".
>
In general yes in this case, but performance starts to degrade 
exponentially beyond a certain point.  The difference between (for 
example) 10 and 20 snapshots is not as much as between 1000 and 1010. 
The problem here is that we don't really have a BCP document that anyone 
ever reads.  A lot of stuff that may seem obvious to us after years of 
working with BTRFS isn't going to be to a newcomer, and it's a lot more 
likely that some random person will get things write if we have a good, 
central BCP document than if it stays as scattered tribal knowledge.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 13:53 ` Chris Mason
@ 2016-09-12 17:36   ` Zoiled
  2016-09-12 17:44     ` Waxhead
  2016-09-15  1:12     ` Nicholas D Steeves
  0 siblings, 2 replies; 93+ messages in thread
From: Zoiled @ 2016-09-12 17:36 UTC (permalink / raw)
  To: Chris Mason, Waxhead, linux-btrfs

Chris Mason wrote:
>
>
> On 09/11/2016 04:55 AM, Waxhead wrote:
>> I have been following BTRFS for years and have recently been starting to
>> use BTRFS more and more and as always BTRFS' stability is a hot topic.
>> Some says that BTRFS is a dead end research project while others claim
>> the opposite.
>>
>> Taking a quick glance at the wiki does not say much about what is safe
>> to use or not and it also points to some who are using BTRFS in 
>> production.
>> While BTRFS can apparently work well in production it does have some
>> caveats, and finding out what features is safe or not can be problematic
>> and I especially think that new users of BTRFS can easily be bitten if
>> they do not do a lot of research on it first.
>>
>> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
>> of warnings and recommendations and is for me a bit better than the
>> official BTRFS wiki when it comes to how to decide what features to use.
>>
>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>> and I think that BTRFS perhaps should consider doing something like that
>> on it's official wiki as well
>>
>> For example something along the lines of .... (the statuses are taken
>> our of thin air just for demonstration purposes)
>>
>
> The out of thin air part is a little confusing, I'm not sure if you're 
> basing this on reports you've read?
>
Well to be honest I used "whatever I felt was right" more or less in 
that table and as I wrote it was only for demonstration purposes only to 
show how such a table could look.
> I'm in favor flagged device replace with raid5/6 as not supported yet. 
> That seems to be where most of the problems are coming in.
>
> The compression framework shouldn't allow one to work well with the 
> other unusable.
Ok good to know , however from the Debian wiki as well as the link to 
the mailing list only LZO compression are mentioned (as far as I 
remember) and I have no idea myself how much difference there is between 
LZO and the ZLIB code,
>
> There were  problems with autodefrag related to snapshot-aware defrag, 
> so Josef disabled the snapshot aware part.
>
> In general, we put btrfs through heavy use at facebook.  The crcs have 
> found serious hardware problems the other filesystems missed.
>
> We've also uncovered performance problems and a some serious bugs, 
> both in btrfs and the other filesystems.  With the other filesystems 
> the fixes were usually upstream (doubly true for the most serious 
> problems), and with btrfs we usually had to make the fixes ourselves.
>
> -chris
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
I'll just pop this in here since I assume most people will read the 
response from your comment:

I think I made my point. The wiki lacks some good documentation on 
what's safe to use and what's not. Yesterday I (Svein Engelsgjerd) did 
put a table on the main wiki and someone have moved that away to a 
status page and also improved the layout a bit. It is a tad more complex 
than my version, but also a lot better for the slightly more advanced 
users and it actually made my view on things a bit clearer as well.

I am glad that I by bringing this up (hopefully) contributed slightly to 
improving the documentation a tiny bit! :)


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 17:29       ` Filipe Manana
@ 2016-09-12 17:42         ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-12 17:42 UTC (permalink / raw)
  To: fdmanana; +Cc: dsterba, Waxhead, linux-btrfs

On 2016-09-12 13:29, Filipe Manana wrote:
> On Mon, Sep 12, 2016 at 5:56 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2016-09-12 12:27, David Sterba wrote:
>>>
>>> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>>>>>
>>>>> I therefore would like to propose that some sort of feature / stability
>>>>> matrix for the latest kernel is added to the wiki preferably somewhere
>>>>> where it is easy to find. It would be nice to archive old matrix'es as
>>>>> well in case someone runs on a bit older kernel (we who use Debian tend
>>>>> to like older kernels). In my opinion it would make things bit easier
>>>>> and perhaps a bit less scary too. Remember if you get bitten badly once
>>>>> you tend to stay away from from it all just in case, if you on the other
>>>>> hand know what bites you can safely pet the fluffy end instead :)
>>>>
>>>>
>>>> Somebody has put that table on the wiki, so it's a good starting point.
>>>> I'm not sure we can fit everything into one table, some combinations do
>>>> not bring new information and we'd need n-dimensional matrix to get the
>>>> whole picture.
>>>
>>>
>>> https://btrfs.wiki.kernel.org/index.php/Status
>>
>>
>> Some things to potentially add based on my own experience:
>>
>> Things listed as TBD status:
>> 1. Seeding: Seems to work fine the couple of times I've tested it, however
>> I've only done very light testing, and the whole feature is pretty much
>> undocumented.
>> 2. Device Replace: Works perfectly as long as the filesystem itself is not
>> corrupted, all the component devices are working, and the FS isn't using any
>> raid56 profiles.  Works fine if only the device being replaced is failing.
>> I've not done much testing WRT replacement when multiple devices are
>> suspect, but what I've done seems to suggest that it might be possible to
>> make it work, but it doesn't currently.  On raid56 it sometimes works fine,
>> sometimes corrupts data, and sometimes takes an insanely long time to
>> complete (putting data at risk from subsequent failures while the replace is
>> running).
>> 3. Balance: Works perfectly as long as the filesystem is not corrupted and
>> nothing throws any read or write errors.  IOW, only run this on a generally
>> healthy filesystem.  Similar caveats to those for replace with raid56 apply
>> here too.
>> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
>> is healthy.
>
> Virtually all other features work fine if the fs is healthy...
I would add more, but I don't often have the time to test broken 
filesystems...

TBH though, that's most of the issue I see with BTRFS in general at the 
moment.  RAID5/6 works fine, as long as all the devices keep working and 
you don't try to replace them and don't lose power.  Qgroups appear to 
work fine as long as no other bug shows up (other than the issues with 
accounting and returning ENOSPC instead of EDQUOT).  We do so much 
testing on pristine filesystems, but most of the utilities and less 
widely used features have had near zero testing on filesystems that are 
in bad shape.  If you pay attention, many (possibly most?) of the 
recently reported bugs are from broken (or poorly curated) filesystems, 
not some random kernel bug.  New features are nice, but they generally 
don't improve stability, and for BTRFS to be truly production ready 
outside of constrained environments like FaceBook, it needs to not choke 
on encountering a FS with some small amount of corruption.
>
>>
>> Other stuff:
>> 1. Compression: The specific known issue is that compressed extents don't
>> always get recovered properly on failed reads when dealing with lots of
>> failed reads.  This can be demonstrated by generating a large raid1
>> filesystem image with huge numbers of small (1MB) readliy compressible
>> files, then putting that on top of a dm-flaky or dm-error target set to give
>> a high read-error rate, then mounting and running cat `find .` > /dev/null
>> from the top level of the FS multiple times in a row.
>
>> 2. Send: The particular edge case appears to be caused by metadata
>> corruption on the sender and results in send choking on the same file every
>> time you try to run it.  The quick fix is to copy the contents of the file
>> to another file and rename that over the original.
>
> I don't remember having seen such case at least for the last 2 or 3
> years, all the problems I've seen/solved or seen fixes from others
> were all related to bugs in the send algorithm and definitely not any
> metadata corruption.
> So I wonder what evidence you have about this.
For the compression related issues, I can still reproduce it, but it 
takes a while.

As for the send issues, I do still see these on rare occasion, but only 
on 2+ year old filesystems, but I think the last time I saw it happen 
was more than 3 months ago.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 17:36   ` Zoiled
@ 2016-09-12 17:44     ` Waxhead
  2016-09-15  1:12     ` Nicholas D Steeves
  1 sibling, 0 replies; 93+ messages in thread
From: Waxhead @ 2016-09-12 17:44 UTC (permalink / raw)
  To: Chris Mason, Waxhead, linux-btrfs

Zoiled wrote:
> Chris Mason wrote:
>>
>>
>> On 09/11/2016 04:55 AM, Waxhead wrote:
>>> I have been following BTRFS for years and have recently been 
>>> starting to
>>> use BTRFS more and more and as always BTRFS' stability is a hot topic.
>>> Some says that BTRFS is a dead end research project while others claim
>>> the opposite.
>>>
>>> Taking a quick glance at the wiki does not say much about what is safe
>>> to use or not and it also points to some who are using BTRFS in 
>>> production.
>>> While BTRFS can apparently work well in production it does have some
>>> caveats, and finding out what features is safe or not can be 
>>> problematic
>>> and I especially think that new users of BTRFS can easily be bitten if
>>> they do not do a lot of research on it first.
>>>
>>> The Debian wiki for BTRFS (which is recent by the way) contains a bunch
>>> of warnings and recommendations and is for me a bit better than the
>>> official BTRFS wiki when it comes to how to decide what features to 
>>> use.
>>>
>>> The Nouveau graphics driver have a nice feature matrix on it's webpage
>>> and I think that BTRFS perhaps should consider doing something like 
>>> that
>>> on it's official wiki as well
>>>
>>> For example something along the lines of .... (the statuses are taken
>>> our of thin air just for demonstration purposes)
>>>
>>
>> The out of thin air part is a little confusing, I'm not sure if 
>> you're basing this on reports you've read?
>>
> Well to be honest I used "whatever I felt was right" more or less in 
> that table and as I wrote it was only for demonstration purposes only 
> to show how such a table could look.
>> I'm in favor flagged device replace with raid5/6 as not supported 
>> yet. That seems to be where most of the problems are coming in.
>>
>> The compression framework shouldn't allow one to work well with the 
>> other unusable.
> Ok good to know , however from the Debian wiki as well as the link to 
> the mailing list only LZO compression are mentioned (as far as I 
> remember) and I have no idea myself how much difference there is 
> between LZO and the ZLIB code,
>>
>> There were  problems with autodefrag related to snapshot-aware 
>> defrag, so Josef disabled the snapshot aware part.
>>
>> In general, we put btrfs through heavy use at facebook.  The crcs 
>> have found serious hardware problems the other filesystems missed.
>>
>> We've also uncovered performance problems and a some serious bugs, 
>> both in btrfs and the other filesystems.  With the other filesystems 
>> the fixes were usually upstream (doubly true for the most serious 
>> problems), and with btrfs we usually had to make the fixes ourselves.
>>
>> -chris
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> I'll just pop this in here since I assume most people will read the 
> response from your comment:
>
> I think I made my point. The wiki lacks some good documentation on 
> what's safe to use and what's not. Yesterday I (Svein Engelsgjerd) did 
> put a table on the main wiki and someone have moved that away to a 
> status page and also improved the layout a bit. It is a tad more 
> complex than my version, but also a lot better for the slightly more 
> advanced users and it actually made my view on things a bit clearer as 
> well.
>
> I am glad that I by bringing this up (hopefully) contributed slightly 
> to improving the documentation a tiny bit! :)
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
Just for the record - sorry for using my "crap mail" - I sometimes 
forget to change to the correct sender. I am therefore Svein Engelsgjerd 
a.k.a. Waxhead a.k.a. "Zoiled" :)
...sorry for the confusion


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
  2016-09-12 16:56     ` Austin S. Hemmelgarn
@ 2016-09-12 19:57     ` Martin Steigerwald
  2016-09-12 20:21       ` Pasi Kärkkäinen
  1 sibling, 1 reply; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-12 19:57 UTC (permalink / raw)
  To: dsterba, Waxhead, linux-btrfs

Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
> > > I therefore would like to propose that some sort of feature / stability
> > > matrix for the latest kernel is added to the wiki preferably somewhere
> > > where it is easy to find. It would be nice to archive old matrix'es as
> > > well in case someone runs on a bit older kernel (we who use Debian tend
> > > to like older kernels). In my opinion it would make things bit easier
> > > and perhaps a bit less scary too. Remember if you get bitten badly once
> > > you tend to stay away from from it all just in case, if you on the other
> > > hand know what bites you can safely pet the fluffy end instead :)
> > 
> > Somebody has put that table on the wiki, so it's a good starting point.
> > I'm not sure we can fit everything into one table, some combinations do
> > not bring new information and we'd need n-dimensional matrix to get the
> > whole picture.
> 
> https://btrfs.wiki.kernel.org/index.php/Status

Great.

I made to minor adaption. I added a link to the Status page to my warning in 
before the kernel log by feature page. And I also mentioned that at the time 
the page was last updated the latest kernel version was 4.7. Yes, thats some 
extra work to update the kernel version, but I think its beneficial to 
explicitely mention the kernel version the page talks about. Everyone who 
updates the page can update the version within a second.

-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 16:56     ` Austin S. Hemmelgarn
  2016-09-12 17:29       ` Filipe Manana
@ 2016-09-12 20:08       ` Chris Murphy
  2016-09-13 11:35         ` Austin S. Hemmelgarn
  2016-09-19  3:47       ` Zygo Blaxell
  2 siblings, 1 reply; 93+ messages in thread
From: Chris Murphy @ 2016-09-12 20:08 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: David Sterba, Waxhead, Btrfs BTRFS

On Mon, Sep 12, 2016 at 10:56 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:

>
> Things listed as TBD status:
> 1. Seeding: Seems to work fine the couple of times I've tested it, however
> I've only done very light testing, and the whole feature is pretty much
> undocumented.

Mostly OK.

Odd behaviors:
- mount seed (ro), add device, remount mountpoint: this just changed
the mounted fs volume UUID
- if two sprouts for a seed exist, ambiguous which is remounted rw,
you'd have to check
- remount should probably be disallowed in this case somehow; require
explicit mount of the sprout

btrfs fi usage crash when multiple device volume contains seed device
https://bugzilla.kernel.org/show_bug.cgi?id=115851


> 2. Device Replace: Works perfectly as long as the filesystem itself is not
> corrupted, all the component devices are working, and the FS isn't using any
> raid56 profiles.  Works fine if only the device being replaced is failing.
> I've not done much testing WRT replacement when multiple devices are
> suspect, but what I've done seems to suggest that it might be possible to
> make it work, but it doesn't currently.  On raid56 it sometimes works fine,
> sometimes corrupts data, and sometimes takes an insanely long time to
> complete (putting data at risk from subsequent failures while the replace is
> running).
> 3. Balance: Works perfectly as long as the filesystem is not corrupted and
> nothing throws any read or write errors.  IOW, only run this on a generally
> healthy filesystem.  Similar caveats to those for replace with raid56 apply
> here too.
> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
> is healthy.

Concur.


Missing from the matrix:

- default file system for distros recommendation
e.g. between enospc and btrfsck status, I'd say in general this is not
currently recommended by upstream (short of having a Btrfs kernel
developer on staff)

- enospc status
e.g. there's new stuff in 4.8 that probably still needs to shake out,
and Jeff's found some metadata accounting problem resulting in enospc
where there's tons of unallocated space available.
e.g. I have empty block groups, and they are not being deallocated,
they just stick around, and this is with 4.7 and 4.8 kernels; so
whatever was at one time automatically removing totally empty bg's
isn't happening anymore.

- btrfsck status
e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists
it under dangerous options also;  while that's true, Btrfs can't be
considered stable or recommended by default
e.g. There's still way too many separate repair tools for Btrfs.
Depending on how you count there's at least 4, and more realistically
8 ways, scattered across multiple commands. This excludes btrfs
check's -E, -r, and -s flags. And it ignores sequence in the success
rate. The permutations are just excessive. It's definitely not easy to
know how to fix a Btrfs volume should things go wrong.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 19:57     ` Martin Steigerwald
@ 2016-09-12 20:21       ` Pasi Kärkkäinen
  2016-09-12 20:35         ` Martin Steigerwald
  2016-09-12 20:48         ` Waxhead
  0 siblings, 2 replies; 93+ messages in thread
From: Pasi Kärkkäinen @ 2016-09-12 20:21 UTC (permalink / raw)
  To: Martin Steigerwald; +Cc: dsterba, Waxhead, linux-btrfs

On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
> Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
> > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
> > > > I therefore would like to propose that some sort of feature / stability
> > > > matrix for the latest kernel is added to the wiki preferably somewhere
> > > > where it is easy to find. It would be nice to archive old matrix'es as
> > > > well in case someone runs on a bit older kernel (we who use Debian tend
> > > > to like older kernels). In my opinion it would make things bit easier
> > > > and perhaps a bit less scary too. Remember if you get bitten badly once
> > > > you tend to stay away from from it all just in case, if you on the other
> > > > hand know what bites you can safely pet the fluffy end instead :)
> > > 
> > > Somebody has put that table on the wiki, so it's a good starting point.
> > > I'm not sure we can fit everything into one table, some combinations do
> > > not bring new information and we'd need n-dimensional matrix to get the
> > > whole picture.
> > 
> > https://btrfs.wiki.kernel.org/index.php/Status
> 
> Great.
> 
> I made to minor adaption. I added a link to the Status page to my warning in 
> before the kernel log by feature page. And I also mentioned that at the time 
> the page was last updated the latest kernel version was 4.7. Yes, thats some 
> extra work to update the kernel version, but I think its beneficial to 
> explicitely mention the kernel version the page talks about. Everyone who 
> updates the page can update the version within a second.
> 

Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not 4.7, I wonder what the status of feature X is.." 

Should we also add a column for kernel version, so we can add "feature X is known to be OK on Linux 3.18 and later"..  ?
Or add those to "notes" field, where applicable? 


-- Pasi

> -- 
> Martin


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:21       ` Pasi Kärkkäinen
@ 2016-09-12 20:35         ` Martin Steigerwald
  2016-09-12 20:44           ` Chris Murphy
  2016-09-12 20:48         ` Waxhead
  1 sibling, 1 reply; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-12 20:35 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: dsterba, Waxhead, linux-btrfs

Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen:
> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
> > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
> > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
> > > > > I therefore would like to propose that some sort of feature /
> > > > > stability
> > > > > matrix for the latest kernel is added to the wiki preferably
> > > > > somewhere
> > > > > where it is easy to find. It would be nice to archive old matrix'es
> > > > > as
> > > > > well in case someone runs on a bit older kernel (we who use Debian
> > > > > tend
> > > > > to like older kernels). In my opinion it would make things bit
> > > > > easier
> > > > > and perhaps a bit less scary too. Remember if you get bitten badly
> > > > > once
> > > > > you tend to stay away from from it all just in case, if you on the
> > > > > other
> > > > > hand know what bites you can safely pet the fluffy end instead :)
> > > > 
> > > > Somebody has put that table on the wiki, so it's a good starting
> > > > point.
> > > > I'm not sure we can fit everything into one table, some combinations
> > > > do
> > > > not bring new information and we'd need n-dimensional matrix to get
> > > > the
> > > > whole picture.
> > > 
> > > https://btrfs.wiki.kernel.org/index.php/Status
> > 
> > Great.
> > 
> > I made to minor adaption. I added a link to the Status page to my warning
> > in before the kernel log by feature page. And I also mentioned that at
> > the time the page was last updated the latest kernel version was 4.7.
> > Yes, thats some extra work to update the kernel version, but I think its
> > beneficial to explicitely mention the kernel version the page talks
> > about. Everyone who updates the page can update the version within a
> > second.
> 
> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not
> 4.7, I wonder what the status of feature X is.."
> 
> Should we also add a column for kernel version, so we can add "feature X is
> known to be OK on Linux 3.18 and later"..  ? Or add those to "notes" field,
> where applicable?

That was my initial idea, and it may be better than a generic kernel version 
for all features. Even if we fill in 4.7 for any of the features that are 
known to work okay for the table.

For RAID 1 I am willing to say it works stable since kernel 3.14, as this was 
the kernel I used when I switched /home and / to Dual SSD RAID 1 on this 
ThinkPad T520.


-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:35         ` Martin Steigerwald
@ 2016-09-12 20:44           ` Chris Murphy
  2016-09-13 11:28             ` Austin S. Hemmelgarn
  2016-09-14  5:53             ` Marc Haber
  0 siblings, 2 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-12 20:44 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: Pasi Kärkkäinen, David Sterba, Waxhead, Btrfs BTRFS

On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwald <martin@lichtvoll.de> wrote:
> Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen:
>> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
>> > Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
>> > > On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>> > > > > I therefore would like to propose that some sort of feature /
>> > > > > stability
>> > > > > matrix for the latest kernel is added to the wiki preferably
>> > > > > somewhere
>> > > > > where it is easy to find. It would be nice to archive old matrix'es
>> > > > > as
>> > > > > well in case someone runs on a bit older kernel (we who use Debian
>> > > > > tend
>> > > > > to like older kernels). In my opinion it would make things bit
>> > > > > easier
>> > > > > and perhaps a bit less scary too. Remember if you get bitten badly
>> > > > > once
>> > > > > you tend to stay away from from it all just in case, if you on the
>> > > > > other
>> > > > > hand know what bites you can safely pet the fluffy end instead :)
>> > > >
>> > > > Somebody has put that table on the wiki, so it's a good starting
>> > > > point.
>> > > > I'm not sure we can fit everything into one table, some combinations
>> > > > do
>> > > > not bring new information and we'd need n-dimensional matrix to get
>> > > > the
>> > > > whole picture.
>> > >
>> > > https://btrfs.wiki.kernel.org/index.php/Status
>> >
>> > Great.
>> >
>> > I made to minor adaption. I added a link to the Status page to my warning
>> > in before the kernel log by feature page. And I also mentioned that at
>> > the time the page was last updated the latest kernel version was 4.7.
>> > Yes, thats some extra work to update the kernel version, but I think its
>> > beneficial to explicitely mention the kernel version the page talks
>> > about. Everyone who updates the page can update the version within a
>> > second.
>>
>> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not
>> 4.7, I wonder what the status of feature X is.."
>>
>> Should we also add a column for kernel version, so we can add "feature X is
>> known to be OK on Linux 3.18 and later"..  ? Or add those to "notes" field,
>> where applicable?
>
> That was my initial idea, and it may be better than a generic kernel version
> for all features. Even if we fill in 4.7 for any of the features that are
> known to work okay for the table.
>
> For RAID 1 I am willing to say it works stable since kernel 3.14, as this was
> the kernel I used when I switched /home and / to Dual SSD RAID 1 on this
> ThinkPad T520.

Just to cut yourself some slack, you could skip 3.14 because it's EOL
now, and just go from 4.4.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:21       ` Pasi Kärkkäinen
  2016-09-12 20:35         ` Martin Steigerwald
@ 2016-09-12 20:48         ` Waxhead
  2016-09-13  8:38           ` Timofey Titovets
  1 sibling, 1 reply; 93+ messages in thread
From: Waxhead @ 2016-09-12 20:48 UTC (permalink / raw)
  To: Pasi Kärkkäinen, Martin Steigerwald; +Cc: dsterba, linux-btrfs

Pasi Kärkkäinen wrote:
> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
>>
>> Great.
>>
>> I made to minor adaption. I added a link to the Status page to my warning in
>> before the kernel log by feature page. And I also mentioned that at the time
>> the page was last updated the latest kernel version was 4.7. Yes, thats some
>> extra work to update the kernel version, but I think its beneficial to
>> explicitely mention the kernel version the page talks about. Everyone who
>> updates the page can update the version within a second.
>>
> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not 4.7, I wonder what the status of feature X is.."
>
> Should we also add a column for kernel version, so we can add "feature X is known to be OK on Linux 3.18 and later"..  ?
> Or add those to "notes" field, where applicable?
>
>
> -- Pasi
>
I think a separate column would be the best solution. For example 
archiving the status page pr. kernel version (as I suggested) will lead 
to issues too. For example if something appears to be just fine in 4.6 
is found to be horribly broken in for example 4.10 the archive would 
still indicate that it WAS ok at that time even if it perhaps was not. 
Then you have regressions - something that worked in 4.4 may not work in 
4.9, but I still think the best idea is to simply label the status as ok 
/ broken since 4.x as those who really want to use a broken feature 
probably would to research to see if this used to work , besides if 
something that used to work goes haywire it should be fixed quickly :)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:48         ` Waxhead
@ 2016-09-13  8:38           ` Timofey Titovets
  2016-09-13 11:26             ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 93+ messages in thread
From: Timofey Titovets @ 2016-09-13  8:38 UTC (permalink / raw)
  To: Waxhead
  Cc: Pasi Kärkkäinen, Martin Steigerwald, dsterba, linux-btrfs

https://btrfs.wiki.kernel.org/index.php/Status
I suggest to mark RAID1/10 as 'mostly ok'
as on btrfs RAID1/10 is safe to data, but not for application that uses it.
i.e. it not hide I/O error even if it's can be masked.
https://www.spinics.net/lists/linux-btrfs/msg56739.html

/* Retest it with upstream 4.7.2 - not fixed */


-- 
Have a nice day,
Timofey.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-13  8:38           ` Timofey Titovets
@ 2016-09-13 11:26             ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-13 11:26 UTC (permalink / raw)
  To: Timofey Titovets, Waxhead
  Cc: Pasi Kärkkäinen, Martin Steigerwald, dsterba, linux-btrfs

On 2016-09-13 04:38, Timofey Titovets wrote:
> https://btrfs.wiki.kernel.org/index.php/Status
> I suggest to mark RAID1/10 as 'mostly ok'
> as on btrfs RAID1/10 is safe to data, but not for application that uses it.
> i.e. it not hide I/O error even if it's can be masked.
> https://www.spinics.net/lists/linux-btrfs/msg56739.html
>
> /* Retest it with upstream 4.7.2 - not fixed */
This doesn't match with what my own testing indicates at least for raid1 
mode.  I run similar tests myself every time a new stable kernel version 
comes out (but only on the most recent stable version) once I get my own 
patches re-based onto it, and I haven't seen issues like this in any of 
the 4.7 kernels, and don't recall any issues like this in any of the 4.6 
kernels.  In fact, I've actually dealt with systems with failing disks 
using BTRFS raid1 mode, including one at work just yesterday where the 
SATA cable had worked loose from vibrations and was causing significant 
data corruption.  It survived just fine, as have all the other systems 
I've dealt with which had hardware issues while running BTRFS in raid1 mode.

The indicated behavior would be consistent with issues seen sometimes 
when using compression however, but the OP in the linked message made no 
indication of there being any in-line compression involved.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:44           ` Chris Murphy
@ 2016-09-13 11:28             ` Austin S. Hemmelgarn
  2016-09-13 11:39               ` Martin Steigerwald
  2016-09-14  5:53             ` Marc Haber
  1 sibling, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-13 11:28 UTC (permalink / raw)
  To: Chris Murphy, Martin Steigerwald
  Cc: Pasi Kärkkäinen, David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-12 16:44, Chris Murphy wrote:
> On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwald <martin@lichtvoll.de> wrote:
>> Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen:
>>> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
>>>> Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
>>>>> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
>>>>>>> I therefore would like to propose that some sort of feature /
>>>>>>> stability
>>>>>>> matrix for the latest kernel is added to the wiki preferably
>>>>>>> somewhere
>>>>>>> where it is easy to find. It would be nice to archive old matrix'es
>>>>>>> as
>>>>>>> well in case someone runs on a bit older kernel (we who use Debian
>>>>>>> tend
>>>>>>> to like older kernels). In my opinion it would make things bit
>>>>>>> easier
>>>>>>> and perhaps a bit less scary too. Remember if you get bitten badly
>>>>>>> once
>>>>>>> you tend to stay away from from it all just in case, if you on the
>>>>>>> other
>>>>>>> hand know what bites you can safely pet the fluffy end instead :)
>>>>>>
>>>>>> Somebody has put that table on the wiki, so it's a good starting
>>>>>> point.
>>>>>> I'm not sure we can fit everything into one table, some combinations
>>>>>> do
>>>>>> not bring new information and we'd need n-dimensional matrix to get
>>>>>> the
>>>>>> whole picture.
>>>>>
>>>>> https://btrfs.wiki.kernel.org/index.php/Status
>>>>
>>>> Great.
>>>>
>>>> I made to minor adaption. I added a link to the Status page to my warning
>>>> in before the kernel log by feature page. And I also mentioned that at
>>>> the time the page was last updated the latest kernel version was 4.7.
>>>> Yes, thats some extra work to update the kernel version, but I think its
>>>> beneficial to explicitely mention the kernel version the page talks
>>>> about. Everyone who updates the page can update the version within a
>>>> second.
>>>
>>> Hmm.. that will still leave people wondering "but I'm running Linux 4.4, not
>>> 4.7, I wonder what the status of feature X is.."
>>>
>>> Should we also add a column for kernel version, so we can add "feature X is
>>> known to be OK on Linux 3.18 and later"..  ? Or add those to "notes" field,
>>> where applicable?
>>
>> That was my initial idea, and it may be better than a generic kernel version
>> for all features. Even if we fill in 4.7 for any of the features that are
>> known to work okay for the table.
>>
>> For RAID 1 I am willing to say it works stable since kernel 3.14, as this was
>> the kernel I used when I switched /home and / to Dual SSD RAID 1 on this
>> ThinkPad T520.
>
> Just to cut yourself some slack, you could skip 3.14 because it's EOL
> now, and just go from 4.4.
That reminds me, we should probably make a point to make it clear that 
this is for the _upstream_ mainline kernel versions, not for versions 
from some arbitrary distro, and that people should check the distro's 
documentation for that info.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:08       ` Chris Murphy
@ 2016-09-13 11:35         ` Austin S. Hemmelgarn
  2016-09-15 18:01           ` Chris Murphy
  0 siblings, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-13 11:35 UTC (permalink / raw)
  To: Chris Murphy; +Cc: David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-12 16:08, Chris Murphy wrote:
> On Mon, Sep 12, 2016 at 10:56 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>
>>
>> Things listed as TBD status:
>> 1. Seeding: Seems to work fine the couple of times I've tested it, however
>> I've only done very light testing, and the whole feature is pretty much
>> undocumented.
>
> Mostly OK.
>
> Odd behaviors:
> - mount seed (ro), add device, remount mountpoint: this just changed
> the mounted fs volume UUID
> - if two sprouts for a seed exist, ambiguous which is remounted rw,
> you'd have to check
> - remount should probably be disallowed in this case somehow; require
> explicit mount of the sprout
>
> btrfs fi usage crash when multiple device volume contains seed device
> https://bugzilla.kernel.org/show_bug.cgi?id=115851
Yeah, like I said, I've only done very light testing.  I kind of lost 
interest in seeding when overlayfs went mainline, as it offers pretty 
much everything I care about that seeding does, and it's filesystem 
agnostic.
>
>
>> 2. Device Replace: Works perfectly as long as the filesystem itself is not
>> corrupted, all the component devices are working, and the FS isn't using any
>> raid56 profiles.  Works fine if only the device being replaced is failing.
>> I've not done much testing WRT replacement when multiple devices are
>> suspect, but what I've done seems to suggest that it might be possible to
>> make it work, but it doesn't currently.  On raid56 it sometimes works fine,
>> sometimes corrupts data, and sometimes takes an insanely long time to
>> complete (putting data at risk from subsequent failures while the replace is
>> running).
>> 3. Balance: Works perfectly as long as the filesystem is not corrupted and
>> nothing throws any read or write errors.  IOW, only run this on a generally
>> healthy filesystem.  Similar caveats to those for replace with raid56 apply
>> here too.
>> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
>> is healthy.
>
> Concur.
>
>
> Missing from the matrix:
>
> - default file system for distros recommendation
> e.g. between enospc and btrfsck status, I'd say in general this is not
> currently recommended by upstream (short of having a Btrfs kernel
> developer on staff)
I'd add the whole UUID issue to that too.
>
> - enospc status
> e.g. there's new stuff in 4.8 that probably still needs to shake out,
> and Jeff's found some metadata accounting problem resulting in enospc
> where there's tons of unallocated space available.
> e.g. I have empty block groups, and they are not being deallocated,
> they just stick around, and this is with 4.7 and 4.8 kernels; so
> whatever was at one time automatically removing totally empty bg's
> isn't happening anymore.
FWIW, that's still working on my systems.
>
> - btrfsck status
> e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists
> it under dangerous options also;  while that's true, Btrfs can't be
> considered stable or recommended by default
> e.g. There's still way too many separate repair tools for Btrfs.
> Depending on how you count there's at least 4, and more realistically
> 8 ways, scattered across multiple commands. This excludes btrfs
> check's -E, -r, and -s flags. And it ignores sequence in the success
> rate. The permutations are just excessive. It's definitely not easy to
> know how to fix a Btrfs volume should things go wrong.
I assume you're counting balance and scrub in that, plus check gives 3, 
what are you considering the 4th?

In the case of just balance, scrub, and check, the differentiation there 
makes more sense IMHO than combining them, check only runs on offline 
filesystems (and as much as we want online fsck, I doubt that that will 
happen any time soon), while scrub and balance operate on online 
filesystems and do two semantically different things.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-13 11:28             ` Austin S. Hemmelgarn
@ 2016-09-13 11:39               ` Martin Steigerwald
  0 siblings, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-13 11:39 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Chris Murphy, Pasi Kärkkäinen, David Sterba, Waxhead,
	Btrfs BTRFS

Am Dienstag, 13. September 2016, 07:28:38 CEST schrieb Austin S. Hemmelgarn:
> On 2016-09-12 16:44, Chris Murphy wrote:
> > On Mon, Sep 12, 2016 at 2:35 PM, Martin Steigerwald <martin@lichtvoll.de> 
wrote:
> >> Am Montag, 12. September 2016, 23:21:09 CEST schrieb Pasi Kärkkäinen:
> >>> On Mon, Sep 12, 2016 at 09:57:17PM +0200, Martin Steigerwald wrote:
> >>>> Am Montag, 12. September 2016, 18:27:47 CEST schrieb David Sterba:
> >>>>> On Mon, Sep 12, 2016 at 04:27:14PM +0200, David Sterba wrote:
[…]
> >>>>> https://btrfs.wiki.kernel.org/index.php/Status
> >>>> 
> >>>> Great.
> >>>> 
> >>>> I made to minor adaption. I added a link to the Status page to my
> >>>> warning
> >>>> in before the kernel log by feature page. And I also mentioned that at
> >>>> the time the page was last updated the latest kernel version was 4.7.
> >>>> Yes, thats some extra work to update the kernel version, but I think
> >>>> its
> >>>> beneficial to explicitely mention the kernel version the page talks
> >>>> about. Everyone who updates the page can update the version within a
> >>>> second.
> >>> 
> >>> Hmm.. that will still leave people wondering "but I'm running Linux 4.4,
> >>> not 4.7, I wonder what the status of feature X is.."
> >>> 
> >>> Should we also add a column for kernel version, so we can add "feature X
> >>> is
> >>> known to be OK on Linux 3.18 and later"..  ? Or add those to "notes"
> >>> field,
> >>> where applicable?
> >> 
> >> That was my initial idea, and it may be better than a generic kernel
> >> version for all features. Even if we fill in 4.7 for any of the features
> >> that are known to work okay for the table.
> >> 
> >> For RAID 1 I am willing to say it works stable since kernel 3.14, as this
> >> was the kernel I used when I switched /home and / to Dual SSD RAID 1 on
> >> this ThinkPad T520.
> > 
> > Just to cut yourself some slack, you could skip 3.14 because it's EOL
> > now, and just go from 4.4.
> 
> That reminds me, we should probably make a point to make it clear that
> this is for the _upstream_ mainline kernel versions, not for versions
> from some arbitrary distro, and that people should check the distro's
> documentation for that info.

I´d do the following:

Really state the first known to work stable kernel version for a feature.

But before the table state this:

1) Instead of the first known to work stable kernel for a feature recommend to 
use the latest upstream kernel or alternatively the latest upstream LTS kernel 
for those users who want to play it a bit safer.

2) For stable distros such as  SLES, RHEL, Ubuntu LTS, Debian Stable recommend 
to check distro documentation. Note that some distro kernels track upstream 
kernels quite closely like Debian backport kernel or Ubuntu kernel backports 
PPA.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 20:44           ` Chris Murphy
  2016-09-13 11:28             ` Austin S. Hemmelgarn
@ 2016-09-14  5:53             ` Marc Haber
  1 sibling, 0 replies; 93+ messages in thread
From: Marc Haber @ 2016-09-14  5:53 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Martin Steigerwald, Pasi Kärkkäinen, David Sterba,
	Waxhead, Btrfs BTRFS

On Mon, Sep 12, 2016 at 02:44:35PM -0600, Chris Murphy wrote:
> Just to cut yourself some slack, you could skip 3.14 because it's EOL
> now, and just go from 4.4.

Don't the btrfs-tools used to create the filesystem also play a huge
role in this game?

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 12:20             ` Austin S. Hemmelgarn
  2016-09-12 12:59               ` Michel Bouissou
@ 2016-09-15  1:05               ` Nicholas D Steeves
  2016-09-15  8:02                 ` Martin Steigerwald
  2016-09-16  7:13                 ` Helmut Eller
  2016-09-15  5:55               ` Kai Krakow
  2 siblings, 2 replies; 93+ messages in thread
From: Nicholas D Steeves @ 2016-09-15  1:05 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Hugo Mills, Waxhead, Martin Steigerwald, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 9683 bytes --]

On Mon, Sep 12, 2016 at 08:20:20AM -0400, Austin S. Hemmelgarn wrote:
> On 2016-09-11 09:02, Hugo Mills wrote:
> >On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> >>Martin Steigerwald wrote:
> >>>Am Sonntag, 11. September 2016, 13:43:59 CEST schrieb Martin Steigerwald:
> >>>>>>Thing is: This just seems to be when has a feature been implemented
> >>>>>>matrix.
> >>>>>>Not when it is considered to be stable. I think this could be done with
> >>>>>>colors or so. Like red for not supported, yellow for implemented and
> >>>>>>green for production ready.
> >>>>>Exactly, just like the Nouveau matrix. It clearly shows what you can
> >>>>>expect from it.
> >>>I mentioned this matrix as a good *starting* point. And I think it would be
> >>>easy to extent it:
> >>>
> >>>Just add another column called "Production ready". Then research / ask about
> >>>production stability of each feature. The only challenge is: Who is
> >>>authoritative on that? I´d certainly ask the developer of a feature, but I´d
> >>>also consider user reports to some extent.
> >>>
> >>>Maybe thats the real challenge.
> >>>
> >>>If you wish, I´d go through each feature there and give my own estimation. But
> >>>I think there are others who are deeper into this.
> >>That is exactly the same reason I don't edit the wiki myself. I
> >>could of course get it started and hopefully someone will correct
> >>what I write, but I feel that if I start this off I don't have deep
> >>enough knowledge to do a proper start. Perhaps I will change my mind
> >>about this.
> >
> >   Given that nobody else has done it yet, what are the odds that
> >someone else will step up to do it now? I would say that you should at
> >least try. Yes, you don't have as much knowledge as some others, but
> >if you keep working at it, you'll gain that knowledge. Yes, you'll
> >probably get it wrong to start with, but you probably won't get it
> >*very* wrong. You'll probably get it horribly wrong at some point, but
> >even the more knowledgable people you're deferring to didn't identify
> >the problems with parity RAID until Zygo and Austin and Chris (and
> >others) put in the work to pin down the exact issues.
> FWIW, here's a list of what I personally consider stable (as in, I'm willing
> to bet against reduced uptime to use this stuff on production systems at
> work and personal systems at home):
> 1. Single device mode, including DUP data profiles on single device without
> mixed-bg.
> 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical devices
> (all devices are the same size).
> 3. Multi-device single profiles with asymmetrical devices.
> 4. Small numbers (max double digit) of snapshots, taken at infrequent
> intervals (no more than once an hour).  I use single snapshots regularly to
> get stable images of the filesystem for backups, and I keep hourly ones of
> my home directory for about 48 hours.
> 5. Subvolumes used to isolate parts of a filesystem from snapshots.  I use
> this regularly to isolate areas of my filesystems from backups.
> 6. Non-incremental send/receive (no clone source, no parent's, no
> deduplication).  I use this regularly for cloning virtual machines.
> 7. Checksumming and scrubs using any of the profiles I've listed above.
> 8. Defragmentation, including autodefrag.
> 9. All of the compat_features, including no-holes and skinny-metadata.
> 
> Things I consider stable enough that I'm willing to use them on my personal
> systems but not systems at work:
> 1. In-line data compression with compress=lzo.  I use this on my laptop and
> home server system.  I've never had any issues with it myself, but I know
> that other people have, and it does seem to make other things more likely to
> have issues.
> 2. Batch deduplication.  I only use this on the back-end filesystems for my
> personal storage cluster, and only because I have multiple copies as a
> result of GlusterFS on top of BTRFS.  I've not had any significant issues
> with it, and I don't remember any reports of data loss resulting from it,
> but it's something that people should not be using if they don't understand
> all the implications.
> 
> Things that I don't consider stable but some people do:
> 1. Quotas and qgroups.  Some people (such as SUSE) consider these to be
> stable.  There are a couple of known issues with them still however (such as
> returning the wrong errno when a quota is hit (should be returning -EDQUOT,
> instead returns -ENOSPC)).
> 2. RAID5/6.  There are a few people who use this, but it's generally agreed
> to be unstable.  There are still at least 3 known bugs which can cause
> complete loss of a filesystem, and there's also a known issue with rebuilds
> taking insanely long, which puts data at risk as well.
> 3. Multi device filesystems with asymmetrical devices running raid0, raid1,
> or raid10.  The issue I have here is that it's much easier to hit errors
> regarding free space than a reliable system should be.  It's possible to
> avoid with careful planning (for example, a 3 disk raid1 profile with 1 disk
> exactly twice the size of the other two will work fine, albeit with more
> load on the larger disk).
> 
...
> As far as documentation though, we [BTRFS] really do need to get our act
> together.  It really doesn't look good to have most of the best
> documentation be in the distro's wikis instead of ours.  I'm not trying to
> say the distros shouldn't be documenting BTRFS, but the point at which
> Debian (for example) has better documentation of the upstream version of
> BTRFS than the upstream project itself does, that starts to look bad.

I would have loved to have this feature-to-stability list when I
started working on the Debian documentation!  I started it because I
was saddened by number of horror story "adventures with btrfs"
articles and posts I had read about, combined with the perspective of
certain members within the Debian community that it was a toy fs.

Are my contributions to that wiki of a high enough quality that I
can work on the upstream one?  Do you think the broader btrfs
community is interested in citations and curated links to discussions?

eg: if a company wants to use btrfs, they check the status page, see a
feature they want is still in the yellow zone of stabilisation, and
then follow the links to familiarise themselves with past discussions.
I imagine this would also help individuals or grad students more
quickly familiarise themselves with the available literature before
choosing a specific project.  If regular updates from SUSE, STRATO,
Facebook, and Fujitsu are also publicly available the k.org wiki would
be a wonderful place to syndicate them!

Sincerely,
Nicholas

> >
> >   So, go for it. You have a lot to offer the community.
> >
> >   Hugo.
> >
> >>>I do think for example that scrubbing and auto raid repair are stable, except
> >>>for RAID 5/6. Also device statistics and RAID 0 and 1 I consider to be stable.
> >>>I think RAID 10 is also stable, but as I do not run it, I don´t know. For me
> >>>also skinny-metadata is stable. For me so far even compress=lzo seems to be
> >>>stable, but well for others it may not.
> >>>
> >>>Since what kernel version? Now, there you go. I have no idea. All I know I
> >>>started BTRFS with Kernel 2.6.38 or 2.6.39 on my laptop, but not as RAID 1 at
> >>>that time.
> >>>
> >>>See, the implementation time of a feature is much easier to assess. Maybe
> >>>thats part of the reason why there is not stability matrix: Maybe no one
> >>>*exactly* knows *for sure*. How could you? So I would even put a footnote on
> >>>that "production ready" row explaining "Considered to be stable by developer
> >>>and user oppinions".
> >>>
> >>>Of course additionally it would be good to read about experiences of corporate
> >>>usage of BTRFS. I know at least Fujitsu, SUSE, Facebook, Oracle are using it.
> >>>But I don´t know in what configurations and with what experiences. One Oracle
> >>>developer invests a lot of time to bring BTRFS like features to XFS and RedHat
> >>>still favors XFS over BTRFS, even SLES defaults to XFS for /home and other non
> >>>/-filesystems. That also tells a story.
> >>>
> >>>Some ideas you can get from SUSE releasenotes. Even if you do not want to use
> >>>it, it tells something and I bet is one of the better sources of information
> >>>regarding your question you can get at this time. Cause I believe SUSE
> >>>developers invested some time to assess the stability of features. Cause they
> >>>would carefully assess what they can support in enterprise environments. There
> >>>is also someone from Fujitsu who shared experiences in a talk, I can search
> >>>the URL to the slides again.
> >>By all means, SUSE's wiki is very valuable. I just said that I
> >>*prefer* to have that stuff on the BTRFS wiki and feel that is the
> >>right place for it.
> >>>
> >>>I bet Chris Mason and other BTRFS developers at Facebook have some idea on
> >>>what they use within Facebook as well. To what extent they are allowed to talk
> >>>about it… I don´t know. My personal impression is that as soon as Chris went
> >>>to Facebook he became quite quiet. Maybe just due to being busy. Maybe due to
> >>>Facebook being concerned much more about the privacy of itself than of its
> >>>users.
> >>>
> >>>Thanks,
> >>
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 17:31       ` Austin S. Hemmelgarn
@ 2016-09-15  1:07         ` Nicholas D Steeves
  2016-09-15  1:13           ` Steven Haigh
  2016-09-19 15:38         ` Is stability a joke? David Sterba
  1 sibling, 1 reply; 93+ messages in thread
From: Nicholas D Steeves @ 2016-09-15  1:07 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 707 bytes --]

On Mon, Sep 12, 2016 at 01:31:42PM -0400, Austin S. Hemmelgarn wrote:
> In general yes in this case, but performance starts to degrade exponentially
> beyond a certain point.  The difference between (for example) 10 and 20
> snapshots is not as much as between 1000 and 1010. The problem here is that
> we don't really have a BCP document that anyone ever reads.  A lot of stuff
> that may seem obvious to us after years of working with BTRFS isn't going to
> be to a newcomer, and it's a lot more likely that some random person will
> get things write if we have a good, central BCP document than if it stays as
> scattered tribal knowledge.

"Scattered tribal knowledge"...exactly!  :-D

Cheers,
Nicholas

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 17:36   ` Zoiled
  2016-09-12 17:44     ` Waxhead
@ 2016-09-15  1:12     ` Nicholas D Steeves
  1 sibling, 0 replies; 93+ messages in thread
From: Nicholas D Steeves @ 2016-09-15  1:12 UTC (permalink / raw)
  To: Zoiled; +Cc: Chris Mason, Waxhead, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On Mon, Sep 12, 2016 at 07:36:57PM +0200, Zoiled wrote:
> Ok good to know , however from the Debian wiki as well as the link to the
> mailing list only LZO compression are mentioned (as far as I remember) and I
> have no idea myself how much difference there is between LZO and the ZLIB
> code,

I tried my best to not make any over-claims, and to always have
supporting citations, which is why only LZO compression is mentioned.
If anyone sees any inaccuracies, please let me know and I'll address
them without hesitation.

Sincerely,
Nicholas

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-15  1:07         ` Nicholas D Steeves
@ 2016-09-15  1:13           ` Steven Haigh
  2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
  0 siblings, 1 reply; 93+ messages in thread
From: Steven Haigh @ 2016-09-15  1:13 UTC (permalink / raw)
  To: Nicholas D Steeves; +Cc: Austin S. Hemmelgarn, dsterba, Waxhead, linux-btrfs

On 2016-09-15 11:07, Nicholas D Steeves wrote:
> On Mon, Sep 12, 2016 at 01:31:42PM -0400, Austin S. Hemmelgarn wrote:
>> In general yes in this case, but performance starts to degrade 
>> exponentially
>> beyond a certain point.  The difference between (for example) 10 and 
>> 20
>> snapshots is not as much as between 1000 and 1010. The problem here is 
>> that
>> we don't really have a BCP document that anyone ever reads.  A lot of 
>> stuff
>> that may seem obvious to us after years of working with BTRFS isn't 
>> going to
>> be to a newcomer, and it's a lot more likely that some random person 
>> will
>> get things write if we have a good, central BCP document than if it 
>> stays as
>> scattered tribal knowledge.
> 
> "Scattered tribal knowledge"...exactly!  :-D

+1 also.

I haven't been following this closely due to other commitments - but I'm 
happy to see the progress on the 'stability matrix' added to the wiki 
page.

It may seem trivial to people who live, eat, and breathe BTRFS, but for 
others, it saves stress, headaches and data loss.

I can't emphasise enough how important getting this part right is until 
some future date where *everything* just works.

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-15  1:13           ` Steven Haigh
@ 2016-09-15  2:14             ` Christoph Anton Mitterer
  2016-09-15  9:49               ` stability matrix Hans van Kranenburg
  2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
  0 siblings, 2 replies; 93+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-15  2:14 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5269 bytes --]

Hey.

As for the stability matrix...

In general:
- I think another column should be added, which tells when and for
  which kernel version the feature-status of each row was 
  revised/updated the last time and especially by whom.
  If a core dev makes a statement on a particular feature, this
  probably means much more, than if it was made by "just" a list
  regular.
  And yes I know, in the beginning it already says "this is for 4.7"...
  but let's be honest, it's pretty likely when this is bumped to 4.8
  that not each and every point will be thoroughly checked again.
- Optionally even one further column could be added, that lists bugs
  where the specific cases are kept record of (if any).
- Perhaps a 3rd Status like "eats-your-data" which is worse than
  critical, e.g. for things were it's known that there is a high
  chance for still getting data corruption (RAID56?)


Perhaps there should be another section that lists general caveats
and pitfalls including:
- defrag/auto-defrag causes ref-link break up (which in turn causes
  possible extensive space being eaten up)
- nodatacow files are not yet[0] checksummed, which in turn means
  that any errors (especially silent data corruption) will not be
  noticed AND which in turn also means the data itself cannot be
  repaired even in case of RAIDs (only the RAIDs are made consistent
  again)
- subvolume UUID attacks discussed in the recent thread
- fs/device UUID collisions
  - the accidental corruption that can happen in case colliding
    fs/device UUIDs appear in a system (and telling the user that
    this is e.g. the case when dd'ing and image or using lvm
    snapshots, probably also when having btrfs on MD RAID1 or RAID10)
  - the attacks that are possible when UUIDs are known to an attacker
- in-band dedupe
  deduped are IIRC not bitwise compared by the kernel before de-duping,
  as it's the case with offline dedupe.
  Even if this is considered safe by the community... I think users
  should be told.
- btrfs check --repair (and others?)
  Telling people that this may often cause more harm than good.
- even mounting a fs ro, may cause it to be changed
- DB/VM-image like IO patterns + nodatacow + (!)checksumming
  + (auto)defrag + snapshots
  a)
  People typically may have the impression:
  btrfs = checksummed => als is guaranteed to be "valid" (or at least
  noticed)
  However this isn't the case for nodatacow'ed files, which in turn is
  kinda "mandatory" for DB/VM-image like IO patterns, cause otherwise
  these would fragment to heavily (see (b).
  Unless claimed by some people, none of the major DBs or VM-image
  formats do general checksumming on their own, most even don't support
  it, some that do wouldn't do it without app-support and few "just"
  don't do it per default.
  Thus one should bump people to this situation and that they may not
  get this "correctness" guarantee here.
  b)
  IIRC, it doesn't even help to simply not use nodatacow on such files
  and using auto-defrag instead to countermeasure the fragmenting, as
  that one doesn't perform too well on large files.




For specific features:
- Autodefrag
  - didn't that also cause reflinks to be broken up? that should be
    mentioned than as well, as it is (more or less) for defrag and
    people could then assume it's not the case for autodefrag (which I
    did initially)
  - wasn't it said that autodefrag performs bad with files > ~1GB?
    Perhaps that should be mentioned too
- defrag
  "extents get unshared" is IMO not an adequate description for the end
  user,... it should perhaps link to the defrag article and there
  explain in detail that any ref-linked files will be broken up, which
  means space usage will increase, and may especially explode in case
  of snapshots
- all the RAID56 related points
  wasn't there recently a thread that discussed a more serious bug,
  where parity was wrongly re-calculated which in turn caused actual
  data corruption?
  I think if that's still an issue "write hole still exists, parity
  not checksummed" is not enough but one should emphasize that data may
  easily be corrupted.
- RAID*
  No userland tools for monitoring/etc.
- Device replace 
  IIRC, CM told me that this may cause severe troubles on RAID56


Also, the current matrix talks about "auto-repair"... what's that? (=>
should be IMO explained). 


Last but not least, perhaps this article may also be the place to
document 3rd party things and how far they work stable with btrfs.
For example:
- Which grub version supports booting from it? Which features does it
  [not] support (e.g. which RAIDs, skinny-extents, etc.)?
- Which forensic tools (e.g. things like testdisk) do work with btrfs?
- Which are still maintained/working dedupe userland tools (and are
  they stable?)



Cheers,
Chris.



[0] Yeah I know, a number of list regulars constantly tried to convince
    me that this wasn't possible per se, but a recent discussion I had
    with CM seemed to have revealed (unless I understood it wrong) that
    it wouldn't be generally impossible at all.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 12:20             ` Austin S. Hemmelgarn
  2016-09-12 12:59               ` Michel Bouissou
  2016-09-15  1:05               ` Nicholas D Steeves
@ 2016-09-15  5:55               ` Kai Krakow
  2016-09-15  8:05                 ` Martin Steigerwald
  2 siblings, 1 reply; 93+ messages in thread
From: Kai Krakow @ 2016-09-15  5:55 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 12 Sep 2016 08:20:20 -0400
schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>:

> On 2016-09-11 09:02, Hugo Mills wrote:
> > On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:  
> >> Martin Steigerwald wrote:  
>  [...]  
>  [...]  
>  [...]  
>  [...]  
> >> That is exactly the same reason I don't edit the wiki myself. I
> >> could of course get it started and hopefully someone will correct
> >> what I write, but I feel that if I start this off I don't have deep
> >> enough knowledge to do a proper start. Perhaps I will change my
> >> mind about this.  
> >
> >    Given that nobody else has done it yet, what are the odds that
> > someone else will step up to do it now? I would say that you should
> > at least try. Yes, you don't have as much knowledge as some others,
> > but if you keep working at it, you'll gain that knowledge. Yes,
> > you'll probably get it wrong to start with, but you probably won't
> > get it *very* wrong. You'll probably get it horribly wrong at some
> > point, but even the more knowledgable people you're deferring to
> > didn't identify the problems with parity RAID until Zygo and Austin
> > and Chris (and others) put in the work to pin down the exact
> > issues.  
> FWIW, here's a list of what I personally consider stable (as in, I'm 
> willing to bet against reduced uptime to use this stuff on production 
> systems at work and personal systems at home):
> 1. Single device mode, including DUP data profiles on single device 
> without mixed-bg.
> 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical 
> devices (all devices are the same size).
> 3. Multi-device single profiles with asymmetrical devices.
> 4. Small numbers (max double digit) of snapshots, taken at infrequent 
> intervals (no more than once an hour).  I use single snapshots
> regularly to get stable images of the filesystem for backups, and I
> keep hourly ones of my home directory for about 48 hours.
> 5. Subvolumes used to isolate parts of a filesystem from snapshots.
> I use this regularly to isolate areas of my filesystems from backups.
> 6. Non-incremental send/receive (no clone source, no parent's, no 
> deduplication).  I use this regularly for cloning virtual machines.
> 7. Checksumming and scrubs using any of the profiles I've listed
> above. 8. Defragmentation, including autodefrag.
> 9. All of the compat_features, including no-holes and skinny-metadata.
> 
> Things I consider stable enough that I'm willing to use them on my 
> personal systems but not systems at work:
> 1. In-line data compression with compress=lzo.  I use this on my
> laptop and home server system.  I've never had any issues with it
> myself, but I know that other people have, and it does seem to make
> other things more likely to have issues.
> 2. Batch deduplication.  I only use this on the back-end filesystems
> for my personal storage cluster, and only because I have multiple
> copies as a result of GlusterFS on top of BTRFS.  I've not had any
> significant issues with it, and I don't remember any reports of data
> loss resulting from it, but it's something that people should not be
> using if they don't understand all the implications.

I could at least add one "don't do it":

Don't use BFQ patches (it's an IO scheduler) if you're using btrfs.
Some people like to use it especially for running VMs and desktops
because it provides very good interactivity while maintaining very good
throughput. But it completely destroyed my btrfs beyond repair at least
twice, either while actually using a VM (in VirtualBox) or during high
IO loads. I now stick to the deadline scheduler instead which provides
very good interactivity for me, too, and the corruptions didn't occur
again so far.

The story with BFQ has always been the same: System suddenly freezes
during moderate to high IO until all processes stop working (no process
shows D state, tho). Only hard reboot possible. After rebooting, access
to some (unrelated) files may fail with "errno=-17 Object already
exists" which cannot be repaired. If it affects files needed during
boot, you are screwed because file system goes RO.

-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-15  1:05               ` Nicholas D Steeves
@ 2016-09-15  8:02                 ` Martin Steigerwald
  2016-09-16  7:13                 ` Helmut Eller
  1 sibling, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-15  8:02 UTC (permalink / raw)
  To: Nicholas D Steeves; +Cc: Austin S. Hemmelgarn, Hugo Mills, Waxhead, linux-btrfs

Hello Nicholas.

Am Mittwoch, 14. September 2016, 21:05:52 CEST schrieb Nicholas D Steeves:
> On Mon, Sep 12, 2016 at 08:20:20AM -0400, Austin S. Hemmelgarn wrote:
> > On 2016-09-11 09:02, Hugo Mills wrote:
[…]
> > As far as documentation though, we [BTRFS] really do need to get our act
> > together.  It really doesn't look good to have most of the best
> > documentation be in the distro's wikis instead of ours.  I'm not trying to
> > say the distros shouldn't be documenting BTRFS, but the point at which
> > Debian (for example) has better documentation of the upstream version of
> > BTRFS than the upstream project itself does, that starts to look bad.
> 
> I would have loved to have this feature-to-stability list when I
> started working on the Debian documentation!  I started it because I
> was saddened by number of horror story "adventures with btrfs"
> articles and posts I had read about, combined with the perspective of
> certain members within the Debian community that it was a toy fs.
> 
> Are my contributions to that wiki of a high enough quality that I
> can work on the upstream one?  Do you think the broader btrfs
> community is interested in citations and curated links to discussions?
> 
> eg: if a company wants to use btrfs, they check the status page, see a
> feature they want is still in the yellow zone of stabilisation, and
> then follow the links to familiarise themselves with past discussions.
> I imagine this would also help individuals or grad students more
> quickly familiarise themselves with the available literature before
> choosing a specific project.  If regular updates from SUSE, STRATO,
> Facebook, and Fujitsu are also publicly available the k.org wiki would
> be a wonderful place to syndicate them!

 I definately think the quality of your contributions is high enough, others 
can also proofread and give in their experiences, so… By *all* means, go ahead 
*already*.

It doesn´t fit all inside the table directly, I bet, *but* you can use 
footnotes or further explainations regarding features that need them with a 
headline per feature below the table and a link to it from within the table.

Thank you!
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-15  5:55               ` Kai Krakow
@ 2016-09-15  8:05                 ` Martin Steigerwald
  0 siblings, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-15  8:05 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-btrfs

Am Donnerstag, 15. September 2016, 07:55:36 CEST schrieb Kai Krakow:
> Am Mon, 12 Sep 2016 08:20:20 -0400
> 
> schrieb "Austin S. Hemmelgarn" <ahferroin7@gmail.com>:
> > On 2016-09-11 09:02, Hugo Mills wrote:
> > > On Sun, Sep 11, 2016 at 02:39:14PM +0200, Waxhead wrote:
> > >> Martin Steigerwald wrote:
> >  [...]
> >  [...]
> >  [...]
> >  [...]
> >  
> > >> That is exactly the same reason I don't edit the wiki myself. I
> > >> could of course get it started and hopefully someone will correct
> > >> what I write, but I feel that if I start this off I don't have deep
> > >> enough knowledge to do a proper start. Perhaps I will change my
> > >> mind about this.
> > >> 
> > >    Given that nobody else has done it yet, what are the odds that
> > > 
> > > someone else will step up to do it now? I would say that you should
> > > at least try. Yes, you don't have as much knowledge as some others,
> > > but if you keep working at it, you'll gain that knowledge. Yes,
> > > you'll probably get it wrong to start with, but you probably won't
> > > get it *very* wrong. You'll probably get it horribly wrong at some
> > > point, but even the more knowledgable people you're deferring to
> > > didn't identify the problems with parity RAID until Zygo and Austin
> > > and Chris (and others) put in the work to pin down the exact
> > > issues.
> > 
> > FWIW, here's a list of what I personally consider stable (as in, I'm
> > willing to bet against reduced uptime to use this stuff on production
> > systems at work and personal systems at home):
> > 1. Single device mode, including DUP data profiles on single device
> > without mixed-bg.
> > 2. Multi-device raid0, raid1, and raid10 profiles with symmetrical
> > devices (all devices are the same size).
> > 3. Multi-device single profiles with asymmetrical devices.
> > 4. Small numbers (max double digit) of snapshots, taken at infrequent
> > intervals (no more than once an hour).  I use single snapshots
> > regularly to get stable images of the filesystem for backups, and I
> > keep hourly ones of my home directory for about 48 hours.
> > 5. Subvolumes used to isolate parts of a filesystem from snapshots.
> > I use this regularly to isolate areas of my filesystems from backups.
> > 6. Non-incremental send/receive (no clone source, no parent's, no
> > deduplication).  I use this regularly for cloning virtual machines.
> > 7. Checksumming and scrubs using any of the profiles I've listed
> > above. 8. Defragmentation, including autodefrag.
> > 9. All of the compat_features, including no-holes and skinny-metadata.
> > 
> > Things I consider stable enough that I'm willing to use them on my
> > personal systems but not systems at work:
> > 1. In-line data compression with compress=lzo.  I use this on my
> > laptop and home server system.  I've never had any issues with it
> > myself, but I know that other people have, and it does seem to make
> > other things more likely to have issues.
> > 2. Batch deduplication.  I only use this on the back-end filesystems
> > for my personal storage cluster, and only because I have multiple
> > copies as a result of GlusterFS on top of BTRFS.  I've not had any
> > significant issues with it, and I don't remember any reports of data
> > loss resulting from it, but it's something that people should not be
> > using if they don't understand all the implications.
> 
> I could at least add one "don't do it":
> 
> Don't use BFQ patches (it's an IO scheduler) if you're using btrfs.
> Some people like to use it especially for running VMs and desktops
> because it provides very good interactivity while maintaining very good
> throughput. But it completely destroyed my btrfs beyond repair at least
> twice, either while actually using a VM (in VirtualBox) or during high
> IO loads. I now stick to the deadline scheduler instead which provides
> very good interactivity for me, too, and the corruptions didn't occur
> again so far.
> 
> The story with BFQ has always been the same: System suddenly freezes
> during moderate to high IO until all processes stop working (no process
> shows D state, tho). Only hard reboot possible. After rebooting, access
> to some (unrelated) files may fail with "errno=-17 Object already
> exists" which cannot be repaired. If it affects files needed during
> boot, you are screwed because file system goes RO.

This could be a further row in the table. And well…

as for CFQ Jens Axboe currently works on bandwidth throttling patches 
*exactly* for the reason to provide more interactivity and fairness to I/O 
operations in between.

Right now, Completely Fair in CFQ is a *huge* exaggeration, at least while you 
have a dd bs=1M thing running.

Thanks,
-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
@ 2016-09-15  9:49               ` Hans van Kranenburg
  2016-09-15 11:54                 ` Austin S. Hemmelgarn
  2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
  1 sibling, 1 reply; 93+ messages in thread
From: Hans van Kranenburg @ 2016-09-15  9:49 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs

On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote:
> Hey.
> 
> As for the stability matrix...
> 
> In general:
> - I think another column should be added, which tells when and for
>   which kernel version the feature-status of each row was 
>   revised/updated the last time and especially by whom.
>   If a core dev makes a statement on a particular feature, this
>   probably means much more, than if it was made by "just" a list
>   regular.
>   And yes I know, in the beginning it already says "this is for 4.7"...
>   but let's be honest, it's pretty likely when this is bumped to 4.8
>   that not each and every point will be thoroughly checked again.
> - Optionally even one further column could be added, that lists bugs
>   where the specific cases are kept record of (if any).
> - Perhaps a 3rd Status like "eats-your-data" which is worse than
>   critical, e.g. for things were it's known that there is a high
>   chance for still getting data corruption (RAID56?)

About the "for 4.7" issue... The Status page could have an extra column,
which for every OK labeled row lists the first version (kernel.org x.y.0
release) it's OK for.

The bugs make it more complicated.

* Feature A is labeled OK in kernel 5.0
* During development of kernel 8-rc, an eat my data bug is fixed. The OK
for this feature in the table is bumped to 8.0?
* kernel 5 is EOL
* kernel 6 is still supported, and the fix is applied to 6.12
* then there's distros which have their own old kernels, applying fixes
on them whenever they like, for example 5.6-distro4 which is leading its
own life

"Normal" users are using distro kernels. They shouldn't be panicing
about their data if they're running 6.14 or 5.6-distro4, but the OK in
the table is bumped to 8.0 because of the serious bugs.

At least the official kernels should be tracked in the table I think.

Separately, a list of known serious bugs per feature (like the 4 about
compression, http://www.spinics.net/lists/linux-btrfs/msg58674.html )
could be listed on another Bugs! page (lots of work) so a user, or
someone helping the user can see if the listed commits are or aren't
included in the actual whatever kernel a user is using.

This list of serious bugs could also help disussions that now sound like
"yeah, there were issues with compression which some time ago got fixed,
but noone knows what it was and when, so don't use compression".

Many of the commits which fix serious bugs (even if they're only
triggered in an edge case) have some explanation about how to trigger
them, like the excellent commit messages of Filipe in the commits
mentioned above. This helps setting up and maintaining the bug page, and
helps advanced users to decide if they're hitting the edge case or not
with their usage pattern.

I'd like to help creating/maintaining this bug overview. A good start
would be to just crawl through all stable kernels and some distro
kernels and see which commits show up in fs/btrfs.

-- 
Hans van Kranenburg

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-15  9:49               ` stability matrix Hans van Kranenburg
@ 2016-09-15 11:54                 ` Austin S. Hemmelgarn
  2016-09-15 14:15                   ` Chris Murphy
                                     ` (2 more replies)
  0 siblings, 3 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-15 11:54 UTC (permalink / raw)
  To: Hans van Kranenburg, Christoph Anton Mitterer, linux-btrfs

On 2016-09-15 05:49, Hans van Kranenburg wrote:
> On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote:
>> Hey.
>>
>> As for the stability matrix...
>>
>> In general:
>> - I think another column should be added, which tells when and for
>>   which kernel version the feature-status of each row was
>>   revised/updated the last time and especially by whom.
>>   If a core dev makes a statement on a particular feature, this
>>   probably means much more, than if it was made by "just" a list
>>   regular.
>>   And yes I know, in the beginning it already says "this is for 4.7"...
>>   but let's be honest, it's pretty likely when this is bumped to 4.8
>>   that not each and every point will be thoroughly checked again.
>> - Optionally even one further column could be added, that lists bugs
>>   where the specific cases are kept record of (if any).
>> - Perhaps a 3rd Status like "eats-your-data" which is worse than
>>   critical, e.g. for things were it's known that there is a high
>>   chance for still getting data corruption (RAID56?)
>
> About the "for 4.7" issue... The Status page could have an extra column,
> which for every OK labeled row lists the first version (kernel.org x.y.0
> release) it's OK for.
>
> The bugs make it more complicated.
>
> * Feature A is labeled OK in kernel 5.0
> * During development of kernel 8-rc, an eat my data bug is fixed. The OK
> for this feature in the table is bumped to 8.0?
> * kernel 5 is EOL
> * kernel 6 is still supported, and the fix is applied to 6.12
> * then there's distros which have their own old kernels, applying fixes
> on them whenever they like, for example 5.6-distro4 which is leading its
> own life
>
> "Normal" users are using distro kernels. They shouldn't be panicing
> about their data if they're running 6.14 or 5.6-distro4, but the OK in
> the table is bumped to 8.0 because of the serious bugs.
>
> At least the official kernels should be tracked in the table I think.
>
> Separately, a list of known serious bugs per feature (like the 4 about
> compression, http://www.spinics.net/lists/linux-btrfs/msg58674.html )
> could be listed on another Bugs! page (lots of work) so a user, or
> someone helping the user can see if the listed commits are or aren't
> included in the actual whatever kernel a user is using.
>
> This list of serious bugs could also help disussions that now sound like
> "yeah, there were issues with compression which some time ago got fixed,
> but noone knows what it was and when, so don't use compression".
>
> Many of the commits which fix serious bugs (even if they're only
> triggered in an edge case) have some explanation about how to trigger
> them, like the excellent commit messages of Filipe in the commits
> mentioned above. This helps setting up and maintaining the bug page, and
> helps advanced users to decide if they're hitting the edge case or not
> with their usage pattern.
>
> I'd like to help creating/maintaining this bug overview. A good start
> would be to just crawl through all stable kernels and some distro
> kernels and see which commits show up in fs/btrfs.
>
As of right now, we kind of do have such a page:
https://btrfs.wiki.kernel.org/index.php/Gotchas
It's not really well labeled though, ans it's easy to overlook.

I specifically do not think we should worry about distro kernels though. 
  If someone is using a specific distro, that distro's documentation 
should cover what they support and what works and what doesn't.  Some 
(like Arch and to a lesser extent Gentoo) use almost upstream kernels, 
so there's very little point in tracking them.  Some (like Ubuntu and 
Debian) use almost upstream LTS kernels, so there's little point 
tracking them either.  Many others though (like CentOS, RHEL, and OEL) 
Use forked kernels that have so many back-ported patches that it's 
impossible to track up-date to up-date what the hell they've got.  A 
rather ridiculous expression regarding herding of cats comes to mind 
with respect to the last group.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-15 11:54                 ` Austin S. Hemmelgarn
@ 2016-09-15 14:15                   ` Chris Murphy
  2016-09-15 14:56                   ` Martin Steigerwald
  2016-09-19 14:38                   ` David Sterba
  2 siblings, 0 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-15 14:15 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Hans van Kranenburg, Christoph Anton Mitterer, Btrfs BTRFS

On Thu, Sep 15, 2016 at 5:54 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
>
>
> I specifically do not think we should worry about distro kernels though.

It will be essentially impossible to keep such a thing up to date.
It's difficult in the best case scenario to even track upstream's own
backports to longterm kernels, whether those would actually even
change anything in the matrix.

I'd say each major version gets it's own page, and just dup the page
for each version.

So for starters, the current page is for version 4.7. If when 4.8 is
released there's no significant change in stability that affects the
color (stability status) of any listed feature, then that page could
say 4.7 through current. If it's true that the status page has no
major changes going back to 4.4 through current, label it that way.

As soon as there's a change that affects the color coding of an item
in the grid, duplicate the page. Old page gets a fixed range of
kernels, say 4.4 to 4.7. And now the newest page is 4.8 - current.

I think a column for version will lose the historical perspective of
when something goes from red to yellow, yellow to green.


> If
> someone is using a specific distro, that distro's documentation should cover
> what they support and what works and what doesn't.  Some (like Arch and to a
> lesser extent Gentoo) use almost upstream kernels, so there's very little
> point in tracking them.  Some (like Ubuntu and Debian) use almost upstream
> LTS kernels, so there's little point tracking them either.  Many others
> though (like CentOS, RHEL, and OEL) Use forked kernels that have so many
> back-ported patches that it's impossible to track up-date to up-date what
> the hell they've got.  A rather ridiculous expression regarding herding of
> cats comes to mind with respect to the last group.

Yeah you need the secret decoder ring to sort it out. Forget it, not worth it.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-15 11:54                 ` Austin S. Hemmelgarn
  2016-09-15 14:15                   ` Chris Murphy
@ 2016-09-15 14:56                   ` Martin Steigerwald
  2016-09-19 14:38                   ` David Sterba
  2 siblings, 0 replies; 93+ messages in thread
From: Martin Steigerwald @ 2016-09-15 14:56 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Hans van Kranenburg, Christoph Anton Mitterer, linux-btrfs

Am Donnerstag, 15. September 2016, 07:54:26 CEST schrieb Austin S. Hemmelgarn:
> On 2016-09-15 05:49, Hans van Kranenburg wrote:
> > On 09/15/2016 04:14 AM, Christoph Anton Mitterer wrote:
[…]
> I specifically do not think we should worry about distro kernels though.
>   If someone is using a specific distro, that distro's documentation
> should cover what they support and what works and what doesn't.  Some
> (like Arch and to a lesser extent Gentoo) use almost upstream kernels,
> so there's very little point in tracking them.  Some (like Ubuntu and
> Debian) use almost upstream LTS kernels, so there's little point
> tracking them either.  Many others though (like CentOS, RHEL, and OEL)
> Use forked kernels that have so many back-ported patches that it's
> impossible to track up-date to up-date what the hell they've got.  A
> rather ridiculous expression regarding herding of cats comes to mind
> with respect to the last group.

Yep. I just read through RHEL releasenotes for a RHEL 7 workshop I will hold 
for a customer… and noted that newer RHEL 7 kernels for example have device 
mapper from Kernel 4.1 (while the kernel still says its a 3.10 one), XFS from 
kernel this.that, including new incompat CRC disk format and the need to also 
upgrade xfsprogs in lockstep, and this and that from kernel this.that and so 
on. Frankenstein as an association comes to my mind, but I bet RHEL kernel 
engineers know what they are doing.

-- 
Martin

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-13 11:35         ` Austin S. Hemmelgarn
@ 2016-09-15 18:01           ` Chris Murphy
  2016-09-15 18:20             ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 93+ messages in thread
From: Chris Murphy @ 2016-09-15 18:01 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Chris Murphy, David Sterba, Waxhead, Btrfs BTRFS

On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2016-09-12 16:08, Chris Murphy wrote:
>>
>> - btrfsck status
>> e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists
>> it under dangerous options also;  while that's true, Btrfs can't be
>> considered stable or recommended by default
>> e.g. There's still way too many separate repair tools for Btrfs.
>> Depending on how you count there's at least 4, and more realistically
>> 8 ways, scattered across multiple commands. This excludes btrfs
>> check's -E, -r, and -s flags. And it ignores sequence in the success
>> rate. The permutations are just excessive. It's definitely not easy to
>> know how to fix a Btrfs volume should things go wrong.
>
> I assume you're counting balance and scrub in that, plus check gives 3, what
> are you considering the 4th?

- Self repair at mount time, similar to other fs's with a journal
- fsck, similar to other fs's except the output is really unclear
about what the prognosis is compared to ext4 or xfs
- mount option usebackuproot/recovery
- btrfs rescue zero-log
- btrfs rescue super-recover
- btrfs rescue chunk-recover
- scrub
- balance

check --repair really needed to be fail safe a long time ago, it's
what everyone's come to expect from fsck's, that they don't make
things worse; and in particular on Btrfs it seems like its repairs
should be reversible but the reality is the man page says do not use
(except under advisement) and that it's dangerous (twice). And a user
got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2
apparently can't fix. So... life is hard, file systems are hard. But
it's also hard to see how distros can possibly feel comfortable with
Btrfs by default when the fsck tool is dangerous, even if in theory it
shouldn't often be necessary.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 18:01           ` Chris Murphy
@ 2016-09-15 18:20             ` Austin S. Hemmelgarn
  2016-09-15 19:02               ` Chris Murphy
  2016-09-15 21:23               ` Christoph Anton Mitterer
  0 siblings, 2 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-15 18:20 UTC (permalink / raw)
  To: Chris Murphy; +Cc: David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-15 14:01, Chris Murphy wrote:
> On Tue, Sep 13, 2016 at 5:35 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>> On 2016-09-12 16:08, Chris Murphy wrote:
>>>
>>> - btrfsck status
>>> e.g. btrfs-progs 4.7.2 still warns against using --repair, and lists
>>> it under dangerous options also;  while that's true, Btrfs can't be
>>> considered stable or recommended by default
>>> e.g. There's still way too many separate repair tools for Btrfs.
>>> Depending on how you count there's at least 4, and more realistically
>>> 8 ways, scattered across multiple commands. This excludes btrfs
>>> check's -E, -r, and -s flags. And it ignores sequence in the success
>>> rate. The permutations are just excessive. It's definitely not easy to
>>> know how to fix a Btrfs volume should things go wrong.
>>
>> I assume you're counting balance and scrub in that, plus check gives 3, what
>> are you considering the 4th?
>
> - Self repair at mount time, similar to other fs's with a journal
> - fsck, similar to other fs's except the output is really unclear
> about what the prognosis is compared to ext4 or xfs
> - mount option usebackuproot/recovery
> - btrfs rescue zero-log
> - btrfs rescue super-recover
> - btrfs rescue chunk-recover
> - scrub
> - balance
>
> check --repair really needed to be fail safe a long time ago, it's
> what everyone's come to expect from fsck's, that they don't make
> things worse; and in particular on Btrfs it seems like its repairs
> should be reversible but the reality is the man page says do not use
> (except under advisement) and that it's dangerous (twice). And a user
> got a broken system in the bug that affects 4.7, 4.7.1, that 4.7.2
> apparently can't fix. So... life is hard, file systems are hard. But
> it's also hard to see how distros can possibly feel comfortable with
> Btrfs by default when the fsck tool is dangerous, even if in theory it
> shouldn't often be necessary.
>
For check specifically, I see four issues:
1. It spits out pretty low-level information about the internals in many 
cases when it returns an error.  xfs_repair does this too, but it's 
needed even less frequently than btrfs check, and it at least uses 
relatively simple jargon by comparison.  I've been using BTRFS for years 
and still can't tell what more than half the error messages check can 
return mean.  In contrast to that, deciphering an error message from 
e2fsck is pretty trivial if you have some basic understanding of VFS 
level filesystem abstractions (stuff like what inodes and dentries are), 
and I never needed to learn low level things about the internals of ext4 
to parse the fsck output (I did anyway, but that's beside the point).

2. We're developing new features without making sure that check can fix 
issues in any associated metadata.  Part of merging a new feature needs 
to be proving that fsck can handle fixing any issues in the metadata for 
that feature short of total data loss or complete corruption.

3. Fsck should be needed only for un-mountable filesystems.  Ideally, we 
should be handling things like Windows does.  Preform slightly better 
checking when reading data, and if we see an error, flag the filesystem 
for expensive repair on the next mount.

4. Btrfs check should know itself if it can fix something or not, and 
that should be reported.  I have an otherwise perfectly fine filesystem 
that throws some (apparently harmless) errors in check, and check can't 
repair them.  Despite this, it gives zero indication that it can't 
repair them, zero indication that it didn't repair them, and doesn't 
even seem to give a non-zero exit status for this filesystem.

As far as the other tools:
- Self-repair at mount time: This isn't a repair tool, if the FS mounts, 
it's not broken, it's just a messy and the kernel is tidying things up.
- btrfsck/btrfs check: I think I covered the issues here well.
- Mount options: These are mostly just for expensive checks during 
mount, and most people should never need them except in very unusual 
circumstances.
- btrfs rescue *: These are all fixes for very specific issues.  They 
should be folded into check with special aliases, and not be separate 
tools.  The first fixes an issue that's pretty much non-existent in any 
modern kernel, and the other two are for very low-level data recovery of 
horribly broken filesystems.
- scrub: This is a very purpose specific tool which is supposed to be 
part of regular maintainence, and only works to fix things as a side 
effect of what it does.
- balance: This is also a relatively purpose specific tool, and again 
only fixes things as a side effect of what it does.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 18:20             ` Austin S. Hemmelgarn
@ 2016-09-15 19:02               ` Chris Murphy
  2016-09-15 20:16                 ` Hugo Mills
  2016-09-19  4:08                 ` Zygo Blaxell
  2016-09-15 21:23               ` Christoph Anton Mitterer
  1 sibling, 2 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-15 19:02 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Chris Murphy, David Sterba, Waxhead, Btrfs BTRFS

On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:

> 2. We're developing new features without making sure that check can fix
> issues in any associated metadata.  Part of merging a new feature needs to
> be proving that fsck can handle fixing any issues in the metadata for that
> feature short of total data loss or complete corruption.
>
> 3. Fsck should be needed only for un-mountable filesystems.  Ideally, we
> should be handling things like Windows does.  Preform slightly better
> checking when reading data, and if we see an error, flag the filesystem for
> expensive repair on the next mount.

Right, well I'm vaguely curious why ZFS, as different as it is,
basically take the position that if the hardware went so batshit that
they can't unwind it on a normal mount, then an fsck probably can't
help either... they still don't have an fsck and don't appear to want
one.

I'm not sure if the brfsck is really all that helpful to user as much
as it is for developers to better learn about the failure vectors of
the file system.


> 4. Btrfs check should know itself if it can fix something or not, and that
> should be reported.  I have an otherwise perfectly fine filesystem that
> throws some (apparently harmless) errors in check, and check can't repair
> them.  Despite this, it gives zero indication that it can't repair them,
> zero indication that it didn't repair them, and doesn't even seem to give a
> non-zero exit status for this filesystem.

Yeah, it's really not a user tool in my view...



>
> As far as the other tools:
> - Self-repair at mount time: This isn't a repair tool, if the FS mounts,
> it's not broken, it's just a messy and the kernel is tidying things up.
> - btrfsck/btrfs check: I think I covered the issues here well.
> - Mount options: These are mostly just for expensive checks during mount,
> and most people should never need them except in very unusual circumstances.
> - btrfs rescue *: These are all fixes for very specific issues.  They should
> be folded into check with special aliases, and not be separate tools.  The
> first fixes an issue that's pretty much non-existent in any modern kernel,
> and the other two are for very low-level data recovery of horribly broken
> filesystems.
> - scrub: This is a very purpose specific tool which is supposed to be part
> of regular maintainence, and only works to fix things as a side effect of
> what it does.
> - balance: This is also a relatively purpose specific tool, and again only
> fixes things as a side effect of what it does.
>

Yeah I know, it's just much of this is non-obvious to users unfamiliar
with this file system. And even I'm often throwing spaghetti on a
wall.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 19:02               ` Chris Murphy
@ 2016-09-15 20:16                 ` Hugo Mills
  2016-09-15 20:26                   ` Chris Murphy
  2016-09-19  4:08                 ` Zygo Blaxell
  1 sibling, 1 reply; 93+ messages in thread
From: Hugo Mills @ 2016-09-15 20:16 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Austin S. Hemmelgarn, David Sterba, Waxhead, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 3615 bytes --]

On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
> On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
> 
> > 2. We're developing new features without making sure that check can fix
> > issues in any associated metadata.  Part of merging a new feature needs to
> > be proving that fsck can handle fixing any issues in the metadata for that
> > feature short of total data loss or complete corruption.
> >
> > 3. Fsck should be needed only for un-mountable filesystems.  Ideally, we
> > should be handling things like Windows does.  Preform slightly better
> > checking when reading data, and if we see an error, flag the filesystem for
> > expensive repair on the next mount.
> 
> Right, well I'm vaguely curious why ZFS, as different as it is,
> basically take the position that if the hardware went so batshit that
> they can't unwind it on a normal mount, then an fsck probably can't
> help either... they still don't have an fsck and don't appear to want
> one.
> 
> I'm not sure if the brfsck is really all that helpful to user as much
> as it is for developers to better learn about the failure vectors of
> the file system.
> 
> 
> > 4. Btrfs check should know itself if it can fix something or not, and that
> > should be reported.  I have an otherwise perfectly fine filesystem that
> > throws some (apparently harmless) errors in check, and check can't repair
> > them.  Despite this, it gives zero indication that it can't repair them,
> > zero indication that it didn't repair them, and doesn't even seem to give a
> > non-zero exit status for this filesystem.
> 
> Yeah, it's really not a user tool in my view...
> 
> 
> 
> >
> > As far as the other tools:
> > - Self-repair at mount time: This isn't a repair tool, if the FS mounts,
> > it's not broken, it's just a messy and the kernel is tidying things up.
> > - btrfsck/btrfs check: I think I covered the issues here well.
> > - Mount options: These are mostly just for expensive checks during mount,
> > and most people should never need them except in very unusual circumstances.
> > - btrfs rescue *: These are all fixes for very specific issues.  They should
> > be folded into check with special aliases, and not be separate tools.  The
> > first fixes an issue that's pretty much non-existent in any modern kernel,
> > and the other two are for very low-level data recovery of horribly broken
> > filesystems.
> > - scrub: This is a very purpose specific tool which is supposed to be part
> > of regular maintainence, and only works to fix things as a side effect of
> > what it does.
> > - balance: This is also a relatively purpose specific tool, and again only
> > fixes things as a side effect of what it does.

   You've forgotten btrfs-zero-log, which seems to have built itself a
reputation on the internet as the tool you run to fix all btrfs ills,
rather than a very finely-targeted tool that was introduced to deal
with approximately one bug somewhere back in the 2.x era (IIRC).

   Hugo.

> 
> Yeah I know, it's just much of this is non-obvious to users unfamiliar
> with this file system. And even I'm often throwing spaghetti on a
> wall.
> 
> 
> -- 
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Hugo Mills             | It's against my programming to impersonate a deity!
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                              C3PO, Return of the Jedi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 20:16                 ` Hugo Mills
@ 2016-09-15 20:26                   ` Chris Murphy
  2016-09-16 12:00                     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 93+ messages in thread
From: Chris Murphy @ 2016-09-15 20:26 UTC (permalink / raw)
  To: Hugo Mills, Chris Murphy, Austin S. Hemmelgarn, David Sterba,
	Waxhead, Btrfs BTRFS

On Thu, Sep 15, 2016 at 2:16 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
>> On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn
>> <ahferroin7@gmail.com> wrote:
>>
>> > 2. We're developing new features without making sure that check can fix
>> > issues in any associated metadata.  Part of merging a new feature needs to
>> > be proving that fsck can handle fixing any issues in the metadata for that
>> > feature short of total data loss or complete corruption.
>> >
>> > 3. Fsck should be needed only for un-mountable filesystems.  Ideally, we
>> > should be handling things like Windows does.  Preform slightly better
>> > checking when reading data, and if we see an error, flag the filesystem for
>> > expensive repair on the next mount.
>>
>> Right, well I'm vaguely curious why ZFS, as different as it is,
>> basically take the position that if the hardware went so batshit that
>> they can't unwind it on a normal mount, then an fsck probably can't
>> help either... they still don't have an fsck and don't appear to want
>> one.
>>
>> I'm not sure if the brfsck is really all that helpful to user as much
>> as it is for developers to better learn about the failure vectors of
>> the file system.
>>
>>
>> > 4. Btrfs check should know itself if it can fix something or not, and that
>> > should be reported.  I have an otherwise perfectly fine filesystem that
>> > throws some (apparently harmless) errors in check, and check can't repair
>> > them.  Despite this, it gives zero indication that it can't repair them,
>> > zero indication that it didn't repair them, and doesn't even seem to give a
>> > non-zero exit status for this filesystem.
>>
>> Yeah, it's really not a user tool in my view...
>>
>>
>>
>> >
>> > As far as the other tools:
>> > - Self-repair at mount time: This isn't a repair tool, if the FS mounts,
>> > it's not broken, it's just a messy and the kernel is tidying things up.
>> > - btrfsck/btrfs check: I think I covered the issues here well.
>> > - Mount options: These are mostly just for expensive checks during mount,
>> > and most people should never need them except in very unusual circumstances.
>> > - btrfs rescue *: These are all fixes for very specific issues.  They should
>> > be folded into check with special aliases, and not be separate tools.  The
>> > first fixes an issue that's pretty much non-existent in any modern kernel,
>> > and the other two are for very low-level data recovery of horribly broken
>> > filesystems.
>> > - scrub: This is a very purpose specific tool which is supposed to be part
>> > of regular maintainence, and only works to fix things as a side effect of
>> > what it does.
>> > - balance: This is also a relatively purpose specific tool, and again only
>> > fixes things as a side effect of what it does.
>
>    You've forgotten btrfs-zero-log, which seems to have built itself a
> reputation on the internet as the tool you run to fix all btrfs ills,
> rather than a very finely-targeted tool that was introduced to deal
> with approximately one bug somewhere back in the 2.x era (IIRC).
>
>    Hugo.

:-) It's in my original list, and it's in Austin's by way of being
lumped into 'btrfs rescue *' along with chunk and super recover. Seems
like super recover should be built into Btrfs check, and would be one
of the first ambiguities to get out of the way but I'm just an ape
that wears pants so what do I know.

Thing is?? zero log has fixed file systems in cases where I never
would have expected it to, and the user was recommended not to use it,
or use it as a 2nd to last resort. So, pfff....It's like throwing salt
around.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 18:20             ` Austin S. Hemmelgarn
  2016-09-15 19:02               ` Chris Murphy
@ 2016-09-15 21:23               ` Christoph Anton Mitterer
  2016-09-16 12:13                 ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 93+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-15 21:23 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On Thu, 2016-09-15 at 14:20 -0400, Austin S. Hemmelgarn wrote:
> 3. Fsck should be needed only for un-mountable filesystems.  Ideally,
> we 
> should be handling things like Windows does.  Preform slightly
> better 
> checking when reading data, and if we see an error, flag the
> filesystem 
> for expensive repair on the next mount.

That philosophy also has some drawbacks:
- The user doesn't directly that anything went wrong. Thus errors may
even continue to accumulate and getting much worse if the fs would have
immediately gone ro and giving the user the chance to manually
intervene (possibly then with help from upstream).

- Any smart auto-magical™ repair may also just fail (and make things
worse, as the current --repair e.g. may). Not performing such auto-
repair, gives the user at least the possible chance to make a bitwise
copy of the whole fs, before trying any rescue operations.
This wouldn't be the case, if the user never noticed that something
happen, and the fs tries to repair things right at mounting.

So I think any such auto-repair should be used with extreme caution and
only in those cases where one is absolutely a 100% sure that the action
will help and just do good.



Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-15  1:05               ` Nicholas D Steeves
  2016-09-15  8:02                 ` Martin Steigerwald
@ 2016-09-16  7:13                 ` Helmut Eller
  1 sibling, 0 replies; 93+ messages in thread
From: Helmut Eller @ 2016-09-16  7:13 UTC (permalink / raw)
  To: Nicholas D Steeves; +Cc: linux-btrfs

On Wed, Sep 14 2016, Nicholas D Steeves wrote:


> Do you think the broader btrfs
> community is interested in citations and curated links to discussions?

I'm definitely interested.  Something I would love to see is a list or
description of the tests that a particular version of btrfs passes or
doesn't pass.  I think that would add a bit of "rationality" to the
issue.  Also interesting would be the results of test-suites that are
used for other filesystems (ext4, xfs).

Helmut

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 20:26                   ` Chris Murphy
@ 2016-09-16 12:00                     ` Austin S. Hemmelgarn
  2016-09-19  2:57                       ` Zygo Blaxell
  0 siblings, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-16 12:00 UTC (permalink / raw)
  To: Chris Murphy, Hugo Mills, David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-15 16:26, Chris Murphy wrote:
> On Thu, Sep 15, 2016 at 2:16 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>> On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
>>> On Thu, Sep 15, 2016 at 12:20 PM, Austin S. Hemmelgarn
>>> <ahferroin7@gmail.com> wrote:
>>>
>>>> 2. We're developing new features without making sure that check can fix
>>>> issues in any associated metadata.  Part of merging a new feature needs to
>>>> be proving that fsck can handle fixing any issues in the metadata for that
>>>> feature short of total data loss or complete corruption.
>>>>
>>>> 3. Fsck should be needed only for un-mountable filesystems.  Ideally, we
>>>> should be handling things like Windows does.  Preform slightly better
>>>> checking when reading data, and if we see an error, flag the filesystem for
>>>> expensive repair on the next mount.
>>>
>>> Right, well I'm vaguely curious why ZFS, as different as it is,
>>> basically take the position that if the hardware went so batshit that
>>> they can't unwind it on a normal mount, then an fsck probably can't
>>> help either... they still don't have an fsck and don't appear to want
>>> one.
>>>
>>> I'm not sure if the brfsck is really all that helpful to user as much
>>> as it is for developers to better learn about the failure vectors of
>>> the file system.
>>>
>>>
>>>> 4. Btrfs check should know itself if it can fix something or not, and that
>>>> should be reported.  I have an otherwise perfectly fine filesystem that
>>>> throws some (apparently harmless) errors in check, and check can't repair
>>>> them.  Despite this, it gives zero indication that it can't repair them,
>>>> zero indication that it didn't repair them, and doesn't even seem to give a
>>>> non-zero exit status for this filesystem.
>>>
>>> Yeah, it's really not a user tool in my view...
>>>
>>>
>>>
>>>>
>>>> As far as the other tools:
>>>> - Self-repair at mount time: This isn't a repair tool, if the FS mounts,
>>>> it's not broken, it's just a messy and the kernel is tidying things up.
>>>> - btrfsck/btrfs check: I think I covered the issues here well.
>>>> - Mount options: These are mostly just for expensive checks during mount,
>>>> and most people should never need them except in very unusual circumstances.
>>>> - btrfs rescue *: These are all fixes for very specific issues.  They should
>>>> be folded into check with special aliases, and not be separate tools.  The
>>>> first fixes an issue that's pretty much non-existent in any modern kernel,
>>>> and the other two are for very low-level data recovery of horribly broken
>>>> filesystems.
>>>> - scrub: This is a very purpose specific tool which is supposed to be part
>>>> of regular maintainence, and only works to fix things as a side effect of
>>>> what it does.
>>>> - balance: This is also a relatively purpose specific tool, and again only
>>>> fixes things as a side effect of what it does.
>>
>>    You've forgotten btrfs-zero-log, which seems to have built itself a
>> reputation on the internet as the tool you run to fix all btrfs ills,
>> rather than a very finely-targeted tool that was introduced to deal
>> with approximately one bug somewhere back in the 2.x era (IIRC).
>>
>>    Hugo.
>
> :-) It's in my original list, and it's in Austin's by way of being
> lumped into 'btrfs rescue *' along with chunk and super recover. Seems
> like super recover should be built into Btrfs check, and would be one
> of the first ambiguities to get out of the way but I'm just an ape
> that wears pants so what do I know.
>
> Thing is?? zero log has fixed file systems in cases where I never
> would have expected it to, and the user was recommended not to use it,
> or use it as a 2nd to last resort. So, pfff....It's like throwing salt
> around.
>
To be entirely honest, both zero-log and super-recover could probably be 
pretty easily integrated into btrfs check such that it detects when they 
need to be run and does so.  zero-log has a very well defined situation 
in which it's absolutely needed (log tree corrupted such that it can't 
be replayed), which is pretty easy to detect (the kernel obviously does 
so, albeit by crashing).  super-recover is also used in a pretty 
specific set of circumstances (first SB corrupted, backups fine), which 
are also pretty easy to detect.  In both cases, I'd like to see some 
switch (--single-fix maybe?) for directly invoking just those functions 
(as well as a few others like dropping the FSC/FST or cancelling a 
paused or crashed balance) that operate at a filesystem level instead of 
a block/inode/extent level like most of the other stuff in check does.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 21:23               ` Christoph Anton Mitterer
@ 2016-09-16 12:13                 ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-16 12:13 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs

On 2016-09-15 17:23, Christoph Anton Mitterer wrote:
> On Thu, 2016-09-15 at 14:20 -0400, Austin S. Hemmelgarn wrote:
>> 3. Fsck should be needed only for un-mountable filesystems.  Ideally,
>> we
>> should be handling things like Windows does.  Preform slightly
>> better
>> checking when reading data, and if we see an error, flag the
>> filesystem
>> for expensive repair on the next mount.
>
> That philosophy also has some drawbacks:
> - The user doesn't directly that anything went wrong. Thus errors may
> even continue to accumulate and getting much worse if the fs would have
> immediately gone ro and giving the user the chance to manually
> intervene (possibly then with help from upstream).
Except that the fsck implementation in windows for NTFS actually fixes 
things that are broken.  MS policy is 'if chkdsk can't fix it, you need 
to just reinstall and restore from backups'.  They don't beat around the 
bush trying to figure out what exactly went wrong, because 99% of the 
time on Windows a corrupted filesystem means broken hardware or a virus. 
  BTRFS obviously isn't to that point yet, but it has the potential if 
we were to start focusing on fixing stuff that's broken instead of 
working on shiny new features that will inevitably make everything else 
harder to debug, we could probably get there faster than most other 
Linux filesystems.
>
> - Any smart auto-magical™ repair may also just fail (and make things
> worse, as the current --repair e.g. may). Not performing such auto-
> repair, gives the user at least the possible chance to make a bitwise
> copy of the whole fs, before trying any rescue operations.
> This wouldn't be the case, if the user never noticed that something
> happen, and the fs tries to repair things right at mounting.
People talk about it being dangerous, but I have yet to see it break a 
filesystem that wasn't already in a state that in XFS or ext4 would be 
considered broken beyond repair.  For pretty much all of the common 
cases (orphaned inodes, dangling hardlinks, isize mismatches, etc), it 
does fix things correctly.  Most of that stuff could be optionally 
checked at mount and fixed without causing issues, but it's not 
something that should be done all the time since it's expensive, hence 
me suggesting checking such things dynamically on-access and flagging 
them for cleanup next mount.
>
> So I think any such auto-repair should be used with extreme caution and
> only in those cases where one is absolutely a 100% sure that the action
> will help and just do good.
In general, I agree with this, and I'd say it should be opt-in, not 
mandatory.  I'm not talking about doing things that are all that risky 
though, but things which btrfs check can handle safely.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-16 12:00                     ` Austin S. Hemmelgarn
@ 2016-09-19  2:57                       ` Zygo Blaxell
  2016-09-19 12:37                         ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 93+ messages in thread
From: Zygo Blaxell @ 2016-09-19  2:57 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Chris Murphy, Hugo Mills, David Sterba, Waxhead, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

On Fri, Sep 16, 2016 at 08:00:44AM -0400, Austin S. Hemmelgarn wrote:
> To be entirely honest, both zero-log and super-recover could probably be
> pretty easily integrated into btrfs check such that it detects when they
> need to be run and does so.  zero-log has a very well defined situation in
> which it's absolutely needed (log tree corrupted such that it can't be
> replayed), which is pretty easy to detect (the kernel obviously does so,
> albeit by crashing).  

Check already includes zero-log.  It loses a little data that way, but
that is probably better than the alternative (try to teach btrfs check
how to replay the log tree and keep up with kernel changes).

There have been at least two log-tree bugs (or, more accurately,
bugs triggered while processing the log tree during mount) in the 3.x
and 4.x kernels.  The most recent I've encountered was in one of the
4.7-rc kernels.  zero-log is certainly not obsolete.

For a filesystem where availablity is more important than integrity
(e.g. root filesystems) it's really handy to have zero-log as a separate
tool without the huge overhead (and regression risk) of check.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-12 16:56     ` Austin S. Hemmelgarn
  2016-09-12 17:29       ` Filipe Manana
  2016-09-12 20:08       ` Chris Murphy
@ 2016-09-19  3:47       ` Zygo Blaxell
  2016-09-19 12:32         ` Austin S. Hemmelgarn
  2 siblings, 1 reply; 93+ messages in thread
From: Zygo Blaxell @ 2016-09-19  3:47 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2539 bytes --]

On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote:
> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
> is healthy.

I've found issues with OOB dedup (clone/extent-same):

1.  Don't dedup data that has not been committed--either call fsync()
on it, or check the generation numbers on each extent before deduping
it, or make sure the data is not being actively modified during dedup;
otherwise, a race condition may lead to the the filesystem locking up and
becoming inaccessible until the kernel is rebooted.  This is particularly
important if you are doing bedup-style incremental dedup on a live system.

I've worked around #1 by placing a fsync() call on the src FD immediately
before calling FILE_EXTENT_SAME.  When I do an A/B experiment with and
without the fsync, "with-fsync" runs for weeks at a time without issues,
while "without-fsync" hangs, sometimes in just a matter of hours.  Note
that the fsync() doesn't resolve the underlying race condition, it just
makes the filesystem hang less often.

2.  There is a practical limit to the number of times a single duplicate
extent can be deduplicated.  As more references to a shared extent
are created, any part of the filesystem that uses backref walking code
gets slower.  This includes dedup itself, balance, device replace/delete,
FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate
files are executables).  Several factors (including file size and number
of snapshots) are involved, making it difficult to devise workarounds or
set up test cases.  99.5% of the time, these operations just get slower
by a few ms each time a new reference is created, but the other 0.5% of
the time, write operations will abruptly grow to consume hours of CPU
time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs)
when they touch one of these over-shared extents.  When this occurs,
it effectively (but not literally) crashes the host machine.

I've worked around #2 by building tables of "toxic" hashes that occur too
frequently in a filesystem to be deduped, and using these tables in dedup
software to ignore any duplicate data matching them.  These tables can
be relatively small as they only need to list hashes that are repeated
more than a few thousand times, and typical filesystems (up to 10TB or
so) have only a few hundred such hashes.

I happened to have a couple of machines taken down by these issues this
very weekend, so I can confirm the issues are present in kernels 4.4.21,
4.5.7, and 4.7.4.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-15 19:02               ` Chris Murphy
  2016-09-15 20:16                 ` Hugo Mills
@ 2016-09-19  4:08                 ` Zygo Blaxell
  2016-09-19 15:27                   ` Sean Greenslade
  2016-09-19 17:38                   ` Austin S. Hemmelgarn
  1 sibling, 2 replies; 93+ messages in thread
From: Zygo Blaxell @ 2016-09-19  4:08 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Austin S. Hemmelgarn, David Sterba, Waxhead, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
> Right, well I'm vaguely curious why ZFS, as different as it is,
> basically take the position that if the hardware went so batshit that
> they can't unwind it on a normal mount, then an fsck probably can't
> help either... they still don't have an fsck and don't appear to want
> one.

ZFS has no automated fsck, but it does have a kind of interactive
debugger that can be used to manually fix things.

ZFS seems to be a lot more robust when it comes to handling bad metadata
(contrast with btrfs-style BUG_ON panics).

When you delete a directory entry that has a missing inode on ZFS,
the dirent goes away.  In the ZFS administrator documentation they give
examples of this as a response in cases where ZFS metadata gets corrupted.

When you delete a file with a missing inode on btrfs, something
(VFS?) wants to check the inode to see if it has attributes that might
affect unlink (e.g. the immutable bit), gets an error reading the
inode, and bombs out of the unlink() before unlink() can get rid of the
dead dirent.  So if you get a dirent with no inode on btrfs on a large
filesystem (too large for btrfs check to handle), you're basically stuck
with it forever.  You can't even rename it.  Hopefully it doesn't happen
in a top-level directory.

ZFS is also infamous for saying "sucks to be you, I'm outta here" when
things go wrong.  People do want ZFS fsck and defrag, but nobody seems
to be bothered much about making those things happen.

At the end of the day I'm not sure fsck really matters.  If the filesystem
is getting corrupted enough that both copies of metadata are broken,
there's something fundamentally wrong with that setup (hardware bugs,
software bugs, bad RAM, etc) and it's just going to keep slowly eating
more data until the underlying problem is fixed, and there's no guarantee
that a repair is going to restore data correctly.  If we exclude broken
hardware, the only thing btrfs check is going to repair is btrfs kernel
bugs...and in that case, why would we expect btrfs check to have fewer
bugs than the filesystem itself?

> I'm not sure if the brfsck is really all that helpful to user as much
> as it is for developers to better learn about the failure vectors of
> the file system.

ReiserFS had no working fsck for all of the 8 years I used it (and still
didn't last year when I tried to use it on an old disk).  "Not working"
here means "much less data is readable from the filesystem after running
fsck than before."  It's not that much of an inconvenience if you have
backups.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19  3:47       ` Zygo Blaxell
@ 2016-09-19 12:32         ` Austin S. Hemmelgarn
  2016-09-19 15:33           ` Zygo Blaxell
  0 siblings, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-19 12:32 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: dsterba, Waxhead, linux-btrfs

On 2016-09-18 23:47, Zygo Blaxell wrote:
> On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote:
>> 4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
>> is healthy.
>
> I've found issues with OOB dedup (clone/extent-same):
>
> 1.  Don't dedup data that has not been committed--either call fsync()
> on it, or check the generation numbers on each extent before deduping
> it, or make sure the data is not being actively modified during dedup;
> otherwise, a race condition may lead to the the filesystem locking up and
> becoming inaccessible until the kernel is rebooted.  This is particularly
> important if you are doing bedup-style incremental dedup on a live system.
>
> I've worked around #1 by placing a fsync() call on the src FD immediately
> before calling FILE_EXTENT_SAME.  When I do an A/B experiment with and
> without the fsync, "with-fsync" runs for weeks at a time without issues,
> while "without-fsync" hangs, sometimes in just a matter of hours.  Note
> that the fsync() doesn't resolve the underlying race condition, it just
> makes the filesystem hang less often.
>
> 2.  There is a practical limit to the number of times a single duplicate
> extent can be deduplicated.  As more references to a shared extent
> are created, any part of the filesystem that uses backref walking code
> gets slower.  This includes dedup itself, balance, device replace/delete,
> FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate
> files are executables).  Several factors (including file size and number
> of snapshots) are involved, making it difficult to devise workarounds or
> set up test cases.  99.5% of the time, these operations just get slower
> by a few ms each time a new reference is created, but the other 0.5% of
> the time, write operations will abruptly grow to consume hours of CPU
> time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs)
> when they touch one of these over-shared extents.  When this occurs,
> it effectively (but not literally) crashes the host machine.
>
> I've worked around #2 by building tables of "toxic" hashes that occur too
> frequently in a filesystem to be deduped, and using these tables in dedup
> software to ignore any duplicate data matching them.  These tables can
> be relatively small as they only need to list hashes that are repeated
> more than a few thousand times, and typical filesystems (up to 10TB or
> so) have only a few hundred such hashes.
>
> I happened to have a couple of machines taken down by these issues this
> very weekend, so I can confirm the issues are present in kernels 4.4.21,
> 4.5.7, and 4.7.4.
OK, that's good to know.  In my case, I'm not operating on a very big 
data set (less than 40GB, but the storage cluster I'm doing this on only 
has about 200GB of total space, so I'm trying to conserve as much as 
possible), and it's mostly static data (less than 100MB worth of changes 
a day except on Sunday when I run backups), so it makes sense that I've 
not seen either of these issues.

The second one sounds like the same performance issue caused by having 
very large numbers of snapshots, and based on what's happening, I don't 
think there's any way we could fix it without rewriting certain core code.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19  2:57                       ` Zygo Blaxell
@ 2016-09-19 12:37                         ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-19 12:37 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Chris Murphy, Hugo Mills, David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-18 22:57, Zygo Blaxell wrote:
> On Fri, Sep 16, 2016 at 08:00:44AM -0400, Austin S. Hemmelgarn wrote:
>> To be entirely honest, both zero-log and super-recover could probably be
>> pretty easily integrated into btrfs check such that it detects when they
>> need to be run and does so.  zero-log has a very well defined situation in
>> which it's absolutely needed (log tree corrupted such that it can't be
>> replayed), which is pretty easy to detect (the kernel obviously does so,
>> albeit by crashing).
>
> Check already includes zero-log.  It loses a little data that way, but
> that is probably better than the alternative (try to teach btrfs check
> how to replay the log tree and keep up with kernel changes).
Interesting, as I've never seen check try to zero the log (even in cases 
where it would fix things) unless it makes some other change in the FS. 
I won't dispute that it clears the log tree _if_ it makes other changes 
to the FS (it kind of has to for safety reasons), but that's the only 
circumstance that I've seen it do so (even on filesystems where the log 
tree was corrupted, but the rest of the FS was fine).
>
> There have been at least two log-tree bugs (or, more accurately,
> bugs triggered while processing the log tree during mount) in the 3.x
> and 4.x kernels.  The most recent I've encountered was in one of the
> 4.7-rc kernels.  zero-log is certainly not obsolete.
I won't dispute this, as I've had it happen myself (albeit not quite 
that recently), all I was trying to say was that it fixes a very well 
defined problem.
>
> For a filesystem where availablity is more important than integrity
> (e.g. root filesystems) it's really handy to have zero-log as a separate
> tool without the huge overhead (and regression risk) of check.
Agreed, hence my later statement that if it gets fully merged, there 
should be an option to run just that.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-15 11:54                 ` Austin S. Hemmelgarn
  2016-09-15 14:15                   ` Chris Murphy
  2016-09-15 14:56                   ` Martin Steigerwald
@ 2016-09-19 14:38                   ` David Sterba
  2 siblings, 0 replies; 93+ messages in thread
From: David Sterba @ 2016-09-19 14:38 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Hans van Kranenburg, Christoph Anton Mitterer, linux-btrfs

On Thu, Sep 15, 2016 at 07:54:26AM -0400, Austin S. Hemmelgarn wrote:
> > I'd like to help creating/maintaining this bug overview. A good start
> > would be to just crawl through all stable kernels and some distro
> > kernels and see which commits show up in fs/btrfs.
> >
> As of right now, we kind of do have such a page:
> https://btrfs.wiki.kernel.org/index.php/Gotchas
> It's not really well labeled though, ans it's easy to overlook.

The page has been created long time ago, if you'd need to start a new
page with similar content I can add a redirect so the link still works.

A more detailed bug page would be welcome by users. The changelogs I
write per release are terse as I don't want to spend the day just on
that. All the information should be in the git log and possibly in
recent mails, so this is manual work to present in on the wiki and does
not need devs to assist.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19  4:08                 ` Zygo Blaxell
@ 2016-09-19 15:27                   ` Sean Greenslade
  2016-09-19 17:38                   ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 93+ messages in thread
From: Sean Greenslade @ 2016-09-19 15:27 UTC (permalink / raw)
  To: Btrfs BTRFS

On Mon, Sep 19, 2016 at 12:08:55AM -0400, Zygo Blaxell wrote:
> <snip>
> At the end of the day I'm not sure fsck really matters.  If the filesystem
> is getting corrupted enough that both copies of metadata are broken,
> there's something fundamentally wrong with that setup (hardware bugs,
> software bugs, bad RAM, etc) and it's just going to keep slowly eating
> more data until the underlying problem is fixed, and there's no guarantee
> that a repair is going to restore data correctly.  If we exclude broken
> hardware, the only thing btrfs check is going to repair is btrfs kernel
> bugs...and in that case, why would we expect btrfs check to have fewer
> bugs than the filesystem itself?

I see btrfs check as having a very useful role: fixing known problems
introduced by previous versions of kernel / progs. In my ext conversion
thread, I seem to have discovered a problem introduced by convert,
balance, or defrag. The data and metadata seem to be OK, however the
filesystem cannot be written to without btrfs falling over. If this was
caused by some edge-case data in the btrfs partition, it makes a lot
more sense to have btrfs check repair it than it does to modify the
kernel code to work around this and possibly many other bugs. The upshot
to this is that since (potentially all of) the data is intact, a
functional btrfs check would save me the hassle of restoring from
backup.

--Sean


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
  2016-09-15  9:49               ` stability matrix Hans van Kranenburg
@ 2016-09-19 15:27               ` David Sterba
  2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
  2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
  1 sibling, 2 replies; 93+ messages in thread
From: David Sterba @ 2016-09-19 15:27 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-btrfs

Hi,

On Thu, Sep 15, 2016 at 04:14:04AM +0200, Christoph Anton Mitterer wrote:
> In general:
> - I think another column should be added, which tells when and for
>   which kernel version the feature-status of each row was 
>   revised/updated the last time and especially by whom.
>   If a core dev makes a statement on a particular feature, this
>   probably means much more, than if it was made by "just" a list
>   regular.

It's going to be revised per release. If there's a bug that affect the
status, the page will be updated. I'm going to do that among other
per-release regular boring tasks.

I'm still not decided if the kernel version will be useful enough, but
if anybody is willing to do the research and fill the table I don't
object.

>   And yes I know, in the beginning it already says "this is for 4.7"...
>   but let's be honest, it's pretty likely when this is bumped to 4.8
>   that not each and every point will be thoroughly checked again.
> - Optionally even one further column could be added, that lists bugs
>   where the specific cases are kept record of (if any).

There's a new section under the table to write anything that would not
fit. Mostly pointers to other documentation (manual pages) or bugzilla.

> - Perhaps a 3rd Status like "eats-your-data" which is worse than
>   critical, e.g. for things were it's known that there is a high
>   chance for still getting data corruption (RAID56?)
> 
> 
> Perhaps there should be another section that lists general caveats
> and pitfalls including:
> - defrag/auto-defrag causes ref-link break up (which in turn causes
>   possible extensive space being eaten up)

Updated accordingly.

> - nodatacow files are not yet[0] checksummed, which in turn means
>   that any errors (especially silent data corruption) will not be
>   noticed AND which in turn also means the data itself cannot be
>   repaired even in case of RAIDs (only the RAIDs are made consistent
>   again)

Added to the table.

> - subvolume UUID attacks discussed in the recent thread
> - fs/device UUID collisions
>   - the accidental corruption that can happen in case colliding
>     fs/device UUIDs appear in a system (and telling the user that
>     this is e.g. the case when dd'ing and image or using lvm
>     snapshots, probably also when having btrfs on MD RAID1 or RAID10)
>   - the attacks that are possible when UUIDs are known to an attacker

That's more like a usecase, thats out of the scope of the tabular
overview. But we have an existing page UseCases that I'd like to
transform to a more structured and complete overview of usceases of
various features, so the UUID collisions would build on top of that with
"and this could hapen if ...".

> - in-band dedupe
>   deduped are IIRC not bitwise compared by the kernel before de-duping,
>   as it's the case with offline dedupe.
>   Even if this is considered safe by the community... I think users
>   should be told.

Only features merged are reflected. And the out-of-band dedupe does full
memcpy. See btrfs_cmp_data() called from btrfs_extent_same().

> - btrfs check --repair (and others?)
>   Telling people that this may often cause more harm than good.

I think userspace tools do not belong to the overview.

> - even mounting a fs ro, may cause it to be changed

This would go to the UseCases

> - DB/VM-image like IO patterns + nodatacow + (!)checksumming
>   + (auto)defrag + snapshots
>   a)
>   People typically may have the impression:
>   btrfs = checksummed => als is guaranteed to be "valid" (or at least
>   noticed)
>   However this isn't the case for nodatacow'ed files, which in turn is
>   kinda "mandatory" for DB/VM-image like IO patterns, cause otherwise
>   these would fragment to heavily (see (b).
>   Unless claimed by some people, none of the major DBs or VM-image
>   formats do general checksumming on their own, most even don't support
>   it, some that do wouldn't do it without app-support and few "just"
>   don't do it per default.
>   Thus one should bump people to this situation and that they may not
>   get this "correctness" guarantee here.
>   b)
>   IIRC, it doesn't even help to simply not use nodatacow on such files
>   and using auto-defrag instead to countermeasure the fragmenting, as
>   that one doesn't perform too well on large files.

Same.

> For specific features:
> - Autodefrag
>   - didn't that also cause reflinks to be broken up?

No and never had.

> that should be
>     mentioned than as well, as it is (more or less) for defrag and
>     people could then assume it's not the case for autodefrag (which I
>     did initially)
>   - wasn't it said that autodefrag performs bad with files > ~1GB?
>     Perhaps that should be mentioned too
> - defrag
>   "extents get unshared" is IMO not an adequate description for the end
>   user,... it should perhaps link to the defrag article and there
>   explain in detail that any ref-linked files will be broken up, which
>   means space usage will increase, and may especially explode in case
>   of snapshots

Added more verbose description.

> - all the RAID56 related points
>   wasn't there recently a thread that discussed a more serious bug,
>   where parity was wrongly re-calculated which in turn caused actual
>   data corruption?
>   I think if that's still an issue "write hole still exists, parity
>   not checksummed" is not enough but one should emphasize that data may
>   easily be corrupted.

There's a separate page for raid56 listing all known problems but I
don't see this one there.

> - RAID*
>   No userland tools for monitoring/etc.

That's a usability bug.

> - Device replace 
>   IIRC, CM told me that this may cause severe troubles on RAID56
>
> Also, the current matrix talks about "auto-repair"... what's that? (=>
> should be IMO explained). 

Added.

> Last but not least, perhaps this article may also be the place to
> document 3rd party things and how far they work stable with btrfs.
> For example:
> - Which grub version supports booting from it? Which features does it
>   [not] support (e.g. which RAIDs, skinny-extents, etc.)?
> - Which forensic tools (e.g. things like testdisk) do work with btrfs?
> - Which are still maintained/working dedupe userland tools (and are
>   they stable?)

This is getting a bit out of the scope. If the information on our wiki
is static, ie 'grub2 since 2.02~beta18 supports something', then ok, but
we still should point readers to the official wikis or documentation.

Auditing the bootloaders for btrfs support is one of the unclaimed
project ideas.

> [0] Yeah I know, a number of list regulars constantly tried to convince
>     me that this wasn't possible per se, but a recent discussion I had
>     with CM seemed to have revealed (unless I understood it wrong) that
>     it wouldn't be generally impossible at all.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19 12:32         ` Austin S. Hemmelgarn
@ 2016-09-19 15:33           ` Zygo Blaxell
  0 siblings, 0 replies; 93+ messages in thread
From: Zygo Blaxell @ 2016-09-19 15:33 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3765 bytes --]

On Mon, Sep 19, 2016 at 08:32:14AM -0400, Austin S. Hemmelgarn wrote:
> On 2016-09-18 23:47, Zygo Blaxell wrote:
> >On Mon, Sep 12, 2016 at 12:56:03PM -0400, Austin S. Hemmelgarn wrote:
> >>4. File Range Cloning and Out-of-band Dedupe: Similarly, work fine if the FS
> >>is healthy.
> >
> >I've found issues with OOB dedup (clone/extent-same):
> >
> >1.  Don't dedup data that has not been committed--either call fsync()
> >on it, or check the generation numbers on each extent before deduping
> >it, or make sure the data is not being actively modified during dedup;
> >otherwise, a race condition may lead to the the filesystem locking up and
> >becoming inaccessible until the kernel is rebooted.  This is particularly
> >important if you are doing bedup-style incremental dedup on a live system.
> >
> >I've worked around #1 by placing a fsync() call on the src FD immediately
> >before calling FILE_EXTENT_SAME.  When I do an A/B experiment with and
> >without the fsync, "with-fsync" runs for weeks at a time without issues,
> >while "without-fsync" hangs, sometimes in just a matter of hours.  Note
> >that the fsync() doesn't resolve the underlying race condition, it just
> >makes the filesystem hang less often.
> >
> >2.  There is a practical limit to the number of times a single duplicate
> >extent can be deduplicated.  As more references to a shared extent
> >are created, any part of the filesystem that uses backref walking code
> >gets slower.  This includes dedup itself, balance, device replace/delete,
> >FIEMAP, LOGICAL_INO, and mmap() (which can be bad news if the duplicate
> >files are executables).  Several factors (including file size and number
> >of snapshots) are involved, making it difficult to devise workarounds or
> >set up test cases.  99.5% of the time, these operations just get slower
> >by a few ms each time a new reference is created, but the other 0.5% of
> >the time, write operations will abruptly grow to consume hours of CPU
> >time or dozens of gigabytes of RAM (in millions of kmalloc-32 slabs)
> >when they touch one of these over-shared extents.  When this occurs,
> >it effectively (but not literally) crashes the host machine.
> >
> >I've worked around #2 by building tables of "toxic" hashes that occur too
> >frequently in a filesystem to be deduped, and using these tables in dedup
> >software to ignore any duplicate data matching them.  These tables can
> >be relatively small as they only need to list hashes that are repeated
> >more than a few thousand times, and typical filesystems (up to 10TB or
> >so) have only a few hundred such hashes.
> >
> >I happened to have a couple of machines taken down by these issues this
> >very weekend, so I can confirm the issues are present in kernels 4.4.21,
> >4.5.7, and 4.7.4.
> OK, that's good to know.  In my case, I'm not operating on a very big data
> set (less than 40GB, but the storage cluster I'm doing this on only has
> about 200GB of total space, so I'm trying to conserve as much as possible),
> and it's mostly static data (less than 100MB worth of changes a day except
> on Sunday when I run backups), so it makes sense that I've not seen either
> of these issues.

I ran into issue #2 on an 8GB filesystem last weekend.  The lower limit
on filesystem size could be as low as a few megabytes if they're arranged
in *just* the right way.

> The second one sounds like the same performance issue caused by having very
> large numbers of snapshots, and based on what's happening, I don't think
> there's any way we could fix it without rewriting certain core code.

find_parent_nodes is the usual culprit for CPU usage.  Fixing this is
required for in-band dedup as well, so I assume someone has it on their
roadmap and will get it done eventually.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-12 17:31       ` Austin S. Hemmelgarn
  2016-09-15  1:07         ` Nicholas D Steeves
@ 2016-09-19 15:38         ` David Sterba
  2016-09-19 21:25           ` Hans van Kranenburg
  1 sibling, 1 reply; 93+ messages in thread
From: David Sterba @ 2016-09-19 15:38 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: dsterba, Waxhead, linux-btrfs

On Mon, Sep 12, 2016 at 01:31:42PM -0400, Austin S. Hemmelgarn wrote:
> On 2016-09-12 12:51, David Sterba wrote:
> > On Mon, Sep 12, 2016 at 10:54:40AM -0400, Austin S. Hemmelgarn wrote:
> >>> Somebody has put that table on the wiki, so it's a good starting point.
> >>> I'm not sure we can fit everything into one table, some combinations do
> >>> not bring new information and we'd need n-dimensional matrix to get the
> >>> whole picture.
> >> Agreed, especially because some things are only bad in specific
> >> circumstances (For example, snapshots generally work fine on almost
> >> anything, until you get into the range of more than about 250, then they
> >> start causing issues).
> >
> > The performance aspect could be hard to estimate. Each feature has some
> > cost, we can document what's expected hit but various combinations and
> > actual runtime performance is unpredictable. I'd rather let the tools do
> > what the user asks for, as we might not be able to even detect there are
> > some bad external factors. I think that 250 snapshots would perform
> > better on an ssd than a rotational disk. In the end this leads to the
> > "dos & don'ts".
> >
> In general yes in this case, but performance starts to degrade 
> exponentially beyond a certain point.  The difference between (for 
> example) 10 and 20 snapshots is not as much as between 1000 and 1010. 
> The problem here is that we don't really have a BCP document that anyone 
> ever reads.  A lot of stuff that may seem obvious to us after years of 
> working with BTRFS isn't going to be to a newcomer, and it's a lot more 
> likely that some random person will get things write if we have a good, 
> central BCP document than if it stays as scattered tribal knowledge.

The IRC tribe answers the same newcomer questions over and over, which
is fine for the interaction itself, but if all that also ended up in
wiki we'd have perfect documentation years ago. Also the current status
of features and bugs is kept in the IRC-hive-mind yet it still needs
some other way to actually make it appear on wiki. Edit with courage!

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
@ 2016-09-19 17:18                 ` Austin S. Hemmelgarn
  2016-09-19 19:52                   ` Christoph Anton Mitterer
  2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
  1 sibling, 1 reply; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-19 17:18 UTC (permalink / raw)
  To: linux-btrfs, dave, Christoph Anton Mitterer

On 2016-09-19 11:27, David Sterba wrote:
> Hi,
>
> On Thu, Sep 15, 2016 at 04:14:04AM +0200, Christoph Anton Mitterer wrote:
>> In general:
>> - I think another column should be added, which tells when and for
>>   which kernel version the feature-status of each row was
>>   revised/updated the last time and especially by whom.
>>   If a core dev makes a statement on a particular feature, this
>>   probably means much more, than if it was made by "just" a list
>>   regular.
>
> It's going to be revised per release. If there's a bug that affect the
> status, the page will be updated. I'm going to do that among other
> per-release regular boring tasks.
>
> I'm still not decided if the kernel version will be useful enough, but
> if anybody is willing to do the research and fill the table I don't
> object.
Moving forwards, I think it's worth it, but I don't feel that it's worth 
looking back at anything before 4.4 to list versions.
>
>>   And yes I know, in the beginning it already says "this is for 4.7"...
>>   but let's be honest, it's pretty likely when this is bumped to 4.8
>>   that not each and every point will be thoroughly checked again.
>> - Optionally even one further column could be added, that lists bugs
>>   where the specific cases are kept record of (if any).
>
> There's a new section under the table to write anything that would not
> fit. Mostly pointers to other documentation (manual pages) or bugzilla.
>
>> - Perhaps a 3rd Status like "eats-your-data" which is worse than
>>   critical, e.g. for things were it's known that there is a high
>>   chance for still getting data corruption (RAID56?)
>>
>>
>> Perhaps there should be another section that lists general caveats
>> and pitfalls including:
>> - defrag/auto-defrag causes ref-link break up (which in turn causes
>>   possible extensive space being eaten up)
>
> Updated accordingly.
>
>> - nodatacow files are not yet[0] checksummed, which in turn means
>>   that any errors (especially silent data corruption) will not be
>>   noticed AND which in turn also means the data itself cannot be
>>   repaired even in case of RAIDs (only the RAIDs are made consistent
>>   again)
>
> Added to the table.
>
>> - subvolume UUID attacks discussed in the recent thread
>> - fs/device UUID collisions
>>   - the accidental corruption that can happen in case colliding
>>     fs/device UUIDs appear in a system (and telling the user that
>>     this is e.g. the case when dd'ing and image or using lvm
>>     snapshots, probably also when having btrfs on MD RAID1 or RAID10)
>>   - the attacks that are possible when UUIDs are known to an attacker
>
> That's more like a usecase, thats out of the scope of the tabular
> overview. But we have an existing page UseCases that I'd like to
> transform to a more structured and complete overview of usceases of
> various features, so the UUID collisions would build on top of that with
> "and this could hapen if ...".
I don't agree with this being use case specific.  Whether or not someone 
cares could technically be use case specific, but the use cases where 
this actually doesn't matter is pretty much limited to tight embedded 
systems with no way to attach external storage.  This behavior results 
in both a number of severe security holes for anyone without proper 
physical security (read as 'almost all desktop and laptop users, as well 
as many server admins'), and severe potential for data loss when 
performing normal recovery activities that work on every other filesystem.
>
>> - in-band dedupe
>>   deduped are IIRC not bitwise compared by the kernel before de-duping,
>>   as it's the case with offline dedupe.
>>   Even if this is considered safe by the community... I think users
>>   should be told.
>
> Only features merged are reflected. And the out-of-band dedupe does full
> memcpy. See btrfs_cmp_data() called from btrfs_extent_same().
>
>> - btrfs check --repair (and others?)
>>   Telling people that this may often cause more harm than good.
>
> I think userspace tools do not belong to the overview.
>
>> - even mounting a fs ro, may cause it to be changed
>
> This would go to the UseCases
My same argument about the UUID issues applies here, just without the 
security aspect.  The only difference here is that it's common behavior 
across most filesystems (but not widely known to most people who aren't 
FS develo9pers or sysops experts).
>
>> - DB/VM-image like IO patterns + nodatacow + (!)checksumming
>>   + (auto)defrag + snapshots
>>   a)
>>   People typically may have the impression:
>>   btrfs = checksummed => als is guaranteed to be "valid" (or at least
>>   noticed)
>>   However this isn't the case for nodatacow'ed files, which in turn is
>>   kinda "mandatory" for DB/VM-image like IO patterns, cause otherwise
>>   these would fragment to heavily (see (b).
>>   Unless claimed by some people, none of the major DBs or VM-image
>>   formats do general checksumming on their own, most even don't support
>>   it, some that do wouldn't do it without app-support and few "just"
>>   don't do it per default.
>>   Thus one should bump people to this situation and that they may not
>>   get this "correctness" guarantee here.
>>   b)
>>   IIRC, it doesn't even help to simply not use nodatacow on such files
>>   and using auto-defrag instead to countermeasure the fragmenting, as
>>   that one doesn't perform too well on large files.
>
> Same.
>
>> For specific features:
>> - Autodefrag
>>   - didn't that also cause reflinks to be broken up?
>
> No and never had.
>
>> that should be
>>     mentioned than as well, as it is (more or less) for defrag and
>>     people could then assume it's not the case for autodefrag (which I
>>     did initially)
>>   - wasn't it said that autodefrag performs bad with files > ~1GB?
>>     Perhaps that should be mentioned too
>> - defrag
>>   "extents get unshared" is IMO not an adequate description for the end
>>   user,... it should perhaps link to the defrag article and there
>>   explain in detail that any ref-linked files will be broken up, which
>>   means space usage will increase, and may especially explode in case
>>   of snapshots
>
> Added more verbose description.
>
>> - all the RAID56 related points
>>   wasn't there recently a thread that discussed a more serious bug,
>>   where parity was wrongly re-calculated which in turn caused actual
>>   data corruption?
>>   I think if that's still an issue "write hole still exists, parity
>>   not checksummed" is not enough but one should emphasize that data may
>>   easily be corrupted.
>
> There's a separate page for raid56 listing all known problems but I
> don't see this one there.
>
>> - RAID*
>>   No userland tools for monitoring/etc.
>
> That's a usability bug.
While it's a usability bug, it's still an important piece of information 
for people who are looking at this for production usage, and due to the 
generally shoddy documentation, is something that's not hard to overlook.
>
>> - Device replace
>>   IIRC, CM told me that this may cause severe troubles on RAID56
>>
>> Also, the current matrix talks about "auto-repair"... what's that? (=>
>> should be IMO explained).
>
> Added.
>
>> Last but not least, perhaps this article may also be the place to
>> document 3rd party things and how far they work stable with btrfs.
>> For example:
>> - Which grub version supports booting from it? Which features does it
>>   [not] support (e.g. which RAIDs, skinny-extents, etc.)?
>> - Which forensic tools (e.g. things like testdisk) do work with btrfs?
>> - Which are still maintained/working dedupe userland tools (and are
>>   they stable?)
>
> This is getting a bit out of the scope. If the information on our wiki
> is static, ie 'grub2 since 2.02~beta18 supports something', then ok, but
> we still should point readers to the official wikis or documentation.
>
> Auditing the bootloaders for btrfs support is one of the unclaimed
> project ideas.
>
>> [0] Yeah I know, a number of list regulars constantly tried to convince
>>     me that this wasn't possible per se, but a recent discussion I had
>>     with CM seemed to have revealed (unless I understood it wrong) that
>>     it wouldn't be generally impossible at all.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19  4:08                 ` Zygo Blaxell
  2016-09-19 15:27                   ` Sean Greenslade
@ 2016-09-19 17:38                   ` Austin S. Hemmelgarn
  2016-09-19 18:27                     ` Chris Murphy
  2016-09-19 20:15                     ` Zygo Blaxell
  1 sibling, 2 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-19 17:38 UTC (permalink / raw)
  To: Zygo Blaxell, Chris Murphy; +Cc: David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-19 00:08, Zygo Blaxell wrote:
> On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
>> Right, well I'm vaguely curious why ZFS, as different as it is,
>> basically take the position that if the hardware went so batshit that
>> they can't unwind it on a normal mount, then an fsck probably can't
>> help either... they still don't have an fsck and don't appear to want
>> one.
>
> ZFS has no automated fsck, but it does have a kind of interactive
> debugger that can be used to manually fix things.
>
> ZFS seems to be a lot more robust when it comes to handling bad metadata
> (contrast with btrfs-style BUG_ON panics).
>
> When you delete a directory entry that has a missing inode on ZFS,
> the dirent goes away.  In the ZFS administrator documentation they give
> examples of this as a response in cases where ZFS metadata gets corrupted.
>
> When you delete a file with a missing inode on btrfs, something
> (VFS?) wants to check the inode to see if it has attributes that might
> affect unlink (e.g. the immutable bit), gets an error reading the
> inode, and bombs out of the unlink() before unlink() can get rid of the
> dead dirent.  So if you get a dirent with no inode on btrfs on a large
> filesystem (too large for btrfs check to handle), you're basically stuck
> with it forever.  You can't even rename it.  Hopefully it doesn't happen
> in a top-level directory.
>
> ZFS is also infamous for saying "sucks to be you, I'm outta here" when
> things go wrong.  People do want ZFS fsck and defrag, but nobody seems
> to be bothered much about making those things happen.
>
> At the end of the day I'm not sure fsck really matters.  If the filesystem
> is getting corrupted enough that both copies of metadata are broken,
> there's something fundamentally wrong with that setup (hardware bugs,
> software bugs, bad RAM, etc) and it's just going to keep slowly eating
> more data until the underlying problem is fixed, and there's no guarantee
> that a repair is going to restore data correctly.  If we exclude broken
> hardware, the only thing btrfs check is going to repair is btrfs kernel
> bugs...and in that case, why would we expect btrfs check to have fewer
> bugs than the filesystem itself?
I wouldn't, but I would still expect to have some tool to deal with 
things like orphaned inodes, dentries which are missing inodes, and 
other similar cases that don't make the filesystem unusable, but can't 
easily be fixed in a sane manner on a live filesystem.  The ZFS approach 
is valid, but it can't deal with things like orphaned inodes where 
there's no reference in the directories any more.
>
>> I'm not sure if the brfsck is really all that helpful to user as much
>> as it is for developers to better learn about the failure vectors of
>> the file system.
>
> ReiserFS had no working fsck for all of the 8 years I used it (and still
> didn't last year when I tried to use it on an old disk).  "Not working"
> here means "much less data is readable from the filesystem after running
> fsck than before."  It's not that much of an inconvenience if you have
> backups.
For a small array, this may be the case.  Once you start looking into 
double digit TB scale arrays though, restoring backups becomes a very 
expensive operation.  If you had a multi-PB array with a single dentry 
which had no inode, would you rather be spending multiple days restoring 
files and possibly losing recent changes, or spend a few hours to check 
the filesystem and fix it with minimal data loss?

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19 17:38                   ` Austin S. Hemmelgarn
@ 2016-09-19 18:27                     ` Chris Murphy
  2016-09-19 18:34                       ` Austin S. Hemmelgarn
  2016-09-19 20:15                     ` Zygo Blaxell
  1 sibling, 1 reply; 93+ messages in thread
From: Chris Murphy @ 2016-09-19 18:27 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Zygo Blaxell, Chris Murphy, David Sterba, Waxhead, Btrfs BTRFS

On Mon, Sep 19, 2016 at 11:38 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
>> ReiserFS had no working fsck for all of the 8 years I used it (and still
>> didn't last year when I tried to use it on an old disk).  "Not working"
>> here means "much less data is readable from the filesystem after running
>> fsck than before."  It's not that much of an inconvenience if you have
>> backups.
>
> For a small array, this may be the case.  Once you start looking into double
> digit TB scale arrays though, restoring backups becomes a very expensive
> operation.  If you had a multi-PB array with a single dentry which had no
> inode, would you rather be spending multiple days restoring files and
> possibly losing recent changes, or spend a few hours to check the filesystem
> and fix it with minimal data loss?

Yep restoring backups, even fully re-replicating data in a cluster, is
untenable it's so expensive. But even offline fsck is sufficiently
non-scalable that at a certain volume size it's not tenable. 100TB
takes a long time to fsck offline, and is it even possible to fsck 1PB
Btrfs? Seems to me it's another case were if it were possible to
isolate what tree limbs are sick, just cut them off and report the
data loss rather than consider the whole fs unusable. That's what we
do with living things.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19 18:27                     ` Chris Murphy
@ 2016-09-19 18:34                       ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-19 18:34 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Zygo Blaxell, David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-19 14:27, Chris Murphy wrote:
> On Mon, Sep 19, 2016 at 11:38 AM, Austin S. Hemmelgarn
> <ahferroin7@gmail.com> wrote:
>>> ReiserFS had no working fsck for all of the 8 years I used it (and still
>>> didn't last year when I tried to use it on an old disk).  "Not working"
>>> here means "much less data is readable from the filesystem after running
>>> fsck than before."  It's not that much of an inconvenience if you have
>>> backups.
>>
>> For a small array, this may be the case.  Once you start looking into double
>> digit TB scale arrays though, restoring backups becomes a very expensive
>> operation.  If you had a multi-PB array with a single dentry which had no
>> inode, would you rather be spending multiple days restoring files and
>> possibly losing recent changes, or spend a few hours to check the filesystem
>> and fix it with minimal data loss?
>
> Yep restoring backups, even fully re-replicating data in a cluster, is
> untenable it's so expensive. But even offline fsck is sufficiently
> non-scalable that at a certain volume size it's not tenable. 100TB
> takes a long time to fsck offline, and is it even possible to fsck 1PB
> Btrfs? Seems to me it's another case were if it were possible to
> isolate what tree limbs are sick, just cut them off and report the
> data loss rather than consider the whole fs unusable. That's what we
> do with living things.
>
This is part of why I said the ZFS approach is valid.  At the moment 
though, we can't even do that, and to do it properly, we'd need a tool 
to bypass the VFS layer to prune the tree, which is non-trivial in and 
of itself.  It would be nice to have a mode in check where you could say 
'I know this path in the FS has some kind of issue, figure out what's 
wrong and fix it if possible, otherwise optionally prune that branch 
from the appropriate tree'.  On the same note, it would be nice to be 
able to manually restrict it to specific checks (eg, 'check only for 
orphaned inodes', or 'only validate the FSC/FST').  If we were to add 
such functionality, dealing with some minor corruption in a 100TB+ array 
wouldn't be quite as much of an issue.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
  2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
@ 2016-09-19 19:45                 ` Christoph Anton Mitterer
  2016-09-20  7:59                   ` Duncan
  2016-09-20  8:34                   ` David Sterba
  1 sibling, 2 replies; 93+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-19 19:45 UTC (permalink / raw)
  To: dave; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4518 bytes --]

+1 for all your changes with the following comments in addition...


On Mon, 2016-09-19 at 17:27 +0200, David Sterba wrote:
> That's more like a usecase, thats out of the scope of the tabular
> overview. But we have an existing page UseCases that I'd like to
> transform to a more structured and complete overview of usceases of
> various features, so the UUID collisions would build on top of that
> with
> "and this could hapen if ...".
Well I don't agree here and see it basically like Austin.

It's not that these UUID collisions can only happen in special
circumstances but plain normal situations that always used to work with
probably literally each and every fs. (So much for the accidental
corruptions).

And an attack is probably never "usecase dependant"... it always
depends on the attacker.
And since that seems to be a pretty real attack vector, I'd also say
it's mandatory to quite clearly warn about that deficiency...

TBH, I'm rather surprised that this situation seems to be kinda
"accepted".

I had a chat with CM recently and he implied things might be solved
with encryption.
While this is probably the case for at least some of the described
problems, it rather seems like a workaround:
- why making btrfs-encryption mandatory for devices who have partially
  secured access (e.g. where a systemdisk with btrfs is not physically
  accessible but a USB port is)
- what about users that rather want to use block device encryption
  instead of fs-level-encryption?


> > - in-band dedupe
> >   deduped are IIRC not bitwise compared by the kernel before de-
> > duping,
> >   as it's the case with offline dedupe.
> >   Even if this is considered safe by the community... I think users
> >   should be told.
> Only features merged are reflected. And the out-of-band dedupe does
> full
> memcpy. See btrfs_cmp_data() called from btrfs_extent_same().
Ah,... I kinda thought it was already merged ... possibly got confused
by the countless patch iterations of it ;)


> > - btrfs check --repair (and others?)
> >   Telling people that this may often cause more harm than good.
> I think userspace tools do not belong to the overview.
Well... I wouldn't mind if there was a btrfs-progs status page... (and
both link each other).
OTOH,... the user probably wants one central point where all relevant
info can be found... and not again having to dig through n websites.


> > - even mounting a fs ro, may cause it to be changed
> 
> This would go to the UseCases
Fine for me.


> 
> > 
> > - DB/VM-image like IO patterns + nodatacow + (!)checksumming
> >   + (auto)defrag + snapshots
> >   a)
> >   People typically may have the impression:
> >   btrfs = checksummed => als is guaranteed to be "valid" (or at
> > least
> >   noticed)
> >   However this isn't the case for nodatacow'ed files, which in turn
> > is
> >   kinda "mandatory" for DB/VM-image like IO patterns, cause
> > otherwise
> >   these would fragment to heavily (see (b).
> >   Unless claimed by some people, none of the major DBs or VM-image
> >   formats do general checksumming on their own, most even don't
> > support
> >   it, some that do wouldn't do it without app-support and few
> > "just"
> >   don't do it per default.
> >   Thus one should bump people to this situation and that they may
> > not
> >   get this "correctness" guarantee here.
> >   b)
> >   IIRC, it doesn't even help to simply not use nodatacow on such
> > files
> >   and using auto-defrag instead to countermeasure the fragmenting,
> > as
> >   that one doesn't perform too well on large files.
> 
> Same.
Fine for me either... you already said above you would mention the
nodatacow=>no-checksumming=>no-verification-and-no-raid-repair in the
general section... this is enough for that place.


> > For specific features:
> > - Autodefrag
> >   - didn't that also cause reflinks to be broken up?
> 
> No and never had.

Absolutely sure? One year ago, I was told that at first too so I
started using it, but later on some (IIRC) developer said auto-defrag
would also suffer from it.

> > - RAID*
> >   No userland tools for monitoring/etc.
> 
> That's a usability bug.

Well it is and it will probably go away sooner or later... but the
unaware user may not really realise that he actually has to take care
on this by himself for now.
So I though it would be helpful to have it added.



Best wishes,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
@ 2016-09-19 19:52                   ` Christoph Anton Mitterer
  2016-09-19 20:07                     ` Chris Mason
  0 siblings, 1 reply; 93+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-19 19:52 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, linux-btrfs, dave

[-- Attachment #1: Type: text/plain, Size: 977 bytes --]

On Mon, 2016-09-19 at 13:18 -0400, Austin S. Hemmelgarn wrote:
> > > - even mounting a fs ro, may cause it to be changed
> > 
> > This would go to the UseCases
> My same argument about the UUID issues applies here, just without
> the 
> security aspect.

I personally could agree to have that "just" in the usecases.

That a fs my be changed even though it's mounted ro is not unique to
btrfs.... and the need for not having that happen goes probably rather
into data-forensics and rescue use cases.

IMO there's rather a general problem, namely that the different
filesystems don't provide a mount option that implies every other mount
options currently needed to get an actual "hard ro", i.e. one where the
device is never written to.

Qu was about to add such option when nologreplay was added, but IIRC he
got some resistance by linux-fs, who probably didn't care enough
whether the end-user can easily do such "hard ro" mount ;)


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-19 19:52                   ` Christoph Anton Mitterer
@ 2016-09-19 20:07                     ` Chris Mason
  2016-09-19 20:36                       ` Christoph Anton Mitterer
  0 siblings, 1 reply; 93+ messages in thread
From: Chris Mason @ 2016-09-19 20:07 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Austin S. Hemmelgarn, linux-btrfs, dave



On 09/19/2016 03:52 PM, Christoph Anton Mitterer wrote:
> On Mon, 2016-09-19 at 13:18 -0400, Austin S. Hemmelgarn wrote:
>>>> - even mounting a fs ro, may cause it to be changed
>>>
>>> This would go to the UseCases
>> My same argument about the UUID issues applies here, just without
>> the
>> security aspect.
>
> I personally could agree to have that "just" in the usecases.
>
> That a fs my be changed even though it's mounted ro is not unique to
> btrfs.... and the need for not having that happen goes probably rather
> into data-forensics and rescue use cases.
>
> IMO there's rather a general problem, namely that the different
> filesystems don't provide a mount option that implies every other mount
> options currently needed to get an actual "hard ro", i.e. one where the
> device is never written to.
>
> Qu was about to add such option when nologreplay was added, but IIRC he
> got some resistance by linux-fs, who probably didn't care enough
> whether the end-user can easily do such "hard ro" mount ;)
>
>

That's in the blockdev command (blockdev --setro /dev/xxx).

We actually try to maintain the established norms where it doesn't 
conflict with the btrfs use cases.  This is one of them ;)

-chris


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19 17:38                   ` Austin S. Hemmelgarn
  2016-09-19 18:27                     ` Chris Murphy
@ 2016-09-19 20:15                     ` Zygo Blaxell
  2016-09-20 12:09                       ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 93+ messages in thread
From: Zygo Blaxell @ 2016-09-19 20:15 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Chris Murphy, David Sterba, Waxhead, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1702 bytes --]

On Mon, Sep 19, 2016 at 01:38:36PM -0400, Austin S. Hemmelgarn wrote:
> >>I'm not sure if the brfsck is really all that helpful to user as much
> >>as it is for developers to better learn about the failure vectors of
> >>the file system.
> >
> >ReiserFS had no working fsck for all of the 8 years I used it (and still
> >didn't last year when I tried to use it on an old disk).  "Not working"
> >here means "much less data is readable from the filesystem after running
> >fsck than before."  It's not that much of an inconvenience if you have
> >backups.
> For a small array, this may be the case.  Once you start looking into double
> digit TB scale arrays though, restoring backups becomes a very expensive
> operation.  If you had a multi-PB array with a single dentry which had no
> inode, would you rather be spending multiple days restoring files and
> possibly losing recent changes, or spend a few hours to check the filesystem
> and fix it with minimal data loss?

I'd really prefer to be able to delete the dead dentry with 'rm' as root,
or failing that, with a ZDB-like tool or ioctl, if it's the only known
instance of such a bad metadata object and I already know where it's
located.

Usually the ultimate failure mode of a btrfs filesystem is a read-only
filesystem from which you can read most or all of your data, but you
can't ever make it writable again because of fsck limitations.

The one thing I do miss about every filesystem that isn't ext2/ext3 is
automated fsck that prioritizes availability, making the filesystem
safely writable even if it can't recover lost data.  On the other
hand, fixing an ext[23] filesystem is utterly trivial compared to any
btree-based filesystem.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-19 20:07                     ` Chris Mason
@ 2016-09-19 20:36                       ` Christoph Anton Mitterer
  2016-09-19 21:03                         ` Chris Mason
  0 siblings, 1 reply; 93+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-19 20:36 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 407 bytes --]

On Mon, 2016-09-19 at 16:07 -0400, Chris Mason wrote:
> That's in the blockdev command (blockdev --setro /dev/xxx).
Well, I know that ;-) ... but I bet most end-user don't (just as most
end-users assume mount -r is truly ro)...

At least this is nowadays documented at the mount manpage... so in a
way one can of course argue: if the user can't read you can't help him
anyway... :)

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix
  2016-09-19 20:36                       ` Christoph Anton Mitterer
@ 2016-09-19 21:03                         ` Chris Mason
  0 siblings, 0 replies; 93+ messages in thread
From: Chris Mason @ 2016-09-19 21:03 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



On 09/19/2016 04:36 PM, Christoph Anton Mitterer wrote:
> On Mon, 2016-09-19 at 16:07 -0400, Chris Mason wrote:
>> That's in the blockdev command (blockdev --setro /dev/xxx).
> Well, I know that ;-) ... but I bet most end-user don't (just as most
> end-users assume mount -r is truly ro)...
>

It's a tradeoff, without the log replay traditional filesystems wouldn't 
be able to mount at all after a crash.  Since most init systems default 
to ro at the beginning, it would have been awkward to introduce the 
logging filesystems into established systems.

16+ years later, I still feel it's still the path of least surprise.

-chris

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-19 15:38         ` Is stability a joke? David Sterba
@ 2016-09-19 21:25           ` Hans van Kranenburg
  0 siblings, 0 replies; 93+ messages in thread
From: Hans van Kranenburg @ 2016-09-19 21:25 UTC (permalink / raw)
  To: Austin S. Hemmelgarn, dsterba, Waxhead, linux-btrfs

On 09/19/2016 05:38 PM, David Sterba wrote:
> On Mon, Sep 12, 2016 at 01:31:42PM -0400, Austin S. Hemmelgarn wrote:
>> [...] A lot of stuff that may seem obvious to us after years of 
>> working with BTRFS isn't going to be to a newcomer, and it's a lot more 
>> likely that some random person will get things write if we have a good, 
>> central BCP document than if it stays as scattered tribal knowledge.
> 
> The IRC tribe answers the same newcomer questions over and over, which
> is fine for the interaction itself, but if all that also ended up in
> wiki we'd have perfect documentation years ago.

Yes, it's not the first time I'm thinking "wow, this #btrfs irc log I
have here is a goldmine of very useful information". Transforming it
into concise usable text on a wiki is a lot of work, but there's
certainly a "turnover" point that can be reached quite fast (I guess).

OTOH, the same happens on the mailing list, where I also see lots of
similar things answered over and over again, and a lot of treasures
being buried and forgotten.

> Also the current status
> of features and bugs is kept in the IRC-hive-mind yet it still needs
> some other way to actually make it appear on wiki. Edit with courage!

Oh, right there at the end, I expected: Join #btrfs on freenode IRC! :-D

-- 
Hans van Kranenburg

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
@ 2016-09-20  7:59                   ` Duncan
  2016-09-20  8:19                     ` Hugo Mills
  2016-09-20  8:34                   ` David Sterba
  1 sibling, 1 reply; 93+ messages in thread
From: Duncan @ 2016-09-20  7:59 UTC (permalink / raw)
  To: linux-btrfs

Christoph Anton Mitterer posted on Mon, 19 Sep 2016 21:45:46 +0200 as
excerpted:

> On Mon, 2016-09-19 at 17:27 +0200, David Sterba wrote:
> 
>> > For specific features:
>> > - Autodefrag
>> >   - didn't that also cause reflinks to be broken up?
>> 
>> No and never had.
> 
> Absolutely sure? One year ago, I was told that at first too so I started
> using it, but later on some (IIRC) developer said auto-defrag would also
> suffer from it.

AFAIK it was Hugo that said he looked into that, and that (if I'm 
representing it correctly) autodefrag breaks reflinks and triggers space-
using duplication much as defrag does, but that it does it on a much 
smaller scale, since it (1) only triggers when some parts of a file are 
being rewritten anyway, thus breaking the reflink for those specific 
parts of the file due to COW (COW1 on otherwise NOCOW files) in any case, 
and (2) unlike defrag, doesn't rewrite and thus break the reflinks on 
entire files, just somewhat larger extents than the pure rewrite by 
itself without autodefrag would.

Thus making the reflink-breaking and duplication effect of autodefrag 
there, but relatively quite small compared to on-demand per-file defrag.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-20  7:59                   ` Duncan
@ 2016-09-20  8:19                     ` Hugo Mills
  0 siblings, 0 replies; 93+ messages in thread
From: Hugo Mills @ 2016-09-20  8:19 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]

On Tue, Sep 20, 2016 at 07:59:44AM +0000, Duncan wrote:
> Christoph Anton Mitterer posted on Mon, 19 Sep 2016 21:45:46 +0200 as
> excerpted:
> 
> > On Mon, 2016-09-19 at 17:27 +0200, David Sterba wrote:
> > 
> >> > For specific features:
> >> > - Autodefrag
> >> >   - didn't that also cause reflinks to be broken up?
> >> 
> >> No and never had.
> > 
> > Absolutely sure? One year ago, I was told that at first too so I started
> > using it, but later on some (IIRC) developer said auto-defrag would also
> > suffer from it.
> 
> AFAIK it was Hugo that said he looked into that, and that (if I'm 
> representing it correctly) autodefrag breaks reflinks and triggers space-
> using duplication much as defrag does, but that it does it on a much 
> smaller scale, since it (1) only triggers when some parts of a file are 
> being rewritten anyway, thus breaking the reflink for those specific 
> parts of the file due to COW (COW1 on otherwise NOCOW files) in any case, 
> and (2) unlike defrag, doesn't rewrite and thus break the reflinks on 
> entire files, just somewhat larger extents than the pure rewrite by 
> itself without autodefrag would.
> 
> Thus making the reflink-breaking and duplication effect of autodefrag 
> there, but relatively quite small compared to on-demand per-file defrag.

   I didn't investigate it -- It was my firmly-stated misunderstanding
which caused someone (Filipe, I think) with much more actual knowledge
to correct me, thus making the actual behaviour much clearer. :)

   I think your description is accurate as far as my current
understanding goes.

   Hugo.

> -- 
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Hugo Mills             | There isn't a noun that can't be verbed.
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: stability matrix (was: Is stability a joke?)
  2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
  2016-09-20  7:59                   ` Duncan
@ 2016-09-20  8:34                   ` David Sterba
  1 sibling, 0 replies; 93+ messages in thread
From: David Sterba @ 2016-09-20  8:34 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-btrfs

On Mon, Sep 19, 2016 at 09:45:46PM +0200, Christoph Anton Mitterer wrote:
> +1 for all your changes with the following comments in addition...
> 
> 
> On Mon, 2016-09-19 at 17:27 +0200, David Sterba wrote:
> > That's more like a usecase, thats out of the scope of the tabular
> > overview. But we have an existing page UseCases that I'd like to
> > transform to a more structured and complete overview of usceases of
> > various features, so the UUID collisions would build on top of that
> > with
> > "and this could hapen if ...".
> Well I don't agree here and see it basically like Austin.

So we'd have to make that two separate topics so the "what if" has
better visibility, and possibly marked "with security implications".

> It's not that these UUID collisions can only happen in special
> circumstances but plain normal situations that always used to work with
> probably literally each and every fs. (So much for the accidental
> corruptions).
> 
> And an attack is probably never "usecase dependant"... it always
> depends on the attacker.
> And since that seems to be a pretty real attack vector, I'd also say
> it's mandatory to quite clearly warn about that deficiency...
> 
> TBH, I'm rather surprised that this situation seems to be kinda
> "accepted".
> 
> I had a chat with CM recently and he implied things might be solved
> with encryption.
> While this is probably the case for at least some of the described
> problems, it rather seems like a workaround:
> - why making btrfs-encryption mandatory for devices who have partially
>   secured access (e.g. where a systemdisk with btrfs is not physically
>   accessible but a USB port is)
> - what about users that rather want to use block device encryption
>   instead of fs-level-encryption?
> 
> 
> > > - in-band dedupe
> > >   deduped are IIRC not bitwise compared by the kernel before de-
> > > duping,
> > >   as it's the case with offline dedupe.
> > >   Even if this is considered safe by the community... I think users
> > >   should be told.
> > Only features merged are reflected. And the out-of-band dedupe does
> > full
> > memcpy. See btrfs_cmp_data() called from btrfs_extent_same().
> Ah,... I kinda thought it was already merged ... possibly got confused
> by the countless patch iterations of it ;)
> 
> 
> > > - btrfs check --repair (and others?)
> > >   Telling people that this may often cause more harm than good.
> > I think userspace tools do not belong to the overview.
> Well... I wouldn't mind if there was a btrfs-progs status page... (and
> both link each other).
> OTOH,... the user probably wants one central point where all relevant
> info can be found... and not again having to dig through n websites.

The Status page should give enough overview about all main topics, so
the progs can be one section there. Any details should go to separate
pages and be linked from there.

> > > - even mounting a fs ro, may cause it to be changed
> > 
> > This would go to the UseCases
> Fine for me.
> 
> 
> > 
> > > 
> > > - DB/VM-image like IO patterns + nodatacow + (!)checksumming
> > >   + (auto)defrag + snapshots
> > >   a)
> > >   People typically may have the impression:
> > >   btrfs = checksummed => als is guaranteed to be "valid" (or at
> > > least
> > >   noticed)
> > >   However this isn't the case for nodatacow'ed files, which in turn
> > > is
> > >   kinda "mandatory" for DB/VM-image like IO patterns, cause
> > > otherwise
> > >   these would fragment to heavily (see (b).
> > >   Unless claimed by some people, none of the major DBs or VM-image
> > >   formats do general checksumming on their own, most even don't
> > > support
> > >   it, some that do wouldn't do it without app-support and few
> > > "just"
> > >   don't do it per default.
> > >   Thus one should bump people to this situation and that they may
> > > not
> > >   get this "correctness" guarantee here.
> > >   b)
> > >   IIRC, it doesn't even help to simply not use nodatacow on such
> > > files
> > >   and using auto-defrag instead to countermeasure the fragmenting,
> > > as
> > >   that one doesn't perform too well on large files.
> > 
> > Same.
> Fine for me either... you already said above you would mention the
> nodatacow=>no-checksumming=>no-verification-and-no-raid-repair in the
> general section... this is enough for that place.
> 
> 
> > > For specific features:
> > > - Autodefrag
> > >   - didn't that also cause reflinks to be broken up?
> > 
> > No and never had.
> 
> Absolutely sure? One year ago, I was told that at first too so I
> started using it, but later on some (IIRC) developer said auto-defrag
> would also suffer from it.

Reading the subthread, I have to change the statement. Autodefrag can
read surrounding blocks up to 64k and write it to a new location, on
that write the links will get broken. I'll update the page.

> 
> > > - RAID*
> > >   No userland tools for monitoring/etc.
> > 
> > That's a usability bug.
> 
> Well it is and it will probably go away sooner or later... but the
> unaware user may not really realise that he actually has to take care
> on this by himself for now.
> So I though it would be helpful to have it added.

I agree and I'm not dismissing it, we need a section for such topic
(like system support tools for automatic snapshotting, monitoring etc).

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke? (wiki updated)
  2016-09-19 20:15                     ` Zygo Blaxell
@ 2016-09-20 12:09                       ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 93+ messages in thread
From: Austin S. Hemmelgarn @ 2016-09-20 12:09 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Chris Murphy, David Sterba, Waxhead, Btrfs BTRFS

On 2016-09-19 16:15, Zygo Blaxell wrote:
> On Mon, Sep 19, 2016 at 01:38:36PM -0400, Austin S. Hemmelgarn wrote:
>>>> I'm not sure if the brfsck is really all that helpful to user as much
>>>> as it is for developers to better learn about the failure vectors of
>>>> the file system.
>>>
>>> ReiserFS had no working fsck for all of the 8 years I used it (and still
>>> didn't last year when I tried to use it on an old disk).  "Not working"
>>> here means "much less data is readable from the filesystem after running
>>> fsck than before."  It's not that much of an inconvenience if you have
>>> backups.
>> For a small array, this may be the case.  Once you start looking into double
>> digit TB scale arrays though, restoring backups becomes a very expensive
>> operation.  If you had a multi-PB array with a single dentry which had no
>> inode, would you rather be spending multiple days restoring files and
>> possibly losing recent changes, or spend a few hours to check the filesystem
>> and fix it with minimal data loss?
>
> I'd really prefer to be able to delete the dead dentry with 'rm' as root,
> or failing that, with a ZDB-like tool or ioctl, if it's the only known
> instance of such a bad metadata object and I already know where it's
> located.
I entirely agree on that.  The problem is that because the VFS layer 
chokes on it, it can't be rm, and it would be non-trivial to implement 
as an ioctl.  It pretty much has to be out-of-band.  I'd love to see 
btrfs check add the ability to process subsets of the filesystem (for 
example 'I know that something is screwed up somehow in 
/path/to/random/directory, check only that path in the filesystem 
(possibly recursively) and tell me what's wrong (and possibly try to fix 
it)').
>
> Usually the ultimate failure mode of a btrfs filesystem is a read-only
> filesystem from which you can read most or all of your data, but you
> can't ever make it writable again because of fsck limitations.
>
> The one thing I do miss about every filesystem that isn't ext2/ext3 is
> automated fsck that prioritizes availability, making the filesystem
> safely writable even if it can't recover lost data.  On the other
> hand, fixing an ext[23] filesystem is utterly trivial compared to any
> btree-based filesystem.
For a data center or corporate entity, dropping broken parts of the FS 
and recovering from backups makes sense.  For a traditional home user 
(that is, the type of person Ubuntu and Windows traditionally target), 
it usually doesn't, as they almost certainly don't have a backup. 
Personally, I'd rather have a tool that gives me the option of whether 
to try and fix a given path or just remove it, instead of assuming that 
it knows how I want to fix it.  That would allow for both use cases.


^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Is stability a joke?
  2016-09-11 17:46           ` Marc MERLIN
@ 2016-09-20 16:33             ` Chris Murphy
  0 siblings, 0 replies; 93+ messages in thread
From: Chris Murphy @ 2016-09-20 16:33 UTC (permalink / raw)
  To: Btrfs BTRFS; +Cc: Waxhead, Martin Steigerwald, Marc MERLIN

btrfs-convert has been rewritten as of btrfs-progs 4.6, and therefore
the conversion page could use an update:
https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3

Anyone wanting to update the page should advise the code is new, check
the changelog, the latest btrfs-progs version should be used, and
there still may be edge cases:
https://btrfs.wiki.kernel.org/index.php/Changelog

Also, the status page doesn't mention the convert feature.


Chris Murphy

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2016-09-20 16:33 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-11  8:55 Is stability a joke? Waxhead
2016-09-11  9:56 ` Steven Haigh
2016-09-11 10:23 ` Martin Steigerwald
2016-09-11 11:21   ` Zoiled
2016-09-11 11:43     ` Martin Steigerwald
2016-09-11 12:05       ` Martin Steigerwald
2016-09-11 12:39         ` Waxhead
2016-09-11 13:02           ` Hugo Mills
2016-09-11 14:59             ` Martin Steigerwald
2016-09-11 20:14             ` Chris Murphy
2016-09-12 12:20             ` Austin S. Hemmelgarn
2016-09-12 12:59               ` Michel Bouissou
2016-09-12 13:14                 ` Austin S. Hemmelgarn
2016-09-12 14:04                 ` Lionel Bouton
2016-09-15  1:05               ` Nicholas D Steeves
2016-09-15  8:02                 ` Martin Steigerwald
2016-09-16  7:13                 ` Helmut Eller
2016-09-15  5:55               ` Kai Krakow
2016-09-15  8:05                 ` Martin Steigerwald
2016-09-11 14:54           ` Martin Steigerwald
2016-09-11 15:19             ` Martin Steigerwald
2016-09-11 20:21             ` Chris Murphy
2016-09-11 17:46           ` Marc MERLIN
2016-09-20 16:33             ` Chris Murphy
2016-09-11 17:11         ` Duncan
2016-09-12 12:26           ` Austin S. Hemmelgarn
2016-09-11 12:30       ` Waxhead
2016-09-11 14:36         ` Martin Steigerwald
2016-09-12 12:48   ` Swâmi Petaramesh
2016-09-12 13:53 ` Chris Mason
2016-09-12 17:36   ` Zoiled
2016-09-12 17:44     ` Waxhead
2016-09-15  1:12     ` Nicholas D Steeves
2016-09-12 14:27 ` David Sterba
2016-09-12 14:54   ` Austin S. Hemmelgarn
2016-09-12 16:51     ` David Sterba
2016-09-12 17:31       ` Austin S. Hemmelgarn
2016-09-15  1:07         ` Nicholas D Steeves
2016-09-15  1:13           ` Steven Haigh
2016-09-15  2:14             ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-15  9:49               ` stability matrix Hans van Kranenburg
2016-09-15 11:54                 ` Austin S. Hemmelgarn
2016-09-15 14:15                   ` Chris Murphy
2016-09-15 14:56                   ` Martin Steigerwald
2016-09-19 14:38                   ` David Sterba
2016-09-19 15:27               ` stability matrix (was: Is stability a joke?) David Sterba
2016-09-19 17:18                 ` stability matrix Austin S. Hemmelgarn
2016-09-19 19:52                   ` Christoph Anton Mitterer
2016-09-19 20:07                     ` Chris Mason
2016-09-19 20:36                       ` Christoph Anton Mitterer
2016-09-19 21:03                         ` Chris Mason
2016-09-19 19:45                 ` stability matrix (was: Is stability a joke?) Christoph Anton Mitterer
2016-09-20  7:59                   ` Duncan
2016-09-20  8:19                     ` Hugo Mills
2016-09-20  8:34                   ` David Sterba
2016-09-19 15:38         ` Is stability a joke? David Sterba
2016-09-19 21:25           ` Hans van Kranenburg
2016-09-12 16:27   ` Is stability a joke? (wiki updated) David Sterba
2016-09-12 16:56     ` Austin S. Hemmelgarn
2016-09-12 17:29       ` Filipe Manana
2016-09-12 17:42         ` Austin S. Hemmelgarn
2016-09-12 20:08       ` Chris Murphy
2016-09-13 11:35         ` Austin S. Hemmelgarn
2016-09-15 18:01           ` Chris Murphy
2016-09-15 18:20             ` Austin S. Hemmelgarn
2016-09-15 19:02               ` Chris Murphy
2016-09-15 20:16                 ` Hugo Mills
2016-09-15 20:26                   ` Chris Murphy
2016-09-16 12:00                     ` Austin S. Hemmelgarn
2016-09-19  2:57                       ` Zygo Blaxell
2016-09-19 12:37                         ` Austin S. Hemmelgarn
2016-09-19  4:08                 ` Zygo Blaxell
2016-09-19 15:27                   ` Sean Greenslade
2016-09-19 17:38                   ` Austin S. Hemmelgarn
2016-09-19 18:27                     ` Chris Murphy
2016-09-19 18:34                       ` Austin S. Hemmelgarn
2016-09-19 20:15                     ` Zygo Blaxell
2016-09-20 12:09                       ` Austin S. Hemmelgarn
2016-09-15 21:23               ` Christoph Anton Mitterer
2016-09-16 12:13                 ` Austin S. Hemmelgarn
2016-09-19  3:47       ` Zygo Blaxell
2016-09-19 12:32         ` Austin S. Hemmelgarn
2016-09-19 15:33           ` Zygo Blaxell
2016-09-12 19:57     ` Martin Steigerwald
2016-09-12 20:21       ` Pasi Kärkkäinen
2016-09-12 20:35         ` Martin Steigerwald
2016-09-12 20:44           ` Chris Murphy
2016-09-13 11:28             ` Austin S. Hemmelgarn
2016-09-13 11:39               ` Martin Steigerwald
2016-09-14  5:53             ` Marc Haber
2016-09-12 20:48         ` Waxhead
2016-09-13  8:38           ` Timofey Titovets
2016-09-13 11:26             ` Austin S. Hemmelgarn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.