All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: It's broke, Jim. BTRFS mounted read only after corruption errors
@ 2021-09-01  2:45 Duncan
  2021-09-04 13:02 ` Martin Steigerwald
  0 siblings, 1 reply; 2+ messages in thread
From: Duncan @ 2021-09-01  2:45 UTC (permalink / raw)
  To: Martin Steigerwald, Btrfs BTRFS

Martin Steigerwald posted on Sun, 22 Aug 2021 13:14:39 +0200 as
excerpted:

> This might be a sequel of:
> 
> Corruption errors on Samsung 980 Pro
> 
> https://lore.kernel.org/linux-btrfs/2729231.WZja5ltl65@ananda/

I saw on the previous thread some discussion of trim/discard but lost 
track of whether you're still trying to enable it in the mount options
or not.

I'd suggest *NOT* enabling trim/discard on any samsung SSDs unless you 
are extremely confident that it is well tested and known to work on
your particular model, because...

I have a samsung 850 evo and did some earlier research on trim/discard 
for it.

What I found was that at least for earlier samsung ssds, queued-trim
had been found not to work safely, with a number of bugs filed over the 
years, resulting in samsung ssds (and a few others) being queued-trim-
blacklisted in the kernel.  Back when I did my research at least, the 
blacklist was for all samsung ssd models, with the problem being that 
they claimed sata 3.1 compliance which requires queued-trim, but they 
didn't actually handle it as a queued command.  When it's not queued
the queued command stream must be flushed before a discard, with the
discard and then another flush issued to ensure proper write order,
before queued command stream can be resumed.

In theory the black-listing should mean the kernel does the right thing 
and it's simply slower, but then it's slower, so not enabling the
discard mount option is probably a good idea anyway.

Now it's quite possible that your newer 980 pro model handles
queued-trim properly, but it's also possible that it still doesn't,
while the kernel blacklist might have been updated assuming it does, or
that the blacklist isn't applying for some reason. And given that you're
seeing problems, probably better safe than sorry. I'd leave discard
disabled.

Another consideration for btrfs is the older root-blocks that are not 
normally immediately overwritten, that thus remain available to use for 
repair/recovery should that be necessary.   Because they're technically 
no longer in use the discard mount option clears these along with other 
unused blocks, so they're no longer an option for repair/recover. =:^(

The alternative (beyond possibly deliberately leaving some unpartitioned
free-space for the ssd wear-leveling algorithm to work with, in
addition to the unreported space it already reserves for that
purpose) is fstrim.

At least on my systemd-option gentoo, there's a weekly fstrim scheduled 
(see fstrim.service and fstrim.timer, owned by the util-linux package), 
tho I don't recall whether I had to enable it or whether it was enabled 
automatically.

Tho it's worth noting that the default fstrim.service apparently (based 
on my logs) only trims filesystems mounted read-write when it runs.  I 
have several filesystems not mounted by default (backups and /boot 
mostly), and my / is mounted read-only by default, and they don't get 
fstrimmed.  But the backups tend to be mkfs.btrfs, mount, backup, 
unmount, with few or no writes after the backup, and mkfs.btrfs already 
does a trim to clear the partition before it does the mkfs, so there's 
little to trim there.  / and /boot get more writes, but /boot is
sub-GiB and / is only 8 GiB, trivial when I've several hundred GiB of
the 1 TB ssd entirely unpartitioned for the ssd firmware to wear-level
with, so I'm not too worried.  But it's something to be aware of and to
consider modifying the scheduled commandline if necessary for your
use-case.

Something else I used to wonder about was whether fstrim handled all 
devices on a multi-device btrfs, or just the specific device that it
was pointed at (that mount said was mounted in the case of the
automatic runs).  But while the log only indicates the one device
fstrimmed, the reported space trimmed is the free space of the entire
filesystem, pair-device btrfs raid1 for most of my btrfs, with double
the free space of the one device reported as trimmed, so it does appear
to trim the free space on all devices of the filesystem despite only
listing one in the log.

As for backup root-blocks, fstrim will still clear those too, but since 
it's running just once a week, on filesystems with any routine writes
at all, the window in which there's not at least a couple backup
root-blocks available is going to be reasonably small, likely to be
considered worth the trivial incremental risk for anyone following the
sysadmin's rule that the value of the data is defined by the number
(and freshness) of backups of said data it's considered valuable enough
to have.  And for filesystems without a lot of writes that's less risk
of damage during any potential crash within that window anyway, so
again, worth the trivial incremental risk, especially compared to that
of using the discard mount option.

-- 
Duncan - No HTML messages please; they are filtered as spam.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: It's broke, Jim. BTRFS mounted read only after corruption errors
  2021-09-01  2:45 It's broke, Jim. BTRFS mounted read only after corruption errors Duncan
@ 2021-09-04 13:02 ` Martin Steigerwald
  0 siblings, 0 replies; 2+ messages in thread
From: Martin Steigerwald @ 2021-09-04 13:02 UTC (permalink / raw)
  To: Btrfs BTRFS, Duncan

Hi Duncan.

Duncan - 01.09.21, 04:45:01 CEST:
> Martin Steigerwald posted on Sun, 22 Aug 2021 13:14:39 +0200 as
> 
> excerpted:
> > This might be a sequel of:
> > 
> > Corruption errors on Samsung 980 Pro
> > 
> > https://lore.kernel.org/linux-btrfs/2729231.WZja5ltl65@ananda/
> 
> I saw on the previous thread some discussion of trim/discard but lost
> track of whether you're still trying to enable it in the mount options
> or not.

I have it enabled for the Samsung 980 Pro, but still disabled for the 
Samsung 860 SSD. Which it seems makes sense, considering:

Samsung 860/870 SSDs Continue Causing Problems For Linux Users

https://www.phoronix.com/scan.php?page=news_item&px=Samsung-860-870-More-Quirks

I may just avoid those drives for the future.

> I'd suggest *NOT* enabling trim/discard on any samsung SSDs unless you
> are extremely confident that it is well tested and known to work on
> your particular model, because...

Well, after I do not hibernate the ThinkPad T14 AMD Gen 1 anymore, I had 
no issues again.

> Now it's quite possible that your newer 980 pro model handles
> queued-trim properly, but it's also possible that it still doesn't,
> while the kernel blacklist might have been updated assuming it does,
> or that the blacklist isn't applying for some reason. And given that
> you're seeing problems, probably better safe than sorry. I'd leave
> discard disabled.

Well, you could be right there. But since I did have no issues again, 
queued trims may just work with Samsung 980 Pro.

> Another consideration for btrfs is the older root-blocks that are not
> normally immediately overwritten, that thus remain available to use
> for repair/recovery should that be necessary.   Because they're
> technically no longer in use the discard mount option clears these
> along with other unused blocks, so they're no longer an option for
> repair/recover. =:^(

Hmmm, that is an interesting consideration for using fstrim -av with a 
cron job and in case of a corruption hope that it was not just at the 
time the cron job triggered.

However, I do tend to eventually put another 42 mm m2 SSD into the 
laptop and thus skip adding a 4G modem to it. Then I could protect 
critical data with BTRFS RAID1 or BTRFS send/receive or hourly backups. 
Actually BTRFS RAID1 prevented me from data loss with the old ThinkPad 
T520 where there were checksum errors after sudden power loss on Crucial 
m500 mSATA SSD.

> The alternative (beyond possibly deliberately leaving some
> unpartitioned free-space for the ssd wear-leveling algorithm to work
> with, in addition to the unreported space it already reserves for
> that purpose) is fstrim.

I always leave some space for that as long as I still have some left. At 
the moment I have plenty of space left. I think I will remove 300G 
homedefect LV soon enough in case no BTRFS developer wants some other 
debug data from the defective filesystem.

> At least on my systemd-option gentoo, there's a weekly fstrim
> scheduled (see fstrim.service and fstrim.timer, owned by the
> util-linux package), tho I don't recall whether I had to enable it or
> whether it was enabled automatically.

No systemd here, however I can do a cron job.

Well as it works at the moment as long as I avoid hibernate, I think I 
keep it that way. And since with Linux 5.14 the deeper sleep state 
(S0ix), labeled as "Windows 10" mode in firmware settings, seems to work 
well enough. I'd still prefer true hibernation, but it is just not 
stable enough on this machine at the moment.

Best,
-- 
Martin



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-09-04 13:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-01  2:45 It's broke, Jim. BTRFS mounted read only after corruption errors Duncan
2021-09-04 13:02 ` Martin Steigerwald

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.