Bcache is not caching anything. cache state=inconsistent, how to clear?

* Bcache is not caching anything. cache state=inconsistent, how to clear?
@ 2021-11-23 14:48 Tobiasz Karoń
  2021-11-23 17:37 ` Kai Krakow
  2021-11-23 22:24 ` Tobiasz Karoń
  0 siblings, 2 replies; 9+ messages in thread
From: Tobiasz Karoń @ 2021-11-23 14:48 UTC (permalink / raw)
  To: linux-bcache

Hi!

TL;DR

My cache is inconsistent, and that's probably preventing Bcache for m
using it (all I/O goes to the backing device). How can I clear that?

Details:

I've been using Bcache for the past few months on my root Btrfs
filesystem with success.
Then one day out of the blue Bcache failed and took my Btrfs
filesystem with it (details:
https://www.youtube.com/watch?v=Hf3zr6CxvmI, looks similar to this:
https://stackoverflow.com/questions/22820492/how-to-revert-bcache-device-to-regular-device).
That's not the topic of my message though.
I've done a clean Arch Linux installation on Bcache + Btrfs once again
using an SSD partition for cache and an HDD as the backing device.

However, this time it doesn't do anything...
I was unable to find any information online to solve this.

My Bcache device works fine, the system boots off of it. However all
I/O goes straight to the backing HDD, and the SSD is unused. Needless
to say this means the performance is not what I got used to when
Bcache was working fine.

Here's what a 3rd party bcache-status script says (it'd be great if
bcache-tools would provide something like this, BTW):

❯ bcache-status
--- bcache ---
Device                      ? (?)
UUID                        c9cd8259-3cee-42ff-a8ec-e11193c09b7e
Block Size                  0.50KiB
Bucket Size                 512.00KiB
Congested?                  False
Read Congestion             2.0ms
Write Congestion            20.0ms
Total Cache Size            173.97GiB
Total Cache Used            8.70GiB     (5%)
Total Cache Unused          165.27GiB   (95%)
Dirty Data                  0.50KiB     (0%)
Evictable Cache             173.97GiB   (100%)
Replacement Policy          [lru] fifo random
Cache Mode                  (Unknown)
Total Hits                  0
Total Misses                0
Total Bypass Hits           0
Total Bypass Misses         0
Total Bypassed              0B

The Total Cache Used value has not changed since I've done my initial
Arch Linux installation. It seems that Bcache has "turned off" by that
point.

Here's the bcache supers fro the backing device and cache

❯ bcache-super-show /dev/sda
sb.magic                ok
sb.first_sector         8 [match]
sb.csum                 4E6EACCA74AB0AE5 [match]
sb.version              1 [backing device]

dev.label               unfa-desktop%20root
dev.uuid                49202fdf-fbe5-48fd-bdd8-df5414da817c
dev.sectors_per_block   8
dev.sectors_per_bucket  1024
dev.data.first_sector   16
dev.data.cache_mode     0 [writethrough]
dev.data.cache_state    3 [inconsistent]

cset.uuid               9572380e-8e6f-4ce4-8323-80b98a85eeed

❯ bcache-super-show /dev/sdd3
sb.magic                ok
sb.first_sector         8 [match]
sb.csum                 259C90FD74B4D4BE [match]
sb.version              3 [cache device]

dev.label               (empty)
dev.uuid                95c6449a-03b5-40f2-a8cc-80b1b61c5ef0
dev.sectors_per_block   1
dev.sectors_per_bucket  1024
dev.cache.first_sector  1024
dev.cache.cache_sectors 364833792
dev.cache.total_sectors 364834816
dev.cache.ordered       yes
dev.cache.discard       no
dev.cache.pos           0
dev.cache.replacement   0 [lru]

cset.uuid               c9cd8259-3cee-42ff-a8ec-e11193c09b7e

BTW - I've now realized I've set a label for the backing device but
not the cache. maybe this is the reason? I don't think it should work
this way but I've cleared the label on my backing device just to be
sure.

Hmm. The cache in inconsistent. I had this before I reinstalled my OS.
I have recreated the bcache cache on the SSD and was hoping that will
solve it.
I don't know what I should do with this, is this the  reason why it's
not working?

I was wondering if washing the partition and recreating the cache
would help, but I don't want to needlessly wear down the SSD if that
won't help.

Needless to say I would really like to avoid data loss when using
Bcache - it's awesome, and the developer says it's perfectly stable
and safe, but I've had a sudden failure and others had such as well
(without seeing any hardware issues that could be causing that). Maybe
I should quit using Bcache all together? Maybe it's not
production-ready? I was wondering about maybe using Bcachefs, though
the need to compile a custom kernel for it is quite a deterrent. I
tried it briefly, but the bcachefs-tools stopped working at some point
without a visible reason. I know Btrfs is flawed, though it seems to
be the best so far.

Thank you for your work,
- unfa

-- 
- Tobiasz 'unfa' Karoń

www.youtube.com/unfa000

^ permalink raw reply	[flat|nested] 9+ messages in thread