All of lore.kernel.org
 help / color / mirror / Atom feed
* Consistent failure of bcache upgrading from 5.10 to 5.15.2
@ 2021-11-16 10:10 Kai Krakow
  2021-11-16 11:02 ` Coly Li
  0 siblings, 1 reply; 16+ messages in thread
From: Kai Krakow @ 2021-11-16 10:10 UTC (permalink / raw)
  To: linux-bcache, Coly Li

Hello Coly!

I think I can consistently reproduce a failure mode of bcache when
going from 5.10 LTS to 5.15.2 - on one single system (my other systems
do just fine).

In 5.10, bcache is stable, no problems at all. After booting to
5.15.2, btrfs would complain about broken btree generation numbers,
then freeze completely. Going back to 5.10, bcache complains about
being broken and cannot start the cache set.

I was able to reproduce the following behavior after the problem
struck me twice in a row:

1. Boot into SysRescueCD
2. modprobe bcache
3. Manually detach the btrfs disks from bcache, set cache mode to
none, force running
4. Reboot into 5.15.2 (now works)
5. See this error in dmesg:

[   27.334306] bcache: bch_cache_set_error() error on
04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
1, 0 keys, disabling caching
[   27.334453] bcache: cache_set_free() Cache set
04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
[   27.334510] bcache: register_cache() error sda3: failed to run cache set
[   27.334512] bcache: register_bcache() error : failed to register device

6. wipefs the failed bcache cache
7. bcache make -C -w 512 /dev/sda3 -l bcache-cdev0 --force
8. re-attach the btrfs disks in writearound mode
9. btrfs immediately fails, freezing the system (with transactions IDs way off)
10. reboot loops to 5, unable to mount
11. escape the situation by starting at 1, and not make a new bcache

Is this a known error? Why does it only hit this machine?

SSD Model: Samsung SSD 850 EVO 250GB

Thanks,
Kai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-16 10:10 Consistent failure of bcache upgrading from 5.10 to 5.15.2 Kai Krakow
@ 2021-11-16 11:02 ` Coly Li
  2021-11-18 10:27   ` Kai Krakow
  0 siblings, 1 reply; 16+ messages in thread
From: Coly Li @ 2021-11-16 11:02 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-bcache

On 11/16/21 6:10 PM, Kai Krakow wrote:
> Hello Coly!
>
> I think I can consistently reproduce a failure mode of bcache when
> going from 5.10 LTS to 5.15.2 - on one single system (my other systems
> do just fine).
>
> In 5.10, bcache is stable, no problems at all. After booting to
> 5.15.2, btrfs would complain about broken btree generation numbers,
> then freeze completely. Going back to 5.10, bcache complains about
> being broken and cannot start the cache set.
>
> I was able to reproduce the following behavior after the problem
> struck me twice in a row:
>
> 1. Boot into SysRescueCD
> 2. modprobe bcache
> 3. Manually detach the btrfs disks from bcache, set cache mode to
> none, force running
> 4. Reboot into 5.15.2 (now works)
> 5. See this error in dmesg:
>
> [   27.334306] bcache: bch_cache_set_error() error on
> 04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
> 1, 0 keys, disabling caching
> [   27.334453] bcache: cache_set_free() Cache set
> 04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
> [   27.334510] bcache: register_cache() error sda3: failed to run cache set
> [   27.334512] bcache: register_bcache() error : failed to register device
>
> 6. wipefs the failed bcache cache
> 7. bcache make -C -w 512 /dev/sda3 -l bcache-cdev0 --force
> 8. re-attach the btrfs disks in writearound mode
> 9. btrfs immediately fails, freezing the system (with transactions IDs way off)
> 10. reboot loops to 5, unable to mount
> 11. escape the situation by starting at 1, and not make a new bcache
>
> Is this a known error? Why does it only hit this machine?
>
> SSD Model: Samsung SSD 850 EVO 250GB

This is already known, there are 3 locations to fix,

1, Revert commit 2fd3e5efe791946be0957c8e1eed9560b541fe46
2, Revert commit  f8b679a070c536600c64a78c83b96aa617f8fa71
3, Do the following change in drivers/md/bcache.c,
@@ -885,9 +885,9 @@ static void bcache_device_free(struct bcache_device *d)

  		bcache_device_detach(d);
  
  	if (disk) {
-		blk_cleanup_disk(disk);
  		ida_simple_remove(&bcache_device_idx,
  				  first_minor_to_idx(disk->first_minor));
+		blk_cleanup_disk(disk);
  	}

The fix 1) and 3) are on the way to stable kernel IMHO, and fix 2) is only my workaround and I don't see upstream fix yet.

Just FYI.

Coly Li


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-16 11:02 ` Coly Li
@ 2021-11-18 10:27   ` Kai Krakow
  2021-11-20  0:06     ` Eric Wheeler
  0 siblings, 1 reply; 16+ messages in thread
From: Kai Krakow @ 2021-11-18 10:27 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-bcache

Hi Coly!

Reading the commit logs, it seems to come from using a non-default
block size, 512 in my case (although I'm pretty sure that *is* the
default on the affected system). I've checked:
```
dev.sectors_per_block   1
dev.sectors_per_bucket  1024
```

The non-affected machines use 4k blocks (sectors per block = 8).

Can this value be changed "on the fly"? I think I remember that the
bdev super block must match the cdev super block - although that
doesn't make that much sense to me.

By "on the fly" I mean: Re-create the cdev super block, then just
attach the bdev - in this case, the sectors per block should not
matter because this is a brand new cdev with no existing cache data.
But I think it will refuse attaching the devices because of
non-matching block size (at least this was the case in the past). I
don't see a point in having a block size in both super blocks at all
if the only block size that matters lives in the cdev superblock.

Thanks
Kai

Am Di., 16. Nov. 2021 um 12:02 Uhr schrieb Coly Li <colyli@suse.de>:
>
> On 11/16/21 6:10 PM, Kai Krakow wrote:
> > Hello Coly!
> >
> > I think I can consistently reproduce a failure mode of bcache when
> > going from 5.10 LTS to 5.15.2 - on one single system (my other systems
> > do just fine).
> >
> > In 5.10, bcache is stable, no problems at all. After booting to
> > 5.15.2, btrfs would complain about broken btree generation numbers,
> > then freeze completely. Going back to 5.10, bcache complains about
> > being broken and cannot start the cache set.
> >
> > I was able to reproduce the following behavior after the problem
> > struck me twice in a row:
> >
> > 1. Boot into SysRescueCD
> > 2. modprobe bcache
> > 3. Manually detach the btrfs disks from bcache, set cache mode to
> > none, force running
> > 4. Reboot into 5.15.2 (now works)
> > 5. See this error in dmesg:
> >
> > [   27.334306] bcache: bch_cache_set_error() error on
> > 04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
> > 1, 0 keys, disabling caching
> > [   27.334453] bcache: cache_set_free() Cache set
> > 04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
> > [   27.334510] bcache: register_cache() error sda3: failed to run cache set
> > [   27.334512] bcache: register_bcache() error : failed to register device
> >
> > 6. wipefs the failed bcache cache
> > 7. bcache make -C -w 512 /dev/sda3 -l bcache-cdev0 --force
> > 8. re-attach the btrfs disks in writearound mode
> > 9. btrfs immediately fails, freezing the system (with transactions IDs way off)
> > 10. reboot loops to 5, unable to mount
> > 11. escape the situation by starting at 1, and not make a new bcache
> >
> > Is this a known error? Why does it only hit this machine?
> >
> > SSD Model: Samsung SSD 850 EVO 250GB
>
> This is already known, there are 3 locations to fix,
>
> 1, Revert commit 2fd3e5efe791946be0957c8e1eed9560b541fe46
> 2, Revert commit  f8b679a070c536600c64a78c83b96aa617f8fa71
> 3, Do the following change in drivers/md/bcache.c,
> @@ -885,9 +885,9 @@ static void bcache_device_free(struct bcache_device *d)
>
>                 bcache_device_detach(d);
>
>         if (disk) {
> -               blk_cleanup_disk(disk);
>                 ida_simple_remove(&bcache_device_idx,
>                                   first_minor_to_idx(disk->first_minor));
> +               blk_cleanup_disk(disk);
>         }
>
> The fix 1) and 3) are on the way to stable kernel IMHO, and fix 2) is only my workaround and I don't see upstream fix yet.
>
> Just FYI.
>
> Coly Li
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-18 10:27   ` Kai Krakow
@ 2021-11-20  0:06     ` Eric Wheeler
  2021-11-23  8:54       ` Coly Li
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Wheeler @ 2021-11-20  0:06 UTC (permalink / raw)
  To: Kai Krakow
  Cc: Coly Li, linux-bcache, Frédéric Dumas, Kent Overstreet

[-- Attachment #1: Type: text/plain, Size: 8832 bytes --]

(Fixed mail header and resent, ignore possible duplicate message and
reply to this one instead because the From header was broken.)


Hi Coly, Kai, and Kent, I hope you are well!

On Thu, 18 Nov 2021, Kai Krakow wrote:

> Hi Coly!
> 
> Reading the commit logs, it seems to come from using a non-default
> block size, 512 in my case (although I'm pretty sure that *is* the
> default on the affected system). I've checked:
> ```
> dev.sectors_per_block   1
> dev.sectors_per_bucket  1024
> ```
> 
> The non-affected machines use 4k blocks (sectors per block = 8).

If it is the cache device with 4k blocks, then this could be a known issue 
(perhaps) not directly related to the 5.15 release. We've hit a before:
  https://www.spinics.net/lists/linux-bcache/msg05983.html

and I just talked to Frédéric Dumas this week who hit it too (cc'ed).  
His solution was to use manufacturer disk tools to change the cachedev's 
logical block size from 4k to 512-bytes and reformat (see below).

We've not seen issues with the backing device using 4k blocks, but bcache 
doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be 
nice to find a long-term fix; more and more SSDs support 4k blocks, which 
is a nice x86 page-alignment and may provide for less CPU overhead.

I think this was the last message on the subject from Kent and Coly:

	> On 2018/5/9 3:59 PM, Kent Overstreet wrote:
	> > Have you checked extent merging?
	> 
	> Hi Kent,
	> 
	> Not yet. Let me look into it.
	> 
	> Thanks for the hint.
	> 
	> Coly Li


Here is a snip of my offline conversation with Frédéric:

11/17/2021 04:03 (America/Los_Angeles) - Frédéric Dumas wrote: 
> > > (3) When I declare the newly created /dev/bcache0 device to LVM, it works but with errors:
> > >
> > > # pvcreate /dev/bcache0
> > >  Error reading device /dev/bcache0 at 7965015146496 length 4.
> > >  bcache_invalidate: block (0, 0) still held
> > >  bcache_abort: block (0, 0) still held
> > >  Error reading device /dev/bcache0 at 7965015248896 length 4.
> > >  Error reading device /dev/bcache0 at 7965015259648 length 24.
> > >  Error reading device /dev/bcache0 at 7965015260160 length 512.
> > >  scan_dev_close /dev/bcache0 no DEV_IN_BCACHE set
> > >  scan_dev_close /dev/bcache0 already closed
> > >  Error reading device /dev/bcache0 at 7965015146496 length 4.
> > >  bcache_invalidate: block (0, 0) still held
> > >  bcache_abort: block (0, 0) still held
> > >  Error reading device /dev/bcache0 at 7965015248896 length 4.
> > >  Error reading device /dev/bcache0 at 7965015259648 length 24.
> > >  Error reading device /dev/bcache0 at 7965015260160 length 512.
> > >  Physical volume "/dev/bcache0" successfully created.
> > >
> > > # vgcreate vms /dev/bcache0
> > >  Error reading device /dev/bcache0 at 7965015146496 length 4.
> > >  bcache_invalidate: block (3, 0) still held
> > >  bcache_abort: block (3, 0) still held
> > >  Error reading device /dev/bcache0 at 7965015248896 length 4.
> > >  Error reading device /dev/bcache0 at 7965015259648 length 24.
> > >  Error reading device /dev/bcache0 at 7965015260160 length 512.
> > >  Error reading device /dev/bcache0 at 7965015146496 length 4.
> > >  bcache_invalidate: block (0, 0) still held
> > >  bcache_abort: block (0, 0) still held
> > >  Error reading device /dev/bcache0 at 7965015248896 length 4.
> > >  Error reading device /dev/bcache0 at 7965015259648 length 24.
> > >  Error reading device /dev/bcache0 at 7965015260160 length 512.
> > >  Volume group "vms" successfully created
> > >
> > > The logs do not give any more clues:
> > >
> > > # journalctl | grep -i bcache
> > > Nov 14 13:00:13 softq-pve-710 kernel: bcache: run_cache_set() invalidating existing data
> > > Nov 14 13:00:13 softq-pve-710 kernel: bcache: register_cache() registered cache device md0
> > > Nov 14 13:00:13 softq-pve-710 kernel: bcache: register_bdev() registered backing device sda4
> > > Nov 14 13:00:13 softq-pve-710 kernel: bcache: bch_cached_dev_attach() Caching sda4 as bcache0 on set a8f159d2-06e6-461f-b66b-22419d2829c0
> > > Nov 14 14:35:49 softq-pve-710 lvm[307524]:   pvscan[307524] PV /dev/bcache0 online, VG vms is complete.
> > >
> > > This error seems to have no effect on the operation of LVM. Do you 
> > > know what is causing it, and whether or not I can overlook it?
> > 

> Le 17 nov. 2021 à 22:12, Eric Wheeler <ewheeler@linuxglobal.com> a écrit 
> 
> > I am guessing you have a cache device with 4K sectors and that bcache 
> > is trying to index it on 512 byte boundaries. This is the bug I was 
> > talking about above. You can tell because 
> > 7965015260160/4096=1944583803.75. Note the fractional division from 
> > the last sector listed in your logs just above . If this is easily 
> > reproducible, then please open an issue on the mailing list so that 
> > Coly, the maintainer, can work on a fix.
>

11/19/2021 01:00 (America/Los_Angeles) - Frédéric Dumas wrote:  
> As you anticipated, reformatting the two P3700s with 512 byte sectors 
> instead of 4KB made any error message from bcache disappear.
>  
> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
> # intelmas start -intelssd 1 -nvmeformat LBAFormat=0
>  
> Then,
>  
> # vgcreate vms /dev/bcache0
>  
> no more errors:
>  
> # vgcreate vms /dev/bcache0
>   Physical volume "/dev/bcache0" successfully created.
>   Volume group "vms" successfully created
> # lvcreate -n store -l 100%VG vms
>   Logical volume "datastore" created.


--
Eric Wheeler



> 
> Can this value be changed "on the fly"? I think I remember that the
> bdev super block must match the cdev super block - although that
> doesn't make that much sense to me.
> 
> By "on the fly" I mean: Re-create the cdev super block, then just
> attach the bdev - in this case, the sectors per block should not
> matter because this is a brand new cdev with no existing cache data.
> But I think it will refuse attaching the devices because of
> non-matching block size (at least this was the case in the past). I
> don't see a point in having a block size in both super blocks at all
> if the only block size that matters lives in the cdev superblock.
> 
> Thanks
> Kai
> 
> Am Di., 16. Nov. 2021 um 12:02 Uhr schrieb Coly Li <colyli@suse.de>:
> >
> > On 11/16/21 6:10 PM, Kai Krakow wrote:
> > > Hello Coly!
> > >
> > > I think I can consistently reproduce a failure mode of bcache when
> > > going from 5.10 LTS to 5.15.2 - on one single system (my other systems
> > > do just fine).
> > >
> > > In 5.10, bcache is stable, no problems at all. After booting to
> > > 5.15.2, btrfs would complain about broken btree generation numbers,
> > > then freeze completely. Going back to 5.10, bcache complains about
> > > being broken and cannot start the cache set.
> > >
> > > I was able to reproduce the following behavior after the problem
> > > struck me twice in a row:
> > >
> > > 1. Boot into SysRescueCD
> > > 2. modprobe bcache
> > > 3. Manually detach the btrfs disks from bcache, set cache mode to
> > > none, force running
> > > 4. Reboot into 5.15.2 (now works)
> > > 5. See this error in dmesg:
> > >
> > > [   27.334306] bcache: bch_cache_set_error() error on
> > > 04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
> > > 1, 0 keys, disabling caching
> > > [   27.334453] bcache: cache_set_free() Cache set
> > > 04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
> > > [   27.334510] bcache: register_cache() error sda3: failed to run cache set
> > > [   27.334512] bcache: register_bcache() error : failed to register device
> > >
> > > 6. wipefs the failed bcache cache
> > > 7. bcache make -C -w 512 /dev/sda3 -l bcache-cdev0 --force
> > > 8. re-attach the btrfs disks in writearound mode
> > > 9. btrfs immediately fails, freezing the system (with transactions IDs way off)
> > > 10. reboot loops to 5, unable to mount
> > > 11. escape the situation by starting at 1, and not make a new bcache
> > >
> > > Is this a known error? Why does it only hit this machine?
> > >
> > > SSD Model: Samsung SSD 850 EVO 250GB
> >
> > This is already known, there are 3 locations to fix,
> >
> > 1, Revert commit 2fd3e5efe791946be0957c8e1eed9560b541fe46
> > 2, Revert commit  f8b679a070c536600c64a78c83b96aa617f8fa71
> > 3, Do the following change in drivers/md/bcache.c,
> > @@ -885,9 +885,9 @@ static void bcache_device_free(struct bcache_device *d)
> >
> >                 bcache_device_detach(d);
> >
> >         if (disk) {
> > -               blk_cleanup_disk(disk);
> >                 ida_simple_remove(&bcache_device_idx,
> >                                   first_minor_to_idx(disk->first_minor));
> > +               blk_cleanup_disk(disk);
> >         }
> >
> > The fix 1) and 3) are on the way to stable kernel IMHO, and fix 2) is 
> > only my workaround and I don't see upstream fix yet.
> >
> > Just FYI.
> >
> > Coly Li
> >
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-20  0:06     ` Eric Wheeler
@ 2021-11-23  8:54       ` Coly Li
  2021-11-23  9:30         ` Kai Krakow
  2022-01-06  2:51         ` Eric Wheeler
  0 siblings, 2 replies; 16+ messages in thread
From: Coly Li @ 2021-11-23  8:54 UTC (permalink / raw)
  To: Eric Wheeler, Kai Krakow
  Cc: linux-bcache, Frédéric Dumas, Kent Overstreet

On 11/20/21 8:06 AM, Eric Wheeler wrote:
> (Fixed mail header and resent, ignore possible duplicate message and
> reply to this one instead because the From header was broken.)
>
>
> Hi Coly, Kai, and Kent, I hope you are well!
>
> On Thu, 18 Nov 2021, Kai Krakow wrote:
>
>> Hi Coly!
>>
>> Reading the commit logs, it seems to come from using a non-default
>> block size, 512 in my case (although I'm pretty sure that *is* the
>> default on the affected system). I've checked:
>> ```
>> dev.sectors_per_block   1
>> dev.sectors_per_bucket  1024
>> ```
>>
>> The non-affected machines use 4k blocks (sectors per block = 8).
> If it is the cache device with 4k blocks, then this could be a known issue
> (perhaps) not directly related to the 5.15 release. We've hit a before:
>    https://www.spinics.net/lists/linux-bcache/msg05983.html
>
> and I just talked to Frédéric Dumas this week who hit it too (cc'ed).
> His solution was to use manufacturer disk tools to change the cachedev's
> logical block size from 4k to 512-bytes and reformat (see below).
>
> We've not seen issues with the backing device using 4k blocks, but bcache
> doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be
> nice to find a long-term fix; more and more SSDs support 4k blocks, which
> is a nice x86 page-alignment and may provide for less CPU overhead.
>
> I think this was the last message on the subject from Kent and Coly:
>
> 	> On 2018/5/9 3:59 PM, Kent Overstreet wrote:
> 	> > Have you checked extent merging?
> 	>
> 	> Hi Kent,
> 	>
> 	> Not yet. Let me look into it.
> 	>
> 	> Thanks for the hint.
> 	>
> 	> Coly Li

I tried and I still remember this, the headache is, I don't have a 4Kn 
SSD to debug and trace, just looking at the code is hard...

If anybody can send me (in China to Beijing) a 4Kn SSD to debug and 
testing, maybe I can make some progress. Or can I configure the kernel 
to force a specific non-4Kn SSD to only accept 4K aligned I/O ?

Coly Li



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-23  8:54       ` Coly Li
@ 2021-11-23  9:30         ` Kai Krakow
  2022-01-06 15:32           ` Coly Li
  2022-01-06  2:51         ` Eric Wheeler
  1 sibling, 1 reply; 16+ messages in thread
From: Kai Krakow @ 2021-11-23  9:30 UTC (permalink / raw)
  To: Coly Li
  Cc: Eric Wheeler, linux-bcache, Frédéric Dumas, Kent Overstreet

Am Di., 23. Nov. 2021 um 09:54 Uhr schrieb Coly Li <colyli@suse.de>:
>
> On 11/20/21 8:06 AM, Eric Wheeler wrote:
> > (Fixed mail header and resent, ignore possible duplicate message and
> > reply to this one instead because the From header was broken.)
> >
> >
> > Hi Coly, Kai, and Kent, I hope you are well!
> >
> > On Thu, 18 Nov 2021, Kai Krakow wrote:
> >
> >> Hi Coly!
> >>
> >> Reading the commit logs, it seems to come from using a non-default
> >> block size, 512 in my case (although I'm pretty sure that *is* the
> >> default on the affected system). I've checked:
> >> ```
> >> dev.sectors_per_block   1
> >> dev.sectors_per_bucket  1024
> >> ```
> >>
> >> The non-affected machines use 4k blocks (sectors per block = 8).
> > If it is the cache device with 4k blocks, then this could be a known issue
> > (perhaps) not directly related to the 5.15 release. We've hit a before:
> >    https://www.spinics.net/lists/linux-bcache/msg05983.html
> >
> > and I just talked to Frédéric Dumas this week who hit it too (cc'ed).
> > His solution was to use manufacturer disk tools to change the cachedev's
> > logical block size from 4k to 512-bytes and reformat (see below).
> >
> > We've not seen issues with the backing device using 4k blocks, but bcache
> > doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be
> > nice to find a long-term fix; more and more SSDs support 4k blocks, which
> > is a nice x86 page-alignment and may provide for less CPU overhead.
> >
> > I think this was the last message on the subject from Kent and Coly:
> >
> >       > On 2018/5/9 3:59 PM, Kent Overstreet wrote:
> >       > > Have you checked extent merging?
> >       >
> >       > Hi Kent,
> >       >
> >       > Not yet. Let me look into it.
> >       >
> >       > Thanks for the hint.
> >       >
> >       > Coly Li
>
> I tried and I still remember this, the headache is, I don't have a 4Kn
> SSD to debug and trace, just looking at the code is hard...
>
> If anybody can send me (in China to Beijing) a 4Kn SSD to debug and
> testing, maybe I can make some progress. Or can I configure the kernel
> to force a specific non-4Kn SSD to only accept 4K aligned I/O ?

I think you can switch at least SOME models to native 4k?

https://unix.stackexchange.com/questions/606072/change-logical-sector-size-to-4k

> Changing a HDD to native 4k sectors works at least with WD Red Plus 14 TB drives but LOSES ALL DATA. The data is not actually wiped but partition tables and filesystems cannot be found after the change because of their now incorrect LBA locations.
>
> hdparm --set-sector-size 4096 --please-destroy-my-drive /dev/sdX

HTH
Kai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-23  8:54       ` Coly Li
  2021-11-23  9:30         ` Kai Krakow
@ 2022-01-06  2:51         ` Eric Wheeler
  2022-01-06  9:25           ` Frédéric Dumas
  2022-01-06 15:49           ` Coly Li
  1 sibling, 2 replies; 16+ messages in thread
From: Eric Wheeler @ 2022-01-06  2:51 UTC (permalink / raw)
  To: Coly Li
  Cc: Kai Krakow, linux-bcache, Frédéric Dumas, Kent Overstreet

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

On Tue, 23 Nov 2021, Coly Li wrote:
> On 11/20/21 8:06 AM, Eric Wheeler wrote:
> > Hi Coly, Kai, and Kent, I hope you are well!
> >
> > On Thu, 18 Nov 2021, Kai Krakow wrote:
> >
> >> Hi Coly!
> >>
> >> Reading the commit logs, it seems to come from using a non-default
> >> block size, 512 in my case (although I'm pretty sure that *is* the
> >> default on the affected system). I've checked:
> >> ```
> >> dev.sectors_per_block   1
> >> dev.sectors_per_bucket  1024
> >> ```
> >>
> >> The non-affected machines use 4k blocks (sectors per block = 8).
> > If it is the cache device with 4k blocks, then this could be a known issue
> > (perhaps) not directly related to the 5.15 release. We've hit a before:
> >    https://www.spinics.net/lists/linux-bcache/msg05983.html
> >
> > and I just talked to Frédéric Dumas this week who hit it too (cc'ed).
> > His solution was to use manufacturer disk tools to change the cachedev's
> > logical block size from 4k to 512-bytes and reformat (see below).
> >
> > We've not seen issues with the backing device using 4k blocks, but bcache
> > doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be
> > nice to find a long-term fix; more and more SSDs support 4k blocks, which
> > is a nice x86 page-alignment and may provide for less CPU overhead.
> >
> > I think this was the last message on the subject from Kent and Coly:
> >
> >  > On 2018/5/9 3:59 PM, Kent Overstreet wrote:
> >  > > Have you checked extent merging?
> >  >
> >  > Hi Kent,
> >  >
> >  > Not yet. Let me look into it.
> >  >
> >  > Thanks for the hint.
> >  >
> >  > Coly Li
> 
> I tried and I still remember this, the headache is, I don't have a 4Kn SSD to
> debug and trace, just looking at the code is hard...

The scsi_debug driver can do it:
	modprobe scsi_debug sector_size=4096 dev_size_mb=$((128*1024)) 

That will give you a 128gb SCSI ram disk with 4k sectors.  If that is 
enough for a cache to test against then you could run your super-high-IO 
test against it and see what you get.  I would be curious how testing 
bcache on the scsi_debug ramdisk in writeback performs!

> If anybody can send me (in China to Beijing) a 4Kn SSD to debug and testing,
> maybe I can make some progress. Or can I configure the kernel to force a
> specific non-4Kn SSD to only accept 4K aligned I/O ?

I think the scsi_debug option above might be cheaper ;) 

But seriously, Frédéric who reported this error was using an Intel P3700 
if someone (SUSE?) wants to fund testing on real hardware.  <$150 used on 
eBay: 

I'm not sure how to format it 4k, but this is how Frédéric set it to 512 
bytes and fixed his issue:

# intelmas start -intelssd 0 -nvmeformat LBAFormat=0
# intelmas start -intelssd 1 -nvmeformat LBAFormat=0

-Eric


> 
> Coly Li
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-01-06  2:51         ` Eric Wheeler
@ 2022-01-06  9:25           ` Frédéric Dumas
  2022-01-06 15:55             ` Coly Li
  2022-01-06 15:49           ` Coly Li
  1 sibling, 1 reply; 16+ messages in thread
From: Frédéric Dumas @ 2022-01-06  9:25 UTC (permalink / raw)
  To: linux-bcache; +Cc: Coly Li, Eric Wheeler, Kai Krakow, Kent Overstreet


Hello!

Many thanks to Eric for describing here and in his previous email the bug I experienced using bcache on SSDs formatted as 4k sectors. Thanks also to him for explaining to me that all I had to do was reformat the SSDs into 512-byte sectors to easily get around the bug.


> I'm not sure how to format it 4k, but this is how Frédéric set it to 512 
> bytes and fixed his issue:
> 
> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0


Right.
To format an Intel NVMe P3700 back to 4k sectors, the command is as follows:

# intelmas start -intelssd 0 -nvmeformat LBAFormat=3


> The parameter LBAformat specifies the sector size to set. Valid options are in the range from index 0 to the number of supported LBA formats of the NVMe drive, however the only sector sizes supported in Intel® NVMe drives are 512B and 4096B which corresponds to indexes 0 and 3 respectively.


Source: https://www.intel.com/content/www/us/en/support/articles/000057964/memory-and-storage.html

Oddly enough the user manual for the intelmass application [1] (formerly isdct) forgets to specify the possible values to be given to the LBAformat argument, which makes it much less useful. :-)


Regards,

Frédéric.


[1] https://www.intel.com/content/www/us/en/download/19520/intel-memory-and-storage-tool-cli-command-line-interface.html
--
Frédéric Dumas
f.dumas@ellis.siteparc.fr




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2021-11-23  9:30         ` Kai Krakow
@ 2022-01-06 15:32           ` Coly Li
  0 siblings, 0 replies; 16+ messages in thread
From: Coly Li @ 2022-01-06 15:32 UTC (permalink / raw)
  To: Kai Krakow
  Cc: Eric Wheeler, linux-bcache, Frédéric Dumas, Kent Overstreet

On 11/23/21 5:30 PM, Kai Krakow wrote:
> Am Di., 23. Nov. 2021 um 09:54 Uhr schrieb Coly Li <colyli@suse.de>:
>> On 11/20/21 8:06 AM, Eric Wheeler wrote:
>>> (Fixed mail header and resent, ignore possible duplicate message and
>>> reply to this one instead because the From header was broken.)
>>>
>>>
>>> Hi Coly, Kai, and Kent, I hope you are well!
>>>
>>> On Thu, 18 Nov 2021, Kai Krakow wrote:
>>>
>>>> Hi Coly!
>>>>
>>>> Reading the commit logs, it seems to come from using a non-default
>>>> block size, 512 in my case (although I'm pretty sure that *is* the
>>>> default on the affected system). I've checked:
>>>> ```
>>>> dev.sectors_per_block   1
>>>> dev.sectors_per_bucket  1024
>>>> ```
>>>>
>>>> The non-affected machines use 4k blocks (sectors per block = 8).
>>> If it is the cache device with 4k blocks, then this could be a known issue
>>> (perhaps) not directly related to the 5.15 release. We've hit a before:
>>>     https://www.spinics.net/lists/linux-bcache/msg05983.html
>>>
>>> and I just talked to Frédéric Dumas this week who hit it too (cc'ed).
>>> His solution was to use manufacturer disk tools to change the cachedev's
>>> logical block size from 4k to 512-bytes and reformat (see below).
>>>
>>> We've not seen issues with the backing device using 4k blocks, but bcache
>>> doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be
>>> nice to find a long-term fix; more and more SSDs support 4k blocks, which
>>> is a nice x86 page-alignment and may provide for less CPU overhead.
>>>
>>> I think this was the last message on the subject from Kent and Coly:
>>>
>>>        > On 2018/5/9 3:59 PM, Kent Overstreet wrote:
>>>        > > Have you checked extent merging?
>>>        >
>>>        > Hi Kent,
>>>        >
>>>        > Not yet. Let me look into it.
>>>        >
>>>        > Thanks for the hint.
>>>        >
>>>        > Coly Li
>> I tried and I still remember this, the headache is, I don't have a 4Kn
>> SSD to debug and trace, just looking at the code is hard...
>>
>> If anybody can send me (in China to Beijing) a 4Kn SSD to debug and
>> testing, maybe I can make some progress. Or can I configure the kernel
>> to force a specific non-4Kn SSD to only accept 4K aligned I/O ?
> I think you can switch at least SOME models to native 4k?
>
> https://unix.stackexchange.com/questions/606072/change-logical-sector-size-to-4k
>
>> Changing a HDD to native 4k sectors works at least with WD Red Plus 14 TB drives but LOSES ALL DATA. The data is not actually wiped but partition tables and filesystems cannot be found after the change because of their now incorrect LBA locations.
>>
>> hdparm --set-sector-size 4096 --please-destroy-my-drive /dev/sdX

I didn't reply this email because I don't test the above example on 
latest mainline kernel.

I tested the command on 5.10 kernel with NVMe and SATA SSD, both of them 
didn't work. I wanted to verify whether this is new on latest mainline 
kernel but not find a chance to do this yet.

Thanks for the hint.

Coly Li


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-01-06  2:51         ` Eric Wheeler
  2022-01-06  9:25           ` Frédéric Dumas
@ 2022-01-06 15:49           ` Coly Li
  2022-02-07  6:11             ` Coly Li
  1 sibling, 1 reply; 16+ messages in thread
From: Coly Li @ 2022-01-06 15:49 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Kai Krakow, linux-bcache, Frédéric Dumas, Kent Overstreet

On 1/6/22 10:51 AM, Eric Wheeler wrote:
> On Tue, 23 Nov 2021, Coly Li wrote:
>> On 11/20/21 8:06 AM, Eric Wheeler wrote:
>>> Hi Coly, Kai, and Kent, I hope you are well!
>>>
>>> On Thu, 18 Nov 2021, Kai Krakow wrote:
>>>
>>>> Hi Coly!
>>>>
>>>> Reading the commit logs, it seems to come from using a non-default
>>>> block size, 512 in my case (although I'm pretty sure that *is* the
>>>> default on the affected system). I've checked:
>>>> ```
>>>> dev.sectors_per_block   1
>>>> dev.sectors_per_bucket  1024
>>>> ```
>>>>
>>>> The non-affected machines use 4k blocks (sectors per block = 8).
>>> If it is the cache device with 4k blocks, then this could be a known issue
>>> (perhaps) not directly related to the 5.15 release. We've hit a before:
>>>     https://www.spinics.net/lists/linux-bcache/msg05983.html
>>>
>>> and I just talked to Frédéric Dumas this week who hit it too (cc'ed).
>>> His solution was to use manufacturer disk tools to change the cachedev's
>>> logical block size from 4k to 512-bytes and reformat (see below).
>>>
>>> We've not seen issues with the backing device using 4k blocks, but bcache
>>> doesn't always seem to make 4k-aligned IOs to the cachedev.  It would be
>>> nice to find a long-term fix; more and more SSDs support 4k blocks, which
>>> is a nice x86 page-alignment and may provide for less CPU overhead.
>>>
>>> I think this was the last message on the subject from Kent and Coly:
>>>
>>>   > On 2018/5/9 3:59 PM, Kent Overstreet wrote:
>>>   > > Have you checked extent merging?
>>>   >
>>>   > Hi Kent,
>>>   >
>>>   > Not yet. Let me look into it.
>>>   >
>>>   > Thanks for the hint.
>>>   >
>>>   > Coly Li
>> I tried and I still remember this, the headache is, I don't have a 4Kn SSD to
>> debug and trace, just looking at the code is hard...

Hi Eric,

> The scsi_debug driver can do it:
> 	modprobe scsi_debug sector_size=4096 dev_size_mb=$((128*1024))
>
> That will give you a 128gb SCSI ram disk with 4k sectors.  If that is
> enough for a cache to test against then you could run your super-high-IO
> test against it and see what you get.  I would be curious how testing
> bcache on the scsi_debug ramdisk in writeback performs!

The dram is not big enough on my testing server....

>> If anybody can send me (in China to Beijing) a 4Kn SSD to debug and testing,
>> maybe I can make some progress. Or can I configure the kernel to force a
>> specific non-4Kn SSD to only accept 4K aligned I/O ?
> I think the scsi_debug option above might be cheaper ;)
>
> But seriously, Frédéric who reported this error was using an Intel P3700
> if someone (SUSE?) wants to fund testing on real hardware.  <$150 used on
> eBay:

Currently all my testing SSDs are supported from Lenovo and Memblaze. I 
tried the hdparm command which Kai Krakow told me, and didn't work out.

Thanks for the hint for Intel P3700, I will try to find some and try to 
reproduce.
>
> I'm not sure how to format it 4k, but this is how Frédéric set it to 512
> bytes and fixed his issue:
>
> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
> # intelmas start -intelssd 1 -nvmeformat LBAFormat=0

Copied. Let me try to find Intel P3700 firstly.

Coly Li

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-01-06  9:25           ` Frédéric Dumas
@ 2022-01-06 15:55             ` Coly Li
  2022-01-08  6:57               ` Coly Li
  0 siblings, 1 reply; 16+ messages in thread
From: Coly Li @ 2022-01-06 15:55 UTC (permalink / raw)
  To: Frédéric Dumas
  Cc: Eric Wheeler, Kai Krakow, linux-bcache, Kent Overstreet

On 1/6/22 5:25 PM, Frédéric Dumas wrote:
> Hello!
>
> Many thanks to Eric for describing here and in his previous email the bug I experienced using bcache on SSDs formatted as 4k sectors. Thanks also to him for explaining to me that all I had to do was reformat the SSDs into 512-byte sectors to easily get around the bug.
>
>
>> I'm not sure how to format it 4k, but this is how Frédéric set it to 512
>> bytes and fixed his issue:
>>
>> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
>
> Right.
> To format an Intel NVMe P3700 back to 4k sectors, the command is as follows:
>
> # intelmas start -intelssd 0 -nvmeformat LBAFormat=3
>
>
>> The parameter LBAformat specifies the sector size to set. Valid options are in the range from index 0 to the number of supported LBA formats of the NVMe drive, however the only sector sizes supported in Intel® NVMe drives are 512B and 4096B which corresponds to indexes 0 and 3 respectively.
>
> Source: https://www.intel.com/content/www/us/en/support/articles/000057964/memory-and-storage.html
>
> Oddly enough the user manual for the intelmass application [1] (formerly isdct) forgets to specify the possible values to be given to the LBAformat argument, which makes it much less useful. :-)

Hi Frederic,

Many thanks for the information. BTW, could you please tell me the 
detail information about your Intel NVMe P3700 SSD, I will try to find 
it in local market.

Coly Li

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-01-06 15:55             ` Coly Li
@ 2022-01-08  6:57               ` Coly Li
  0 siblings, 0 replies; 16+ messages in thread
From: Coly Li @ 2022-01-08  6:57 UTC (permalink / raw)
  To: Frédéric Dumas
  Cc: Eric Wheeler, Kai Krakow, linux-bcache, Kent Overstreet

On 1/6/22 11:55 PM, Coly Li wrote:
> On 1/6/22 5:25 PM, Frédéric Dumas wrote:
>> Hello!
>>
>> Many thanks to Eric for describing here and in his previous email the 
>> bug I experienced using bcache on SSDs formatted as 4k sectors. 
>> Thanks also to him for explaining to me that all I had to do was 
>> reformat the SSDs into 512-byte sectors to easily get around the bug.
>>
>>
>>> I'm not sure how to format it 4k, but this is how Frédéric set it to 
>>> 512
>>> bytes and fixed his issue:
>>>
>>> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
>>
>> Right.
>> To format an Intel NVMe P3700 back to 4k sectors, the command is as 
>> follows:
>>
>> # intelmas start -intelssd 0 -nvmeformat LBAFormat=3
>>
>>
>>> The parameter LBAformat specifies the sector size to set. Valid 
>>> options are in the range from index 0 to the number of supported LBA 
>>> formats of the NVMe drive, however the only sector sizes supported 
>>> in Intel® NVMe drives are 512B and 4096B which corresponds to 
>>> indexes 0 and 3 respectively.
>>
>> Source: 
>> https://www.intel.com/content/www/us/en/support/articles/000057964/memory-and-storage.html
>>
>> Oddly enough the user manual for the intelmass application [1] 
>> (formerly isdct) forgets to specify the possible values to be given 
>> to the LBAformat argument, which makes it much less useful. :-)
>
> Hi Frederic,
>
> Many thanks for the information. BTW, could you please tell me the 
> detail information about your Intel NVMe P3700 SSD, I will try to find 
> it in local market.

I try to find some PCI-e interface Intel P3700 SSDs with 400G or 800G 
capacity. If I am lucky, they may reach my location within 2 weeks, hope 
I may reproduce same operation as you did on these SSDs.

Coly Li

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-01-06 15:49           ` Coly Li
@ 2022-02-07  6:11             ` Coly Li
  2022-02-07  7:37               ` Coly Li
  0 siblings, 1 reply; 16+ messages in thread
From: Coly Li @ 2022-02-07  6:11 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Kai Krakow, linux-bcache, Frédéric Dumas, Kent Overstreet

On 1/6/22 11:49 PM, Coly Li wrote:
> On 1/6/22 10:51 AM, Eric Wheeler wrote:
>
>>
>> I'm not sure how to format it 4k, but this is how Frédéric set it to 512
>> bytes and fixed his issue:
>>
>> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
>> # intelmas start -intelssd 1 -nvmeformat LBAFormat=0
>
> Copied. Let me try to find Intel P3700 firstly.

Thanks to Lenovo, they lent me P3700 PCIe SSD for bcache testing and 
debug. Now I format the card to 4K sector size and see the new 4k sector 
size from fdisk output.

I start to run fio with 8 io jobs and 256 io depth, 4K random write. Let 
me see what may happen. If any one has advice to reproduce the 
non-aligned I/O error more easily, please hint me.

Coly Li

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-02-07  6:11             ` Coly Li
@ 2022-02-07  7:37               ` Coly Li
  2022-02-07  8:10                 ` Kai Krakow
  0 siblings, 1 reply; 16+ messages in thread
From: Coly Li @ 2022-02-07  7:37 UTC (permalink / raw)
  To: Eric Wheeler
  Cc: Kai Krakow, linux-bcache, Frédéric Dumas, Kent Overstreet

On 2/7/22 2:11 PM, Coly Li wrote:
> On 1/6/22 11:49 PM, Coly Li wrote:
>> On 1/6/22 10:51 AM, Eric Wheeler wrote:
>>
>>>
>>> I'm not sure how to format it 4k, but this is how Frédéric set it to 
>>> 512
>>> bytes and fixed his issue:
>>>
>>> # intelmas start -intelssd 0 -nvmeformat LBAFormat=0
>>> # intelmas start -intelssd 1 -nvmeformat LBAFormat=0
>>
>> Copied. Let me try to find Intel P3700 firstly.
>
> Thanks to Lenovo, they lent me P3700 PCIe SSD for bcache testing and 
> debug. Now I format the card to 4K sector size and see the new 4k 
> sector size from fdisk output.
>
> I start to run fio with 8 io jobs and 256 io depth, 4K random write. 
> Let me see what may happen. If any one has advice to reproduce the 
> non-aligned I/O error more easily, please hint me.

BTW, just for extra clarifying,

The original issue reported by Kai in this thread, is not related to 4Kn 
issue. It is very probably a kernel regression as I replied to his first 
email.

What I am working on, is the problem originally reported by Eric which 
happened on 4Kn devices. I will update the situation on that thread later.

For the problem reported by Kai in this thread, the dmesg

[   27.334306] bcache: bch_cache_set_error() error on
04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
1, 0 keys, disabling caching
[   27.334453] bcache: cache_set_free() Cache set
04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
[   27.334510] bcache: register_cache() error sda3: failed to run cache set
[   27.334512] bcache: register_bcache() error : failed to register device

tells that the mate data is corrupted which probably by uncompleted meta data write, which some other people and I countered too (some specific bcache block size on specific device). Update to latest stable kernel may solve the issue, but I don't verify whether the regression is fixed or not.

Coly Li


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-02-07  7:37               ` Coly Li
@ 2022-02-07  8:10                 ` Kai Krakow
  2022-02-07  8:13                   ` Coly Li
  0 siblings, 1 reply; 16+ messages in thread
From: Kai Krakow @ 2022-02-07  8:10 UTC (permalink / raw)
  To: Coly Li
  Cc: Eric Wheeler, linux-bcache, Frédéric Dumas, Kent Overstreet

Am Mo., 7. Feb. 2022 um 08:37 Uhr schrieb Coly Li <colyli@suse.de>:

> For the problem reported by Kai in this thread, the dmesg
>
> [   27.334306] bcache: bch_cache_set_error() error on
> 04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
> 1, 0 keys, disabling caching
> [   27.334453] bcache: cache_set_free() Cache set
> 04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
> [   27.334510] bcache: register_cache() error sda3: failed to run cache set
> [   27.334512] bcache: register_bcache() error : failed to register device
>
> tells that the mate data is corrupted which probably by uncompleted meta data write, which some other people and I countered too (some specific bcache block size on specific device). Update to latest stable kernel may solve the issue, but I don't verify whether the regression is fixed or not.

As far as I can tell, the problem hasn't happened again since. I think
I saw the problem in 5.15.2 (the first 5.15.x I tried), and it was
fixed probably by 'bcache: Revert "bcache: use bvec_virt"' in 5.15.3.
I even tried write-back mode again on multiple systems and it is
stable. OTOH, I must say that I only enabled writeback caching after
using btrfs metadata hinting patches which can move metadata to native
SSD devices - so bcache will no longer handle btrfs metadata writes or
reads. Performance-wise, this seems a superior setup, even bcache
seems to struggle with btrfs metadata access patterns. But I doubt it
has anything to do with whether the 5.15.2 problem triggers or
doesn't, just wanted to state that for completeness.

Regards,
Kai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Consistent failure of bcache upgrading from 5.10 to 5.15.2
  2022-02-07  8:10                 ` Kai Krakow
@ 2022-02-07  8:13                   ` Coly Li
  0 siblings, 0 replies; 16+ messages in thread
From: Coly Li @ 2022-02-07  8:13 UTC (permalink / raw)
  To: Kai Krakow
  Cc: Eric Wheeler, linux-bcache, Frédéric Dumas, Kent Overstreet

On 2/7/22 4:10 PM, Kai Krakow wrote:
> Am Mo., 7. Feb. 2022 um 08:37 Uhr schrieb Coly Li <colyli@suse.de>:
>
>> For the problem reported by Kai in this thread, the dmesg
>>
>> [   27.334306] bcache: bch_cache_set_error() error on
>> 04af889c-4ccb-401b-b525-fb9613a81b69: empty set at bucket 1213, block
>> 1, 0 keys, disabling caching
>> [   27.334453] bcache: cache_set_free() Cache set
>> 04af889c-4ccb-401b-b525-fb9613a81b69 unregistered
>> [   27.334510] bcache: register_cache() error sda3: failed to run cache set
>> [   27.334512] bcache: register_bcache() error : failed to register device
>>
>> tells that the mate data is corrupted which probably by uncompleted meta data write, which some other people and I countered too (some specific bcache block size on specific device). Update to latest stable kernel may solve the issue, but I don't verify whether the regression is fixed or not.
> As far as I can tell, the problem hasn't happened again since. I think
> I saw the problem in 5.15.2 (the first 5.15.x I tried), and it was
> fixed probably by 'bcache: Revert "bcache: use bvec_virt"' in 5.15.3.
> I even tried write-back mode again on multiple systems and it is
> stable. OTOH, I must say that I only enabled writeback caching after
> using btrfs metadata hinting patches which can move metadata to native
> SSD devices - so bcache will no longer handle btrfs metadata writes or
> reads. Performance-wise, this seems a superior setup, even bcache
> seems to struggle with btrfs metadata access patterns. But I doubt it
> has anything to do with whether the 5.15.2 problem triggers or
> doesn't, just wanted to state that for completeness.

Copied. Thank you for the information. And by your information, I am 
triggered by this report to find hardware to debug another existing 
issue for years. This is a powerful motivation from community :-)


Coly Li

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-07  8:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 10:10 Consistent failure of bcache upgrading from 5.10 to 5.15.2 Kai Krakow
2021-11-16 11:02 ` Coly Li
2021-11-18 10:27   ` Kai Krakow
2021-11-20  0:06     ` Eric Wheeler
2021-11-23  8:54       ` Coly Li
2021-11-23  9:30         ` Kai Krakow
2022-01-06 15:32           ` Coly Li
2022-01-06  2:51         ` Eric Wheeler
2022-01-06  9:25           ` Frédéric Dumas
2022-01-06 15:55             ` Coly Li
2022-01-08  6:57               ` Coly Li
2022-01-06 15:49           ` Coly Li
2022-02-07  6:11             ` Coly Li
2022-02-07  7:37               ` Coly Li
2022-02-07  8:10                 ` Kai Krakow
2022-02-07  8:13                   ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.