From: Arnaldo Montagner <armont@google.com>
To: linux-bcache@vger.kernel.org
Subject: I/O error on cache device can cause user observable errors
Date: Thu, 1 Feb 2024 14:25:40 -0800 [thread overview]
Message-ID: <CANF=pgrX7h26TjA9bPUm9umRA-9KvELb9z3-bJsHm+t6SYbE1w@mail.gmail.com> (raw)
The bcache documentation says that errors on the cache device are
handled transparently.
I'm seeing a case where the cache device is unregistered in response
to repeated write errors (expected), but that results in a read error
on the bcache device (unexpected).
Here's how I'm reproducing the problem:
1. Create a device with dm-error to simulate I/O errors. The device is
1G in size and it will fail I/Os in a 4M extent starting at offset
128M:
$ dmsetup create cache_disk << EOF
0 262144 linear /dev/sdb 0
262144 8192 error
270336 1826816 linear /dev/sdb 270336
EOF
2. Set up bcache in writethrough mode. The backing device is 1000G in length:
$ make-bcache --cache /dev/mapper/cache_disk --bdev /dev/sdc
--wipe-bcache --bucket 256k
$ echo writethrough > /sys/block/bcache0/bcache/cache_mode
$ echo 0 > /sys/block/bcache0/bcache/cache/synchronous
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
...
sdb 8:16 0 10G 0 disk
└─cache_disk 253:0 0 1G 0 dm
└─bcache0 252:0 0 1000G 0 disk
sdc 8:32 0 1000G 0 disk
└─bcache0 252:0 0 1000G 0 disk
3. Start a random read workload on the bcache device (using fio):
$ fio --name=basic --filename=/dev/bcache0 --size=1000G
--rw=randread --blocksize=256k --blockalign=256k
4. After a while I see that the cache device gets unregistered.
However, the application output indicates it saw an I/O error on a
read request:
fio: io_u error on file /dev/bcache0: Input/output error: read
offset=592264298496, buflen=262144
I can see in the syslogs that bcache unregistered the cache. The logs
also show that there was an I/O error on the bcache device:
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.176867] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.186494] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.195743] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.204869] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.234722] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.246102] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.274013] bcache:
bch_count_io_errors() dm-0: IO error on writing data to cache.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.289128] bcache:
bch_cache_set_error() error on 427201f5-5c86-4890-9866-f9860e518041:
dm-0: too many IO errors writing data to cache
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.289128] ,
disabling caching
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.306212] bcache:
conditional_stop_bcache_device() stop_when_cache_set_failed of bcache0
is "auto" and cache is clean, keep it alive.
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.306543] Buffer
I/O error on dev bcache0, logical block 144595776, async page read
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.316119] bcache:
cached_dev_detach_finish() Caching disabled for sdc
Feb 1 19:47:23 armont-bcache-test kernel: [ 3327.316398] bcache:
cache_set_free() Cache set 427201f5-5c86-4890-9866-f9860e518041
unregistered
The steps above reproduce the problem most of the time, but not
always. In a few of the attempts, the cache was unregistered without
resulting in observable I/O errors.
Is this expected?
I'm running the Linux kernel version 6.5.0.
next reply other threads:[~2024-02-01 22:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-01 22:25 Arnaldo Montagner [this message]
2024-02-02 7:00 ` I/O error on cache device can cause user observable errors Coly Li
2024-02-02 17:48 ` Arnaldo Montagner
2024-02-03 3:43 ` Coly Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CANF=pgrX7h26TjA9bPUm9umRA-9KvELb9z3-bJsHm+t6SYbE1w@mail.gmail.com' \
--to=armont@google.com \
--cc=linux-bcache@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).