archive mirror
 help / color / mirror / Atom feed
From: David Howells <>
To: Max Kellermann <>
Subject: Re: fscache corruption in Linux 5.17?
Date: Tue, 19 Apr 2022 17:17:02 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

Max Kellermann <> wrote:

> At least one web server is still in this broken state right now.  So
> if you need anything from that server, tell me, and I'll get it.

Can you turn on:

echo 65536 >/sys/kernel/debug/tracing/buffer_size_kb
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_read/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_write/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_trunc/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_io_error/enable
echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_vfs_error/enable

Then try and trigger the bug if you can.  The trace can be viewed with:

cat /sys/kernel/debug/tracing/trace | less

The problem very likely happens on write rather than read.  If you know of a
file that's corrupt, turn on the tracing above and read that file.  Then look
in the trace buffer and you should see the corresponding lines and they should
have the backing inode in them, marked "B=iiii" where "iiii" is the inode
number of the file in hex.  You should be able to examine the backing file by
finding it with something like:

	find /var/cache/fscache -inum $((0xiiii))

and see if you can see the corruption in there.  Note that there may be blocks
of zeroes corresponding to unfetched file blocks.

Also, what filesystem is backing your cachefiles cache?  It could be useful to
dump the extent list of the file.  You should be able to do this with
"filefrag -e".

As to why this happens, a write that's misaligned by 31 bytes should cause DIO
to a disk to fail - so it shouldn't be possible to write that.  However, I'm
doing fallocate and truncate on the file to shape it so that DIO will work on
it, so it's possible that there's a bug there.  The cachefiles_trunc trace
lines may help catch that.


  parent reply	other threads:[~2022-04-19 16:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 15:10 fscache corruption in Linux 5.17? Max Kellermann
2022-04-16 11:38 ` Thorsten Leemhuis
2022-04-16 19:55   ` Max Kellermann
2022-04-19 13:02 ` David Howells
2022-04-19 14:18   ` Max Kellermann
2022-04-19 15:23     ` [Linux-cachefs] " David Wysochanski
2022-04-19 16:17   ` David Howells [this message]
2022-04-19 16:41     ` Max Kellermann
2022-04-19 16:47     ` David Howells
2022-04-19 15:56 ` David Howells
2022-04-19 16:06   ` Max Kellermann
2022-04-19 16:42   ` David Howells
2022-04-19 18:01     ` Max Kellermann
2022-04-19 21:27     ` Max Kellermann
2022-04-20 13:55     ` David Howells
2022-05-04  8:38       ` Max Kellermann
2022-05-31  8:35       ` David Howells
2022-05-31  8:41         ` Max Kellermann
2022-05-31  9:13         ` David Howells
2022-06-20  7:11           ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).