linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Kernels 4.15..5.5: "WARNING: CPU: 2 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0"
Date: Tue, 4 Feb 2020 11:49:18 -0500	[thread overview]
Message-ID: <20200204164918.GC13306@hungrycats.org> (raw)
In-Reply-To: <65514978-506f-83fa-2c95-ee9ce3cbf5b4@applied-asynchrony.com>

[-- Attachment #1: Type: text/plain, Size: 6560 bytes --]

On Tue, Feb 04, 2020 at 02:58:52PM +0100, Holger Hoffstätte wrote:
> On 2/4/20 6:04 AM, Zygo Blaxell wrote:
> > On Fri, Mar 22, 2019 at 12:17:32AM -0400, Zygo Blaxell wrote:
> > > When filesystems are mounted flushoncommit, I get this warning roughly
> > > every 30 seconds:
> > > 
> > > 	[ 4575.142805] WARNING: CPU: 3 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0
> > > 	[ 4575.145567] Modules linked in: crct10dif_pclmul crc32_pclmul dm_cache_smq crc32c_intel dm_cache snd_pcm ghash_clmulni_intel aesni_intel sr_mod dm_persistent_data ppdev joydev dm_bio_prison aes_x86_64 crypto_simd snd_timer dm_bufio cryptd cdrom snd glue_helper dm_mod parport_pc soundcore sg floppy parport pcspkr psmouse bochs_drm rtc_cmos ide_pci_generic piix input_leds i2c_piix4 ide_core serio_raw evbug qemu_fw_cfg evdev ip_tables x_tables ipv6 crc_ccitt autofs4
> > > 	[ 4575.160021] CPU: 3 PID: 4150 Comm: btrfs-transacti Tainted: G        W         5.0.3-zb64+ #1
> > > 	[ 4575.162484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > > 	[ 4575.164505] RIP: 0010:__writeback_inodes_sb_nr+0xa9/0xc0
> > > 	[ 4575.165809] Code: 0f b6 d2 e8 b9 f8 ff ff 48 89 ee 48 89 df e8 0e f8 ff ff 48 8b 44 24 48 65 48 33 04 25 28 00 00 00 75 0b 48 83 c4 50 5b 5d c3 <0f> 0b eb cb e8 4e e9 d6 ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00
> > > 	[ 4575.171927] RSP: 0018:ffffa9cac0eabde8 EFLAGS: 00010246
> > > 	[ 4575.173045] RAX: 0000000000000000 RBX: ffff9353e23af000 RCX: 0000000000000000
> > > 	[ 4575.175639] RDX: 0000000000000002 RSI: 0000000000030c67 RDI: ffffa9cac0eabe30
> > > 	[ 4575.177619] RBP: ffffa9cac0eabdec R08: ffffa9cac0eabdf0 R09: ffff9353f12da000
> > > 	[ 4575.179736] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9353e1980000
> > > 	[ 4575.181661] R13: ffff9353e1981430 R14: ffff9353f27e4260 R15: ffff9353e1981518
> > > 	[ 4575.183871] FS:  0000000000000000(0000) GS:ffff9353f6800000(0000) knlGS:0000000000000000
> > > 	[ 4575.185940] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > 	[ 4575.188072] CR2: 00007fb81841fa20 CR3: 00000002218c0006 CR4: 00000000001606e0
> > > 	[ 4575.190094] Call Trace:
> > > 	[ 4575.190828]  btrfs_commit_transaction+0x7a6/0x9e0
> > > 	[ 4575.192115]  ? start_transaction+0x91/0x4d0
> > > 	[ 4575.193197]  transaction_kthread+0x146/0x180
> > > 	[ 4575.194415]  kthread+0x106/0x140
> > > 	[ 4575.195403]  ? btrfs_cleanup_transaction+0x620/0x620
> > > 	[ 4575.196903]  ? kthread_park+0x90/0x90
> > > 	[ 4575.198412]  ret_from_fork+0x3a/0x50
> > > 	[ 4575.199374] irq event stamp: 54922780
> > > 	[ 4575.200218] hardirqs last  enabled at (54922779): [<ffffffffa3d5f2e2>] _raw_spin_unlock_irqrestore+0x32/0x60
> > > 	[ 4575.202753] hardirqs last disabled at (54922780): [<ffffffffa300379f>] trace_hardirqs_off_thunk+0x1a/0x1c
> > > 	[ 4575.205921] softirqs last  enabled at (54922378): [<ffffffffa40003a4>] __do_softirq+0x3a4/0x45f
> > > 	[ 4575.208350] softirqs last disabled at (54922361): [<ffffffffa30a3d44>] irq_exit+0xe4/0xf0
> > > 	[ 4575.210616] ---[ end trace 5309dcf3a1920eca ]---
> > > 
> > > For my own kernel builds I just comment out the line in fs-writeback.c,
> > > but that's not a real solution.
> > 
> > This still happens in 5.5.0.  No changes in behavior or workaround, no
> > apparent harmful effect, almost 2 years running in stress-testing and
> > production.
> > 
> > I, for one, am glad we fixed all those other bugs before doing anything
> > about this one.  It is utterly harmless.
> 
> This triggered my archeology itch. I had to go deeper.

You could start with this thread:

	https://www.spinics.net/lists/linux-btrfs/msg87752.html

> The warning goes all the way back to 2010 (kernel 2.6.x) when everything
> happened at FusionIO.
> 
> Commit [1] introduced it as preparation for [2].
> 
> The only caller of writeback_inodes_sb_nr is btrfs_writeback_inodes_sb_nr in
> (today's) space-info.c, where the mutex trylock was introduced in [3], apparently
> to work around a VFS function that didn't do it for btrfs at the time.
> 
> Flushoncommit was added by Sage Weil for Ceph's btrfs backend in [4], even
> before the WARN_ON, in 2009. We know how that story ended.
> 
> Why has nobody except you noticed this? Probably because the number of people
> actually using it or reporting bugs is.. very small. ¯\_(ツ)_/¯

I'm not the only one to notice, or report, e.g.

	https://www.spinics.net/lists/linux-btrfs/msg74496.html
	https://www.spinics.net/lists/linux-btrfs/msg72483.html
	https://github.com/Zygo/bees/issues/68

plus it comes up every now and then on IRC.  I have heard from other
users of flushoncommit that they also patch their kernels to get rid of
the WARN_ON (or make it WARN_ON_ONCE).

The WARN_ON appears in btrfs starting in 4.15 after:

	https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce8ea7cc6eb3139f4c730d647325e69354159b0f

which rearranges some calls to put the fs-writeback.c WARN_ON on a code
path where it doesn't hold the lock.

To answer a question I asked in

	https://www.spinics.net/lists/linux-btrfs/msg87769.html

(and again in another message of this thread), the answer is
"cherry-picking ce8ea7cc6eb3 into 4.14.107 makes 4.14.107 deadlock
immediately".  Reverting the same commit makes kernel 4.15 and later
deadlock immediately.

btrfs crashes _much_ less often now than it did in 4.14.  Mounting with
noflushoncommit is starting to look like an option worth contemplating
for some workloads on 5.4.18+.  On the other hand, one of the reasons
why I use btrfs instead of other filesystems is that other filesystems
don't implement a sane equivalent of flushoncommit, and those use cases
aren't going away any time soon.

> Unfortunately I'm still none the wiser why btrfs feels it's necessary to
> "open-code"/circumvent the rwsem check. Maybe this gives you a clue.
> 
> cheers,
> Holger
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/fs-writeback.c?id=cf37e972478ec58a8a54a6b4f951815f0ae28f78
> 
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/fs-writeback.c?id=d19de7edf59cdd586777b009e0e8fbe5412dd35f
> 
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/btrfs/extent-tree.c?id=925a6efb8ff0c2bdbec107ed9890e62650c83306
> 
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dccae99995089641fbac452ebc7f0cab18751ddb

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  reply	other threads:[~2020-02-04 16:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22  4:17 Kernels 4.15..5.0.3: "WARNING: CPU: 2 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0" Zygo Blaxell
2019-03-22  7:32 ` Nikolay Borisov
2019-03-22 15:59   ` David Sterba
2019-03-22 17:26     ` Filipe Manana
2019-03-26 23:13       ` Zygo Blaxell
2019-03-26 23:19         ` Filipe Manana
2019-03-22 18:15     ` Zygo Blaxell
2019-05-18 21:11 ` Kernels 4.15..5.1.3: " Zygo Blaxell
2020-02-04  5:04 ` Kernels 4.15..5.5: " Zygo Blaxell
2020-02-04 13:58   ` Holger Hoffstätte
2020-02-04 16:49     ` Zygo Blaxell [this message]
2020-03-27  5:59       ` Kernel 5.5.8 : " Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200204164918.GC13306@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=holger@applied-asynchrony.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).