linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "stable@vger.kernel.org, dsterba@suse.cz,
	Filipe Manana"  <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>,
	Filipe Manana <fdmanana@suse.com>
Subject: [PATCH 5.10 28/29] btrfs: fix crash after non-aligned direct IO write with O_DSYNC
Date: Mon, 22 Feb 2021 13:13:22 +0100	[thread overview]
Message-ID: <20210222121025.628566179@linuxfoundation.org> (raw)
In-Reply-To: <20210222121019.444399883@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

Whenever we attempt to do a non-aligned direct IO write with O_DSYNC, we
end up triggering an assertion and crashing. Example reproducer:

  $ cat test.sh
  #!/bin/bash

  DEV=/dev/sdj
  MNT=/mnt/sdj

  mkfs.btrfs -f $DEV > /dev/null
  mount $DEV $MNT

  # Do a direct IO write with O_DSYNC into a non-aligned range...
  xfs_io -f -d -s -c "pwrite -S 0xab -b 64K 1111 64K" $MNT/foobar

  umount $MNT

When running the reproducer an assertion fails and produces the following
trace:

  [ 2418.403134] assertion failed: !current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA, in fs/btrfs/space-info.c:1467
  [ 2418.403745] ------------[ cut here ]------------
  [ 2418.404306] kernel BUG at fs/btrfs/ctree.h:3286!
  [ 2418.404862] invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC PTI
  [ 2418.405451] CPU: 1 PID: 64705 Comm: xfs_io Tainted: G      D           5.10.15-btrfs-next-87 #1
  [ 2418.406026] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [ 2418.407228] RIP: 0010:assertfail.constprop.0+0x18/0x26 [btrfs]
  [ 2418.407835] Code: e6 48 c7 (...)
  [ 2418.409078] RSP: 0018:ffffb06080d13c98 EFLAGS: 00010246
  [ 2418.409696] RAX: 000000000000006c RBX: ffff994c1debbf08 RCX: 0000000000000000
  [ 2418.410302] RDX: 0000000000000000 RSI: 0000000000000027 RDI: 00000000ffffffff
  [ 2418.410904] RBP: ffff994c21770000 R08: 0000000000000000 R09: 0000000000000000
  [ 2418.411504] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000010000
  [ 2418.412111] R13: ffff994c22198400 R14: ffff994c21770000 R15: 0000000000000000
  [ 2418.412713] FS:  00007f54fd7aff00(0000) GS:ffff994d35200000(0000) knlGS:0000000000000000
  [ 2418.413326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 2418.413933] CR2: 000056549596d000 CR3: 000000010b928003 CR4: 0000000000370ee0
  [ 2418.414528] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [ 2418.415109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [ 2418.415669] Call Trace:
  [ 2418.416254]  btrfs_reserve_data_bytes.cold+0x22/0x22 [btrfs]
  [ 2418.416812]  btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
  [ 2418.417380]  btrfs_buffered_write+0x1b0/0x7f0 [btrfs]
  [ 2418.418315]  btrfs_file_write_iter+0x2a9/0x770 [btrfs]
  [ 2418.418920]  new_sync_write+0x11f/0x1c0
  [ 2418.419430]  vfs_write+0x2bb/0x3b0
  [ 2418.419972]  __x64_sys_pwrite64+0x90/0xc0
  [ 2418.420486]  do_syscall_64+0x33/0x80
  [ 2418.420979]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [ 2418.421486] RIP: 0033:0x7f54fda0b986
  [ 2418.421981] Code: 48 c7 c0 (...)
  [ 2418.423019] RSP: 002b:00007ffc40569c38 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
  [ 2418.423547] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54fda0b986
  [ 2418.424075] RDX: 0000000000010000 RSI: 000056549595e000 RDI: 0000000000000003
  [ 2418.424596] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000400
  [ 2418.425119] R10: 0000000000000400 R11: 0000000000000246 R12: 00000000ffffffff
  [ 2418.425644] R13: 0000000000000400 R14: 0000000000010000 R15: 0000000000000000
  [ 2418.426148] Modules linked in: btrfs blake2b_generic (...)
  [ 2418.429540] ---[ end trace ef2aeb44dc0afa34 ]---

1) At btrfs_file_write_iter() we set current->journal_info to
   BTRFS_DIO_SYNC_STUB;

2) We then call __btrfs_direct_write(), which calls btrfs_direct_IO();

3) We can't do the direct IO write because it starts at a non-aligned
   offset (1111). So at btrfs_direct_IO() we return -EINVAL (coming from
   check_direct_IO() which does the alignment check), but we leave
   current->journal_info set to BTRFS_DIO_SYNC_STUB - we only clear it
   at btrfs_dio_iomap_begin(), because we assume we always get there;

4) Then at __btrfs_direct_write() we see that the attempt to do the
   direct IO write was not successful, 0 bytes written, so we fallback
   to a buffered write by calling btrfs_buffered_write();

5) There we call btrfs_check_data_free_space() which in turn calls
   btrfs_alloc_data_chunk_ondemand() and that calls
   btrfs_reserve_data_bytes() with flush == BTRFS_RESERVE_FLUSH_DATA;

6) Then at btrfs_reserve_data_bytes() we have current->journal_info set to
   BTRFS_DIO_SYNC_STUB, therefore not NULL, and flush has the value
   BTRFS_RESERVE_FLUSH_DATA, triggering the second assertion:

  int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
                               enum btrfs_reserve_flush_enum flush)
  {
      struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
      int ret;

      ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
             flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE);
      ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
  (...)

So fix that by setting the journal to NULL whenever check_direct_IO()
returns a failure.

This bug only affects 5.10 kernels, and the regression was introduced in
5.10-rc1 by commit 0eb79294dbe328 ("btrfs: dio iomap DSYNC workaround").
The bug does not exist in 5.11 kernels due to commit ecfdc08b8cc65d
("btrfs: remove dio iomap DSYNC workaround"), which depends on a large
patchset that went into the merge window for 5.11. So this is a fix only
for 5.10.x stable kernels, as there are people hitting this bug.

Fixes: 0eb79294dbe328 ("btrfs: dio iomap DSYNC workaround")
CC: stable@vger.kernel.org # 5.10 (and only 5.10)
Acked-by: David Sterba <dsterba@suse.com>
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1181605
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/inode.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8026,8 +8026,12 @@ ssize_t btrfs_direct_IO(struct kiocb *io
 	bool relock = false;
 	ssize_t ret;
 
-	if (check_direct_IO(fs_info, iter, offset))
+	if (check_direct_IO(fs_info, iter, offset)) {
+		ASSERT(current->journal_info == NULL ||
+		       current->journal_info == BTRFS_DIO_SYNC_STUB);
+		current->journal_info = NULL;
 		return 0;
+	}
 
 	count = iov_iter_count(iter);
 	if (iov_iter_rw(iter) == WRITE) {



  parent reply	other threads:[~2021-02-22 12:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 12:12 [PATCH 5.10 00/29] 5.10.18-rc1 review Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 01/29] vdpa_sim: remove hard-coded virtq count Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 02/29] vdpa_sim: add struct vdpasim_dev_attr for device attributes Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 03/29] vdpa_sim: store parsed MAC address in a buffer Greg Kroah-Hartman
2021-02-22 19:54   ` Pavel Machek
2021-02-23  4:49     ` Greg Kroah-Hartman
2021-02-23  8:06     ` Stefano Garzarella
2021-02-24  8:29       ` Pavel Machek
2021-02-24  8:36         ` Stefano Garzarella
2021-02-22 12:12 ` [PATCH 5.10 04/29] vdpa_sim: make config generic and usable for any device type Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 05/29] vdpa_sim: add get_config callback in vdpasim_dev_attr Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 06/29] IB/isert: add module param to set sg_tablesize for IO cmd Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 07/29] net: qrtr: Fix port ID for control messages Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 08/29] mptcp: skip to next candidate if subflow has unacked data Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 09/29] net/sched: fix miss init the mru in qdisc_skb_cb Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 10/29] mt76: mt7915: fix endian issues Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 11/29] mt76: mt7615: fix rdd mcu cmd endianness Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 12/29] net: sched: incorrect Kconfig dependencies on Netfilter modules Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 13/29] net: openvswitch: fix TTL decrement exception action execution Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 14/29] net: bridge: Fix a warning when del bridge sysfs Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 15/29] net: fix proc_fs init handling in af_packet and tls Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 16/29] Xen/x86: dont bail early from clear_foreign_p2m_mapping() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 17/29] Xen/x86: also check kernel mapping in set_foreign_p2m_mapping() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 18/29] Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 19/29] Xen/gntdev: correct error checking " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 20/29] xen/arm: dont ignore return errors from set_phys_to_machine Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 21/29] xen-blkback: dont "handle" error by BUG() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 22/29] xen-netback: " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 23/29] xen-scsiback: " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 24/29] xen-blkback: fix error handling in xen_blkbk_map() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 25/29] tty: protect tty_write from odd low-level tty disciplines Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 26/29] Bluetooth: btusb: Always fallback to alt 1 for WBS Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 27/29] btrfs: fix backport of 2175bf57dc952 in 5.10.13 Greg Kroah-Hartman
2021-02-22 12:13 ` Greg Kroah-Hartman [this message]
2021-02-22 12:13 ` [PATCH 5.10 29/29] media: pwc: Use correct device for DMA Greg Kroah-Hartman
2021-02-22 17:17 ` [PATCH 5.10 00/29] 5.10.18-rc1 review Florian Fainelli
2021-02-24 18:42   ` Greg Kroah-Hartman
2021-02-22 18:42 ` Pavel Machek
2021-02-24 18:42   ` Greg Kroah-Hartman
2021-02-22 21:28 ` Guenter Roeck
2021-02-22 21:34 ` Igor
2021-02-23  2:49 ` Naresh Kamboju
2021-02-23 21:06 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210222121025.628566179@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).