All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Cc: "Williams, Dan J" <dan.j.williams@intel.com>,
	"hch@lst.de" <hch@lst.de>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>
Subject: 5.3-rc1 regression with XFS log recovery
Date: Fri, 16 Aug 2019 20:59:44 +0000	[thread overview]
Message-ID: <e49a6a3a244db055995769eb844c281f93e50ab9.camel@intel.com> (raw)

Hi all,

When running the 'ndctl' unit tests against 5.3-rc kernels, I noticed a
frequent failure of the 'mmap.sh' test [1][2].

[1]: https://github.com/pmem/ndctl/blob/master/test/mmap.sh
[2]: https://github.com/pmem/ndctl/blob/master/test/mmap.c

But in trying to pare down the test further, I found that I can simply
reproduce the problem by:

  mkfs.xfs -f /dev/pmem0
  mount /dev/pmem0 /mnt/mem

Where 'pmem0' is a legacy pmem namespace from reserved memory using the
memmap= command line option. (Specifically, I have this:
memmap=3G!6G,3G!9G )

The above mkfs/mount steps don't reproduce the problem a 100% of the
time, but it does happen on my qemu based setup over 75% of the times.

The kernel log shows the following when the mount fails:

   [Aug16 14:41] XFS (pmem0): Mounting V5 Filesystem
   [  +0.001856] XFS (pmem0): totally zeroed log
   [  +0.402616] XFS (pmem0): Internal error xlog_clear_stale_blocks(2) at line 1715 of file fs/xfs/xfs_log_recover.c.  Caller xlog_find_tail+0x230/0x340 [xfs]
   [  +0.001741] CPU: 7 PID: 1771 Comm: mount Tainted: G           O      5.2.0-rc4+ #112
   [  +0.000976] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
   [  +0.001516] Call Trace:
   [  +0.000351]  dump_stack+0x85/0xc0
   [  +0.000452]  xlog_clear_stale_blocks+0x16d/0x180 [xfs]
   [  +0.000665]  xlog_find_tail+0x230/0x340 [xfs]
   [  +0.000581]  xlog_recover+0x2b/0x160 [xfs]
   [  +0.000554]  xfs_log_mount+0x280/0x2a0 [xfs]
   [  +0.000561]  xfs_mountfs+0x415/0x860 [xfs]
   [  +0.000533]  ? xfs_mru_cache_create+0x18b/0x1f0 [xfs]
   [  +0.000665]  xfs_fs_fill_super+0x4b0/0x700 [xfs]
   [  +0.000638]  ? xfs_test_remount_options+0x60/0x60 [xfs]
   [  +0.000710]  mount_bdev+0x17f/0x1b0
   [  +0.000442]  legacy_get_tree+0x30/0x50
   [  +0.000467]  vfs_get_tree+0x28/0xf0
   [  +0.000436]  do_mount+0x2d4/0xa00
   [  +0.000411]  ? memdup_user+0x3e/0x70
   [  +0.000455]  ksys_mount+0xba/0xd0
   [  +0.000420]  __x64_sys_mount+0x21/0x30
   [  +0.000473]  do_syscall_64+0x60/0x240
   [  +0.000460]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
   [  +0.000655] RIP: 0033:0x7f730fec91be
   [  +0.000506] Code: 48 8b 0d cd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9a 1c 0c 00 f7 d8 64 89 01 48
   [  +0.002305] RSP: 002b:00007ffdadbdb178 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
   [  +0.000922] RAX: ffffffffffffffda RBX: 000055b8f9db8a40 RCX: 00007f730fec91be
   [  +0.000875] RDX: 000055b8f9dbfdb0 RSI: 000055b8f9dbb930 RDI: 000055b8f9db8c20
   [  +0.000917] RBP: 00007f731007f1a4 R08: 0000000000000000 R09: 000055b8f9dc01f0
   [  +0.000942] R10: 00000000c0ed0000 R11: 0000000000000246 R12: 0000000000000000
   [  +0.000878] R13: 00000000c0ed0000 R14: 000055b8f9db8c20 R15: 000055b8f9dbfdb0
   [  +0.000915] XFS (pmem0): failed to locate log tail
   [  +0.000622] XFS (pmem0): log mount/recovery failed: error -117
   [  +0.012560] XFS (pmem0): log mount failed


A bisect pointed to this commit:

commit 6ad5b3255b9e3d6d94154738aacd5119bf9c8f6e (HEAD -> bisect-bad, refs/bisect/bad)
Author: Christoph Hellwig <hch@lst.de>
Date:   Fri Jun 28 19:27:26 2019 -0700

    xfs: use bios directly to read and write the log recovery buffers
    
    The xfs_buf structure is basically used as a glorified container for
    a memory allocation in the log recovery code.  Replace it with a
    call to kmem_alloc_large and a simple abstraction to read into or
    write from it synchronously using chained bios.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Full bisect log follows at the end.

I saw [3], but I can still easily hit the failure after manually
applying that patch on the above commit.

[3]: https://lore.kernel.org/linux-xfs/20190709152352.27465-1-hch@lst.de/

Any thoughts on what might be happening? I'd be happy to test out
theories/patches.

Thanks,
	-Vishal


             reply	other threads:[~2019-08-16 20:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-16 20:59 Verma, Vishal L [this message]
2019-08-18  7:11 ` 5.3-rc1 regression with XFS log recovery hch
2019-08-18  7:41   ` hch
2019-08-18 17:34     ` hch
2019-08-19  0:08       ` Dave Chinner
2019-08-19  3:49         ` hch
2019-08-19  4:11           ` hch
2019-08-19  4:22             ` Dave Chinner
2019-08-19  4:29               ` hch
2019-08-19  4:40                 ` hch
2019-08-19  5:31                   ` Dave Chinner
2019-08-20  6:14                     ` hch
2019-08-20  4:41                   ` Dave Chinner
2019-08-20  5:53                     ` hch
2019-08-20  7:44                       ` Dave Chinner
2019-08-20  8:13                       ` Ming Lei
2019-08-20  9:24                         ` Ming Lei
2019-08-20 16:30                           ` Verma, Vishal L
2019-08-20 21:44                           ` Dave Chinner
2019-08-20 22:08                             ` Verma, Vishal L
2019-08-20 23:53                               ` Dave Chinner
2019-08-21  2:19                               ` Ming Lei
2019-08-21  1:56                             ` Ming Lei
2019-08-19  4:15           ` Dave Chinner
2019-08-19 17:19       ` Verma, Vishal L
2019-08-21  0:26       ` Dave Chinner
2019-08-21  0:44         ` hch
2019-08-21  1:08           ` Dave Chinner
2019-08-21  1:56             ` Verma, Vishal L
2019-08-21  6:15               ` Dave Chinner
2019-08-26 17:32       ` Verma, Vishal L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e49a6a3a244db055995769eb844c281f93e50ab9.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.