All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Pavel Reichl <preichl@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH v7 0/4] xfs: Remove wrappers for some semaphores
Date: Mon, 23 Mar 2020 09:33:42 -0700	[thread overview]
Message-ID: <20200323163342.GD29339@magnolia> (raw)
In-Reply-To: <CAJc7PzXuRHhYztic9vZsspiHiP-vL_0HANd8x76Y+OdRVw6wwg@mail.gmail.com>

On Mon, Mar 23, 2020 at 10:22:02AM +0100, Pavel Reichl wrote:
> Oh, thanks for the heads up...I'll try to investigate.

Ahah, I figured it out.  It took me a while to pin down a solid reproducer,
but here's a stack trace that I see most often:

[  812.102819] XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/libxfs/xfs_trans_inode.c, line: 91
[  812.104017] ------------[ cut here ]------------
[  812.104598] WARNING: CPU: 2 PID: 26250 at fs/xfs/xfs_message.c:112 assfail+0x30/0x50 [xfs]
[  812.105505] Modules linked in: xfs libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip_set_hash_ip ip_set_hash_net xt_tcpudp xt_set ip_set_hash_mac ip_set nfnetlink ip6table_filter ip6_tables iptable_filter bfq sch_fq_codel ip_tables x_tables nfsv4 af_packet [last unloaded: xfs]
[  812.108176] CPU: 2 PID: 26250 Comm: kworker/2:1 Tainted: G        W         5.6.0-rc4-djw #rc4
[  812.110742] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
[  812.111724] Workqueue: xfsalloc xfs_btree_split_worker [xfs]
[  812.112404] RIP: 0010:assfail+0x30/0x50 [xfs]
[  812.112905] Code: 41 89 c8 48 89 d1 48 89 f2 48 c7 c6 10 5d 37 a0 e8 c5 f8 ff ff 0f b6 1d fa 5b 11 00 80 fb 01 0f 87 aa 1b 06 00 83 e3 01 75 04 <0f> 0b 5b c3 0f 0b 48 c7 c7 80 a6 40 a0 e8 4d b8 0d e1 0f 1f 40 00
[  812.114912] RSP: 0018:ffffc90000c7fc48 EFLAGS: 00010246
[  812.115508] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  812.116286] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa0364b4d
[  812.117087] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[  812.117875] R10: 000000000000000a R11: f000000000000000 R12: ffff888078695700
[  812.118661] R13: 0000000000000000 R14: ffff8880787c4460 R15: 0000000000000000
[  812.119445] FS:  0000000000000000(0000) GS:ffff88807e000000(0000) knlGS:0000000000000000
[  812.120311] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  812.120969] CR2: 00007fe67192deb8 CR3: 00000000713a1002 CR4: 00000000001606a0
[  812.121755] Call Trace:
[  812.122097]  xfs_trans_log_inode+0x128/0x140 [xfs]
[  812.122685]  xfs_bmbt_alloc_block+0x133/0x230 [xfs]
[  812.123286]  __xfs_btree_split+0x16d/0x980 [xfs]
[  812.123876]  xfs_btree_split_worker+0x61/0x90 [xfs]
[  812.124446]  process_one_work+0x250/0x5c0
[  812.124910]  ? worker_thread+0xcf/0x3a0
[  812.125380]  worker_thread+0x3d/0x3a0
[  812.125807]  ? process_one_work+0x5c0/0x5c0
[  812.126297]  kthread+0x121/0x140
[  812.126684]  ? kthread_park+0x80/0x80
[  812.127111]  ret_from_fork+0x3a/0x50
[  812.127545] irq event stamp: 75298
[  812.127947] hardirqs last  enabled at (75297): [<ffffffff810d7f98>] console_unlock+0x428/0x580
[  812.128895] hardirqs last disabled at (75298): [<ffffffff81001db0>] trace_hardirqs_off_thunk+0x1a/0x1c
[  812.129942] softirqs last  enabled at (74892): [<ffffffffa02c9f2b>] xfs_buf_find+0xa6b/0x1130 [xfs]
[  812.130984] softirqs last disabled at (74890): [<ffffffffa02c98c0>] xfs_buf_find+0x400/0x1130 [xfs]
[  812.131993] ---[ end trace 82fd2c9f1faba927 ]---

This one I could reproduce on my laptop by running generic/324 with a 1k
blocksize, but on my test vm fleet it was xfs/057 with 4k blocks.  YMMV.

Anyway -- I augmented isilocked with some extra trace_printk to see what
was going on, and noticed this:

161011:           <...>-28177 [000]   803.593121: xfs_ilock:            dev 8:80 ino 0x46 flags ILOCK_EXCL caller xfs_alloc_file_space
161022:           <...>-28177 [000]   803.593126: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
161038:           <...>-28177 [000]   803.593132: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
161048:           <...>-28177 [000]   803.593136: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
161064:           <...>-28177 [000]   803.593172: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
161083:           <...>-27081 [000]   803.593559: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 0
<assertion blows up>
162017:           <...>-28177 [001]   803.641200: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162036:           <...>-28177 [001]   803.641215: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162057:           <...>-28177 [001]   803.641224: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162073:           <...>-28177 [001]   803.641243: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162091:           <...>-28177 [001]   803.641260: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162106:           <...>-28177 [001]   803.641288: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162127:           <...>-28177 [001]   803.641318: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162140:           <...>-28177 [001]   803.641397: bprint:               xfs_isilocked: ino 0x46 lockf 0x4:0x1 debug_locks 1 arg 0 locked? 1
162158:           <...>-28177 [001]   803.641416: xfs_iunlock:          dev 8:80 ino 0x46 flags ILOCK_EXCL caller xfs_alloc_file_space

Process 28177 takes the ilock, and tries to do a bmapi_write which
involves a bmap btree split.  The kernel libxfs delegates the split to a
workqueue (apparently to reduce stack usage?), which performs the bmbt
split and logs the inode core.  The kworker is process 27081.

lockdep tracks the rwsem's lock state /and/ which process actually
holds the rwsem.  This ownership doesn't transfer from 28177 to 27081,
so when the kworker asks lockdep if it holds ILOCK_EXCL, lockdep says
no, because 27081 doesn't own the lock, 28177 does.  Kaboom.

The old mrlock_t had that 'int mr_writer' field which didn't care about
lock ownership and so isilocked would return true and so the assert was
happy.

So now comes the fun part -- the old isilocked code had a glaring hole
in which it would return true if *anyone* held the lock, even if the
owner is some other unrelated thread.  That's probably good enough for
most of the fstests because we generally only run one thread at a time,
and developers will probably notice. :)

However, with your series applied, the isilocked function becomes much
more powerful when lockdep is active because now we can test that the
lock is held *by the caller*, which closes that hole.

Unfortunately, it also trips over this bmap split case, so if there's a
way to solve that problem we'd better find it quickly.  Unfortunately, I
don't know of a way to gift a lock to another thread temporarily...

Thoughts?

--D

> On Mon, Mar 23, 2020 at 4:28 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Fri, Mar 20, 2020 at 10:03:13PM +0100, Pavel Reichl wrote:
> > > Remove some wrappers that we have in XFS around the read-write semaphore
> > > locks.
> > >
> > > The goal of this cleanup is to remove mrlock_t structure and its mr*()
> > > wrapper functions and replace it with native rw_semaphore type and its
> > > native calls.
> >
> > Hmmm, there's something funny about this patchset that causes my fstests
> > vm to explode with isilocked assertions everywhere... I'll look more
> > tomorrow (it's still the weekend here) but figured I should tell you
> > sooner than later.
> >
> > --D
> >
> > > Pavel Reichl (4):
> > >   xfs: Refactor xfs_isilocked()
> > >   xfs: clean up whitespace in xfs_isilocked() calls
> > >   xfs: xfs_isilocked() can only check a single lock type
> > >   xfs: replace mrlock_t with rw_semaphores
> > >
> > >  fs/xfs/libxfs/xfs_bmap.c |   8 +--
> > >  fs/xfs/mrlock.h          |  78 -----------------------------
> > >  fs/xfs/xfs_file.c        |   3 +-
> > >  fs/xfs/xfs_inode.c       | 104 ++++++++++++++++++++++++++-------------
> > >  fs/xfs/xfs_inode.h       |  25 ++++++----
> > >  fs/xfs/xfs_iops.c        |   4 +-
> > >  fs/xfs/xfs_linux.h       |   2 +-
> > >  fs/xfs/xfs_qm.c          |   2 +-
> > >  fs/xfs/xfs_super.c       |   6 +--
> > >  9 files changed, 98 insertions(+), 134 deletions(-)
> > >  delete mode 100644 fs/xfs/mrlock.h
> > >
> > > --
> > > 2.25.1
> > >
> >
> 

  reply	other threads:[~2020-03-23 16:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 21:03 [PATCH v7 0/4] xfs: Remove wrappers for some semaphores Pavel Reichl
2020-03-20 21:03 ` [PATCH v7 1/4] xfs: Refactor xfs_isilocked() Pavel Reichl
2020-03-20 21:03 ` [PATCH v7 2/4] xfs: clean up whitespace in xfs_isilocked() calls Pavel Reichl
2020-03-20 21:03 ` [PATCH v7 3/4] xfs: xfs_isilocked() can only check a single lock type Pavel Reichl
2020-03-20 21:03 ` [PATCH v7 4/4] xfs: replace mrlock_t with rw_semaphores Pavel Reichl
2020-03-23  3:28 ` [PATCH v7 0/4] xfs: Remove wrappers for some semaphores Darrick J. Wong
2020-03-23  9:22   ` Pavel Reichl
2020-03-23 16:33     ` Darrick J. Wong [this message]
2020-03-23 16:49       ` Christoph Hellwig
2020-03-23 21:59       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200323163342.GD29339@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=preichl@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.