All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Soren Hansen <soren@linux2go.dk>
Cc: linux-kernel@vger.kernel.org, Dave Hansen <haveblue@us.ibm.com>
Subject: Re: nbd locking problems
Date: Fri, 28 Jan 2011 21:18:26 +0000	[thread overview]
Message-ID: <20110128211825.GA20707@mail.hallyn.com> (raw)
In-Reply-To: <AANLkTimP=5A-BNHUwV0qhvjw8y4HevMvXtOBpRr6vFYR@mail.gmail.com>

Quoting Soren Hansen (soren@linux2go.dk):
> I'm seeing the following lockup on a current kernel:
> 
> [117360.880311] INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
> [117360.880319] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [117360.880326] qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
> [117360.880337]  ffff88007d775d98 0000000000000082 ffff88007d775fd8
> ffff88007d774000
> [117360.880348]  0000000000013a80 ffff8800020347e0 ffff88007d775fd8
> 0000000000013a80
> [117360.880359]  ffff880133730000 ffff880002034440 ffffea0004333db8
> ffffffffa071c020
> [117360.880369] Call Trace:
> [117360.880389]  [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
> [117360.880400]  [<ffffffff81132f13>] ? handle_mm_fault+0x1c3/0x410
> [117360.880408]  [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
> [117360.880419]  [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
> [117360.880429]  [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
> [117360.880438]  [<ffffffff811967a1>] block_ioctl+0x41/0x50
> [117360.880447]  [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
> [117360.880455]  [<ffffffff8116fa83>] ? putname+0x33/0x50
> [117360.880463]  [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
> [117360.880473]  [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b
> 
> It seems to be caused by 2a48fc0ab24241755dc93bfd4f01d68efab47f5a which
> replaced lock_kernel() with mutex operations.
> 
> In my attempts to debug this problem, I reverted that commit.
> 
> There's a printk statement that logs each ioctl call right before the
> lock_kernel call. I extended it to log the pid of the caller, and added
> a similar printk statement after unlock_kernel.
> 
> This worked for me, and I saw something like this in my dmesg:
> 
> [122322.650443] nbd9: in: nbd_ioctl pid=(24396) cmd=set-blksize(0xab01) arg=1024
> [122322.650454] nbd9: out: nbd_ioctl pid=(24396)
> cmd=set-blksize(0xab01) arg=1024
> [122322.650462] nbd9: in: nbd_ioctl pid=(24396)
> cmd=set-size-blocks(0xab07) arg=153600
> [122322.650469] nbd9: out: nbd_ioctl pid=(24396)
> cmd=set-size-blocks(0xab07) arg=153600
> [122322.650476] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04)
> arg=153600
> [122322.650483] nbd9: out: nbd_ioctl pid=(24396)
> cmd=clear-sock(0xab04) arg=153600
> [122322.650490] nbd9: in: nbd_ioctl pid=(24396) cmd=set-sock(0xab00) arg=3
> [122322.650497] nbd9: out: nbd_ioctl pid=(24396) cmd=set-sock(0xab00) arg=3
> [122322.651178] nbd9: in: nbd_ioctl pid=(24396) cmd=do-it(0xab03)
> arg=139925697560832
> [122322.652454]  nbd9: unknown partition table
> [122323.070522] nbd9: in: nbd_ioctl pid=(24427) cmd=clear-que(0xab05)
> arg=140267460190640
> [122323.070527] nbd9: out: nbd_ioctl pid=(24427) cmd=clear-que(0xab05)
> arg=140267460190640
> [122323.070530] nbd9: in: nbd_ioctl pid=(24427) cmd=disconnect(0xab08)
> arg=140267460190640
> [122323.070532] nbd9: NBD_DISCONNECT
> [122323.070555] nbd9: out: nbd_ioctl pid=(24427)
> cmd=disconnect(0xab08) arg=140267460190640
> [122323.070558] nbd9: in: nbd_ioctl pid=(24427) cmd=clear-sock(0xab04)
> arg=140267460190640
> [122323.070561] nbd9: out: nbd_ioctl pid=(24427)
> cmd=clear-sock(0xab04) arg=140267460190640
> [122323.070667] nbd9: Receive control failed (result -32)
> [122323.070710] nbd9: queue cleared
> [122323.071186] nbd9: out: nbd_ioctl pid=(24396) cmd=do-it(0xab03)
> arg=139925697560832
> [122323.071194] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-que(0xab05) arg=32
> [122323.071197] nbd9: out: nbd_ioctl pid=(24396) cmd=clear-que(0xab05) arg=32
> [122323.071200] nbd9: in: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04) arg=32
> [122323.071203] nbd9: out: nbd_ioctl pid=(24396) cmd=clear-sock(0xab04) arg=32
> 
> As I understand it, lock_kernel is implemented using a semaphore which
> allows a single process to invoke lock_kernel more than once without
> locking up. However, as can be seen above (timestamp 122322.651178), pid
> 24396 invokes the NBD_DO_IT ioctl. This calls lock_kernel and starts a
> kernel thread to do the heavy lifting of nbd. Later on, pid 24427 comes
> along, wanting to shut down the connection. It works. I don't understand
> how it makes it past the lock_kernel() call. There's only one instance
> of each of lock_kernel and unlock_kernel in the nbd driver, so I'm
> fairly certain unlock_kernel isn't being called (at least not by nbd
> directly).
> 
> With the mutex code introduced by
> 2a48fc0ab24241755dc93bfd4f01d68efab47f5a but with my extra printk's, this
> is what I get:
> 
> [125474.139952] nbd9: in: nbd_ioctl pid=(26015) cmd=set-blksize(0xab01) arg=1024
> [125474.139963] nbd9: out: nbd_ioctl pid=(26015)
> cmd=set-blksize(0xab01) arg=1024
> [125474.139971] nbd9: in: nbd_ioctl pid=(26015)
> cmd=set-size-blocks(0xab07) arg=153600
> [125474.139979] nbd9: out: nbd_ioctl pid=(26015)
> cmd=set-size-blocks(0xab07) arg=153600
> [125474.139986] nbd9: in: nbd_ioctl pid=(26015) cmd=clear-sock(0xab04)
> arg=153600
> [125474.139994] nbd9: out: nbd_ioctl pid=(26015)
> cmd=clear-sock(0xab04) arg=153600
> [125474.140001] nbd9: in: nbd_ioctl pid=(26015) cmd=set-sock(0xab00) arg=3
> [125474.140036] nbd9: out: nbd_ioctl pid=(26015) cmd=set-sock(0xab00) arg=3
> [125474.140660] nbd9: in: nbd_ioctl pid=(26015) cmd=do-it(0xab03)
> arg=140041726828800
> [125474.142255]  nbd9: unknown partition table
> [125475.604613] nbd9: in: nbd_ioctl pid=(26047) cmd=clear-que(0xab05)
> arg=139640717427120
> 
> ...and then it sits there until I kill -9 26015.  This makes sense to
> me, though. The mutex is held by 26015, so of course 26047 can't grab
> it, too. The mystery is why it works with lock_kernel. It doesn't
> release the kernel explicitly, and it doesn't (explicitly, at least)
> call schedule().

Thanks for looking into this, Soren.

As Dave just explained to me, BKL is released when you sleep :)  I
assume that's how it gets released around the kthread_create().  I
*think* you're right that the new mutex is superfluous, but I'd like to
look through the ioctl code and make sure there's no shared state which
we need to protect.  I don't see how there could be, or rather, if there
were, then it was broken before.

In the meantime, could you please post your patch?  (Is it the
three-liner I'd expect?)

thanks,
-serge

  reply	other threads:[~2011-01-28 21:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-26 10:52 nbd locking problems Soren Hansen
2011-01-28 21:18 ` Serge E. Hallyn [this message]
2011-01-28 21:30   ` Arnd Bergmann
2011-01-28 23:11   ` Soren Hansen
2011-03-13 10:28     ` Pavel Machek
2011-01-31 19:35   ` Serge E. Hallyn
2011-01-28 23:23 ` [PATCH] nbd: Remove module-level ioctl mutex Soren Hansen
2011-01-30 15:07 nbd locking problems Paul Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110128211825.GA20707@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=haveblue@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=soren@linux2go.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.