All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm@lists.01.org, peterz@infradead.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH v2 6/7] libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
Date: Wed, 17 Jul 2019 22:04:48 -0400	[thread overview]
Message-ID: <20190718020448.GE3079@sasha-vm> (raw)
In-Reply-To: <156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.com>

On Wed, Jul 17, 2019 at 06:08:21PM -0700, Dan Williams wrote:
>A multithreaded namespace creation/destruction stress test currently
>deadlocks with the following lockup signature:
>
>    INFO: task ndctl:2924 blocked for more than 122 seconds.
>          Tainted: G           OE     5.2.0-rc4+ #3382
>    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>    ndctl           D    0  2924   1176 0x00000000
>    Call Trace:
>     ? __schedule+0x27e/0x780
>     schedule+0x30/0xb0
>     wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
>     ? finish_wait+0x80/0x80
>     uuid_store+0xe6/0x2e0 [libnvdimm]
>     kernfs_fop_write+0xf0/0x1a0
>     vfs_write+0xb7/0x1b0
>     ksys_write+0x5c/0xd0
>     do_syscall_64+0x60/0x240
>
>     INFO: task ndctl:2923 blocked for more than 122 seconds.
>           Tainted: G           OE     5.2.0-rc4+ #3382
>     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     ndctl           D    0  2923   1175 0x00000000
>     Call Trace:
>      ? __schedule+0x27e/0x780
>      ? __mutex_lock+0x489/0x910
>      schedule+0x30/0xb0
>      schedule_preempt_disabled+0x11/0x20
>      __mutex_lock+0x48e/0x910
>      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      ? __lock_acquire+0x23f/0x1710
>      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      __dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
>      ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
>      dax_pmem_probe+0xc/0x20 [dax_pmem]
>      nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
>      really_probe+0xef/0x390
>      driver_probe_device+0xb4/0x100
>
>In this sequence an 'nd_dax' device is being probed and trying to take
>the lock on its backing namespace to validate that the 'nd_dax' device
>indeed has exclusive access to the backing namespace. Meanwhile, another
>thread is trying to update the uuid property of that same backing
>namespace. So one thread is in the probe path trying to acquire the
>lock, and the other thread has acquired the lock and tries to flush the
>probe path.
>
>Fix this deadlock by not holding the namespace device_lock over the
>wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
>the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
>subsequently dropped internally to wait_nvdimm_bus_probe_idle().
>
>Cc: <stable@vger.kernel.org>
>Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
>Cc: Vishal Verma <vishal.l.verma@intel.com>
>Tested-by: Jane Chu <jane.chu@oracle.com>
>Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Hi Dan,

The way these patches are split, when we take them to stable this patch
won't apply because it wants "libnvdimm/bus: Prepare the nd_ioctl() path
to be re-entrant".

If you were to send another iteration of this patchset, could you please
re-order the patches so they will apply cleanly to stable? this will
help with reducing mail exchanges later on and possibly a mis-merge into
stable.

If not, this should serve as a reference for future us to double check
the backport.

--
Thanks,
Sasha
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm@lists.01.org, stable@vger.kernel.org,
	Vishal Verma <vishal.l.verma@intel.com>,
	Jane Chu <jane.chu@oracle.com>,
	peterz@infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/7] libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
Date: Wed, 17 Jul 2019 22:04:48 -0400	[thread overview]
Message-ID: <20190718020448.GE3079@sasha-vm> (raw)
In-Reply-To: <156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.com>

On Wed, Jul 17, 2019 at 06:08:21PM -0700, Dan Williams wrote:
>A multithreaded namespace creation/destruction stress test currently
>deadlocks with the following lockup signature:
>
>    INFO: task ndctl:2924 blocked for more than 122 seconds.
>          Tainted: G           OE     5.2.0-rc4+ #3382
>    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>    ndctl           D    0  2924   1176 0x00000000
>    Call Trace:
>     ? __schedule+0x27e/0x780
>     schedule+0x30/0xb0
>     wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
>     ? finish_wait+0x80/0x80
>     uuid_store+0xe6/0x2e0 [libnvdimm]
>     kernfs_fop_write+0xf0/0x1a0
>     vfs_write+0xb7/0x1b0
>     ksys_write+0x5c/0xd0
>     do_syscall_64+0x60/0x240
>
>     INFO: task ndctl:2923 blocked for more than 122 seconds.
>           Tainted: G           OE     5.2.0-rc4+ #3382
>     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     ndctl           D    0  2923   1175 0x00000000
>     Call Trace:
>      ? __schedule+0x27e/0x780
>      ? __mutex_lock+0x489/0x910
>      schedule+0x30/0xb0
>      schedule_preempt_disabled+0x11/0x20
>      __mutex_lock+0x48e/0x910
>      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      ? __lock_acquire+0x23f/0x1710
>      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
>      __dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
>      ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
>      dax_pmem_probe+0xc/0x20 [dax_pmem]
>      nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
>      really_probe+0xef/0x390
>      driver_probe_device+0xb4/0x100
>
>In this sequence an 'nd_dax' device is being probed and trying to take
>the lock on its backing namespace to validate that the 'nd_dax' device
>indeed has exclusive access to the backing namespace. Meanwhile, another
>thread is trying to update the uuid property of that same backing
>namespace. So one thread is in the probe path trying to acquire the
>lock, and the other thread has acquired the lock and tries to flush the
>probe path.
>
>Fix this deadlock by not holding the namespace device_lock over the
>wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
>the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
>subsequently dropped internally to wait_nvdimm_bus_probe_idle().
>
>Cc: <stable@vger.kernel.org>
>Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
>Cc: Vishal Verma <vishal.l.verma@intel.com>
>Tested-by: Jane Chu <jane.chu@oracle.com>
>Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Hi Dan,

The way these patches are split, when we take them to stable this patch
won't apply because it wants "libnvdimm/bus: Prepare the nd_ioctl() path
to be re-entrant".

If you were to send another iteration of this patchset, could you please
re-order the patches so they will apply cleanly to stable? this will
help with reducing mail exchanges later on and possibly a mis-merge into
stable.

If not, this should serve as a reference for future us to double check
the backport.

--
Thanks,
Sasha

  reply	other threads:[~2019-07-18  2:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-18  1:07 [PATCH v2 0/7] libnvdimm: Fix async operations and locking Dan Williams
2019-07-18  1:07 ` Dan Williams
2019-07-18  1:07 ` [PATCH v2 1/7] drivers/base: Introduce kill_device() Dan Williams
2019-07-18  1:07   ` Dan Williams
2019-07-18  2:29   ` Greg Kroah-Hartman
2019-07-18  2:29     ` Greg Kroah-Hartman
     [not found]   ` <156341207332.292348.14959761496009347574.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2019-07-19  0:45     ` Sasha Levin
2019-07-18  1:07 ` [PATCH v2 2/7] libnvdimm/bus: Prevent duplicate device_unregister() calls Dan Williams
2019-07-18  1:07   ` Dan Williams
2019-07-18  1:08 ` [PATCH v2 3/7] libnvdimm/region: Register badblocks before namespaces Dan Williams
2019-07-18  1:08   ` Dan Williams
2019-07-18 18:16   ` Verma, Vishal L
2019-07-18 18:16     ` Verma, Vishal L
2019-07-18  1:08 ` [PATCH v2 4/7] libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant Dan Williams
2019-07-18  1:08   ` Dan Williams
2019-07-18 18:21   ` Verma, Vishal L
2019-07-18 18:21     ` Verma, Vishal L
2019-07-18  1:08 ` [PATCH v2 5/7] libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl() Dan Williams
2019-07-18  1:08   ` Dan Williams
2019-07-18  1:08 ` [PATCH v2 6/7] libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock Dan Williams
2019-07-18  1:08   ` Dan Williams
2019-07-18  2:04   ` Sasha Levin [this message]
2019-07-18  2:04     ` Sasha Levin
2019-07-18  6:39     ` Dan Williams
2019-07-18  6:39       ` Dan Williams
2019-07-18  1:08 ` [PATCH v2 7/7] driver-core, libnvdimm: Let device subsystems add local lockdep coverage Dan Williams
2019-07-18  1:08   ` Dan Williams
2019-07-18  2:28   ` Greg Kroah-Hartman
2019-07-18  2:28     ` Greg Kroah-Hartman
2019-07-18 16:09   ` Ira Weiny
2019-07-18 16:09     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190718020448.GE3079@sasha-vm \
    --to=sashal@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.