linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs/pstore: Perform erase from a worker
@ 2017-03-17  9:52 Chris Wilson
  2017-03-20 17:49 ` Kees Cook
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2017-03-17  9:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: intel-gfx, Chris Wilson, Tomi Sarvela, Kees Cook,
	Anton Vorontsov, Colin Cross, Tony Luck, Stefan Hajnoczi,
	Namhyung Kim, # v4 . 10+

In order to prevent a cyclic recursion between psi->read_mutex and the
inode_lock, we need to move the pse->erase to a worker.

[  605.374955] ======================================================
[  605.381281] [ INFO: possible circular locking dependency detected ]
[  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
[  605.392826] -------------------------------------------------------
[  605.399196] rm/7298 is trying to acquire lock:
[  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [<ffffffff813e183f>] pstore_unlink+0x3f/0xa0
[  605.412300]
[  605.412300] but task is already holding lock:
[  605.418237]  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c/0x19
0
[  605.427397]
[  605.427397] which lock already depends on the new lock.
[  605.427397]
[  605.435770]
[  605.435770] the existing dependency chain (in reverse order) is:
[  605.443396]
[  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}:
[  605.450347]        lock_acquire+0xc9/0x220
[  605.454551]        down_write+0x3f/0x70
[  605.458484]        pstore_mkfile+0x1f4/0x460
[  605.462835]        pstore_get_records+0x17a/0x320
[  605.467664]        pstore_fill_super+0xa4/0xc0
[  605.472205]        mount_single+0x89/0xb0
[  605.476314]        pstore_mount+0x13/0x20
[  605.480411]        mount_fs+0xf/0x90
[  605.484122]        vfs_kern_mount+0x66/0x170
[  605.488464]        do_mount+0x190/0xd50
[  605.492397]        SyS_mount+0x90/0xd0
[  605.496212]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.501496]
[  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
[  605.507747]        __lock_acquire+0x1ac0/0x1bb0
[  605.512401]        lock_acquire+0xc9/0x220
[  605.516594]        __mutex_lock+0x6e/0x990
[  605.520755]        mutex_lock_nested+0x16/0x20
[  605.525279]        pstore_unlink+0x3f/0xa0
[  605.529465]        vfs_unlink+0xb5/0x190
[  605.533477]        do_unlinkat+0x24c/0x2a0
[  605.537672]        SyS_unlinkat+0x16/0x30
[  605.541781]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.547067]
[  605.547067] other info that might help us debug this:
[  605.547067]
[  605.555221]  Possible unsafe locking scenario:
[  605.555221]
[  605.561280]        CPU0                    CPU1
[  605.565883]        ----                    ----
[  605.570502]   lock(&sb->s_type->i_mutex_key#14);
[  605.575217]                                lock(&psinfo->read_mutex);
[  605.581803]                                lock(&sb->s_type->i_mutex_key#14);
[  605.589159]   lock(&psinfo->read_mutex);
[  605.593156]
[  605.593156]  *** DEADLOCK ***
[  605.593156]
[  605.599214] 3 locks held by rm/7298:
[  605.602896]  #0:  (sb_writers#11){.+.+..}, at: [<ffffffff8122e85f>] mnt_want_write+0x1f/0x50
[  605.611490]  #1:  (&sb->s_type->i_mutex_key#14/1){+.+...}, at: [<ffffffff8121a7dc>] do_unlinkat+0
x11c/0x2a0
[  605.621417]  #2:  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c
/0x190
[  605.630995]
[  605.630995] stack backtrace:
[  605.635450] CPU: 7 PID: 7298 Comm: rm Not tainted 4.11.0-rc2-CI-CI_DRM_2352+ #1
[  605.642999] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
017
[  605.652305] Call Trace:
[  605.654814]  dump_stack+0x67/0x92
[  605.658184]  print_circular_bug+0x1e0/0x2e0
[  605.662465]  __lock_acquire+0x1ac0/0x1bb0
[  605.666634]  ? retint_kernel+0x2d/0x2d
[  605.670456]  lock_acquire+0xc9/0x220
[  605.674112]  ? pstore_unlink+0x3f/0xa0
[  605.677970]  ? pstore_unlink+0x3f/0xa0
[  605.681818]  __mutex_lock+0x6e/0x990
[  605.685456]  ? pstore_unlink+0x3f/0xa0
[  605.689791]  ? pstore_unlink+0x3f/0xa0
[  605.694124]  ? vfs_unlink+0x4c/0x190
[  605.698310]  mutex_lock_nested+0x16/0x20
[  605.702859]  pstore_unlink+0x3f/0xa0
[  605.707021]  vfs_unlink+0xb5/0x190
[  605.711024]  do_unlinkat+0x24c/0x2a0
[  605.715194]  SyS_unlinkat+0x16/0x30
[  605.719275]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  605.724543] RIP: 0033:0x7f8b08073ed7
[  605.728676] RSP: 002b:00007ffe70eff628 EFLAGS: 00000206 ORIG_RAX: 0000000000000107
[  605.736929] RAX: ffffffffffffffda RBX: ffffffff8147ea93 RCX: 00007f8b08073ed7
[  605.744711] RDX: 0000000000000000 RSI: 0000000001450000 RDI: 00000000ffffff9c
[  605.752512] RBP: ffffc9000338ff88 R08: 0000000000000003 R09: 0000000000000000
[  605.760276] R10: 000000000000015e R11: 0000000000000206 R12: 0000000000000000
[  605.768040] R13: 00007ffe70eff750 R14: 000000000144ff70 R15: 0000000001451230
[  605.775800]  ? __this_cpu_preempt_check+0x13/0x20

Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Fixes: e9e360b08a44 ("pstore: Protect unlink with read_mutex")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100234
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: <stable@vger.kernel.org> # v4.10+
---
 fs/pstore/inode.c    | 37 ++++++++++++++++++++++++++++++-------
 fs/pstore/platform.c |  2 ++
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 57c0646479f5..dee5d34d9a64 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -186,6 +186,26 @@ static const struct file_operations pstore_file_operations = {
 	.release	= seq_release,
 };
 
+struct pstore_unlink_work {
+	struct work_struct work;
+	struct dentry *dentry;
+};
+
+static void pstore_unlink_work(struct work_struct *work)
+{
+	struct pstore_unlink_work *arg = container_of(work, typeof(*arg), work);
+	struct dentry *dentry = arg->dentry;
+	struct pstore_private *p = d_inode(dentry)->i_private;
+
+	mutex_lock(&p->psi->read_mutex);
+	p->psi->erase(p->type, p->id, p->count,
+		      d_inode(dentry)->i_ctime, p->psi);
+	mutex_unlock(&p->psi->read_mutex);
+
+	dput(dentry);
+	kfree(arg);
+}
+
 /*
  * When a file is unlinked from our file system we call the
  * platform driver to erase the record from persistent store.
@@ -193,20 +213,23 @@ static const struct file_operations pstore_file_operations = {
 static int pstore_unlink(struct inode *dir, struct dentry *dentry)
 {
 	struct pstore_private *p = d_inode(dentry)->i_private;
+	struct pstore_unlink_work *work;
 	int err;
 
 	err = pstore_check_syslog_permissions(p);
 	if (err)
 		return err;
 
-	if (p->psi->erase) {
-		mutex_lock(&p->psi->read_mutex);
-		p->psi->erase(p->type, p->id, p->count,
-			      d_inode(dentry)->i_ctime, p->psi);
-		mutex_unlock(&p->psi->read_mutex);
-	} else {
+	if (!p->psi->erase)
 		return -EPERM;
-	}
+
+	work = kmalloc(sizeof(*work), GFP_KERNEL);
+	if (!work)
+		return -ENOMEM;
+
+	work->dentry = dget(dentry);
+	INIT_WORK(&work->work, pstore_unlink_work);
+	schedule_work(&work->work);
 
 	return simple_unlink(dir, dentry);
 }
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index efab7b64925b..6ffca5ab4782 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -731,6 +731,8 @@ EXPORT_SYMBOL_GPL(pstore_register);
 
 void pstore_unregister(struct pstore_info *psi)
 {
+	flush_scheduled_work();
+
 	if (psi->flags & PSTORE_FLAGS_PMSG)
 		pstore_unregister_pmsg();
 	if (psi->flags & PSTORE_FLAGS_FTRACE)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/pstore: Perform erase from a worker
  2017-03-17  9:52 [PATCH] fs/pstore: Perform erase from a worker Chris Wilson
@ 2017-03-20 17:49 ` Kees Cook
  2017-03-21  5:58   ` Namhyung Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Kees Cook @ 2017-03-20 17:49 UTC (permalink / raw)
  To: Chris Wilson
  Cc: LKML, intel-gfx, Tomi Sarvela, Anton Vorontsov, Colin Cross,
	Tony Luck, Stefan Hajnoczi, Namhyung Kim, # v4 . 10+

On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> In order to prevent a cyclic recursion between psi->read_mutex and the
> inode_lock, we need to move the pse->erase to a worker.
>
> [  605.374955] ======================================================
> [  605.381281] [ INFO: possible circular locking dependency detected ]
> [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> [  605.392826] -------------------------------------------------------
> [  605.399196] rm/7298 is trying to acquire lock:
> [  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [<ffffffff813e183f>] pstore_unlink+0x3f/0xa0
> [  605.412300]
> [  605.412300] but task is already holding lock:
> [  605.418237]  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c/0x19
> 0
> [  605.427397]
> [  605.427397] which lock already depends on the new lock.
> [  605.427397]
> [  605.435770]
> [  605.435770] the existing dependency chain (in reverse order) is:
> [  605.443396]
> [  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}:
> [  605.450347]        lock_acquire+0xc9/0x220
> [  605.454551]        down_write+0x3f/0x70
> [  605.458484]        pstore_mkfile+0x1f4/0x460
> [  605.462835]        pstore_get_records+0x17a/0x320
> [  605.467664]        pstore_fill_super+0xa4/0xc0
> [  605.472205]        mount_single+0x89/0xb0
> [  605.476314]        pstore_mount+0x13/0x20
> [  605.480411]        mount_fs+0xf/0x90
> [  605.484122]        vfs_kern_mount+0x66/0x170
> [  605.488464]        do_mount+0x190/0xd50
> [  605.492397]        SyS_mount+0x90/0xd0
> [  605.496212]        entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.501496]
> [  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
> [  605.507747]        __lock_acquire+0x1ac0/0x1bb0
> [  605.512401]        lock_acquire+0xc9/0x220
> [  605.516594]        __mutex_lock+0x6e/0x990
> [  605.520755]        mutex_lock_nested+0x16/0x20
> [  605.525279]        pstore_unlink+0x3f/0xa0
> [  605.529465]        vfs_unlink+0xb5/0x190
> [  605.533477]        do_unlinkat+0x24c/0x2a0
> [  605.537672]        SyS_unlinkat+0x16/0x30
> [  605.541781]        entry_SYSCALL_64_fastpath+0x1c/0xb1

If I'm reading this right it's a race between mount and unlink...
that's quite a corner case. :)

> [  605.547067]
> [  605.547067] other info that might help us debug this:
> [  605.547067]
> [  605.555221]  Possible unsafe locking scenario:
> [  605.555221]
> [  605.561280]        CPU0                    CPU1
> [  605.565883]        ----                    ----
> [  605.570502]   lock(&sb->s_type->i_mutex_key#14);
> [  605.575217]                                lock(&psinfo->read_mutex);
> [  605.581803]                                lock(&sb->s_type->i_mutex_key#14);
> [  605.589159]   lock(&psinfo->read_mutex);

I haven't had time to dig much yet, but I wonder if the locking order
on unlink could just be reversed, and the deadlock would go away?

> [  605.593156]
> [  605.593156]  *** DEADLOCK ***
> [  605.593156]
> [  605.599214] 3 locks held by rm/7298:
> [  605.602896]  #0:  (sb_writers#11){.+.+..}, at: [<ffffffff8122e85f>] mnt_want_write+0x1f/0x50
> [  605.611490]  #1:  (&sb->s_type->i_mutex_key#14/1){+.+...}, at: [<ffffffff8121a7dc>] do_unlinkat+0
> x11c/0x2a0
> [  605.621417]  #2:  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c
> /0x190
> [  605.630995]
> [  605.630995] stack backtrace:
> [  605.635450] CPU: 7 PID: 7298 Comm: rm Not tainted 4.11.0-rc2-CI-CI_DRM_2352+ #1
> [  605.642999] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
> 017
> [  605.652305] Call Trace:
> [  605.654814]  dump_stack+0x67/0x92
> [  605.658184]  print_circular_bug+0x1e0/0x2e0
> [  605.662465]  __lock_acquire+0x1ac0/0x1bb0
> [  605.666634]  ? retint_kernel+0x2d/0x2d
> [  605.670456]  lock_acquire+0xc9/0x220
> [  605.674112]  ? pstore_unlink+0x3f/0xa0
> [  605.677970]  ? pstore_unlink+0x3f/0xa0
> [  605.681818]  __mutex_lock+0x6e/0x990
> [  605.685456]  ? pstore_unlink+0x3f/0xa0
> [  605.689791]  ? pstore_unlink+0x3f/0xa0
> [  605.694124]  ? vfs_unlink+0x4c/0x190
> [  605.698310]  mutex_lock_nested+0x16/0x20
> [  605.702859]  pstore_unlink+0x3f/0xa0
> [  605.707021]  vfs_unlink+0xb5/0x190
> [  605.711024]  do_unlinkat+0x24c/0x2a0
> [  605.715194]  SyS_unlinkat+0x16/0x30
> [  605.719275]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.724543] RIP: 0033:0x7f8b08073ed7
> [  605.728676] RSP: 002b:00007ffe70eff628 EFLAGS: 00000206 ORIG_RAX: 0000000000000107
> [  605.736929] RAX: ffffffffffffffda RBX: ffffffff8147ea93 RCX: 00007f8b08073ed7
> [  605.744711] RDX: 0000000000000000 RSI: 0000000001450000 RDI: 00000000ffffff9c
> [  605.752512] RBP: ffffc9000338ff88 R08: 0000000000000003 R09: 0000000000000000
> [  605.760276] R10: 000000000000015e R11: 0000000000000206 R12: 0000000000000000
> [  605.768040] R13: 00007ffe70eff750 R14: 000000000144ff70 R15: 0000000001451230
> [  605.775800]  ? __this_cpu_preempt_check+0x13/0x20
>
> Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
> Fixes: e9e360b08a44 ("pstore: Protect unlink with read_mutex")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100234
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Anton Vorontsov <anton@enomsg.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: <stable@vger.kernel.org> # v4.10+
> ---
>  fs/pstore/inode.c    | 37 ++++++++++++++++++++++++++++++-------
>  fs/pstore/platform.c |  2 ++
>  2 files changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
> index 57c0646479f5..dee5d34d9a64 100644
> --- a/fs/pstore/inode.c
> +++ b/fs/pstore/inode.c
> @@ -186,6 +186,26 @@ static const struct file_operations pstore_file_operations = {
>         .release        = seq_release,
>  };
>
> +struct pstore_unlink_work {
> +       struct work_struct work;
> +       struct dentry *dentry;
> +};
> +
> +static void pstore_unlink_work(struct work_struct *work)
> +{
> +       struct pstore_unlink_work *arg = container_of(work, typeof(*arg), work);
> +       struct dentry *dentry = arg->dentry;
> +       struct pstore_private *p = d_inode(dentry)->i_private;
> +
> +       mutex_lock(&p->psi->read_mutex);
> +       p->psi->erase(p->type, p->id, p->count,
> +                     d_inode(dentry)->i_ctime, p->psi);
> +       mutex_unlock(&p->psi->read_mutex);
> +
> +       dput(dentry);
> +       kfree(arg);
> +}
> +
>  /*
>   * When a file is unlinked from our file system we call the
>   * platform driver to erase the record from persistent store.
> @@ -193,20 +213,23 @@ static const struct file_operations pstore_file_operations = {
>  static int pstore_unlink(struct inode *dir, struct dentry *dentry)
>  {
>         struct pstore_private *p = d_inode(dentry)->i_private;
> +       struct pstore_unlink_work *work;
>         int err;
>
>         err = pstore_check_syslog_permissions(p);
>         if (err)
>                 return err;
>
> -       if (p->psi->erase) {
> -               mutex_lock(&p->psi->read_mutex);
> -               p->psi->erase(p->type, p->id, p->count,
> -                             d_inode(dentry)->i_ctime, p->psi);
> -               mutex_unlock(&p->psi->read_mutex);
> -       } else {
> +       if (!p->psi->erase)
>                 return -EPERM;
> -       }
> +
> +       work = kmalloc(sizeof(*work), GFP_KERNEL);
> +       if (!work)
> +               return -ENOMEM;
> +
> +       work->dentry = dget(dentry);
> +       INIT_WORK(&work->work, pstore_unlink_work);
> +       schedule_work(&work->work);
>
>         return simple_unlink(dir, dentry);
>  }
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index efab7b64925b..6ffca5ab4782 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -731,6 +731,8 @@ EXPORT_SYMBOL_GPL(pstore_register);
>
>  void pstore_unregister(struct pstore_info *psi)
>  {
> +       flush_scheduled_work();
> +
>         if (psi->flags & PSTORE_FLAGS_PMSG)
>                 pstore_unregister_pmsg();
>         if (psi->flags & PSTORE_FLAGS_FTRACE)
> --
> 2.11.0
>

If there isn't a shorter solution, then yeah, we can add a worker, but
I'd rather avoid the complexity...

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/pstore: Perform erase from a worker
  2017-03-20 17:49 ` Kees Cook
@ 2017-03-21  5:58   ` Namhyung Kim
  2017-03-21 16:00     ` Chris Wilson
  2017-03-21 18:16     ` Kees Cook
  0 siblings, 2 replies; 5+ messages in thread
From: Namhyung Kim @ 2017-03-21  5:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Chris Wilson, LKML, intel-gfx, Tomi Sarvela, Anton Vorontsov,
	Colin Cross, Tony Luck, Stefan Hajnoczi, # v4 . 10+,
	kernel-team

Hello,

On Mon, Mar 20, 2017 at 10:49:16AM -0700, Kees Cook wrote:
> On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > In order to prevent a cyclic recursion between psi->read_mutex and the
> > inode_lock, we need to move the pse->erase to a worker.
> >
> > [  605.374955] ======================================================
> > [  605.381281] [ INFO: possible circular locking dependency detected ]
> > [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> > [  605.392826] -------------------------------------------------------
> > [  605.399196] rm/7298 is trying to acquire lock:
> > [  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [<ffffffff813e183f>] pstore_unlink+0x3f/0xa0
> > [  605.412300]
> > [  605.412300] but task is already holding lock:
> > [  605.418237]  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c/0x19
> > 0
> > [  605.427397]
> > [  605.427397] which lock already depends on the new lock.
> > [  605.427397]
> > [  605.435770]
> > [  605.435770] the existing dependency chain (in reverse order) is:
> > [  605.443396]
> > [  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}:
> > [  605.450347]        lock_acquire+0xc9/0x220
> > [  605.454551]        down_write+0x3f/0x70
> > [  605.458484]        pstore_mkfile+0x1f4/0x460
> > [  605.462835]        pstore_get_records+0x17a/0x320
> > [  605.467664]        pstore_fill_super+0xa4/0xc0
> > [  605.472205]        mount_single+0x89/0xb0
> > [  605.476314]        pstore_mount+0x13/0x20
> > [  605.480411]        mount_fs+0xf/0x90
> > [  605.484122]        vfs_kern_mount+0x66/0x170
> > [  605.488464]        do_mount+0x190/0xd50
> > [  605.492397]        SyS_mount+0x90/0xd0
> > [  605.496212]        entry_SYSCALL_64_fastpath+0x1c/0xb1
> > [  605.501496]
> > [  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
> > [  605.507747]        __lock_acquire+0x1ac0/0x1bb0
> > [  605.512401]        lock_acquire+0xc9/0x220
> > [  605.516594]        __mutex_lock+0x6e/0x990
> > [  605.520755]        mutex_lock_nested+0x16/0x20
> > [  605.525279]        pstore_unlink+0x3f/0xa0
> > [  605.529465]        vfs_unlink+0xb5/0x190
> > [  605.533477]        do_unlinkat+0x24c/0x2a0
> > [  605.537672]        SyS_unlinkat+0x16/0x30
> > [  605.541781]        entry_SYSCALL_64_fastpath+0x1c/0xb1
> 
> If I'm reading this right it's a race between mount and unlink...
> that's quite a corner case. :)
> 
> > [  605.547067]
> > [  605.547067] other info that might help us debug this:
> > [  605.547067]
> > [  605.555221]  Possible unsafe locking scenario:
> > [  605.555221]
> > [  605.561280]        CPU0                    CPU1
> > [  605.565883]        ----                    ----
> > [  605.570502]   lock(&sb->s_type->i_mutex_key#14);
> > [  605.575217]                                lock(&psinfo->read_mutex);
> > [  605.581803]                                lock(&sb->s_type->i_mutex_key#14);
> > [  605.589159]   lock(&psinfo->read_mutex);
> 
> I haven't had time to dig much yet, but I wonder if the locking order
> on unlink could just be reversed, and the deadlock would go away?

IIUC, the unlink path locks a file in the root directory, while the
mount path locks the root directory.  Maybe we can use a subclass?
(not tested)

Thanks,
Namhyung


diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 06504b69575b..6eea6bcf90c8 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -394,7 +394,7 @@ int pstore_mkfile(struct pstore_record *record)
                break;
        }
 
-       inode_lock(d_inode(root));
+       inode_lock_nested(d_inode(root), I_MUTEX_PARENT);
 
        dentry = d_alloc_name(root, name);
        if (!dentry)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/pstore: Perform erase from a worker
  2017-03-21  5:58   ` Namhyung Kim
@ 2017-03-21 16:00     ` Chris Wilson
  2017-03-21 18:16     ` Kees Cook
  1 sibling, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2017-03-21 16:00 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Kees Cook, LKML, intel-gfx, Tomi Sarvela, Anton Vorontsov,
	Colin Cross, Tony Luck, Stefan Hajnoczi, # v4 . 10+,
	kernel-team

On Tue, Mar 21, 2017 at 02:58:48PM +0900, Namhyung Kim wrote:
> Hello,
> 
> On Mon, Mar 20, 2017 at 10:49:16AM -0700, Kees Cook wrote:
> > On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > In order to prevent a cyclic recursion between psi->read_mutex and the
> > > inode_lock, we need to move the pse->erase to a worker.
> > >
> > > [  605.374955] ======================================================
> > > [  605.381281] [ INFO: possible circular locking dependency detected ]
> > > [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> > > [  605.392826] -------------------------------------------------------
> > > [  605.399196] rm/7298 is trying to acquire lock:
> > > [  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [<ffffffff813e183f>] pstore_unlink+0x3f/0xa0
> > > [  605.412300]
> > > [  605.412300] but task is already holding lock:
> > > [  605.418237]  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c/0x19
> > > 0
> > > [  605.427397]
> > > [  605.427397] which lock already depends on the new lock.
> > > [  605.427397]
> > > [  605.435770]
> > > [  605.435770] the existing dependency chain (in reverse order) is:
> > > [  605.443396]
> > > [  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}:
> > > [  605.450347]        lock_acquire+0xc9/0x220
> > > [  605.454551]        down_write+0x3f/0x70
> > > [  605.458484]        pstore_mkfile+0x1f4/0x460
> > > [  605.462835]        pstore_get_records+0x17a/0x320
> > > [  605.467664]        pstore_fill_super+0xa4/0xc0
> > > [  605.472205]        mount_single+0x89/0xb0
> > > [  605.476314]        pstore_mount+0x13/0x20
> > > [  605.480411]        mount_fs+0xf/0x90
> > > [  605.484122]        vfs_kern_mount+0x66/0x170
> > > [  605.488464]        do_mount+0x190/0xd50
> > > [  605.492397]        SyS_mount+0x90/0xd0
> > > [  605.496212]        entry_SYSCALL_64_fastpath+0x1c/0xb1
> > > [  605.501496]
> > > [  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
> > > [  605.507747]        __lock_acquire+0x1ac0/0x1bb0
> > > [  605.512401]        lock_acquire+0xc9/0x220
> > > [  605.516594]        __mutex_lock+0x6e/0x990
> > > [  605.520755]        mutex_lock_nested+0x16/0x20
> > > [  605.525279]        pstore_unlink+0x3f/0xa0
> > > [  605.529465]        vfs_unlink+0xb5/0x190
> > > [  605.533477]        do_unlinkat+0x24c/0x2a0
> > > [  605.537672]        SyS_unlinkat+0x16/0x30
> > > [  605.541781]        entry_SYSCALL_64_fastpath+0x1c/0xb1
> > 
> > If I'm reading this right it's a race between mount and unlink...
> > that's quite a corner case. :)
> > 
> > > [  605.547067]
> > > [  605.547067] other info that might help us debug this:
> > > [  605.547067]
> > > [  605.555221]  Possible unsafe locking scenario:
> > > [  605.555221]
> > > [  605.561280]        CPU0                    CPU1
> > > [  605.565883]        ----                    ----
> > > [  605.570502]   lock(&sb->s_type->i_mutex_key#14);
> > > [  605.575217]                                lock(&psinfo->read_mutex);
> > > [  605.581803]                                lock(&sb->s_type->i_mutex_key#14);
> > > [  605.589159]   lock(&psinfo->read_mutex);
> > 
> > I haven't had time to dig much yet, but I wonder if the locking order
> > on unlink could just be reversed, and the deadlock would go away?
> 
> IIUC, the unlink path locks a file in the root directory, while the
> mount path locks the root directory.  Maybe we can use a subclass?
> (not tested)

More puzzling, or just my confusion, reports from our CI farm say that
this patch breaks removing objects from pstote. :|

Will look forward to better suggestions on how to avoid lockdep.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fs/pstore: Perform erase from a worker
  2017-03-21  5:58   ` Namhyung Kim
  2017-03-21 16:00     ` Chris Wilson
@ 2017-03-21 18:16     ` Kees Cook
  1 sibling, 0 replies; 5+ messages in thread
From: Kees Cook @ 2017-03-21 18:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Chris Wilson, LKML, intel-gfx, Tomi Sarvela, Anton Vorontsov,
	Colin Cross, Tony Luck, Stefan Hajnoczi, # v4 . 10+,
	kernel-team

On Mon, Mar 20, 2017 at 10:58 PM, Namhyung Kim <namhyung@kernel.org> wrote:
> Hello,
>
> On Mon, Mar 20, 2017 at 10:49:16AM -0700, Kees Cook wrote:
>> On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > In order to prevent a cyclic recursion between psi->read_mutex and the
>> > inode_lock, we need to move the pse->erase to a worker.
>> >
>> > [  605.374955] ======================================================
>> > [  605.381281] [ INFO: possible circular locking dependency detected ]
>> > [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
>> > [  605.392826] -------------------------------------------------------
>> > [  605.399196] rm/7298 is trying to acquire lock:
>> > [  605.403720]  (&psinfo->read_mutex){+.+.+.}, at: [<ffffffff813e183f>] pstore_unlink+0x3f/0xa0
>> > [  605.412300]
>> > [  605.412300] but task is already holding lock:
>> > [  605.418237]  (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff812157ec>] vfs_unlink+0x4c/0x19
>> > 0
>> > [  605.427397]
>> > [  605.427397] which lock already depends on the new lock.
>> > [  605.427397]
>> > [  605.435770]
>> > [  605.435770] the existing dependency chain (in reverse order) is:
>> > [  605.443396]
>> > [  605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}:
>> > [  605.450347]        lock_acquire+0xc9/0x220
>> > [  605.454551]        down_write+0x3f/0x70
>> > [  605.458484]        pstore_mkfile+0x1f4/0x460
>> > [  605.462835]        pstore_get_records+0x17a/0x320
>> > [  605.467664]        pstore_fill_super+0xa4/0xc0
>> > [  605.472205]        mount_single+0x89/0xb0
>> > [  605.476314]        pstore_mount+0x13/0x20
>> > [  605.480411]        mount_fs+0xf/0x90
>> > [  605.484122]        vfs_kern_mount+0x66/0x170
>> > [  605.488464]        do_mount+0x190/0xd50
>> > [  605.492397]        SyS_mount+0x90/0xd0
>> > [  605.496212]        entry_SYSCALL_64_fastpath+0x1c/0xb1
>> > [  605.501496]
>> > [  605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}:
>> > [  605.507747]        __lock_acquire+0x1ac0/0x1bb0
>> > [  605.512401]        lock_acquire+0xc9/0x220
>> > [  605.516594]        __mutex_lock+0x6e/0x990
>> > [  605.520755]        mutex_lock_nested+0x16/0x20
>> > [  605.525279]        pstore_unlink+0x3f/0xa0
>> > [  605.529465]        vfs_unlink+0xb5/0x190
>> > [  605.533477]        do_unlinkat+0x24c/0x2a0
>> > [  605.537672]        SyS_unlinkat+0x16/0x30
>> > [  605.541781]        entry_SYSCALL_64_fastpath+0x1c/0xb1
>>
>> If I'm reading this right it's a race between mount and unlink...
>> that's quite a corner case. :)
>>
>> > [  605.547067]
>> > [  605.547067] other info that might help us debug this:
>> > [  605.547067]
>> > [  605.555221]  Possible unsafe locking scenario:
>> > [  605.555221]
>> > [  605.561280]        CPU0                    CPU1
>> > [  605.565883]        ----                    ----
>> > [  605.570502]   lock(&sb->s_type->i_mutex_key#14);
>> > [  605.575217]                                lock(&psinfo->read_mutex);
>> > [  605.581803]                                lock(&sb->s_type->i_mutex_key#14);
>> > [  605.589159]   lock(&psinfo->read_mutex);
>>
>> I haven't had time to dig much yet, but I wonder if the locking order
>> on unlink could just be reversed, and the deadlock would go away?
>
> IIUC, the unlink path locks a file in the root directory, while the
> mount path locks the root directory.  Maybe we can use a subclass?
> (not tested)
>
> Thanks,
> Namhyung
>
>
> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
> index 06504b69575b..6eea6bcf90c8 100644
> --- a/fs/pstore/inode.c
> +++ b/fs/pstore/inode.c
> @@ -394,7 +394,7 @@ int pstore_mkfile(struct pstore_record *record)
>                 break;
>         }
>
> -       inode_lock(d_inode(root));
> +       inode_lock_nested(d_inode(root), I_MUTEX_PARENT);
>
>         dentry = d_alloc_name(root, name);
>         if (!dentry)

No luck, unfortunately...

In looking at other examples, I don't think the inode_lock is needed
at all? I see other uses of d_alloc_name() and d_add() without an
inode lock (proc, libfs, etc), and the locking documentation doesn't
seem to imply it either? This solves the lockdep, though it's unclear
to me if it is somehow unsafe (apologies for whitespace damage...):


diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 06504b69575b..3d83b13d2338 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -394,11 +394,9 @@ int pstore_mkfile(struct pstore_record *record)
                break;
        }

-       inode_lock(d_inode(root));
-
        dentry = d_alloc_name(root, name);
        if (!dentry)
-               goto fail_lockedalloc;
+               goto fail_private;

        inode->i_size = private->total_size = size;

@@ -413,16 +411,12 @@ int pstore_mkfile(struct pstore_record *record)
        list_add(&private->list, &allpstore);
        spin_unlock_irqrestore(&allpstore_lock, flags);

-       inode_unlock(d_inode(root));
-
        return 0;

-fail_lockedalloc:
-       inode_unlock(d_inode(root));
+fail_private:
        free_pstore_private(private);
 fail_alloc:
        iput(inode);
-
 fail:
        return rc;
 }


-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-21 18:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-17  9:52 [PATCH] fs/pstore: Perform erase from a worker Chris Wilson
2017-03-20 17:49 ` Kees Cook
2017-03-21  5:58   ` Namhyung Kim
2017-03-21 16:00     ` Chris Wilson
2017-03-21 18:16     ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).