From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@codemonkey.org.uk>,
Peter Zijlstra <peterz@infradead.org>,
Nick Piggin <npiggin@gmail.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [4.15-rc9] fs_reclaim lockdep trace
Date: Sun, 28 Jan 2018 14:55:28 +0900 [thread overview]
Message-ID: <8f1c776d-b791-e0b9-1e5c-62b03dcd1d74@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <7771dd55-2655-d3a9-80ee-24c9ada7dbbe@I-love.SAKURA.ne.jp>
Dave, would you try below patch?
>From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sun, 28 Jan 2018 14:17:14 +0900
Subject: [PATCH] lockdep: Fix fs_reclaim warning.
Dave Jones reported fs_reclaim lockdep warnings.
============================================
WARNING: possible recursive locking detected
4.15.0-rc9-backup-debug+ #1 Not tainted
--------------------------------------------
sshd/24800 is trying to acquire lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
but task is already holding lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(fs_reclaim);
lock(fs_reclaim);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by sshd/24800:
#0: (sk_lock-AF_INET6){+.+.}, at: [<000000001a069652>] tcp_sendmsg+0x19/0x40
#1: (fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
stack backtrace:
CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1
Call Trace:
dump_stack+0xbc/0x13f
__lock_acquire+0xa09/0x2040
lock_acquire+0x12e/0x350
fs_reclaim_acquire.part.102+0x29/0x30
kmem_cache_alloc+0x3d/0x2c0
alloc_extent_state+0xa7/0x410
__clear_extent_bit+0x3ea/0x570
try_release_extent_mapping+0x21a/0x260
__btrfs_releasepage+0xb0/0x1c0
btrfs_releasepage+0x161/0x170
try_to_release_page+0x162/0x1c0
shrink_page_list+0x1d5a/0x2fb0
shrink_inactive_list+0x451/0x940
shrink_node_memcg.constprop.88+0x4c9/0x5e0
shrink_node+0x12d/0x260
try_to_free_pages+0x418/0xaf0
__alloc_pages_slowpath+0x976/0x1790
__alloc_pages_nodemask+0x52c/0x5c0
new_slab+0x374/0x3f0
___slab_alloc.constprop.81+0x47e/0x5a0
__slab_alloc.constprop.80+0x32/0x60
__kmalloc_track_caller+0x267/0x310
__kmalloc_reserve.isra.40+0x29/0x80
__alloc_skb+0xee/0x390
sk_stream_alloc_skb+0xb8/0x340
tcp_sendmsg_locked+0x8e6/0x1d30
tcp_sendmsg+0x27/0x40
inet_sendmsg+0xd0/0x310
sock_write_iter+0x17a/0x240
__vfs_write+0x2ab/0x380
vfs_write+0xfb/0x260
SyS_write+0xb6/0x140
do_syscall_64+0x1e5/0xc05
entry_SYSCALL64_slow_path+0x25/0x25
Since no fs locks are held, doing GFP_KERNEL allocation should be safe
as long as there is PF_MEMALLOC safeguard (
/* Avoid recursion of direct reclaim */
if (p->flags & PF_MEMALLOC)
goto nopage;
) which prevents infinite recursion.
This warning seems to be caused by commit d92a8cfcb37ecd13
("locking/lockdep: Rework FS_RECLAIM annotation") which moved the
location of
/* this guy won't enter reclaim */
if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
return false;
check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context
(__GFP_NOFS)"). Since __kmalloc_reserve() from __alloc_skb() adds
__GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is
failing to return false despite PF_MEMALLOC context (and resulted in
lockdep warning).
Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking
__GFP_NOMEMALLOC might make sense. But since this safeguard was added by
commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for
allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense.
Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to
return false.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@gmail.com>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 76c9688..7804b0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
return false;
/* this guy won't enter reclaim */
- if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
+ if (current->flags & PF_MEMALLOC)
return false;
/* We're only interested __GFP_FS allocations for now */
--
1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@codemonkey.org.uk>,
Peter Zijlstra <peterz@infradead.org>,
Nick Piggin <npiggin@gmail.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [4.15-rc9] fs_reclaim lockdep trace
Date: Sun, 28 Jan 2018 14:55:28 +0900 [thread overview]
Message-ID: <8f1c776d-b791-e0b9-1e5c-62b03dcd1d74@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <7771dd55-2655-d3a9-80ee-24c9ada7dbbe@I-love.SAKURA.ne.jp>
Dave, would you try below patch?
>From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sun, 28 Jan 2018 14:17:14 +0900
Subject: [PATCH] lockdep: Fix fs_reclaim warning.
Dave Jones reported fs_reclaim lockdep warnings.
============================================
WARNING: possible recursive locking detected
4.15.0-rc9-backup-debug+ #1 Not tainted
--------------------------------------------
sshd/24800 is trying to acquire lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
but task is already holding lock:
(fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(fs_reclaim);
lock(fs_reclaim);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by sshd/24800:
#0: (sk_lock-AF_INET6){+.+.}, at: [<000000001a069652>] tcp_sendmsg+0x19/0x40
#1: (fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
stack backtrace:
CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1
Call Trace:
dump_stack+0xbc/0x13f
__lock_acquire+0xa09/0x2040
lock_acquire+0x12e/0x350
fs_reclaim_acquire.part.102+0x29/0x30
kmem_cache_alloc+0x3d/0x2c0
alloc_extent_state+0xa7/0x410
__clear_extent_bit+0x3ea/0x570
try_release_extent_mapping+0x21a/0x260
__btrfs_releasepage+0xb0/0x1c0
btrfs_releasepage+0x161/0x170
try_to_release_page+0x162/0x1c0
shrink_page_list+0x1d5a/0x2fb0
shrink_inactive_list+0x451/0x940
shrink_node_memcg.constprop.88+0x4c9/0x5e0
shrink_node+0x12d/0x260
try_to_free_pages+0x418/0xaf0
__alloc_pages_slowpath+0x976/0x1790
__alloc_pages_nodemask+0x52c/0x5c0
new_slab+0x374/0x3f0
___slab_alloc.constprop.81+0x47e/0x5a0
__slab_alloc.constprop.80+0x32/0x60
__kmalloc_track_caller+0x267/0x310
__kmalloc_reserve.isra.40+0x29/0x80
__alloc_skb+0xee/0x390
sk_stream_alloc_skb+0xb8/0x340
tcp_sendmsg_locked+0x8e6/0x1d30
tcp_sendmsg+0x27/0x40
inet_sendmsg+0xd0/0x310
sock_write_iter+0x17a/0x240
__vfs_write+0x2ab/0x380
vfs_write+0xfb/0x260
SyS_write+0xb6/0x140
do_syscall_64+0x1e5/0xc05
entry_SYSCALL64_slow_path+0x25/0x25
Since no fs locks are held, doing GFP_KERNEL allocation should be safe
as long as there is PF_MEMALLOC safeguard (
/* Avoid recursion of direct reclaim */
if (p->flags & PF_MEMALLOC)
goto nopage;
) which prevents infinite recursion.
This warning seems to be caused by commit d92a8cfcb37ecd13
("locking/lockdep: Rework FS_RECLAIM annotation") which moved the
location of
/* this guy won't enter reclaim */
if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
return false;
check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context
(__GFP_NOFS)"). Since __kmalloc_reserve() from __alloc_skb() adds
__GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is
failing to return false despite PF_MEMALLOC context (and resulted in
lockdep warning).
Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking
__GFP_NOMEMALLOC might make sense. But since this safeguard was added by
commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for
allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense.
Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to
return false.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@gmail.com>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 76c9688..7804b0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
return false;
/* this guy won't enter reclaim */
- if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
+ if (current->flags & PF_MEMALLOC)
return false;
/* We're only interested __GFP_FS allocations for now */
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@codemonkey.org.uk>,
Peter Zijlstra <peterz@infradead.org>,
Nick Piggin <npiggin@gmail.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [4.15-rc9] fs_reclaim lockdep trace
Date: Sun, 28 Jan 2018 14:55:28 +0900 [thread overview]
Message-ID: <8f1c776d-b791-e0b9-1e5c-62b03dcd1d74@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <7771dd55-2655-d3a9-80ee-24c9ada7dbbe@I-love.SAKURA.ne.jp>
Dave, would you try below patch?
next prev parent reply other threads:[~2018-01-28 5:56 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 1:36 [4.15-rc9] fs_reclaim lockdep trace Dave Jones
2018-01-24 1:36 ` Dave Jones
2018-01-27 22:24 ` Dave Jones
2018-01-27 22:24 ` Dave Jones
2018-01-27 22:43 ` Linus Torvalds
2018-01-27 22:43 ` Linus Torvalds
2018-01-28 1:16 ` Tetsuo Handa
2018-01-28 1:16 ` Tetsuo Handa
2018-01-28 4:25 ` Tetsuo Handa
2018-01-28 4:25 ` Tetsuo Handa
2018-01-28 5:55 ` Tetsuo Handa [this message]
2018-01-28 5:55 ` Tetsuo Handa
2018-01-28 5:55 ` Tetsuo Handa
2018-01-29 2:43 ` Dave Jones
2018-01-29 2:43 ` Dave Jones
2018-01-29 10:27 ` Peter Zijlstra
2018-01-29 10:27 ` Peter Zijlstra
2018-01-29 11:47 ` Tetsuo Handa
2018-01-29 11:47 ` Tetsuo Handa
2018-01-29 13:55 ` Peter Zijlstra
2018-01-29 13:55 ` Peter Zijlstra
2018-02-01 11:36 ` Tetsuo Handa
2018-02-01 11:36 ` Tetsuo Handa
2018-02-08 11:43 ` [PATCH v2] lockdep: Fix fs_reclaim warning Tetsuo Handa
2018-02-08 11:43 ` Tetsuo Handa
2018-02-12 12:08 ` Nikolay Borisov
2018-02-12 12:08 ` Nikolay Borisov
2018-02-12 13:46 ` Tetsuo Handa
2018-02-12 13:46 ` Tetsuo Handa
2018-02-19 11:52 ` Tetsuo Handa
2018-02-19 11:52 ` Tetsuo Handa
2018-02-27 21:50 ` [PATCH v2 (RESEND)] " Tetsuo Handa
2018-03-07 21:44 ` Tetsuo Handa
2018-03-07 23:33 ` Andrew Morton
2018-03-08 15:30 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8f1c776d-b791-e0b9-1e5c-62b03dcd1d74@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=davej@codemonkey.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.