From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Li Rongqing <lirongqing@baidu.com>,
Zhang Yu <zhangyu31@baidu.com>, Davidlohr Bueso <dbueso@suse.de>,
Manfred Spraul <manfred@colorfullife.com>,
Arnd Bergmann <arnd@arndb.de>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Sasha Levin <sashal@kernel.org>,
netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 5.1 10/60] ipc: prevent lockup on alloc_msg and free_msg
Date: Tue, 4 Jun 2019 19:21:20 -0400 [thread overview]
Message-ID: <20190604232212.6753-10-sashal@kernel.org> (raw)
In-Reply-To: <20190604232212.6753-1-sashal@kernel.org>
From: Li Rongqing <lirongqing@baidu.com>
[ Upstream commit d6a2946a88f524a47cc9b79279667137899db807 ]
msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is
enabled on large memory SMP systems, the pages initialization can take a
long time, if msgctl10 requests a huge block memory, and it will block
rcu scheduler, so release cpu actively.
After adding schedule() in free_msg, free_msg can not be called when
holding spinlock, so adding msg to a tmp list, and free it out of
spinlock
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505
rcu: Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978
rcu: (detected by 11, t=35024 jiffies, g=44237529, q=16542267)
msgctl10 R running task 21608 32505 2794 0x00000082
Call Trace:
preempt_schedule_irq+0x4c/0xb0
retint_kernel+0x1b/0x2d
RIP: 0010:__is_insn_slot_addr+0xfb/0x250
Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48
RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57
RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780
RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3
R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73
R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec
kernel_text_address+0xc1/0x100
__kernel_text_address+0xe/0x30
unwind_get_return_address+0x2f/0x50
__save_stack_trace+0x92/0x100
create_object+0x380/0x650
__kmalloc+0x14c/0x2b0
load_msg+0x38/0x1a0
do_msgsnd+0x19e/0xcf0
do_syscall_64+0x117/0x400
entry_SYSCALL_64_after_hwframe+0x49/0xbe
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170
rcu: (detected by 14, t=35016 jiffies, g=44237525, q=12423063)
msgctl10 R running task 21608 32170 32155 0x00000082
Call Trace:
preempt_schedule_irq+0x4c/0xb0
retint_kernel+0x1b/0x2d
RIP: 0010:lock_acquire+0x4d/0x340
Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82
RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64
RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000
R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
is_bpf_text_address+0x32/0xe0
kernel_text_address+0xec/0x100
__kernel_text_address+0xe/0x30
unwind_get_return_address+0x2f/0x50
__save_stack_trace+0x92/0x100
save_stack+0x32/0xb0
__kasan_slab_free+0x130/0x180
kfree+0xfa/0x2d0
free_msg+0x24/0x50
do_msgrcv+0x508/0xe60
do_syscall_64+0x117/0x400
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Davidlohr said:
"So after releasing the lock, the msg rbtree/list is empty and new
calls will not see those in the newly populated tmp_msg list, and
therefore they cannot access the delayed msg freeing pointers, which
is good. Also the fact that the node_cache is now freed before the
actual messages seems to be harmless as this is wanted for
msg_insert() avoiding GFP_ATOMIC allocations, and after releasing the
info->lock the thing is freed anyway so it should not change things"
Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
ipc/mqueue.c | 10 ++++++++--
ipc/msgutil.c | 6 ++++++
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index aea30530c472..127ba1e8950b 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -436,7 +436,8 @@ static void mqueue_evict_inode(struct inode *inode)
struct user_struct *user;
unsigned long mq_bytes, mq_treesize;
struct ipc_namespace *ipc_ns;
- struct msg_msg *msg;
+ struct msg_msg *msg, *nmsg;
+ LIST_HEAD(tmp_msg);
clear_inode(inode);
@@ -447,10 +448,15 @@ static void mqueue_evict_inode(struct inode *inode)
info = MQUEUE_I(inode);
spin_lock(&info->lock);
while ((msg = msg_get(info)) != NULL)
- free_msg(msg);
+ list_add_tail(&msg->m_list, &tmp_msg);
kfree(info->node_cache);
spin_unlock(&info->lock);
+ list_for_each_entry_safe(msg, nmsg, &tmp_msg, m_list) {
+ list_del(&msg->m_list);
+ free_msg(msg);
+ }
+
/* Total amount of bytes accounted for the mqueue */
mq_treesize = info->attr.mq_maxmsg * sizeof(struct msg_msg) +
min_t(unsigned int, info->attr.mq_maxmsg, MQ_PRIO_MAX) *
diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index 84598025a6ad..e65593742e2b 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -18,6 +18,7 @@
#include <linux/utsname.h>
#include <linux/proc_ns.h>
#include <linux/uaccess.h>
+#include <linux/sched.h>
#include "util.h"
@@ -64,6 +65,9 @@ static struct msg_msg *alloc_msg(size_t len)
pseg = &msg->next;
while (len > 0) {
struct msg_msgseg *seg;
+
+ cond_resched();
+
alen = min(len, DATALEN_SEG);
seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT);
if (seg == NULL)
@@ -176,6 +180,8 @@ void free_msg(struct msg_msg *msg)
kfree(msg);
while (seg != NULL) {
struct msg_msgseg *tmp = seg->next;
+
+ cond_resched();
kfree(seg);
seg = tmp;
}
--
2.20.1
next prev parent reply other threads:[~2019-06-04 23:22 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-04 23:21 [PATCH AUTOSEL 5.1 01/60] x86/uaccess, kcov: Disable stack protector Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 02/60] ALSA: seq: Protect in-kernel ioctl calls with mutex Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 03/60] ALSA: seq: Fix race of get-subscription call vs port-delete ioctls Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 04/60] Revert "ALSA: seq: Protect in-kernel ioctl calls with mutex" Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 05/60] ALSA: seq: Cover unsubscribe_port() in list_mutex Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 06/60] driver core: platform: Fix the usage of platform device name(pdev->name) Sasha Levin
2019-06-05 4:58 ` Greg Kroah-Hartman
2019-06-15 22:26 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 07/60] Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 08/60] configfs: fix possible use-after-free in configfs_register_group Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 09/60] f2fs: fix to avoid accessing xattr across the boundary Sasha Levin
2019-06-04 23:21 ` Sasha Levin [this message]
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 11/60] drivers/perf: arm_spe: Don't error on high-order pages for aux buf Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 12/60] bpf: sockmap, only stop/flush strp if it was enabled at some point Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 13/60] bpf: sockmap remove duplicate queue free Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 14/60] bpf: sockmap fix msg->sg.size account on ingress skb Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 15/60] scsi: qla2xxx: Add cleanup for PCI EEH recovery Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 16/60] scsi: qedi: remove memset/memcpy to nfunc and use func instead Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 17/60] scsi: qedi: remove set but not used variables 'cdev' and 'udev' Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 18/60] scsi: lpfc: resolve lockdep warnings Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 19/60] scsi: lpfc: correct rcu unlock issue in lpfc_nvme_info_show Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 20/60] scsi: lpfc: add check for loss of ndlp when sending RRQ Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 21/60] arm64: Print physical address of page table base in show_pte() Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 22/60] net: macb: fix error format in dev_err() Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 23/60] enetc: Fix NULL dma address unmap for Tx BD extensions Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 24/60] bpf, tcp: correctly handle DONT_WAIT flags and timeo == 0 Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 25/60] arm64/mm: Inhibit huge-vmap with ptdump Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 26/60] tools/bpftool: move set_max_rlimit() before __bpf_object__open_xattr() Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 27/60] selftests/bpf: fix bpf_get_current_task Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` sashal
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 28/60] nvme-pci: Fix controller freeze wait disabling Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 29/60] nvme: fix srcu locking on error return in nvme_get_ns_from_disk Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 30/60] nvme: remove the ifdef around nvme_nvm_ioctl Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 31/60] nvme: merge nvme_ns_ioctl into nvme_ioctl Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 32/60] nvme: release namespace SRCU protection before performing controller ioctls Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 33/60] nvme: fix memory leak for power latency tolerance Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 34/60] platform/x86: pmc_atom: Add Lex 3I380D industrial PC to critclk_systems DMI table Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 35/60] platform/x86: pmc_atom: Add several Beckhoff Automation boards " Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 36/60] scsi: myrs: Fix uninitialized variable Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 37/60] scsi: bnx2fc: fix incorrect cast to u64 on shift operation Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 38/60] drm/amdgpu: keep stolen memory on picasso Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 39/60] libnvdimm: Fix compilation warnings with W=1 Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` [PATCH AUTOSEL 5.1 40/60] selftests: fib_rule_tests: fix local IPv4 address typo Sasha Levin
2019-06-04 23:21 ` Sasha Levin
2019-06-04 23:21 ` sashal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190604232212.6753-10-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bpf@vger.kernel.org \
--cc=dbueso@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=manfred@colorfullife.com \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=zhangyu31@baidu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.