linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kamal Mostafa <kamal@canonical.com>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	kernel-team@lists.ubuntu.com
Cc: "Tejun Heo" <tj@kernel.org>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Wanpeng Li" <wanpeng.li@hotmail.com>,
	"Kamal Mostafa" <kamal@canonical.com>
Subject: [PATCH 4.2.y-ckt 24/53] workqueue: fix rebind bound workers warning
Date: Tue, 24 May 2016 10:54:54 -0700	[thread overview]
Message-ID: <1464112523-3701-25-git-send-email-kamal@canonical.com> (raw)
In-Reply-To: <1464112523-3701-1-git-send-email-kamal@canonical.com>

4.2.8-ckt11 -stable review patch.  If anyone has any objections, please let me know.

---8<------------------------------------------------------------

From: Wanpeng Li <wanpeng.li@hotmail.com>

commit f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9 upstream.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0
Modules linked in:
CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31
Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013
 0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010
 0000000000000000 0000000000000000 0000000000000000 ffff881037babba8
 ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046
Call Trace:
 dump_stack+0x89/0xd4
 __warn+0xfd/0x120
 warn_slowpath_null+0x1d/0x20
 rebind_workers+0x1c0/0x1d0
 workqueue_cpu_up_callback+0xf5/0x1d0
 notifier_call_chain+0x64/0x90
 ? trace_hardirqs_on_caller+0xf2/0x220
 ? notify_prepare+0x80/0x80
 __raw_notifier_call_chain+0xe/0x10
 __cpu_notify+0x35/0x50
 notify_down_prepare+0x5e/0x80
 ? notify_prepare+0x80/0x80
 cpuhp_invoke_callback+0x73/0x330
 ? __schedule+0x33e/0x8a0
 cpuhp_down_callbacks+0x51/0xc0
 cpuhp_thread_fun+0xc1/0xf0
 smpboot_thread_fn+0x159/0x2a0
 ? smpboot_create_threads+0x80/0x80
 kthread+0xef/0x110
 ? wait_for_completion+0xf0/0x120
 ? schedule_tail+0x35/0xf0
 ret_from_fork+0x22/0x50
 ? __init_kthread_worker+0x70/0x70
---[ end trace eb12ae47d2382d8f ]---
notify_down_prepare: attempt to take down CPU 0 failed

This bug can be reproduced by below config w/ nohz_full= all cpus:

CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
CONFIG_DEBUG_HOTPLUG_CPU0=y
CONFIG_NO_HZ_FULL=y

As Thomas pointed out:

| If a down prepare callback fails, then DOWN_FAILED is invoked for all
| callbacks which have successfully executed DOWN_PREPARE.
|
| But, workqueue has actually two notifiers. One which handles
| UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE.
|
| Now look at the priorities of those callbacks:
|
| CPU_PRI_WORKQUEUE_UP        = 5
| CPU_PRI_WORKQUEUE_DOWN      = -5
|
| So the call order on DOWN_PREPARE is:
|
| CB 1
| CB ...
| CB workqueue_up() -> Ignores DOWN_PREPARE
| CB ...
| CB X ---> Fails
|
| So we call up to CB X with DOWN_FAILED
|
| CB 1
| CB ...
| CB workqueue_up() -> Handles DOWN_FAILED
| CB ...
| CB X-1
|
| So the problem is that the workqueue stuff handles DOWN_FAILED in the up
| callback, while it should do it in the down callback. Which is not a good idea
| either because it wants to be called early on rollback...
|
| Brilliant stuff, isn't it? The hotplug rework will solve this problem because
| the callbacks become symetric, but for the existing mess, we need some
| workaround in the workqueue code.

The boot CPU handles housekeeping duty(unbound timers, workqueues,
timekeeping, ...) on behalf of full dynticks CPUs. It must remain
online when nohz full is enabled. There is a priority set to every
notifier_blocks:

workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down

So tick_nohz_cpu_down callback failed when down prepare cpu 0, and
notifier_blocks behind tick_nohz_cpu_down will not be called any
more, which leads to workers are actually not unbound. Then hotplug
state machine will fallback to undo and online cpu 0 again. Workers
will be rebound unconditionally even if they are not unbound and
trigger the warning in this progress.

This patch fix it by catching !DISASSOCIATED to avoid rebind bound
workers.

Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
---
 kernel/workqueue.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a2a7ac1..d2a188f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4457,6 +4457,17 @@ static void rebind_workers(struct worker_pool *pool)
 						  pool->attrs->cpumask) < 0);
 
 	spin_lock_irq(&pool->lock);
+
+	/*
+	 * XXX: CPU hotplug notifiers are weird and can call DOWN_FAILED
+	 * w/o preceding DOWN_PREPARE.  Work around it.  CPU hotplug is
+	 * being reworked and this can go away in time.
+	 */
+	if (!(pool->flags & POOL_DISASSOCIATED)) {
+		spin_unlock_irq(&pool->lock);
+		return;
+	}
+
 	pool->flags &= ~POOL_DISASSOCIATED;
 
 	for_each_pool_worker(worker, pool) {
-- 
2.7.4

  parent reply	other threads:[~2016-05-24 18:03 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24 17:54 [4.2.y-ckt stable] Linux 4.2.8-ckt11 stable review Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 01/53] [4.2-stable only] fix backport "IB/security: restrict use of the write() interface" Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 02/53] Revert "usb: hub: do not clear BOS field during reset device" Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 03/53] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 04/53] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 05/53] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 06/53] atomic_open(): fix the handling of create_error Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 07/53] drm/i915/bdw: Add missing delay during L3 SQC credit programming Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 08/53] crypto: hash - Fix page length clamping in hash walk Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 09/53] drm/radeon: fix DP link training issue with second 4K monitor Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 10/53] drm/radeon: fix PLL sharing on DCE6.1 (v2) Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 11/53] get_rock_ridge_filename(): handle malformed NM entries Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 12/53] ALSA: hda - Fix white noise on Asus UX501VW headset Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 13/53] Input: max8997-haptic - fix NULL pointer dereference Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 14/53] drm/i915: Bail out of pipe config compute loop on LPT Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 15/53] ALSA: hda - Fix broken reconfig Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 16/53] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 17/53] vfs: add vfs_select_inode() helper Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 18/53] vfs: rename: check backing inode being equal Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 19/53] ALSA: usb-audio: Yet another Phoneix Audio device quirk Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 20/53] perf/x86: Fix undefined shift on 32-bit kernels Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 21/53] perf/x86/intel/pt: Generate PMI in the STOP region as well Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 22/53] perf/core: Disable the event on a truncated AUX record Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 23/53] tools lib traceevent: Do not reassign parg after collapse_tree() Kamal Mostafa
2016-05-24 17:54 ` Kamal Mostafa [this message]
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 25/53] ocfs2: fix posix_acl_create deadlock Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 26/53] nf_conntrack: avoid kernel pointer value leak in slab name Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 27/53] macvtap: segmented packet is consumed Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 28/53] regulator: axp20x: Fix axp22x ldo_io voltage ranges Kamal Mostafa
2016-05-24 17:54 ` [PATCH 4.2.y-ckt 29/53] arm64: bpf: jit JMP_JSET_{X,K} Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 30/53] bridge: fix igmp / mld query parsing Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 31/53] net/mlx4_en: Fix endianness bug in IPV6 csum calculation Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 32/53] net: fec: only clear a queue's work bit if the queue was emptied Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 33/53] tcp: refresh skb timestamp at retransmit time Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 34/53] net/route: enforce hoplimit max value Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 35/53] decnet: Do not build routes to devices without decnet private data Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 36/53] route: do not cache fib route info on local routes with oif Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 37/53] net: use skb_postpush_rcsum instead of own implementations Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 38/53] vlan: pull on __vlan_insert_tag error path and fix csum correction Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 39/53] ipv4/fib: don't warn when primary address is missing if in_dev is dead Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 40/53] bpf: fix double-fdput in replace_map_fd_with_map_ptr() Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 41/53] net_sched: introduce qdisc_replace() helper Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 42/53] net_sched: update hierarchical backlog too Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 43/53] sch_htb: update backlog as well Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 44/53] sch_dsmark: " Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 45/53] netem: Segment GSO packets on enqueue Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 46/53] net: fix infoleak in llc Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 47/53] net: fix infoleak in rtnetlink Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 48/53] VSOCK: do not disconnect socket when peer has shutdown SEND only Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 49/53] net: bridge: fix old ioctl unlocked net device walk Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 50/53] net: fix a kernel infoleak in x25 module Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 51/53] cdc_mbim: apply "NDP to end" quirk to all Huawei devices Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 52/53] soreuseport: fix ordering for mixed v4/v6 sockets Kamal Mostafa
2016-05-24 17:55 ` [PATCH 4.2.y-ckt 53/53] uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h Kamal Mostafa
2016-05-25  7:22   ` Mikko Rapeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1464112523-3701-25-git-send-email-kamal@canonical.com \
    --to=kamal@canonical.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@lists.ubuntu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).