All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Rick Warner <rick@microway.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [bisected] rcu_sched detected stalls - 4.15 or newer kernel with some Xeon skylake CPUs and extended APIC
Date: Wed, 16 May 2018 16:50:07 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.1805161642040.1627@nanos.tec.linutronix.de> (raw)
In-Reply-To: <alpine.DEB.2.21.1805152208590.1605@nanos.tec.linutronix.de>

Rick,

On Tue, 15 May 2018, Thomas Gleixner wrote:
> I can't spot an immediate fail with that commit, but I'll have a look
> tomorrow for instrumenting this with tracepoints which can be dumped from
> the stall detector.

can you please give the patch below a try? I assume you have a serial
console, otherwise this is going to be tedious.

Please add the following to the kernel command line:

  ftrace_dump_on_oops

Note, the box will panic after the first stall and spill out the trace
buffer over serial, which might take a while.

Thanks,

	tglx

8<--------------------

--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -52,20 +52,28 @@ static void
 	if (apic_dest != APIC_DEST_ALLINC)
 		cpumask_clear_cpu(smp_processor_id(), tmpmsk);
 
+	trace_printk("To: %*pbl\n", cpumask_pr_args(tmpmsk));
+
 	/* Collapse cpus in a cluster so a single IPI per cluster is sent */
 	for_each_cpu(cpu, tmpmsk) {
 		struct cluster_mask *cmsk = per_cpu(cluster_masks, cpu);
 
 		dest = 0;
+		trace_printk("CPU: %u cluster: %*pbl\n", cpu,
+			     cpumask_pr_args(&cmsk->mask));
 		for_each_cpu_and(clustercpu, tmpmsk, &cmsk->mask)
 			dest |= per_cpu(x86_cpu_to_logical_apicid, clustercpu);
 
-		if (!dest)
+		if (!dest) {
+			trace_printk("dest = 0!?\n");
 			continue;
+		}
 
 		__x2apic_send_IPI_dest(dest, vector, apic->dest_logical);
 		/* Remove cluster CPUs from tmpmask */
 		cpumask_andnot(tmpmsk, tmpmsk, &cmsk->mask);
+		trace_printk("dest %08x --> tmpmsk %*pbl\n", dest,
+			     cpumask_pr_args(tmpmsk));
 	}
 
 	local_irq_restore(flags);
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -124,7 +124,7 @@ int rcu_num_lvls __read_mostly = RCU_NUM
 int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
 int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
 /* panic() on RCU Stall sysctl. */
-int sysctl_panic_on_rcu_stall __read_mostly;
+int sysctl_panic_on_rcu_stall __read_mostly = 1;
 
 /*
  * The rcu_scheduler_active variable is initialized to the value
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -681,6 +681,7 @@ void on_each_cpu_cond(bool (*cond_func)(
 		for_each_online_cpu(cpu)
 			if (cond_func(cpu, info))
 				cpumask_set_cpu(cpu, cpus);
+		trace_printk("%*pbl\n", cpumask_pr_args(cpus));
 		on_each_cpu_mask(cpus, func, info, wait);
 		preempt_enable();
 		free_cpumask_var(cpus);

  reply	other threads:[~2018-05-16 14:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-01 16:37 stall/hang on 4.15 kernel with some Xeon skylake CPUs and extended APIC Rick Warner
2018-05-15 16:07 ` [bisected] rcu_sched detected stalls - 4.15 or newer " Rick Warner
2018-05-15 20:19   ` Thomas Gleixner
2018-05-16 14:50     ` Thomas Gleixner [this message]
2018-05-16 23:02       ` Rick Warner
2018-05-17 12:36         ` Thomas Gleixner
2018-05-17 15:59           ` Rick Warner
2018-05-17 19:03           ` [tip:x86/urgent] x86/apic/x2apic: Initialize cluster ID properly tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1805161642040.1627@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rick@microway.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.