All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Juergen Gross <jgross@suse.com>
Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	paulmck@kernel.org, mhocko@suse.com,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH v2 3/3] kernel/smp: add more data to CSD lock debugging
Date: Mon, 1 Mar 2021 18:07:24 +0100	[thread overview]
Message-ID: <YD0fTLnQTQ7/M7fx@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20210301101336.7797-4-jgross@suse.com>

On Mon, Mar 01, 2021 at 11:13:36AM +0100, Juergen Gross wrote:
> In order to help identifying problems with IPI handling and remote
> function execution add some more data to IPI debugging code.
> 
> There have been multiple reports of cpus looping long times (many
> seconds) in smp_call_function_many() waiting for another cpu executing
> a function like tlb flushing. Most of these reports have been for
> cases where the kernel was running as a guest on top of KVM or Xen
> (there are rumours of that happening under VMWare, too, and even on
> bare metal).
> 
> Finding the root cause hasn't been successful yet, even after more than
> 2 years of chasing this bug by different developers.
> 
> Commit 35feb60474bf4f7 ("kernel/smp: Provide CSD lock timeout
> diagnostics") tried to address this by adding some debug code and by
> issuing another IPI when a hang was detected. This helped mitigating
> the problem (the repeated IPI unlocks the hang), but the root cause is
> still unknown.
> 
> Current available data suggests that either an IPI wasn't sent when it
> should have been, or that the IPI didn't result in the target cpu
> executing the queued function (due to the IPI not reaching the cpu,
> the IPI handler not being called, or the handler not seeing the queued
> request).
> 
> Try to add more diagnostic data by introducing a global atomic counter
> which is being incremented when doing critical operations (before and
> after queueing a new request, when sending an IPI, and when dequeueing
> a request). The counter value is stored in percpu variables which can
> be printed out when a hang is detected.
> 
> The data of the last event (consisting of sequence counter, source
> cpu, target cpu, and event type) is stored in a global variable. When
> a new event is to be traced, the data of the last event is stored in
> the event related percpu location and the global data is updated with
> the new event's data. This allows to track two events in one data
> location: one by the value of the event data (the event before the
> current one), and one by the location itself (the current event).
> 
> A typical printout with a detected hang will look like this:
> 
> csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 5000000003 ns for CPU#06 scf_handler_1+0x0/0x50(0xffffa2a881bb1410).
> 	csd: CSD lock (#1) handling prior scf_handler_1+0x0/0x50(0xffffa2a8813823c0) request.
>         csd: cnt(00008cc): ffff->0000 dequeue (src cpu 0 == empty)
>         csd: cnt(00008cd): ffff->0006 idle
>         csd: cnt(0003668): 0001->0006 queue
>         csd: cnt(0003669): 0001->0006 ipi
>         csd: cnt(0003e0f): 0007->000a queue
>         csd: cnt(0003e10): 0001->ffff ping
>         csd: cnt(0003e71): 0003->0000 ping
>         csd: cnt(0003e72): ffff->0006 gotipi
>         csd: cnt(0003e73): ffff->0006 handle
>         csd: cnt(0003e74): ffff->0006 dequeue (src cpu 0 == empty)
>         csd: cnt(0003e7f): 0004->0006 ping
>         csd: cnt(0003e80): 0001->ffff pinged
>         csd: cnt(0003eb2): 0005->0001 noipi
>         csd: cnt(0003eb3): 0001->0006 queue
>         csd: cnt(0003eb4): 0001->0006 noipi
>         csd: cnt now: 0003f00
> 
> The idea is to print only relevant entries. Those are all events which
> are associated with the hang (so sender side events for the source cpu
> of the hanging request, and receiver side events for the target cpu),
> and the related events just before those (for adding data needed to
> identify a possible race). Printing all available data would be
> possible, but this would add large amounts of data printed on larger
> configurations.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Tested-by: Paul E. McKenney <paulmck@kernel.org>
> ---
> V2:
> - add automatic data deciphering and sorting of entries
> - add new trace point for leaving flush_smp_call_function_queue()
> - add information when finding an empty call_single_queue

They do not apply on top of these:

  https://lkml.kernel.org/r/20210220231712.2475218-2-namit@vmware.com

:-/

> @@ -290,6 +476,19 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
>  
>  void __smp_call_single_queue(int cpu, struct llist_node *node)
>  {
> +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> +	if (static_branch_unlikely(&csdlock_debug_extended)) {
> +		unsigned int type;
> +
> +		type = CSD_TYPE(container_of(node, call_single_data_t,
> +					     node.llist));
> +		if (type == CSD_TYPE_SYNC || type == CSD_TYPE_ASYNC) {
> +			__smp_call_single_queue_debug(cpu, node);
> +			return;
> +		}
> +	}
> +#endif

This really ought to be in generic_exec_single(), because there we know
the type matches.



  parent reply	other threads:[~2021-03-01 20:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01 10:13 [PATCH v2 0/3] kernel/smp.c: add more CSD lock debugging Juergen Gross
2021-03-01 10:13 ` [PATCH v2 1/3] kernel/smp: add boot parameter for controlling " Juergen Gross
2021-03-01 14:09   ` [tip: locking/core] locking/csd_lock: Add " tip-bot2 for Juergen Gross
2021-03-06 11:54   ` tip-bot2 for Juergen Gross
2021-03-01 10:13 ` [PATCH v2 2/3] kernel/smp: prepare more " Juergen Gross
2021-03-01 14:09   ` [tip: locking/core] locking/csd_lock: Prepare " tip-bot2 for Juergen Gross
2021-03-06 11:54   ` tip-bot2 for Juergen Gross
2021-03-01 10:13 ` [PATCH v2 3/3] kernel/smp: add more data to " Juergen Gross
2021-03-01 14:09   ` [tip: locking/core] locking/csd_lock: Add " tip-bot2 for Juergen Gross
2021-03-01 17:07   ` Peter Zijlstra [this message]
2021-03-01 19:16     ` [PATCH v2 3/3] kernel/smp: add " Jürgen Groß
2021-03-01 20:44       ` Peter Zijlstra
2021-03-02  7:05         ` Ingo Molnar
2021-03-02  7:14           ` Ingo Molnar
2021-03-02  5:57     ` Jürgen Groß
2021-03-06 11:54   ` [tip: locking/core] locking/csd_lock: Add " tip-bot2 for Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YD0fTLnQTQ7/M7fx@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=corbet@lwn.net \
    --cc=jgross@suse.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.