All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Michele Ballabio <barra_cuda@katamail.com>
Cc: linux-kernel@vger.kernel.org, toralf.foerster@gmx.de,
	fweisbec@gmail.com, mingo@kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: Bisected KVM hang on x86-32 between v3.12 and v3.13
Date: Mon, 7 Apr 2014 17:03:37 +0200	[thread overview]
Message-ID: <20140407150337.GO10526@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <5341707F.5000406@katamail.com>

On Sun, Apr 06, 2014 at 05:19:27PM +0200, Michele Ballabio wrote:
> Toralf Förster reported this in
>   http://article.gmane.org/gmane.linux.kernel/1662567
>   http://article.gmane.org/gmane.linux.kernel/1658422
>   http://article.gmane.org/gmane.linux.kernel/1657962
> 
>   "The issue happens here at a 32 bit stable Gentoo Linux if
>    I try to start a KVM image. Kernels 3.12.X works fine,
>    kernel >= v3.13 will hang shortly after I started the image
>    with the virtual-manager. The last syslog messages are
>    something like:
>    Feb 28 16:22:00 n22 kernel: INFO: rcu_sched detected stalls
>        on CPUs/tasks: {} (detected by 2, t=60002 jiffies,
>        g=14689, c=14688, q=21051)
>    Feb 28 16:22:00 n22 kernel: INFO: Stall ended before state
>        dump start"
> 
> He correctly pointed out that the bisection blamed the merge
> commit 37bf06375c90a42fe07b9bebdb07bc316ae5a0ce
> "Merge tag 'v3.12-rc4' into sched/core".
> 
> This bug is obviously caused by at least two patches, one
> on each side of the merge, that only when combined together
> (at that merge point) cause the bug in kvm. By rebasing
> the "sched/core" branch on "master" before the merge and
> going on with the bisection, I found commit
> 3e8e42c69bb7d9fc12ebc23ff308e8523a2a59a0
> "sched: Revert need_resched() to look at TIF_NEED_RESCHED"
> as one of the causes. The other patch that contributes to the
> bug is commit ded797547548a5b8e7b92383a41e4c0e6b0ecb7f
> "irq: Force hardirq exit's softirq processing on its own stack".
> 
> Reverting either one of them solves the problem reported with kvm,
> but revert is probably not the correct answer.
> 
> I wonder if the solution is as simple as this:
> 
> --->8---
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 0af5250..f3b985d 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -126,6 +126,7 @@ config X86
>  	select RTC_LIB
>  	select HAVE_DEBUG_STACKOVERFLOW
>  	select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
> +	select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_32
>  	select HAVE_CC_STACKPROTECTOR

Ohh ahh.. shiney!

So what I suspect at this point is that because i386 and x86_64 have a
difference in current_thread_info() (i386 is stack based), we end up
setting the TIF_NEED_RESCHED bit on the wrong stack.

Now I have some vague memories of propagating the TIF flags on stack
switch, but I cannot remember what arch we did that for. Let me stare at
this a little more.

Also, IFF this is the case, then the fingered patch above (and your
suggested 'fix') aren't the real curlpit/cure but simply make it
more/less likely to happen.

Now, Steve had a patch somewhere that would make i386 use per-cpu
variables for current_thread_info() just like x86_64 already does I
think. Let me go find them too.

  parent reply	other threads:[~2014-04-07 15:03 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-06 15:19 Bisected KVM hang on x86-32 between v3.12 and v3.13 Michele Ballabio
2014-04-06 15:52 ` Toralf Förster
2014-04-06 17:40   ` Michele Ballabio
2014-04-07 15:03 ` Peter Zijlstra [this message]
2014-04-07 15:07   ` Peter Zijlstra
2014-04-07 18:16     ` Toralf Förster
2014-04-07 18:56       ` Peter Zijlstra
2014-04-08 12:21         ` Peter Zijlstra
2014-04-08 19:14           ` Michele Ballabio
2014-04-08 19:51             ` Michele Ballabio
2014-04-08 20:28           ` Toralf Förster
2014-04-09  9:14           ` Stefan Bader
2014-04-09  9:45             ` Peter Zijlstra
2014-04-09 14:24               ` [PATCH -stable] x86,preempt: Fix preemption for i386 Peter Zijlstra
2014-04-09 14:36                 ` Linus Torvalds
2014-04-09 19:19                   ` Greg KH
2014-04-09 19:38                     ` Peter Zijlstra
2014-04-09 19:57                       ` Greg KH
2014-05-13 23:56                       ` Greg KH
2014-04-07 18:59       ` Bisected KVM hang on x86-32 between v3.12 and v3.13 Frederic Weisbecker
2014-04-07 19:57         ` Toralf Förster
2014-04-07 22:43           ` Frederic Weisbecker
2014-04-07 19:49     ` Michele Ballabio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140407150337.GO10526@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=barra_cuda@katamail.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=toralf.foerster@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.