All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Andre Przywara <andre.przywara@amd.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Diestelhorst, Stephan" <Stephan.Diestelhorst@amd.com>
Subject: Re: Hypervisor crash(!) on xl cpupool-numa-split
Date: Mon, 21 Feb 2011 15:50:14 +0100	[thread overview]
Message-ID: <4D627BA6.4020406@ts.fujitsu.com> (raw)
In-Reply-To: <4D627A6F.5070105@amd.com>

On 02/21/11 15:45, Andre Przywara wrote:
> Juergen Gross wrote:
>> On 02/21/11 11:00, Andre Przywara wrote:
>>> George Dunlap wrote:
>>>> Andre (and Juergen), can you try again with the attached patch?
>>> I applied this patch on top of 22931 and it did _not_ work.
>>> The crash occurred almost immediately after I started my script, so the
>>> same behaviour as without the patch.
>>
>> Did you try my patch addressing races in the scheduler when moving cpus
>> between cpupools?
> Sorry, I tried yours first, but it didn't apply cleanly on my particular
> tree (sched_jg_fix ;-). So I tested George's first.
>
>> I've attached it again. For me it works quite well, while George's patch
>> seems not to be enough (machine hanging after some tests with cpupools).
> OK, it now applied after a rebase.
> And yes, I didn't see a crash! At least until the script stopped while
> at lot of these messages appeared:
> (XEN) do_IRQ: 0.89 No irq handler for vector (irq -1)
>
> That is what I reported before and is most probably totally unrelated to
> this issue.
> So I consider this fix working!
> I will try to match my recent theories and debug results with your patch
> to see whether this fits.
>
>> OTOH I can't reproduce an error as fast as you even without any patch :-)
>>
>>> (attached my script for reference, though it will most likely only make
>>> sense on bigger NUMA machines)
>>
>> Yeah, on my 2-node system I need several hundred tries to get an error.
>> But it seems to be more effective than George's script.
> I consider the large over-provisioning the reason. With Dom0 having 48
> VCPUs finally squashed together to 6 pCPUs, my script triggered at the
> second run the latest.
> With your patch it made 24 iterations before the other bug kicked in.

Okay, I'll prepare an official patch. Might last some days, as I'm not in the
office until Thursday.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

  reply	other threads:[~2011-02-21 14:50 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-27 23:18 Hypervisor crash(!) on xl cpupool-numa-split Andre Przywara
2011-01-28  6:47 ` Juergen Gross
2011-01-28 11:07   ` Andre Przywara
2011-01-28 11:44     ` Juergen Gross
2011-01-28 13:14       ` Andre Przywara
2011-01-31  7:04         ` Juergen Gross
2011-01-31 14:59           ` Andre Przywara
2011-01-31 15:28             ` George Dunlap
2011-02-01 16:32               ` Andre Przywara
2011-02-02  6:27                 ` Juergen Gross
2011-02-02  8:49                   ` Juergen Gross
2011-02-02 10:05                     ` Juergen Gross
2011-02-02 10:59                       ` Andre Przywara
2011-02-02 14:39                 ` Stephan Diestelhorst
2011-02-02 15:14                   ` Juergen Gross
2011-02-02 16:01                     ` Stephan Diestelhorst
2011-02-03  5:57                       ` Juergen Gross
2011-02-03  9:18                         ` Juergen Gross
2011-02-04 14:09                           ` Andre Przywara
2011-02-07 12:38                             ` Andre Przywara
2011-02-07 13:32                               ` Juergen Gross
2011-02-07 15:55                                 ` George Dunlap
2011-02-08  5:43                                   ` Juergen Gross
2011-02-08 12:08                                     ` George Dunlap
2011-02-08 12:14                                       ` George Dunlap
2011-02-08 16:33                                         ` Andre Przywara
2011-02-09 12:27                                           ` George Dunlap
2011-02-09 12:27                                             ` George Dunlap
2011-02-09 13:04                                               ` Juergen Gross
2011-02-09 13:39                                                 ` Andre Przywara
2011-02-09 13:51                                               ` Andre Przywara
2011-02-09 14:21                                                 ` Juergen Gross
2011-02-10  6:42                                                   ` Juergen Gross
2011-02-10  9:25                                                     ` Andre Przywara
2011-02-10 14:18                                                       ` Andre Przywara
2011-02-11  6:17                                                         ` Juergen Gross
2011-02-11  7:39                                                           ` Andre Przywara
2011-02-14 17:57                                                             ` George Dunlap
2011-02-15  7:22                                                               ` Juergen Gross
2011-02-16  9:47                                                                 ` Juergen Gross
2011-02-16 13:54                                                                   ` George Dunlap
     [not found]                                                                     ` <4D6237C6.1050206@amd.c om>
2011-02-16 14:11                                                                     ` Juergen Gross
2011-02-16 14:28                                                                       ` Juergen Gross
2011-02-17  0:05                                                                       ` André Przywara
2011-02-17  7:05                                                                     ` Juergen Gross
2011-02-17  9:11                                                                       ` Juergen Gross
2011-02-21 10:00                                                                     ` Andre Przywara
2011-02-21 13:19                                                                       ` Juergen Gross
2011-02-21 14:45                                                                         ` Andre Przywara
2011-02-21 14:50                                                                           ` Juergen Gross [this message]
2011-02-08 12:23                                       ` Juergen Gross
2011-01-28 11:13   ` George Dunlap
2011-01-28 13:05     ` Andre Przywara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D627BA6.4020406@ts.fujitsu.com \
    --to=juergen.gross@ts.fujitsu.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Stephan.Diestelhorst@amd.com \
    --cc=andre.przywara@amd.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.