All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Elliott Mitchell <ehem+xen@m5p.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: HVM/PVH Balloon crash
Date: Tue, 7 Sep 2021 17:57:10 +0200	[thread overview]
Message-ID: <935dc03f-74f5-4b49-3a45-71148364fb5a@suse.com> (raw)
In-Reply-To: <YTd/SFtvuzejeiik@mattapan.m5p.com>

On 07.09.2021 17:03, Elliott Mitchell wrote:
> On Tue, Sep 07, 2021 at 10:03:51AM +0200, Jan Beulich wrote:
>> On 06.09.2021 22:47, Elliott Mitchell wrote:
>>> On Mon, Sep 06, 2021 at 09:52:17AM +0200, Jan Beulich wrote:
>>>> On 06.09.2021 00:10, Elliott Mitchell wrote:
>>>>> I brought this up a while back, but it still appears to be present and
>>>>> the latest observations appear rather serious.
>>>>>
>>>>> I'm unsure of the entire set of conditions for reproduction.
>>>>>
>>>>> Domain 0 on this machine is PV (I think the BIOS enables the IOMMU, but
>>>>> this is an older AMD IOMMU).
>>>>>
>>>>> This has been confirmed with Xen 4.11 and Xen 4.14.  This includes
>>>>> Debian's patches, but those are mostly backports or environment
>>>>> adjustments.
>>>>>
>>>>> Domain 0 is presently using a 4.19 kernel.
>>>>>
>>>>> The trigger is creating a HVM or PVH domain where memory does not equal
>>>>> maxmem.
>>>>
>>>> I take it you refer to "[PATCH] x86/pod: Do not fragment PoD memory
>>>> allocations" submitted very early this year? There you said the issue
>>>> was with a guest's maxmem exceeding host memory size. Here you seem to
>>>> be talking of PoD in its normal form of use. Personally I uses this
>>>> all the time (unless enabling PCI pass-through for a guest, for being
>>>> incompatible). I've not observed any badness as severe as you've
>>>> described.
>>>
>>> I've got very little idea what is occurring as I'm expecting to be doing
>>> ARM debugging, not x86 debugging.
>>>
>>> I was starting to wonder whether this was widespread or not.  As such I
>>> was reporting the factors which might be different in my environment.
>>>
>>> The one which sticks out is the computer has an older AMD processor (you
>>> a 100% Intel shop?).
>>
>> No, AMD is as relevant to us as is Intel.
>>
>>>  The processor has the AMD NPT feature, but a very
>>> early/limited IOMMU (according to Linux "AMD IOMMUv2 functionality not
>>> available").
>>>
>>> Xen 4.14 refused to load the Domain 0 kernel as PVH (not enough of an
>>> IOMMU).
>>
>> That sounds odd at the first glance - PVH simply requires that there be
>> an (enabled) IOMMU. Hence the only thing I could imagine is that Xen
>> doesn't enable the IOMMU in the first place for some reason.
> 
> Doesn't seem that odd to me.  I don't know the differences between the
> first and second versions of the AMD IOMMU, but could well be v1 was
> judged not to have enough functionality to bother with.
> 
> What this does make me wonder is, how much testing was done on systems
> with functioning NPT, but disabled IOMMU?

No idea. During development is may happen (rarely) that one disables
the IOMMU on purpose. Beyond that - can't tell.

>  Could be this system is in an
> intergenerational hole, and some spot in the PVH/HVM code makes an
> assumption of the presence of NPT guarantees presence of an operational
> IOMMU.  Otherwise if there was some copy and paste while writing IOMMU
> code, some portion of the IOMMU code might be checking for presence of
> NPT instead of presence of IOMMU.

This is all very speculative; I consider what you suspect not very likely,
but also not entirely impossible. This is not the least because for a
long time we've been running without shared page tables on AMD.

I'm afraid without technical data and without knowing how to repro, I
don't see a way forward here.

Jan



  reply	other threads:[~2021-09-07 15:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-05 22:10 HVM/PVH Ballon crash Elliott Mitchell
2021-09-06  7:52 ` Jan Beulich
2021-09-06 20:47   ` HVM/PVH Balloon crash Elliott Mitchell
2021-09-07  8:03     ` Jan Beulich
2021-09-07 15:03       ` Elliott Mitchell
2021-09-07 15:57         ` Jan Beulich [this message]
2021-09-07 21:40           ` Elliott Mitchell
2021-09-15  2:40           ` Elliott Mitchell
2021-09-15  6:05             ` Jan Beulich
2021-09-26 22:53               ` Elliott Mitchell
2021-09-29 13:32                 ` Jan Beulich
2021-09-29 15:31                   ` Elliott Mitchell
2021-09-30  7:08                     ` Jan Beulich
2021-10-02  2:35                       ` Elliott Mitchell
2021-10-07  7:20                         ` Jan Beulich
2021-09-30  7:43                 ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=935dc03f-74f5-4b49-3a45-71148364fb5a@suse.com \
    --to=jbeulich@suse.com \
    --cc=ehem+xen@m5p.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.