xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@arm.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
	Aaron Cornelius <Aaron.Cornelius@dornerworks.com>,
	Xen-devel <xen-devel@lists.xenproject.org>,
	Stefano Stabellini <sstabellini@kernel.org>
Subject: Re: Xen 4.7 crash
Date: Wed, 1 Jun 2016 23:18:24 +0100	[thread overview]
Message-ID: <75b9c560-d7db-c636-251d-8bf36bad5ae2@arm.com> (raw)
In-Reply-To: <e3748d9a-3ec0-33e2-4be1-9b0972b69415@citrix.com>

Hi Andrew,

On 01/06/2016 22:24, Andrew Cooper wrote:
> On 01/06/2016 21:45, Aaron Cornelius wrote:
>>>
>>>> However, since I only have 1 domain active at a time, I'm not sure why I
>>> should run out of VM IDs.
>>>
>>> Sounds like a VMID resource leak.  Check to see whether it is freed properly
>>> in domain_destroy().
>>>
>>> ~Andrew
>> That would be my assumption.  But as far as I can tell, arch_domain_destroy() calls pwm_teardown() which calls p2m_free_vmid(), and none of the functionality related to freeing a VM ID appears to have changed in years.
>
> The VMID handling looks suspect.  It can be called repeatedly during
> domain destruction, and it will repeatedly clear the same bit out of the
> vmid_mask.

Can you explain how the p2m_free_vmid can be called multiple time?

We have the following path:
    arch_domain_destroy -> p2m_teardown -> p2m_free_vmid.

And I can find only 3 call of arch_domain_destroy we should only be done 
once per domain.

If arch_domain_destroy is called multiple time, p2m_free_vmid will not 
be the only place where Xen will be in trouble.

> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 838d004..7adb39a 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1393,7 +1393,10 @@ static void p2m_free_vmid(struct domain *d)
>      struct p2m_domain *p2m = &d->arch.p2m;
>      spin_lock(&vmid_alloc_lock);
>      if ( p2m->vmid != INVALID_VMID )
> -        clear_bit(p2m->vmid, vmid_mask);
> +    {
> +        ASSERT(test_and_clear_bit(p2m->vmid, vmid_mask));
> +        p2m->vmid = INVALID_VMID;
> +    }
>
>      spin_unlock(&vmid_alloc_lock);
>  }
>
> Having said that, I can't explain why that bug would result in the
> symptoms you are seeing.  It is also possibly that your issue is memory
> corruption from a separate source.
>
> Can you see about instrumenting p2m_alloc_vmid()/p2m_free_vmid() (with
> vmid_alloc_lock held) to see which vmid is being allocated/freed ?
> After the initial boot of the system, you should see the same vmid being
> allocated and freed for each of your domains.

Looking quickly at the log, the domain is dom1101. However, the number 
maximum number of VMID supported is 256, so the exhaustion might be a 
race somewhere.

I would be interested to get a reproducer. I wrote a script to cycle a 
domain (create/domain) in loop, and I have not seen any issue after 1200 
cycles (and counting).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-06-01 22:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01 19:54 Xen 4.7 crash Aaron Cornelius
2016-06-01 20:00 ` Andrew Cooper
2016-06-01 20:45   ` Aaron Cornelius
2016-06-01 21:24     ` Andrew Cooper
2016-06-01 22:18       ` Julien Grall [this message]
2016-06-01 22:26         ` Andrew Cooper
2016-06-01 21:35 ` Andrew Cooper
2016-06-01 22:24   ` Julien Grall
2016-06-01 22:31     ` Andrew Cooper
2016-06-02  8:47       ` Jan Beulich
2016-06-02  8:53         ` Andrew Cooper
2016-06-02  9:07           ` Jan Beulich
2016-06-01 22:35 ` Julien Grall
2016-06-02  1:32   ` Aaron Cornelius
2016-06-02  8:49     ` Jan Beulich
2016-06-02  9:07     ` Julien Grall
2016-06-06 13:58       ` Aaron Cornelius
2016-06-06 14:05         ` Julien Grall
2016-06-06 14:19           ` Wei Liu
2016-06-06 15:02             ` Aaron Cornelius
2016-06-07  9:53               ` Ian Jackson
2016-06-07 13:40                 ` Aaron Cornelius
2016-06-07 15:13                   ` Aaron Cornelius
2016-06-09 11:14                     ` Ian Jackson
2016-06-14 13:11                       ` Aaron Cornelius
2016-06-14 13:15                         ` Wei Liu
2016-06-14 13:26                           ` Aaron Cornelius
2016-06-14 13:38                             ` Aaron Cornelius
2016-06-14 13:47                               ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=75b9c560-d7db-c636-251d-8bf36bad5ae2@arm.com \
    --to=julien.grall@arm.com \
    --cc=Aaron.Cornelius@dornerworks.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).