From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tim Deegan <tim@xen.org>
Subject: Re: frequently ballooning results in qemu exit
Date: Thu, 21 Mar 2013 12:15:47 +0000
Message-ID: <20130321121547.GB12338@ocelot.phlegethon.org>
References: <FAB5C136CA8BEA4DBEA2F641E3F536384A8AF9C3@szxeml538-mbx.china.huawei.com>
	<CAFLBxZa1OC73M9oJ2m9NhvH6c+_2L2wyDSRV8Q4MnYDw8s-XCA@mail.gmail.com>
	<5141A8B0.4050305@citrix.com>
	<FAB5C136CA8BEA4DBEA2F641E3F536384A8B089B@szxeml538-mbx.china.huawei.com>
	<20130314143403.GB5174@ocelot.phlegethon.org>
	<FAB5C136CA8BEA4DBEA2F641E3F536384A8B14D2@szxeml538-mbx.china.huawei.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Content-Disposition: inline
In-Reply-To: <FAB5C136CA8BEA4DBEA2F641E3F536384A8B14D2@szxeml538-mbx.china.huawei.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Hanweidong <hanweidong@huawei.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Andrew Cooper <Andrew.Cooper3@citrix.com>, Yanqiangjun <yanqiangjun@huawei.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Gonglei (Arei)" <arei.gonglei@huawei.com>, Anthony PERARD <anthony.perard@citrix.com>
List-Id: xen-devel@lists.xenproject.org

At 05:54 +0000 on 15 Mar (1363326854), Hanweidong wrote:
> > > I'm also curious about this. There is a window between memory balloon
> > out
> > > and QEMU invalidate mapcache.
> > 
> > That by itself is OK; I don't think we need to provide any meaningful
> > semantics if the guest is accessing memory that it's ballooned out.
> > 
> > The question is where the SIGBUS comes from: either qemu has a mapping
> > of the old memory, in which case it can write to it safely, or it
> > doesn't, in which case it shouldn't try.
> 
> The error always happened at memcpy in if (is_write) branch in
> address_space_rw.

Sure, but _why_?  Why does this access cause SIGBUS?  Presumably there's
some part of the mapcache code that thinks it has a mapping there when
it doesn't.

> We found that, after the last xen_invalidate_map_cache, the mapcache entry related to the failed address was mapped:
> 	==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
> 	==xen_remap_bucket== begin size=1048576 ,address_index=7a3
> 	==xen_remap_bucket== end entry->paddr_index=7a3,entry->vaddr_base=2a2d9000,size=1048576,address_index=7a3

OK, so that's 0x2a2d9000 -- 0x2a3d8fff.

> 	==address_space_rw== ptr=2a39aec0
> 	==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
> 	==address_space_rw== ptr=2a39aec4
> 	==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
> 	==address_space_rw== ptr=2a39aec8
> 	==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
> 	==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
> 	==address_space_rw== ptr=2a39aecc

These are all to page 0x2a3e9a___.

> 	==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
> 	==xen_map_cache== return 92a407000+6c108=2a473108
> 	==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
> 	==xen_map_cache==first return 2a407000+6c10c=2a47310c
> 	==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
> 	==xen_map_cache==first return 2a407000+6c110=2a473110
> 	==xen_map_cache== phys_addr=7a395000 size=0 lock=0
> 	==xen_map_cache== return 2a2d9000+95000=2a36e000
> 	==address_space_rw== ptr=2a36e000

And this is to page 0x2a36e___, a different page in the same bucket.

>       here, the SIGBUS error occurred.

So that page isn't mapped.  Which means:
- it was never mapped (and the mapcache code didn't handle the error
  correctly at map time); or
- it was never mapped (and the mapcache hasn't checked its own records
  before using the map); or
- it was mapped (and something unmapped it in the meantime).

Why not add some tests in xen_remap_bucket to check that all the pages
that qemu records as mapped are actually there?

Tim.