From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jui-Hao Chiang Subject: mem_sharing: summarized problems when domain is dying Date: Fri, 21 Jan 2011 11:19:03 -0500 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tim Deegan Cc: MaoXiaoyun , xen devel List-Id: xen-devel@lists.xenproject.org Hi, Tim: >>From tinnycloud's result, here I summarize the current problem and findings of mem_sharing due to domain dying. (1) When domain is dying, alloc_domheap_page() and set_shared_p2m_entry() would just fail. So the shr_lock is not enough to ensure that the domain won't die in the middle of mem_sharing code. As tinnycloud's code shows, is that better to use rcu_lock_domain_by_id before calling the above two functions? (2) What's the proper behavior of nominate/share/unshare when domain is dyi= ng? The following is just my current guess. Please give comments as well. (2.1) nominate: return fail; but needs to check blktap2's code to make sure it understand and act properly (should be minor issue now) (2.2) share: return success but skip the gfns of dying domain, i.e., we don't remove them from the hash list, and don't update their p2m entry (set_shared_p2m_entry). We believe that the p2m_teardown will clean up them later. (2.3) unshare: it's the most problematic part. Because we are not able to alloc_domheap_page at this moment, the only thing we can do is simply skip the page and return. But what's the side effect? (a) If p2m_teardown comes in, there is no problem. Just destroy it and done= . (b) hap_nested_page_fault: if we return fail, will this cause problem to guest? or we can simply return success to cheat the guest. But later the guest will trigger another page fault if it write the page again. (c) gnttab_map_grant_ref: this function specify must_succeed to gfn_to_mfn_unshare(), which would BUG if unshare() fails. Do we really need (b) and (c) in the last steps of domain dying? If that's the case, we need to have a special alloc_domheap_page for dying domain. On Thu, Jan 20, 2011 at 4:19 AM, Tim Deegan wrote: > At 07:19 +0000 on 20 Jan (1295507976), MaoXiaoyun wrote: >> Hi: >> >> =A0 =A0 =A0 =A0 =A0 =A0 The latest BUG in mem_sharing_alloc_page from me= m_sharing_unshare_page. >> =A0 =A0 =A0 =A0 =A0 =A0 I printed heap info, which shows plenty memory l= eft. >> =A0 =A0 =A0 =A0 =A0 =A0 Could domain be NULL during in unshare, or shoul= d it be locked by rcu_lock_domain_by_id ? >> > > 'd' probably isn't NULL; more likely is that the domain is not allowed > to have any more memory. =A0You should look at the values of d->max_pages > and d->tot_pages when the failure happens. > > Cheers. > > Tim. > Bests, Jui-Hao