All of lore.kernel.org
 help / color / mirror / Atom feed
* mem_sharing: summarized problems when domain is dying
@ 2011-01-21 16:19 Jui-Hao Chiang
  2011-01-21 16:29 ` George Dunlap
  2011-01-21 19:45 ` Jui-Hao Chiang
  0 siblings, 2 replies; 12+ messages in thread
From: Jui-Hao Chiang @ 2011-01-21 16:19 UTC (permalink / raw)
  To: Tim Deegan; +Cc: MaoXiaoyun, xen devel

Hi, Tim:

>From tinnycloud's result, here I summarize the current problem and
findings of mem_sharing due to domain dying.
(1) When domain is dying, alloc_domheap_page() and
set_shared_p2m_entry() would just fail. So the shr_lock is not enough
to ensure that the domain won't die in the middle of mem_sharing code.
As tinnycloud's code shows, is that better to use
rcu_lock_domain_by_id before calling the above two functions?

(2) What's the proper behavior of nominate/share/unshare when domain is dying?
The following is just my current guess. Please give comments as well.

(2.1) nominate: return fail; but needs to check blktap2's code to make
sure it understand and act properly (should be minor issue now)

(2.2) share: return success but skip the gfns of dying domain, i.e.,
we don't remove them from the hash list, and don't update their p2m
entry (set_shared_p2m_entry). We believe that the p2m_teardown will
clean up them later.

(2.3) unshare: it's the most problematic part. Because we are not able
to alloc_domheap_page at this moment, the only thing we can do is
simply skip the page and return. But what's the side effect?
(a) If p2m_teardown comes in, there is no problem. Just destroy it and done.
(b) hap_nested_page_fault: if we return fail, will this cause problem
to guest? or we can simply return success to cheat the guest. But
later the guest will trigger another page fault if it write the page
again.
(c) gnttab_map_grant_ref: this function specify must_succeed to
gfn_to_mfn_unshare(), which would BUG if unshare() fails.

Do we really need (b) and (c) in the last steps of domain dying? If
that's the case, we need to have a special alloc_domheap_page for
dying domain.


On Thu, Jan 20, 2011 at 4:19 AM, Tim Deegan <Tim.Deegan@citrix.com> wrote:
> At 07:19 +0000 on 20 Jan (1295507976), MaoXiaoyun wrote:
>> Hi:
>>
>>             The latest BUG in mem_sharing_alloc_page from mem_sharing_unshare_page.
>>             I printed heap info, which shows plenty memory left.
>>             Could domain be NULL during in unshare, or should it be locked by rcu_lock_domain_by_id ?
>>
>
> 'd' probably isn't NULL; more likely is that the domain is not allowed
> to have any more memory.  You should look at the values of d->max_pages
> and d->tot_pages when the failure happens.
>
> Cheers.
>
> Tim.
>

Bests,
Jui-Hao

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-01-25  6:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21 16:19 mem_sharing: summarized problems when domain is dying Jui-Hao Chiang
2011-01-21 16:29 ` George Dunlap
2011-01-21 16:32   ` George Dunlap
2011-01-21 16:41     ` George Dunlap
2011-01-21 16:53       ` Tim Deegan
2011-01-22 11:17       ` MaoXiaoyun
2011-01-21 19:45 ` Jui-Hao Chiang
2011-01-24 13:14   ` MaoXiaoyun
2011-01-24 14:08     ` George Dunlap
2011-01-25  4:13     ` Linux Guest Crash on stress test of memory sharing MaoXiaoyun
     [not found]       ` <BLU157-w350046650B3C4960C4B1F2DAFC0@phx.gbl>
2011-01-25  6:23         ` MaoXiaoyun
2011-01-24 14:02   ` mem_sharing: summarized problems when domain is dying Tim Deegan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.