All of lore.kernel.org
 help / color / mirror / Atom feed
* [memory sharing] bug on get_page_and_type
@ 2011-01-31 10:13 tinnycloud
  2011-01-31 10:49 ` George Dunlap
  0 siblings, 1 reply; 14+ messages in thread
From: tinnycloud @ 2011-01-31 10:13 UTC (permalink / raw)
  To: 'xen devel'; +Cc: george.dunlap, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 5192 bytes --]

Hi:
 

Attached is the whole patch suit for latest xen-4.0-testing,changeset 21443,


It comes from George, Tim and JuiHao, also, has some extra debug info.

 

On most occasion, memory sharing works fine, but still exists a bug.

       I've been tracing this problem for a while.

       There is a bug on get_page_and_type() in mem_sharing_share_pages()

 

--------------------------------------------mem_sharing_share_pages()-------
----------------

789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){

790             mem_sharing_debug_gfn(cd, gfn->gfn);

791             mem_sharing_debug_gfn(sd, sgfn->gfn);

792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
cd->is_dying, sd->is_dying, spage, se->mfn);

793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
owner_id=%d\n",

794                     mfn_x(page_to_mfn(spage)),

795                     spage->count_info, 

 796                     spage->u.inuse.type_info,

797                     page_get_owner(spage)->domain_id);

798             BUG();

799         }

 

 

Below painc log contains the debug info from line 790-798.

We saw that 180000000000000 which is PGC_state_free,

So it looks like a shared page has been freed unexpectly.

 

 

(XEN) teardown 64

(XEN) teardown 66

blktap_sysfs_destroy

blktap_sysfs_create: adding attributes for dev ffff880109d0c200

blktap_sysfs_destroy

__ratelimit: 1 callbacks suppressed

blktap_sysfs_destroy

(XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
ci=8000000000000005, ti=8400000000000001, owner_id=32755

(XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff

(XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895

(XEN) Debug page: MFN=507895 is ci= 180000000000000, ti=8400000000000001,
owner_id=32755

(XEN) Xen BUG at mem_sharing.c:798

(XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----

(XEN) CPU:    0

(XEN) RIP:    e008:[<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0

(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor

(XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx: 0000000000000092

(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c4802237c4

(XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48   r8:  0000000000000001

(XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11: 0000000000000005

(XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14: 0000000000347967

(XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4: 00000000000026f0

(XEN) cr3: 000000010e642000   cr2: 00002abdc4809000

(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008

(XEN) Xen stack trace from rsp=ffff82c48035fc48:

(XEN)    000000000001e8dc ffff83030d99e068 0000000000507895 ffff83030d99e050

(XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300 ffff830626afbbd0

(XEN)    0000000080372980 ffff82c48035fe38 ffff83023febe000 00000000008f7000

(XEN)    0000000000305000 0000000000000006 0000000000000006 ffff82c4801c44a4

(XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc fffffffffffffff3

(XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281 0000000000000080

(XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8 ffff82c48035fd38

(XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002 0000000000000000

(XEN)    0000000000000002 0000000000000000 ffff83023febe000 ffff83023ff80080

(XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80 0000000000000000

(XEN)    0000000000000002 ffff82c48018a452 0000000000000000 ffff82c48016f6bb

(XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067 fffffffffffffff3

(XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000 0000000000000006

(XEN)    0000000000000006 ffff82c4801042b3 0000000000000000 ffff82c48010d2a5

(XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200 ffff8300bf76e000

(XEN)    0000000600000039 0000000000000000 00007fc09d744003 0000000000325258

(XEN)    0000000000347967 ffffffffff600429 000000004d463b17 0000000000062fe2

(XEN)    0000000000000000 00007fc09d744070 00007fc09d744000 00007fffcfec0d20

(XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88 00000000019f1ca8

(XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000 0000000000000246

(XEN) Xen call trace:

(XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0

(XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130

(XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0

(XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0

(XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0

(XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50

(XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130

(XEN)    [<ffff82c4801538ea>] ack_edge_ioapic_irq+0x2a/0x70

(XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0

(XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0

(XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae

(XEN)    

(XEN) 

(XEN) ****************************************

(XEN) Panic on CPU 0:

(XEN) Xen BUG at mem_sharing.c:798

(XEN) ****************************************

(XEN) 

(XEN) Manual reset required ('noreboot' specified)

 


[-- Attachment #1.2: Type: text/html, Size: 20329 bytes --]

[-- Attachment #2: memsharing.patch.patch --]
[-- Type: application/octet-stream, Size: 26020 bytes --]

diff -rubN b/xen/arch/x86/mm/hap/p2m-ept.c a/xen/arch/x86/mm/hap/p2m-ept.c
--- b/xen/arch/x86/mm/hap/p2m-ept.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/arch/x86/mm/hap/p2m-ept.c	2011-01-19 16:36:27.000000000 +0800
@@ -38,17 +38,22 @@
                                       p2m_query_t q)
 {
     int r;
+    int do_locking = !p2m_locked_by_me(d->arch.p2m);
+
+    if(do_locking)
     p2m_lock(d->arch.p2m);
 
     /* Check to make sure this is still PoD */
     if ( entry->avail1 != p2m_populate_on_demand )
     {
+        if(do_locking)
         p2m_unlock(d->arch.p2m);
         return 0;
     }
 
     r = p2m_pod_demand_populate(d, gfn, order, q);
 
+    if(do_locking)
     p2m_unlock(d->arch.p2m);
 
     return r;
diff -rubN b/xen/arch/x86/mm/mem_sharing.c a/xen/arch/x86/mm/mem_sharing.c
--- b/xen/arch/x86/mm/mem_sharing.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/arch/x86/mm/mem_sharing.c	2011-01-30 19:12:24.000000000 +0800
@@ -57,6 +57,7 @@
 
 static shr_handle_t next_handle = 1;
 static atomic_t nr_saved_mfns = ATOMIC_INIT(0); 
+int mem_sharing_debug_mfn(unsigned long mfn);
 
 typedef struct shr_hash_entry 
 {
@@ -168,6 +169,65 @@
     xfree(gfn);
 }
 
+static shr_hash_entry_t* mem_sharing_hash_lookup_by_gfn(
+                            struct domain *d,
+                            unsigned long gfn 
+        )
+{
+    shr_hash_entry_t *ce;
+    int i;
+ 
+    for(i = 0; i < SHR_HASH_LENGTH; i++){
+        ce = shr_hash[i];
+        while(ce != NULL)
+        {
+            struct list_head *le, *te;
+            struct gfn_info *sgfn;
+            list_for_each_safe(le, te, &ce->gfns)
+            {
+                sgfn = list_entry(le, struct gfn_info, list);
+                if(sgfn->domain == d->domain_id && sgfn->gfn == gfn){
+                    mem_sharing_debug_mfn(mfn_x(ce->mfn));
+                    printk("gfn handle %lu\n", ce->handle);
+                }
+            }
+            ce = ce->next;
+        }
+    }
+
+    return NULL;
+}
+static unsigned long mem_sharing_hash_lookup_by_mfn(
+        unsigned long mfn
+        )
+{
+    shr_hash_entry_t *ce;
+    int i;
+ 
+    for(i = 0; i < SHR_HASH_LENGTH; i++){
+        ce = shr_hash[i];
+        while(ce != NULL )
+        {
+            if(mfn_x(ce->mfn) == mfn){
+                struct list_head *le, *te;
+                struct gfn_info *sgfn = NULL;
+                mem_sharing_debug_mfn(mfn_x(ce->mfn));
+                printk("handle %lu ", ce->handle);
+                list_for_each_safe(le, te, &ce->gfns)
+                {
+                    sgfn = list_entry(le, struct gfn_info, list);
+                    printk("did %u, gfn %lu ", sgfn->domain, sgfn->gfn);
+                }
+                printk("end\n");
+                return sgfn->gfn;
+            }
+            ce = ce->next;
+        }
+    }
+
+    return 0;
+}
+
 static shr_hash_entry_t* mem_sharing_hash_lookup(shr_handle_t handle)
 {
     shr_hash_entry_t *e;
@@ -183,6 +243,73 @@
     return NULL;
 }
 
+void mem_sharing_hash_iter_detail(void)
+{
+    shr_hash_entry_t *ce;
+    int i,j;
+    int count_e= 0;
+
+    shr_lock();
+    j = 0;
+    for(i = 0; i < SHR_HASH_LENGTH; i++){
+
+        ce = shr_hash[i];
+        if(ce && ce->next != NULL)
+            printk("===> %d <==== ", i);
+
+        while(ce != NULL)
+        {
+            struct list_head *le, *te;
+            struct gfn_info *gfn;
+            j = 0;
+            printk("--%lu--", ce->handle);
+            list_for_each_safe(le, te, &ce->gfns)
+            {
+                gfn = list_entry(le, struct gfn_info, list);
+                j++;
+                printk("%d ", gfn->domain);
+            }
+            printk(",(%d)\n", j);
+            count_e++;
+            ce = ce->next;
+        }
+    }
+    printk("===>total handles %d total gfns %d next_handle %lu\n", count_e, j, next_handle);
+    shr_unlock();
+}
+
+
+void mem_sharing_hash_iter(void)
+{
+    shr_hash_entry_t *ce;
+    int i,j;
+    int count_e= 0;
+
+    shr_lock();
+    j = 0;
+    for(i = 0; i < SHR_HASH_LENGTH; i++){
+//      printk("========> %d <\n", i);
+        ce = shr_hash[i];
+        while(ce != NULL)
+        {
+            struct list_head *le, *te;
+            struct gfn_info *gfn;
+//          printk("--%lu--", ce->handle);
+            list_for_each_safe(le, te, &ce->gfns)
+            {
+                gfn = list_entry(le, struct gfn_info, list);
+                j++;
+//              printk("%d ", gfn->domain);
+            }
+//          printk(",(%d)\n", j);
+            count_e++;
+            ce = ce->next;
+        }
+    }
+    printk("===>total handles %d total gfns %d next_handle: %lu\n", count_e, j, next_handle);
+    shr_unlock();
+}
+
 static shr_hash_entry_t* mem_sharing_hash_insert(shr_handle_t handle, mfn_t mfn)
 {
     shr_hash_entry_t *e, **ee;
@@ -293,6 +420,7 @@
 #endif
 
 
+extern void pagealloc_info(unsigned char key);
 static struct page_info* mem_sharing_alloc_page(struct domain *d, 
                                                 unsigned long gfn,
                                                 int must_succeed)
@@ -310,6 +438,7 @@
         /* We do not support 'must_succeed' any more. External operations such
          * as grant table mappings may fail with OOM condition! 
          */
+        pagealloc_info('m');
         BUG();
     }
     else
@@ -402,7 +531,7 @@
 static grant_entry_header_t *
 shared_entry_header(struct grant_table *t, grant_ref_t ref)
 {
-    ASSERT(t->gt_version != 0);
+    BUG_ON(t->gt_version == 0);
     if (t->gt_version == 1)
         return (grant_entry_header_t*)&shared_entry_v1(t, ref);
     else
@@ -451,16 +580,18 @@
     if(list_has_one_entry(&gfn->list))
     {
         d = get_domain_by_id(gfn->domain);
+        if(!d){
+            printk("========>did is %d\n", gfn->domain);
+            return -1;
+        }
         BUG_ON(!d);
         if(sharing) 
             atomic_inc(&d->shr_pages);
         else
             atomic_dec(&d->shr_pages);
         put_domain(d);
-
         return 1;
     }
-    mem_sharing_audit();
 
     return 0;
 }
@@ -505,6 +636,7 @@
 
     *phandle = 0UL;
 
+    shr_lock();
     mfn = gfn_to_mfn(d, gfn, &p2mt);
 
     /* Check if mfn is valid */
@@ -512,29 +644,33 @@
     if (!mfn_valid(mfn))
         goto out;
 
+    /* Return the handle if the page is already shared */
+    page = mfn_to_page(mfn);
+    if (p2m_is_shared(p2mt)) {
+        *phandle = page->shr_handle;
+        ret = 0;
+        goto out;
+    }
+
     /* Check p2m type */
     if (!p2m_is_sharable(p2mt))
         goto out;
 
     /* Try to convert the mfn to the sharable type */
-    page = mfn_to_page(mfn);
     ret = page_make_sharable(d, page, expected_refcnt); 
     if(ret) 
         goto out;
 
     /* Create the handle */
     ret = -ENOMEM;
-    shr_lock(); 
     handle = next_handle++;  
     if((hash_entry = mem_sharing_hash_insert(handle, mfn)) == NULL)
     {
-        shr_unlock();
         goto out;
     }
     if((gfn_info = mem_sharing_gfn_alloc()) == NULL)
     {
         mem_sharing_hash_destroy(hash_entry);
-        shr_unlock();
         goto out;
     }
 
@@ -545,10 +681,14 @@
          * it a few lines above.
          * The mfn needs to revert back to rw type. This should never fail,
          * since no-one knew that the mfn was temporarily sharable */
+    
+        printk("==make share fail");
+        mem_sharing_debug_gfn(d, gfn);
         BUG_ON(page_make_private(d, page) != 0);
         mem_sharing_hash_destroy(hash_entry);
         mem_sharing_gfn_destroy(gfn_info, 0);
-        shr_unlock();
+        printk("==make share fail2");
+        mem_sharing_debug_gfn(d, gfn);
         goto out;
     }
 
@@ -562,11 +702,13 @@
     gfn_info->domain = d->domain_id;
     page->shr_handle = handle;
     *phandle = handle;
-    shr_unlock();
 
     ret = 0;
 
 out:
+    shr_unlock();
+    if(0)
+        mem_sharing_hash_lookup_by_mfn(mfn_x(mfn));
     return ret;
 }
 
@@ -575,9 +717,16 @@
     shr_hash_entry_t *se, *ce;
     struct page_info *spage, *cpage;
     struct list_head *le, *te;
-    struct gfn_info *gfn;
-    struct domain *d;
+    struct gfn_info *gfn = NULL;
+    struct domain *cd = NULL;
+    struct domain *sd = NULL;
+    struct gfn_info *sgfn;
     int ret;
+    int count = 0;
+
+    BUG_ON(ch == sh);
+    BUG_ON(ch > next_handle);
+    BUG_ON(sh > next_handle);
 
     shr_lock();
 
@@ -590,30 +739,97 @@
     spage = mfn_to_page(se->mfn); 
     cpage = mfn_to_page(ce->mfn); 
     /* gfn lists always have at least one entry => save to call list_entry */
-    mem_sharing_gfn_account(gfn_get_info(&ce->gfns), 1);
-    mem_sharing_gfn_account(gfn_get_info(&se->gfns), 1);
+    gfn = gfn_get_info(&ce->gfns);
+    if(!list_has_one_entry(&gfn->list)){
+        list_for_each_safe(le, te, &ce->gfns)
+        gfn = list_entry(le, struct gfn_info, list);
+        printk("gfn %lx did %u\n", gfn->gfn, gfn->domain);
+        BUG();
+    }
+
+    if(mem_sharing_gfn_account(gfn_get_info(&ce->gfns), 1) == -1){
+        printk("=====client not found, server %d client %d\n", gfn_get_info(&se->gfns)->domain, gfn_get_info(&ce->gfns)->domain);
+        ret = XEN_DOMCTL_MEM_SHARING_C_HANDLE_INVALID;
+        goto err_out;
+    }
+
+    sgfn = gfn_get_info(&se->gfns);
+    if(mem_sharing_gfn_account(sgfn, 1)==-1){
+        printk("=====server not found, server %d client %d\n", gfn_get_info(&se->gfns)->domain, gfn_get_info(&ce->gfns)->domain);
+        ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID;
+        goto err_out;
+    }
+
     list_for_each_safe(le, te, &ce->gfns)
     {
         gfn = list_entry(le, struct gfn_info, list);
         /* Get the source page and type, this should never fail 
          * because we are under shr lock, and got non-null se */
-        BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page));
+        cd = rcu_lock_domain_by_id(gfn->domain);
+        sd = rcu_lock_domain_by_id(sgfn->domain);
+        if(!cd || cd->is_dying || !sd || sd->is_dying){
+            if(!cd)
+                printk("cd is NULL %d\n", gfn->domain);
+            else if(cd->is_dying)
+                printk("cd is dying %d %d\n", cd->is_dying, gfn->domain);
+            else if(!sd)
+                printk("sd is NULL %d\n", gfn->domain);
+            else if(sd->is_dying){
+                printk("sd is dying %d %d %d c %d\n", sd->is_dying, sgfn->domain, gfn->domain, count++);
+            }
+
+            /*release lock if we have d*/
+            if(cd)
+                rcu_unlock_domain(cd);
+            if(sd)
+                rcu_unlock_domain(sd);
+            continue;
+        }
+
+        if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
+            mem_sharing_debug_gfn(cd, gfn->gfn);
+            mem_sharing_debug_gfn(sd, sgfn->gfn);
+            printk("c->dying %d s->dying %d spage %p se->mfn %lx\n", cd->is_dying, sd->is_dying, spage, se->mfn);
+            printk("Debug page: MFN=%lx is ci=%lx, ti=%lx, owner_id=%d\n",
+                    mfn_x(page_to_mfn(spage)), 
+                    spage->count_info, 
+                    spage->u.inuse.type_info,
+                    page_get_owner(spage)->domain_id);
+            BUG();
+        }
         /* Move the gfn_info from ce list to se list */
         list_del(&gfn->list);
-        d = get_domain_by_id(gfn->domain);
-        BUG_ON(!d);
-        BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0);
-        put_domain(d);
+        //d = get_domain_by_id(gfn->domain);
+        //mem_sharing_debug_gfn(d, gfn->gfn);
+        BUG_ON(set_shared_p2m_entry(cd, gfn->gfn, se->mfn) == 0);
         list_add(&gfn->list, &se->gfns);
         put_page_and_type(cpage);
+        rcu_unlock_domain(cd);
+        rcu_unlock_domain(sd);
+        //mem_sharing_debug_gfn(d, gfn->gfn);
     } 
-    ASSERT(list_empty(&ce->gfns));
+    /* Free the client page when no gfn refer to*/
+    if (list_empty(&ce->gfns) && test_and_clear_bit(_PGC_allocated, &cpage->count_info)){
+        put_page(cpage);
     mem_sharing_hash_delete(ch);
     atomic_inc(&nr_saved_mfns);
-    /* Free the client page */
-    if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
-        put_page(cpage);
     ret = 0;
+    }else{
+        ret = XEN_DOMCTL_MEM_SHARING_S_HANDLE_INVALID;
+    }
+
+/*    mem_sharing_debug_gfn(d, gfn->gfn);
+    printk("Debug page: MFN=%lx is ci=%lx, ti=%lx, owner_id=%d, h=%lu\n",
+           mfn_x(page_to_mfn(cpage)), 
+           cpage->count_info, 
+           cpage->u.inuse.type_info,
+           page_get_owner(cpage)->domain_id, cpage->shr_handle);
+    printk("Debug page: MFN=%lx is ci=%lx, ti=%lx, owner_id=%d, h=%lu\n",
+           mfn_x(page_to_mfn(cpage)), 
+           spage->count_info, 
+           spage->u.inuse.type_info,
+           page_get_owner(spage)->domain_id, spage->shr_handle);
+*/
     
 err_out:
     shr_unlock();
@@ -634,16 +850,35 @@
     struct gfn_info *gfn_info = NULL;
     shr_handle_t handle;
     struct list_head *le;
+    struct domain *ld = NULL;
 
+    mem_sharing_audit();
+    /* Remove the gfn_info from the list */
+    shr_lock();
     mfn = gfn_to_mfn(d, gfn, &p2mt);
+    /* Has someone already unshared it? */
+
+    if (!p2m_is_shared(p2mt)){
+        printk("===>already unshared, flag %u\n", flags);
+        shr_unlock();
+        return 0;
+    }
+
+    if (!mfn_valid(mfn)){
+        mem_sharing_debug_gfn(d, gfn);
+    }
 
     page = mfn_to_page(mfn);
     handle = page->shr_handle;
  
-    mem_sharing_audit();
     /* Remove the gfn_info from the list */
-    shr_lock();
     hash_entry = mem_sharing_hash_lookup(handle); 
+
+    if(!hash_entry){
+        printk("entry handle %lu\n",handle);
+        mem_sharing_debug_gfn(d, gfn);
+        mem_sharing_hash_lookup_by_gfn(d,gfn);
+    }
     list_for_each(le, &hash_entry->gfns)
     {
         gfn_info = list_entry(le, struct gfn_info, list);
@@ -677,18 +912,51 @@
         return 0;
     }
  
+    ld = rcu_lock_domain_by_id(d->domain_id);                                                                                                         
+    BUG_ON(!ld);
+
     ret = page_make_private(d, page);
     BUG_ON(last_gfn & ret);
     if(ret == 0) goto private_page_found;
         
+    if(ld->is_dying )
+    {
+        if(!ld)
+            printk("d is NULL %d\n", d->domain_id);
+        else
+            printk("d is dying %d %d\n", d->is_dying, d->domain_id);
+
+        /*decrease page type count and destory gfn*/
+        put_page_and_type(page);
+        mem_sharing_gfn_destroy(gfn_info, !last_gfn);
+
+        if(last_gfn) 
+            mem_sharing_hash_delete(handle);
+        else 
+            /* Even though we don't allocate a private page, we have to account
+             * for the MFN that originally backed this PFN. */
+            atomic_dec(&nr_saved_mfns);
+
+        /*set mfn invalid*/
+        printk("did %d gfn %lx to invalid\n", d->domain_id, gfn);
+        BUG_ON(set_shared_p2m_entry_invalid(d, gfn)==0);
+        rcu_unlock_domain(ld);
+        shr_unlock();
+        return -EAGAIN;
+    }
+
     old_page = page;
     page = mem_sharing_alloc_page(d, gfn, flags & MEM_SHARING_MUST_SUCCEED);
-    BUG_ON(!page && (flags & MEM_SHARING_MUST_SUCCEED));
     if(!page) 
     {
+        mem_sharing_debug_gfn(d, gfn);
+        printk("---domain is %d, max_pages %u, total_pages %u \n", d->is_dying, d->max_pages, d->tot_pages);
+        BUG_ON(!d);
+        BUG_ON(!page && (flags & MEM_SHARING_MUST_SUCCEED));
         /* We've failed to obtain memory for private page. Need to re-add the
          * gfn_info to relevant list */
         list_add(&gfn_info->list, &hash_entry->gfns);
+        rcu_unlock_domain(ld);
         shr_unlock();
         return -ENOMEM;
     }
@@ -709,7 +977,6 @@
         mem_sharing_hash_delete(handle);
     else
         atomic_dec(&nr_saved_mfns);
-    shr_unlock();
 
     if(p2m_change_type(d, gfn, p2m_ram_shared, p2m_ram_rw) != 
                                                 p2m_ram_shared) 
@@ -719,7 +986,8 @@
     }
     /* Update m2p entry */
     set_gpfn_from_mfn(mfn_x(page_to_mfn(page)), gfn);
-
+    rcu_unlock_domain(ld);
+    shr_unlock();
     return 0;
 }
 
diff -rubN b/xen/arch/x86/mm/p2m.c a/xen/arch/x86/mm/p2m.c
--- b/xen/arch/x86/mm/p2m.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/arch/x86/mm/p2m.c	2011-01-30 22:20:53.000000000 +0800
@@ -1720,6 +1720,8 @@
 #endif
 
     p2m_lock(p2m);
+    printk(KERN_INFO "teardown %d\n",d->domain_id);
+    printk("teardown %d\n",d->domain_id);
 
 
 #ifdef __x86_64__
@@ -2426,11 +2428,35 @@
     return rc;
 }
 
+int
+set_shared_p2m_entry_invalid(struct domain *d, unsigned long gfn)
+{
+    int rc = 0;
+    int need_lock = !p2m_locked_by_me(d->arch.p2m);
+
+    if ( !paging_mode_translate(d) )
+        return 0;
+
+    if ( need_lock ) 
+        p2m_lock(d->arch.p2m);
+    
+    rc = set_p2m_entry(d, gfn, _mfn(INVALID_MFN), 0, p2m_invalid);
+
+    if ( need_lock ) 
+        p2m_unlock(d->arch.p2m);
+
+    if ( 0 == rc )
+        gdprintk(XENLOG_ERR,
+            "set_mmio_p2m_entry: set_p2m_entry failed! mfn=%08lx %d\n",
+            gmfn_to_mfn(d, gfn), d->is_dying);
+    return rc;
+}
 #ifdef __x86_64__
 int
 set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
     int rc = 0;
+    int need_lock = !p2m_locked_by_me(d->arch.p2m);
     p2m_type_t ot;
     mfn_t omfn;
 
@@ -2446,7 +2472,15 @@
     set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY);
 
     P2M_DEBUG("set shared %lx %lx\n", gfn, mfn_x(mfn));
+    if ( need_lock ) 
+        p2m_lock(d->arch.p2m);
+    if(!mfn_valid(mfn)){
+        printk("did %d gfn %lx ==\n", d->domain_id, gfn);
+        BUG();
+    }
     rc = set_p2m_entry(d, gfn, mfn, 0, p2m_ram_shared);
+    if ( need_lock ) 
+        p2m_unlock(d->arch.p2m);
     if ( 0 == rc )
         gdprintk(XENLOG_ERR,
             "set_mmio_p2m_entry: set_p2m_entry failed! mfn=%08lx\n",
diff -rubN b/xen/arch/x86/mm.c a/xen/arch/x86/mm.c
--- b/xen/arch/x86/mm.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/arch/x86/mm.c	2011-01-31 11:26:21.000000000 +0800
@@ -116,6 +116,7 @@
 #include <asm/setup.h>
 #include <asm/mem_sharing.h>
 
+extern int mem_sharing_debug_mfn(unsigned long mfn);
 /*
  * Mapping of first 2 or 4 megabytes of memory. This is mapped with 4kB
  * mappings to avoid type conflicts with fixed-range MTRRs covering the
@@ -2342,7 +2343,8 @@
 
                 /* No special validation needed for writable pages. */
                 /* Page tables and GDT/LDT need to be scanned for validity. */
-                if ( type == PGT_writable_page )
+                if ( type == PGT_writable_page || type == 
+                    PGT_shared_page )
                     nx |= PGT_validated;
             }
         }
@@ -3167,15 +3169,12 @@
                     else if ( p2m_ram_shared == l1e_p2mt )
                     {
                         /* Unshare the page for RW foreign mappings */
-                        if ( l1e_get_flags(l1e) & _PAGE_RW )
-                        {
                             rc = mem_sharing_unshare_page(pg_owner, 
                                                           l1e_get_pfn(l1e), 
                                                           0);
                             if ( rc )
                                 break; 
                         }
-                    } 
 #endif
 
                     okay = mod_l1_entry(va, l1e, mfn,
@@ -3857,32 +3856,24 @@
                        struct page_info *page, 
                        int expected_refcnt)
 {
-    unsigned long x, nx, y;
-
-    /* Acquire ref first, so that the page doesn't dissapear from us */
-    if(!get_page(page, d))
-        return -EINVAL;
-
     spin_lock(&d->page_alloc_lock);
 
     /* Change page type and count atomically */
-    y = page->u.inuse.type_info;
-    nx = PGT_shared_page | PGT_validated | 1; 
-    do {
-        x = y;
-        /* We can only change the type if count is zero, and 
-           type is PGT_none */
-        if((x & (PGT_type_mask | PGT_count_mask)) != PGT_none)
+    if (!get_page_and_type(page, d, PGT_shared_page) )
         {
-            put_page(page);
             spin_unlock(&d->page_alloc_lock);
-            return -EEXIST;
+        return -EINVAL;
         }
-        y = cmpxchg(&page->u.inuse.type_info, x, nx);
-    } while(x != y);
 
-    /* Check if the ref count is 2. The first from PGT_allocated, and the second
-     * from get_page at the top of this function */
+    /* Check it wasn't already sharable and undo if it was */
+    if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
+    {
+        put_page_and_type(page);
+        spin_unlock(&d->page_alloc_lock);
+        return -EEXIST;
+    }
+    /* Check if the ref count is 2. The first from PGT_allocated, and
+       +     * the second from get_page_and_type at the top of this function */
     if(page->count_info != (PGC_allocated | (2 + expected_refcnt)))
     {
         /* Return type count back to zero */
@@ -3895,39 +3886,26 @@
     d->tot_pages--;
     page_list_del(page, &d->page_list);
     spin_unlock(&d->page_alloc_lock);
-
-    /* NOTE: We are not putting the page back. In effect this function acquires
-     * one ref and type ref for the caller */
-
     return 0;
 }
 
 int page_make_private(struct domain *d, struct page_info *page)
 {
-    unsigned long x, y;
-
     if(!get_page(page, dom_cow))
         return -EINVAL;
     
     spin_lock(&d->page_alloc_lock);
 
-    /* Change page type and count atomically */
-    y = page->u.inuse.type_info;
-    do {
-        x = y;
         /* We can only change the type if count is one */
-        if((x & (PGT_type_mask | PGT_count_mask)) != 
-                (PGT_shared_page | 1))
+    if ( (page->u.inuse.type_info & (PGT_type_mask | PGT_count_mask))
+            != (PGT_shared_page | 1) )
         {
             put_page(page);
             spin_unlock(&d->page_alloc_lock);
             return -EEXIST;
         }
-        y = cmpxchg(&page->u.inuse.type_info, x, PGT_none);
-    } while(x != y);
-
-    /* We dropped type ref above, drop one ref count too */
-    put_page(page);
+    /* Drop the final typecount */
+    put_page_and_type(page);
 
     /* Change the owner */
     ASSERT(page_get_owner(page) == dom_cow);
diff -rubN b/xen/common/keyhandler.c a/xen/common/keyhandler.c
--- b/xen/common/keyhandler.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/common/keyhandler.c	2011-01-31 11:10:54.000000000 +0800
@@ -19,6 +19,8 @@
 
 static struct keyhandler *key_table[256];
 static unsigned char keypress_key;
+extern void mem_sharing_hash_iter(void);
+extern void mem_sharing_hash_iter_detail(void);
 
 char keyhandler_scratch[1024];
 
@@ -63,6 +65,7 @@
         if ( key_table[i] != NULL ) 
             printk(" key '%c' (ascii '%02x') => %s\n", 
                    isprint(i) ? i : ' ', i, key_table[i]->desc);
+    mem_sharing_hash_iter();
 }
 
 static struct keyhandler show_handlers_keyhandler = {
@@ -119,6 +122,9 @@
 {
     struct vcpu *v;
 
+    mem_sharing_hash_iter_detail();
+    return;
+
     if ( dom0 == NULL )
         return;
 
diff -rubN b/xen/common/page_alloc.c a/xen/common/page_alloc.c
--- b/xen/common/page_alloc.c	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/common/page_alloc.c	2011-01-19 16:40:48.000000000 +0800
@@ -69,6 +69,7 @@
 /* Broken page list, protected by heap_lock. */
 PAGE_LIST_HEAD(page_broken_list);
 
+void pagealloc_info(unsigned char key);
 /*************************
  * BOOT-TIME ALLOCATOR
  */
@@ -1260,7 +1261,7 @@
                             -1);
 }
 
-static void pagealloc_info(unsigned char key)
+void pagealloc_info(unsigned char key)
 {
     unsigned int zone = MEMZONE_XEN;
     unsigned long n, total = 0;
diff -rubN b/xen/include/asm-x86/p2m.h a/xen/include/asm-x86/p2m.h
--- b/xen/include/asm-x86/p2m.h	2011-01-31 10:51:55.000000000 +0800
+++ a/xen/include/asm-x86/p2m.h	2011-01-31 11:06:02.000000000 +0800
@@ -316,15 +316,22 @@
                                        int must_succeed)
 {
     mfn_t mfn;
+    int ret;
 
     mfn = gfn_to_mfn(d, gfn, p2mt);
 #ifdef __x86_64__
     if ( p2m_is_shared(*p2mt) )
     {
-        if ( mem_sharing_unshare_page(d, gfn,
-                                      must_succeed 
-                                      ? MEM_SHARING_MUST_SUCCEED : 0) )
+        ret = mem_sharing_unshare_page(d, gfn,
+                must_succeed ? MEM_SHARING_MUST_SUCCEED : 0);
+        if(ret < 0)
         {
+            if ( must_succeed && d->is_dying )
+            {
+                printk("unshare failed dying %d\n", d->is_dying);
+                mfn = _mfn(INVALID_MFN);
+                *p2mt=p2m_invalid;
+            }else
             BUG_ON(must_succeed);
             return mfn;
         }
@@ -437,6 +444,8 @@
 #ifdef __x86_64__
 /* Modify p2m table for shared gfn */
 int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
+int
+set_shared_p2m_entry_invalid(struct domain *d, unsigned long gfn);
 /* Check if a nominated gfn is valid to be paged out */
 int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn);
 /* Evict a frame */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-01-31 10:13 [memory sharing] bug on get_page_and_type tinnycloud
@ 2011-01-31 10:49 ` George Dunlap
  2011-01-31 12:30   ` MaoXiaoyun
  2011-02-01 14:07   ` Tim Deegan
  0 siblings, 2 replies; 14+ messages in thread
From: George Dunlap @ 2011-01-31 10:49 UTC (permalink / raw)
  To: tinnycloud; +Cc: xen devel, tim.deegan, juihaochiang

Xiaoyun,

Thanks for all of your work getting page sharing working.  When
submitting patches, please break them down into individual chunks,
each of which does one thing.  Each patch should also include a
comment saying what the patch does and why, and a Signed-off-by line
indicating that you certify that the copyright holder (possibly you)
is placing the code under the GPL.

Using mercurial queues:
http://mercurial.selenic.com/wiki/MqExtension
and the mercurial patchbomb extension:
http://mercurial.selenic.com/wiki/PatchbombExtension
are particularly handy for this process.

Quick comment:  The ept-locking part of the patch is going to be
NACK-ed, as it will cause circular locking dependencies with the
hap_lock.  c/s 22526:7a5ee380 has an explanation of the circular
dependency, and a fix which doesn't introduce a circular dependency.
(It may need to be adapted a bit to apply to 4.0-testing)

 -George

On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com> wrote:
> Hi:
>
>
> Attached is the whole patch suit for latest xen-4.0-testing,changeset 21443,
>
> It comes from George, Tim and JuiHao, also, has some extra debug info.
>
>
>
> On most occasion, memory sharing works fine, but still exists a bug.
>
>        I’ve been tracing this problem for a while.
>
>        There is a bug on get_page_and_type() in mem_sharing_share_pages()
>
>
>
> --------------------------------------------mem_sharing_share_pages()-----------------------
>
> 789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
>
> 790             mem_sharing_debug_gfn(cd, gfn->gfn);
>
> 791             mem_sharing_debug_gfn(sd, sgfn->gfn);
>
> 792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
> cd->is_dying, sd->is_dying, spage, se->mfn);
>
> 793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
> owner_id=%d\n",
>
> 794                     mfn_x(page_to_mfn(spage)),
>
> 795                     spage->count_info,
>
>  796                     spage->u.inuse.type_info,
>
> 797                     page_get_owner(spage)->domain_id);
>
> 798             BUG();
>
> 799         }
>
>
>
>
>
> Below painc log contains the debug info from line 790-798.
>
> We saw that 180000000000000 which is PGC_state_free,
>
> So it looks like a shared page has been freed unexpectly.
>
>
>
>
>
> (XEN) teardown 64
>
> (XEN) teardown 66
>
> blktap_sysfs_destroy
>
> blktap_sysfs_create: adding attributes for dev ffff880109d0c200
>
> blktap_sysfs_destroy
>
> __ratelimit: 1 callbacks suppressed
>
> blktap_sysfs_destroy
>
> (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
> ci=8000000000000005, ti=8400000000000001, owner_id=32755
>
> (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
>
> (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
>
> (XEN) Debug page: MFN=507895 is ci= 180000000000000, ti=8400000000000001,
> owner_id=32755
>
> (XEN) Xen BUG at mem_sharing.c:798
>
> (XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----
>
> (XEN) CPU:    0
>
> (XEN) RIP:    e008:[<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
>
> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>
> (XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx: 0000000000000092
>
> (XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c4802237c4
>
> (XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48   r8:  0000000000000001
>
> (XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11: 0000000000000005
>
> (XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14: 0000000000347967
>
> (XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4: 00000000000026f0
>
> (XEN) cr3: 000000010e642000   cr2: 00002abdc4809000
>
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>
> (XEN) Xen stack trace from rsp=ffff82c48035fc48:
>
> (XEN)    000000000001e8dc ffff83030d99e068 0000000000507895 ffff83030d99e050
>
> (XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300 ffff830626afbbd0
>
> (XEN)    0000000080372980 ffff82c48035fe38 ffff83023febe000 00000000008f7000
>
> (XEN)    0000000000305000 0000000000000006 0000000000000006 ffff82c4801c44a4
>
> (XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc fffffffffffffff3
>
> (XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281 0000000000000080
>
> (XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8 ffff82c48035fd38
>
> (XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002 0000000000000000
>
> (XEN)    0000000000000002 0000000000000000 ffff83023febe000 ffff83023ff80080
>
> (XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80 0000000000000000
>
> (XEN)    0000000000000002 ffff82c48018a452 0000000000000000 ffff82c48016f6bb
>
> (XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067 fffffffffffffff3
>
> (XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000 0000000000000006
>
> (XEN)    0000000000000006 ffff82c4801042b3 0000000000000000 ffff82c48010d2a5
>
> (XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200 ffff8300bf76e000
>
> (XEN)    0000000600000039 0000000000000000 00007fc09d744003 0000000000325258
>
> (XEN)    0000000000347967 ffffffffff600429 000000004d463b17 0000000000062fe2
>
> (XEN)    0000000000000000 00007fc09d744070 00007fc09d744000 00007fffcfec0d20
>
> (XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88 00000000019f1ca8
>
> (XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000 0000000000000246
>
> (XEN) Xen call trace:
>
> (XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
>
> (XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
>
> (XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
>
> (XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
>
> (XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
>
> (XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
>
> (XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
>
> (XEN)    [<ffff82c4801538ea>] ack_edge_ioapic_irq+0x2a/0x70
>
> (XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
>
> (XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
>
> (XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
>
> (XEN)
>
> (XEN)
>
> (XEN) ****************************************
>
> (XEN) Panic on CPU 0:
>
> (XEN) Xen BUG at mem_sharing.c:798
>
> (XEN) ****************************************
>
> (XEN)
>
> (XEN) Manual reset required ('noreboot' specified)
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [memory sharing] bug on get_page_and_type
  2011-01-31 10:49 ` George Dunlap
@ 2011-01-31 12:30   ` MaoXiaoyun
  2011-01-31 12:47     ` George Dunlap
  2011-02-01 14:07   ` Tim Deegan
  1 sibling, 1 reply; 14+ messages in thread
From: MaoXiaoyun @ 2011-01-31 12:30 UTC (permalink / raw)
  To: xen devel; +Cc: george.dunlap, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 10410 bytes --]


Hi Georage:
 
      Thanks for your help, I think I really need to learn hg a bit.
      Well, when looking into memory sharing code, a confusion really brother me a lot.
      I think maybe you can  enlighten me.
 
      There has been *check action code* do like below steps
      1) Use gfn_to_mfn to get p2m_type_t p2mt
      2) check the type of p2mt,
      3) do some stuffes base if check on step 2 is failed
 
      While when *nominate  page*  to be shared, it does these follow steps
      a) check page whether can be shared
      b) call page_make_sharable to make sharable
      c) if 2) is success, call p2m_change_type to update page type p2m_ram_shared
 
      My confusion is if *nominate page* code reach step b, and before it go to step c
       *check action code*  reach its step2, it will find page type is not  p2m_ram_shared
       and go to step3. And later "nominate page"  code go to step (c)
 
Take xen/common/memory.c  for example,
line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated flag
if check failed, so it might indicates, *nominate page* code change the page type
into p2m_ram_shared, but the page actually has alreay been freed.
 
Am I right?
 
 
155 int guest_remove_page(struct domain *d, unsigned long gmfn)
156 {
157     struct page_info *page;
158 #ifdef CONFIG_X86
159     p2m_type_t p2mt;
160 #endif
161     unsigned long mfn;
162 
163 #ifdef CONFIG_X86
164     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt)); 
165 #else
166     mfn = gmfn_to_mfn(d, gmfn);
167 #endif
168     if ( unlikely(!mfn_valid(mfn)) )
169     {
170         gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
171                 d->domain_id, gmfn);
172         return 0;
173     }
174             
175     page = mfn_to_page(mfn);
176 #ifdef CONFIG_X86
177     /* If gmfn is shared, just drop the guest reference (which may or may not
178      * free the page) */
179     if(p2m_is_shared(p2mt))
180     {
181         put_page_and_type(page);
182         guest_physmap_remove_page(d, gmfn, mfn, 0);
183         return 1;
184     }
185 
186 #endif /* CONFIG_X86 */
187     if ( unlikely(!get_page(page, d)) )
188     {
189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n", d->domain_id);
190         return 0; 
189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n", d->domain_id);                                                                            
190         return 0;
191     }
192 
193     if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
194         put_page_and_type(page);
195 
196     if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
197         put_page(page);
198 
199     guest_physmap_remove_page(d, gmfn, mfn, 0);
200 
201     put_page(page);
202 
203     return 1;
204 }
 
> Date: Mon, 31 Jan 2011 10:49:37 +0000
> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> From: George.Dunlap@eu.citrix.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com; juihaochiang@gmail.com
> 
> Xiaoyun,
> 
> Thanks for all of your work getting page sharing working. When
> submitting patches, please break them down into individual chunks,
> each of which does one thing. Each patch should also include a
> comment saying what the patch does and why, and a Signed-off-by line
> indicating that you certify that the copyright holder (possibly you)
> is placing the code under the GPL.
> 
> Using mercurial queues:
> http://mercurial.selenic.com/wiki/MqExtension
> and the mercurial patchbomb extension:
> http://mercurial.selenic.com/wiki/PatchbombExtension
> are particularly handy for this process.
> 
> Quick comment: The ept-locking part of the patch is going to be
> NACK-ed, as it will cause circular locking dependencies with the
> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
> dependency, and a fix which doesn't introduce a circular dependency.
> (It may need to be adapted a bit to apply to 4.0-testing)
> 
> -George
> 
> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com> wrote:
> > Hi:
> >
> >
> > Attached is the whole patch suit for latest xen-4.0-testing,changeset 21443,
> >
> > It comes from George, Tim and JuiHao, also, has some extra debug info.
> >
> >
> >
> > On most occasion, memory sharing works fine, but still exists a bug.
> >
> >        I’ve been tracing this problem for a while.
> >
> >        There is a bug on get_page_and_type() in mem_sharing_share_pages()
> >
> >
> >
> > --------------------------------------------mem_sharing_share_pages()-----------------------
> >
> > 789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
> >
> > 790             mem_sharing_debug_gfn(cd, gfn->gfn);
> >
> > 791             mem_sharing_debug_gfn(sd, sgfn->gfn);
> >
> > 792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
> > cd->is_dying, sd->is_dying, spage, se->mfn);
> >
> > 793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
> > owner_id=%d\n",
> >
> > 794                     mfn_x(page_to_mfn(spage)),
> >
> > 795                     spage->count_info,
> >
> >  796                     spage->u.inuse.type_info,
> >
> > 797                     page_get_owner(spage)->domain_id);
> >
> > 798             BUG();
> >
> > 799         }
> >
> >
> >
> >
> >
> > Below painc log contains the debug info from line 790-798.
> >
> > We saw that 180000000000000 which is PGC_state_free,
> >
> > So it looks like a shared page has been freed unexpectly.
> >
> >
> >
> >
> >
> > (XEN) teardown 64
> >
> > (XEN) teardown 66
> >
> > blktap_sysfs_destroy
> >
> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
> >
> > blktap_sysfs_destroy
> >
> > __ratelimit: 1 callbacks suppressed
> >
> > blktap_sysfs_destroy
> >
> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
> >
> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
> >
> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
> >
> > (XEN) Debug page: MFN=507895 is ci= 180000000000000, ti=8400000000000001,
> > owner_id=32755
> >
> > (XEN) Xen BUG at mem_sharing.c:798
> >
> > (XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----
> >
> > (XEN) CPU:    0
> >
> > (XEN) RIP:    e008:[<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
> >
> > (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> >
> > (XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx: 0000000000000092
> >
> > (XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c4802237c4
> >
> > (XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48   r8:  0000000000000001
> >
> > (XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11: 0000000000000005
> >
> > (XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14: 0000000000347967
> >
> > (XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4: 00000000000026f0
> >
> > (XEN) cr3: 000000010e642000   cr2: 00002abdc4809000
> >
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> >
> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
> >
> > (XEN)    000000000001e8dc ffff83030d99e068 0000000000507895 ffff83030d99e050
> >
> > (XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300 ffff830626afbbd0
> >
> > (XEN)    0000000080372980 ffff82c48035fe38 ffff83023febe000 00000000008f7000
> >
> > (XEN)    0000000000305000 0000000000000006 0000000000000006 ffff82c4801c44a4
> >
> > (XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc fffffffffffffff3
> >
> > (XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281 0000000000000080
> >
> > (XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8 ffff82c48035fd38
> >
> > (XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002 0000000000000000
> >
> > (XEN)    0000000000000002 0000000000000000 ffff83023febe000 ffff83023ff80080
> >
> > (XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80 0000000000000000
> >
> > (XEN)    0000000000000002 ffff82c48018a452 0000000000000000 ffff82c48016f6bb
> >
> > (XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067 fffffffffffffff3
> >
> > (XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000 0000000000000006
> >
> > (XEN)    0000000000000006 ffff82c4801042b3 0000000000000000 ffff82c48010d2a5
> >
> > (XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200 ffff8300bf76e000
> >
> > (XEN)    0000000600000039 0000000000000000 00007fc09d744003 0000000000325258
> >
> > (XEN)    0000000000347967 ffffffffff600429 000000004d463b17 0000000000062fe2
> >
> > (XEN)    0000000000000000 00007fc09d744070 00007fc09d744000 00007fffcfec0d20
> >
> > (XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88 00000000019f1ca8
> >
> > (XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000 0000000000000246
> >
> > (XEN) Xen call trace:
> >
> > (XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
> >
> > (XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
> >
> > (XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
> >
> > (XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
> >
> > (XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
> >
> > (XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
> >
> > (XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
> >
> > (XEN)    [<ffff82c4801538ea>] ack_edge_ioapic_irq+0x2a/0x70
> >
> > (XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
> >
> > (XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
> >
> > (XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
> >
> > (XEN)
> >
> > (XEN)
> >
> > (XEN) ****************************************
> >
> > (XEN) Panic on CPU 0:
> >
> > (XEN) Xen BUG at mem_sharing.c:798
> >
> > (XEN) ****************************************
> >
> > (XEN)
> >
> > (XEN) Manual reset required ('noreboot' specified)
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
> >
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 16278 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-01-31 12:30   ` MaoXiaoyun
@ 2011-01-31 12:47     ` George Dunlap
  2011-02-01 14:28       ` Tim Deegan
  0 siblings, 1 reply; 14+ messages in thread
From: George Dunlap @ 2011-01-31 12:47 UTC (permalink / raw)
  To: MaoXiaoyun; +Cc: xen devel, tim.deegan, juihaochiang

At first glance, it does look like a race, of the type that would
normally be done by grabbing the p2m lock.  I'd have to take a closer
look to be sure, but I won't be able to for at least another day or
two.  Maybe Tim can comment.

 -George

2011/1/31 MaoXiaoyun <tinnycloud@hotmail.com>:
> Hi Georage:
>
>       Thanks for your help, I think I really need to learn hg a bit.
>       Well, when looking into memory sharing code, a confusion
> really brother me a lot.
>       I think maybe you can  enlighten me.
>
>       There has been *check action code* do like below steps
>       1) Use gfn_to_mfn to get p2m_type_t p2mt
>       2) check the type of p2mt,
>       3) do some stuffes base if check on step 2 is failed
>
>       While when *nominate  page*  to be shared, it does these follow steps
>       a) check page whether can be shared
>       b) call page_make_sharable to make sharable
>       c) if 2) is success, call p2m_change_type to update page type
> p2m_ram_shared
>
>       My confusion is if *nominate page* code reach step b, and before it go
> to step c
>        *check action code*  reach its step2, it will find page type is
> not  p2m_ram_shared
>        and go to step3. And later "nominate page"  code go to step (c)
>
> Take xen/common/memory.c  for example,
> line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated
> flag
> if check failed, so it might indicates, *nominate page* code change the page
> type
> into p2m_ram_shared, but the page actually has alreay been freed.
>
> Am I right?
>
>
> 155 int guest_remove_page(struct domain *d, unsigned long gmfn)
> 156 {
> 157     struct page_info *page;
> 158 #ifdef CONFIG_X86
> 159     p2m_type_t p2mt;
> 160 #endif
> 161     unsigned long mfn;
> 162
> 163 #ifdef CONFIG_X86
> 164     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt));
> 165 #else
> 166     mfn = gmfn_to_mfn(d, gmfn);
> 167 #endif
> 168     if ( unlikely(!mfn_valid(mfn)) )
> 169     {
> 170         gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
> 171                 d->domain_id, gmfn);
> 172         return 0;
> 173     }
> 174
> 175&nbsp ;    page = mfn_to_page(mfn);
> 176 #ifdef CONFIG_X86
> 177     /* If gmfn is shared, just drop the guest reference (which may or
> may not
> 178      * free the page) */
> 179     if(p2m_is_shared(p2mt))
> 180     {
> 181         put_page_and_type(page);
> 182         guest_physmap_remove_page(d, gmfn, mfn, 0);
> 183         return 1;
> 184     }
> 185
> 186 #endif /* CONFIG_X86 */
> 187     if ( unlikely(!get_page(page, d)) )
> 188     {
> 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> d->domain_id);
> 190         return 0;
> 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> d->domain_id);
> 190         return 0;
> 191     }
> 192
> 193     if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
> 194         put_page_and_type(page);
> 195
> 196     if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> 197          put_page(page);
> 198
> 199     guest_physmap_remove_page(d, gmfn, mfn, 0);
> 200
> 201     put_page(page);
> 202
> 203     return 1;
> 204 }
>
>> Date: Mon, 31 Jan 2011 10:49:37 +0000
>> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
>> From: George.Dunlap@eu.citrix.com
>> To: tinnycloud@hotmail.com
>> CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com;
>> juihaochiang@gmail.com
>>
>> Xiaoyun,
>>
>> Thanks for all of your work getting page sharing working. When
>> submitting patches, please break them down into individual chunks,
>> each of which does one thing. Each patch should also include a
>> comment saying what the patch does and why, and a Signed-off-by line
>> indicating that you certify that the copyright holder (possibly you)
>> is placing the code under the GPL.
>>
>> Using mercurial queues:
>> http://mercurial.selenic.com/wiki/MqExtension
>> and the mercurial patchbomb extension:
>> http://mercurial.selenic.com/wiki/PatchbombExtension
>> are particularly handy for this process.
> &g t;
>> Quick comment: The ept-locking part of the patch is going to be
>> NACK-ed, as it will cause circular locking dependencies with the
>> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
>> dependency, and a fix which doesn't introduce a circular dependency.
>> (It may need to be adapted a bit to apply to 4.0-testing)
>>
>> -George
>>
>> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com>
>> wrote:
>> > Hi:
>> >
>> >
>> > Attached is the whole patch suit for latest xen-4.0-testing,changeset
>> > 21443,
>> >
>> > It comes from George, Tim and JuiHao, also, has some extra debug info.
>> >
>> >
>> >
>> > On most occasion, memory sharing works fine, but still exists a bug.
>> >
>> >        I’ve been tracing this problem for a while.
>> >
>> >   &nbs p;    There is a bug on get_page_and_type() in
>> > mem_sharing_share_pages()
>> >
>> >
>> >
>> >
>> > --------------------------------------------mem_sharing_share_pages()-----------------------
>> >
>> > 789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
>> >
>> > 790             mem_sharing_debug_gfn(cd, gfn->gfn);
>> >
>> > 791             mem_sharing_debug_gfn(sd, sgfn->gfn);
>> >
>> > 792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
>> > cd->is_dying, sd->is_dying, spage, se->mfn);
>> >
>> > 793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
>> > owner_id=%d\n",
>> >
>> > 794                     mfn_x(page_to_mfn(spage)),
>> >
>> > 795                     spage->count_info,
>> >
>> >  796                     spage->u.inuse.type_info,
>> >
>> > 797                     page_get_owner(spage)->domain_id);
>> >
>> > 798             BUG();
>> >
>> > 799         }
>> >
>> >
>> >
>> >
>> >
>> > Below painc log contains the debug info from line 790-798.
>> >
>> > We saw that 180000000000000 which is PGC_state_free,
>> >
>> > So it looks like a shared page has been freed unexpectly.
>> >
>> >
>> >
>> >
>> >
>> > (XEN) teardown 64
>> >
>> > (XEN) teardown 66
>> >
>> > blktap_sysfs_destroy
>> >
>> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
>> >
>> > blktap_sysfs_destroy
>> >
>> > __ratelimit: 1 callbacks suppressed
>> >
>> > blktap_sysfs_destroy
>> >
>> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
>> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
>> >
>> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
>> >
>> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
>> >
>> > (XEN) Debug page: MFN=507895 is ci= 180000000000000,
>> > ti=8400000000000001,
>> > owner_id=32755
>> >
>> > (XEN) Xen BUG at mem_sharing.c:798
>> >
>> > (XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----
>> >
>> > (XEN) CPU:    0
>> >
>> > (XEN) RIP:    e008:[<ffff82c4801c3760>]
>> > mem_sharing_share_pages+0x5b0/0x5d0
>> >
>> > (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>> >
>> > (XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx:
>> > 0000000000000092
>> >
>> > (XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi:
>> > ffff82c4802237c4
>> >
>> > (XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48    r8:
>> > 0000000000000001
>> >
>> > (XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11:
>> > 0000000000000005
>> >
>> > (XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14:
>> > 0000000000347967
>> >
>> > (XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4:
>> > 00000000000026f0
>> >
>> > (XEN) cr3: 000000010e642000   cr2: 00002abdc4809000
>> >
>> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> >
>> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
>> >
>> > (XEN)    000000000001e8dc ffff83030d99e068 0000000000507895
>> > ffff83030d99e050
>> >
>> > (XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300
>> > ffff830626afbbd0
>> >
>> > (XEN)&nb sp;   0000000080372980 ffff82c48035fe38 ffff83023febe000
>> > 00000000008f7000
>> >
>> > (XEN)    0000000000305000 0000000000000006 0000000000000006
>> > ffff82c4801c44a4
>> >
>> > (XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc
>> > fffffffffffffff3
>> >
>> > (XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281
>> > 0000000000000080
>> >
>> > (XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8
>> > ffff82c48035fd38
>> >
>> > (XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002
>> > 0000000000000000
>> >
>> > (XEN)    0000000000000002 0000000000000000 ffff83023febe000
>> > ffff83023ff80080
>> >
>> > (XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80
>> > 0000000000000000
>> >
>> > (XEN)    0000000000000002 ffff82c48018a 452 0000000000000000
>> > ffff82c48016f6bb
>> >
>> > (XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067
>> > fffffffffffffff3
>> >
>> > (XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000
>> > 0000000000000006
>> >
>> > (XEN)    0000000000000006 ffff82c4801042b3 0000000000000000
>> > ffff82c48010d2a5
>> >
>> > (XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200
>> > ffff8300bf76e000
>> >
>> > (XEN)    0000000600000039 0000000000000000 00007fc09d744003
>> > 0000000000325258
>> >
>> > (XEN)    0000000000347967 ffffffffff600429 000000004d463b17
>> > 0000000000062fe2
>> >
>> > (XEN)    0000000000000000 00007fc09d744070 00007fc09d744000
>> > 00007fffcfec0d20
>> >
>> > (XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88
>> > 00000000019f1ca8
>> >
>> > (XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000
>> > 0000000000000246
>> >
>> > (XEN) Xen call trace:
>> >
>> > (XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
>> >
>> > (XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
>> >
>> > (XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
>> >
>> > (XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
>> >
>> > (XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
>> >
>> > (XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
>> >
>> > (XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
>> >
>> > (XEN)    [<ffff82c4801538ea> ] ack_edge_ioapic_irq+0x2a/0x70
>> >
>> > (XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
>> >
>> > (XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
>> >
>> > (XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
>> >
>> > (XEN)
>> >
>> > (XEN)
>> >
>> > (XEN) ****************************************
>> >
>> > (XEN) Panic on CPU 0:
>> >
>> > (XEN) Xen BUG at mem_sharing.c:798
>> >
>> > (XEN) ****************************************
>> >
>> > (XEN)
>> >
>> > (XEN) Manual reset required ('noreboot' specified)
>> >
>> >
>> >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@lists.xensource.com
>> > http://lists.xensource.com/xen- devel
>> >
>> >
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-01-31 10:49 ` George Dunlap
  2011-01-31 12:30   ` MaoXiaoyun
@ 2011-02-01 14:07   ` Tim Deegan
  1 sibling, 0 replies; 14+ messages in thread
From: Tim Deegan @ 2011-02-01 14:07 UTC (permalink / raw)
  To: George Dunlap; +Cc: tinnycloud, xen devel, juihaochiang

At 10:49 +0000 on 31 Jan (1296470977), George Dunlap wrote:
> Xiaoyun,
> 
> Thanks for all of your work getting page sharing working.  When
> submitting patches, please break them down into individual chunks,
> each of which does one thing.  Each patch should also include a
> comment saying what the patch does and why, and a Signed-off-by line
> indicating that you certify that the copyright holder (possibly you)
> is placing the code under the GPL.

Also: we don't usually take patches against the testing tree; you should
rebase them against xen-unstable and when they have been applied there
you can ask for them to be backported to xen 4.0-testing.  (Otherwise
all the bugs would re-appear in 4.1!)  Some of the changes you sent here
have already been applied to xen-unstable.

I haven't read the patches in detail but they need a bit of tidying up -
e.g. you add some commented-out code.  Please clean them up before
resubmitting.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-01-31 12:47     ` George Dunlap
@ 2011-02-01 14:28       ` Tim Deegan
  2011-02-02 15:43         ` MaoXiaoyun
  0 siblings, 1 reply; 14+ messages in thread
From: Tim Deegan @ 2011-02-01 14:28 UTC (permalink / raw)
  To: George Dunlap; +Cc: MaoXiaoyun, xen devel, juihaochiang

At 12:47 +0000 on 31 Jan (1296478031), George Dunlap wrote:
> At first glance, it does look like a race, of the type that would
> normally be done by grabbing the p2m lock.  I'd have to take a closer
> look to be sure, but I won't be able to for at least another day or
> two.  Maybe Tim can comment.

Yes, this is a race; it would be fixed by getting rid of the shr_lock
and using the p2m lock to cover page-sahring operations, which I think
is a good idea anyway.  

I think I'll have time to look at the locking properly some time this
month.  Of course, I've been wrong before. :)

Tim.

> 2011/1/31 MaoXiaoyun <tinnycloud@hotmail.com>:
> > Hi Georage:
> >
> >       Thanks for your help, I think I really need to learn hg a bit.
> >       Well, when looking into memory sharing code, a confusion
> > really brother me a lot.
> >       I think maybe you can  enlighten me.
> >
> >       There has been *check action code* do like below steps
> >       1) Use gfn_to_mfn to get p2m_type_t p2mt
> >       2) check the type of p2mt,
> >       3) do some stuffes base if check on step 2 is failed
> >
> >       While when *nominate  page*  to be shared, it does these follow steps
> >       a) check page whether can be shared
> >       b) call page_make_sharable to make sharable
> >       c) if 2) is success, call p2m_change_type to update page type
> > p2m_ram_shared
> >
> >       My confusion is if *nominate page* code reach step b, and before it go
> > to step c
> >        *check action code*  reach its step2, it will find page type is
> > not  p2m_ram_shared
> >        and go to step3. And later "nominate page"  code go to step (c)
> >
> > Take xen/common/memory.c  for example,
> > line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated
> > flag
> > if check failed, so it might indicates, *nominate page* code change the page
> > type
> > into p2m_ram_shared, but the page actually has alreay been freed.
> >
> > Am I right?
> >
> >
> > 155 int guest_remove_page(struct domain *d, unsigned long gmfn)
> > 156 {
> > 157     struct page_info *page;
> > 158 #ifdef CONFIG_X86
> > 159     p2m_type_t p2mt;
> > 160 #endif
> > 161     unsigned long mfn;
> > 162
> > 163 #ifdef CONFIG_X86
> > 164     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt));
> > 165 #else
> > 166     mfn = gmfn_to_mfn(d, gmfn);
> > 167 #endif
> > 168     if ( unlikely(!mfn_valid(mfn)) )
> > 169     {
> > 170         gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
> > 171                 d->domain_id, gmfn);
> > 172         return 0;
> > 173     }
> > 174
> > 175&nbsp ;    page = mfn_to_page(mfn);
> > 176 #ifdef CONFIG_X86
> > 177     /* If gmfn is shared, just drop the guest reference (which may or
> > may not
> > 178      * free the page) */
> > 179     if(p2m_is_shared(p2mt))
> > 180     {
> > 181         put_page_and_type(page);
> > 182         guest_physmap_remove_page(d, gmfn, mfn, 0);
> > 183         return 1;
> > 184     }
> > 185
> > 186 #endif /* CONFIG_X86 */
> > 187     if ( unlikely(!get_page(page, d)) )
> > 188     {
> > 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > d->domain_id);
> > 190         return 0;
> > 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > d->domain_id);
> > 190         return 0;
> > 191     }
> > 192
> > 193     if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
> > 194         put_page_and_type(page);
> > 195
> > 196     if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> > 197          put_page(page);
> > 198
> > 199     guest_physmap_remove_page(d, gmfn, mfn, 0);
> > 200
> > 201     put_page(page);
> > 202
> > 203     return 1;
> > 204 }
> >
> >> Date: Mon, 31 Jan 2011 10:49:37 +0000
> >> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> >> From: George.Dunlap@eu.citrix.com
> >> To: tinnycloud@hotmail.com
> >> CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com;
> >> juihaochiang@gmail.com
> >>
> >> Xiaoyun,
> >>
> >> Thanks for all of your work getting page sharing working. When
> >> submitting patches, please break them down into individual chunks,
> >> each of which does one thing. Each patch should also include a
> >> comment saying what the patch does and why, and a Signed-off-by line
> >> indicating that you certify that the copyright holder (possibly you)
> >> is placing the code under the GPL.
> >>
> >> Using mercurial queues:
> >> http://mercurial.selenic.com/wiki/MqExtension
> >> and the mercurial patchbomb extension:
> >> http://mercurial.selenic.com/wiki/PatchbombExtension
> >> are particularly handy for this process.
> > &g t;
> >> Quick comment: The ept-locking part of the patch is going to be
> >> NACK-ed, as it will cause circular locking dependencies with the
> >> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
> >> dependency, and a fix which doesn't introduce a circular dependency.
> >> (It may need to be adapted a bit to apply to 4.0-testing)
> >>
> >> -George
> >>
> >> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com>
> >> wrote:
> >> > Hi:
> >> >
> >> >
> >> > Attached is the whole patch suit for latest xen-4.0-testing,changeset
> >> > 21443,
> >> >
> >> > It comes from George, Tim and JuiHao, also, has some extra debug info.
> >> >
> >> >
> >> >
> >> > On most occasion, memory sharing works fine, but still exists a bug.
> >> >
> >> >        I?ve been tracing this problem for a while.
> >> >
> >> >   &nbs p;    There is a bug on get_page_and_type() in
> >> > mem_sharing_share_pages()
> >> >
> >> >
> >> >
> >> >
> >> > --------------------------------------------mem_sharing_share_pages()-----------------------
> >> >
> >> > 789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
> >> >
> >> > 790             mem_sharing_debug_gfn(cd, gfn->gfn);
> >> >
> >> > 791             mem_sharing_debug_gfn(sd, sgfn->gfn);
> >> >
> >> > 792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
> >> > cd->is_dying, sd->is_dying, spage, se->mfn);
> >> >
> >> > 793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
> >> > owner_id=%d\n",
> >> >
> >> > 794                     mfn_x(page_to_mfn(spage)),
> >> >
> >> > 795                     spage->count_info,
> >> >
> >> >  796                     spage->u.inuse.type_info,
> >> >
> >> > 797                     page_get_owner(spage)->domain_id);
> >> >
> >> > 798             BUG();
> >> >
> >> > 799         }
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > Below painc log contains the debug info from line 790-798.
> >> >
> >> > We saw that 180000000000000 which is PGC_state_free,
> >> >
> >> > So it looks like a shared page has been freed unexpectly.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > (XEN) teardown 64
> >> >
> >> > (XEN) teardown 66
> >> >
> >> > blktap_sysfs_destroy
> >> >
> >> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
> >> >
> >> > blktap_sysfs_destroy
> >> >
> >> > __ratelimit: 1 callbacks suppressed
> >> >
> >> > blktap_sysfs_destroy
> >> >
> >> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
> >> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
> >> >
> >> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
> >> >
> >> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
> >> >
> >> > (XEN) Debug page: MFN=507895 is ci= 180000000000000,
> >> > ti=8400000000000001,
> >> > owner_id=32755
> >> >
> >> > (XEN) Xen BUG at mem_sharing.c:798
> >> >
> >> > (XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----
> >> >
> >> > (XEN) CPU:    0
> >> >
> >> > (XEN) RIP:    e008:[<ffff82c4801c3760>]
> >> > mem_sharing_share_pages+0x5b0/0x5d0
> >> >
> >> > (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> >> >
> >> > (XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx:
> >> > 0000000000000092
> >> >
> >> > (XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi:
> >> > ffff82c4802237c4
> >> >
> >> > (XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48    r8:
> >> > 0000000000000001
> >> >
> >> > (XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11:
> >> > 0000000000000005
> >> >
> >> > (XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14:
> >> > 0000000000347967
> >> >
> >> > (XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4:
> >> > 00000000000026f0
> >> >
> >> > (XEN) cr3: 000000010e642000   cr2: 00002abdc4809000
> >> >
> >> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> >> >
> >> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
> >> >
> >> > (XEN)    000000000001e8dc ffff83030d99e068 0000000000507895
> >> > ffff83030d99e050
> >> >
> >> > (XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300
> >> > ffff830626afbbd0
> >> >
> >> > (XEN)&nb sp;   0000000080372980 ffff82c48035fe38 ffff83023febe000
> >> > 00000000008f7000
> >> >
> >> > (XEN)    0000000000305000 0000000000000006 0000000000000006
> >> > ffff82c4801c44a4
> >> >
> >> > (XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc
> >> > fffffffffffffff3
> >> >
> >> > (XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281
> >> > 0000000000000080
> >> >
> >> > (XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8
> >> > ffff82c48035fd38
> >> >
> >> > (XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002
> >> > 0000000000000000
> >> >
> >> > (XEN)    0000000000000002 0000000000000000 ffff83023febe000
> >> > ffff83023ff80080
> >> >
> >> > (XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80
> >> > 0000000000000000
> >> >
> >> > (XEN)    0000000000000002 ffff82c48018a 452 0000000000000000
> >> > ffff82c48016f6bb
> >> >
> >> > (XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067
> >> > fffffffffffffff3
> >> >
> >> > (XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000
> >> > 0000000000000006
> >> >
> >> > (XEN)    0000000000000006 ffff82c4801042b3 0000000000000000
> >> > ffff82c48010d2a5
> >> >
> >> > (XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200
> >> > ffff8300bf76e000
> >> >
> >> > (XEN)    0000000600000039 0000000000000000 00007fc09d744003
> >> > 0000000000325258
> >> >
> >> > (XEN)    0000000000347967 ffffffffff600429 000000004d463b17
> >> > 0000000000062fe2
> >> >
> >> > (XEN)    0000000000000000 00007fc09d744070 00007fc09d744000
> >> > 00007fffcfec0d20
> >> >
> >> > (XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88
> >> > 00000000019f1ca8
> >> >
> >> > (XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000
> >> > 0000000000000246
> >> >
> >> > (XEN) Xen call trace:
> >> >
> >> > (XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
> >> >
> >> > (XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
> >> >
> >> > (XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
> >> >
> >> > (XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
> >> >
> >> > (XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
> >> >
> >> > (XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
> >> >
> >> > (XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
> >> >
> >> > (XEN)    [<ffff82c4801538ea> ] ack_edge_ioapic_irq+0x2a/0x70
> >> >
> >> > (XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
> >> >
> >> > (XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
> >> >
> >> > (XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
> >> >
> >> > (XEN)
> >> >
> >> > (XEN)
> >> >
> >> > (XEN) ****************************************
> >> >
> >> > (XEN) Panic on CPU 0:
> >> >
> >> > (XEN) Xen BUG at mem_sharing.c:798
> >> >
> >> > (XEN) ****************************************
> >> >
> >> > (XEN)
> >> >
> >> > (XEN) Manual reset required ('noreboot' specified)
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > Xen-devel mailing list
> >> > Xen-devel@lists.xensource.com
> >> > http://lists.xensource.com/xen- devel
> >> >
> >> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
> >

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [memory sharing] bug on get_page_and_type
  2011-02-01 14:28       ` Tim Deegan
@ 2011-02-02 15:43         ` MaoXiaoyun
  2011-02-02 16:18           ` Tim Deegan
  0 siblings, 1 reply; 14+ messages in thread
From: MaoXiaoyun @ 2011-02-02 15:43 UTC (permalink / raw)
  To: xen devel; +Cc: george.dunlap, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 13490 bytes --]


Hi Tim:
 
         Thanks for  both your advice and quick reply. I will follow.
 
         So at last we should replace shr_lock with p2m_lock.
         But more complicate, it seems both the 
         *check action* code and *nominate page* code need to be locked ,right?
         If so, quite a lot of  *check action* codes need to be locked.
 
         Looking forward to your suggestion on the fix, I think Juihao and I can help to do analyse and test. 
 
> Date: Tue, 1 Feb 2011 14:28:16 +0000
> From: Tim.Deegan@citrix.com
> To: George.Dunlap@eu.citrix.com
> CC: tinnycloud@hotmail.com; xen-devel@lists.xensource.com; juihaochiang@gmail.com
> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> 
> At 12:47 +0000 on 31 Jan (1296478031), George Dunlap wrote:
> > At first glance, it does look like a race, of the type that would
> > normally be done by grabbing the p2m lock. I'd have to take a closer
> > look to be sure, but I won't be able to for at least another day or
> > two. Maybe Tim can comment.
> 
> Yes, this is a race; it would be fixed by getting rid of the shr_lock
> and using the p2m lock to cover page-sahring operations, which I think
> is a good idea anyway. 
> 
> I think I'll have time to look at the locking properly some time this
> month. Of course, I've been wrong before. :)
> 
> Tim.
> 
> > 2011/1/31 MaoXiaoyun <tinnycloud@hotmail.com>:
> > > Hi Georage:
> > >
> > > Thanks for your help, I think I really need to learn hg a bit.
> > > Well, when looking into memory sharing code, a confusion
> > > really brother me a lot.
> > > I think maybe you can enlighten me.
> > >
> > > There has been *check action code* do like below steps
> > > 1) Use gfn_to_mfn to get p2m_type_t p2mt
> > > 2) check the type of p2mt,
> > > 3) do some stuffes base if check on step 2 is failed
> > >
> > > While when *nominate page* to be shared, it does these follow steps
> > > a) check page whether can be shared
> > > b) call page_make_sharable to make sharable
> > > c) if 2) is success, call p2m_change_type to update page type
> > > p2m_ram_shared
> > >
> > > My confusion is if *nominate page* code reach step b, and before it go
> > > to step c
> > > *check action code* reach its step2, it will find page type is
> > > not p2m_ram_shared
> > > and go to step3. And later "nominate page" code go to step (c)
> > >
> > > Take xen/common/memory.c for example,
> > > line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated
> > > flag
> > > if check failed, so it might indicates, *nominate page* code change the page
> > > type
> > > into p2m_ram_shared, but the page actually has alreay been freed.
> > >
> > > Am I right?
> > >
> > >
> > > 155 int guest_remove_page(struct domain *d, unsigned long gmfn)
> > > 156 {
> > > 157 struct page_info *page;
> > > 158 #ifdef CONFIG_X86
> > > 159 p2m_type_t p2mt;
> > > 160 #endif
> > > 161 unsigned long mfn;
> > > 162
> > > 163 #ifdef CONFIG_X86
> > > 164 mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt));
> > > 165 #else
> > > 166 mfn = gmfn_to_mfn(d, gmfn);
> > > 167 #endif
> > > 168 if ( unlikely(!mfn_valid(mfn)) )
> > > 169 {
> > > 170 gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
> > > 171 d->domain_id, gmfn);
> > > 172 return 0;
> > > 173 }
> > > 174
> > > 175&nbsp ; page = mfn_to_page(mfn);
> > > 176 #ifdef CONFIG_X86
> > > 177 /* If gmfn is shared, just drop the guest reference (which may or
> > > may not
> > > 178 * free the page) */
> > > 179 if(p2m_is_shared(p2mt))
> > > 180 {
> > > 181 put_page_and_type(page);
> > > 182 guest_physmap_remove_page(d, gmfn, mfn, 0);
> > > 183 return 1;
> > > 184 }
> > > 185
> > > 186 #endif /* CONFIG_X86 */
> > > 187 if ( unlikely(!get_page(page, d)) )
> > > 188 {
> > > 189 gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > > d->domain_id);
> > > 190 return 0;
> > > 189 gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > > d->domain_id);
> > > 190 return 0;
> > > 191 }
> > > 192
> > > 193 if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
> > > 194 put_page_and_type(page);
> > > 195
> > > 196 if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> > > 197 put_page(page);
> > > 198
> > > 199 guest_physmap_remove_page(d, gmfn, mfn, 0);
> > > 200
> > > 201 put_page(page);
> > > 202
> > > 203 return 1;
> > > 204 }
> > >
> > >> Date: Mon, 31 Jan 2011 10:49:37 +0000
> > >> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > >> From: George.Dunlap@eu.citrix.com
> > >> To: tinnycloud@hotmail.com
> > >> CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com;
> > >> juihaochiang@gmail.com
> > >>
> > >> Xiaoyun,
> > >>
> > >> Thanks for all of your work getting page sharing working. When
> > >> submitting patches, please break them down into individual chunks,
> > >> each of which does one thing. Each patch should also include a
> > >> comment saying what the patch does and why, and a Signed-off-by line
> > >> indicating that you certify that the copyright holder (possibly you)
> > >> is placing the code under the GPL.
> > >>
> > >> Using mercurial queues:
> > >> http://mercurial.selenic.com/wiki/MqExtension
> > >> and the mercurial patchbomb extension:
> > >> http://mercurial.selenic.com/wiki/PatchbombExtension
> > >> are particularly handy for this process.
> > > &g t;
> > >> Quick comment: The ept-locking part of the patch is going to be
> > >> NACK-ed, as it will cause circular locking dependencies with the
> > >> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
> > >> dependency, and a fix which doesn't introduce a circular dependency.
> > >> (It may need to be adapted a bit to apply to 4.0-testing)
> > >>
> > >> -George
> > >>
> > >> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com>
> > >> wrote:
> > >> > Hi:
> > >> >
> > >> >
> > >> > Attached is the whole patch suit for latest xen-4.0-testing,changeset
> > >> > 21443,
> > >> >
> > >> > It comes from George, Tim and JuiHao, also, has some extra debug info.
> > >> >
> > >> >
> > >> >
> > >> > On most occasion, memory sharing works fine, but still exists a bug.
> > >> >
> > >> > I?ve been tracing this problem for a while.
> > >> >
> > >> > &nbs p; There is a bug on get_page_and_type() in
> > >> > mem_sharing_share_pages()
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > --------------------------------------------mem_sharing_share_pages()-----------------------
> > >> >
> > >> > 789 if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
> > >> >
> > >> > 790 mem_sharing_debug_gfn(cd, gfn->gfn);
> > >> >
> > >> > 791 mem_sharing_debug_gfn(sd, sgfn->gfn);
> > >> >
> > >> > 792 printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
> > >> > cd->is_dying, sd->is_dying, spage, se->mfn);
> > >> >
> > >> > 793 printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
> > >> > owner_id=%d\n",
> > >> >
> > >> > 794 mfn_x(page_to_mfn(spage)),
> > >> >
> > >> > 795 spage->count_info,
> > >> >
> > >> > 796 spage->u.inuse.type_info,
> > >> >
> > >> > 797 page_get_owner(spage)->domain_id);
> > >> >
> > >> > 798 BUG();
> > >> >
> > >> > 799 }
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > Below painc log contains the debug info from line 790-798.
> > >> >
> > >> > We saw that 180000000000000 which is PGC_state_free,
> > >> >
> > >> > So it looks like a shared page has been freed unexpectly.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > (XEN) teardown 64
> > >> >
> > >> > (XEN) teardown 66
> > >> >
> > >> > blktap_sysfs_destroy
> > >> >
> > >> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
> > >> >
> > >> > blktap_sysfs_destroy
> > >> >
> > >> > __ratelimit: 1 callbacks suppressed
> > >> >
> > >> > blktap_sysfs_destroy
> > >> >
> > >> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
> > >> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
> > >> >
> > >> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
> > >> >
> > >> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
> > >> >
> > >> > (XEN) Debug page: MFN=507895 is ci= 180000000000000,
> > >> > ti=8400000000000001,
> > >> > owner_id=32755
> > >> >
> > >> > (XEN) Xen BUG at mem_sharing.c:798
> > >> >
> > >> > (XEN) ----[ Xen-4.0.2-rc2-pre x86_64 debug=n Not tainted ]----
> > >> >
> > >> > (XEN) CPU: 0
> > >> >
> > >> > (XEN) RIP: e008:[<ffff82c4801c3760>]
> > >> > mem_sharing_share_pages+0x5b0/0x5d0
> > >> >
> > >> > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor
> > >> >
> > >> > (XEN) rax: 0000000000000000 rbx: ffff83044ef76000 rcx:
> > >> > 0000000000000092
> > >> >
> > >> > (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi:
> > >> > ffff82c4802237c4
> > >> >
> > >> > (XEN) rbp: ffff83030d99e310 rsp: ffff82c48035fc48 r8:
> > >> > 0000000000000001
> > >> >
> > >> > (XEN) r9: 0000000000000001 r10: 00000000fffffff8 r11:
> > >> > 0000000000000005
> > >> >
> > >> > (XEN) r12: ffff83050b558000 r13: ffff830626ec5740 r14:
> > >> > 0000000000347967
> > >> >
> > >> > (XEN) r15: ffff83030d99e068 cr0: 0000000080050033 cr4:
> > >> > 00000000000026f0
> > >> >
> > >> > (XEN) cr3: 000000010e642000 cr2: 00002abdc4809000
> > >> >
> > >> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> > >> >
> > >> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
> > >> >
> > >> > (XEN) 000000000001e8dc ffff83030d99e068 0000000000507895
> > >> > ffff83030d99e050
> > >> >
> > >> > (XEN) ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300
> > >> > ffff830626afbbd0
> > >> >
> > >> > (XEN)&nb sp; 0000000080372980 ffff82c48035fe38 ffff83023febe000
> > >> > 00000000008f7000
> > >> >
> > >> > (XEN) 0000000000305000 0000000000000006 0000000000000006
> > >> > ffff82c4801c44a4
> > >> >
> > >> > (XEN) ffff82c480258188 ffff82c48011cb89 000000000001e8dc
> > >> > fffffffffffffff3
> > >> >
> > >> > (XEN) ffff82c48035fe28 ffff82c480148503 000003d932971281
> > >> > 0000000000000080
> > >> >
> > >> > (XEN) 000003d9329611d6 ffff82c48018aedf ffff82c4803903a8
> > >> > ffff82c48035fd38
> > >> >
> > >> > (XEN) ffff82c480390380 ffff82c48035fe28 0000000000000002
> > >> > 0000000000000000
> > >> >
> > >> > (XEN) 0000000000000002 0000000000000000 ffff83023febe000
> > >> > ffff83023ff80080
> > >> >
> > >> > (XEN) ffff82c48035fe58 0000000000000000 ffff82c48022ca80
> > >> > 0000000000000000
> > >> >
> > >> > (XEN) 0000000000000002 ffff82c48018a 452 0000000000000000
> > >> > ffff82c48016f6bb
> > >> >
> > >> > (XEN) ffff82c48035fe58 ffff82c4801538ea 000000023ab79067
> > >> > fffffffffffffff3
> > >> >
> > >> > (XEN) ffff82c48035fe28 00000000008f7000 0000000000305000
> > >> > 0000000000000006
> > >> >
> > >> > (XEN) 0000000000000006 ffff82c4801042b3 0000000000000000
> > >> > ffff82c48010d2a5
> > >> >
> > >> > (XEN) ffff830477fb51e8 0000000000000000 00007ff000000200
> > >> > ffff8300bf76e000
> > >> >
> > >> > (XEN) 0000000600000039 0000000000000000 00007fc09d744003
> > >> > 0000000000325258
> > >> >
> > >> > (XEN) 0000000000347967 ffffffffff600429 000000004d463b17
> > >> > 0000000000062fe2
> > >> >
> > >> > (XEN) 0000000000000000 00007fc09d744070 00007fc09d744000
> > >> > 00007fffcfec0d20
> > >> >
> > >> > (XEN) 00007fc09d744078 0000000000430fd8 00007fffcfec0d88
> > >> > 00000000019f1ca8
> > >> >
> > >> > (XEN) 0000f05c00000000 00007fc09f4c2718 0000000000000000
> > >> > 0000000000000246
> > >> >
> > >> > (XEN) Xen call trace:
> > >> >
> > >> > (XEN) [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
> > >> >
> > >> > (XEN) [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
> > >> >
> > >> > (XEN) [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
> > >> >
> > >> > (XEN) [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
> > >> >
> > >> > (XEN) [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
> > >> >
> > >> > (XEN) [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
> > >> >
> > >> > (XEN) [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
> > >> >
> > >> > (XEN) [<ffff82c4801538ea> ] ack_edge_ioapic_irq+0x2a/0x70
> > >> >
> > >> > (XEN) [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
> > >> >
> > >> > (XEN) [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
> > >> >
> > >> > (XEN) [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
> > >> >
> > >> > (XEN)
> > >> >
> > >> > (XEN)
> > >> >
> > >> > (XEN) ****************************************
> > >> >
> > >> > (XEN) Panic on CPU 0:
> > >> >
> > >> > (XEN) Xen BUG at mem_sharing.c:798
> > >> >
> > >> > (XEN) ****************************************
> > >> >
> > >> > (XEN)
> > >> >
> > >> > (XEN) Manual reset required ('noreboot' specified)
> > >> >
> > >> >
> > >> >
> > >> > _______________________________________________
> > >> > Xen-devel mailing list
> > >> > Xen-devel@lists.xensource.com
> > >> > http://lists.xensource.com/xen- devel
> > >> >
> > >> >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xensource.com
> > > http://lists.xensource.com/xen-devel
> > >
> > >
> 
> -- 
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 19815 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-02-02 15:43         ` MaoXiaoyun
@ 2011-02-02 16:18           ` Tim Deegan
  2011-02-09  2:46             ` MaoXiaoyun
  0 siblings, 1 reply; 14+ messages in thread
From: Tim Deegan @ 2011-02-02 16:18 UTC (permalink / raw)
  To: MaoXiaoyun; +Cc: George Dunlap, xen devel, juihaochiang

At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> Hi Tim:
> 
>          Thanks for  both your advice and quick reply. I will follow.
> 
>          So at last we should replace shr_lock with p2m_lock.
>          But more complicate, it seems both the
>          *check action* code and *nominate page* code need to be locked ,right?
>          If so, quite a lot of  *check action* codes need to be locked.

Yes, I think you're right about that.  Unfortunately there are some very
long TOCTOU windows in those kind of p2m lookups, which will get more
important as the p2m gets more dynamic.  I don't want to have the
callers of p2m code touching the p2m lock directly so we may need a new
p2m interface to address it.

Tim.

>          Looking forward to your suggestion on the fix, I think Juihao and I can help to do analyse and test.
> 
> > Date: Tue, 1 Feb 2011 14:28:16 +0000
> > From: Tim.Deegan@citrix.com
> > To: George.Dunlap@eu.citrix.com
> > CC: tinnycloud@hotmail.com; xen-devel@lists.xensource.com; juihaochiang@gmail.com
> > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> >
> > At 12:47 +0000 on 31 Jan (1296478031), George Dunlap wrote:
> > > At first glance, it does look like a race, of the type that would
> > > normally be done by grabbing the p2m lock. I'd have to take a closer
> > > look to be sure, but I won't be able to for at least another day or
> > > two. Maybe Tim can comment.
> >
> > Yes, this is a race; it would be fixed by getting rid of the shr_lock
> > and using the p2m lock to cover page-sahring operations, which I think
> > is a good idea anyway.
> >
> > I think I'll have time to look at the locking properly some time this
> > month. Of course, I've been wrong before. :)
> > < BR>> Tim.
> >
> > > 2011/1/31 MaoXiaoyun <tinnycloud@hotmail.com>:
> > > > Hi Georage:
> > > >
> > > > Thanks for your help, I think I really need to learn hg a bit.
> > > > Well, when looking into memory sharing code, a confusion
> > > > really brother me a lot.
> > > > I think maybe you can enlighten me.
> > > >
> > > > There has been *check action code* do like below steps
> > > > 1) Use gfn_to_mfn to get p2m_type_t p2mt
> > > > 2) check the type of p2mt,
> > > > 3) do some stuffes base if check on step 2 is failed
> > > >
> > > > While when *nominate page* to be shared, it does these follow steps
> > > > a) check page whether can be shared
> > > > b) call page_make_sharable to make sharable
> > > > c) if 2) is success, call p2m_change_type to update page type
> > > > p2m_ram_shared
> > > >
> > > > My confusion is if *nominate page* code reach step b, and before it go
> > > > to step c
> > > > *check action code* reach its step2, it will find page type is
> > > > not p2m_ram_shared
> > > > and go to step3. And later "nominate page" code go to step (c)
> > > >
> > > > Take xen/common/memory.c for example,
> > > > line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated
> > > > flag
> > > > if check failed, so it might indicates, *nominate page* code change the page
> > > > type
> > > > into p2m_ram_shared, but the page actually has alreay been freed.
> > > >
> > > > Am I right?
> > > >
> > > >
> > > > 155 int guest_remove_page(struct domain *d, unsigned long gmfn)
> > > > 156 {
> > > > 157 struct page_info *page;
> > > > 158 #ifdef C ONFIG_X86
> > > > 159 p2m_type_t p2mt;
> > > > 160 #endif
> > > > 161 unsigned long mfn;
> > > > 162
> > > > 163 #ifdef CONFIG_X86
> > > > 164 mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt));
> > > > 165 #else
> > > > 166 mfn = gmfn_to_mfn(d, gmfn);
> > > > 167 #endif
> > > > 168 if ( unlikely(!mfn_valid(mfn)) )
> > > > 169 {
> > > > 170 gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
> > > > 171 d->domain_id, gmfn);
> > > > 172 return 0;
> > > > 173 }
> > > > 174
> > > > 175&nbsp ; page = mfn_to_page(mfn);
> > > > 176 #ifdef CONFIG_X86
> > > > 177 /* If gmfn is shared, just drop the guest reference (which may or
> > > > may not
> > > > 178 * free the page) */
> > > > 179 if(p2m_is_shared(p2mt))
> > > > 180 {
> > > > 181 put _page_and_type(page);
> > > > 182 guest_physmap_remove_page(d, gmfn, mfn, 0);
> > > > 183 return 1;
> > > > 184 }
> > > > 185
> > > > 186 #endif /* CONFIG_X86 */
> > > > 187 if ( unlikely(!get_page(page, d)) )
> > > > 188 {
> > > > 189 gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > > > d->domain_id);
> > > > 190 return 0;
> > > > 189 gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> > > > d->domain_id);
> > > > 190 return 0;
> > > > 191 }
> > > > 192
> > > > 193 if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
> > > > 194 put_page_and_type(page);
> > > > 195
> > > > 196 if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> > > > 197 put_page(page);
> > > > 198
> > > > 199 guest_physmap_remove_page (d, gmfn, mfn, 0);
> > > > 200
> > > > 201 put_page(page);
> > > > 202
> > > > 203 return 1;
> > > > 204 }
> > > >
> > > >> Date: Mon, 31 Jan 2011 10:49:37 +0000
> > > >> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > > >> From: George.Dunlap@eu.citrix.com
> > > >> To: tinnycloud@hotmail.com
> > > >> CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com;
> > > >> juihaochiang@gmail.com
> > > >>
> > > >> Xiaoyun,
> > > >>
> > > >> Thanks for all of your work getting page sharing working. When
> > > >> submitting patches, please break them down into individual chunks,
> > > >> each of which does one thing. Each patch should also include a
> > > >> comment saying what the patch does and why, and a Signed-off-by line
> > > ; >> indicating that you certify that the copyright holder (possibly you)
> > > >> is placing the code under the GPL.
> > > >>
> > > >> Using mercurial queues:
> > > >> http://mercurial.selenic.com/wiki/MqExtension
> > > >> and the mercurial patchbomb extension:
> > > >> http://mercurial.selenic.com/wiki/PatchbombExtension
> > > >> are particularly handy for this process.
> > > > &g t;
> > > >> Quick comment: The ept-locking part of the patch is going to be
> > > >> NACK-ed, as it will cause circular locking dependencies with the
> > > >> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
> > > >> dependency, and a fix which doesn't introduce a circular dependency.
> > > >> (It may need to be adapted a bit to apply to 4.0-testing)
> > > >>
> > > >> -George
> > > >>
> > > >> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@hotmail.com>
> > > >> wrote:
> > > >> > Hi:
> > > >> >
> > > >> >
> > > >> > Attached is the whole patch suit for latest xen-4.0-testing,changeset
> > > >> > 21443,
> > > >> >
> > > >> > It comes from George, Tim and JuiHao, also, has some extra debug info.
> > > >> >
> > > >> >
> > > >> >
> > > >> > On most occasion, memory sharing works fine, but still exists a bug.
> > > >> >
> > > >> > I?ve been tracing this problem for a while.
> > > >> >
> > > >> > &nbs p; There is a bug on get_page_and_type() in
> > > >> > mem_sharing_share_pages()
> > > >> >
> > > >> >
> > > >&g t; >
> > > >> >
> > > >> > --------------------------------------------mem_sharing_share_pages()-----------------------
> > > >> >
> > > >> > 789 if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
> > > >> >
> > > >> > 790 mem_sharing_debug_gfn(cd, gfn->gfn);
> > > >> >
> > > >> > 791 mem_sharing_debug_gfn(sd, sgfn->gfn);
> > > >> >
> > > >> > 792 printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
> > > >> > cd->is_dying, sd->is_dying, spage, se->mfn);
> > > >> >
> > > >> > 793 printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
> > > >> > owner_id=%d\n",
> > > >> >
> > > >> > 794 mfn_x(page_to_mfn(spage)),
> > > >> >
> > > >> > 795 spage->count_info,
> &g t; > >> >
> > > >> > 796 spage->u.inuse.type_info,
> > > >> >
> > > >> > 797 page_get_owner(spage)->domain_id);
> > > >> >
> > > >> > 798 BUG();
> > > >> >
> > > >> > 799 }
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > Below painc log contains the debug info from line 790-798.
> > > >> >
> > > >> > We saw that 180000000000000 which is PGC_state_free,
> > > >> >
> > > >> > So it looks like a shared page has been freed unexpectly.
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > (XEN) teardown 64
> > > >> >
> > > >> > (XEN) teardow n 66
> > > >> >
> > > >> > blktap_sysfs_destroy
> > > >> >
> > > >> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
> > > >> >
> > > >> > blktap_sysfs_destroy
> > > >> >
> > > >> > __ratelimit: 1 callbacks suppressed
> > > >> >
> > > >> > blktap_sysfs_destroy
> > > >> >
> > > >> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
> > > >> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
> > > >> >
> > > >> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
> > > >> >
> > > >> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
> > > >> >
> > > >> > (XEN) Debug page: MFN=507895 is ci= 1800000000 00000,
> > > >> > ti=8400000000000001,
> > > >> > owner_id=32755
> > > >> >
> > > >> > (XEN) Xen BUG at mem_sharing.c:798
> > > >> >
> > > >> > (XEN) ----[ Xen-4.0.2-rc2-pre x86_64 debug=n Not tainted ]----
> > > >> >
> > > >> > (XEN) CPU: 0
> > > >> >
> > > >> > (XEN) RIP: e008:[<ffff82c4801c3760>]
> > > >> > mem_sharing_share_pages+0x5b0/0x5d0
> > > >> >
> > > >> > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor
> > > >> >
> > > >> > (XEN) rax: 0000000000000000 rbx: ffff83044ef76000 rcx:
> > > >> > 0000000000000092
> > > >> >
> > > >> > (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi:
> > > >> > ffff82c4802237c4
> > > >> >
> > > >> > (XEN) rbp: ffff83030d99e310 rsp: ffff82c48035fc48 r8:
> > > >> > 0000000000000001
> > > >> >
> > > >> > (XEN) r9: 0000000000000001 r10: 00000000fffffff8 r11:
> > > >> > 0000000000000005
> > > >> >
> > > >> > (XEN) r12: ffff83050b558000 r13: ffff830626ec5740 r14:
> > > >> > 0000000000347967
> > > >> >
> > > >> > (XEN) r15: ffff83030d99e068 cr0: 0000000080050033 cr4:
> > > >> > 00000000000026f0
> > > >> >
> > > >> > (XEN) cr3: 000000010e642000 cr2: 00002abdc4809000
> > > >> >
> > > >> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> > > >> >
> > > >> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
> > > >> >
> > > >> > (XEN) 000000000001e8dc ffff83030d99e068 00000000005078 95
> > > >> > ffff83030d99e050
> > > >> >
> > > >> > (XEN) ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300
> > > >> > ffff830626afbbd0
> > > >> >
> > > >> > (XEN)&nb sp; 0000000080372980 ffff82c48035fe38 ffff83023febe000
> > > >> > 00000000008f7000
> > > >> >
> > > >> > (XEN) 0000000000305000 0000000000000006 0000000000000006
> > > >> > ffff82c4801c44a4
> > > >> >
> > > >> > (XEN) ffff82c480258188 ffff82c48011cb89 000000000001e8dc
> > > >> > fffffffffffffff3
> > > >> >
> > > >> > (XEN) ffff82c48035fe28 ffff82c480148503 000003d932971281
> > > >> > 0000000000000080
> > > >> >
> > > >> > (XEN) 000003d9329611d6 ffff82c48018aedf ffff82c4803903a8
> > > >> > ffff82c48035fd38
> > > >> >
> > > >> > (XEN) ffff82c480390380 ffff82c48035fe28 0000000000000002
> > > >> > 0000000000000000
> > > >> >
> > > >> > (XEN) 0000000000000002 0000000000000000 ffff83023febe000
> > > >> > ffff83023ff80080
> > > >> >
> > > >> > (XEN) ffff82c48035fe58 0000000000000000 ffff82c48022ca80
> > > >> > 0000000000000000
> > > >> >
> > > >> > (XEN) 0000000000000002 ffff82c48018a 452 0000000000000000
> > > >> > ffff82c48016f6bb
> > > >> >
> > > >> > (XEN) ffff82c48035fe58 ffff82c4801538ea 000000023ab79067
> > > >> > fffffffffffffff3
> > > >> >
> > > >> > (XEN) ffff82c48035fe28 00000000008f7000 0000000000305000
> > > >> > 0000000000000006
> > > >> >
> > > >> > (XEN) 0 000000000000006 ffff82c4801042b3 0000000000000000
> > > >> > ffff82c48010d2a5
> > > >> >
> > > >> > (XEN) ffff830477fb51e8 0000000000000000 00007ff000000200
> > > >> > ffff8300bf76e000
> > > >> >
> > > >> > (XEN) 0000000600000039 0000000000000000 00007fc09d744003
> > > >> > 0000000000325258
> > > >> >
> > > >> > (XEN) 0000000000347967 ffffffffff600429 000000004d463b17
> > > >> > 0000000000062fe2
> > > >> >
> > > >> > (XEN) 0000000000000000 00007fc09d744070 00007fc09d744000
> > > >> > 00007fffcfec0d20
> > > >> >
> > > >> > (XEN) 00007fc09d744078 0000000000430fd8 00007fffcfec0d88
> > > >> > 00000000019f1ca8
> > > >> >
> > > >> > (XEN) 0000f05c00000000 00007fc09f4c2718 0000000000000000
> > &g t; >> > 0000000000000246
> > > >> >
> > > >> > (XEN) Xen call trace:
> > > >> >
> > > >> > (XEN) [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
> > > >> >
> > > >> > (XEN) [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
> > > >> >
> > > >> > (XEN) [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
> > > >> >
> > > >> > (XEN) [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
> > > >> >
> > > >> > (XEN) [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
> > > >> >
> > > >> > (XEN) [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
> > > >> >
> > > >> > (XEN) [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
> > > >> >
> > > >> > (XEN) [<ffff82c4801538ea> ] ack_edge_ioapic_irq+0x2a/0x70
> > > >> >
> > > >> > (XEN) [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
> > > >> >
> > > >> > (XEN) [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
> > > >> >
> > > >> > (XEN) [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
> > > >> >
> > > >> > (XEN)
> > > >> >
> > > >> > (XEN)
> > > >> >
> > > >> > (XEN) ****************************************
> > > >> >
> > > >> > (XEN) Panic on CPU 0:
> > > >> >
> > > >> > (XEN) Xen BUG at mem_sharing.c:798
> > > >> >
> > > >> > (XEN) ****************************************
> > > >> >
> > > >> > (XEN)
> > > >> >
> > > & gt;> > (XEN) Manual reset required ('noreboot' specified)
> > > >> >
> > > >> >
> > > >> >
> > > >> > _______________________________________________
> > > >> > Xen-devel mailing list
> > > >> > Xen-devel@lists.xensource.com
> > > >> > http://lists.xensource.com/xen- devel
> > > >> >
> > > >> >
> > > >
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.xensource.com
> > > > http://lists.xensource.com/xen-devel
> > > >
> > > >
> >
> > --
> > Tim Deegan <Tim.Deegan@citrix.com>
> > Principal Software Engineer, Xen Platform Team
> > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [memory sharing] bug on get_page_and_type
  2011-02-02 16:18           ` Tim Deegan
@ 2011-02-09  2:46             ` MaoXiaoyun
  2011-02-09  9:57               ` Tim Deegan
  0 siblings, 1 reply; 14+ messages in thread
From: MaoXiaoyun @ 2011-02-09  2:46 UTC (permalink / raw)
  To: xen devel; +Cc: george.dunlap, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 1441 bytes --]


Hi Tim:
 
      I've been looking into the TOCTOU issue quite a while, but
 
     1) There are quite a lot judgements like "p2m_is_shared(p2mt)" or "p2mt == p2m_ram_shared", 
which, for me,   is hard to tell whom are need to be protect by p2m_lock and whom are not
         So is there a rule to distinguish these?
 
     2) Could we improve p2m_lock to sparse lock, which maybe better, right? 
 
     thanks.
 
> Date: Wed, 2 Feb 2011 16:18:37 +0000
> From: Tim.Deegan@citrix.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> 
> At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > Hi Tim:
> > 
> > Thanks for both your advice and quick reply. I will follow.
> > 
> > So at last we should replace shr_lock with p2m_lock.
> > But more complicate, it seems both the
> > *check action* code and *nominate page* code need to be locked ,right?
> > If so, quite a lot of *check action* codes need to be locked.
> 
> Yes, I think you're right about that. Unfortunately there are some very
> long TOCTOU windows in those kind of p2m lookups, which will get more
> important as the p2m gets more dynamic. I don't want to have the
> callers of p2m code touching the p2m lock directly so we may need a new
> p2m interface to address it.
> 
> Tim.
> 
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 2043 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-02-09  2:46             ` MaoXiaoyun
@ 2011-02-09  9:57               ` Tim Deegan
  2011-02-11  7:04                 ` MaoXiaoyun
  0 siblings, 1 reply; 14+ messages in thread
From: Tim Deegan @ 2011-02-09  9:57 UTC (permalink / raw)
  To: MaoXiaoyun; +Cc: George Dunlap, xen devel, juihaochiang

At 02:46 +0000 on 09 Feb (1297219562), MaoXiaoyun wrote:
>       I've been looking into the TOCTOU issue quite a while, but
> 
>      1) There are quite a lot judgements like "p2m_is_shared(p2mt)" or
>         "p2mt == p2m_ram_shared", which, for me, is hard to tell whom
>         are need to be protect by p2m_lock and whom are not So is
>         there a rule to distinguish these?

Not particularly.  I suspect that most of them will need to be
changed, but as I said I hope we can find something nicer than
scattering p2m_lock() around non-p2m code. 

>      2) Could we improve p2m_lock to sparse lock, which maybe better, right?

Maybe, but not necessarily.  Let's get it working properly first and
then we can measure lock contention and see whether fancy locks are
worthwhile. 

Tim.

> 
> > Date: Wed, 2 Feb 2011 16:18:37 +0000
> > From: Tim.Deegan@citrix.com
> > To: tinnycloud@hotmail.com
> > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> >
> > At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > > Hi Tim:
> > >
> > > Thanks for both your advice and quick reply. I will follow.
> > >
> > > So at last we should replace shr_lock with p2m_lock.
> > > But more complicate, it seems both the
> > > *check action* code and *nominate page* code need to be locked ,right?
> > > If so, quite a lot of *check action* codes need to be locked.
> >
> > Yes, I think you're right about that. Unfortunately there are some very
> > long TOCTOU windows in those kind of p2m lookups, which will get more
> > important as the p2m gets more dynamic. I don't want to have the
> > callers of p2m code touching the p2m lock directly so we may need a new
> > p2m interface to address it.
> >
> > Tim.
> >

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [memory sharing] bug on get_page_and_type
  2011-02-09  9:57               ` Tim Deegan
@ 2011-02-11  7:04                 ` MaoXiaoyun
  2011-02-11  9:32                   ` Tim Deegan
  0 siblings, 1 reply; 14+ messages in thread
From: MaoXiaoyun @ 2011-02-11  7:04 UTC (permalink / raw)
  To: xen devel; +Cc: george.dunlap, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 3644 bytes --]


Thanks Tim.
 
After discuss with JuiHao, How about fix in this way?
 
1)  Suppose we have a function,  make_page_unsharable() to substitude 
p2m_is_shared() check, if p2mt is not shared, we increase its type count 
to avoid it turn to shared while using it.
 
 1 int make_page_unsharable(int enable)
  2 {
  3     p2m_type_t p2mt;
  4     unsigned long mfn;
  5 
  6     p2m_lock()
  7     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt))
  8 
  9     if(p2m_is_shared(p2mt)){
10         p2m_unlock()
11         return 1;
12     }
13                                                                                                                                                          
14     get_page_type() / ***increase page type count to avoid page type turn to shared, since in 
                                                   mem_sharing_nominate_page->page_make_sharable, only type count zero is
                                                    allowed to be shared*/
15     p2m_unlock()
16     
 17     return 0;
18 }   
 
2) If p2mt is not shared, we must decrease it type count after we finish using it
3) To avoid competition, page_make_sharble() and p2m_change_type() in  
mem_sharing_nominate_page() should be protected in same p2m_lock.

comments?
 

> Date: Wed, 9 Feb 2011 09:57:20 +0000
> From: Tim.Deegan@citrix.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> 
> At 02:46 +0000 on 09 Feb (1297219562), MaoXiaoyun wrote:
> > I've been looking into the TOCTOU issue quite a while, but
> > 
> > 1) There are quite a lot judgements like "p2m_is_shared(p2mt)" or
> > "p2mt == p2m_ram_shared", which, for me, is hard to tell whom
> > are need to be protect by p2m_lock and whom are not So is
> > there a rule to distinguish these?
> 
> Not particularly. I suspect that most of them will need to be
> changed, but as I said I hope we can find something nicer than
> scattering p2m_lock() around non-p2m code. 
> 
> > 2) Could we improve p2m_lock to sparse lock, which maybe better, right?
> 
> Maybe, but not necessarily. Let's get it working properly first and
> then we can measure lock contention and see whether fancy locks are
> worthwhile. 
> 
> Tim.
> 
> > 
> > > Date: Wed, 2 Feb 2011 16:18:37 +0000
> > > From: Tim.Deegan@citrix.com
> > > To: tinnycloud@hotmail.com
> > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > >
> > > At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > > > Hi Tim:
> > > >
> > > > Thanks for both your advice and quick reply. I will follow.
> > > >
> > > > So at last we should replace shr_lock with p2m_lock.
> > > > But more complicate, it seems both the
> > > > *check action* code and *nominate page* code need to be locked ,right?
> > > > If so, quite a lot of *check action* codes need to be locked.
> > >
> > > Yes, I think you're right about that. Unfortunately there are some very
> > > long TOCTOU windows in those kind of p2m lookups, which will get more
> > > important as the p2m gets more dynamic. I don't want to have the
> > > callers of p2m code touching the p2m lock directly so we may need a new
> > > p2m interface to address it.
> > >
> > > Tim.
> > >
> 
> -- 
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 10122 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-02-11  7:04                 ` MaoXiaoyun
@ 2011-02-11  9:32                   ` Tim Deegan
  2011-02-21 12:29                     ` MaoXiaoyun
  0 siblings, 1 reply; 14+ messages in thread
From: Tim Deegan @ 2011-02-11  9:32 UTC (permalink / raw)
  To: MaoXiaoyun; +Cc: George Dunlap, xen devel, juihaochiang

At 07:04 +0000 on 11 Feb (1297407854), MaoXiaoyun wrote:
> Thanks Tim.
> 
> After discuss with JuiHao, How about fix in this way?
> 
> 1)  Suppose we have a function,  make_page_unsharable() to substitude
> p2m_is_shared() check, if p2mt is not shared, we increase its type count
> to avoid it turn to shared while using it.

That's a good idea.  I'd rather not have the name be anything to do with
"sharable", but we could have a function that does a p2m lookup and a
get-page-and-type, all under the p2m lock, and use it instead of the
lookup-check-getref idiom elsewhere.

Then if (as you say) the make-shareable and nominate-page actions were
covered by the same lock (or potentially even just called the same
function themselves) we would eliminate a lot of races. 

That will be too big a patch to take before 4.1.0 but I'd consider it
immediately after the release.

Tim.

>  1 int make_page_unsharable(int enable)
>   2 {
>   3     p2m_type_t p2mt;
>   4     unsigned long mfn;
>   5
>   6     p2m_lock()
>   7     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt))
>   8
>   9     if(p2m_is_shared(p2mt)){
> 10         p2m_unlock()
> 11         return 1;
> 12     }
> 13
> 14     get_page_type() / ***increase page type count to avoid page type turn to shared, since in
>                                                    mem_sharing_nominate_page->page_make_sharable, only type count zero is
>                                                     allowed to be shared*/
> 15     p2m_unlock()
> 16
>  17     return 0;
> 18 }
> 
> 2) If p2mt is not shared, we must decrease it type count after we finish using it
> 3) To avoid competition, page_make_sharble() and p2m_change_type() in
> mem_sharing_nominate_page() should be protected in same p2m_lock.
> 
> comments?
> 
> 
> > Date: Wed, 9 Feb 2011 09:57:20 +0000
> > From: Tim.Deegan@citrix.com
> > To: tinnycloud@hotmail.com
> > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> >
> > At 02:46 +0000 on 09 Feb (1297219562), MaoXiaoyun wrote:
> > > I've been looking into the TOCTOU issue quite a while, but
> > >
> > > 1) There are quite a lot judgements like "p2m_is_shared(p2mt)" or
> > > "p2mt == p2m_ram_shared", which, for me, is hard to tell whom
> > > are need to be protect by p2m_lock and whom are not So is
> > > there a rule to distinguish these?
> >
> > Not particularly. I suspect that most of them will need to be
> > changed, but as I said I hope we can find something nicer than
> > scattering p2m_lock() around non-p2m code.
> >
> > > 2) Could we improve p2m_lock to sparse lock, which maybe better, right?
> >
> > Maybe, but not necessarily. Let's get it working properly first and
> > then we can measure lock contention and see whether fancy locks are
> > worthwhile.
> >
> > Tim.
> >
> > >
> > > > Date: Wed, 2 Feb 2011 16:18:37 +0000
> > > > From: Tim.Deegan@citrix.com
> > > > To: tinnycloud@hotmail.com
> > > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > > >
> > > > At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > > > > Hi Tim:
> > > > >
> > > > > Thanks for both your advice and quick reply. I will follow.
> > > > >
> > > > > So at last we should replace shr_lock with p2m_lock.
> > > > > But more complicate, it seems both the
> > > > > *check action* code and *nominate page* code need to be locked ,right?
> > > > > If so, quite a lot of *check action* codes need to be locked.
> > > >
> > > > Yes, I think you're right about that. Unfortunately there are some very
> > > > long TOCTOU windows in those kind of p2m lookups, which will get more
> > > > important as the p2m gets more dynamic. I don't want to have the
> > > > callers of p2m code touching the p2m lock directly so we may need a new
> > > > p2m interface to address it.
> > > >
> > > > Tim.
> > > >
> >
> > --
> > Tim Deegan <Tim.Deegan@citrix.com>
> > Principal Software Engineer, Xen Platform Team
> > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [memory sharing] bug on get_page_and_type
  2011-02-11  9:32                   ` Tim Deegan
@ 2011-02-21 12:29                     ` MaoXiaoyun
  2011-02-21 14:08                       ` Tim Deegan
  0 siblings, 1 reply; 14+ messages in thread
From: MaoXiaoyun @ 2011-02-21 12:29 UTC (permalink / raw)
  To: xen devel; +Cc: george.dunlap, harshan.dll, tim.deegan, juihaochiang


[-- Attachment #1.1: Type: text/plain, Size: 5639 bytes --]


Hi Tim:
 
         More thoughts on this bug.
         
         First some questions
 
         1) What PGT_writeable_page means to a page?
         2) When a page type will be changed to PGT_writeable_page?
         3) It looks like PGT_writeable_page is not sharable? Since only PGT_none can, right?
         4) Could I use get_page_type(page, PGT_writeable_page) before every is_p2m_shared() check.
               
             Since if get_page_type() success, then the page will has no chance to be shared later
             and if get_page_type() failed, it page mush has type, it is either  PGT_shared_page or other types, 
             if other types, the page still has no chance to be shared.
             if PGT_shared_page, that's ok, just do usual is_p2m_shared return routine.
 
             question is, is it ok if we only get_page_type, and not to put_page_type()?
        

 
> Date: Fri, 11 Feb 2011 09:32:18 +0000
> From: Tim.Deegan@citrix.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> 
> At 07:04 +0000 on 11 Feb (1297407854), MaoXiaoyun wrote:
> > Thanks Tim.
> > 
> > After discuss with JuiHao, How about fix in this way?
> > 
> > 1) Suppose we have a function, make_page_unsharable() to substitude
> > p2m_is_shared() check, if p2mt is not shared, we increase its type count
> > to avoid it turn to shared while using it.
> 
> That's a good idea. I'd rather not have the name be anything to do with
> "sharable", but we could have a function that does a p2m lookup and a
> get-page-and-type, all under the p2m lock, and use it instead of the
> lookup-check-getref idiom elsewhere.
> 
> Then if (as you say) the make-shareable and nominate-page actions were
> covered by the same lock (or potentially even just called the same
> function themselves) we would eliminate a lot of races. 
> 
> That will be too big a patch to take before 4.1.0 but I'd consider it
> immediately after the release.
> 
> Tim.
> 
> > 1 int make_page_unsharable(int enable)
> > 2 {
> > 3 p2m_type_t p2mt;
> > 4 unsigned long mfn;
> > 5
> > 6 p2m_lock()
> > 7 mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt))
> > 8
> > 9 if(p2m_is_shared(p2mt)){
> > 10 p2m_unlock()
> > 11 return 1;
> > 12 }
> > 13
> > 14 get_page_type() / ***increase page type count to avoid page type turn to shared, since in
> > mem_sharing_nominate_page->page_make_sharable, only type count zero is
> > allowed to be shared*/
> > 15 p2m_unlock()
> > 16
> > 17 return 0;
> > 18 }
> > 
> > 2) If p2mt is not shared, we must decrease it type count after we finish using it
> > 3) To avoid competition, page_make_sharble() and p2m_change_type() in
> > mem_sharing_nominate_page() should be protected in same p2m_lock.
> > 
> > comments?
> > 
> > 
> > > Date: Wed, 9 Feb 2011 09:57:20 +0000
> > > From: Tim.Deegan@citrix.com
> > > To: tinnycloud@hotmail.com
> > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > >
> > > At 02:46 +0000 on 09 Feb (1297219562), MaoXiaoyun wrote:
> > > > I've been looking into the TOCTOU issue quite a while, but
> > > >
> > > > 1) There are quite a lot judgements like "p2m_is_shared(p2mt)" or
> > > > "p2mt == p2m_ram_shared", which, for me, is hard to tell whom
> > > > are need to be protect by p2m_lock and whom are not So is
> > > > there a rule to distinguish these?
> > >
> > > Not particularly. I suspect that most of them will need to be
> > > changed, but as I said I hope we can find something nicer than
> > > scattering p2m_lock() around non-p2m code.
> > >
> > > > 2) Could we improve p2m_lock to sparse lock, which maybe better, right?
> > >
> > > Maybe, but not necessarily. Let's get it working properly first and
> > > then we can measure lock contention and see whether fancy locks are
> > > worthwhile.
> > >
> > > Tim.
> > >
> > > >
> > > > > Date: Wed, 2 Feb 2011 16:18:37 +0000
> > > > > From: Tim.Deegan@citrix.com
> > > > > To: tinnycloud@hotmail.com
> > > > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > > > >
> > > > > At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > > > > > Hi Tim:
> > > > > >
> > > > > > Thanks for both your advice and quick reply. I will follow.
> > > > > >
> > > > > > So at last we should replace shr_lock with p2m_lock.
> > > > > > But more complicate, it seems both the
> > > > > > *check action* code and *nominate page* code need to be locked ,right?
> > > > > > If so, quite a lot of *check action* codes need to be locked.
> > > > >
> > > > > Yes, I think you're right about that. Unfortunately there are some very
> > > > > long TOCTOU windows in those kind of p2m lookups, which will get more
> > > > > important as the p2m gets more dynamic. I don't want to have the
> > > > > callers of p2m code touching the p2m lock directly so we may need a new
> > > > > p2m interface to address it.
> > > > >
> > > > > Tim.
> > > > >
> > >
> > > --
> > > Tim Deegan <Tim.Deegan@citrix.com>
> > > Principal Software Engineer, Xen Platform Team
> > > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
> 
> -- 
> Tim Deegan <Tim.Deegan@citrix.com>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 9770 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [memory sharing] bug on get_page_and_type
  2011-02-21 12:29                     ` MaoXiaoyun
@ 2011-02-21 14:08                       ` Tim Deegan
  0 siblings, 0 replies; 14+ messages in thread
From: Tim Deegan @ 2011-02-21 14:08 UTC (permalink / raw)
  To: MaoXiaoyun; +Cc: George Dunlap, xen devel, harshan.dll, juihaochiang

At 12:29 +0000 on 21 Feb (1298291344), MaoXiaoyun wrote:
> Hi Tim:
> 
>          More thoughts on this bug.
> 
>          First some questions
> 
> 
>          1) What PGT_writeable_page means to a page?

It means there is a writeable mapping of it in a pagetable somewhere,
either in a validated PV guest's pagetable or a shadow pagetable.
(Probably, writeable p2m entries ought to take this kind of typecount
too.)

>          2) When a page type will be changed to PGT_writeable_page?
> 
>          3) It looks like PGT_writeable_page is not sharable? Since only PGT_none can, right?

Yes, a page can only be PGT_writeable_page or PGT_share* because it can
only have one type at a time.  PGT_none is irrelevant.

>          4) Could I use get_page_type(page, PGT_writeable_page) before every is_p2m_shared() check.

No; PGT_writeable page means something, so you can't just take it
randomly.

>              Since if get_page_type() success, then the page will has no chance to be shared later
> 
>              and if get_page_type() failed, it page mush has type, it is either  PGT_shared_page or other types,
> 
>              if other types, the page still has no chance to be shared.
> 
>              if PGT_shared_page, that's ok, just do usual is_p2m_shared return routine.
> 
> 
> 
>              question is, is it ok if we only get_page_type, and not to put_page_type()?

No, that's never OK.  Every reference count and typecount must have a
matching put somewhere or we wouldn't be able to re-use memory for new
domains. 

Cheers,

Tim.

> 
> 
> 
> > Date: Fri, 11 Feb 2011 09:32:18 +0000
> > From: Tim.Deegan@citrix.com
> > To: tinnycloud@hotmail.com
> > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> >
> > At 07:04 +0000 on 11 Feb (1297407854), MaoXiaoyun wrote:
> > > Thanks Tim.
> > >
> > > After discuss with JuiHao, How about fix in this way?
> > >
> > > 1) Suppose we have a function, make_page_unsharable() to substitude
> > > p2m_is_shared() check, if p2mt is not shared, we increase its type count
> > > to avoid it turn to shared while using it.
> >
> > That's a good idea. I'd rather not have the name be anything to do with
> > "sharable", but we could have a function that does a p2m lookup and a
> > get-page-and-type, all under the p2m lock, and use it instead of the
> > lookup-check-getref idiom elsewhere.< BR>>
> > Then if (as you say) the make-shareable and nominate-page actions were
> > covered by the same lock (or potentially even just called the same
> > function themselves) we would eliminate a lot of races.
> >
> > That will be too big a patch to take before 4.1.0 but I'd consider it
> > immediately after the release.
> >
> > Tim.
> >
> > > 1 int make_page_unsharable(int enable)
> > > 2 {
> > > 3 p2m_type_t p2mt;
> > > 4 unsigned long mfn;
> > > 5
> > > 6 p2m_lock()
> > > 7 mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt))
> > > 8
> > > 9 if(p2m_is_shared(p2mt)){
> > > 10 p2m_unlock()
> > > 11 return 1;
> > > 12 }
> > > 13
> > > 14 get_page_type() / ***increase page type count to avoid page type turn to shared, since in
> > > mem_sharing_nominate_page->page_make_sharable, only type count zero is
> > > allowed to be shared */
> > > 15 p2m_unlock()
> > > 16
> > > 17 return 0;
> > > 18 }
> > >
> > > 2) If p2mt is not shared, we must decrease it type count after we finish using it
> > > 3) To avoid competition, page_make_sharble() and p2m_change_type() in
> > > mem_sharing_nominate_page() should be protected in same p2m_lock.
> > >
> > > comments?
> > >
> > >
> > > > Date: Wed, 9 Feb 2011 09:57:20 +0000
> > > > From: Tim.Deegan@citrix.com
> > > > To: tinnycloud@hotmail.com
> > > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > > >
> > > > At 02:46 +0000 on 09 Feb (1297219562), MaoXiaoyun wrote:
> > > > > I've been looking into the TOCTOU issue quite a while, but
> > > > >
> > > > > 1) Th ere are quite a lot judgements like "p2m_is_shared(p2mt)" or
> > > > > "p2mt == p2m_ram_shared", which, for me, is hard to tell whom
> > > > > are need to be protect by p2m_lock and whom are not So is
> > > > > there a rule to distinguish these?
> > > >
> > > > Not particularly. I suspect that most of them will need to be
> > > > changed, but as I said I hope we can find something nicer than
> > > > scattering p2m_lock() around non-p2m code.
> > > >
> > > > > 2) Could we improve p2m_lock to sparse lock, which maybe better, right?
> > > >
> > > > Maybe, but not necessarily. Let's get it working properly first and
> > > > then we can measure lock contention and see whether fancy locks are
> > > > worthwhile.
> > > >
> > > > Tim.
> > > >
> > > > >
> > > > > > Date: Wed, 2 Feb 2011 16:18:37 +0000
> > > > > > From: Tim.Deegan@citrix.com
> > > > > > To: tinnycloud@hotmail.com
> > > > > > CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; juihaochiang@gmail.com
> > > > > > Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
> > > > > >
> > > > > > At 15:43 +0000 on 02 Feb (1296661396), MaoXiaoyun wrote:
> > > > > > > Hi Tim:
> > > > > > >
> > > > > > > Thanks for both your advice and quick reply. I will follow.
> > > > > > >
> > > > > > > So at last we should replace shr_lock with p2m_lock.
> > > > > > > But more complicate, it seems both the
> > > > > > > *check action* code and *nominate page* code need to be locked ,right?
> > > > > > > If so, quite a lot of *check action* codes nee d to be locked.
> > > > > >
> > > > > > Yes, I think you're right about that. Unfortunately there are some very
> > > > > > long TOCTOU windows in those kind of p2m lookups, which will get more
> > > > > > important as the p2m gets more dynamic. I don't want to have the
> > > > > > callers of p2m code touching the p2m lock directly so we may need a new
> > > > > > p2m interface to address it.
> > > > > >
> > > > > > Tim.
> > > > > >
> > > >
> > > > --
> > > > Tim Deegan <Tim.Deegan@citrix.com>
> > > > Principal Software Engineer, Xen Platform Team
> > > > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
> >
> > --
> > Tim Deegan <Tim.Deegan@citrix.com>
> > Principal Software Engineer, Xen Platform Team
> > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-02-21 14:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-31 10:13 [memory sharing] bug on get_page_and_type tinnycloud
2011-01-31 10:49 ` George Dunlap
2011-01-31 12:30   ` MaoXiaoyun
2011-01-31 12:47     ` George Dunlap
2011-02-01 14:28       ` Tim Deegan
2011-02-02 15:43         ` MaoXiaoyun
2011-02-02 16:18           ` Tim Deegan
2011-02-09  2:46             ` MaoXiaoyun
2011-02-09  9:57               ` Tim Deegan
2011-02-11  7:04                 ` MaoXiaoyun
2011-02-11  9:32                   ` Tim Deegan
2011-02-21 12:29                     ` MaoXiaoyun
2011-02-21 14:08                       ` Tim Deegan
2011-02-01 14:07   ` Tim Deegan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.