From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Linux Guest Crash on stress test of memory sharing Date: Tue, 25 Jan 2011 14:23:00 +0800 Message-ID: References: , , , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0530301679==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: george.dunlap@eu.citrix.com, zpfalpc23@gmail.com, tim.deegan@citrix.com, juihaochiang@gmail.com List-Id: xen-devel@lists.xenproject.org --===============0530301679== Content-Type: multipart/alternative; boundary="_6262348f-78d5-4951-9a08-763c50cbf466_" --_6262348f-78d5-4951-9a08-763c50cbf466_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 Most the core dump has the same stack as submitted before, now we h= ave another stack thanks. =20 crash> bt -l PID: 1 TASK: ffff8100011df7a0 CPU: 0 COMMAND: "init" #0 [ffff8100011fddf0] xen_panic_event at ffffffff88001d28 #1 [ffff8100011fde10] notifier_call_chain at ffffffff80066eaa /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/sys.c:= 146 #2 [ffff8100011fde30] panic at ffffffff8009094a /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/panic.= c: 101 #3 [ffff8100011fdf20] do_exit at ffffffff80015477 /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/exit.c= : 835 #4 [ffff8100011fdf80] system_call at ffffffff8005d116 /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/arch/x86_64/k= ernel/entry.S RIP: 000000000055a5ff RSP: 00007fff2b8c2e10 RFLAGS: 00010246 RAX: 00000000000000e7 RBX: ffffffff8005d116 RCX: 0000000000000047 RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001 RBP: 0000000000000000 R8: 00000000000000e7 R9: ffffffffffffffb4 R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000604ea8 R14: ffffffff80049281 R15: 0000000000000000 ORIG_RAX: 00000000000000e7 CS: 0033 SS: 002b crash>=20 =20 =20 >From: tinnycloud@hotmail.com >To: tinnycloud@hotmail.com >Subject: Linux Guest Crash on stress test of memory sharing >Date: Tue, 25 Jan 2011 13:07:15 +0800 > >Hi: >=20 > Follow George's suggestion to summit the bug in this new thread. >=20 > Start 24 linux HVMS on a physical host, each of them reboot throu= gh "xm reboot" every 30minutes. > After several hours, some of the HVM will crash.=20 >=20 > All of the crash HVM are stopped during booting. > The bug still exists even I forbid page sharing by cheating tapdi= sk that xc_memshr_nominate_gref() > return failure. No bug if memory sharing is disabled. > (This means only mem_sharing_nominate_page() are called, and in m= em_sharing_nominate_page() > page type is set to p2m_shared, so later it needs to be unshared= when someone try to use it) >=20 > I remember there is a call routine in memory sharing, > hvm_hap_nested_page_fault()->mem_sharing_unshare_page()=20 > compare to the crash dump, it might indicates some connections. >=20 >DomU kernel is from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Se= rver/en/os/SRPMS/kernel-2.6.18-164.el5.src.rpm >Xen version: 4.0.0 >=20 >crash dump stack : >=20 >crash> bt -l >PID: 2422 TASK: ffff810013b40860 CPU: 1 COMMAND: "setfont" > #0 [ffff810012cef900] xen_panic_event at ffffffff88001d28 > #1 [ffff810012cef920] notifier_call_chain at ffffffff80066eaa > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/sys.c= : 146 > #2 [ffff810012cef940] panic at ffffffff8009094a > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/panic= .c: 101 > #3 [ffff810012cefa30] oops_end at ffffffff80064fca > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/arch/x86_64/= kernel/traps.c: 539 > #4 [ffff810012cefa40] do_page_fault at ffffffff80066dc0 > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/arch/x86_64/= mm/fault.c: 591 > #5 [ffff810012cefb30] error_exit at ffffffff8005dde9 > [exception RIP: vgacon_do_font_op+435] > RIP: ffffffff8005162d RSP: ffff810012cefbe8 RFLAGS: 00010287 > RAX: ffff8100000a6000 RBX: ffffffff804b3740 RCX: ffff8100000a4ae0 > RDX: ffff810012d16ae1 RSI: ffff810012d14000 RDI: ffffffff803244c4 > RBP: ffff810012d14000 R8: d0d6999996000000 R9: 0000009090b0b0ff > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004 > R13: 0000000000000001 R14: 0000000000000001 R15: 000000000000000e > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #6 [ffff810012cefc20] vgacon_font_set at ffffffff8016bec5 > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/drivers/vide= o/console/vgacon.c: 1238 > #7 [ffff810012cefc60] con_font_op at ffffffff801aa86b > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/drivers/char= /vt.c: 3645 > #8 [ffff810012cefcd0] vt_ioctl at ffffffff801a5af4 > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/drivers/char= /vt_ioctl.c: 965 > #9 [ffff810012cefd70] tty_ioctl at ffffffff80038a2c > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/drivers/char= /tty_io.c: 3340 >#10 [ffff810012cefeb0] do_ioctl at ffffffff800420d9 > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/fs/ioctl.c: = 39 >#11 [ffff810012cefed0] vfs_ioctl at ffffffff800302ce > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/fs/ioctl.c: = 500 >#12 [ffff810012ceff40] sys_ioctl at ffffffff8004c766 > /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/fs/ioctl.c: = 520 >#13 [ffff810012ceff80] tracesys at ffffffff8005d28d (via system_call) > RIP: 00000039294cc557 RSP: 00007fff1a57ed98 RFLAGS: 00000246 > RAX: ffffffffffffffda RBX: ffffffff8005d28d RCX: ffffffffffffffff > RDX: 00007fff1a57edb0 RSI: 0000000000004b72 RDI: 0000000000000003 > RBP: 000000001e33dab0 R8: 0000000000000010 R9: 0000000000800000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010 > R13: 0000000000000200 R14: 0000000000000008 R15: 0000000000000008 > ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b =20 --_6262348f-78d5-4951-9a08-763c50cbf466_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
      Most the core dump has the same stack as s= ubmitted before, now we have another stack
      thanks.
 
crash> bt -l
PID: 1      TASK: ffff8100011= df7a0  CPU: 0   COMMAND: "init"
 #0 [ffff8100011fd= df0] xen_panic_event at ffffffff88001d28
 #1 [ffff8100011fde10] n= otifier_call_chain at ffffffff80066eaa
    /usr/src/red= hat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/sys.c: 146
 #2= [ffff8100011fde30] panic at ffffffff8009094a
    /usr/= src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/panic.c: 101 #3 [ffff8100011fdf20] do_exit at ffffffff80015477
  &= nbsp; /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/kernel/exit= .c: 835
 #4 [ffff8100011fdf80] system_call at ffffffff8005d116    /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_= 64/arch/x86_64/kernel/entry.S
    RIP: 000000000055a5ff=   RSP: 00007fff2b8c2e10  RFLAGS: 00010246
   = RAX: 00000000000000e7  RBX: fffffff f8005d116  RCX: 0000000000000047
    RDX: 0000000= 000000001  RSI: 000000000000003c  RDI: 0000000000000001
&nbs= p;   RBP: 0000000000000000   R8: 00000000000000e7&nbs= p;  R9: ffffffffffffffb4
    R10: 00000000ffffffff=   R11: 0000000000000246  R12: 0000000000000001
  &= nbsp; R13: 0000000000604ea8  R14: ffffffff80049281  R15: 000000= 0000000000
    ORIG_RAX: 00000000000000e7  CS: 003= 3  SS: 002b
crash>
      
 
>From: tinnycloud@hotmail.com
>To: tinnycloud@hotmail.com
>Subject: Linux Guest Cr= ash on stress test of memory sharing
>Date: Tue, 25 Jan 2011 13:07:= 15 +0800
>
>Hi:
>
>     = ;  Follow George's suggestion to summit the bug in this new thread.<= BR>>
>       Start 24 linux HVMS = on a physical host, each of them reboot through "xm reboot" every 30minut= es.
>       After several hours, some= of the HVM will crash.
>
>     &n= bsp; All of the crash HVM are stopped during booting.
>  =      The bug still exists even I forbid page sharing = by cheating tapdisk that xc_memshr_nominate_gref()
>  &nb= sp;    return failure. No bug if memory sharing is disabled.
>      = (This means only mem_sharing_nominate_page() are called, and in mem_shar= ing_nominate_page()
>        pag= e type is set to p2m_shared, so later it needs to be unshared when someon= e try to use it)
>
>       I r= emember there is a call routine in memory sharing,
>  &nb= sp;    hvm_hap_nested_page_fault()->mem_sharing_unshare= _page()
>       compare to the crash= dump, it might indicates some connections.
>
>DomU kernel i= s from ftp://ftp.redhat.com/pub/red= hat/linux/enterprise/5Server/en/os/SRPMS/kernel-2.6.18-164.el5.src.rpm
>Xen version: 4.0.0
>
>crash dump stack :
> >crash> bt -l
>PID: 2422    TASK: ffff810013b40860  CPU: 1   COMMAND: "setfont= "
> #0 [ffff810012cef900] xen_panic_event at ffffffff88001d28
&g= t; #1 [ffff810012cef920] notifier_call_chain at ffffffff80066eaa
>&= nbsp;   /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64= /kernel/sys.c: 146
> #2 [ffff810012cef940] panic at ffffffff8009094= a
>    /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6= .18.x86_64/kernel/panic.c: 101
> #3 [ffff810012cefa30] oops_end at = ffffffff80064fca
>    /usr/src/redhat/BUILD/kernel-2= .6.18/linux-2.6.18.x86_64/arch/x86_64/kernel/traps.c: 539
> #4 [fff= f810012cefa40] do_page_fault at ffffffff80066dc0
>   = ; /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/arch/x86_64/mm/= fault.c: 591
> #5 [ffff810012cefb30] error_exit at ffffffff8005dde9=
>    [exception RIP: vgacon_do_font_op+435]
>=     RIP: ffffffff8005162d&n bsp; RSP: ffff810012cefbe8  RFLAGS: 00010287
>  &nb= sp; RAX: ffff8100000a6000  RBX: ffffffff804b3740  RCX: ffff8100= 000a4ae0
>    RDX: ffff810012d16ae1  RSI: ffff8= 10012d14000  RDI: ffffffff803244c4
>    RBP: ff= ff810012d14000   R8: d0d6999996000000   R9: 000000909= 0b0b0ff
>    R10: 0000000000000000  R11: 000000= 0000000000  R12: 0000000000000004
>    R13: 000= 0000000000001  R14: 0000000000000001  R15: 000000000000000e
= >    ORIG_RAX: ffffffffffffffff  CS: 0010  SS= : 0018
> #6 [ffff810012cefc20] vgacon_font_set at ffffffff8016bec5<= BR>>    /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.1= 8.x86_64/drivers/video/console/vgacon.c: 1238
> #7 [ffff810012cefc6= 0] con_font_op at ffffffff801aa86b
>    /usr/src/red= hat/BUILD/kernel-2.6.18/linux-2.6.18.x86_6 4/drivers/char/vt.c: 3645
> #8 [ffff810012cefcd0] vt_ioctl at ffff= ffff801a5af4
>    /usr/src/redhat/BUILD/kernel-2.6.1= 8/linux-2.6.18.x86_64/drivers/char/vt_ioctl.c: 965
> #9 [ffff810012= cefd70] tty_ioctl at ffffffff80038a2c
>    /usr/src/= redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/drivers/char/tty_io.c: 334= 0
>#10 [ffff810012cefeb0] do_ioctl at ffffffff800420d9
> = ;   /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/fs/= ioctl.c: 39
>#11 [ffff810012cefed0] vfs_ioctl at ffffffff800302ce>    /usr/src/redhat/BUILD/kernel-2.6.18/linux-2.6.18= .x86_64/fs/ioctl.c: 500
>#12 [ffff810012ceff40] sys_ioctl at ffffff= ff8004c766
>    /usr/src/redhat/BUILD/kernel-2.6.18/= linux-2.6.18.x86_64/fs/ioctl.c: 520
>#13 [ffff810012ceff80] tracesy= s at ffffffff8005d28d (via system_call)
>    RIP: 00= 000039294cc557  RSP: 00007fff1a57ed98   RFLAGS: 00000246
>    RAX: ffffffffffffffda&= nbsp; RBX: ffffffff8005d28d  RCX: ffffffffffffffff
> &nbs= p;  RDX: 00007fff1a57edb0  RSI: 0000000000004b72  RDI: 000= 0000000000003
>    RBP: 000000001e33dab0  = R8: 0000000000000010   R9: 0000000000800000
>  = ;  R10: 0000000000000000  R11: 0000000000000246  R12: 0000= 000000000010
>    R13: 0000000000000200  R14: 0= 000000000000008  R15: 0000000000000008
>    ORI= G_RAX: 0000000000000010  CS: 0033  SS: 002b
--_6262348f-78d5-4951-9a08-763c50cbf466_-- --===============0530301679== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0530301679==--