All of lore.kernel.org
 help / color / mirror / Atom feed
* Help with Guest creation problem
@ 2009-06-02 22:08 Mick Jordan
  2009-06-04  6:12 ` Simon Horman
  0 siblings, 1 reply; 3+ messages in thread
From: Mick Jordan @ 2009-06-02 22:08 UTC (permalink / raw)
  To: xen-devel

This problem is a bit out of the mainstream as it relates to guests that 
terminate quickly, which the typical OS guest does not do, but my Java 
guest applications sometimes do.

The problem manifests itself as the behavior below. I believe that the 
cause is some race between the guest terminating, console output and the 
xm code. I'm no Python expert but it doesn't look as if the error 
handling in do_console is working correctly. Note that if the guest is 
made to sleep for a second before terminating, this behavior never happens.

I've experienced this occasionally on Solaris xVM and very frequently on 
various flavors of Xen/Linux.

Any insight appreciated.

Mick

# xm create -c xmconfigs/domain_config_generic extra= 
-XX:SemiSpaceGC:Virtual   -cp /guestvm/image/GuestVM/bin 
test.java.lang.Null  name=GuestVM-test.java.lang.Null-mjj
Unexpected error: <type 'exceptions.OSError'>

Please report to xen-devel@lists.xensource.com
Traceback (most recent call last):
  File "/usr/sbin/xm", line 10, in <module>
    main.main(sys.argv)
  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2500, 
in main
    _, rc = _run_cmd(cmd, cmd_name, args)
  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2524, 
in _run_cmd
    return True, cmd(args)
  File "<string>", line 1, in <lambda>
  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 1302, 
in xm_importcommand
    cmd.main([command] + args)
  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1293, 
in main
    do_console(sxp.child_value(config, 'name', -1))
  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1318, 
in do_console
    (p, rv) = os.waitpid(cpid, os.WNOHANG)
OSError: [Errno 10] No child processes
[root@diy-3-15 GuestVMNative]#

create.py:

def do_console(domain_name):
    cpid = os.fork()
    if cpid != 0:
        for i in range(10):
            # Catch failure of the create process
            time.sleep(1)
            (p, rv) = os.waitpid(cpid, os.WNOHANG)
            if os.WIFEXITED(rv):
                if os.WEXITSTATUS(rv) != 0:
                    sys.exit(os.WEXITSTATUS(rv))
            try:
                # Acquire the console of the created dom
                if serverType == SERVER_XEN_API:
                    domid = server.xenapi.VM.get_domid(
                               get_single_vm(domain_name))
                else:
                    dom = server.xend.domain(domain_name)
                    domid = int(sxp.child_value(dom, 'domid', '-1'))
                console.execConsole(domid)
            except:
                pass
        print("Could not start console\n");
        sys.exit(0)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help with Guest creation problem
  2009-06-02 22:08 Help with Guest creation problem Mick Jordan
@ 2009-06-04  6:12 ` Simon Horman
  2009-06-04  6:22   ` Simon Horman
  0 siblings, 1 reply; 3+ messages in thread
From: Simon Horman @ 2009-06-04  6:12 UTC (permalink / raw)
  To: Mick Jordan; +Cc: xen-devel

On Tue, Jun 02, 2009 at 03:08:34PM -0700, Mick Jordan wrote:
> This problem is a bit out of the mainstream as it relates to guests that  
> terminate quickly, which the typical OS guest does not do, but my Java  
> guest applications sometimes do.
>
> The problem manifests itself as the behavior below. I believe that the  
> cause is some race between the guest terminating, console output and the  
> xm code. I'm no Python expert but it doesn't look as if the error  
> handling in do_console is working correctly. Note that if the guest is  
> made to sleep for a second before terminating, this behavior never 
> happens.
>
> I've experienced this occasionally on Solaris xVM and very frequently on  
> various flavors of Xen/Linux.
>
> Any insight appreciated.
>
> Mick
>
> # xm create -c xmconfigs/domain_config_generic extra=  
> -XX:SemiSpaceGC:Virtual   -cp /guestvm/image/GuestVM/bin  
> test.java.lang.Null  name=GuestVM-test.java.lang.Null-mjj
> Unexpected error: <type 'exceptions.OSError'>
>
> Please report to xen-devel@lists.xensource.com
> Traceback (most recent call last):
>  File "/usr/sbin/xm", line 10, in <module>
>    main.main(sys.argv)
>  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2500, in 
> main
>    _, rc = _run_cmd(cmd, cmd_name, args)
>  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2524, in 
> _run_cmd
>    return True, cmd(args)
>  File "<string>", line 1, in <lambda>
>  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 1302, in 
> xm_importcommand
>    cmd.main([command] + args)
>  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1293,  
> in main
>    do_console(sxp.child_value(config, 'name', -1))
>  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1318,  
> in do_console
>    (p, rv) = os.waitpid(cpid, os.WNOHANG)
> OSError: [Errno 10] No child processes
> [root@diy-3-15 GuestVMNative]#
>
> create.py:
>
> def do_console(domain_name):
>    cpid = os.fork()
>    if cpid != 0:
>        for i in range(10):
>            # Catch failure of the create process
>            time.sleep(1)
>            (p, rv) = os.waitpid(cpid, os.WNOHANG)
>            if os.WIFEXITED(rv):
>                if os.WEXITSTATUS(rv) != 0:
>                    sys.exit(os.WEXITSTATUS(rv))
>            try:
>                # Acquire the console of the created dom
>                if serverType == SERVER_XEN_API:
>                    domid = server.xenapi.VM.get_domid(
>                               get_single_vm(domain_name))
>                else:
>                    dom = server.xend.domain(domain_name)
>                    domid = int(sxp.child_value(dom, 'domid', '-1'))
>                console.execConsole(domid)
>            except:
>                pass
>        print("Could not start console\n");
>        sys.exit(0)

Hi Mick,

What I think is happening is that after the domain finishes
the child process that was used to start it is detached from its parent.
I am able to reproduce the error that you see using the following:

----------- start -------------
import os
import time
import sys

def do_console():
   cpid = os.fork()
   if cpid != 0:
       for i in range(10):
           # Catch failure of the create process
	   time.sleep(1)
           (p, rv) = os.waitpid(cpid, os.WNOHANG)
           if os.WIFEXITED(rv):
               if os.WEXITSTATUS(rv) != 0:
                   sys.exit(os.WEXITSTATUS(rv))

       print("Could not start console\n");
       sys.exit(0)

do_console()
os.setsid()
----------- end -------------

The following patch should resolve this problem by exiting
cleanly if os.waitpid() throws an OSError exception.

----------------------------------------------------------------------

xm: Don't die when trying to conect the console to short-lived domains

As observed by Mick Joran, if short-lived domain exits cleanly
then os.waitpid() will throw the following exception. This appears
to be because the child process that is used to start the domain
has detached from its parent.

OSError: [Errno 10] No child processes

Cc: Mick Jordan <Mick.Jordan@sun.com>
Signed-off-by: Simon Horman <horms@verge.ent.au>

Index: xen-unstable.hg/tools/python/xen/xm/create.py
===================================================================
--- xen-unstable.hg.orig/tools/python/xen/xm/create.py	2009-06-04 15:32:59.000000000 +1000
+++ xen-unstable.hg/tools/python/xen/xm/create.py	2009-06-04 16:10:59.000000000 +1000
@@ -1400,6 +1400,13 @@ def do_console(domain_name):
         for i in range(10):
             # Catch failure of the create process 
             time.sleep(1)
+            try:
+                (p, rv) = os.waitpid(cpid, os.WNOHANG)
+            except OSError:
+                # Domain has started cleanly and then exiting,
+		# the child process used to do this has detached
+                print("Domain has already finished");
+                break
             (p, rv) = os.waitpid(cpid, os.WNOHANG)
             if os.WIFEXITED(rv):
                 if os.WEXITSTATUS(rv) != 0:

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Help with Guest creation problem
  2009-06-04  6:12 ` Simon Horman
@ 2009-06-04  6:22   ` Simon Horman
  0 siblings, 0 replies; 3+ messages in thread
From: Simon Horman @ 2009-06-04  6:22 UTC (permalink / raw)
  To: Mick Jordan; +Cc: xen-devel

On Thu, Jun 04, 2009 at 04:12:17PM +1000, Simon Horman wrote:
> On Tue, Jun 02, 2009 at 03:08:34PM -0700, Mick Jordan wrote:
> > This problem is a bit out of the mainstream as it relates to guests that  
> > terminate quickly, which the typical OS guest does not do, but my Java  
> > guest applications sometimes do.
> >
> > The problem manifests itself as the behavior below. I believe that the  
> > cause is some race between the guest terminating, console output and the  
> > xm code. I'm no Python expert but it doesn't look as if the error  
> > handling in do_console is working correctly. Note that if the guest is  
> > made to sleep for a second before terminating, this behavior never 
> > happens.
> >
> > I've experienced this occasionally on Solaris xVM and very frequently on  
> > various flavors of Xen/Linux.
> >
> > Any insight appreciated.
> >
> > Mick
> >
> > # xm create -c xmconfigs/domain_config_generic extra=  
> > -XX:SemiSpaceGC:Virtual   -cp /guestvm/image/GuestVM/bin  
> > test.java.lang.Null  name=GuestVM-test.java.lang.Null-mjj
> > Unexpected error: <type 'exceptions.OSError'>
> >
> > Please report to xen-devel@lists.xensource.com
> > Traceback (most recent call last):
> >  File "/usr/sbin/xm", line 10, in <module>
> >    main.main(sys.argv)
> >  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2500, in 
> > main
> >    _, rc = _run_cmd(cmd, cmd_name, args)
> >  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 2524, in 
> > _run_cmd
> >    return True, cmd(args)
> >  File "<string>", line 1, in <lambda>
> >  File "/usr/lib64/python2.5/site-packages/xen/xm/main.py", line 1302, in 
> > xm_importcommand
> >    cmd.main([command] + args)
> >  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1293,  
> > in main
> >    do_console(sxp.child_value(config, 'name', -1))
> >  File "/usr/lib64/python2.5/site-packages/xen/xm/create.py", line 1318,  
> > in do_console
> >    (p, rv) = os.waitpid(cpid, os.WNOHANG)
> > OSError: [Errno 10] No child processes
> > [root@diy-3-15 GuestVMNative]#
> >
> > create.py:
> >
> > def do_console(domain_name):
> >    cpid = os.fork()
> >    if cpid != 0:
> >        for i in range(10):
> >            # Catch failure of the create process
> >            time.sleep(1)
> >            (p, rv) = os.waitpid(cpid, os.WNOHANG)
> >            if os.WIFEXITED(rv):
> >                if os.WEXITSTATUS(rv) != 0:
> >                    sys.exit(os.WEXITSTATUS(rv))
> >            try:
> >                # Acquire the console of the created dom
> >                if serverType == SERVER_XEN_API:
> >                    domid = server.xenapi.VM.get_domid(
> >                               get_single_vm(domain_name))
> >                else:
> >                    dom = server.xend.domain(domain_name)
> >                    domid = int(sxp.child_value(dom, 'domid', '-1'))
> >                console.execConsole(domid)
> >            except:
> >                pass
> >        print("Could not start console\n");
> >        sys.exit(0)
> 
> Hi Mick,
> 
> What I think is happening is that after the domain finishes
> the child process that was used to start it is detached from its parent.
> I am able to reproduce the error that you see using the following:
> 
> ----------- start -------------
> import os
> import time
> import sys
> 
> def do_console():
>    cpid = os.fork()
>    if cpid != 0:
>        for i in range(10):
>            # Catch failure of the create process
> 	   time.sleep(1)
>            (p, rv) = os.waitpid(cpid, os.WNOHANG)
>            if os.WIFEXITED(rv):
>                if os.WEXITSTATUS(rv) != 0:
>                    sys.exit(os.WEXITSTATUS(rv))
> 
>        print("Could not start console\n");
>        sys.exit(0)
> 
> do_console()
> os.setsid()
> ----------- end -------------
> 
> The following patch should resolve this problem by exiting
> cleanly if os.waitpid() throws an OSError exception.
> 

again, without tabs.

----------------------------------------------------------------------

xm: Don't die when trying to conect the console to short-lived domains

As observed by Mick Joran, if short-lived domain exits cleanly
then os.waitpid() will throw the following exception. This appears
to be because the child process that is used to start the domain
has detached from its parent.

OSError: [Errno 10] No child processes

Cc: Mick Jordan <Mick.Jordan@sun.com>
Signed-off-by: Simon Horman <horms@verge.ent.au>

Index: xen-unstable.hg/tools/python/xen/xm/create.py
===================================================================
--- xen-unstable.hg.orig/tools/python/xen/xm/create.py	2009-06-04 15:32:59.000000000 +1000
+++ xen-unstable.hg/tools/python/xen/xm/create.py	2009-06-04 16:21:10.000000000 +1000
@@ -1400,6 +1400,13 @@ def do_console(domain_name):
         for i in range(10):
             # Catch failure of the create process 
             time.sleep(1)
+            try:
+                (p, rv) = os.waitpid(cpid, os.WNOHANG)
+            except OSError:
+                # Domain has started cleanly and then exiting,
+                # the child process used to do this has detached
+                print("Domain has already finished");
+                break
             (p, rv) = os.waitpid(cpid, os.WNOHANG)
             if os.WIFEXITED(rv):
                 if os.WEXITSTATUS(rv) != 0:

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-06-04  6:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-02 22:08 Help with Guest creation problem Mick Jordan
2009-06-04  6:12 ` Simon Horman
2009-06-04  6:22   ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.