All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Jackson <Ian.Jackson@eu.citrix.com>
To: rshriram@cs.ubc.ca
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering [and 1 more messages]
Date: Mon, 14 Oct 2013 17:34:44 +0100	[thread overview]
Message-ID: <21084.7460.602827.494596@mariner.uk.xensource.com> (raw)
In-Reply-To: <CAP8mzPN+Pe9V9J7UHSkC81PZTT7Btx9L4WEK1BDhxe0dXs6T9A@mail.gmail.com>, <CAP8mzPPgsywSJb3mmjmWDcd+1Qzc51UkPecSJ+U0f8GpgdZDQg@mail.gmail.com>

Shriram Rajagopalan writes ("Re: [Xen-devel] [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering"):
> On Wed, Sep 4, 2013 at 12:16 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
>     > diff -r 79b21d778550 -r 3e8b0aa7cd4a tools/libxl/libxl_netbuffer.c
>     > --- /dev/null Thu Jan 01 00:00:00 1970 +0000
>     > +++ b/tools/libxl/libxl_netbuffer.c   Thu Aug 29 14:36:36 2013 -0700
>     > @@ -0,0 +1,434 @@
>     ...
>     > +static void netbuf_script_child_death_cb(libxl__egc *egc,
>     > +                                         libxl__ev_child *child,
>     > +                                         pid_t pid, int status)
>     > +{
>     > +    /* No-op. hotplug-error will be read in setup/teardown_ifb */
>     > +    return;
>     > +}
> 
>     This can't possibly be right.  You must always wait for your children
>     and you may not complete the ao until the child has finished.  This
>     should all be documented in libxl_internal.h.
>
> Thats true. But this was not meant to be an ao (see explanation at
> end of this email).

It is not possible to fork/exec/wait in libxl without using the ao
machinery.  As the comment in libxl_internal.h says:
 /*
  * For making subprocesses.  This is the only permitted mechanism for
  * code in libxl to do so.
 ...
 _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, ...

(This is due to awkwardness in the POSIX API, combined with libxl's
status as a library which needs to work inside a variety of other
programs and libraries with a variety of ways of handling child
processes.)

>     > +    /* Now wait for the result (XENBUS_PATH/hotplug-status).
>     > +     * It doesnt matter what the result is. If the hotplug-status path
>     > +     * appears, then we are good to go.
>     > +     */
>     > +    rc = libxl__wait_for_offspring(gc, domid, LIBXL_HOTPLUG_TIMEOUT,
>     > +                                   script,
>     > +                                   /* path */
>     > +                                   GCSPRINTF("%s/hotplug-status",
>     > +                                             out_path_base),
>     > +                                   NULL /* state */,
>     > +                                   NULL, script_done_cb, NULL);
> 
>     This wait does not actually block the current execution.
...
> It actually does. Atleast based on the code in libxl_exec.c which uses
> a select() call to wait on xenstore fds.

Oh, I'm wrong.  But, libxl__wait_for_offspring is a special tool for
dealing with the daemonic device model process.  The comment should be
clearer.

>     Does this run a script per netbuf ?  Do they want to run in
>     parallel ?
> 
> A script per netbuf. Running them in parallel is an overkill, since
> the remus setup is a one-time task and the average domain is
> expected to have few interfaces.  Besides, the script has no
> waits. It just issues a bunch of shell commands that don't block

Fair enough.  But you may find it easier to run them in parallel
because perhaps then you can reuse more of the existing machinery for
running multiple ao subtasks.

> See answer above for running in parallel. In terms of running this
> asynchronously, I had a comment in this file explaining why I
> decided to go the synchronous route.  Basically, the setup and
> teardown are part of the remus API call, which is already
> asynchronous.
> 
> The remus external API (libxl_domain_remus_start) returns after
> starting remus.  It goes like this:
>   setup scripts
>   suspend_domain (which basically launches save_domain helper process)
>   return AO_IN_PROGRESS

Right.

> The setup scripts are have no long running operations. They complete in jiffy
> (adding a bunch of tc rules to interfaces)

But in libxl you can't wait for subprocesses (ie, you can't call
waitpid) other than via the ao machinery.

> The AO callback in libxl.c (named "remus_replication_failure_cb" [beginning 
> of this patch]), is called when the backup fails, i.e. when the replication 
> operations fail (such as write() calls).
> 
> The AO callback calls the above teardown function.  I remember reading in the
> comments that an AO op should not be issued from an AO callback 
> (please correct me if I am wrong).

You can't run an ao_how-taking libxl function from within libxl.  But
you may (and should) provide a plain-callback-style version of the
same operation from use within libxl.

To see some examples of how to do subfunctions in the asynchronous
style, see libxl_create.c - for example, its call to
libxl__bootloader_run.

Thanks,
Ian.

  parent reply	other threads:[~2013-10-14 16:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-29 22:16 [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
2013-08-29 22:16 ` [PATCH 1 of 5 V2] remus: add libnl3 dependency to autoconf scripts Shriram Rajagopalan
2013-09-04 13:40   ` Ian Campbell
2013-10-09  2:14     ` Shriram Rajagopalan
2013-10-09  8:06       ` Ian Campbell
2013-10-09 16:27         ` Shriram Rajagopalan
2013-10-09 16:36           ` Ian Campbell
2013-08-29 22:16 ` [PATCH 2 of 5 V2] tools/hotplug: Remus network buffering setup scripts Shriram Rajagopalan
2013-09-04 13:50   ` Ian Campbell
2013-08-29 22:16 ` [PATCH 3 of 5 V2] tools/libxl: setup/teardown Remus network buffering Shriram Rajagopalan
2013-09-04 15:17   ` Ian Campbell
2013-10-14 14:23     ` Shriram Rajagopalan
2013-10-15 10:34       ` Ian Campbell
2013-09-04 16:16   ` Ian Jackson
2013-09-04 17:22     ` Shriram Rajagopalan
2013-10-09 16:32       ` Shriram Rajagopalan
2013-10-14 16:30         ` [PATCH] libxl: Deprecate synchronous waiting for the device model Ian Jackson
2013-11-04 16:50           ` Ian Campbell
2013-11-04 17:03             ` Ian Jackson
2013-11-12 16:58               ` [PATCH RESEND] " Ian Jackson
2013-11-12 17:25                 ` Ian Campbell
2013-11-12 17:27                   ` Ian Jackson
2013-10-14 16:34         ` Ian Jackson [this message]
2013-08-29 22:16 ` [PATCH 4 of 5 V2] tools/libxl: Control network buffering in remus callbacks Shriram Rajagopalan
2013-09-04 15:19   ` Ian Campbell
2013-09-05 11:25   ` Ian Jackson
2013-08-29 22:16 ` [PATCH 5 of 5 V2] tools/xl: Remus - Network buffering cmdline switch Shriram Rajagopalan
2013-09-04 15:24   ` Ian Campbell
2013-09-04 13:21 ` [PATCH 0 of 5 V2] Remus/Libxl: Network buffering support Shriram Rajagopalan
2013-09-04 13:27   ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21084.7460.602827.494596@mariner.uk.xensource.com \
    --to=ian.jackson@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.