All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Liu <wei.liu2@citrix.com>
To: Venu Busireddy <venu.busireddy@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	xen-devel@lists.xen.org
Subject: Re: Containing unrecoverable AER errors...
Date: Tue, 20 Jun 2017 12:56:34 +0100	[thread overview]
Message-ID: <20170620115634.ci56l4zsicprvo62@citrix.com> (raw)
In-Reply-To: <20170607192432.20500-1-venu.busireddy@oracle.com>

On Wed, Jun 07, 2017 at 02:24:32PM -0500, Venu Busireddy wrote:
> 
> Hi,
> 
> I am working on creating a patch to aid in containing the unrecoverable
> AER errors generated by PCI devices assigned to guests in passthrough
> mode.
> 
> The overall approach is as follows:
> 
> 1. Change the BIOS settings such that the AER error handling is delegated
>    to the host.
> 
> 2. Change the xen_pciback driver to store the name (SBDF) of the erring
>    device in xenstore.
> 
> 3. At the time of creating the guest, setup a watcher for such writes to
>    the xenstore.
> 
> 4. When the watcher is kicked off due to errors, *shutdown* the guest and
>    mark the erring device unassignable until administrative intervention.
> 
> I got all of this working, but I was advised that shutting down the
> guest is not the correct approach, because the guest may or may not
> respond to the shutdown. The suggestion was to destroy the guest.
> 
> I ran into a problem with that. libxl_domain_destroy() is not
> callable from within libxl. I tried to create a new wrapper to call
> libxl__domain_destroy(), but the callback function never gets called!
> Not surprisingly, because the description in libxl/libxl_internal.h
> about asynchronous operations does prohibit this!
> 
> What is the best way to kill/destroy a guest from within libxl? Could you
> please advise? I am including the patches below for reference (please
> ignore the few debug statements). The problem part is the function
> aer_backend_watch_callback() in tools/libxl/libxl_pci.c.
> 
[...]
> +
> +/* Handler of events for device driver domains */
> +int libxl_reg_aer_events_handler(libxl_ctx *ctx, uint32_t domid)
> +{
> +    int rc;
> +    char *be_path;
> +    GC_INIT(ctx);
> +

You can probably create an AO here, stash it somewhere, and the use it
in your callback to destroy the domain.

See also: libxl_device_events_handler

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-06-20 11:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-07 19:24 Containing unrecoverable AER errors Venu Busireddy
2017-06-20 11:56 ` Wei Liu [this message]
2017-06-29 15:46   ` Venu Busireddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170620115634.ci56l4zsicprvo62@citrix.com \
    --to=wei.liu2@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=venu.busireddy@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.