All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andres Lagar-Cavilla <andreslc@gridcentric.ca>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: andres.lagarcavilla@gmail.com, xen-devel@lists.xen.org,
	Andres Lagar-Cavilla <andres@lagarcavilla.org>,
	David Vrabel <david.vrabel@citrix.com>
Subject: Re: [PATCH] Xen backend support for paged out grant targets.
Date: Fri, 31 Aug 2012 15:33:17 -0400	[thread overview]
Message-ID: <CAO=PTzoNfS8+DFXUM2+E9FaZSCJwTeRjecCXJwy_+CFyxGYpUQ@mail.gmail.com> (raw)
In-Reply-To: <20120831165454.GE18929@localhost.localdomain>


[-- Attachment #1.1: Type: text/plain, Size: 14485 bytes --]

But msleep will wind up calling schedule(). We definitely cannot afford to
pin down a dom0 vcpu when the pager itself is in dom0.

Iiuc....

Andres
On Aug 31, 2012 12:55 PM, "Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>
wrote:

> On Fri, Aug 31, 2012 at 11:42:32AM -0400, Andres Lagar-Cavilla wrote:
> > Actually acted upon your feedback ipso facto:
> >
> > commit d5fab912caa1f0cf6be0a6773f502d3417a207b6
> > Author: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> > Date:   Sun Aug 26 09:45:57 2012 -0400
> >
> >     Xen backend support for paged out grant targets.
> >
> >     Since Xen-4.2, hvm domains may have portions of their memory paged
> out. When a
> >     foreign domain (such as dom0) attempts to map these frames, the map
> will
> >     initially fail. The hypervisor returns a suitable errno, and kicks an
> >     asynchronous page-in operation carried out by a helper. The foreign
> domain is
> >     expected to retry the mapping operation until it eventually
> succeeds. The
> >     foreign domain is not put to sleep because itself could be the one
> running the
> >     pager assist (typical scenario for dom0).
> >
> >     This patch adds support for this mechanism for backend drivers using
> grant
> >     mapping and copying operations. Specifically, this covers the
> blkback and
> >     gntdev drivers (which map foregin grants), and the netback driver
> (which copies
> >     foreign grants).
> >
> >     * Add GNTST_eagain, already exposed by Xen, to the grant interface.
> >     * Add a retry method for grants that fail with GNTST_eagain (i.e.
> because the
> >       target foregin frame is paged out).
> >     * Insert hooks with appropriate macro decorators in the
> aforementioned drivers.
> >
> >     The retry loop is only invoked if the grant operation status is
> GNTST_eagain.
> >     It guarantees to leave a new status code different from
> GNTST_eagain. Any other
> >     status code results in identical code execution as before.
> >
> >     The retry loop performs 256 attempts with increasing time intervals
> through a
> >     32 second period. It uses msleep to yield while waiting for the next
> retry.
> >
>
>
> Would it make sense to yield to other processes (so call schedule)? Or
> perhaps have this in a workqueue ?
>
> I mean the 'msleep' just looks like a hack. .. 32 seconds of doing
> 'msleep' on 1VCPU dom0 could trigger the watchdog I think?
>
> >     V2 after feedback from David Vrabel:
> >     * Explicit MAX_DELAY instead of wrap-around delay into zero
> >     * Abstract GNTST_eagain check into core grant table code for netback
> module.
> >
> >     Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> >
> > diff --git a/drivers/net/xen-netback/netback.c
> b/drivers/net/xen-netback/netback.c
> > index 682633b..5610fd8 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -635,9 +635,7 @@ static void xen_netbk_rx_action(struct xen_netbk
> *netbk)
> >               return;
> >
> >       BUG_ON(npo.copy_prod > ARRAY_SIZE(netbk->grant_copy_op));
> > -     ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
> &netbk->grant_copy_op,
> > -                                     npo.copy_prod);
> > -     BUG_ON(ret != 0);
> > +     gnttab_batch_copy_no_eagain(netbk->grant_copy_op, npo.copy_prod);
> >
> >       while ((skb = __skb_dequeue(&rxq)) != NULL) {
> >               sco = (struct skb_cb_overlay *)skb->cb;
> > @@ -1460,18 +1458,15 @@ static void xen_netbk_tx_submit(struct xen_netbk
> *netbk)
> >  static void xen_netbk_tx_action(struct xen_netbk *netbk)
> >  {
> >       unsigned nr_gops;
> > -     int ret;
> >
> >       nr_gops = xen_netbk_tx_build_gops(netbk);
> >
> >       if (nr_gops == 0)
> >               return;
> > -     ret = HYPERVISOR_grant_table_op(GNTTABOP_copy,
> > -                                     netbk->tx_copy_ops, nr_gops);
> > -     BUG_ON(ret);
> >
> > -     xen_netbk_tx_submit(netbk);
> > +     gnttab_batch_copy_no_eagain(netbk->tx_copy_ops, nr_gops);
> >
> > +     xen_netbk_tx_submit(netbk);
> >  }
> >
> >  static void xen_netbk_idx_release(struct xen_netbk *netbk, u16
> pending_idx)
> > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> > index eea81cf..96543b2 100644
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -38,6 +38,7 @@
> >  #include <linux/vmalloc.h>
> >  #include <linux/uaccess.h>
> >  #include <linux/io.h>
> > +#include <linux/delay.h>
> >  #include <linux/hardirq.h>
> >
> >  #include <xen/xen.h>
> > @@ -823,6 +824,26 @@ unsigned int gnttab_max_grant_frames(void)
> >  }
> >  EXPORT_SYMBOL_GPL(gnttab_max_grant_frames);
> >
> > +#define MAX_DELAY 256
> > +void
> > +gnttab_retry_eagain_gop(unsigned int cmd, void *gop, int16_t *status,
> > +                                             const char *func)
> > +{
> > +     unsigned delay = 1;
> > +
> > +     do {
> > +             BUG_ON(HYPERVISOR_grant_table_op(cmd, gop, 1));
> > +             if (*status == GNTST_eagain)
> > +                     msleep(delay++);
> > +     } while ((*status == GNTST_eagain) && (delay < MAX_DELAY));
> > +
> > +     if (delay >= MAX_DELAY) {
> > +             printk(KERN_ERR "%s: %s eagain grant\n", func,
> current->comm);
> > +             *status = GNTST_bad_page;
> > +     }
> > +}
> > +EXPORT_SYMBOL_GPL(gnttab_retry_eagain_gop);
> > +
> >  int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
> >                   struct gnttab_map_grant_ref *kmap_ops,
> >                   struct page **pages, unsigned int count)
> > @@ -836,6 +857,11 @@ int gnttab_map_refs(struct gnttab_map_grant_ref
> *map_ops,
> >       if (ret)
> >               return ret;
> >
> > +     /* Retry eagain maps */
> > +     for (i = 0; i < count; i++)
> > +             if (map_ops[i].status == GNTST_eagain)
> > +                     gnttab_retry_eagain_map(map_ops + i);
> > +
> >       if (xen_feature(XENFEAT_auto_translated_physmap))
> >               return ret;
> >
> > diff --git a/drivers/xen/xenbus/xenbus_client.c
> b/drivers/xen/xenbus/xenbus_client.c
> > index b3e146e..749f6a3 100644
> > --- a/drivers/xen/xenbus/xenbus_client.c
> > +++ b/drivers/xen/xenbus/xenbus_client.c
> > @@ -490,8 +490,7 @@ static int xenbus_map_ring_valloc_pv(struct
> xenbus_device *dev,
> >
> >       op.host_addr = arbitrary_virt_to_machine(pte).maddr;
> >
> > -     if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> > -             BUG();
> > +     gnttab_map_grant_no_eagain(&op);
> >
> >       if (op.status != GNTST_okay) {
> >               free_vm_area(area);
> > @@ -572,8 +571,7 @@ int xenbus_map_ring(struct xenbus_device *dev, int
> gnt_ref,
> >       gnttab_set_map_op(&op, (unsigned long)vaddr, GNTMAP_host_map,
> gnt_ref,
> >                         dev->otherend_id);
> >
> > -     if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
> > -             BUG();
> > +     gnttab_map_grant_no_eagain(&op);
> >
> >       if (op.status != GNTST_okay) {
> >               xenbus_dev_fatal(dev, op.status,
> > diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
> > index 11e27c3..2fecfab 100644
> > --- a/include/xen/grant_table.h
> > +++ b/include/xen/grant_table.h
> > @@ -43,6 +43,7 @@
> >  #include <xen/interface/grant_table.h>
> >
> >  #include <asm/xen/hypervisor.h>
> > +#include <asm/xen/hypercall.h>
> >
> >  #include <xen/features.h>
> >
> > @@ -183,6 +184,43 @@ unsigned int gnttab_max_grant_frames(void);
> >
> >  #define gnttab_map_vaddr(map) ((void *)(map.host_virt_addr))
> >
> > +/* Retry a grant map/copy operation when the hypervisor returns
> GNTST_eagain.
> > + * This is typically due to paged out target frames.
> > + * Generic entry-point, use macro decorators below for specific grant
> > + * operations.
> > + * Will retry for 1, 2, ... 255 ms, i.e. 256 times during 32 seconds.
> > + * Return value in *status guaranteed to no longer be GNTST_eagain. */
> > +void gnttab_retry_eagain_gop(unsigned int cmd, void *gop, int16_t
> *status,
> > +                             const char *func);
> > +
> > +#define gnttab_retry_eagain_map(_gop)                       \
> > +    gnttab_retry_eagain_gop(GNTTABOP_map_grant_ref, (_gop), \
> > +                            &(_gop)->status, __func__)
> > +
> > +#define gnttab_retry_eagain_copy(_gop)                  \
> > +    gnttab_retry_eagain_gop(GNTTABOP_copy, (_gop),      \
> > +                            &(_gop)->status, __func__)
> > +
> > +#define gnttab_map_grant_no_eagain(_gop)
>      \
> > +do {
>      \
> > +    if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, (_gop), 1))
>     \
> > +        BUG();
>      \
> > +    if ((_gop)->status == GNTST_eagain)
>     \
> > +        gnttab_retry_eagain_map((_gop));
>      \
> > +} while(0)
> > +
> > +static inline void
> > +gnttab_batch_copy_no_eagain(struct gnttab_copy *batch, unsigned count)
> > +{
> > +    unsigned i;
> > +    struct gnttab_copy *op;
> > +
> > +    BUG_ON(HYPERVISOR_grant_table_op(GNTTABOP_copy, batch, count));
> > +    for (i = 0, op = batch; i < count; i++, op++)
> > +        if (op->status == GNTST_eagain)
> > +            gnttab_retry_eagain_copy(op);
> > +}
> > +
> >  int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
> >                   struct gnttab_map_grant_ref *kmap_ops,
> >                   struct page **pages, unsigned int count);
> > diff --git a/include/xen/interface/grant_table.h
> b/include/xen/interface/grant_table.h
> > index 7da811b..66cb734 100644
> > --- a/include/xen/interface/grant_table.h
> > +++ b/include/xen/interface/grant_table.h
> > @@ -520,6 +520,7 @@ DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);
> >  #define GNTST_permission_denied (-8) /* Not enough privilege for
> operation.  */
> >  #define GNTST_bad_page         (-9) /* Specified page was invalid for
> op.    */
> >  #define GNTST_bad_copy_arg    (-10) /* copy arguments cross page
> boundary */
> > +#define GNTST_eagain          (-12) /* Retry.
>      */
> >
> >  #define GNTTABOP_error_msgs {                   \
> >      "okay",                                     \
> > @@ -533,6 +534,7 @@ DEFINE_GUEST_HANDLE_STRUCT(gnttab_get_version);
> >      "permission denied",                        \
> >      "bad page",                                 \
> >      "copy arguments cross page boundary"        \
> > +    "retry"                                     \
> >  }
> >
> >  #endif /* __XEN_PUBLIC_GRANT_TABLE_H__ */
> >
> >
> > On Aug 31, 2012, at 10:45 AM, Andres Lagar-Cavilla wrote:
> >
> > >
> > > On Aug 31, 2012, at 10:32 AM, David Vrabel wrote:
> > >
> > >> On 27/08/12 17:51, andres@lagarcavilla.org wrote:
> > >>> From: Andres Lagar-Cavilla <andres@lagarcavilla.org>
> > >>>
> > >>> Since Xen-4.2, hvm domains may have portions of their memory paged
> out. When a
> > >>> foreign domain (such as dom0) attempts to map these frames, the map
> will
> > >>> initially fail. The hypervisor returns a suitable errno, and kicks an
> > >>> asynchronous page-in operation carried out by a helper. The foreign
> domain is
> > >>> expected to retry the mapping operation until it eventually
> succeeds. The
> > >>> foreign domain is not put to sleep because itself could be the one
> running the
> > >>> pager assist (typical scenario for dom0).
> > >>>
> > >>> This patch adds support for this mechanism for backend drivers using
> grant
> > >>> mapping and copying operations. Specifically, this covers the
> blkback and
> > >>> gntdev drivers (which map foregin grants), and the netback driver
> (which copies
> > >>> foreign grants).
> > >>>
> > >>> * Add GNTST_eagain, already exposed by Xen, to the grant interface.
> > >>> * Add a retry method for grants that fail with GNTST_eagain (i.e.
> because the
> > >>> target foregin frame is paged out).
> > >>> * Insert hooks with appropriate macro decorators in the
> aforementioned drivers.
> > >>
> > >> I think you should implement wrappers around
> HYPERVISOR_grant_table_op()
> > >> have have the wrapper do the retries instead of every backend having
> to
> > >> check for EAGAIN and issue the retries itself. Similar to the
> > >> gnttab_map_grant_no_eagain() function you've already added.
> > >>
> > >> Why do some operations not retry anyway?
> > >
> > > All operations retry. The reason why I could not make it as elegant as
> you suggest is because grant operations are submitted in batches and their
> status(es?) later checked individually elsewhere. This is the case for
> netback. Note that both blkback and gntdev use a more linear structure with
> the gnttab_map_refs helper, which allows me to hide all the retry gore from
> those drivers into grant table code. Likewise for xenbus ring mapping.
> > >
> > > In summary, outside of core grant table code, only the netback driver
> needs to check explicitly for retries, due to its
> batch-copy-delayed-per-slot-check structure.
> > >
> > >>
> > >>> +void
> > >>> +gnttab_retry_eagain_gop(unsigned int cmd, void *gop, int16_t
> *status,
> > >>> +                                         const char *func)
> > >>> +{
> > >>> + u8 delay = 1;
> > >>> +
> > >>> + do {
> > >>> +         BUG_ON(HYPERVISOR_grant_table_op(cmd, gop, 1));
> > >>> +         if (*status == GNTST_eagain)
> > >>> +                 msleep(delay++);
> > >>> + } while ((*status == GNTST_eagain) && delay);
> > >>
> > >> Terminating the loop when delay wraps is a bit subtle.  Why not make
> > >> delay unsigned and check delay <= MAX_DELAY?
> > > Good idea (MAX_DELAY == 256). I'd like to get Konrad's feedback before
> a re-spin.
> > >
> > >>
> > >> Would it be sensible to ramp the delay faster?  Perhaps double each
> > >> iteration with a maximum possible delay of e.g., 256 ms.
> > > Generally speaking we've never seen past three retries. I am open to
> changing the algorithm but there is a significant possibility it won't
> matter at all.
> > >
> > >>
> > >>> +#define gnttab_map_grant_no_eagain(_gop)
>          \
> > >>> +do {
>          \
> > >>> +    if ( HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, (_gop),
> 1))      \
> > >>> +        BUG();
>          \
> > >>> +    if ((_gop)->status == GNTST_eagain)
>         \
> > >>> +        gnttab_retry_eagain_map((_gop));
>          \
> > >>> +} while(0)
> > >>
> > >> Inline functions, please.
> > >
> > > I want to retain the original context for debugging. Eventually we
> print __func__ if things go wrong.
> > >
> > > Thanks, great feedback
> > > Andres
> > >
> > >>
> > >> David
> > >
> >
>

[-- Attachment #1.2: Type: text/html, Size: 17912 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2012-08-31 19:33 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-27 16:51 [PATCH] Xen backend support for paged out grant targets andres
2012-08-31 14:32 ` David Vrabel
2012-08-31 14:45   ` Andres Lagar-Cavilla
2012-08-31 15:42     ` Andres Lagar-Cavilla
2012-08-31 16:10       ` David Vrabel
2012-09-05 16:27         ` Konrad Rzeszutek Wilk
2012-09-05 17:21           ` Andres Lagar-Cavilla
2012-09-12 13:20             ` Andres Lagar-Cavilla
2012-09-12 18:30               ` Konrad Rzeszutek Wilk
2012-08-31 16:54       ` Konrad Rzeszutek Wilk
2012-08-31 19:33         ` Andres Lagar-Cavilla [this message]
2012-08-31 21:14           ` Konrad Rzeszutek Wilk
2012-09-12 19:50 Andres Lagar-Cavilla
     [not found] <1347479153-441-1-git-send-email-andres@lagarcavilla.org>
2012-09-12 20:06 ` Konrad Rzeszutek Wilk
2012-09-12 20:21   ` Andres Lagar-Cavilla
2012-09-13  0:04     ` Andres Lagar-Cavilla
     [not found] ` <1347520482.25803.46.camel@dagon.hellion.org.uk>
2012-09-13 14:24   ` Andres Lagar-Cavilla
2012-09-13 17:28 Andres Lagar-Cavilla
2012-09-13 18:11 ` Ian Campbell
2012-09-13 19:45   ` Andres Lagar-Cavilla
2012-09-14  7:19     ` Ian Campbell
2012-09-14 12:44       ` Konrad Rzeszutek Wilk
2012-09-14 14:27         ` Andres Lagar-Cavilla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO=PTzoNfS8+DFXUM2+E9FaZSCJwTeRjecCXJwy_+CFyxGYpUQ@mail.gmail.com' \
    --to=andreslc@gridcentric.ca \
    --cc=andres.lagarcavilla@gmail.com \
    --cc=andres@lagarcavilla.org \
    --cc=david.vrabel@citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.