All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 4] libxl: initial support for xenpaging
@ 2011-11-02 14:45 Olaf Hering
  2011-11-02 14:45 ` [PATCH 1 of 4] xenpaging: use guests tot_pages as working target Olaf Hering
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-02 14:45 UTC (permalink / raw)
  To: xen-devel; +Cc: George.Dunlap, Ian.Campbell


The following series adds initial support for xenpaging to libxl.
It depends on two series I sent earlier:

tools/xenpaging fixes for xen-unstable, sent on 2011-10-21
http://lists.xensource.com/archives/html/xen-devel/2011-10/msg01542.html

libxl: make spawn interface more generic, sent on 2011-10-27
http://lists.xensource.com/archives/html/xen-devel/2011-10/msg01912.html


The logic of xenpaging was reversed by this series.
It does now monitor the guests tot_pages value and work toward that number by
either paging out more pages, or write pages back into the guest.
Target changes will received from the guests "memory/target-tot_pages" path.

Three new configuration file options specific for xenpaging were added:
  actmem=<int>
  xenpaging_file=<string> (optional)
  xenpaging_extra=[ 'string', 'string' ] (optional)
xenpaging will only be started if actmem= is set and not zero.


A xl mem-SOMETHING command is not yet part of this series. I will add it once
a suitable name is found.


There has been some discussion regarding the naming of the config option, and
how to drive xenpaging via xl commands. 
http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00110.html

The term "actual memory" was suggested by IanC, thats why the option is now
'actmem=' instead of 'totmem='. So far I couldnt come up with a better name
that follows the current scheme.

George Dunlap suggested the following off-list for the related xl mem-*
commands:
'xl mem-set' should continue to change the balloon target as it does today.
But it should also update "memory/target-tot_pages" with the same value. There
could be some churn when the balloon driver and xenpaging try to reach that
value. Eventually xenpaging will be faster to free pages, while the balloon
driver still tries to reach its target. In my opinion thats not an issue if
mem-set really means 'release as much memory back to Xen, as fast as
possible'. If the guest is actually using much memory then the balloon driver
(in its role as memory hog) can not do much to reach its target. But xenpaging
swap some parts of the guest to free memory on the host.

Two other 'xl mem-*' commands should be added to tweak just the balloon driver
and xenpaging. 'xl mem-balloon-target' does what 'mem-set' does today, and 'xl
mem-swap-target' will tweak "memory/target-tot_pages".



Olaf


 tools/libxl/libxl.h          |    1 
 tools/libxl/libxl_create.c   |  126 ++++++++++++++++++++++++++
 tools/libxl/libxl_dom.c      |    8 +
 tools/libxl/libxl_memory.txt |   57 +++++++-----
 tools/libxl/libxl_types.idl  |    3 
 tools/libxl/xl_cmdimpl.c     |   31 ++++++
 tools/xenpaging/xenpaging.c  |  201 +++++++++++++++++++++++++++++++++++--------
 tools/xenpaging/xenpaging.h  |    1 
 8 files changed, 368 insertions(+), 60 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1 of 4] xenpaging: use guests tot_pages as working target
  2011-11-02 14:45 [PATCH 0 of 4] libxl: initial support for xenpaging Olaf Hering
@ 2011-11-02 14:45 ` Olaf Hering
  2011-11-02 14:45 ` [PATCH 2 of 4] xenpaging: watch the guests memory/target-tot_pages xenstore value Olaf Hering
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-02 14:45 UTC (permalink / raw)
  To: xen-devel; +Cc: George.Dunlap, Ian.Campbell

# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1320244382 -3600
# Node ID 0d872bf1203dd36200477f688908797875035b50
# Parent  f057eb06706e2bacaadb41cf80fa45001e786e69
xenpaging: use guests tot_pages as working target

This change reverses the task of xenpaging. Before this change a fixed number
of pages was paged out. With this change the guest will not have access to
more than the given number of pages at the same time.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

diff -r f057eb06706e -r 0d872bf1203d tools/xenpaging/policy_default.c
--- a/tools/xenpaging/policy_default.c
+++ b/tools/xenpaging/policy_default.c
@@ -71,7 +71,6 @@ int policy_init(xenpaging_t *paging)
 
     /* Start in the middle to avoid paging during BIOS startup */
     current_gfn = max_pages / 2;
-    current_gfn -= paging->num_pages / 2;
 
     rc = 0;
  out:
diff -r f057eb06706e -r 0d872bf1203d tools/xenpaging/xenpaging.c
--- a/tools/xenpaging/xenpaging.c
+++ b/tools/xenpaging/xenpaging.c
@@ -136,6 +136,21 @@ err:
     return rc;
 }
 
+static int xenpaging_get_tot_pages(xenpaging_t *paging)
+{
+    xc_interface *xch = paging->xc_handle;
+    xc_domaininfo_t domain_info;
+    int rc;
+
+    rc = xc_domain_getinfolist(xch, paging->mem_event.domain_id, 1, &domain_info);
+    if ( rc != 1 )
+    {
+        PERROR("Error getting domain info");
+        return -1;
+    }
+    return domain_info.tot_pages;
+}
+
 static void *init_page(void)
 {
     void *buffer;
@@ -161,7 +176,7 @@ static void *init_page(void)
     return NULL;
 }
 
-static xenpaging_t *xenpaging_init(domid_t domain_id, int num_pages)
+static xenpaging_t *xenpaging_init(domid_t domain_id, int target_tot_pages)
 {
     xenpaging_t *paging;
     xc_domaininfo_t domain_info;
@@ -296,12 +311,7 @@ static xenpaging_t *xenpaging_init(domid
     }
     DPRINTF("max_pages = %d\n", paging->max_pages);
 
-    if ( num_pages < 0 || num_pages > paging->max_pages )
-    {
-        num_pages = paging->max_pages;
-        DPRINTF("setting num_pages to %d\n", num_pages);
-    }
-    paging->num_pages = num_pages;
+    paging->target_tot_pages = target_tot_pages;
 
     /* Initialise policy */
     rc = policy_init(paging);
@@ -648,7 +658,9 @@ int main(int argc, char *argv[])
     xenpaging_victim_t *victims;
     mem_event_request_t req;
     mem_event_response_t rsp;
+    int num, prev_num = 0;
     int i;
+    int tot_pages;
     int rc = -1;
     int rc1;
     xc_interface *xch;
@@ -659,7 +671,7 @@ int main(int argc, char *argv[])
 
     if ( argc != 3 )
     {
-        fprintf(stderr, "Usage: %s <domain_id> <num_pages>\n", argv[0]);
+        fprintf(stderr, "Usage: %s <domain_id> <tot_pages>\n", argv[0]);
         return -1;
     }
 
@@ -672,7 +684,7 @@ int main(int argc, char *argv[])
     }
     xch = paging->xc_handle;
 
-    DPRINTF("starting %s %u %d\n", argv[0], paging->mem_event.domain_id, paging->num_pages);
+    DPRINTF("starting %s %u %d\n", argv[0], paging->mem_event.domain_id, paging->target_tot_pages);
 
     /* Open file */
     sprintf(filename, "page_cache_%u", paging->mem_event.domain_id);
@@ -704,9 +716,6 @@ int main(int argc, char *argv[])
     /* listen for page-in events to stop pager */
     create_page_in_thread(paging);
 
-    i = evict_pages(paging, fd, victims, paging->num_pages);
-    DPRINTF("%d pages evicted. Done.\n", i);
-
     /* Swap pages in and out */
     while ( 1 )
     {
@@ -771,12 +780,8 @@ int main(int argc, char *argv[])
                     goto out;
                 }
 
-                /* Evict a new page to replace the one we just paged in,
-                 * or clear this pagefile slot on exit */
-                if ( interrupted )
-                    victims[i].gfn = INVALID_MFN;
-                else
-                    evict_victim(paging, &victims[i], fd, i);
+                /* Clear this pagefile slot */
+                victims[i].gfn = INVALID_MFN;
             }
             else
             {
@@ -823,6 +828,43 @@ int main(int argc, char *argv[])
         if ( interrupted )
             break;
 
+        /* Check if the target has been reached already */
+        tot_pages = xenpaging_get_tot_pages(paging);
+        if ( tot_pages < 0 )
+            goto out;
+
+        /* Resume all pages if paging is disabled or no target was set */
+        if ( paging->target_tot_pages == 0 )
+        {
+            if ( paging->num_paged_out )
+                resume_pages(paging, paging->num_paged_out);
+        }
+        /* Evict more pages if target not reached */
+        else if ( tot_pages > paging->target_tot_pages )
+        {
+            num = tot_pages - paging->target_tot_pages;
+            if ( num != prev_num )
+            {
+                DPRINTF("Need to evict %d pages to reach %d target_tot_pages\n", num, paging->target_tot_pages);
+                prev_num = num;
+            }
+            /* Limit the number of evicts to be able to process page-in requests */
+            if ( num > 42 )
+                num = 42;
+            evict_pages(paging, fd, victims, num);
+        }
+        /* Resume some pages if target not reached */
+        else if ( tot_pages < paging->target_tot_pages && paging->num_paged_out )
+        {
+            num = paging->target_tot_pages - tot_pages;
+            if ( num != prev_num )
+            {
+                DPRINTF("Need to resume %d pages to reach %d target_tot_pages\n", num, paging->target_tot_pages);
+                prev_num = num;
+            }
+            resume_pages(paging, num);
+        }
+
     }
     DPRINTF("xenpaging got signal %d\n", interrupted);
 
diff -r f057eb06706e -r 0d872bf1203d tools/xenpaging/xenpaging.h
--- a/tools/xenpaging/xenpaging.h
+++ b/tools/xenpaging/xenpaging.h
@@ -50,7 +50,7 @@ typedef struct xenpaging {
     /* number of pages for which data structures were allocated */
     int max_pages;
     int num_paged_out;
-    int num_pages;
+    int target_tot_pages;
     int policy_mru_size;
     unsigned long pagein_queue[XENPAGING_PAGEIN_QUEUE_SIZE];
 } xenpaging_t;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2 of 4] xenpaging: watch the guests memory/target-tot_pages xenstore value
  2011-11-02 14:45 [PATCH 0 of 4] libxl: initial support for xenpaging Olaf Hering
  2011-11-02 14:45 ` [PATCH 1 of 4] xenpaging: use guests tot_pages as working target Olaf Hering
@ 2011-11-02 14:45 ` Olaf Hering
  2011-11-02 14:45 ` [PATCH 3 of 4] xenpaging: add cmdline interface for pager Olaf Hering
  2011-11-02 14:45 ` [PATCH 4 of 4] xenpaging: initial libxl support Olaf Hering
  3 siblings, 0 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-02 14:45 UTC (permalink / raw)
  To: xen-devel; +Cc: George.Dunlap, Ian.Campbell

# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1320244383 -3600
# Node ID 434f0b4da9148b101e184e0108be6c31f67038f4
# Parent  0d872bf1203dd36200477f688908797875035b50
xenpaging: watch the guests memory/target-tot_pages xenstore value

Subsequent patches will use xenstored to store the numbers of pages
xenpaging is suppose to page-out.
Remove num_pages and use target_pages instead.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

diff -r 0d872bf1203d -r 434f0b4da914 tools/xenpaging/xenpaging.c
--- a/tools/xenpaging/xenpaging.c
+++ b/tools/xenpaging/xenpaging.c
@@ -19,8 +19,10 @@
  */
 
 #define _XOPEN_SOURCE	600
+#define _GNU_SOURCE
 
 #include <inttypes.h>
+#include <stdio.h>
 #include <stdlib.h>
 #include <stdarg.h>
 #include <time.h>
@@ -35,6 +37,10 @@
 #include "policy.h"
 #include "xenpaging.h"
 
+/* Defines number of mfns a guest should use at a time, in KiB */
+#define WATCH_TARGETPAGES "memory/target-tot_pages"
+static char *watch_target_tot_pages;
+static char *dom_path;
 static char watch_token[16];
 static char filename[80];
 static int interrupted;
@@ -72,7 +78,7 @@ static int xenpaging_wait_for_event_or_t
 {
     xc_interface *xch = paging->xc_handle;
     xc_evtchn *xce = paging->mem_event.xce_handle;
-    char **vec;
+    char **vec, *val;
     unsigned int num;
     struct pollfd fd[2];
     int port;
@@ -111,6 +117,25 @@ static int xenpaging_wait_for_event_or_t
                     rc = 0;
                 }
             }
+            else if ( strcmp(vec[XS_WATCH_PATH], watch_target_tot_pages) == 0 )
+            {
+                int ret, target_tot_pages;
+                val = xs_read(paging->xs_handle, XBT_NULL, vec[XS_WATCH_PATH], NULL);
+                if ( val )
+                {
+                    ret = sscanf(val, "%d", &target_tot_pages);
+                    if ( ret > 0 )
+                    {
+                        /* KiB to pages */
+                        target_tot_pages >>= 2;
+                        if ( target_tot_pages < 0 || target_tot_pages > paging->max_pages )
+                            target_tot_pages = paging->max_pages;
+                        paging->target_tot_pages = target_tot_pages;
+                        DPRINTF("new target_tot_pages %d\n", target_tot_pages);
+                    }
+                    free(val);
+                }
+            }
             free(vec);
         }
     }
@@ -216,6 +241,25 @@ static xenpaging_t *xenpaging_init(domid
         goto err;
     }
 
+    /* Watch xenpagings working target */
+    dom_path = xs_get_domain_path(paging->xs_handle, domain_id);
+    if ( !dom_path )
+    {
+        PERROR("Could not find domain path\n");
+        goto err;
+    }
+    if ( asprintf(&watch_target_tot_pages, "%s/%s", dom_path, WATCH_TARGETPAGES) < 0 )
+    {
+        PERROR("Could not alloc watch path\n");
+        goto err;
+    }
+    DPRINTF("watching '%s'\n", watch_target_tot_pages);
+    if ( xs_watch(paging->xs_handle, watch_target_tot_pages, "") == false )
+    {
+        PERROR("Could not bind to xenpaging watch\n");
+        goto err;
+    }
+
     p = getenv("XENPAGING_POLICY_MRU_SIZE");
     if ( p && *p )
     {
@@ -342,6 +386,8 @@ static xenpaging_t *xenpaging_init(domid
             free(paging->mem_event.ring_page);
         }
 
+        free(dom_path);
+        free(watch_target_tot_pages);
         free(paging->bitmap);
         free(paging);
     }
@@ -357,6 +403,9 @@ static int xenpaging_teardown(xenpaging_
     if ( paging == NULL )
         return 0;
 
+    xs_unwatch(paging->xs_handle, watch_target_tot_pages, "");
+    xs_unwatch(paging->xs_handle, "@releaseDomain", watch_token);
+
     xch = paging->xc_handle;
     paging->xc_handle = NULL;
     /* Tear down domain paging in Xen */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3 of 4] xenpaging: add cmdline interface for pager
  2011-11-02 14:45 [PATCH 0 of 4] libxl: initial support for xenpaging Olaf Hering
  2011-11-02 14:45 ` [PATCH 1 of 4] xenpaging: use guests tot_pages as working target Olaf Hering
  2011-11-02 14:45 ` [PATCH 2 of 4] xenpaging: watch the guests memory/target-tot_pages xenstore value Olaf Hering
@ 2011-11-02 14:45 ` Olaf Hering
  2011-11-02 14:45 ` [PATCH 4 of 4] xenpaging: initial libxl support Olaf Hering
  3 siblings, 0 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-02 14:45 UTC (permalink / raw)
  To: xen-devel; +Cc: George.Dunlap, Ian.Campbell

# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1320244384 -3600
# Node ID a51d4fab351d2d1a38b82cbd7ad925f76fce9e9a
# Parent  434f0b4da9148b101e184e0108be6c31f67038f4
xenpaging: add cmdline interface for pager

Introduce a cmdline handling for the pager. This simplifies libxl support,
debug and mru_size are not passed via the environment anymore.
The new interface looks like this:

xenpaging [options] -f <pagefile> -d <domain_id>
options:
 -d <domid>     --domain=<domid>         numerical domain_id of guest. This option is required.
 -f <file>      --pagefile=<file>        pagefile to use. This option is required.
 -m <max_memkb> --max_memkb=<max_memkb>  maximum amount of memory to handle.
 -r <num>       --mru_size=<num>         number of paged-in pages to keep in memory.
 -d             --debug                  enable debug output.
 -h             --help                   this output.


Signed-off-by: Olaf Hering <olaf@aepfle.de>

diff -r 434f0b4da914 -r a51d4fab351d tools/xenpaging/xenpaging.c
--- a/tools/xenpaging/xenpaging.c
+++ b/tools/xenpaging/xenpaging.c
@@ -31,6 +31,7 @@
 #include <poll.h>
 #include <xc_private.h>
 #include <xs.h>
+#include <getopt.h>
 
 #include "xc_bitops.h"
 #include "file_ops.h"
@@ -42,12 +43,12 @@
 static char *watch_target_tot_pages;
 static char *dom_path;
 static char watch_token[16];
-static char filename[80];
+static char *filename;
 static int interrupted;
 
 static void unlink_pagefile(void)
 {
-    if ( filename[0] )
+    if ( filename && filename[0] )
     {
         unlink(filename);
         filename[0] = '\0';
@@ -201,11 +202,85 @@ static void *init_page(void)
     return NULL;
 }
 
-static xenpaging_t *xenpaging_init(domid_t domain_id, int target_tot_pages)
+static void usage(void)
+{
+    printf("usage:\n\n");
+
+    printf("  xenpaging [options] -f <pagefile> -d <domain_id>\n\n");
+
+    printf("options:\n");
+    printf(" -d <domid>     --domain=<domid>         numerical domain_id of guest. This option is required.\n");
+    printf(" -f <file>      --pagefile=<file>        pagefile to use. This option is required.\n");
+    printf(" -m <max_memkb> --max_memkb=<max_memkb>  maximum amount of memory to handle.\n");
+    printf(" -r <num>       --mru_size=<num>         number of paged-in pages to keep in memory.\n");
+    printf(" -v             --verbose                enable debug output.\n");
+    printf(" -h             --help                   this output.\n");
+}
+
+static int xenpaging_getopts(xenpaging_t *paging, int argc, char *argv[])
+{
+    int ch;
+    static const char sopts[] = "hvd:f:m:r:";
+    static const struct option lopts[] = {
+        {"help", 0, NULL, 'h'},
+        {"verbose", 0, NULL, 'v'},
+        {"domain", 1, NULL, 'd'},
+        {"pagefile", 1, NULL, 'f'},
+        {"mru_size", 1, NULL, 'm'},
+        { }
+    };
+
+    while ((ch = getopt_long(argc, argv, sopts, lopts, NULL)) != -1)
+    {
+        switch(ch) {
+        case 'd':
+            paging->mem_event.domain_id = atoi(optarg);
+            break;
+        case 'f':
+            filename = strdup(optarg);
+            break;
+        case 'm':
+            /* KiB to pages */
+            paging->max_pages = atoi(optarg) >> 2;
+            break;
+        case 'r':
+            paging->policy_mru_size = atoi(optarg);
+            break;
+        case 'v':
+            paging->debug = 1;
+            break;
+        case 'h':
+        case '?':
+            usage();
+            return 1;
+        }
+    }
+
+    argv += optind; argc -= optind;
+    
+    /* Path to pagefile is required */
+    if ( !filename )
+    {
+        printf("Filename for pagefile missing!\n");
+        usage();
+        return 1;
+    }
+
+    /* Set domain id */
+    if ( !paging->mem_event.domain_id )
+    {
+        printf("Numerical <domain_id> missing!\n");
+        return 1;
+    }
+
+    return 0;
+}
+
+static xenpaging_t *xenpaging_init(int argc, char *argv[])
 {
     xenpaging_t *paging;
     xc_domaininfo_t domain_info;
-    xc_interface *xch;
+    xc_interface *xch = NULL;
     xentoollog_logger *dbg = NULL;
     char *p;
     int rc;
@@ -215,7 +290,12 @@ static xenpaging_t *xenpaging_init(domid
     if ( !paging )
         goto err;
 
-    if ( getenv("XENPAGING_DEBUG") )
+    /* Get cmdline options and domain_id */
+    if ( xenpaging_getopts(paging, argc, argv) )
+        goto err;
+
+    /* Enable debug output */
+    if ( paging->debug )
         dbg = (xentoollog_logger *)xtl_createlogger_stdiostream(stderr, XTL_DEBUG, 0);
 
     /* Open connection to xen */
@@ -234,7 +314,7 @@ static xenpaging_t *xenpaging_init(domid
     }
 
     /* write domain ID to watch so we can ignore other domain shutdowns */
-    snprintf(watch_token, sizeof(watch_token), "%u", domain_id);
+    snprintf(watch_token, sizeof(watch_token), "%u", paging->mem_event.domain_id);
     if ( xs_watch(paging->xs_handle, "@releaseDomain", watch_token) == false )
     {
         PERROR("Could not bind to shutdown watch\n");
@@ -242,7 +322,7 @@ static xenpaging_t *xenpaging_init(domid
     }
 
     /* Watch xenpagings working target */
-    dom_path = xs_get_domain_path(paging->xs_handle, domain_id);
+    dom_path = xs_get_domain_path(paging->xs_handle, paging->mem_event.domain_id);
     if ( !dom_path )
     {
         PERROR("Could not find domain path\n");
@@ -260,16 +340,6 @@ static xenpaging_t *xenpaging_init(domid
         goto err;
     }
 
-    p = getenv("XENPAGING_POLICY_MRU_SIZE");
-    if ( p && *p )
-    {
-         paging->policy_mru_size = atoi(p);
-         DPRINTF("Setting policy mru_size to %d\n", paging->policy_mru_size);
-    }
-
-    /* Set domain id */
-    paging->mem_event.domain_id = domain_id;
-
     /* Initialise shared page */
     paging->mem_event.shared_page = init_page();
     if ( paging->mem_event.shared_page == NULL )
@@ -335,17 +405,21 @@ static xenpaging_t *xenpaging_init(domid
 
     paging->mem_event.port = rc;
 
-    rc = xc_domain_getinfolist(xch, paging->mem_event.domain_id, 1,
-                               &domain_info);
-    if ( rc != 1 )
+    /* Get max_pages from guest if not provided via cmdline */
+    if ( !paging->max_pages )
     {
-        PERROR("Error getting domain info");
-        goto err;
+        rc = xc_domain_getinfolist(xch, paging->mem_event.domain_id, 1,
+                                   &domain_info);
+        if ( rc != 1 )
+        {
+            PERROR("Error getting domain info");
+            goto err;
+        }
+
+        /* Record number of max_pages */
+        paging->max_pages = domain_info.max_pages;
     }
 
-    /* Record number of max_pages */
-    paging->max_pages = domain_info.max_pages;
-
     /* Allocate bitmap for tracking pages that have been paged out */
     paging->bitmap = bitmap_alloc(paging->max_pages);
     if ( !paging->bitmap )
@@ -355,8 +429,6 @@ static xenpaging_t *xenpaging_init(domid
     }
     DPRINTF("max_pages = %d\n", paging->max_pages);
 
-    paging->target_tot_pages = target_tot_pages;
-
     /* Initialise policy */
     rc = policy_init(paging);
     if ( rc != 0 )
@@ -718,25 +790,18 @@ int main(int argc, char *argv[])
     mode_t open_mode = S_IRUSR | S_IRGRP | S_IROTH | S_IWUSR | S_IWGRP | S_IWOTH;
     int fd;
 
-    if ( argc != 3 )
-    {
-        fprintf(stderr, "Usage: %s <domain_id> <tot_pages>\n", argv[0]);
-        return -1;
-    }
-
     /* Initialise domain paging */
-    paging = xenpaging_init(atoi(argv[1]), atoi(argv[2]));
+    paging = xenpaging_init(argc, argv);
     if ( paging == NULL )
     {
-        fprintf(stderr, "Error initialising paging");
+        fprintf(stderr, "Error initialising paging\n");
         return 1;
     }
     xch = paging->xc_handle;
 
-    DPRINTF("starting %s %u %d\n", argv[0], paging->mem_event.domain_id, paging->target_tot_pages);
+    DPRINTF("starting %s for domain_id %u with pagefile %s\n", argv[0], paging->mem_event.domain_id, filename);
 
     /* Open file */
-    sprintf(filename, "page_cache_%u", paging->mem_event.domain_id);
     fd = open(filename, open_flags, open_mode);
     if ( fd < 0 )
     {
diff -r 434f0b4da914 -r a51d4fab351d tools/xenpaging/xenpaging.h
--- a/tools/xenpaging/xenpaging.h
+++ b/tools/xenpaging/xenpaging.h
@@ -52,6 +52,7 @@ typedef struct xenpaging {
     int num_paged_out;
     int target_tot_pages;
     int policy_mru_size;
+    int debug;
     unsigned long pagein_queue[XENPAGING_PAGEIN_QUEUE_SIZE];
 } xenpaging_t;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-02 14:45 [PATCH 0 of 4] libxl: initial support for xenpaging Olaf Hering
                   ` (2 preceding siblings ...)
  2011-11-02 14:45 ` [PATCH 3 of 4] xenpaging: add cmdline interface for pager Olaf Hering
@ 2011-11-02 14:45 ` Olaf Hering
  2011-11-07 11:02   ` Stefano Stabellini
  3 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2011-11-02 14:45 UTC (permalink / raw)
  To: xen-devel; +Cc: George.Dunlap, Ian.Campbell

# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1320244864 -3600
# Node ID ab5406a5b1d01e3828f0dcd833f99b70e4fbad72
# Parent  a51d4fab351d2d1a38b82cbd7ad925f76fce9e9a
xenpaging: initial libxl support

Add initial support to libxl for starting xenpaging.

The patch adds three new config options:
actmem=<int>, the amount of memory in MiB for the guest
xenpaging_file=<string>, pagefile to use (optional)
xenpaging_extra=[ 'string', 'string' ], additional args for xenpaging (optional)

If 'actmem=' is not specified in config file, xenpaging will not start.
If 'xenpaging_file=' is not specified in config file,
/var/lib/xen/xenpaging/<domain_name>.<domaind_id>.paging is used.

Signed-off-by: Olaf Hering <olaf@aepfle.de>

diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl.h
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -261,6 +261,7 @@ int libxl_init_dm_info(libxl_ctx *ctx,
 typedef int (*libxl_console_ready)(libxl_ctx *ctx, uint32_t domid, void *priv);
 int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config, libxl_console_ready cb, void *priv, uint32_t *domid);
 int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config, libxl_console_ready cb, void *priv, uint32_t *domid, int restore_fd);
+int libxl__create_xenpaging(libxl_ctx *ctx, libxl_domain_config *d_config, uint32_t domid, char *path);
 void libxl_domain_config_destroy(libxl_domain_config *d_config);
 int libxl_domain_suspend(libxl_ctx *ctx, libxl_domain_suspend_info *info,
                           uint32_t domid, int fd);
diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_create.c
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -429,6 +429,122 @@ retry_transaction:
     return rc;
 }
 
+static int create_xenpaging(libxl__gc *gc, char *dom_name, uint32_t domid,
+                            libxl_domain_build_info *b_info)
+{
+    libxl__spawner_starting *buf_starting;
+    libxl_string_list xpe = b_info->u.hvm.xenpaging_extra;
+    int i, rc;
+    char *logfile;
+    int logfile_w, null;
+    char *path, *dom_path, *value;
+    char **args;
+    char *xp;
+    flexarray_t *xp_args;
+    libxl_ctx *ctx = libxl__gc_owner(gc);
+
+    /* Nothing to do */
+    if (!b_info->tot_memkb)
+        return 0;
+
+    /* Check if paging is already enabled */
+    dom_path = libxl__xs_get_dompath(gc, domid);
+    if (!dom_path ) {
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+    path = libxl__sprintf(gc, "%s/xenpaging/state", dom_path);
+    if (!path ) {
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+    value = xs_read(ctx->xsh, XBT_NULL, path, NULL);
+    rc = value && strcmp(value, "running") == 0;
+    free(value);
+    /* Already running, nothing to do */
+    if (rc)
+        return 0;
+
+    /* Check if xenpaging is present */
+    xp = libxl__abs_path(gc, "xenpaging", libxl_libexec_path());
+    if (access(xp, X_OK) < 0) {
+        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "%s is not executable", xp);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    /* Initialise settings for child */
+    buf_starting = calloc(sizeof(*buf_starting), 1);
+    if (!buf_starting) {
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+    buf_starting->domid = domid;
+    buf_starting->dom_path = dom_path;
+    buf_starting->pid_path = "xenpaging/xenpaging-pid";
+    buf_starting->for_spawn = calloc(sizeof(libxl__spawn_starting), 1);
+    if (!buf_starting->for_spawn) {
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+
+    /* Assemble arguments for xenpaging */
+    xp_args = flexarray_make(8, 1);
+    if (!xp_args) {
+        rc = ERROR_NOMEM;
+        goto out;
+    }
+    /* Set executable path */
+    flexarray_append(xp_args, xp);
+
+    /* Append pagefile option */
+    flexarray_append(xp_args, "-f");
+    if (b_info->u.hvm.xenpaging_file)
+        flexarray_append(xp_args, b_info->u.hvm.xenpaging_file);
+    else
+        flexarray_append(xp_args, libxl__sprintf(gc, "%s/%s.%u.paging",
+                         libxl_xenpaging_dir_path(), dom_name, domid));
+
+    /* Set maximum amount of memory xenpaging should handle */
+    flexarray_append(xp_args, "-m");
+    flexarray_append(xp_args, libxl__sprintf(gc, "%d", b_info->max_memkb));
+
+    /* Append extra args for pager */
+    for (i = 0; xpe && xpe[i]; i++)
+        flexarray_append(xp_args, xpe[i]);
+    /* Append domid for pager */
+    flexarray_append(xp_args, "-d");
+    flexarray_append(xp_args, libxl__sprintf(gc, "%u", domid));
+    flexarray_append(xp_args, NULL);
+    args = (char **) flexarray_contents(xp_args);
+
+    /* Initialise logfile */
+    libxl_create_logfile(ctx, libxl__sprintf(gc, "xenpaging-%s", dom_name),
+                         &logfile);
+    logfile_w = open(logfile, O_WRONLY|O_CREAT, 0644);
+    free(logfile);
+    null = open("/dev/null", O_RDONLY);
+
+    /* Spawn the child */
+    rc = libxl__spawn_spawn(gc, buf_starting->for_spawn, "xenpaging",
+                            libxl_spawner_record_pid, buf_starting);
+    if (rc < 0)
+        goto out_close;
+    if (!rc) { /* inner child */
+        setsid();
+        /* Finally run xenpaging */
+        libxl__exec(null, logfile_w, logfile_w, xp, args);
+    }
+    rc = libxl__spawn_confirm_offspring_startup(gc, 5, "xenpaging", path,
+                                                "running", buf_starting);
+out_close:
+    close(null);
+    close(logfile_w);
+    free(args);
+out:
+    return rc;
+}
+
 static int do_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
                             libxl_console_ready cb, void *priv,
                             uint32_t *domid_out, int restore_fd)
@@ -614,6 +730,16 @@ static int do_domain_create(libxl__gc *g
             goto error_out;
     }
 
+    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM) {
+        ret = create_xenpaging(gc, d_config->dm_info.dom_name, domid,
+                              &d_config->b_info);
+        if (ret) {
+            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
+                      "Failed to start xenpaging.\n");
+            goto error_out;
+	}
+    }
+
     *domid_out = domid;
     return 0;
 
diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -108,7 +108,7 @@ int libxl__build_post(libxl__gc *gc, uin
     if (info->cpuid != NULL)
         libxl_cpuid_set(ctx, domid, info->cpuid);
 
-    ents = libxl__calloc(gc, 12 + (info->max_vcpus * 2) + 2, sizeof(char *));
+    ents = libxl__calloc(gc, 14 + (info->max_vcpus * 2) + 2, sizeof(char *));
     ents[0] = "memory/static-max";
     ents[1] = libxl__sprintf(gc, "%d", info->max_memkb);
     ents[2] = "memory/target";
@@ -121,9 +121,11 @@ int libxl__build_post(libxl__gc *gc, uin
     ents[9] = libxl__sprintf(gc, "%"PRIu32, state->store_port);
     ents[10] = "store/ring-ref";
     ents[11] = libxl__sprintf(gc, "%lu", state->store_mfn);
+    ents[12] = "memory/target-tot_pages";
+    ents[13] = libxl__sprintf(gc, "%d", info->tot_memkb);
     for (i = 0; i < info->max_vcpus; i++) {
-        ents[12+(i*2)]   = libxl__sprintf(gc, "cpu/%d/availability", i);
-        ents[12+(i*2)+1] = (i && info->cur_vcpus && !(info->cur_vcpus & (1 << i)))
+        ents[14+(i*2)]   = libxl__sprintf(gc, "cpu/%d/availability", i);
+        ents[14+(i*2)+1] = (i && info->cur_vcpus && !(info->cur_vcpus & (1 << i)))
                             ? "offline" : "online";
     }
 
diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_memory.txt
--- a/tools/libxl/libxl_memory.txt
+++ b/tools/libxl/libxl_memory.txt
@@ -1,28 +1,28 @@
 /* === Domain memory breakdown: HVM guests ==================================
                            
-             +  +----------+                                     +            
-             |  | shadow   |                                     |            
-             |  +----------+                                     |            
-    overhead |  | extra    |                                     |            
-             |  | external |                                     |            
-             |  +----------+                          +          |            
-             |  | extra    |                          |          |            
-             |  | internal |                          |          |            
-             +  +----------+                +         |          | footprint  
-             |  | video    |                |         |          |            
-             |  +----------+  +    +        |         | xen      |            
-             |  |          |  |    |        | actual  | maximum  |            
-             |  |          |  |    |        | target  |          |            
-             |  | guest    |  |    | build  |         |          |            
-             |  |          |  |    | start  |         |          |            
-      static |  |          |  |    |        |         |          |            
-     maximum |  +----------+  |    +        +         +          +            
-             |  |          |  |                                               
-             |  |          |  |                                               
-             |  | balloon  |  | build                                         
-             |  |          |  | maximum                                       
-             |  |          |  |                                               
-             +  +----------+  +                                               
+             +  +----------+                                                 +            
+             |  | shadow   |                                                 |            
+             |  +----------+                                                 |            
+    overhead |  | extra    |                                                 |            
+             |  | external |                                                 |            
+             |  +----------+                                      +          |            
+             |  | extra    |                                      |          |            
+             |  | internal |                                      |          |            
+             +  +----------+                            +         |          | footprint  
+             |  | video    |                            |         |          |            
+             |  +----------+  +           +    +        |         | xen      |            
+             |  |          |  | guest OS  |    |        | actual  | maximum  |            
+             |  | guest    |  | real RAM  |    |        | target  |          |            
+             |  |          |  |           |    | build  |         |          |            
+             |  |----------+  +           |    | start  +         |          |            
+      static |  | paging   |              |    |                  |          |            
+     maximum |  +----------+              |    +                  +          +            
+             |  |          |              |                                               
+             |  |          |              |                                               
+             |  | balloon  |              | build                                         
+             |  |          |              | maximum                                       
+             |  |          |              |                                               
+             +  +----------+              +                                               
                 
                 
     extra internal = LIBXL_MAXMEM_CONSTANT
@@ -34,6 +34,17 @@
     libxl_domain_setmaxmem -> xen maximum
     libxl_set_memory_target -> actual target
                 
+    build maximum = RAM as seen inside the virtual machine
+                    Guest OS has to configure itself for this amount of memory
+                    Increase/Decrease via memory hotplug of virtual hardware.
+		    xl mem-max
+    build start   = RAM usable by the guest OS
+                    Guest OS sees balloon driver as memory hog
+                    Increase/Decrease via commands to the balloon driver
+		    xl mem-set
+    actual target = RAM allocated for the guest
+                    Increase/Decrease via commands to paging daemon
+		    xl mem-paging_target (?)
                 
  === Domain memory breakdown: PV guests ==================================
                 
diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_types.idl
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -157,6 +157,7 @@ libxl_domain_build_info = Struct("domain
     ("tsc_mode",        integer),
     ("max_memkb",       uint32),
     ("target_memkb",    uint32),
+    ("tot_memkb",       uint32),
     ("video_memkb",     uint32),
     ("shadow_memkb",    uint32),
     ("disable_migrate", bool),
@@ -174,6 +175,8 @@ libxl_domain_build_info = Struct("domain
                                        ("vpt_align", bool),
                                        ("timer_mode", integer),
                                        ("nested_hvm", bool),
+                                       ("xenpaging_file", string),
+                                       ("xenpaging_extra", libxl_string_list),
                                        ])),
                  ("pv", Struct(None, [("kernel", libxl_file_reference),
                                       ("slack_memkb", uint32),
diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -346,6 +346,7 @@ static void printf_info(int domid,
         printf("\t\t\t(firmware %s)\n", b_info->u.hvm.firmware);
         printf("\t\t\t(video_memkb %d)\n", b_info->video_memkb);
         printf("\t\t\t(shadow_memkb %d)\n", b_info->shadow_memkb);
+        printf("\t\t\t(tot_memkb %d)\n", b_info->tot_memkb);
         printf("\t\t\t(pae %d)\n", b_info->u.hvm.pae);
         printf("\t\t\t(apic %d)\n", b_info->u.hvm.apic);
         printf("\t\t\t(acpi %d)\n", b_info->u.hvm.acpi);
@@ -380,6 +381,7 @@ static void printf_info(int domid,
         printf("\t\t\t(spicedisable_ticketing %d)\n",
                     dm_info->spicedisable_ticketing);
         printf("\t\t\t(spiceagent_mouse %d)\n", dm_info->spiceagent_mouse);
+        printf("\t\t\t(xenpaging_file %s)\n", b_info->u.hvm.xenpaging_file);
         printf("\t\t)\n");
         break;
     case LIBXL_DOMAIN_TYPE_PV:
@@ -515,6 +517,28 @@ static void parse_disk_config(XLU_Config
     parse_disk_config_multistring(config, 1, &spec, disk);
 }
 
+static void parse_xenpaging_extra(const XLU_Config *config, libxl_string_list *xpe)
+{
+    XLU_ConfigList *args;
+    libxl_string_list l;
+    const char *val;
+    int nr_args = 0, i;
+
+    if (xlu_cfg_get_list(config, "xenpaging_extra", &args, &nr_args, 1))
+        return;
+
+    l = xmalloc(sizeof(char*)*(nr_args + 1));
+    if (!l)
+        return;
+
+    l[nr_args] = NULL;
+    for (i = 0; i < nr_args; i++) {
+        val = xlu_cfg_get_listitem(args, i);
+        l[i] = val ? strdup(val) : NULL;
+    }
+    *xpe = l;
+}
+
 static void parse_config_data(const char *configfile_filename_report,
                               const char *configfile_data,
                               int configfile_len,
@@ -620,6 +644,9 @@ static void parse_config_data(const char
     if (!xlu_cfg_get_long (config, "maxmem", &l))
         b_info->max_memkb = l * 1024;
 
+    if (!xlu_cfg_get_long (config, "actmem", &l))
+        b_info->tot_memkb = l * 1024;
+
     if (xlu_cfg_get_string (config, "on_poweroff", &buf))
         buf = "destroy";
     if (!parse_action_on_shutdown(buf, &d_config->on_poweroff)) {
@@ -695,6 +722,10 @@ static void parse_config_data(const char
             b_info->u.hvm.timer_mode = l;
         if (!xlu_cfg_get_long (config, "nestedhvm", &l))
             b_info->u.hvm.nested_hvm = l;
+
+        xlu_cfg_replace_string (config, "xenpaging_file", &b_info->u.hvm.xenpaging_file);
+        parse_xenpaging_extra(config, &b_info->u.hvm.xenpaging_extra);
+
         break;
     case LIBXL_DOMAIN_TYPE_PV:
     {
diff -r a51d4fab351d -r ab5406a5b1d0 tools/xenpaging/xenpaging.c
--- a/tools/xenpaging/xenpaging.c
+++ b/tools/xenpaging/xenpaging.c
@@ -40,6 +40,8 @@
 
 /* Defines number of mfns a guest should use at a time, in KiB */
 #define WATCH_TARGETPAGES "memory/target-tot_pages"
+/* Defines path to startup confirmation */
+#define WATCH_STARTUP "xenpaging/state"
 static char *watch_target_tot_pages;
 static char *dom_path;
 static char watch_token[16];
@@ -772,6 +774,20 @@ static int evict_pages(xenpaging_t *pagi
     return num;
 }
 
+static void xenpaging_confirm_startup(xenpaging_t *paging)
+{
+    xc_interface *xch = paging->xc_handle;
+    char *path;
+    int len;
+
+    len = asprintf(&path, "%s/%s", dom_path, WATCH_STARTUP);
+    if ( len < 0 )
+        return;
+    DPRINTF("confirming startup in %s\n", path);
+    xs_write(paging->xs_handle, XBT_NULL, path, "running", len);
+    free(path);
+}
+
 int main(int argc, char *argv[])
 {
     struct sigaction act;
@@ -830,6 +846,9 @@ int main(int argc, char *argv[])
     /* listen for page-in events to stop pager */
     create_page_in_thread(paging);
 
+    /* Confirm startup to caller */
+    xenpaging_confirm_startup(paging);
+
     /* Swap pages in and out */
     while ( 1 )
     {

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-02 14:45 ` [PATCH 4 of 4] xenpaging: initial libxl support Olaf Hering
@ 2011-11-07 11:02   ` Stefano Stabellini
  2011-11-07 12:55     ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: Stefano Stabellini @ 2011-11-07 11:02 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Dunlap, xen-devel

On Wed, 2 Nov 2011, Olaf Hering wrote:
> diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_create.c
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -429,6 +429,122 @@ retry_transaction:
>      return rc;
>  }
> 
> +static int create_xenpaging(libxl__gc *gc, char *dom_name, uint32_t domid,
> +                            libxl_domain_build_info *b_info)
> +{
> +    libxl__spawner_starting *buf_starting;
> +    libxl_string_list xpe = b_info->u.hvm.xenpaging_extra;
> +    int i, rc;
> +    char *logfile;
> +    int logfile_w, null;
> +    char *path, *dom_path, *value;
> +    char **args;
> +    char *xp;
> +    flexarray_t *xp_args;
> +    libxl_ctx *ctx = libxl__gc_owner(gc);
> +
> +    /* Nothing to do */
> +    if (!b_info->tot_memkb)
> +        return 0;

I think that using tot_memkb to store the actual memory target and then
checking whether is 0 to detect if paging is active/inactive is
confusing.
If tot_memkb is the pod target of the domain, we should be coherent and
set it equal to target_memkb when paging is inactive.


> @@ -34,6 +34,17 @@
>      libxl_domain_setmaxmem -> xen maximum
>      libxl_set_memory_target -> actual target
> 
> +    build maximum = RAM as seen inside the virtual machine
> +                    Guest OS has to configure itself for this amount of memory
> +                    Increase/Decrease via memory hotplug of virtual hardware.
> +                   xl mem-max
> +    build start   = RAM usable by the guest OS
> +                    Guest OS sees balloon driver as memory hog
> +                    Increase/Decrease via commands to the balloon driver
> +                   xl mem-set
> +    actual target = RAM allocated for the guest
> +                    Increase/Decrease via commands to paging daemon
> +                   xl mem-paging_target (?)

maybe xl mem-paging is specific enough


>   === Domain memory breakdown: PV guests ==================================
> 
> diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_types.idl
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -157,6 +157,7 @@ libxl_domain_build_info = Struct("domain
>      ("tsc_mode",        integer),
>      ("max_memkb",       uint32),
>      ("target_memkb",    uint32),
> +    ("tot_memkb",       uint32),
>      ("video_memkb",     uint32),
>      ("shadow_memkb",    uint32),
>      ("disable_migrate", bool),

I would like a comment somewhere of what tot_memkb is supposed to
represent.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-07 11:02   ` Stefano Stabellini
@ 2011-11-07 12:55     ` Olaf Hering
  2011-11-07 13:28       ` Stefano Stabellini
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2011-11-07 12:55 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, xen-devel, Ian Campbell

On Mon, Nov 07, Stefano Stabellini wrote:

> I think that using tot_memkb to store the actual memory target and then
> checking whether is 0 to detect if paging is active/inactive is
> confusing.

tot_memkb is only set when it was specified in the config file, and
perhaps later when a suitable xl mem-FOO command and a related watch on
the targer-tot_pages node is added.

> If tot_memkb is the pod target of the domain, we should be coherent and
> set it equal to target_memkb when paging is inactive.

So far PoD and paging are unrelated and mean different things.
I think the difference between max_memkb and tot_memkb could be the
trigger to start paging.

> >   === Domain memory breakdown: PV guests ==================================
> > 
> > diff -r a51d4fab351d -r ab5406a5b1d0 tools/libxl/libxl_types.idl
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -157,6 +157,7 @@ libxl_domain_build_info = Struct("domain
> >      ("tsc_mode",        integer),
> >      ("max_memkb",       uint32),
> >      ("target_memkb",    uint32),
> > +    ("tot_memkb",       uint32),
> >      ("video_memkb",     uint32),
> >      ("shadow_memkb",    uint32),
> >      ("disable_migrate", bool),
> 
> I would like a comment somewhere of what tot_memkb is supposed to
> represent.

Yes, sorry, docu is lacking in that change.

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-07 12:55     ` Olaf Hering
@ 2011-11-07 13:28       ` Stefano Stabellini
  2011-11-20 18:29         ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: Stefano Stabellini @ 2011-11-07 13:28 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Dunlap, xen-devel

On Mon, 7 Nov 2011, Olaf Hering wrote:
> > If tot_memkb is the pod target of the domain, we should be coherent and
> > set it equal to target_memkb when paging is inactive.
> 
> So far PoD and paging are unrelated and mean different things.
> I think the difference between max_memkb and tot_memkb could be the
> trigger to start paging.

Yes, I think it would be better.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-07 13:28       ` Stefano Stabellini
@ 2011-11-20 18:29         ` Olaf Hering
  2011-11-21 10:53           ` Stefano Stabellini
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2011-11-20 18:29 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, xen-devel, Ian Campbell

On Mon, Nov 07, Stefano Stabellini wrote:

> On Mon, 7 Nov 2011, Olaf Hering wrote:
> > > If tot_memkb is the pod target of the domain, we should be coherent and
> > > set it equal to target_memkb when paging is inactive.
> > 
> > So far PoD and paging are unrelated and mean different things.
> > I think the difference between max_memkb and tot_memkb could be the
> > trigger to start paging.
> 
> Yes, I think it would be better.

I have to disagree here.

After looking at the code in parse_config_data(), tot_memkb is only set
if actmem= is listed in the configfile. And if actmem= is set, its the
trigger to run xenpaging and let it work toward the specified number.
So checking for a non-null tot_memkb in create_xenpaging() looks like
the correct way to me to decide wether xenpaging should be started.


Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-20 18:29         ` Olaf Hering
@ 2011-11-21 10:53           ` Stefano Stabellini
  2011-11-21 15:13             ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: Stefano Stabellini @ 2011-11-21 10:53 UTC (permalink / raw)
  To: Olaf Hering; +Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini

On Sun, 20 Nov 2011, Olaf Hering wrote:
> On Mon, Nov 07, Stefano Stabellini wrote:
> 
> > On Mon, 7 Nov 2011, Olaf Hering wrote:
> > > > If tot_memkb is the pod target of the domain, we should be coherent and
> > > > set it equal to target_memkb when paging is inactive.
> > > 
> > > So far PoD and paging are unrelated and mean different things.
> > > I think the difference between max_memkb and tot_memkb could be the
> > > trigger to start paging.
> > 
> > Yes, I think it would be better.
> 
> I have to disagree here.
> 
> After looking at the code in parse_config_data(), tot_memkb is only set
> if actmem= is listed in the configfile. And if actmem= is set, its the
> trigger to run xenpaging and let it work toward the specified number.
> So checking for a non-null tot_memkb in create_xenpaging() looks like
> the correct way to me to decide wether xenpaging should be started.

what if tot_memkb is bigger than target_memkb? Or even bigger than
max_memkb?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-21 10:53           ` Stefano Stabellini
@ 2011-11-21 15:13             ` Olaf Hering
  2011-11-21 16:40               ` George Dunlap
                                 ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-21 15:13 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, xen-devel, Ian Campbell

On Mon, Nov 21, Stefano Stabellini wrote:

> what if tot_memkb is bigger than target_memkb? Or even bigger than
> max_memkb?

tot_memkb is unrelated to target_memkb, also somewhat unrelated to
max_memkb.

xenpaging will look at tot_memkb value (at "memory/target-tot_pages" to
be precise) and try to reach that number of domain->tot_pages. If the
tot_memkb number is larger than max_memkb nothing will happen.


Right now there is not much checking anyway, memory=1024 maxmem=1 in the
config is accepted in my testing.

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-21 15:13             ` Olaf Hering
@ 2011-11-21 16:40               ` George Dunlap
  2011-11-22  9:05                 ` Ian Campbell
  2011-11-22 10:58               ` Stefano Stabellini
  2011-11-22 15:48               ` George Dunlap
  2 siblings, 1 reply; 24+ messages in thread
From: George Dunlap @ 2011-11-21 16:40 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

On Mon, Nov 21, 2011 at 3:13 PM, Olaf Hering <olaf@aepfle.de> wrote:
> On Mon, Nov 21, Stefano Stabellini wrote:
>
>> what if tot_memkb is bigger than target_memkb? Or even bigger than
>> max_memkb?
>
> tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> max_memkb.

I'd love to contribute to this discussion, but I don't know what these
different names mean.  I think what we need to talk about is all of
the different memory parameters we need, and then what each of the
individual names mean -- what they currently map to, and then what we
want them to map to.  At very least they should be in a comment
somewhere.

>
> xenpaging will look at tot_memkb value (at "memory/target-tot_pages" to
> be precise) and try to reach that number of domain->tot_pages. If the
> tot_memkb number is larger than max_memkb nothing will happen.
>
>
> Right now there is not much checking anyway, memory=1024 maxmem=1 in the
> config is accepted in my testing.
>
> Olaf
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-21 16:40               ` George Dunlap
@ 2011-11-22  9:05                 ` Ian Campbell
  0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2011-11-22  9:05 UTC (permalink / raw)
  To: George Dunlap; +Cc: Olaf Hering, xen-devel, Stefano Stabellini

On Mon, 2011-11-21 at 16:40 +0000, George Dunlap wrote:
> On Mon, Nov 21, 2011 at 3:13 PM, Olaf Hering <olaf@aepfle.de> wrote:
> > On Mon, Nov 21, Stefano Stabellini wrote:
> >
> >> what if tot_memkb is bigger than target_memkb? Or even bigger than
> >> max_memkb?
> >
> > tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> > max_memkb.
> 
> I'd love to contribute to this discussion, but I don't know what these
> different names mean.  I think what we need to talk about is all of
> the different memory parameters we need, and then what each of the
> individual names mean -- what they currently map to, and then what we
> want them to map to.  At very least they should be in a comment
> somewhere.

tools/libxl/libxl_memory.txt covers some of that (and Olaf patched it
IIRC) although it is not so clear on the mapping to xl configuration
keys.

Ian.

> 
> >
> > xenpaging will look at tot_memkb value (at "memory/target-tot_pages" to
> > be precise) and try to reach that number of domain->tot_pages. If the
> > tot_memkb number is larger than max_memkb nothing will happen.
> >
> >
> > Right now there is not much checking anyway, memory=1024 maxmem=1 in the
> > config is accepted in my testing.
> >
> > Olaf
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-21 15:13             ` Olaf Hering
  2011-11-21 16:40               ` George Dunlap
@ 2011-11-22 10:58               ` Stefano Stabellini
  2011-11-22 11:22                 ` Olaf Hering
  2011-11-22 15:48               ` George Dunlap
  2 siblings, 1 reply; 24+ messages in thread
From: Stefano Stabellini @ 2011-11-22 10:58 UTC (permalink / raw)
  To: Olaf Hering; +Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini

On Mon, 21 Nov 2011, Olaf Hering wrote:
> On Mon, Nov 21, Stefano Stabellini wrote:
> 
> > what if tot_memkb is bigger than target_memkb? Or even bigger than
> > max_memkb?
> 
> tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> max_memkb.

At build time ballooning is not active yet and target_memkb represents
the amount of memory available to the VM plus the videoram (see
libxl__build_hvm).
As a consequence I think that tot_memkb cannot be higher than
target_memkb - videoram_memkb (that is build_start in the diagram). 

So, what is going to happen if tot_memkb is higher than target_memkb -
videoram_memkb?
Also, what is going to happen if it is lower?


> xenpaging will look at tot_memkb value (at "memory/target-tot_pages" to
> be precise) and try to reach that number of domain->tot_pages. If the
> tot_memkb number is larger than max_memkb nothing will happen.

How is it going to reach the tot_pages target? Where is it going to take
the memory from? Is it going to automatically page out memory from other
VMs?

 
> Right now there is not much checking anyway, memory=1024 maxmem=1 in the
> config is accepted in my testing.
 
That is a correct configuration: it means that the domain has 1024MB of
RAM but it cannot allocate any more (maximum allocation limit being 1MB).
maxmem doesn't influence the current memory of the VM, only future
allocations.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-22 10:58               ` Stefano Stabellini
@ 2011-11-22 11:22                 ` Olaf Hering
  0 siblings, 0 replies; 24+ messages in thread
From: Olaf Hering @ 2011-11-22 11:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, xen-devel, Ian Campbell

On Tue, Nov 22, Stefano Stabellini wrote:

> On Mon, 21 Nov 2011, Olaf Hering wrote:
> > On Mon, Nov 21, Stefano Stabellini wrote:
> > 
> > > what if tot_memkb is bigger than target_memkb? Or even bigger than
> > > max_memkb?
> > 
> > tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> > max_memkb.
> 
> At build time ballooning is not active yet and target_memkb represents
> the amount of memory available to the VM plus the videoram (see
> libxl__build_hvm).
> As a consequence I think that tot_memkb cannot be higher than
> target_memkb - videoram_memkb (that is build_start in the diagram). 

It can because with xenpaging the target_memkb turns from real memory
into virtual memory, and tot_memkb is the new amount of real memory.

The actual checking wether the tot_memkb/target_memkb/max_memkb are sane
can be either done when they are changed with xl mem-XY like its done
now. Or we add new code to do such checking already during config
parsing.

> So, what is going to happen if tot_memkb is higher than target_memkb -
> videoram_memkb?

Nothing happens, since xenpaging is the only consumer of that variable
(via xenstore). See below.

> Also, what is going to happen if it is lower?

If its lower, xenpaging will page-out some pages, adds them back if the
guest happens to access them and page-out some other pages. The guest
still has access to all memory it thinks it has (target_memkb).

> > xenpaging will look at tot_memkb value (at "memory/target-tot_pages" to
> > be precise) and try to reach that number of domain->tot_pages. If the
> > tot_memkb number is larger than max_memkb nothing will happen.
> 
> How is it going to reach the tot_pages target? Where is it going to take
> the memory from? Is it going to automatically page out memory from other
> VMs?

xenpaging does not add new memory. If it has no pages to page-in and
tot_pages is still higher, it will do nothing.

> > Right now there is not much checking anyway, memory=1024 maxmem=1 in the
> > config is accepted in my testing.
>  
> That is a correct configuration: it means that the domain has 1024MB of
> RAM but it cannot allocate any more (maximum allocation limit being 1MB).
> maxmem doesn't influence the current memory of the VM, only future
> allocations.

It causes stall in the host, perhaps due to an interger overflow (I have
not analyzed it yet).

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-21 15:13             ` Olaf Hering
  2011-11-21 16:40               ` George Dunlap
  2011-11-22 10:58               ` Stefano Stabellini
@ 2011-11-22 15:48               ` George Dunlap
  2012-01-09 19:21                 ` Olaf Hering
  2 siblings, 1 reply; 24+ messages in thread
From: George Dunlap @ 2011-11-22 15:48 UTC (permalink / raw)
  To: Olaf Hering; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

On Mon, Nov 21, 2011 at 3:13 PM, Olaf Hering <olaf@aepfle.de> wrote:
> On Mon, Nov 21, Stefano Stabellini wrote:
>
>> what if tot_memkb is bigger than target_memkb? Or even bigger than
>> max_memkb?
>
> tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> max_memkb.

It seems to me the opposite: tot_memkb (as you're describing here) and
target_memkb both mean, "How much Xen memory the administrator wants
allocated to the VM."  Before either paging or PoD, the only way to
modify the amount of memory allocated to a VM was via the balloon
driver.  PoD introduced a mechanism that allows the domain builder to
start a VM with less memory than static_max, and allow the VM to run
until balloon driver can normalize things.    Paging introduces a
separate mechanism for the administrator to modify the amount of
memory allocated to the VM.

It seems to me like paging and ballooning should both use
target_memkb.  We just need to figure out how to make sure that paging
only comes on when it's needed.  When it might be needed includes:
* For guests that don't have a balloon driver
* For guests whose balloon driver is not meeting target_memkb (either
because it's unresponsive, rebellious, or because it can't get more
memory from the guest OS)
* Potentially, between domain creation and the time the balloon driver
comes up (i.e., replacing PoD).

It seems like having some kind of a flag or setting would be better.
Various factors:
* Do we start the paging daemon?
* Do we use paging during boot?  Only matters if max_memkb !=
target_memkb.  If no, the domain builder uses PoD mode.  If yes, the
domain builder will fill in target_memkb worth of guest memory, and
then fill the rest with swapped-out entries.  (If max_memkb ==
target_memkb, domain builder fills in all entries.)
* When does the paging daemon respond to changes to target_memkb?
This could be:
 - Immediately (assume no balloon driver)
 - PoD mode: Start immediately, but when you notice the balloon driver
reaching the initial target_memkb, turn off, or switch into the next
mode
 - Fallback mode: Pay attention to changes in target_memkb, but don't
act immediately.  Wait for paging_delay secs for the balloon driver to
handle it; if it doesn't respond, then start paging (and perhaps
switch to "Immediately" mode).

What do you think?

 -George

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2011-11-22 15:48               ` George Dunlap
@ 2012-01-09 19:21                 ` Olaf Hering
  2012-01-10 12:02                   ` George Dunlap
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2012-01-09 19:21 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

Lets resume this discussion now to get it sorted out for 4.2.

On Tue, Nov 22, George Dunlap wrote:

> On Mon, Nov 21, 2011 at 3:13 PM, Olaf Hering <olaf@aepfle.de> wrote:
> > On Mon, Nov 21, Stefano Stabellini wrote:
> >
> >> what if tot_memkb is bigger than target_memkb? Or even bigger than
> >> max_memkb?
> >
> > tot_memkb is unrelated to target_memkb, also somewhat unrelated to
> > max_memkb.
> 
> It seems to me the opposite: tot_memkb (as you're describing here) and
> target_memkb both mean, "How much Xen memory the administrator wants
> allocated to the VM."  Before either paging or PoD, the only way to
> modify the amount of memory allocated to a VM was via the balloon
> driver.  PoD introduced a mechanism that allows the domain builder to
> start a VM with less memory than static_max, and allow the VM to run
> until balloon driver can normalize things.    Paging introduces a
> separate mechanism for the administrator to modify the amount of
> memory allocated to the VM.
> 
> It seems to me like paging and ballooning should both use
> target_memkb.  We just need to figure out how to make sure that paging
> only comes on when it's needed.  When it might be needed includes:
> * For guests that don't have a balloon driver
> * For guests whose balloon driver is not meeting target_memkb (either
> because it's unresponsive, rebellious, or because it can't get more
> memory from the guest OS)
> * Potentially, between domain creation and the time the balloon driver
> comes up (i.e., replacing PoD).
> 
> It seems like having some kind of a flag or setting would be better.
> Various factors:
> * Do we start the paging daemon?
> * Do we use paging during boot?  Only matters if max_memkb !=
> target_memkb.  If no, the domain builder uses PoD mode.  If yes, the
> domain builder will fill in target_memkb worth of guest memory, and
> then fill the rest with swapped-out entries.  (If max_memkb ==
> target_memkb, domain builder fills in all entries.)
> * When does the paging daemon respond to changes to target_memkb?
> This could be:
>  - Immediately (assume no balloon driver)
>  - PoD mode: Start immediately, but when you notice the balloon driver
> reaching the initial target_memkb, turn off, or switch into the next
> mode
>  - Fallback mode: Pay attention to changes in target_memkb, but don't
> act immediately.  Wait for paging_delay secs for the balloon driver to
> handle it; if it doesn't respond, then start paging (and perhaps
> switch to "Immediately" mode).
> 
> What do you think?


So there is that maxmem= setting to let the guest OS configure itself
for a given amount of pseudo-physical memory. Then there is a way to cut
down the guest OS memory usage, both with balloon driver in guest and
later with PoD.
Isnt paging a better (or: just different) way to control the memory
usage of a guest OS (It costs diskspace in dom0)?
If a guest OS is configured with maxmem=4096, but then restricted with
memory=3072 in the next line, why is maxmem= there in the first place?
Would it clearer to say: The guest OS has a certain workload which
requires 3072MB. But maybe at some point the guest needs the full
4096MB, then it can access all of it at the cost of some IO due to
swapping in dom0.
I think the balloon driver in the guest is not really needed anymore, it
could just be there and do nothing. IF there is physical memory to
release to the host, the pager can do it on behalf of the balloon
driver.

What if the config format is like this:

Do things as they were done until now (PoD + balloon driver):
  memory=3072
  maxmem=4096
  paging=0 (or not specified at all)

Do things with pager instead of balloon driver and/or PoD:
  memory=3072
  maxmem=4096
  paging=1, or xenpaging=1
  xenpaging_extra=[ '-f', '/path/to/pagefile_guestname' ] (optional)

And have mem-set adjust memory/target-tot_pages to tell pager about the
new target. The builder could create some sort PoD for a paged guest so
that during startup only the amount of memory= needs to be allocated.
This needs to be implemented, right now a starting guest needs the full
amount of memory until the pager starts to page-out pages.

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-09 19:21                 ` Olaf Hering
@ 2012-01-10 12:02                   ` George Dunlap
  2012-01-11 14:58                     ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: George Dunlap @ 2012-01-10 12:02 UTC (permalink / raw)
  To: Olaf Hering; +Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini

On Mon, 2012-01-09 at 19:21 +0000, Olaf Hering wrote:
> So there is that maxmem= setting to let the guest OS configure itself
> for a given amount of pseudo-physical memory. Then there is a way to cut
> down the guest OS memory usage, both with balloon driver in guest and
> later with PoD.
> Isnt paging a better (or: just different) way to control the memory
> usage of a guest OS (It costs diskspace in dom0)?

On the contrary, hypervisor swapping is definitely *much worse* than
using a balloon driver.  The balloon driver was an innovation developed
specifically to avoid hypervisor swapping if at all possible[1].  We
need hypervisor swapping as a back-stop for situations where the balloon
driver is non-existent, or can't function immediately for some reason
(e.g., we've been using page-sharing to do memory overcommit and
suddenly have a bunch of pages un-shared); but it should always be a
last resort, and would ideally be mitigated by the balloon driver as
soon as possible.

[1] http://www.waldspurger.org/carl/papers/esx-mem-osdi02.pdf

> If a guest OS is configured with maxmem=4096, but then restricted with
> memory=3072 in the next line, why is maxmem= there in the first place?

Because for HVM guests at least, the guest OS will never recognize more
memory than was reported in the e820 map at boot.  So if you boot with
maxmem=3072, the VM will *never* be able to see more then 3072 megabytes
of RAM.  If you want to start a VM with 3072 MiB, but want the
flexibility of allowing the VM to use up to 4096 MiB at some point in
the future, you need to have 4096MiB in the e820 map.

> Would it clearer to say: The guest OS has a certain workload which
> requires 3072MB. But maybe at some point the guest needs the full
> 4096MB, then it can access all of it at the cost of some IO due to
> swapping in dom0.

The very best thing is if the guest does its own swapping.  If its
working set is 4096MiB, but its available memory is only 3072MiB, it's
better to tell the guest it only has 3072MiB to work with, so it can do
the swapping optimally.

> I think the balloon driver in the guest is not really needed anymore, it
> could just be there and do nothing. IF there is physical memory to
> release to the host, the pager can do it on behalf of the balloon
> driver.

Hopefully it's clear that I disagree with this completely.

> What if the config format is like this:
> 
> Do things as they were done until now (PoD + balloon driver):
>   memory=3072
>   maxmem=4096
>   paging=0 (or not specified at all)
> 
> Do things with pager instead of balloon driver and/or PoD:
>   memory=3072
>   maxmem=4096
>   paging=1, or xenpaging=1
>   xenpaging_extra=[ '-f', '/path/to/pagefile_guestname' ] (optional)

Except that this makes paging and ballooning mutually exclusive.  What
we want is to make them work together -- to have paging as a back-up
when ballooning fails (or isn't fast enough).

We'd also like to experiment with having a special-case of paging
replace PoD; in that case, we need to start with this special-case
paging and then transition into ballooning.

It may be that we don't have time to make them work together before the
4.2 release; in that case, we may need to make them mutually exclusive
for that release, to be fixed up in 4.3.  But if we can make them work
together by 4.2, that would be the best; and in any case, we need to
make sure we're planning for them to work together, and minimize the
interface changes when we do.

> The builder could create some sort PoD for a paged guest so
> that during startup only the amount of memory= needs to be allocated.
> This needs to be implemented, right now a starting guest needs the full
> amount of memory until the pager starts to page-out pages.

Yes, the builder needs to be able to start a guest with pages pre-paged
out, for the same reason we introduced PoD: that is, if you page a guest
from 4096MiB down to 3072MiB, and then reboot the guest, you may only
have 3072MiB available.  So if you want maxmem=4096 still, you need to
start with some pages "pre-paged" out.  We need that mechanism for
robustness anyway; we can then experiment with using it to replace PoD.

 -George

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-10 12:02                   ` George Dunlap
@ 2012-01-11 14:58                     ` Olaf Hering
  2012-01-11 16:10                       ` Tim Deegan
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2012-01-11 14:58 UTC (permalink / raw)
  To: George Dunlap; +Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini

On Tue, Jan 10, George Dunlap wrote:

> On Mon, 2012-01-09 at 19:21 +0000, Olaf Hering wrote:
> > So there is that maxmem= setting to let the guest OS configure itself
> > for a given amount of pseudo-physical memory. Then there is a way to cut
> > down the guest OS memory usage, both with balloon driver in guest and
> > later with PoD.
> > Isnt paging a better (or: just different) way to control the memory
> > usage of a guest OS (It costs diskspace in dom0)?
> 
> On the contrary, hypervisor swapping is definitely *much worse* than
> using a balloon driver.  The balloon driver was an innovation developed
> specifically to avoid hypervisor swapping if at all possible[1].  We
> need hypervisor swapping as a back-stop for situations where the balloon
> driver is non-existent, or can't function immediately for some reason
> (e.g., we've been using page-sharing to do memory overcommit and
> suddenly have a bunch of pages un-shared); but it should always be a
> last resort, and would ideally be mitigated by the balloon driver as
> soon as possible.

Isnt that up to the host admin to decide where to take the memory from?
So if its acceptable to swap parts of a VM (independent from what the
guest OS thinks it has), so be it.

We just need to right knobs.

So far we have two knobs:
maxmem=  xl mem-max
memory=  xl mem-set  (and guest OS balloon driver via sysfs)

Another knob for paging is needed. A while ago you proposed two new
commands: mem-balloon_target and mem-swap_target. Perhaps these terms
should be used also in the config file to set the initial memory/target
and memory/target-tot_pages values. If the latter is set, start the
pager. And if the latter is called, start the pager if it doesnt run
already.

At some point we will have the code ready so that PoD and paging can
coexist, so that the guests memory usage can grow on guests demand as it
does now. This is just a detail, independent from config options and
commands.


To summarize:

maxmem=  ; xl mem-max
memory=  mem-balloon_target= ; xl mem-balloon_target, xl mem-set
mem-swap_target= ; xl mem-swap_target

The rule could be like this:
mem-swap_target <= mem-balloon_target <= mem-max


> > What if the config format is like this:
> > 
> > Do things as they were done until now (PoD + balloon driver):
> >   memory=3072
> >   maxmem=4096
> >   paging=0 (or not specified at all)
> > 
> > Do things with pager instead of balloon driver and/or PoD:
> >   memory=3072
> >   maxmem=4096
> >   paging=1, or xenpaging=1
> >   xenpaging_extra=[ '-f', '/path/to/pagefile_guestname' ] (optional)
> 
> Except that this makes paging and ballooning mutually exclusive.  What
> we want is to make them work together -- to have paging as a back-up
> when ballooning fails (or isn't fast enough).

ballooning in the guest will still work. For example via sysfs, the
guest driver can release pages any time it wants to. But with the above
knobs the balloon driver can still be tweaked from the host.


Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-11 14:58                     ` Olaf Hering
@ 2012-01-11 16:10                       ` Tim Deegan
  2012-01-11 16:38                         ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: Tim Deegan @ 2012-01-11 16:10 UTC (permalink / raw)
  To: Olaf Hering
  Cc: George Dunlap, xen-devel, George Dunlap, Stefano Stabellini,
	Ian Campbell

At 15:58 +0100 on 11 Jan (1326297501), Olaf Hering wrote:
> On Tue, Jan 10, George Dunlap wrote:
> 
> > On Mon, 2012-01-09 at 19:21 +0000, Olaf Hering wrote:
> > > So there is that maxmem= setting to let the guest OS configure itself
> > > for a given amount of pseudo-physical memory. Then there is a way to cut
> > > down the guest OS memory usage, both with balloon driver in guest and
> > > later with PoD.
> > > Isnt paging a better (or: just different) way to control the memory
> > > usage of a guest OS (It costs diskspace in dom0)?
> > 
> > On the contrary, hypervisor swapping is definitely *much worse* than
> > using a balloon driver.  The balloon driver was an innovation developed
> > specifically to avoid hypervisor swapping if at all possible[1].  We
> > need hypervisor swapping as a back-stop for situations where the balloon
> > driver is non-existent, or can't function immediately for some reason
> > (e.g., we've been using page-sharing to do memory overcommit and
> > suddenly have a bunch of pages un-shared); but it should always be a
> > last resort, and would ideally be mitigated by the balloon driver as
> > soon as possible.
> 
> Isnt that up to the host admin to decide where to take the memory from?
> So if its acceptable to swap parts of a VM (independent from what the
> guest OS thinks it has), so be it.

Why?  The _only_ reason I can imagine for wanting to use paging is when
the balloon driver can't or won't do its job.  There's no advantage to
paging except that you can always force it to happen.

I think it makes sense to have two separate targets at the libxl level
(one for the balloon driver and one for the external pager/PoD), but at the
xl level (i.e. in config files and commands) there should be only one
target for memroy-actually-in-use-by-the-guest and xl should DTRT to
achieve it.  This interface is already baffling enough. :)

Tim.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-11 16:10                       ` Tim Deegan
@ 2012-01-11 16:38                         ` Olaf Hering
  2012-01-11 16:58                           ` Tim Deegan
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2012-01-11 16:38 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, xen-devel, George Dunlap, Stefano Stabellini,
	Ian Campbell

On Wed, Jan 11, Tim Deegan wrote:

> > Isnt that up to the host admin to decide where to take the memory from?
> > So if its acceptable to swap parts of a VM (independent from what the
> > guest OS thinks it has), so be it.
> 
> Why?  The _only_ reason I can imagine for wanting to use paging is when
> the balloon driver can't or won't do its job.  There's no advantage to
> paging except that you can always force it to happen.

Isnt that the whole point of paging, to make it happen at will without
the guest (or the application at process level) noticing it?

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-11 16:38                         ` Olaf Hering
@ 2012-01-11 16:58                           ` Tim Deegan
  2012-01-12 14:12                             ` Olaf Hering
  0 siblings, 1 reply; 24+ messages in thread
From: Tim Deegan @ 2012-01-11 16:58 UTC (permalink / raw)
  To: Olaf Hering
  Cc: George Dunlap, xen-devel, George Dunlap, Stefano Stabellini,
	Ian Campbell

At 17:38 +0100 on 11 Jan (1326303483), Olaf Hering wrote:
> On Wed, Jan 11, Tim Deegan wrote:
> 
> > > Isnt that up to the host admin to decide where to take the memory from?
> > > So if its acceptable to swap parts of a VM (independent from what the
> > > guest OS thinks it has), so be it.
> > 
> > Why?  The _only_ reason I can imagine for wanting to use paging is when
> > the balloon driver can't or won't do its job.  There's no advantage to
> > paging except that you can always force it to happen.
> 
> Isnt that the whole point of paging, to make it happen at will without
> the guest (or the application at process level) noticing it?

Yes, but that's a _bad_ thing. :)  If the guest can co-operate, you'll
get way better eviction choices, better performance, and better
accounting (since the I/O is done by the guest to guest-owned disk).

That's why I think both mechanisms should be visible up to the libxl
layer, but xl itself should just implement the one sensible policy:
try ballooning first, then page if that fails.  

Tim.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-11 16:58                           ` Tim Deegan
@ 2012-01-12 14:12                             ` Olaf Hering
  2012-01-13 11:00                               ` Ian Campbell
  0 siblings, 1 reply; 24+ messages in thread
From: Olaf Hering @ 2012-01-12 14:12 UTC (permalink / raw)
  To: Tim Deegan
  Cc: George Dunlap, xen-devel, George Dunlap, Stefano Stabellini,
	Ian Campbell

On Wed, Jan 11, Tim Deegan wrote:

> At 17:38 +0100 on 11 Jan (1326303483), Olaf Hering wrote:
> > On Wed, Jan 11, Tim Deegan wrote:
> > 
> > > > Isnt that up to the host admin to decide where to take the memory from?
> > > > So if its acceptable to swap parts of a VM (independent from what the
> > > > guest OS thinks it has), so be it.
> > > 
> > > Why?  The _only_ reason I can imagine for wanting to use paging is when
> > > the balloon driver can't or won't do its job.  There's no advantage to
> > > paging except that you can always force it to happen.
> > 
> > Isnt that the whole point of paging, to make it happen at will without
> > the guest (or the application at process level) noticing it?
> 
> Yes, but that's a _bad_ thing. :)  If the guest can co-operate, you'll
> get way better eviction choices, better performance, and better
> accounting (since the I/O is done by the guest to guest-owned disk).

Hmm, I think its slightly like an 'rm -rf *' accident: bad, but allowed.

> That's why I think both mechanisms should be visible up to the libxl
> layer, but xl itself should just implement the one sensible policy:
> try ballooning first, then page if that fails.  

So you are saying xl should take care of an improved mem-set command?
Perhaps by tweaking memory/target first, monitoring something like
tot_pages, and if memory/target isnt reached after some time, tweak
memory/target-tot_pages so that xenpaging takes care of the rest?

Olaf

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4 of 4] xenpaging: initial libxl support
  2012-01-12 14:12                             ` Olaf Hering
@ 2012-01-13 11:00                               ` Ian Campbell
  0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2012-01-13 11:00 UTC (permalink / raw)
  To: Olaf Hering; +Cc: George Dunlap, xen-devel, Tim (Xen.org), Stefano Stabellini

On Thu, 2012-01-12 at 14:12 +0000, Olaf Hering wrote:
> On Wed, Jan 11, Tim Deegan wrote:
> 
> > At 17:38 +0100 on 11 Jan (1326303483), Olaf Hering wrote:
> > > On Wed, Jan 11, Tim Deegan wrote:
> > > 
> > > > > Isnt that up to the host admin to decide where to take the memory from?
> > > > > So if its acceptable to swap parts of a VM (independent from what the
> > > > > guest OS thinks it has), so be it.
> > > > 
> > > > Why?  The _only_ reason I can imagine for wanting to use paging is when
> > > > the balloon driver can't or won't do its job.  There's no advantage to
> > > > paging except that you can always force it to happen.
> > > 
> > > Isnt that the whole point of paging, to make it happen at will without
> > > the guest (or the application at process level) noticing it?
> > 
> > Yes, but that's a _bad_ thing. :)  If the guest can co-operate, you'll
> > get way better eviction choices, better performance, and better
> > accounting (since the I/O is done by the guest to guest-owned disk).
> 
> Hmm, I think its slightly like an 'rm -rf *' accident: bad, but allowed.

You analogy is bogus, the "support" for "rm -rf *" comes legitimately
(even if unfortunately) from the combined semantics of the shell and rm
and just falls out from the normal use cases.

In the case of paging we would have to add explicit support for doing
something which we think has no purpose. I think it is OK for libxl to
offer the flexibility to toolstack authors to do this however they want
but xl should only expose a single "target" value.

> > That's why I think both mechanisms should be visible up to the libxl
> > layer, but xl itself should just implement the one sensible policy:
> > try ballooning first, then page if that fails.  
> 
> So you are saying xl should take care of an improved mem-set command?
> Perhaps by tweaking memory/target first, monitoring something like
> tot_pages, and if memory/target isnt reached after some time, tweak
> memory/target-tot_pages so that xenpaging takes care of the rest?

Yes, although there is no need for monitoring, it can just be set
memory/target, wait, set memory/target-tot_pages. If the balloon driver
has caught up then setting target-tot_pages will be a nop but it still
correctly reflects the desired state of the system and so we should set
it.

Another alternative would be for the pager to add some hysteresis after
it observes a change in the target before it starts "implementing" it.
This would allow the toolstack to just set things one shot. I'm not sure
that this is better though -- it makes things a little less flexible for
the toolstack and encodes policy in the pager. 

Ian.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-01-13 11:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-02 14:45 [PATCH 0 of 4] libxl: initial support for xenpaging Olaf Hering
2011-11-02 14:45 ` [PATCH 1 of 4] xenpaging: use guests tot_pages as working target Olaf Hering
2011-11-02 14:45 ` [PATCH 2 of 4] xenpaging: watch the guests memory/target-tot_pages xenstore value Olaf Hering
2011-11-02 14:45 ` [PATCH 3 of 4] xenpaging: add cmdline interface for pager Olaf Hering
2011-11-02 14:45 ` [PATCH 4 of 4] xenpaging: initial libxl support Olaf Hering
2011-11-07 11:02   ` Stefano Stabellini
2011-11-07 12:55     ` Olaf Hering
2011-11-07 13:28       ` Stefano Stabellini
2011-11-20 18:29         ` Olaf Hering
2011-11-21 10:53           ` Stefano Stabellini
2011-11-21 15:13             ` Olaf Hering
2011-11-21 16:40               ` George Dunlap
2011-11-22  9:05                 ` Ian Campbell
2011-11-22 10:58               ` Stefano Stabellini
2011-11-22 11:22                 ` Olaf Hering
2011-11-22 15:48               ` George Dunlap
2012-01-09 19:21                 ` Olaf Hering
2012-01-10 12:02                   ` George Dunlap
2012-01-11 14:58                     ` Olaf Hering
2012-01-11 16:10                       ` Tim Deegan
2012-01-11 16:38                         ` Olaf Hering
2012-01-11 16:58                           ` Tim Deegan
2012-01-12 14:12                             ` Olaf Hering
2012-01-13 11:00                               ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.