All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Remove tmem
@ 2018-11-28 13:58 Wei Liu
  2018-11-28 13:58 ` [PATCH v2 1/3] tools: remove tmem code and commands Wei Liu
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Wei Liu @ 2018-11-28 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich

It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.

In this version:

1. Remove some residuals from previous version and fix all build errors
   discovered by Gitlab CI.
2. Swap the order of patches to make sure bisection still works. This
   is verified by calling
      `./automation/scripts/build-test.sh origin/staging HEAD`
3. Make sure Xen still boots and passes all XTF tests after the removal.
4. Keep public/tmem.h.

Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

Wei Liu (3):
  tools: remove tmem code and commands
  xen: remove tmem from hypervisor
  docs: remove tmem related text

 docs/man/xl.conf.pod.5                       |    9 +-
 docs/man/xl.pod.1.in                         |   68 -
 docs/misc/tmem-internals.html                |  789 ----------
 docs/misc/xen-command-line.markdown          |    6 -
 docs/misc/xsm-flask.txt                      |   36 -
 tools/flask/policy/modules/dom0.te           |    4 +-
 tools/flask/policy/modules/guest_features.te |    3 -
 tools/libxc/Makefile                         |    1 -
 tools/libxc/include/xenctrl.h                |   17 -
 tools/libxc/xc_tmem.c                        |  507 -------
 tools/libxl/libxl_tmem.c                     |  119 +-
 tools/misc/Makefile                          |    1 -
 tools/misc/xen-tmem-list-parse.c             |  339 -----
 tools/python/xen/lowlevel/xc/xc.c            |   87 --
 tools/xenstat/libxenstat/src/xenstat.c       |   53 +-
 tools/xenstat/libxenstat/src/xenstat.h       |   15 -
 tools/xenstat/libxenstat/src/xenstat_priv.h  |    8 -
 tools/xenstat/xentop/xentop.c                |   36 +-
 tools/xl/Makefile                            |    2 +-
 tools/xl/xl.h                                |    6 -
 tools/xl/xl_cmdtable.c                       |   40 -
 tools/xl/xl_tmem.c                           |  251 ---
 xen/arch/arm/configs/tiny64.conf             |    1 -
 xen/arch/x86/configs/pvshim_defconfig        |    1 -
 xen/arch/x86/guest/hypercall_page.S          |    1 -
 xen/arch/x86/hvm/hypercall.c                 |    3 -
 xen/arch/x86/hypercall.c                     |    1 -
 xen/arch/x86/pv/hypercall.c                  |    3 -
 xen/arch/x86/setup.c                         |    8 -
 xen/common/Kconfig                           |   13 -
 xen/common/Makefile                          |    4 -
 xen/common/compat/tmem_xen.c                 |   23 -
 xen/common/domain.c                          |    3 -
 xen/common/memory.c                          |    5 +-
 xen/common/page_alloc.c                      |   40 +-
 xen/common/sysctl.c                          |    5 -
 xen/common/tmem.c                            | 2095 --------------------------
 xen/common/tmem_control.c                    |  560 -------
 xen/common/tmem_xen.c                        |  277 ----
 xen/include/Makefile                         |    1 -
 xen/include/public/sysctl.h                  |  108 +-
 xen/include/public/tmem.h                    |   14 +-
 xen/include/xen/hypercall.h                  |    7 -
 xen/include/xen/mm.h                         |    2 +
 xen/include/xen/sched.h                      |    3 -
 xen/include/xen/tmem.h                       |   45 -
 xen/include/xen/tmem_control.h               |   39 -
 xen/include/xen/tmem_xen.h                   |  343 -----
 xen/include/xlat.lst                         |    2 -
 xen/include/xsm/dummy.h                      |    6 -
 xen/include/xsm/xsm.h                        |    6 -
 xen/xsm/dummy.c                              |    1 -
 xen/xsm/flask/hooks.c                        |    9 -
 xen/xsm/flask/policy/access_vectors          |    4 -
 54 files changed, 36 insertions(+), 5994 deletions(-)
 delete mode 100644 docs/misc/tmem-internals.html
 delete mode 100644 tools/libxc/xc_tmem.c
 delete mode 100644 tools/misc/xen-tmem-list-parse.c
 delete mode 100644 tools/xl/xl_tmem.c
 delete mode 100644 xen/common/compat/tmem_xen.c
 delete mode 100644 xen/common/tmem.c
 delete mode 100644 xen/common/tmem_control.c
 delete mode 100644 xen/common/tmem_xen.c
 delete mode 100644 xen/include/xen/tmem.h
 delete mode 100644 xen/include/xen/tmem_control.h
 delete mode 100644 xen/include/xen/tmem_xen.h

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/3] tools: remove tmem code and commands
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
@ 2018-11-28 13:58 ` Wei Liu
  2018-11-30 17:10   ` Ian Jackson
  2018-11-28 13:58 ` [PATCH v2 2/3] xen: remove tmem from hypervisor Wei Liu
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Wei Liu @ 2018-11-28 13:58 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Marek Marczykowski-Górecki

Remove all tmem related code in libxc.

Leave some stubs in libxl in case anyone has linked to those functions
before the removal.

Remove all tmem related commands in xl, all tmem related code in other
utilities we ship.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/Makefile                        |   1 -
 tools/libxc/include/xenctrl.h               |  17 -
 tools/libxc/xc_tmem.c                       | 507 ----------------------------
 tools/libxl/libxl_tmem.c                    | 119 +------
 tools/misc/Makefile                         |   1 -
 tools/misc/xen-tmem-list-parse.c            | 339 -------------------
 tools/python/xen/lowlevel/xc/xc.c           |  87 -----
 tools/xenstat/libxenstat/src/xenstat.c      |  53 +--
 tools/xenstat/libxenstat/src/xenstat.h      |  15 -
 tools/xenstat/libxenstat/src/xenstat_priv.h |   8 -
 tools/xenstat/xentop/xentop.c               |  36 +-
 tools/xl/Makefile                           |   2 +-
 tools/xl/xl.h                               |   6 -
 tools/xl/xl_cmdtable.c                      |  40 ---
 tools/xl/xl_tmem.c                          | 251 --------------
 15 files changed, 18 insertions(+), 1464 deletions(-)
 delete mode 100644 tools/libxc/xc_tmem.c
 delete mode 100644 tools/misc/xen-tmem-list-parse.c
 delete mode 100644 tools/xl/xl_tmem.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 44d9d09d4e..1546afd168 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -30,7 +30,6 @@ CTRL_SRCS-y       += xc_tbuf.c
 CTRL_SRCS-y       += xc_pm.c
 CTRL_SRCS-y       += xc_cpu_hotplug.c
 CTRL_SRCS-y       += xc_resume.c
-CTRL_SRCS-y       += xc_tmem.c
 CTRL_SRCS-y       += xc_vm_event.c
 CTRL_SRCS-y       += xc_monitor.c
 CTRL_SRCS-y       += xc_mem_paging.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 97ae965be7..8334dc5750 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -44,7 +44,6 @@
 #include <xen/hvm/dm_op.h>
 #include <xen/hvm/params.h>
 #include <xen/xsm/flask_op.h>
-#include <xen/tmem.h>
 #include <xen/kexec.h>
 #include <xen/platform.h>
 
@@ -1907,22 +1906,6 @@ int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value);
 
 int xc_enable_turbo(xc_interface *xch, int cpuid);
 int xc_disable_turbo(xc_interface *xch, int cpuid);
-/**
- * tmem operations
- */
-
-int xc_tmem_control_oid(xc_interface *xch, int32_t pool_id, uint32_t subop,
-                        uint32_t cli_id, uint32_t len, uint32_t arg,
-                        struct xen_tmem_oid oid, void *buf);
-int xc_tmem_control(xc_interface *xch,
-                    int32_t pool_id, uint32_t subop, uint32_t cli_id,
-                    uint32_t len, uint32_t arg, void *buf);
-int xc_tmem_auth(xc_interface *xch, int cli_id, char *uuid_str, int enable);
-int xc_tmem_save(xc_interface *xch, uint32_t domid, int live, int fd, int field_marker);
-int xc_tmem_save_extra(xc_interface *xch, uint32_t domid, int fd, int field_marker);
-void xc_tmem_save_done(xc_interface *xch, uint32_t domid);
-int xc_tmem_restore(xc_interface *xch, uint32_t domid, int fd);
-int xc_tmem_restore_extra(xc_interface *xch, uint32_t domid, int fd);
 
 /**
  * altp2m operations
diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c
deleted file mode 100644
index a365c74388..0000000000
--- a/tools/libxc/xc_tmem.c
+++ /dev/null
@@ -1,507 +0,0 @@
-/******************************************************************************
- * xc_tmem.c
- *
- * Copyright (C) 2008 Oracle Corp.
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include "xc_private.h"
-#include <inttypes.h>
-#include <assert.h>
-#include <xen/tmem.h>
-
-int xc_tmem_control(xc_interface *xch,
-                    int32_t pool_id,
-                    uint32_t cmd,
-                    uint32_t cli_id,
-                    uint32_t len,
-                    uint32_t arg,
-                    void *buf)
-{
-    DECLARE_SYSCTL;
-    DECLARE_HYPERCALL_BOUNCE(buf, len, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
-    int rc;
-
-    sysctl.cmd = XEN_SYSCTL_tmem_op;
-    sysctl.u.tmem_op.pool_id = pool_id;
-    sysctl.u.tmem_op.cmd = cmd;
-    sysctl.u.tmem_op.cli_id = cli_id;
-    sysctl.u.tmem_op.len = len;
-    sysctl.u.tmem_op.arg = arg;
-    sysctl.u.tmem_op.pad = 0;
-    sysctl.u.tmem_op.oid.oid[0] = 0;
-    sysctl.u.tmem_op.oid.oid[1] = 0;
-    sysctl.u.tmem_op.oid.oid[2] = 0;
-
-    if ( cmd == XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO ||
-         cmd == XEN_SYSCTL_TMEM_OP_SET_AUTH )
-        HYPERCALL_BOUNCE_SET_DIR(buf, XC_HYPERCALL_BUFFER_BOUNCE_IN);
-    if ( len )
-    {
-        if ( buf == NULL )
-        {
-            errno = EINVAL;
-            return -1;
-        }
-        if ( xc_hypercall_bounce_pre(xch, buf) )
-        {
-            PERROR("Could not bounce buffer for tmem control hypercall");
-            return -1;
-        }
-    }
-
-    set_xen_guest_handle(sysctl.u.tmem_op.u.buf, buf);
-
-    rc = do_sysctl(xch, &sysctl);
-
-    if ( len )
-        xc_hypercall_bounce_post(xch, buf);
-
-    return rc;
-}
-
-int xc_tmem_control_oid(xc_interface *xch,
-                        int32_t pool_id,
-                        uint32_t cmd,
-                        uint32_t cli_id,
-                        uint32_t len,
-                        uint32_t arg,
-                        struct xen_tmem_oid oid,
-                        void *buf)
-{
-    DECLARE_SYSCTL;
-    DECLARE_HYPERCALL_BOUNCE(buf, len, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
-    int rc;
-
-    sysctl.cmd = XEN_SYSCTL_tmem_op;
-    sysctl.u.tmem_op.pool_id = pool_id;
-    sysctl.u.tmem_op.cmd = cmd;
-    sysctl.u.tmem_op.cli_id = cli_id;
-    sysctl.u.tmem_op.len = len;
-    sysctl.u.tmem_op.arg = arg;
-    sysctl.u.tmem_op.pad = 0;
-    sysctl.u.tmem_op.oid = oid;
-
-    if ( len  )
-    {
-        if ( buf == NULL )
-        {
-            errno = EINVAL;
-            return -1;
-        }
-        if ( xc_hypercall_bounce_pre(xch, buf) )
-        {
-            PERROR("Could not bounce buffer for tmem control (OID) hypercall");
-            return -1;
-        }
-    }
-
-    set_xen_guest_handle(sysctl.u.tmem_op.u.buf, buf);
-
-    rc = do_sysctl(xch, &sysctl);
-
-    if ( len )
-        xc_hypercall_bounce_post(xch, buf);
-
-    return rc;
-}
-
-static int xc_tmem_uuid_parse(char *uuid_str, uint64_t *uuid_lo, uint64_t *uuid_hi)
-{
-    char *p = uuid_str;
-    uint64_t *x = uuid_hi;
-    int i = 0, digit;
-
-    *uuid_lo = 0; *uuid_hi = 0;
-    for ( p = uuid_str, i = 0; i != 36 && *p != '\0'; p++, i++ )
-    {
-        if ( (i == 8 || i == 13 || i == 18 || i == 23) )
-        {
-            if ( *p != '-' )
-                return -1;
-            if ( i == 18 )
-                x = uuid_lo;
-            continue;
-        }
-        else if ( *p >= '0' && *p <= '9' )
-            digit = *p - '0';
-        else if ( *p >= 'A' && *p <= 'F' )
-            digit = *p - 'A' + 10;
-        else if ( *p >= 'a' && *p <= 'f' )
-            digit = *p - 'a' + 10;
-        else
-            return -1;
-        *x = (*x << 4) | digit;
-    }
-    if ( (i != 1 && i != 36) || *p != '\0' )
-        return -1;
-    return 0;
-}
-
-int xc_tmem_auth(xc_interface *xch,
-                 int cli_id,
-                 char *uuid_str,
-                 int enable)
-{
-    xen_tmem_pool_info_t pool = {
-        .flags.u.auth = enable,
-        .id = 0,
-        .n_pages = 0,
-        .uuid[0] = 0,
-        .uuid[1] = 0,
-    };
-    if ( xc_tmem_uuid_parse(uuid_str, &pool.uuid[0],
-                                      &pool.uuid[1]) < 0 )
-    {
-        PERROR("Can't parse uuid, use xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");
-        return -1;
-    }
-    return xc_tmem_control(xch, 0 /* pool_id */,
-                           XEN_SYSCTL_TMEM_OP_SET_AUTH,
-                           cli_id, sizeof(pool),
-                           0 /* arg */, &pool);
-}
-
-/* Save/restore/live migrate */
-
-/*
-   Note that live migration complicates the save/restore format in
-   multiple ways: Though saving/migration can only occur when all
-   tmem pools belonging to the domain-being-saved are frozen and
-   this ensures that new pools can't be created or existing pools
-   grown (in number of pages), it is possible during a live migration
-   that pools may be destroyed and pages invalidated while the migration
-   is in process.  As a result, (1) it is not safe to pre-specify counts
-   for these values precisely, but only as a "max", and (2) a "invalidation"
-   list (of pools, objects, pages) must be appended when the domain is truly
-   suspended.
- */
-
-/* returns 0 if nothing to save, -1 if error saving, 1 if saved successfully */
-int xc_tmem_save(xc_interface *xch,
-                 uint32_t domid, int io_fd, int live, int field_marker)
-{
-    int marker = field_marker;
-    int i, j, rc;
-    uint32_t minusone = -1;
-    struct tmem_handle *h;
-    xen_tmem_client_t info;
-    xen_tmem_pool_info_t *pools;
-    char *buf = NULL;
-
-    rc = xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_SAVE_BEGIN,
-                         domid, 0 /* len*/ , live, NULL);
-    if ( rc )
-    {
-        /* Nothing to save - no tmem enabled. */
-        if ( errno == ENOENT )
-            return 0;
-
-        return rc;
-    }
-
-    if ( xc_tmem_control(xch, 0 /* pool_id */,
-                         XEN_SYSCTL_TMEM_OP_GET_CLIENT_INFO,
-                         domid /* cli_id */, sizeof(info), 0 /* arg */,
-                         &info) < 0 )
-        return -1;
-
-    /* Nothing to do. */
-    if ( !info.nr_pools )
-        return 0;
-
-    pools = calloc(info.nr_pools, sizeof(*pools));
-    if ( !pools )
-        return -1;
-
-    rc = xc_tmem_control(xch, 0 /* pool_id is ignored. */,
-                         XEN_SYSCTL_TMEM_OP_GET_POOLS,
-                         domid /* cli_id */, sizeof(*pools) * info.nr_pools,
-                         0 /* arg */, pools);
-
-    if ( rc < 0 || (uint32_t)rc > info.nr_pools )
-        goto out_memory;
-
-    /* Update it - as we have less pools between the two hypercalls. */
-    info.nr_pools = (uint32_t)rc;
-
-    if ( write_exact(io_fd, &marker, sizeof(marker)) )
-        goto out_memory;
-
-    if ( write_exact(io_fd, &info, sizeof(info)) )
-        goto out_memory;
-
-    if ( write_exact(io_fd, &minusone, sizeof(minusone)) )
-        goto out_memory;
-
-    for ( i = 0; i < info.nr_pools; i++ )
-    {
-        uint32_t pagesize;
-        int bufsize = 0;
-        int checksum = 0;
-        xen_tmem_pool_info_t *pool = &pools[i];
-
-        if ( pool->flags.raw != -1 )
-        {
-            if ( !pool->flags.u.persist )
-                pool->n_pages = 0;
-
-            if ( write_exact(io_fd, pool, sizeof(*pool)) )
-                goto out_memory;
-
-            if ( !pool->flags.u.persist )
-                continue;
-
-            pagesize = 1 << (pool->flags.u.pagebits + 12);
-            if ( pagesize > bufsize )
-            {
-                bufsize = pagesize + sizeof(struct tmem_handle);
-                if ( (buf = realloc(buf,bufsize)) == NULL )
-                    goto out_memory;
-            }
-            for ( j = pool->n_pages; j > 0; j-- )
-            {
-                int ret;
-                if ( (ret = xc_tmem_control(
-                          xch, pool->id, XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE,
-                          domid, bufsize, 0, buf)) > 0 )
-                {
-                    h = (struct tmem_handle *)buf;
-                    if ( write_exact(io_fd, &h->oid, sizeof(h->oid)) )
-                        goto out_memory;
-
-                    if ( write_exact(io_fd, &h->index, sizeof(h->index)) )
-                        goto out_memory;
-                    h++;
-                    checksum += *(char *)h;
-                    if ( write_exact(io_fd, h, pagesize) )
-                        goto out_memory;
-                } else if ( ret == 0 ) {
-                    continue;
-                } else {
-                    /* page list terminator */
-                    h = (struct tmem_handle *)buf;
-                    h->oid.oid[0] = h->oid.oid[1] = h->oid.oid[2] = -1L;
-                    if ( write_exact(io_fd, &h->oid, sizeof(h->oid)) )
-                    {
- out_memory:
-                        free(pools);
-                        free(buf);
-                        return -1;
-                    }
-                    break;
-                }
-            }
-            DPRINTF("saved %"PRId64" tmem pages for dom=%d pool=%d, checksum=%x\n",
-                    pool->n_pages - j, domid, pool->id, checksum);
-        }
-    }
-    free(pools);
-    free(buf);
-
-    /* pool list terminator */
-    minusone = -1;
-    if ( write_exact(io_fd, &minusone, sizeof(minusone)) )
-        return -1;
-
-    return 1;
-}
-
-/* only called for live migration */
-int xc_tmem_save_extra(xc_interface *xch, uint32_t domid, int io_fd, int field_marker)
-{
-    struct tmem_handle handle;
-    int marker = field_marker;
-    uint32_t minusone;
-    int count = 0, checksum = 0;
-
-    if ( write_exact(io_fd, &marker, sizeof(marker)) )
-        return -1;
-    while ( xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV, domid,
-                            sizeof(handle),0,&handle) > 0 ) {
-        if ( write_exact(io_fd, &handle.pool_id, sizeof(handle.pool_id)) )
-            return -1;
-        if ( write_exact(io_fd, &handle.oid, sizeof(handle.oid)) )
-            return -1;
-        if ( write_exact(io_fd, &handle.index, sizeof(handle.index)) )
-            return -1;
-        count++;
-        checksum += handle.pool_id + handle.oid.oid[0] + handle.oid.oid[1] +
-                    handle.oid.oid[2] + handle.index;
-    }
-    if ( count )
-            DPRINTF("needed %d tmem invalidates, check=%d\n",count,checksum);
-    minusone = -1;
-    if ( write_exact(io_fd, &minusone, sizeof(minusone)) )
-        return -1;
-    return 0;
-}
-
-/* only called for live migration */
-void xc_tmem_save_done(xc_interface *xch, uint32_t domid)
-{
-    xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_SAVE_END, domid, 0, 0, NULL);
-}
-
-/* restore routines */
-
-static int xc_tmem_restore_new_pool(
-                    xc_interface *xch,
-                    int cli_id,
-                    uint32_t pool_id,
-                    uint32_t flags,
-                    uint64_t uuid_lo,
-                    uint64_t uuid_hi)
-{
-    xen_tmem_pool_info_t pool = {
-        .flags.raw = flags,
-        .id = pool_id,
-        .n_pages = 0,
-        .uuid[0] = uuid_lo,
-        .uuid[1] = uuid_hi,
-    };
-
-    return xc_tmem_control(xch, pool_id,
-                           XEN_SYSCTL_TMEM_OP_SET_POOLS,
-                           cli_id, sizeof(pool),
-                           0 /* arg */, &pool);
-}
-
-int xc_tmem_restore(xc_interface *xch, uint32_t domid, int io_fd)
-{
-    uint32_t minusone;
-    xen_tmem_client_t info;
-    int checksum = 0;
-    unsigned int i;
-    char *buf = NULL;
-
-    if ( read_exact(io_fd, &info, sizeof(info)) )
-        return -1;
-
-    /* We would never save if there weren't any pools! */
-    if ( !info.nr_pools )
-        return -1;
-
-    if ( xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN, domid, 0, 0, NULL) < 0 )
-        return -1;
-
-    if ( xc_tmem_control(xch, 0 /* pool_id */,
-                         XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO,
-                         domid /* cli_id */, sizeof(info), 0 /* arg */,
-                         &info) < 0 )
-        return -1;
-
-    if ( read_exact(io_fd, &minusone, sizeof(minusone)) )
-        return -1;
-
-    for ( i = 0; i < info.nr_pools; i++ )
-    {
-        int bufsize = 0, pagesize;
-        int j;
-        xen_tmem_pool_info_t pool;
-
-        if ( read_exact(io_fd, &pool, sizeof(pool)) )
-            goto out_memory;
-
-        if ( xc_tmem_restore_new_pool(xch, domid, pool.id, pool.flags.raw,
-                                      pool.uuid[0], pool.uuid[1]) < 0 )
-            goto out_memory;
-
-        if ( pool.n_pages <= 0 )
-            continue;
-
-        pagesize = 1 << (pool.flags.u.pagebits + 12);
-        if ( pagesize > bufsize )
-        {
-            bufsize = pagesize;
-            if ( (buf = realloc(buf,bufsize)) == NULL )
-                goto out_memory;
-        }
-        for ( j = pool.n_pages; j > 0; j-- )
-        {
-            struct xen_tmem_oid oid;
-            uint32_t index;
-            int rc;
-
-            if ( read_exact(io_fd, &oid, sizeof(oid)) )
-                goto out_memory;
-
-            if ( oid.oid[0] == -1L && oid.oid[1] == -1L && oid.oid[2] == -1L )
-                break;
-            if ( read_exact(io_fd, &index, sizeof(index)) )
-                goto out_memory;
-
-            if ( read_exact(io_fd, buf, pagesize) )
-                goto out_memory;
-
-            checksum += *buf;
-            if ( (rc = xc_tmem_control_oid(
-                      xch, pool.id, XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE,
-                      domid, bufsize, index, oid, buf)) <= 0 )
-            {
-                DPRINTF("xc_tmem_restore: putting page failed, rc=%d\n",rc);
- out_memory:
-                free(buf);
-                return -1;
-            }
-        }
-        if ( pool.n_pages )
-            DPRINTF("restored %"PRId64" tmem pages for dom=%d pool=%d, check=%x\n",
-                    pool.n_pages - j, domid, pool.id, checksum);
-    }
-    free(buf);
-
-    return 0;
-}
-
-/* only called for live migration, must be called after suspend */
-int xc_tmem_restore_extra(xc_interface *xch, uint32_t domid, int io_fd)
-{
-    uint32_t pool_id;
-    struct xen_tmem_oid oid;
-    uint32_t index;
-    int count = 0;
-    int checksum = 0;
-
-    while ( read_exact(io_fd, &pool_id, sizeof(pool_id)) == 0 && pool_id != -1 )
-    {
-        if ( read_exact(io_fd, &oid, sizeof(oid)) )
-            return -1;
-        if ( read_exact(io_fd, &index, sizeof(index)) )
-            return -1;
-        if ( xc_tmem_control_oid(
-                 xch, pool_id, XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE,
-                 domid, 0, index, oid, NULL) <= 0 )
-            return -1;
-        count++;
-        checksum += pool_id + oid.oid[0] + oid.oid[1] + oid.oid[2] + index;
-    }
-    if ( pool_id != -1 )
-        return -1;
-    if ( count )
-            DPRINTF("invalidated %d tmem pages, check=%d\n",count,checksum);
-
-    return 0;
-}
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/tools/libxl/libxl_tmem.c b/tools/libxl/libxl_tmem.c
index 2bee8d1edf..a553b39738 100644
--- a/tools/libxl/libxl_tmem.c
+++ b/tools/libxl/libxl_tmem.c
@@ -16,146 +16,55 @@
 
 #include "libxl_internal.h"
 
+/* TMEM is gone. Leave some stubs here. */
+
 char *libxl_tmem_list(libxl_ctx *ctx, uint32_t domid, int use_long)
 {
-    int r;
-    char _buf[32768];
     GC_INIT(ctx);
-
-    r = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_LIST, domid, 32768,
-                        use_long, _buf);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not get tmem list");
-        GC_FREE;
-        return NULL;
-    }
-
+    LOGED(ERROR, domid, "Can not get tmem list");
     GC_FREE;
-    return strdup(_buf);
+    return NULL;
 }
 
 int libxl_tmem_freeze(libxl_ctx *ctx, uint32_t domid)
 {
-    int r, rc;
     GC_INIT(ctx);
-
-    r = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_FREEZE, domid, 0, 0,
-                        NULL);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not freeze tmem pools");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    rc = 0;
-out:
+    LOGED(ERROR, domid, "Can not freeze tmem pools");
     GC_FREE;
-    return rc;
+    return ERROR_FAIL;
 }
 
 int libxl_tmem_thaw(libxl_ctx *ctx, uint32_t domid)
 {
-    int r, rc;
     GC_INIT(ctx);
-
-    r = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_THAW, domid, 0, 0,
-                        NULL);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not thaw tmem pools");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    rc = 0;
-out:
+    LOGED(ERROR, domid, "Can not thaw tmem pools");
     GC_FREE;
-    return rc;
-}
-
-static int32_t tmem_setop_from_string(char *set_name, uint32_t val,
-                                      xen_tmem_client_t *info)
-{
-    if (!strcmp(set_name, "weight"))
-        info->weight = val;
-    else if (!strcmp(set_name, "compress"))
-        info->flags.u.compress = val;
-    else
-        return -1;
-
-    return 0;
+    return ERROR_FAIL;
 }
 
 int libxl_tmem_set(libxl_ctx *ctx, uint32_t domid, char* name, uint32_t set)
 {
-    int r, rc;
-    xen_tmem_client_t info;
     GC_INIT(ctx);
-
-    r = xc_tmem_control(ctx->xch, -1 /* pool_id */,
-                        XEN_SYSCTL_TMEM_OP_GET_CLIENT_INFO,
-                        domid, sizeof(info), 0 /* arg */, &info);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not get tmem data!");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-    rc = tmem_setop_from_string(name, set, &info);
-    if (rc == -1) {
-        LOGEVD(ERROR, -1, domid, "Invalid set, valid sets are <weight|compress>");
-        rc = ERROR_INVAL;
-        goto out;
-    }
-    r = xc_tmem_control(ctx->xch, -1 /* pool_id */,
-                        XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO,
-                        domid, sizeof(info), 0 /* arg */, &info);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not set tmem %s", name);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    rc = 0;
-out:
+    LOGED(ERROR, domid, "Can not set tmem %s", name);
     GC_FREE;
-    return rc;
+    return ERROR_FAIL;
 }
 
 int libxl_tmem_shared_auth(libxl_ctx *ctx, uint32_t domid,
                            char* uuid, int auth)
 {
-    int r, rc;
     GC_INIT(ctx);
-
-    r = xc_tmem_auth(ctx->xch, domid, uuid, auth);
-    if (r < 0) {
-        LOGED(ERROR, domid, "Can not set tmem shared auth");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    rc = 0;
-out:
+    LOGED(ERROR, domid, "Can not set tmem shared auth");
     GC_FREE;
-    return rc;
+    return ERROR_FAIL;
 }
 
 int libxl_tmem_freeable(libxl_ctx *ctx)
 {
-    int r, rc;
     GC_INIT(ctx);
-
-    r = xc_tmem_control(ctx->xch, -1, XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB,
-                        -1, 0, 0, 0);
-    if (r < 0) {
-        LOGE(ERROR, "Can not get tmem freeable memory");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    rc = 0;
-out:
+    LOGE(ERROR, "Can not get tmem freeable memory");
     GC_FREE;
-    return rc;
+    return ERROR_FAIL;
 }
 
 /*
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index eaa28793ef..335b3170d1 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -24,7 +24,6 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmctx
 INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
 INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
 INSTALL_SBIN                   += xen-ringwatch
-INSTALL_SBIN                   += xen-tmem-list-parse
 INSTALL_SBIN                   += xencov
 INSTALL_SBIN                   += xenlockprof
 INSTALL_SBIN                   += xenperf
diff --git a/tools/misc/xen-tmem-list-parse.c b/tools/misc/xen-tmem-list-parse.c
deleted file mode 100644
index f32b107dce..0000000000
--- a/tools/misc/xen-tmem-list-parse.c
+++ /dev/null
@@ -1,339 +0,0 @@
-/*
- * Parse output from tmem-list and reformat to human-readable
- *
- * NOTE: NEVER delete a parse call as this file documents backwards
- * compatibility for older versions of tmem-list and we don't want to
- * accidentally reuse an old tag
- *
- * Copyright (c) 2009, Dan Magenheimer, Oracle Corp.
- */
-
-#include <stdio.h>
-#include <unistd.h>
-#include <string.h>
-
-#define BUFSIZE 4096
-#define PAGE_SIZE 4096
-
-unsigned long long parse(char *s,char *match)
-{
-    char *s1 = strstr(s,match);
-    unsigned long long ret;
-
-    if ( s1 == NULL )
-        return 0LL;
-    s1 += 2;
-    if ( *s1++ != ':' )
-        return 0LL;
-    sscanf(s1,"%llu",&ret);
-    return ret;
-}
-
-unsigned long long parse_hex(char *s,char *match)
-{
-    char *s1 = strstr(s,match);
-    unsigned long long ret;
-
-    if ( s1 == NULL )
-        return 0LL;
-    s1 += 2;
-    if ( *s1++ != ':' )
-        return 0LL;
-    sscanf(s1,"%llx",&ret);
-    return ret;
-}
-
-unsigned long long parse2(char *s,char *match1, char *match2)
-{
-    char match[3];
-    match[0] = *match1;
-    match[1] = *match2;
-    match[2] = '\0';
-    return parse(s,match);
-}
-
-void parse_string(char *s,char *match, char *buf, int len)
-{
-    char *s1 = strstr(s,match);
-    int i;
-
-    if ( s1 == NULL )
-        return;
-    s1 += 2;
-    if ( *s1++ != ':' )
-        return;
-    for ( i = 0; i < len; i++ )
-        *buf++ = *s1++;
-}
-
-void parse_sharers(char *s, char *match, char *buf, int len)
-{
-    char *s1 = strstr(s,match);
-    char *b = buf;
-
-    if ( s1 == NULL )
-        return;
-    while ( s1 )
-    {
-        s1 += 2;
-        if (*s1++ != ':')
-            return;
-        while (*s1 >= '0' && *s1 <= '9')
-            *b++ = *s1++;
-        *b++ = ',';
-        s1 = strstr(s1,match);
-    }
-    if ( b != buf )
-        *--b = '\0';
-}
-
-void parse_global(char *s)
-{
-    unsigned long long total_ops = parse(s,"Tt");
-    unsigned long long errored_ops = parse(s,"Te");
-    unsigned long long failed_copies = parse(s,"Cf");
-    unsigned long long alloc_failed = parse(s,"Af");
-    unsigned long long alloc_page_failed = parse(s,"Pf");
-    unsigned long long avail_pages = parse(s,"Ta");
-    unsigned long long low_on_memory = parse(s,"Lm");
-    unsigned long long evicted_pgs = parse(s,"Et");
-    unsigned long long evict_attempts = parse(s,"Ea");
-    unsigned long long relinq_pgs = parse(s,"Rt");
-    unsigned long long relinq_attempts = parse(s,"Ra");
-    unsigned long long max_evicts_per_relinq = parse(s,"Rx");
-    unsigned long long total_flush_pool = parse(s,"Fp");
-    unsigned long long global_eph_count = parse(s,"Ec");
-    unsigned long long global_eph_max = parse(s,"Em");
-    unsigned long long obj_count = parse(s,"Oc");
-    unsigned long long obj_max = parse(s,"Om");
-    unsigned long long rtree_node_count = parse(s,"Nc");
-    unsigned long long rtree_node_max = parse(s,"Nm");
-    unsigned long long pgp_count = parse(s,"Pc");
-    unsigned long long pgp_max = parse(s,"Pm");
-    unsigned long long page_count = parse(s,"Fc");
-    unsigned long long max_page_count = parse(s,"Fm");
-    unsigned long long pcd_count = parse(s,"Sc");
-    unsigned long long max_pcd_count = parse(s,"Sm");
-    unsigned long long pcd_tot_tze_size = parse(s,"Zt");
-    unsigned long long pcd_tot_csize = parse(s,"Gz");
-    unsigned long long deduped_puts = parse(s,"Gd");
-    unsigned long long tot_good_eph_puts = parse(s,"Ep");
-
-    printf("total tmem ops=%llu (errors=%llu) -- tmem pages avail=%llu\n",
-           total_ops, errored_ops, avail_pages);
-    printf("datastructs: objs=%llu (max=%llu) pgps=%llu (max=%llu) "
-           "nodes=%llu (max=%llu) pages=%llu (max=%llu) ",
-           obj_count, obj_max, pgp_count, pgp_max,
-           rtree_node_count, rtree_node_max,
-           page_count,max_page_count);
-    if (max_pcd_count != 0 && global_eph_count != 0 && tot_good_eph_puts != 0) {
-           printf("pcds=%llu (max=%llu) ",
-               pcd_count,max_pcd_count);
-           printf("deduped: avg=%4.2f%% (curr=%4.2f%%) ",
-                   ((deduped_puts*1.0)/tot_good_eph_puts)*100,
-                   (1.0-(pcd_count*1.0)/global_eph_count)*100);
-    }
-    if (pcd_count != 0)
-    {
-           if (pcd_tot_tze_size && (pcd_tot_tze_size < pcd_count*PAGE_SIZE))
-               printf("tze savings=%4.2f%% ",
-                   (1.0-(pcd_tot_tze_size*1.0)/(pcd_count*PAGE_SIZE))*100);
-           if (pcd_tot_csize && (pcd_tot_csize < pcd_count*PAGE_SIZE))
-               printf("compression savings=%4.2f%% ",
-                   (1.0-(pcd_tot_csize*1.0)/(pcd_count*PAGE_SIZE))*100);
-    }
-    printf("\n");
-    printf("misc: failed_copies=%llu alloc_failed=%llu alloc_page_failed=%llu "
-           "low_mem=%llu evicted=%llu/%llu relinq=%llu/%llu, "
-           "max_evicts_per_relinq=%llu, flush_pools=%llu, "
-           "eph_count=%llu, eph_max=%llu\n",
-           failed_copies, alloc_failed, alloc_page_failed, low_on_memory,
-           evicted_pgs, evict_attempts, relinq_pgs, relinq_attempts,
-           max_evicts_per_relinq, total_flush_pool,
-           global_eph_count, global_eph_max);
-}
-
-#define PARSE_CYC_COUNTER(s,x,prefix) unsigned long long \
-   x##_count = parse2(s,prefix,"n"), \
-   x##_sum_cycles = parse2(s,prefix,"t"), \
-   x##_max_cycles = parse2(s,prefix,"x"), \
-   x##_min_cycles = parse2(s,prefix,"m")
-#define PRINTF_CYC_COUNTER(x,text) \
-  if (x##_count) printf(text" avg=%llu, max=%llu, " \
-  "min=%llu, samples=%llu\n", \
-  x##_sum_cycles ? (x##_sum_cycles/x##_count) : 0, \
-  x##_max_cycles, x##_min_cycles, x##_count)
-
-void parse_time_stats(char *s)
-{
-    PARSE_CYC_COUNTER(s,succ_get,"G");
-    PARSE_CYC_COUNTER(s,succ_put,"P");
-    PARSE_CYC_COUNTER(s,non_succ_get,"g");
-    PARSE_CYC_COUNTER(s,non_succ_put,"p");
-    PARSE_CYC_COUNTER(s,flush,"F");
-    PARSE_CYC_COUNTER(s,flush_obj,"O");
-    PARSE_CYC_COUNTER(s,pg_copy,"C");
-    PARSE_CYC_COUNTER(s,compress,"c");
-    PARSE_CYC_COUNTER(s,decompress,"d");
-
-    PRINTF_CYC_COUNTER(succ_get,"succ get cycles:");
-    PRINTF_CYC_COUNTER(succ_put,"succ put cycles:");
-    PRINTF_CYC_COUNTER(non_succ_get,"failed get cycles:");
-    PRINTF_CYC_COUNTER(non_succ_put,"failed put cycles:");
-    PRINTF_CYC_COUNTER(flush,"flush cycles:");
-    PRINTF_CYC_COUNTER(flush_obj,"flush_obj cycles:");
-    PRINTF_CYC_COUNTER(pg_copy,"page copy cycles:");
-    PRINTF_CYC_COUNTER(compress,"compression cycles:");
-    PRINTF_CYC_COUNTER(decompress,"decompression cycles:");
-}
-
-void parse_client(char *s)
-{
-    unsigned long cli_id = parse(s,"CI");
-    unsigned long weight = parse(s,"ww");
-    unsigned long cap = parse(s,"ca");
-    unsigned long compress = parse(s,"co");
-    unsigned long frozen = parse(s,"fr");
-    unsigned long long eph_count = parse(s,"Ec");
-    unsigned long long max_eph_count = parse(s,"Em");
-    unsigned long long compressed_pages = parse(s,"cp");
-    unsigned long long compressed_sum_size = parse(s,"cb");
-    unsigned long long compress_poor = parse(s,"cn");
-    unsigned long long compress_nomem = parse(s,"cm");
-    unsigned long long total_cycles = parse(s,"Tc");
-    unsigned long long succ_eph_gets = parse(s,"Ge");
-    unsigned long long succ_pers_puts = parse(s,"Pp");
-    unsigned long long succ_pers_gets = parse(s,"Gp");
-
-    printf("domid%lu: weight=%lu,cap=%lu,compress=%d,frozen=%d,"
-           "total_cycles=%llu,succ_eph_gets=%llu,"
-           "succ_pers_puts=%llu,succ_pers_gets=%llu,"
-           "eph_count=%llu,max_eph=%llu,"
-           "compression ratio=%lu%% (samples=%llu,poor=%llu,nomem=%llu)\n",
-           cli_id, weight, cap, compress?1:0, frozen?1:0,
-           total_cycles, succ_eph_gets, succ_pers_puts, succ_pers_gets, 
-           eph_count, max_eph_count,
-           compressed_pages ?  (long)((compressed_sum_size*100LL) /
-                                      (compressed_pages*PAGE_SIZE)) : 0,
-           compressed_pages, compress_poor, compress_nomem);
-
-}
-
-void parse_pool(char *s)
-{
-    char pool_type[3];
-    unsigned long cli_id = parse(s,"CI");
-    unsigned long pool_id = parse(s,"PI");
-    unsigned long long pgp_count = parse(s,"Pc");
-    unsigned long long max_pgp_count = parse(s,"Pm");
-    unsigned long long obj_count = parse(s,"Oc");
-    unsigned long long max_obj_count = parse(s,"Om");
-    unsigned long long objnode_count = parse(s,"Nc");
-    unsigned long long max_objnode_count = parse(s,"Nm");
-    unsigned long long good_puts = parse(s,"ps");
-    unsigned long long puts = parse(s,"pt");
-    unsigned long long no_mem_puts = parse(s,"px");
-    unsigned long long dup_puts_flushed = parse(s,"pd");
-    unsigned long long dup_puts_replaced = parse(s,"pr");
-    unsigned long long found_gets = parse(s,"gs");
-    unsigned long long gets = parse(s,"gt");
-    unsigned long long flushs_found = parse(s,"fs");
-    unsigned long long flushs = parse(s,"ft");
-    unsigned long long flush_objs_found = parse(s,"os");
-    unsigned long long flush_objs = parse(s,"ot");
-
-    parse_string(s,"PT",pool_type,2);
-    pool_type[2] = '\0';
-    if (pool_type[1] == 'S')
-        return; /* no need to repeat print data for shared pools */
-    printf("domid%lu,id%lu[%s]:pgp=%llu(max=%llu) obj=%llu(%llu) "
-           "objnode=%llu(%llu) puts=%llu/%llu/%llu(dup=%llu/%llu) "
-           "gets=%llu/%llu(%llu%%) "
-           "flush=%llu/%llu flobj=%llu/%llu\n",
-           cli_id, pool_id, pool_type,
-           pgp_count, max_pgp_count, obj_count, max_obj_count,
-           objnode_count, max_objnode_count,
-           good_puts, puts, no_mem_puts, 
-           dup_puts_flushed, dup_puts_replaced,
-           found_gets, gets,
-           gets ? (found_gets*100LL)/gets : 0,
-           flushs_found, flushs, flush_objs_found, flush_objs);
-
-}
-
-void parse_shared_pool(char *s)
-{
-    char pool_type[3];
-    char buf[BUFSIZE];
-    unsigned long pool_id = parse(s,"PI");
-    unsigned long long uid0 = parse_hex(s,"U0");
-    unsigned long long uid1 = parse_hex(s,"U1");
-    unsigned long long pgp_count = parse(s,"Pc");
-    unsigned long long max_pgp_count = parse(s,"Pm");
-    unsigned long long obj_count = parse(s,"Oc");
-    unsigned long long max_obj_count = parse(s,"Om");
-    unsigned long long objnode_count = parse(s,"Nc");
-    unsigned long long max_objnode_count = parse(s,"Nm");
-    unsigned long long good_puts = parse(s,"ps");
-    unsigned long long puts = parse(s,"pt");
-    unsigned long long no_mem_puts = parse(s,"px");
-    unsigned long long dup_puts_flushed = parse(s,"pd");
-    unsigned long long dup_puts_replaced = parse(s,"pr");
-    unsigned long long found_gets = parse(s,"gs");
-    unsigned long long gets = parse(s,"gt");
-    unsigned long long flushs_found = parse(s,"fs");
-    unsigned long long flushs = parse(s,"ft");
-    unsigned long long flush_objs_found = parse(s,"os");
-    unsigned long long flush_objs = parse(s,"ot");
-
-    parse_string(s,"PT",pool_type,2);
-    pool_type[2] = '\0';
-    parse_sharers(s,"SC",buf,BUFSIZE);
-    printf("poolid=%lu[%s] uuid=%llx.%llx, shared-by:%s: "
-           "pgp=%llu(max=%llu) obj=%llu(%llu) "
-           "objnode=%llu(%llu) puts=%llu/%llu/%llu(dup=%llu/%llu) "
-           "gets=%llu/%llu(%llu%%) "
-           "flush=%llu/%llu flobj=%llu/%llu\n",
-           pool_id, pool_type, uid0, uid1, buf,
-           pgp_count, max_pgp_count, obj_count, max_obj_count,
-           objnode_count, max_objnode_count,
-           good_puts, puts, no_mem_puts, 
-           dup_puts_flushed, dup_puts_replaced,
-           found_gets, gets,
-           gets ? (found_gets*100LL)/gets : 0,
-           flushs_found, flushs, flush_objs_found, flush_objs);
-}
-
-int main(int ac, char **av)
-{
-    char *p, c;
-    char buf[BUFSIZE];
-
-    while ( (p = fgets(buf,BUFSIZE,stdin)) != NULL )
-    {
-        c = *p++;
-        if ( *p++ != '=' )
-            continue;
-        switch ( c )
-        {
-        case 'G':
-            parse_global(p);
-            break;
-        case 'T':
-            parse_time_stats(p);
-            break;
-        case 'C':
-            parse_client(p);
-            break;
-        case 'P':
-            parse_pool(p);
-            break;
-        case 'S':
-            parse_shared_pool(p);
-            break;
-        default:
-            continue;
-        }
-    }
-    return 0;
-}
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index 484b790c75..1190255ac1 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -17,7 +17,6 @@
 #include <arpa/inet.h>
 
 #include <xen/elfnote.h>
-#include <xen/tmem.h>
 #include "xc_dom.h"
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/params.h>
@@ -1614,71 +1613,6 @@ static PyObject *dom_op(XcObject *self, PyObject *args,
     return zero;
 }
 
-static PyObject *pyxc_tmem_control(XcObject *self,
-                                   PyObject *args,
-                                   PyObject *kwds)
-{
-    int32_t pool_id;
-    uint32_t subop;
-    uint32_t cli_id;
-    uint32_t len;
-    uint32_t arg;
-    char *buf;
-    char _buffer[32768], *buffer = _buffer;
-    int rc;
-
-    static char *kwd_list[] = { "pool_id", "subop", "cli_id", "arg1", "arg2", "buf", NULL };
-
-    if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iiiiis", kwd_list,
-                        &pool_id, &subop, &cli_id, &len, &arg, &buf) )
-        return NULL;
-
-    if ( (subop == XEN_SYSCTL_TMEM_OP_LIST) && (len > 32768) )
-        len = 32768;
-
-    if ( (rc = xc_tmem_control(self->xc_handle, pool_id, subop, cli_id, len, arg, buffer)) < 0 )
-        return Py_BuildValue("i", rc);
-
-    switch (subop) {
-        case XEN_SYSCTL_TMEM_OP_LIST:
-            return Py_BuildValue("s", buffer);
-        case XEN_SYSCTL_TMEM_OP_FLUSH:
-            return Py_BuildValue("i", rc);
-        case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
-            return Py_BuildValue("i", rc);
-        case XEN_SYSCTL_TMEM_OP_THAW:
-        case XEN_SYSCTL_TMEM_OP_FREEZE:
-        case XEN_SYSCTL_TMEM_OP_DESTROY:
-        default:
-            break;
-    }
-
-    Py_INCREF(zero);
-    return zero;
-}
-
-static PyObject *pyxc_tmem_shared_auth(XcObject *self,
-                                   PyObject *args,
-                                   PyObject *kwds)
-{
-    uint32_t cli_id;
-    uint32_t arg1;
-    char *uuid_str;
-    int rc;
-
-    static char *kwd_list[] = { "cli_id", "uuid_str", "arg1", NULL };
-
-    if ( !PyArg_ParseTupleAndKeywords(args, kwds, "isi", kwd_list,
-                                   &cli_id, &uuid_str, &arg1) )
-        return NULL;
-
-    if ( (rc = xc_tmem_auth(self->xc_handle, cli_id, uuid_str, arg1)) < 0 )
-        return Py_BuildValue("i", rc);
-
-    Py_INCREF(zero);
-    return zero;
-}
-
 static PyObject *pyxc_dom_set_memshr(XcObject *self, PyObject *args)
 {
     uint32_t dom;
@@ -2497,27 +2431,6 @@ static PyMethodDef pyxc_methods[] = {
       " dom [int]: Identifier of domain.\n" },
 #endif
 
-    { "tmem_control",
-      (PyCFunction)pyxc_tmem_control,
-      METH_VARARGS | METH_KEYWORDS, "\n"
-      "Do various control on a tmem pool.\n"
-      " pool_id [int]: Identifier of the tmem pool (-1 == all).\n"
-      " subop [int]: Supplementary Operation.\n"
-      " cli_id [int]: Client identifier (-1 == all).\n"
-      " len [int]: Length of 'buf'.\n"
-      " arg [int]: Argument.\n"
-      " buf [str]: Buffer.\n\n"
-      "Returns: [int] 0 or [str] tmem info on success; exception on error.\n" },
-
-    { "tmem_shared_auth",
-      (PyCFunction)pyxc_tmem_shared_auth,
-      METH_VARARGS | METH_KEYWORDS, "\n"
-      "De/authenticate a shared tmem pool.\n"
-      " cli_id [int]: Client identifier (-1 == all).\n"
-      " uuid_str [str]: uuid.\n"
-      " auth [int]: 0|1 .\n"
-      "Returns: [int] 0 on success; exception on error.\n" },
-
     { "dom_set_memshr", 
       (PyCFunction)pyxc_dom_set_memshr,
       METH_VARARGS, "\n"
diff --git a/tools/xenstat/libxenstat/src/xenstat.c b/tools/xenstat/libxenstat/src/xenstat.c
index fbe44f3c56..52e305ace4 100644
--- a/tools/xenstat/libxenstat/src/xenstat.c
+++ b/tools/xenstat/libxenstat/src/xenstat.c
@@ -145,19 +145,6 @@ static inline unsigned long long parse(char *s, char *match)
 	return ret;
 }
 
-void domain_get_tmem_stats(xenstat_handle * handle, xenstat_domain * domain)
-{
-	char buffer[4096];
-
-	if (xc_tmem_control(handle->xc_handle,-1,XEN_SYSCTL_TMEM_OP_LIST,domain->id,
-                        sizeof(buffer),-1,buffer) < 0)
-		return;
-	domain->tmem_stats.curr_eph_pages = parse(buffer,"Ec");
-	domain->tmem_stats.succ_eph_gets = parse(buffer,"Ge");
-	domain->tmem_stats.succ_pers_puts = parse(buffer,"Pp");
-	domain->tmem_stats.succ_pers_gets = parse(buffer,"Gp");
-}
-
 xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 {
 #define DOMAIN_CHUNK_SIZE 256
@@ -166,7 +153,6 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 	xc_domaininfo_t domaininfo[DOMAIN_CHUNK_SIZE];
 	int new_domains;
 	unsigned int i;
-	int rc;
 
 	/* Create the node */
 	node = (xenstat_node *) calloc(1, sizeof(xenstat_node));
@@ -190,9 +176,7 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 	node->free_mem = ((unsigned long long)physinfo.free_pages)
 	    * handle->page_size;
 
-	rc = xc_tmem_control(handle->xc_handle, -1,
-                         XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB, -1, 0, 0, NULL);
-	node->freeable_mb = (rc < 0) ? 0 : rc;
+	node->freeable_mb = 0;
 	/* malloc(0) is not portable, so allocate a single domain.  This will
 	 * be resized below. */
 	node->domains = malloc(sizeof(xenstat_domain));
@@ -260,7 +244,6 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 			domain->networks = NULL;
 			domain->num_vbds = 0;
 			domain->vbds = NULL;
-			domain_get_tmem_stats(handle,domain);
 
 			domain++;
 			node->num_domains++;
@@ -729,40 +712,6 @@ unsigned long long xenstat_vbd_wr_sects(xenstat_vbd * vbd)
 	return vbd->wr_sects;
 }
 
-/*
- * Tmem functions
- */
-
-xenstat_tmem *xenstat_domain_tmem(xenstat_domain * domain)
-{
-	return &domain->tmem_stats;
-}
-
-/* Get the current number of ephemeral pages */
-unsigned long long xenstat_tmem_curr_eph_pages(xenstat_tmem *tmem)
-{
-	return tmem->curr_eph_pages;
-}
-
-/* Get the number of successful ephemeral gets */
-unsigned long long xenstat_tmem_succ_eph_gets(xenstat_tmem *tmem)
-{
-	return tmem->succ_eph_gets;
-}
-
-/* Get the number of successful persistent puts */
-unsigned long long xenstat_tmem_succ_pers_puts(xenstat_tmem *tmem)
-{
-	return tmem->succ_pers_puts;
-}
-
-/* Get the number of successful persistent gets */
-unsigned long long xenstat_tmem_succ_pers_gets(xenstat_tmem *tmem)
-{
-	return tmem->succ_pers_gets;
-}
-
-
 static char *xenstat_get_domain_name(xenstat_handle *handle, unsigned int domain_id)
 {
 	char path[80];
diff --git a/tools/xenstat/libxenstat/src/xenstat.h b/tools/xenstat/libxenstat/src/xenstat.h
index 47ec60e14d..dfc27d4d2c 100644
--- a/tools/xenstat/libxenstat/src/xenstat.h
+++ b/tools/xenstat/libxenstat/src/xenstat.h
@@ -27,7 +27,6 @@ typedef struct xenstat_node xenstat_node;
 typedef struct xenstat_vcpu xenstat_vcpu;
 typedef struct xenstat_network xenstat_network;
 typedef struct xenstat_vbd xenstat_vbd;
-typedef struct xenstat_tmem xenstat_tmem;
 
 /* Initialize the xenstat library.  Returns a handle to be used with
  * subsequent calls to the xenstat library, or NULL if an error occurs. */
@@ -70,9 +69,6 @@ unsigned long long xenstat_node_tot_mem(xenstat_node * node);
 /* Get amount of free memory on a node */
 unsigned long long xenstat_node_free_mem(xenstat_node * node);
 
-/* Get amount of tmem freeable memory (in MiB) on a node */
-long xenstat_node_freeable_mb(xenstat_node * node);
-
 /* Find the number of domains existing on a node */
 unsigned int xenstat_node_num_domains(xenstat_node * node);
 
@@ -133,9 +129,6 @@ unsigned int xenstat_domain_num_vbds(xenstat_domain *);
 xenstat_vbd *xenstat_domain_vbd(xenstat_domain * domain,
 				    unsigned int vbd);
 
-/* Get the tmem information for a given domain */
-xenstat_tmem *xenstat_domain_tmem(xenstat_domain * domain);
-
 /*
  * VCPU functions - extract information from a xenstat_vcpu
  */
@@ -193,12 +186,4 @@ unsigned long long xenstat_vbd_wr_reqs(xenstat_vbd * vbd);
 unsigned long long xenstat_vbd_rd_sects(xenstat_vbd * vbd);
 unsigned long long xenstat_vbd_wr_sects(xenstat_vbd * vbd);
 
-/*
- * Tmem functions - extract tmem information
- */
-unsigned long long xenstat_tmem_curr_eph_pages(xenstat_tmem *tmem);
-unsigned long long xenstat_tmem_succ_eph_gets(xenstat_tmem *tmem);
-unsigned long long xenstat_tmem_succ_pers_puts(xenstat_tmem *tmem);
-unsigned long long xenstat_tmem_succ_pers_gets(xenstat_tmem *tmem);
-
 #endif /* XENSTAT_H */
diff --git a/tools/xenstat/libxenstat/src/xenstat_priv.h b/tools/xenstat/libxenstat/src/xenstat_priv.h
index 74e0774a5e..3a0b9c990b 100644
--- a/tools/xenstat/libxenstat/src/xenstat_priv.h
+++ b/tools/xenstat/libxenstat/src/xenstat_priv.h
@@ -52,13 +52,6 @@ struct xenstat_node {
 	long freeable_mb;
 };
 
-struct xenstat_tmem {
-	unsigned long long curr_eph_pages;
-	unsigned long long succ_eph_gets;
-	unsigned long long succ_pers_puts;
-	unsigned long long succ_pers_gets;
-};
-
 struct xenstat_domain {
 	unsigned int id;
 	char *name;
@@ -73,7 +66,6 @@ struct xenstat_domain {
 	xenstat_network *networks;	/* Array of length num_networks */
 	unsigned int num_vbds;
 	xenstat_vbd *vbds;
-	xenstat_tmem tmem_stats;
 };
 
 struct xenstat_vcpu {
diff --git a/tools/xenstat/xentop/xentop.c b/tools/xenstat/xentop/xentop.c
index c46581062b..f6fcb234fa 100644
--- a/tools/xenstat/xentop/xentop.c
+++ b/tools/xenstat/xentop/xentop.c
@@ -209,7 +209,6 @@ unsigned int iterations = 0;
 int show_vcpus = 0;
 int show_networks = 0;
 int show_vbds = 0;
-int show_tmem = 0;
 int repeat_header = 0;
 int show_full_name = 0;
 #define PROMPT_VAL_LEN 80
@@ -362,9 +361,6 @@ static int handle_key(int ch)
 		case 'b': case 'B':
 			show_vbds ^= 1;
 			break;
-		case 't': case 'T':
-			show_tmem ^= 1;
-			break;
 		case 'r': case 'R':
 			repeat_header ^= 1;
 			break;
@@ -893,8 +889,8 @@ void do_summary(void)
 	      "%u crashed, %u dying, %u shutdown \n",
 	      num_domains, run, block, pause, crash, dying, shutdown);
 
-	used = xenstat_node_tot_mem(cur_node)-xenstat_node_free_mem(cur_node);
-	freeable_mb = xenstat_node_freeable_mb(cur_node);
+	used = xenstat_node_tot_mem(cur_node);
+	freeable_mb = 0;
 
 	/* Dump node memory and cpu information */
 	if ( freeable_mb <= 0 )
@@ -952,12 +948,6 @@ void do_bottom_line(void)
 		attr_addstr(show_vbds ? COLOR_PAIR(1) : 0, "ds");
 		addstr("  ");
 
-		/* tmem */
-		addch(A_REVERSE | 'T');
-		attr_addstr(show_tmem ? COLOR_PAIR(1) : 0, "mem");
-		addstr("  ");
-
-
 		/* vcpus */
 		addch(A_REVERSE | 'V');
 		attr_addstr(show_vcpus ? COLOR_PAIR(1) : 0, "CPUs");
@@ -1086,23 +1076,6 @@ void do_vbd(xenstat_domain *domain)
 	}
 }
 
-/* Output all tmem information */
-void do_tmem(xenstat_domain *domain)
-{
-	xenstat_tmem *tmem = xenstat_domain_tmem(domain);
-	unsigned long long curr_eph_pages = xenstat_tmem_curr_eph_pages(tmem);
-	unsigned long long succ_eph_gets = xenstat_tmem_succ_eph_gets(tmem);
-	unsigned long long succ_pers_puts = xenstat_tmem_succ_pers_puts(tmem);
-	unsigned long long succ_pers_gets = xenstat_tmem_succ_pers_gets(tmem);
-
-	if (curr_eph_pages | succ_eph_gets | succ_pers_puts | succ_pers_gets)
-		print("Tmem:  Curr eph pages: %8llu   Succ eph gets: %8llu   "
-	              "Succ pers puts: %8llu   Succ pers gets: %8llu\n",
-			curr_eph_pages, succ_eph_gets,
-			succ_pers_puts, succ_pers_gets);
-
-}
-
 static void top(void)
 {
 	xenstat_domain **domains;
@@ -1155,8 +1128,6 @@ static void top(void)
 			do_network(domains[i]);
 		if (show_vbds)
 			do_vbd(domains[i]);
-		if (show_tmem)
-			do_tmem(domains[i]);
 	}
 
 	if (!batch)
@@ -1232,9 +1203,6 @@ int main(int argc, char **argv)
 		case 'f':
 			show_full_name = 1;
 			break;
-		case 't':
-			show_tmem = 1;
-			break;
 		}
 	}
 
diff --git a/tools/xl/Makefile b/tools/xl/Makefile
index 2769295515..af4912e67a 100644
--- a/tools/xl/Makefile
+++ b/tools/xl/Makefile
@@ -17,7 +17,7 @@ CFLAGS_XL += -Wshadow
 
 XL_OBJS-$(CONFIG_X86) = xl_psr.o
 XL_OBJS = xl.o xl_cmdtable.o xl_sxp.o xl_utils.o $(XL_OBJS-y)
-XL_OBJS += xl_tmem.o xl_parse.o xl_cpupool.o xl_flask.o
+XL_OBJS += xl_parse.o xl_cpupool.o xl_flask.o
 XL_OBJS += xl_vtpm.o xl_block.o xl_nic.o xl_usb.o
 XL_OBJS += xl_sched.o xl_pci.o xl_vcpu.o xl_cdrom.o xl_mem.o
 XL_OBJS += xl_info.o xl_console.o xl_misc.o
diff --git a/tools/xl/xl.h b/tools/xl/xl.h
index cf4202bc89..60bdad8ffb 100644
--- a/tools/xl/xl.h
+++ b/tools/xl/xl.h
@@ -184,12 +184,6 @@ int main_usbdev_detach(int argc, char **argv);
 int main_usblist(int argc, char **argv);
 int main_uptime(int argc, char **argv);
 int main_claims(int argc, char **argv);
-int main_tmem_list(int argc, char **argv);
-int main_tmem_freeze(int argc, char **argv);
-int main_tmem_thaw(int argc, char **argv);
-int main_tmem_set(int argc, char **argv);
-int main_tmem_shared_auth(int argc, char **argv);
-int main_tmem_freeable(int argc, char **argv);
 int main_network2attach(int argc, char **argv);
 int main_network2list(int argc, char **argv);
 int main_network2detach(int argc, char **argv);
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index 89716badcb..5baa6023aa 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -443,46 +443,6 @@ struct cmd_spec cmd_table[] = {
       "",
       "",
     },
-    { "tmem-list",
-      &main_tmem_list, 0, 0,
-      "List tmem pools",
-      "[-l] [<Domain>|-a]",
-      "  -l                             List tmem stats",
-    },
-    { "tmem-freeze",
-      &main_tmem_freeze, 0, 1,
-      "Freeze tmem pools",
-      "[<Domain>|-a]",
-      "  -a                             Freeze all tmem",
-    },
-    { "tmem-thaw",
-      &main_tmem_thaw, 0, 1,
-      "Thaw tmem pools",
-      "[<Domain>|-a]",
-      "  -a                             Thaw all tmem",
-    },
-    { "tmem-set",
-      &main_tmem_set, 0, 1,
-      "Change tmem settings",
-      "[<Domain>|-a] [-w[=WEIGHT]|-c[=CAP]|-p[=COMPRESS]]",
-      "  -a                             Operate on all tmem\n"
-      "  -w WEIGHT                      Weight (int)\n"
-      "  -p COMPRESS                    Compress (int)",
-    },
-    { "tmem-shared-auth",
-      &main_tmem_shared_auth, 0, 1,
-      "De/authenticate shared tmem pool",
-      "[<Domain>|-a] [-u[=UUID] [-A[=AUTH]",
-      "  -a                             Authenticate for all tmem pools\n"
-      "  -u UUID                        Specify uuid\n"
-      "                                 (abcdef01-2345-6789-1234-567890abcdef)\n"
-      "  -A AUTH                        0=deauth,1=auth",
-    },
-    { "tmem-freeable",
-      &main_tmem_freeable, 0, 0,
-      "Get information about how much freeable memory (MB) is in-use by tmem",
-      "",
-    },
     { "cpupool-create",
       &main_cpupoolcreate, 1, 1,
       "Create a new CPU pool",
diff --git a/tools/xl/xl_tmem.c b/tools/xl/xl_tmem.c
deleted file mode 100644
index 36214321e6..0000000000
--- a/tools/xl/xl_tmem.c
+++ /dev/null
@@ -1,251 +0,0 @@
-/*
- * Copyright 2009-2017 Citrix Ltd and other contributors
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU Lesser General Public License as published
- * by the Free Software Foundation; version 2.1 only. with the special
- * exception on linking described in file LICENSE.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU Lesser General Public License for more details.
- */
-
-#include <stdlib.h>
-#include <unistd.h>
-
-#include <libxl.h>
-
-#include "xl.h"
-#include "xl_utils.h"
-
-int main_tmem_list(int argc, char **argv)
-{
-    uint32_t domid;
-    const char *dom = NULL;
-    char *buf = NULL;
-    int use_long = 0;
-    int all = 0;
-    int opt;
-
-    SWITCH_FOREACH_OPT(opt, "al", NULL, "tmem-list", 0) {
-    case 'l':
-        use_long = 1;
-        break;
-    case 'a':
-        all = 1;
-        break;
-    }
-
-    dom = argv[optind];
-    if (!dom && all == 0) {
-        fprintf(stderr, "You must specify -a or a domain id.\n\n");
-        help("tmem-list");
-        return 1;
-    }
-
-    if (all)
-        domid = INVALID_DOMID;
-    else
-        domid = find_domain(dom);
-
-    buf = libxl_tmem_list(ctx, domid, use_long);
-    if (buf == NULL)
-        return EXIT_FAILURE;
-
-    printf("%s\n", buf);
-    free(buf);
-    return EXIT_SUCCESS;
-}
-
-int main_tmem_freeze(int argc, char **argv)
-{
-    uint32_t domid;
-    const char *dom = NULL;
-    int all = 0;
-    int opt;
-
-    SWITCH_FOREACH_OPT(opt, "a", NULL, "tmem-freeze", 0) {
-    case 'a':
-        all = 1;
-        break;
-    }
-
-    dom = argv[optind];
-    if (!dom && all == 0) {
-        fprintf(stderr, "You must specify -a or a domain id.\n\n");
-        help("tmem-freeze");
-        return EXIT_FAILURE;
-    }
-
-    if (all)
-        domid = INVALID_DOMID;
-    else
-        domid = find_domain(dom);
-
-    if (libxl_tmem_freeze(ctx, domid) < 0)
-        return EXIT_FAILURE;
-
-    return EXIT_SUCCESS;
-}
-
-int main_tmem_thaw(int argc, char **argv)
-{
-    uint32_t domid;
-    const char *dom = NULL;
-    int all = 0;
-    int opt;
-
-    SWITCH_FOREACH_OPT(opt, "a", NULL, "tmem-thaw", 0) {
-    case 'a':
-        all = 1;
-        break;
-    }
-
-    dom = argv[optind];
-    if (!dom && all == 0) {
-        fprintf(stderr, "You must specify -a or a domain id.\n\n");
-        help("tmem-thaw");
-        return EXIT_FAILURE;
-    }
-
-    if (all)
-        domid = INVALID_DOMID;
-    else
-        domid = find_domain(dom);
-
-    if (libxl_tmem_thaw(ctx, domid) < 0)
-        return EXIT_FAILURE;
-
-    return EXIT_SUCCESS;
-}
-
-int main_tmem_set(int argc, char **argv)
-{
-    uint32_t domid;
-    const char *dom = NULL;
-    uint32_t weight = 0, cap = 0, compress = 0;
-    int opt_w = 0, opt_c = 0, opt_p = 0;
-    int all = 0;
-    int opt;
-    int rc = 0;
-
-    SWITCH_FOREACH_OPT(opt, "aw:c:p:", NULL, "tmem-set", 0) {
-    case 'a':
-        all = 1;
-        break;
-    case 'w':
-        weight = strtol(optarg, NULL, 10);
-        opt_w = 1;
-        break;
-    case 'c':
-        cap = strtol(optarg, NULL, 10);
-        opt_c = 1;
-        break;
-    case 'p':
-        compress = strtol(optarg, NULL, 10);
-        opt_p = 1;
-        break;
-    }
-
-    dom = argv[optind];
-    if (!dom && all == 0) {
-        fprintf(stderr, "You must specify -a or a domain id.\n\n");
-        help("tmem-set");
-        return EXIT_FAILURE;
-    }
-
-    if (all)
-        domid = INVALID_DOMID;
-    else
-        domid = find_domain(dom);
-
-    if (!opt_w && !opt_c && !opt_p) {
-        fprintf(stderr, "No set value specified.\n\n");
-        help("tmem-set");
-        return EXIT_FAILURE;
-    }
-
-    if (opt_w)
-        rc = libxl_tmem_set(ctx, domid, "weight", weight);
-    if (opt_c)
-        rc = libxl_tmem_set(ctx, domid, "cap", cap);
-    if (opt_p)
-        rc = libxl_tmem_set(ctx, domid, "compress", compress);
-
-    if (rc < 0)
-        return EXIT_FAILURE;
-
-    return EXIT_SUCCESS;
-}
-
-int main_tmem_shared_auth(int argc, char **argv)
-{
-    uint32_t domid;
-    const char *autharg = NULL;
-    char *endptr = NULL;
-    const char *dom = NULL;
-    char *uuid = NULL;
-    int auth = -1;
-    int all = 0;
-    int opt;
-
-    SWITCH_FOREACH_OPT(opt, "au:A:", NULL, "tmem-shared-auth", 0) {
-    case 'a':
-        all = 1;
-        break;
-    case 'u':
-        uuid = optarg;
-        break;
-    case 'A':
-        autharg = optarg;
-        break;
-    }
-
-    dom = argv[optind];
-    if (!dom && all == 0) {
-        fprintf(stderr, "You must specify -a or a domain id.\n\n");
-        help("tmem-shared-auth");
-        return EXIT_FAILURE;
-    }
-
-    if (all)
-        domid = INVALID_DOMID;
-    else
-        domid = find_domain(dom);
-
-    if (uuid == NULL || autharg == NULL) {
-        fprintf(stderr, "No uuid or auth specified.\n\n");
-        help("tmem-shared-auth");
-        return EXIT_FAILURE;
-    }
-
-    auth = strtol(autharg, &endptr, 10);
-    if (*endptr != '\0') {
-        fprintf(stderr, "Invalid auth, valid auth are <0|1>.\n\n");
-        return EXIT_FAILURE;
-    }
-
-    if (libxl_tmem_shared_auth(ctx, domid, uuid, auth) < 0)
-        return EXIT_FAILURE;
-
-    return EXIT_SUCCESS;
-}
-
-int main_tmem_freeable(int argc, char **argv)
-{
-    int opt;
-    int mb;
-
-    SWITCH_FOREACH_OPT(opt, "", NULL, "tmem-freeable", 0) {
-        /* No options */
-    }
-
-    mb = libxl_tmem_freeable(ctx);
-    if (mb == -1)
-        return EXIT_FAILURE;
-
-    printf("%d\n", mb);
-    return EXIT_SUCCESS;
-}
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/3] xen: remove tmem from hypervisor
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
  2018-11-28 13:58 ` [PATCH v2 1/3] tools: remove tmem code and commands Wei Liu
@ 2018-11-28 13:58 ` Wei Liu
  2018-11-28 14:43   ` Jan Beulich
  2018-11-28 15:49   ` Daniel De Graaf
  2018-11-28 13:58 ` [PATCH v2 3/3] docs: remove tmem related text Wei Liu
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Wei Liu @ 2018-11-28 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Daniel De Graaf, Roger Pau Monné

This patch removes all tmem related code and CONFIG_TMEM from the
hypervisor. Also remove tmem hypercalls from the default XSM policy.

It is written as if tmem is disabled and tmem freeable pages is 0.

We will need to keep public/tmem.h around forever to avoid breaking
guests.  Remove the hypervisor only part and put guest visible part
under a xen version check. Take the chance to remove trailing
whitespaces.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
v2:
1. remove some more residuals
2. fix errors discovered by Gitlab CI
3. keep public/tmem.h
---
 tools/flask/policy/modules/dom0.te           |    4 +-
 tools/flask/policy/modules/guest_features.te |    3 -
 xen/arch/arm/configs/tiny64.conf             |    1 -
 xen/arch/x86/configs/pvshim_defconfig        |    1 -
 xen/arch/x86/guest/hypercall_page.S          |    1 -
 xen/arch/x86/hvm/hypercall.c                 |    3 -
 xen/arch/x86/hypercall.c                     |    1 -
 xen/arch/x86/pv/hypercall.c                  |    3 -
 xen/arch/x86/setup.c                         |    8 -
 xen/common/Kconfig                           |   13 -
 xen/common/Makefile                          |    4 -
 xen/common/compat/tmem_xen.c                 |   23 -
 xen/common/domain.c                          |    3 -
 xen/common/memory.c                          |    5 +-
 xen/common/page_alloc.c                      |   40 +-
 xen/common/sysctl.c                          |    5 -
 xen/common/tmem.c                            | 2095 --------------------------
 xen/common/tmem_control.c                    |  560 -------
 xen/common/tmem_xen.c                        |  277 ----
 xen/include/Makefile                         |    1 -
 xen/include/public/sysctl.h                  |  108 +-
 xen/include/public/tmem.h                    |   14 +-
 xen/include/xen/hypercall.h                  |    7 -
 xen/include/xen/mm.h                         |    2 +
 xen/include/xen/sched.h                      |    3 -
 xen/include/xen/tmem.h                       |   45 -
 xen/include/xen/tmem_control.h               |   39 -
 xen/include/xen/tmem_xen.h                   |  343 -----
 xen/include/xlat.lst                         |    2 -
 xen/include/xsm/dummy.h                      |    6 -
 xen/include/xsm/xsm.h                        |    6 -
 xen/xsm/dummy.c                              |    1 -
 xen/xsm/flask/hooks.c                        |    9 -
 xen/xsm/flask/policy/access_vectors          |    4 -
 34 files changed, 16 insertions(+), 3624 deletions(-)
 delete mode 100644 xen/common/compat/tmem_xen.c
 delete mode 100644 xen/common/tmem.c
 delete mode 100644 xen/common/tmem_control.c
 delete mode 100644 xen/common/tmem_xen.c
 delete mode 100644 xen/include/xen/tmem.h
 delete mode 100644 xen/include/xen/tmem_control.h
 delete mode 100644 xen/include/xen/tmem_xen.h

diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te
index a347d664f8..9970f9dc08 100644
--- a/tools/flask/policy/modules/dom0.te
+++ b/tools/flask/policy/modules/dom0.te
@@ -10,8 +10,8 @@ allow dom0_t xen_t:xen {
 	settime tbufcontrol readconsole clearconsole perfcontrol mtrr_add
 	mtrr_del mtrr_read microcode physinfo quirk writeconsole readapic
 	writeapic privprofile nonprivprofile kexec firmware sleep frequency
-	getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op tmem_op
-	tmem_control getscheduler setscheduler
+	getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op
+	getscheduler setscheduler
 };
 allow dom0_t xen_t:xen2 {
 	resource_op psr_cmt_op psr_alloc pmu_ctrl get_symbol
diff --git a/tools/flask/policy/modules/guest_features.te b/tools/flask/policy/modules/guest_features.te
index 9ac9780ded..1b77832aea 100644
--- a/tools/flask/policy/modules/guest_features.te
+++ b/tools/flask/policy/modules/guest_features.te
@@ -1,6 +1,3 @@
-# Allow all domains to use (unprivileged parts of) the tmem hypercall
-allow domain_type xen_t:xen tmem_op;
-
 # Allow all domains to use PMU (but not to change its settings --- that's what
 # pmu_ctrl is for)
 allow domain_type xen_t:xen2 pmu_use;
diff --git a/xen/arch/arm/configs/tiny64.conf b/xen/arch/arm/configs/tiny64.conf
index aecc55c95f..cc6d93f2f8 100644
--- a/xen/arch/arm/configs/tiny64.conf
+++ b/xen/arch/arm/configs/tiny64.conf
@@ -11,7 +11,6 @@ CONFIG_ARM=y
 #
 # Common Features
 #
-# CONFIG_TMEM is not set
 CONFIG_SCHED_CREDIT=y
 # CONFIG_SCHED_CREDIT2 is not set
 # CONFIG_SCHED_RTDS is not set
diff --git a/xen/arch/x86/configs/pvshim_defconfig b/xen/arch/x86/configs/pvshim_defconfig
index a12e3d0465..9710aa6238 100644
--- a/xen/arch/x86/configs/pvshim_defconfig
+++ b/xen/arch/x86/configs/pvshim_defconfig
@@ -11,7 +11,6 @@ CONFIG_NR_CPUS=32
 # CONFIG_HVM_FEP is not set
 # CONFIG_TBOOT is not set
 # CONFIG_KEXEC is not set
-# CONFIG_TMEM is not set
 # CONFIG_XENOPROF is not set
 # CONFIG_XSM is not set
 # CONFIG_SCHED_CREDIT2 is not set
diff --git a/xen/arch/x86/guest/hypercall_page.S b/xen/arch/x86/guest/hypercall_page.S
index fdd2e72272..ae00898f07 100644
--- a/xen/arch/x86/guest/hypercall_page.S
+++ b/xen/arch/x86/guest/hypercall_page.S
@@ -58,7 +58,6 @@ DECLARE_HYPERCALL(hvm_op)
 DECLARE_HYPERCALL(sysctl)
 DECLARE_HYPERCALL(domctl)
 DECLARE_HYPERCALL(kexec_op)
-DECLARE_HYPERCALL(tmem_op)
 DECLARE_HYPERCALL(xc_reserved_op)
 DECLARE_HYPERCALL(xenpmu_op)
 
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 19d126377a..b52f7b2f09 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -131,9 +131,6 @@ static const hypercall_table_t hvm_hypercall_table[] = {
     HYPERCALL(hvm_op),
     HYPERCALL(sysctl),
     HYPERCALL(domctl),
-#ifdef CONFIG_TMEM
-    HYPERCALL(tmem_op),
-#endif
     COMPAT_CALL(platform_op),
 #ifdef CONFIG_PV
     COMPAT_CALL(mmuext_op),
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 032de8f8f8..f4da9326fd 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -63,7 +63,6 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(sysctl, 1),
     ARGS(domctl, 1),
     ARGS(kexec_op, 2),
-    ARGS(tmem_op, 1),
     ARGS(xenpmu_op, 2),
 #ifdef CONFIG_HVM
     ARGS(hvm_op, 2),
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 5d11911735..3a67b7e64f 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -74,9 +74,6 @@ const hypercall_table_t pv_hypercall_table[] = {
 #ifdef CONFIG_KEXEC
     COMPAT_CALL(kexec_op),
 #endif
-#ifdef CONFIG_TMEM
-    HYPERCALL(tmem_op),
-#endif
     HYPERCALL(xenpmu_op),
 #ifdef CONFIG_HVM
     HYPERCALL(hvm_op),
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 9cbff22fb3..3621f986f9 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -25,7 +25,6 @@
 #include <xen/dmi.h>
 #include <xen/pfn.h>
 #include <xen/nodemask.h>
-#include <xen/tmem_xen.h>
 #include <xen/virtual_region.h>
 #include <xen/watchdog.h>
 #include <public/version.h>
@@ -1478,13 +1477,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                 s = pfn_to_paddr(limit + 1);
             init_domheap_pages(s, e);
         }
-
-        if ( tmem_enabled() )
-        {
-           printk(XENLOG_WARNING
-                  "TMEM physical RAM limit exceeded, disabling TMEM\n");
-           tmem_disable();
-        }
     }
     else
         end_boot_allocator();
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 68132a3a10..fb719ac237 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -77,19 +77,6 @@ config KEXEC
 
 	  If unsure, say Y.
 
-config TMEM
-	def_bool y
-	prompt "Transcendent Memory Support" if EXPERT = "y"
-	---help---
-	  Transcendent memory allows PV-aware guests to collaborate on memory
-	  usage. Guests can 'swap' their memory to the hypervisor or have an
-	  collective pool of memory shared across guests. The end result is
-	  less memory usage by guests allowing higher guest density.
-
-	  You also have to enable it on the Xen commandline by using tmem=1
-
-	  If unsure, say Y.
-
 config XENOPROF
 	def_bool y
 	prompt "Xen Oprofile Support" if EXPERT = "y"
diff --git a/xen/common/Makefile b/xen/common/Makefile
index ffdfb7448d..02763290a9 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -71,10 +71,6 @@ obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4
 
 obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o kernel.o memory.o multicall.o xlat.o)
 
-tmem-y := tmem.o tmem_xen.o tmem_control.o
-tmem-$(CONFIG_COMPAT) += compat/tmem_xen.o
-obj-$(CONFIG_TMEM) += $(tmem-y)
-
 extra-y := symbols-dummy.o
 
 subdir-$(CONFIG_COVERAGE) += coverage
diff --git a/xen/common/compat/tmem_xen.c b/xen/common/compat/tmem_xen.c
deleted file mode 100644
index 5111fd8df6..0000000000
--- a/xen/common/compat/tmem_xen.c
+++ /dev/null
@@ -1,23 +0,0 @@
-/******************************************************************************
- * tmem_xen.c
- *
- */
-
-#include <xen/lib.h>
-#include <xen/sched.h>
-#include <xen/domain.h>
-#include <xen/guest_access.h>
-#include <xen/hypercall.h>
-#include <compat/tmem.h>
-
-CHECK_tmem_oid;
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 78cc5249e8..3362ad3ad3 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -40,7 +40,6 @@
 #include <public/vcpu.h>
 #include <xsm/xsm.h>
 #include <xen/trace.h>
-#include <xen/tmem.h>
 #include <asm/setup.h>
 
 #ifdef CONFIG_X86
@@ -719,10 +718,8 @@ int domain_kill(struct domain *d)
         d->is_dying = DOMDYING_dying;
         evtchn_destroy(d);
         gnttab_release_mappings(d);
-        tmem_destroy(d->tmem_client);
         vnuma_destroy(d->vnuma);
         domain_set_outstanding_pages(d, 0);
-        d->tmem_client = NULL;
         /* fallthrough */
     case DOMDYING_dying:
         rc = domain_relinquish_resources(d);
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 175bd62c11..3bd7902cf0 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -7,6 +7,7 @@
  * Copyright (c) 2003-2005, K A Fraser
  */
 
+#include <xen/domain_page.h>
 #include <xen/types.h>
 #include <xen/lib.h>
 #include <xen/mm.h>
@@ -18,8 +19,6 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/errno.h>
-#include <xen/tmem.h>
-#include <xen/tmem_xen.h>
 #include <xen/numa.h>
 #include <xen/mem_access.h>
 #include <xen/trace.h>
@@ -250,7 +249,7 @@ static void populate_physmap(struct memop_args *a)
 
                 if ( unlikely(!page) )
                 {
-                    if ( !tmem_enabled() || a->extent_order )
+                    if ( a->extent_order )
                         gdprintk(XENLOG_INFO,
                                  "Could not allocate order=%u extent: id=%d memflags=%#x (%u of %u)\n",
                                  a->extent_order, d->domain_id, a->memflags,
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index fd3b0aaa83..bb19b026a8 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -135,8 +135,6 @@
 #include <xen/numa.h>
 #include <xen/nodemask.h>
 #include <xen/event.h>
-#include <xen/tmem.h>
-#include <xen/tmem_xen.h>
 #include <public/sysctl.h>
 #include <public/sched.h>
 #include <asm/page.h>
@@ -529,16 +527,6 @@ int domain_set_outstanding_pages(struct domain *d, unsigned long pages)
     /* how much memory is available? */
     avail_pages = total_avail_pages;
 
-    /* Note: The usage of claim means that allocation from a guest *might*
-     * have to come from freeable memory. Using free memory is always better, if
-     * it is available, than using freeable memory.
-     *
-     * But that is OK as once the claim has been made, it still can take minutes
-     * before the claim is fully satisfied. Tmem can make use of the unclaimed
-     * pages during this time (to store ephemeral/freeable pages only,
-     * not persistent pages).
-     */
-    avail_pages += tmem_freeable_pages();
     avail_pages -= outstanding_claims;
 
     /*
@@ -710,8 +698,7 @@ static void __init setup_low_mem_virq(void)
 
 static void check_low_mem_virq(void)
 {
-    unsigned long avail_pages = total_avail_pages +
-        tmem_freeable_pages() - outstanding_claims;
+    unsigned long avail_pages = total_avail_pages - outstanding_claims;
 
     if ( unlikely(avail_pages <= low_mem_virq_th) )
     {
@@ -940,8 +927,7 @@ static struct page_info *alloc_heap_pages(
      * Claimed memory is considered unavailable unless the request
      * is made by a domain with sufficient unclaimed pages.
      */
-    if ( (outstanding_claims + request >
-          total_avail_pages + tmem_freeable_pages()) &&
+    if ( (outstanding_claims + request > total_avail_pages) &&
           ((memflags & MEMF_no_refcount) ||
            !d || d->outstanding_pages < request) )
     {
@@ -949,22 +935,6 @@ static struct page_info *alloc_heap_pages(
         return NULL;
     }
 
-    /*
-     * TMEM: When available memory is scarce due to tmem absorbing it, allow
-     * only mid-size allocations to avoid worst of fragmentation issues.
-     * Others try tmem pools then fail.  This is a workaround until all
-     * post-dom0-creation-multi-page allocations can be eliminated.
-     */
-    if ( ((order == 0) || (order >= 9)) &&
-         (total_avail_pages <= midsize_alloc_zone_pages) &&
-         tmem_freeable_pages() )
-    {
-        /* Try to free memory from tmem. */
-        pg = tmem_relinquish_pages(order, memflags);
-        spin_unlock(&heap_lock);
-        return pg;
-    }
-
     pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d);
     /* Try getting a dirty buddy if we couldn't get a clean one. */
     if ( !pg && !(memflags & MEMF_no_scrub) )
@@ -1444,10 +1414,6 @@ static void free_heap_pages(
     else
         pg->u.free.first_dirty = INVALID_DIRTY_IDX;
 
-    if ( tmem_enabled() )
-        midsize_alloc_zone_pages = max(
-            midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
-
     /* Merge chunks as far as possible. */
     while ( order < MAX_ORDER )
     {
@@ -2265,7 +2231,7 @@ int assign_pages(
     {
         if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) )
         {
-            if ( !tmem_enabled() || order != 0 || d->tot_pages != d->max_pages )
+            if ( order != 0 || d->tot_pages != d->max_pages )
                 gprintk(XENLOG_INFO, "Over-allocation for domain %u: "
                         "%u > %u\n", d->domain_id,
                         d->tot_pages + (1 << order), d->max_pages);
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index c0aa6bde4e..765effde8d 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -13,7 +13,6 @@
 #include <xen/domain.h>
 #include <xen/event.h>
 #include <xen/domain_page.h>
-#include <xen/tmem.h>
 #include <xen/trace.h>
 #include <xen/console.h>
 #include <xen/iocap.h>
@@ -456,10 +455,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
     }
 #endif
 
-    case XEN_SYSCTL_tmem_op:
-        ret = tmem_control(&op->u.tmem_op);
-        break;
-
     case XEN_SYSCTL_livepatch_op:
         ret = livepatch_op(&op->u.livepatch);
         if ( ret != -ENOSYS && ret != -EOPNOTSUPP )
diff --git a/xen/common/tmem.c b/xen/common/tmem.c
deleted file mode 100644
index c077f87e77..0000000000
--- a/xen/common/tmem.c
+++ /dev/null
@@ -1,2095 +0,0 @@
-/******************************************************************************
- * tmem.c
- *
- * Transcendent memory
- *
- * Copyright (c) 2009, Dan Magenheimer, Oracle Corp.
- */
-
-/* TODO list: 090129 (updated 100318)
-   - any better reclamation policy?
-   - use different tlsf pools for each client (maybe each pool)
-   - test shared access more completely (ocfs2)
-   - add feedback-driven compression (not for persistent pools though!)
-   - add data-structure total bytes overhead stats
- */
-
-#ifdef __XEN__
-#include <xen/tmem_xen.h> /* host-specific (eg Xen) code goes here. */
-#endif
-
-#include <public/sysctl.h>
-#include <xen/tmem.h>
-#include <xen/rbtree.h>
-#include <xen/radix-tree.h>
-#include <xen/list.h>
-#include <xen/init.h>
-
-#define TMEM_SPEC_VERSION 1
-
-struct tmem_statistics tmem_stats = {
-    .global_obj_count = ATOMIC_INIT(0),
-    .global_pgp_count = ATOMIC_INIT(0),
-    .global_pcd_count = ATOMIC_INIT(0),
-    .global_page_count = ATOMIC_INIT(0),
-    .global_rtree_node_count = ATOMIC_INIT(0),
-};
-
-/************ CORE DATA STRUCTURES ************************************/
-
-struct tmem_object_root {
-    struct xen_tmem_oid oid;
-    struct rb_node rb_tree_node; /* Protected by pool->pool_rwlock. */
-    unsigned long objnode_count; /* Atomicity depends on obj_spinlock. */
-    long pgp_count; /* Atomicity depends on obj_spinlock. */
-    struct radix_tree_root tree_root; /* Tree of pages within object. */
-    struct tmem_pool *pool;
-    domid_t last_client;
-    spinlock_t obj_spinlock;
-};
-
-struct tmem_object_node {
-    struct tmem_object_root *obj;
-    struct radix_tree_node rtn;
-};
-
-struct tmem_page_descriptor {
-    union {
-        struct list_head global_eph_pages;
-        struct list_head client_inv_pages;
-    };
-    union {
-        struct {
-            union {
-                struct list_head client_eph_pages;
-                struct list_head pool_pers_pages;
-            };
-            struct tmem_object_root *obj;
-        } us;
-        struct xen_tmem_oid inv_oid;  /* Used for invalid list only. */
-    };
-    pagesize_t size; /* 0 == PAGE_SIZE (pfp), -1 == data invalid,
-                    else compressed data (cdata). */
-    uint32_t index;
-    bool eviction_attempted;  /* CHANGE TO lifetimes? (settable). */
-    union {
-        struct page_info *pfp;  /* Page frame pointer. */
-        char *cdata; /* Compressed data. */
-        struct tmem_page_content_descriptor *pcd; /* Page dedup. */
-    };
-    union {
-        uint64_t timestamp;
-        uint32_t pool_id;  /* Used for invalid list only. */
-    };
-};
-
-#define PCD_TZE_MAX_SIZE (PAGE_SIZE - (PAGE_SIZE/64))
-
-struct tmem_page_content_descriptor {
-    union {
-        struct page_info *pfp;  /* Page frame pointer. */
-        char *cdata; /* If compression_enabled. */
-    };
-    pagesize_t size; /* If compression_enabled -> 0<size<PAGE_SIZE (*cdata)
-                     * else if tze, 0<=size<PAGE_SIZE, rounded up to mult of 8
-                     * else PAGE_SIZE -> *pfp. */
-};
-
-static int tmem_initialized = 0;
-
-struct xmem_pool *tmem_mempool = 0;
-unsigned int tmem_mempool_maxalloc = 0;
-
-DEFINE_SPINLOCK(tmem_page_list_lock);
-PAGE_LIST_HEAD(tmem_page_list);
-unsigned long tmem_page_list_pages = 0;
-
-DEFINE_RWLOCK(tmem_rwlock);
-static DEFINE_SPINLOCK(eph_lists_spinlock); /* Protects global AND clients. */
-static DEFINE_SPINLOCK(pers_lists_spinlock);
-
-#define ASSERT_SPINLOCK(_l) ASSERT(spin_is_locked(_l))
-#define ASSERT_WRITELOCK(_l) ASSERT(rw_is_write_locked(_l))
-
-    atomic_t client_weight_total;
-
-struct tmem_global tmem_global = {
-    .ephemeral_page_list = LIST_HEAD_INIT(tmem_global.ephemeral_page_list),
-    .client_list = LIST_HEAD_INIT(tmem_global.client_list),
-    .client_weight_total = ATOMIC_INIT(0),
-};
-
-/*
- * There two types of memory allocation interfaces in tmem.
- * One is based on xmem_pool and the other is used for allocate a whole page.
- * Both of them are based on the lowlevel function __tmem_alloc_page/_thispool().
- * The call trace of alloc path is like below.
- * Persistant pool:
- *     1.tmem_malloc()
- *         > xmem_pool_alloc()
- *             > tmem_persistent_pool_page_get()
- *                 > __tmem_alloc_page_thispool()
- *     2.tmem_alloc_page()
- *         > __tmem_alloc_page_thispool()
- *
- * Ephemeral pool:
- *     1.tmem_malloc()
- *         > xmem_pool_alloc()
- *             > tmem_mempool_page_get()
- *                 > __tmem_alloc_page()
- *     2.tmem_alloc_page()
- *         > __tmem_alloc_page()
- *
- * The free path is done in the same manner.
- */
-static void *tmem_malloc(size_t size, struct tmem_pool *pool)
-{
-    void *v = NULL;
-
-    if ( (pool != NULL) && is_persistent(pool) ) {
-        if ( pool->client->persistent_pool )
-            v = xmem_pool_alloc(size, pool->client->persistent_pool);
-    }
-    else
-    {
-        ASSERT( size < tmem_mempool_maxalloc );
-        ASSERT( tmem_mempool != NULL );
-        v = xmem_pool_alloc(size, tmem_mempool);
-    }
-    if ( v == NULL )
-        tmem_stats.alloc_failed++;
-    return v;
-}
-
-static void tmem_free(void *p, struct tmem_pool *pool)
-{
-    if ( pool == NULL || !is_persistent(pool) )
-    {
-        ASSERT( tmem_mempool != NULL );
-        xmem_pool_free(p, tmem_mempool);
-    }
-    else
-    {
-        ASSERT( pool->client->persistent_pool != NULL );
-        xmem_pool_free(p, pool->client->persistent_pool);
-    }
-}
-
-static struct page_info *tmem_alloc_page(struct tmem_pool *pool)
-{
-    struct page_info *pfp = NULL;
-
-    if ( pool != NULL && is_persistent(pool) )
-        pfp = __tmem_alloc_page_thispool(pool->client->domain);
-    else
-        pfp = __tmem_alloc_page();
-    if ( pfp == NULL )
-        tmem_stats.alloc_page_failed++;
-    else
-        atomic_inc_and_max(global_page_count);
-    return pfp;
-}
-
-static void tmem_free_page(struct tmem_pool *pool, struct page_info *pfp)
-{
-    ASSERT(pfp);
-    if ( pool == NULL || !is_persistent(pool) )
-        __tmem_free_page(pfp);
-    else
-        __tmem_free_page_thispool(pfp);
-    atomic_dec_and_assert(global_page_count);
-}
-
-static void *tmem_mempool_page_get(unsigned long size)
-{
-    struct page_info *pi;
-
-    ASSERT(size == PAGE_SIZE);
-    if ( (pi = __tmem_alloc_page()) == NULL )
-        return NULL;
-    return page_to_virt(pi);
-}
-
-static void tmem_mempool_page_put(void *page_va)
-{
-    ASSERT(IS_PAGE_ALIGNED(page_va));
-    __tmem_free_page(virt_to_page(page_va));
-}
-
-static int __init tmem_mempool_init(void)
-{
-    tmem_mempool = xmem_pool_create("tmem", tmem_mempool_page_get,
-        tmem_mempool_page_put, PAGE_SIZE, 0, PAGE_SIZE);
-    if ( tmem_mempool )
-        tmem_mempool_maxalloc = xmem_pool_maxalloc(tmem_mempool);
-    return tmem_mempool != NULL;
-}
-
-/* Persistent pools are per-domain. */
-static void *tmem_persistent_pool_page_get(unsigned long size)
-{
-    struct page_info *pi;
-    struct domain *d = current->domain;
-
-    ASSERT(size == PAGE_SIZE);
-    if ( (pi = __tmem_alloc_page_thispool(d)) == NULL )
-        return NULL;
-    ASSERT(IS_VALID_PAGE(pi));
-    return page_to_virt(pi);
-}
-
-static void tmem_persistent_pool_page_put(void *page_va)
-{
-    struct page_info *pi;
-
-    ASSERT(IS_PAGE_ALIGNED(page_va));
-    pi = mfn_to_page(_mfn(virt_to_mfn(page_va)));
-    ASSERT(IS_VALID_PAGE(pi));
-    __tmem_free_page_thispool(pi);
-}
-
-/*
- * Page content descriptor manipulation routines.
- */
-#define NOT_SHAREABLE ((uint16_t)-1UL)
-
-/************ PAGE DESCRIPTOR MANIPULATION ROUTINES *******************/
-
-/* Allocate a struct tmem_page_descriptor and associate it with an object. */
-static struct tmem_page_descriptor *pgp_alloc(struct tmem_object_root *obj)
-{
-    struct tmem_page_descriptor *pgp;
-    struct tmem_pool *pool;
-
-    ASSERT(obj != NULL);
-    ASSERT(obj->pool != NULL);
-    pool = obj->pool;
-    if ( (pgp = tmem_malloc(sizeof(struct tmem_page_descriptor), pool)) == NULL )
-        return NULL;
-    pgp->us.obj = obj;
-    INIT_LIST_HEAD(&pgp->global_eph_pages);
-    INIT_LIST_HEAD(&pgp->us.client_eph_pages);
-    pgp->pfp = NULL;
-    pgp->size = -1;
-    pgp->index = -1;
-    pgp->timestamp = get_cycles();
-    atomic_inc_and_max(global_pgp_count);
-    atomic_inc(&pool->pgp_count);
-    if ( _atomic_read(pool->pgp_count) > pool->pgp_count_max )
-        pool->pgp_count_max = _atomic_read(pool->pgp_count);
-    return pgp;
-}
-
-static struct tmem_page_descriptor *pgp_lookup_in_obj(struct tmem_object_root *obj, uint32_t index)
-{
-    ASSERT(obj != NULL);
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    ASSERT(obj->pool != NULL);
-    return radix_tree_lookup(&obj->tree_root, index);
-}
-
-static void pgp_free_data(struct tmem_page_descriptor *pgp, struct tmem_pool *pool)
-{
-    pagesize_t pgp_size = pgp->size;
-
-    if ( pgp->pfp == NULL )
-        return;
-    if ( pgp_size )
-        tmem_free(pgp->cdata, pool);
-    else
-        tmem_free_page(pgp->us.obj->pool,pgp->pfp);
-    if ( pool != NULL && pgp_size )
-    {
-        pool->client->compressed_pages--;
-        pool->client->compressed_sum_size -= pgp_size;
-    }
-    pgp->pfp = NULL;
-    pgp->size = -1;
-}
-
-static void __pgp_free(struct tmem_page_descriptor *pgp, struct tmem_pool *pool)
-{
-    pgp->us.obj = NULL;
-    pgp->index = -1;
-    tmem_free(pgp, pool);
-}
-
-static void pgp_free(struct tmem_page_descriptor *pgp)
-{
-    struct tmem_pool *pool = NULL;
-
-    ASSERT(pgp->us.obj != NULL);
-    ASSERT(pgp->us.obj->pool != NULL);
-    ASSERT(pgp->us.obj->pool->client != NULL);
-
-    pool = pgp->us.obj->pool;
-    if ( !is_persistent(pool) )
-    {
-        ASSERT(list_empty(&pgp->global_eph_pages));
-        ASSERT(list_empty(&pgp->us.client_eph_pages));
-    }
-    pgp_free_data(pgp, pool);
-    atomic_dec_and_assert(global_pgp_count);
-    atomic_dec(&pool->pgp_count);
-    ASSERT(_atomic_read(pool->pgp_count) >= 0);
-    pgp->size = -1;
-    if ( is_persistent(pool) && pool->client->info.flags.u.migrating )
-    {
-        pgp->inv_oid = pgp->us.obj->oid;
-        pgp->pool_id = pool->pool_id;
-        return;
-    }
-    __pgp_free(pgp, pool);
-}
-
-/* Remove pgp from global/pool/client lists and free it. */
-static void pgp_delist_free(struct tmem_page_descriptor *pgp)
-{
-    struct client *client;
-    uint64_t life;
-
-    ASSERT(pgp != NULL);
-    ASSERT(pgp->us.obj != NULL);
-    ASSERT(pgp->us.obj->pool != NULL);
-    client = pgp->us.obj->pool->client;
-    ASSERT(client != NULL);
-
-    /* Delist pgp. */
-    if ( !is_persistent(pgp->us.obj->pool) )
-    {
-        spin_lock(&eph_lists_spinlock);
-        if ( !list_empty(&pgp->us.client_eph_pages) )
-            client->eph_count--;
-        ASSERT(client->eph_count >= 0);
-        list_del_init(&pgp->us.client_eph_pages);
-        if ( !list_empty(&pgp->global_eph_pages) )
-            tmem_global.eph_count--;
-        ASSERT(tmem_global.eph_count >= 0);
-        list_del_init(&pgp->global_eph_pages);
-        spin_unlock(&eph_lists_spinlock);
-    }
-    else
-    {
-        if ( client->info.flags.u.migrating )
-        {
-            spin_lock(&pers_lists_spinlock);
-            list_add_tail(&pgp->client_inv_pages,
-                          &client->persistent_invalidated_list);
-            if ( pgp != pgp->us.obj->pool->cur_pgp )
-                list_del_init(&pgp->us.pool_pers_pages);
-            spin_unlock(&pers_lists_spinlock);
-        }
-        else
-        {
-            spin_lock(&pers_lists_spinlock);
-            list_del_init(&pgp->us.pool_pers_pages);
-            spin_unlock(&pers_lists_spinlock);
-        }
-    }
-    life = get_cycles() - pgp->timestamp;
-    pgp->us.obj->pool->sum_life_cycles += life;
-
-    /* Free pgp. */
-    pgp_free(pgp);
-}
-
-/* Called only indirectly by radix_tree_destroy. */
-static void pgp_destroy(void *v)
-{
-    struct tmem_page_descriptor *pgp = (struct tmem_page_descriptor *)v;
-
-    pgp->us.obj->pgp_count--;
-    pgp_delist_free(pgp);
-}
-
-static int pgp_add_to_obj(struct tmem_object_root *obj, uint32_t index, struct tmem_page_descriptor *pgp)
-{
-    int ret;
-
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    ret = radix_tree_insert(&obj->tree_root, index, pgp);
-    if ( !ret )
-        obj->pgp_count++;
-    return ret;
-}
-
-static struct tmem_page_descriptor *pgp_delete_from_obj(struct tmem_object_root *obj, uint32_t index)
-{
-    struct tmem_page_descriptor *pgp;
-
-    ASSERT(obj != NULL);
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    ASSERT(obj->pool != NULL);
-    pgp = radix_tree_delete(&obj->tree_root, index);
-    if ( pgp != NULL )
-        obj->pgp_count--;
-    ASSERT(obj->pgp_count >= 0);
-
-    return pgp;
-}
-
-/************ RADIX TREE NODE MANIPULATION ROUTINES *******************/
-
-/* Called only indirectly from radix_tree_insert. */
-static struct radix_tree_node *rtn_alloc(void *arg)
-{
-    struct tmem_object_node *objnode;
-    struct tmem_object_root *obj = (struct tmem_object_root *)arg;
-
-    ASSERT(obj->pool != NULL);
-    objnode = tmem_malloc(sizeof(struct tmem_object_node),obj->pool);
-    if (objnode == NULL)
-        return NULL;
-    objnode->obj = obj;
-    memset(&objnode->rtn, 0, sizeof(struct radix_tree_node));
-    if (++obj->pool->objnode_count > obj->pool->objnode_count_max)
-        obj->pool->objnode_count_max = obj->pool->objnode_count;
-    atomic_inc_and_max(global_rtree_node_count);
-    obj->objnode_count++;
-    return &objnode->rtn;
-}
-
-/* Called only indirectly from radix_tree_delete/destroy. */
-static void rtn_free(struct radix_tree_node *rtn, void *arg)
-{
-    struct tmem_pool *pool;
-    struct tmem_object_node *objnode;
-
-    ASSERT(rtn != NULL);
-    objnode = container_of(rtn,struct tmem_object_node,rtn);
-    ASSERT(objnode->obj != NULL);
-    ASSERT_SPINLOCK(&objnode->obj->obj_spinlock);
-    pool = objnode->obj->pool;
-    ASSERT(pool != NULL);
-    pool->objnode_count--;
-    objnode->obj->objnode_count--;
-    objnode->obj = NULL;
-    tmem_free(objnode, pool);
-    atomic_dec_and_assert(global_rtree_node_count);
-}
-
-/************ POOL OBJECT COLLECTION MANIPULATION ROUTINES *******************/
-
-static int oid_compare(struct xen_tmem_oid *left,
-                       struct xen_tmem_oid *right)
-{
-    if ( left->oid[2] == right->oid[2] )
-    {
-        if ( left->oid[1] == right->oid[1] )
-        {
-            if ( left->oid[0] == right->oid[0] )
-                return 0;
-            else if ( left->oid[0] < right->oid[0] )
-                return -1;
-            else
-                return 1;
-        }
-        else if ( left->oid[1] < right->oid[1] )
-            return -1;
-        else
-            return 1;
-    }
-    else if ( left->oid[2] < right->oid[2] )
-        return -1;
-    else
-        return 1;
-}
-
-static void oid_set_invalid(struct xen_tmem_oid *oidp)
-{
-    oidp->oid[0] = oidp->oid[1] = oidp->oid[2] = -1UL;
-}
-
-static unsigned oid_hash(struct xen_tmem_oid *oidp)
-{
-    return (tmem_hash(oidp->oid[0] ^ oidp->oid[1] ^ oidp->oid[2],
-                     BITS_PER_LONG) & OBJ_HASH_BUCKETS_MASK);
-}
-
-/* Searches for object==oid in pool, returns locked object if found. */
-static struct tmem_object_root * obj_find(struct tmem_pool *pool,
-                                          struct xen_tmem_oid *oidp)
-{
-    struct rb_node *node;
-    struct tmem_object_root *obj;
-
-restart_find:
-    read_lock(&pool->pool_rwlock);
-    node = pool->obj_rb_root[oid_hash(oidp)].rb_node;
-    while ( node )
-    {
-        obj = container_of(node, struct tmem_object_root, rb_tree_node);
-        switch ( oid_compare(&obj->oid, oidp) )
-        {
-            case 0: /* Equal. */
-                if ( !spin_trylock(&obj->obj_spinlock) )
-                {
-                    read_unlock(&pool->pool_rwlock);
-                    goto restart_find;
-                }
-                read_unlock(&pool->pool_rwlock);
-                return obj;
-            case -1:
-                node = node->rb_left;
-                break;
-            case 1:
-                node = node->rb_right;
-        }
-    }
-    read_unlock(&pool->pool_rwlock);
-    return NULL;
-}
-
-/* Free an object that has no more pgps in it. */
-static void obj_free(struct tmem_object_root *obj)
-{
-    struct tmem_pool *pool;
-    struct xen_tmem_oid old_oid;
-
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    ASSERT(obj != NULL);
-    ASSERT(obj->pgp_count == 0);
-    pool = obj->pool;
-    ASSERT(pool != NULL);
-    ASSERT(pool->client != NULL);
-    ASSERT_WRITELOCK(&pool->pool_rwlock);
-    if ( obj->tree_root.rnode != NULL ) /* May be a "stump" with no leaves. */
-        radix_tree_destroy(&obj->tree_root, pgp_destroy);
-    ASSERT((long)obj->objnode_count == 0);
-    ASSERT(obj->tree_root.rnode == NULL);
-    pool->obj_count--;
-    ASSERT(pool->obj_count >= 0);
-    obj->pool = NULL;
-    old_oid = obj->oid;
-    oid_set_invalid(&obj->oid);
-    obj->last_client = TMEM_CLI_ID_NULL;
-    atomic_dec_and_assert(global_obj_count);
-    rb_erase(&obj->rb_tree_node, &pool->obj_rb_root[oid_hash(&old_oid)]);
-    spin_unlock(&obj->obj_spinlock);
-    tmem_free(obj, pool);
-}
-
-static int obj_rb_insert(struct rb_root *root, struct tmem_object_root *obj)
-{
-    struct rb_node **new, *parent = NULL;
-    struct tmem_object_root *this;
-
-    ASSERT(obj->pool);
-    ASSERT_WRITELOCK(&obj->pool->pool_rwlock);
-
-    new = &(root->rb_node);
-    while ( *new )
-    {
-        this = container_of(*new, struct tmem_object_root, rb_tree_node);
-        parent = *new;
-        switch ( oid_compare(&this->oid, &obj->oid) )
-        {
-            case 0:
-                return 0;
-            case -1:
-                new = &((*new)->rb_left);
-                break;
-            case 1:
-                new = &((*new)->rb_right);
-                break;
-        }
-    }
-    rb_link_node(&obj->rb_tree_node, parent, new);
-    rb_insert_color(&obj->rb_tree_node, root);
-    return 1;
-}
-
-/*
- * Allocate, initialize, and insert an tmem_object_root
- * (should be called only if find failed).
- */
-static struct tmem_object_root * obj_alloc(struct tmem_pool *pool,
-                                           struct xen_tmem_oid *oidp)
-{
-    struct tmem_object_root *obj;
-
-    ASSERT(pool != NULL);
-    if ( (obj = tmem_malloc(sizeof(struct tmem_object_root), pool)) == NULL )
-        return NULL;
-    pool->obj_count++;
-    if (pool->obj_count > pool->obj_count_max)
-        pool->obj_count_max = pool->obj_count;
-    atomic_inc_and_max(global_obj_count);
-    radix_tree_init(&obj->tree_root);
-    radix_tree_set_alloc_callbacks(&obj->tree_root, rtn_alloc, rtn_free, obj);
-    spin_lock_init(&obj->obj_spinlock);
-    obj->pool = pool;
-    obj->oid = *oidp;
-    obj->objnode_count = 0;
-    obj->pgp_count = 0;
-    obj->last_client = TMEM_CLI_ID_NULL;
-    return obj;
-}
-
-/* Free an object after destroying any pgps in it. */
-static void obj_destroy(struct tmem_object_root *obj)
-{
-    ASSERT_WRITELOCK(&obj->pool->pool_rwlock);
-    radix_tree_destroy(&obj->tree_root, pgp_destroy);
-    obj_free(obj);
-}
-
-/* Destroys all objs in a pool, or only if obj->last_client matches cli_id. */
-static void pool_destroy_objs(struct tmem_pool *pool, domid_t cli_id)
-{
-    struct rb_node *node;
-    struct tmem_object_root *obj;
-    int i;
-
-    write_lock(&pool->pool_rwlock);
-    pool->is_dying = 1;
-    for (i = 0; i < OBJ_HASH_BUCKETS; i++)
-    {
-        node = rb_first(&pool->obj_rb_root[i]);
-        while ( node != NULL )
-        {
-            obj = container_of(node, struct tmem_object_root, rb_tree_node);
-            spin_lock(&obj->obj_spinlock);
-            node = rb_next(node);
-            if ( obj->last_client == cli_id )
-                obj_destroy(obj);
-            else
-                spin_unlock(&obj->obj_spinlock);
-        }
-    }
-    write_unlock(&pool->pool_rwlock);
-}
-
-
-/************ POOL MANIPULATION ROUTINES ******************************/
-
-static struct tmem_pool * pool_alloc(void)
-{
-    struct tmem_pool *pool;
-    int i;
-
-    if ( (pool = xzalloc(struct tmem_pool)) == NULL )
-        return NULL;
-    for (i = 0; i < OBJ_HASH_BUCKETS; i++)
-        pool->obj_rb_root[i] = RB_ROOT;
-    INIT_LIST_HEAD(&pool->persistent_page_list);
-    rwlock_init(&pool->pool_rwlock);
-    return pool;
-}
-
-static void pool_free(struct tmem_pool *pool)
-{
-    pool->client = NULL;
-    xfree(pool);
-}
-
-/*
- * Register new_client as a user of this shared pool and return 0 on succ.
- */
-static int shared_pool_join(struct tmem_pool *pool, struct client *new_client)
-{
-    struct share_list *sl;
-    ASSERT(is_shared(pool));
-
-    if ( (sl = tmem_malloc(sizeof(struct share_list), NULL)) == NULL )
-        return -1;
-    sl->client = new_client;
-    list_add_tail(&sl->share_list, &pool->share_list);
-    if ( new_client->cli_id != pool->client->cli_id )
-        tmem_client_info("adding new %s %d to shared pool owned by %s %d\n",
-                    tmem_client_str, new_client->cli_id, tmem_client_str,
-                    pool->client->cli_id);
-    else if ( pool->shared_count )
-        tmem_client_info("inter-guest sharing of shared pool %s by client %d\n",
-                         tmem_client_str, pool->client->cli_id);
-    ++pool->shared_count;
-    return 0;
-}
-
-/* Reassign "ownership" of the pool to another client that shares this pool. */
-static void shared_pool_reassign(struct tmem_pool *pool)
-{
-    struct share_list *sl;
-    int poolid;
-    struct client *old_client = pool->client, *new_client;
-
-    ASSERT(is_shared(pool));
-    if ( list_empty(&pool->share_list) )
-    {
-        ASSERT(pool->shared_count == 0);
-        return;
-    }
-    old_client->pools[pool->pool_id] = NULL;
-    sl = list_entry(pool->share_list.next, struct share_list, share_list);
-    /*
-     * The sl->client can be old_client if there are multiple shared pools
-     * within an guest.
-     */
-    pool->client = new_client = sl->client;
-    for (poolid = 0; poolid < MAX_POOLS_PER_DOMAIN; poolid++)
-        if (new_client->pools[poolid] == pool)
-            break;
-    ASSERT(poolid != MAX_POOLS_PER_DOMAIN);
-    new_client->eph_count += _atomic_read(pool->pgp_count);
-    old_client->eph_count -= _atomic_read(pool->pgp_count);
-    list_splice_init(&old_client->ephemeral_page_list,
-                     &new_client->ephemeral_page_list);
-    tmem_client_info("reassigned shared pool from %s=%d to %s=%d pool_id=%d\n",
-        tmem_cli_id_str, old_client->cli_id, tmem_cli_id_str, new_client->cli_id, poolid);
-    pool->pool_id = poolid;
-}
-
-/*
- * Destroy all objects with last_client same as passed cli_id,
- * remove pool's cli_id from list of sharers of this pool.
- */
-static int shared_pool_quit(struct tmem_pool *pool, domid_t cli_id)
-{
-    struct share_list *sl;
-    int s_poolid;
-
-    ASSERT(is_shared(pool));
-    ASSERT(pool->client != NULL);
-
-    ASSERT_WRITELOCK(&tmem_rwlock);
-    pool_destroy_objs(pool, cli_id);
-    list_for_each_entry(sl,&pool->share_list, share_list)
-    {
-        if (sl->client->cli_id != cli_id)
-            continue;
-        list_del(&sl->share_list);
-        tmem_free(sl, pool);
-        --pool->shared_count;
-        if (pool->client->cli_id == cli_id)
-            shared_pool_reassign(pool);
-        if (pool->shared_count)
-            return pool->shared_count;
-        for (s_poolid = 0; s_poolid < MAX_GLOBAL_SHARED_POOLS; s_poolid++)
-            if ( (tmem_global.shared_pools[s_poolid]) == pool )
-            {
-                tmem_global.shared_pools[s_poolid] = NULL;
-                break;
-            }
-        return 0;
-    }
-    tmem_client_warn("tmem: no match unsharing pool, %s=%d\n",
-        tmem_cli_id_str,pool->client->cli_id);
-    return -1;
-}
-
-/* Flush all data (owned by cli_id) from a pool and, optionally, free it. */
-static void pool_flush(struct tmem_pool *pool, domid_t cli_id)
-{
-    ASSERT(pool != NULL);
-    if ( (is_shared(pool)) && (shared_pool_quit(pool,cli_id) > 0) )
-    {
-        tmem_client_warn("tmem: %s=%d no longer using shared pool %d owned by %s=%d\n",
-           tmem_cli_id_str, cli_id, pool->pool_id, tmem_cli_id_str,pool->client->cli_id);
-        return;
-    }
-    tmem_client_info("Destroying %s-%s tmem pool %s=%d pool_id=%d\n",
-                    is_persistent(pool) ? "persistent" : "ephemeral" ,
-                    is_shared(pool) ? "shared" : "private",
-                    tmem_cli_id_str, pool->client->cli_id, pool->pool_id);
-    if ( pool->client->info.flags.u.migrating )
-    {
-        tmem_client_warn("can't destroy pool while %s is live-migrating\n",
-                    tmem_client_str);
-        return;
-    }
-    pool_destroy_objs(pool, TMEM_CLI_ID_NULL);
-    pool->client->pools[pool->pool_id] = NULL;
-    pool_free(pool);
-}
-
-/************ CLIENT MANIPULATION OPERATIONS **************************/
-
-struct client *client_create(domid_t cli_id)
-{
-    struct client *client = xzalloc(struct client);
-    int i, shift;
-    char name[5];
-    struct domain *d;
-
-    tmem_client_info("tmem: initializing tmem capability for %s=%d...",
-                    tmem_cli_id_str, cli_id);
-    if ( client == NULL )
-    {
-        tmem_client_err("failed... out of memory\n");
-        goto fail;
-    }
-
-    for (i = 0, shift = 12; i < 4; shift -=4, i++)
-        name[i] = (((unsigned short)cli_id >> shift) & 0xf) + '0';
-    name[4] = '\0';
-    client->persistent_pool = xmem_pool_create(name, tmem_persistent_pool_page_get,
-        tmem_persistent_pool_page_put, PAGE_SIZE, 0, PAGE_SIZE);
-    if ( client->persistent_pool == NULL )
-    {
-        tmem_client_err("failed... can't alloc persistent pool\n");
-        goto fail;
-    }
-
-    d = rcu_lock_domain_by_id(cli_id);
-    if ( d == NULL ) {
-        tmem_client_err("failed... can't set client\n");
-        xmem_pool_destroy(client->persistent_pool);
-        goto fail;
-    }
-    if ( !d->is_dying ) {
-        d->tmem_client = client;
-        client->domain = d;
-    }
-    rcu_unlock_domain(d);
-
-    client->cli_id = cli_id;
-    client->info.version = TMEM_SPEC_VERSION;
-    client->info.maxpools = MAX_POOLS_PER_DOMAIN;
-    client->info.flags.u.compress = tmem_compression_enabled();
-    for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++)
-        client->shared_auth_uuid[i][0] =
-            client->shared_auth_uuid[i][1] = -1L;
-    list_add_tail(&client->client_list, &tmem_global.client_list);
-    INIT_LIST_HEAD(&client->ephemeral_page_list);
-    INIT_LIST_HEAD(&client->persistent_invalidated_list);
-    tmem_client_info("ok\n");
-    return client;
-
- fail:
-    xfree(client);
-    return NULL;
-}
-
-static void client_free(struct client *client)
-{
-    list_del(&client->client_list);
-    xmem_pool_destroy(client->persistent_pool);
-    xfree(client);
-}
-
-/* Flush all data from a client and, optionally, free it. */
-static void client_flush(struct client *client)
-{
-    int i;
-    struct tmem_pool *pool;
-
-    for  (i = 0; i < MAX_POOLS_PER_DOMAIN; i++)
-    {
-        if ( (pool = client->pools[i]) == NULL )
-            continue;
-        pool_flush(pool, client->cli_id);
-        client->pools[i] = NULL;
-        client->info.nr_pools--;
-    }
-    client_free(client);
-}
-
-static bool client_over_quota(const struct client *client)
-{
-    int total = _atomic_read(tmem_global.client_weight_total);
-
-    ASSERT(client != NULL);
-    if ( (total == 0) || (client->info.weight == 0) ||
-          (client->eph_count == 0) )
-        return false;
-
-    return (((tmem_global.eph_count * 100L) / client->eph_count) >
-            ((total * 100L) / client->info.weight));
-}
-
-/************ MEMORY REVOCATION ROUTINES *******************************/
-
-static bool tmem_try_to_evict_pgp(struct tmem_page_descriptor *pgp,
-                                  bool *hold_pool_rwlock)
-{
-    struct tmem_object_root *obj = pgp->us.obj;
-    struct tmem_pool *pool = obj->pool;
-
-    if ( pool->is_dying )
-        return false;
-    if ( spin_trylock(&obj->obj_spinlock) )
-    {
-        if ( obj->pgp_count > 1 )
-            return true;
-        if ( write_trylock(&pool->pool_rwlock) )
-        {
-            *hold_pool_rwlock = 1;
-            return true;
-        }
-        spin_unlock(&obj->obj_spinlock);
-    }
-    return false;
-}
-
-int tmem_evict(void)
-{
-    struct client *client = current->domain->tmem_client;
-    struct tmem_page_descriptor *pgp = NULL, *pgp_del;
-    struct tmem_object_root *obj;
-    struct tmem_pool *pool;
-    int ret = 0;
-    bool hold_pool_rwlock = false;
-
-    tmem_stats.evict_attempts++;
-    spin_lock(&eph_lists_spinlock);
-    if ( (client != NULL) && client_over_quota(client) &&
-         !list_empty(&client->ephemeral_page_list) )
-    {
-        list_for_each_entry(pgp, &client->ephemeral_page_list, us.client_eph_pages)
-            if ( tmem_try_to_evict_pgp(pgp, &hold_pool_rwlock) )
-                goto found;
-    }
-    else if ( !list_empty(&tmem_global.ephemeral_page_list) )
-    {
-        list_for_each_entry(pgp, &tmem_global.ephemeral_page_list, global_eph_pages)
-            if ( tmem_try_to_evict_pgp(pgp, &hold_pool_rwlock) )
-            {
-                client = pgp->us.obj->pool->client;
-                goto found;
-            }
-    }
-     /* Global_ephemeral_page_list is empty, so we bail out. */
-    spin_unlock(&eph_lists_spinlock);
-    goto out;
-
-found:
-    /* Delist. */
-    list_del_init(&pgp->us.client_eph_pages);
-    client->eph_count--;
-    list_del_init(&pgp->global_eph_pages);
-    tmem_global.eph_count--;
-    ASSERT(tmem_global.eph_count >= 0);
-    ASSERT(client->eph_count >= 0);
-    spin_unlock(&eph_lists_spinlock);
-
-    ASSERT(pgp != NULL);
-    obj = pgp->us.obj;
-    ASSERT(obj != NULL);
-    ASSERT(obj->pool != NULL);
-    pool = obj->pool;
-
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    pgp_del = pgp_delete_from_obj(obj, pgp->index);
-    ASSERT(pgp_del == pgp);
-
-    /* pgp already delist, so call pgp_free directly. */
-    pgp_free(pgp);
-    if ( obj->pgp_count == 0 )
-    {
-        ASSERT_WRITELOCK(&pool->pool_rwlock);
-        obj_free(obj);
-    }
-    else
-        spin_unlock(&obj->obj_spinlock);
-    if ( hold_pool_rwlock )
-        write_unlock(&pool->pool_rwlock);
-    tmem_stats.evicted_pgs++;
-    ret = 1;
-out:
-    return ret;
-}
-
-
-/*
- * Under certain conditions (e.g. if each client is putting pages for exactly
- * one object), once locks are held, freeing up memory may
- * result in livelocks and very long "put" times, so we try to ensure there
- * is a minimum amount of memory (1MB) available BEFORE any data structure
- * locks are held.
- */
-static inline bool tmem_ensure_avail_pages(void)
-{
-    int failed_evict = 10;
-    unsigned long free_mem;
-
-    do {
-        free_mem = (tmem_page_list_pages + total_free_pages())
-                        >> (20 - PAGE_SHIFT);
-        if ( free_mem )
-            return true;
-        if ( !tmem_evict() )
-            failed_evict--;
-    } while ( failed_evict > 0 );
-
-    return false;
-}
-
-/************ TMEM CORE OPERATIONS ************************************/
-
-static int do_tmem_put_compress(struct tmem_page_descriptor *pgp, xen_pfn_t cmfn,
-                                         tmem_cli_va_param_t clibuf)
-{
-    void *dst, *p;
-    size_t size;
-    int ret = 0;
-
-    ASSERT(pgp != NULL);
-    ASSERT(pgp->us.obj != NULL);
-    ASSERT_SPINLOCK(&pgp->us.obj->obj_spinlock);
-    ASSERT(pgp->us.obj->pool != NULL);
-    ASSERT(pgp->us.obj->pool->client != NULL);
-
-    if ( pgp->pfp != NULL )
-        pgp_free_data(pgp, pgp->us.obj->pool);
-    ret = tmem_compress_from_client(cmfn, &dst, &size, clibuf);
-    if ( ret <= 0 )
-        goto out;
-    else if ( (size == 0) || (size >= tmem_mempool_maxalloc) ) {
-        ret = 0;
-        goto out;
-    } else if ( (p = tmem_malloc(size,pgp->us.obj->pool)) == NULL ) {
-        ret = -ENOMEM;
-        goto out;
-    } else {
-        memcpy(p,dst,size);
-        pgp->cdata = p;
-    }
-    pgp->size = size;
-    pgp->us.obj->pool->client->compressed_pages++;
-    pgp->us.obj->pool->client->compressed_sum_size += size;
-    ret = 1;
-
-out:
-    return ret;
-}
-
-static int do_tmem_dup_put(struct tmem_page_descriptor *pgp, xen_pfn_t cmfn,
-       tmem_cli_va_param_t clibuf)
-{
-    struct tmem_pool *pool;
-    struct tmem_object_root *obj;
-    struct client *client;
-    struct tmem_page_descriptor *pgpfound = NULL;
-    int ret;
-
-    ASSERT(pgp != NULL);
-    ASSERT(pgp->pfp != NULL);
-    ASSERT(pgp->size != -1);
-    obj = pgp->us.obj;
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    ASSERT(obj != NULL);
-    pool = obj->pool;
-    ASSERT(pool != NULL);
-    client = pool->client;
-    if ( client->info.flags.u.migrating )
-        goto failed_dup; /* No dups allowed when migrating. */
-    /* Can we successfully manipulate pgp to change out the data? */
-    if ( client->info.flags.u.compress && pgp->size != 0 )
-    {
-        ret = do_tmem_put_compress(pgp, cmfn, clibuf);
-        if ( ret == 1 )
-            goto done;
-        else if ( ret == 0 )
-            goto copy_uncompressed;
-        else if ( ret == -ENOMEM )
-            goto failed_dup;
-        else if ( ret == -EFAULT )
-            goto bad_copy;
-    }
-
-copy_uncompressed:
-    if ( pgp->pfp )
-        pgp_free_data(pgp, pool);
-    if ( ( pgp->pfp = tmem_alloc_page(pool) ) == NULL )
-        goto failed_dup;
-    pgp->size = 0;
-    ret = tmem_copy_from_client(pgp->pfp, cmfn, tmem_cli_buf_null);
-    if ( ret < 0 )
-        goto bad_copy;
-
-done:
-    /* Successfully replaced data, clean up and return success. */
-    if ( is_shared(pool) )
-        obj->last_client = client->cli_id;
-    spin_unlock(&obj->obj_spinlock);
-    pool->dup_puts_replaced++;
-    pool->good_puts++;
-    if ( is_persistent(pool) )
-        client->succ_pers_puts++;
-    return 1;
-
-bad_copy:
-    tmem_stats.failed_copies++;
-    goto cleanup;
-
-failed_dup:
-    /*
-     * Couldn't change out the data, flush the old data and return
-     * -ENOSPC instead of -ENOMEM to differentiate failed _dup_ put.
-     */
-    ret = -ENOSPC;
-cleanup:
-    pgpfound = pgp_delete_from_obj(obj, pgp->index);
-    ASSERT(pgpfound == pgp);
-    pgp_delist_free(pgpfound);
-    if ( obj->pgp_count == 0 )
-    {
-        write_lock(&pool->pool_rwlock);
-        obj_free(obj);
-        write_unlock(&pool->pool_rwlock);
-    } else {
-        spin_unlock(&obj->obj_spinlock);
-    }
-    pool->dup_puts_flushed++;
-    return ret;
-}
-
-static int do_tmem_put(struct tmem_pool *pool,
-                       struct xen_tmem_oid *oidp, uint32_t index,
-                       xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
-{
-    struct tmem_object_root *obj = NULL;
-    struct tmem_page_descriptor *pgp = NULL;
-    struct client *client;
-    int ret, newobj = 0;
-
-    ASSERT(pool != NULL);
-    client = pool->client;
-    ASSERT(client != NULL);
-    ret = client->info.flags.u.frozen  ? -EFROZEN : -ENOMEM;
-    pool->puts++;
-
-refind:
-    /* Does page already exist (dup)?  if so, handle specially. */
-    if ( (obj = obj_find(pool, oidp)) != NULL )
-    {
-        if ((pgp = pgp_lookup_in_obj(obj, index)) != NULL)
-        {
-            return do_tmem_dup_put(pgp, cmfn, clibuf);
-        }
-        else
-        {
-            /* No puts allowed into a frozen pool (except dup puts). */
-            if ( client->info.flags.u.frozen )
-                goto unlock_obj;
-        }
-    }
-    else
-    {
-        /* No puts allowed into a frozen pool (except dup puts). */
-        if ( client->info.flags.u.frozen )
-            return ret;
-        if ( (obj = obj_alloc(pool, oidp)) == NULL )
-            return -ENOMEM;
-
-        write_lock(&pool->pool_rwlock);
-        /*
-         * Parallel callers may already allocated obj and inserted to obj_rb_root
-         * before us.
-         */
-        if ( !obj_rb_insert(&pool->obj_rb_root[oid_hash(oidp)], obj) )
-        {
-            tmem_free(obj, pool);
-            write_unlock(&pool->pool_rwlock);
-            goto refind;
-        }
-
-        spin_lock(&obj->obj_spinlock);
-        newobj = 1;
-        write_unlock(&pool->pool_rwlock);
-    }
-
-    /* When arrive here, we have a spinlocked obj for use. */
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    if ( (pgp = pgp_alloc(obj)) == NULL )
-        goto unlock_obj;
-
-    ret = pgp_add_to_obj(obj, index, pgp);
-    if ( ret == -ENOMEM  )
-        /* Warning: may result in partially built radix tree ("stump"). */
-        goto free_pgp;
-
-    pgp->index = index;
-    pgp->size = 0;
-
-    if ( client->info.flags.u.compress )
-    {
-        ASSERT(pgp->pfp == NULL);
-        ret = do_tmem_put_compress(pgp, cmfn, clibuf);
-        if ( ret == 1 )
-            goto insert_page;
-        if ( ret == -ENOMEM )
-        {
-            client->compress_nomem++;
-            goto del_pgp_from_obj;
-        }
-        if ( ret == 0 )
-        {
-            client->compress_poor++;
-            goto copy_uncompressed;
-        }
-        if ( ret == -EFAULT )
-            goto bad_copy;
-    }
-
-copy_uncompressed:
-    if ( ( pgp->pfp = tmem_alloc_page(pool) ) == NULL )
-    {
-        ret = -ENOMEM;
-        goto del_pgp_from_obj;
-    }
-    ret = tmem_copy_from_client(pgp->pfp, cmfn, clibuf);
-    if ( ret < 0 )
-        goto bad_copy;
-
-insert_page:
-    if ( !is_persistent(pool) )
-    {
-        spin_lock(&eph_lists_spinlock);
-        list_add_tail(&pgp->global_eph_pages, &tmem_global.ephemeral_page_list);
-        if (++tmem_global.eph_count > tmem_stats.global_eph_count_max)
-            tmem_stats.global_eph_count_max = tmem_global.eph_count;
-        list_add_tail(&pgp->us.client_eph_pages,
-            &client->ephemeral_page_list);
-        if (++client->eph_count > client->eph_count_max)
-            client->eph_count_max = client->eph_count;
-        spin_unlock(&eph_lists_spinlock);
-    }
-    else
-    { /* is_persistent. */
-        spin_lock(&pers_lists_spinlock);
-        list_add_tail(&pgp->us.pool_pers_pages,
-            &pool->persistent_page_list);
-        spin_unlock(&pers_lists_spinlock);
-    }
-
-    if ( is_shared(pool) )
-        obj->last_client = client->cli_id;
-
-    /* Free the obj spinlock. */
-    spin_unlock(&obj->obj_spinlock);
-    pool->good_puts++;
-
-    if ( is_persistent(pool) )
-        client->succ_pers_puts++;
-    else
-        tmem_stats.tot_good_eph_puts++;
-    return 1;
-
-bad_copy:
-    tmem_stats.failed_copies++;
-
-del_pgp_from_obj:
-    ASSERT((obj != NULL) && (pgp != NULL) && (pgp->index != -1));
-    pgp_delete_from_obj(obj, pgp->index);
-
-free_pgp:
-    pgp_free(pgp);
-unlock_obj:
-    if ( newobj )
-    {
-        write_lock(&pool->pool_rwlock);
-        obj_free(obj);
-        write_unlock(&pool->pool_rwlock);
-    }
-    else
-    {
-        spin_unlock(&obj->obj_spinlock);
-    }
-    pool->no_mem_puts++;
-    return ret;
-}
-
-static int do_tmem_get(struct tmem_pool *pool,
-                       struct xen_tmem_oid *oidp, uint32_t index,
-                       xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
-{
-    struct tmem_object_root *obj;
-    struct tmem_page_descriptor *pgp;
-    struct client *client = pool->client;
-    int rc;
-
-    if ( !_atomic_read(pool->pgp_count) )
-        return -EEMPTY;
-
-    pool->gets++;
-    obj = obj_find(pool,oidp);
-    if ( obj == NULL )
-        return 0;
-
-    ASSERT_SPINLOCK(&obj->obj_spinlock);
-    if (is_shared(pool) || is_persistent(pool) )
-        pgp = pgp_lookup_in_obj(obj, index);
-    else
-        pgp = pgp_delete_from_obj(obj, index);
-    if ( pgp == NULL )
-    {
-        spin_unlock(&obj->obj_spinlock);
-        return 0;
-    }
-    ASSERT(pgp->size != -1);
-    if ( pgp->size != 0 )
-    {
-        rc = tmem_decompress_to_client(cmfn, pgp->cdata, pgp->size, clibuf);
-    }
-    else
-        rc = tmem_copy_to_client(cmfn, pgp->pfp, clibuf);
-    if ( rc <= 0 )
-        goto bad_copy;
-
-    if ( !is_persistent(pool) )
-    {
-        if ( !is_shared(pool) )
-        {
-            pgp_delist_free(pgp);
-            if ( obj->pgp_count == 0 )
-            {
-                write_lock(&pool->pool_rwlock);
-                obj_free(obj);
-                obj = NULL;
-                write_unlock(&pool->pool_rwlock);
-            }
-        } else {
-            spin_lock(&eph_lists_spinlock);
-            list_del(&pgp->global_eph_pages);
-            list_add_tail(&pgp->global_eph_pages,&tmem_global.ephemeral_page_list);
-            list_del(&pgp->us.client_eph_pages);
-            list_add_tail(&pgp->us.client_eph_pages,&client->ephemeral_page_list);
-            spin_unlock(&eph_lists_spinlock);
-            obj->last_client = current->domain->domain_id;
-        }
-    }
-    if ( obj != NULL )
-    {
-        spin_unlock(&obj->obj_spinlock);
-    }
-    pool->found_gets++;
-    if ( is_persistent(pool) )
-        client->succ_pers_gets++;
-    else
-        client->succ_eph_gets++;
-    return 1;
-
-bad_copy:
-    spin_unlock(&obj->obj_spinlock);
-    tmem_stats.failed_copies++;
-    return rc;
-}
-
-static int do_tmem_flush_page(struct tmem_pool *pool,
-                              struct xen_tmem_oid *oidp, uint32_t index)
-{
-    struct tmem_object_root *obj;
-    struct tmem_page_descriptor *pgp;
-
-    pool->flushs++;
-    obj = obj_find(pool,oidp);
-    if ( obj == NULL )
-        goto out;
-    pgp = pgp_delete_from_obj(obj, index);
-    if ( pgp == NULL )
-    {
-        spin_unlock(&obj->obj_spinlock);
-        goto out;
-    }
-    pgp_delist_free(pgp);
-    if ( obj->pgp_count == 0 )
-    {
-        write_lock(&pool->pool_rwlock);
-        obj_free(obj);
-        write_unlock(&pool->pool_rwlock);
-    } else {
-        spin_unlock(&obj->obj_spinlock);
-    }
-    pool->flushs_found++;
-
-out:
-    if ( pool->client->info.flags.u.frozen )
-        return -EFROZEN;
-    else
-        return 1;
-}
-
-static int do_tmem_flush_object(struct tmem_pool *pool,
-                                struct xen_tmem_oid *oidp)
-{
-    struct tmem_object_root *obj;
-
-    pool->flush_objs++;
-    obj = obj_find(pool,oidp);
-    if ( obj == NULL )
-        goto out;
-    write_lock(&pool->pool_rwlock);
-    obj_destroy(obj);
-    pool->flush_objs_found++;
-    write_unlock(&pool->pool_rwlock);
-
-out:
-    if ( pool->client->info.flags.u.frozen )
-        return -EFROZEN;
-    else
-        return 1;
-}
-
-static int do_tmem_destroy_pool(uint32_t pool_id)
-{
-    struct client *client = current->domain->tmem_client;
-    struct tmem_pool *pool;
-
-    if ( pool_id >= MAX_POOLS_PER_DOMAIN )
-        return 0;
-    if ( (pool = client->pools[pool_id]) == NULL )
-        return 0;
-    client->pools[pool_id] = NULL;
-    pool_flush(pool, client->cli_id);
-    client->info.nr_pools--;
-    return 1;
-}
-
-int do_tmem_new_pool(domid_t this_cli_id,
-                     uint32_t d_poolid, uint32_t flags,
-                     uint64_t uuid_lo, uint64_t uuid_hi)
-{
-    struct client *client;
-    domid_t cli_id;
-    int persistent = flags & TMEM_POOL_PERSIST;
-    int shared = flags & TMEM_POOL_SHARED;
-    int pagebits = (flags >> TMEM_POOL_PAGESIZE_SHIFT)
-         & TMEM_POOL_PAGESIZE_MASK;
-    int specversion = (flags >> TMEM_POOL_VERSION_SHIFT)
-         & TMEM_POOL_VERSION_MASK;
-    struct tmem_pool *pool, *shpool;
-    int i, first_unused_s_poolid;
-
-    if ( this_cli_id == TMEM_CLI_ID_NULL )
-        cli_id = current->domain->domain_id;
-    else
-        cli_id = this_cli_id;
-    tmem_client_info("tmem: allocating %s-%s tmem pool for %s=%d...",
-        persistent ? "persistent" : "ephemeral" ,
-        shared ? "shared" : "private", tmem_cli_id_str, cli_id);
-    if ( specversion != TMEM_SPEC_VERSION )
-    {
-        tmem_client_err("failed... unsupported spec version\n");
-        return -EPERM;
-    }
-    if ( shared && persistent )
-    {
-        tmem_client_err("failed... unable to create a shared-persistant pool\n");
-        return -EPERM;
-    }
-    if ( pagebits != (PAGE_SHIFT - 12) )
-    {
-        tmem_client_err("failed... unsupported pagesize %d\n",
-                       1 << (pagebits + 12));
-        return -EPERM;
-    }
-    if ( flags & TMEM_POOL_PRECOMPRESSED )
-    {
-        tmem_client_err("failed... precompression flag set but unsupported\n");
-        return -EPERM;
-    }
-    if ( flags & TMEM_POOL_RESERVED_BITS )
-    {
-        tmem_client_err("failed... reserved bits must be zero\n");
-        return -EPERM;
-    }
-    if ( this_cli_id != TMEM_CLI_ID_NULL )
-    {
-        if ( (client = tmem_client_from_cli_id(this_cli_id)) == NULL
-             || d_poolid >= MAX_POOLS_PER_DOMAIN
-             || client->pools[d_poolid] != NULL )
-            return -EPERM;
-    }
-    else
-    {
-        client = current->domain->tmem_client;
-        ASSERT(client != NULL);
-        for ( d_poolid = 0; d_poolid < MAX_POOLS_PER_DOMAIN; d_poolid++ )
-            if ( client->pools[d_poolid] == NULL )
-                break;
-        if ( d_poolid >= MAX_POOLS_PER_DOMAIN )
-        {
-            tmem_client_err("failed... no more pool slots available for this %s\n",
-                   tmem_client_str);
-            return -EPERM;
-        }
-    }
-
-    if ( (pool = pool_alloc()) == NULL )
-    {
-        tmem_client_err("failed... out of memory\n");
-        return -ENOMEM;
-    }
-    client->pools[d_poolid] = pool;
-    pool->client = client;
-    pool->pool_id = d_poolid;
-    pool->shared = shared;
-    pool->persistent = persistent;
-    pool->uuid[0] = uuid_lo;
-    pool->uuid[1] = uuid_hi;
-
-    /*
-     * Already created a pool when arrived here, but need some special process
-     * for shared pool.
-     */
-    if ( shared )
-    {
-        if ( uuid_lo == -1L && uuid_hi == -1L )
-        {
-            tmem_client_info("Invalid uuid, create non shared pool instead!\n");
-            pool->shared = 0;
-            goto out;
-        }
-        if ( !tmem_global.shared_auth )
-        {
-            for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++)
-                if ( (client->shared_auth_uuid[i][0] == uuid_lo) &&
-                     (client->shared_auth_uuid[i][1] == uuid_hi) )
-                    break;
-            if ( i == MAX_GLOBAL_SHARED_POOLS )
-            {
-                tmem_client_info("Shared auth failed, create non shared pool instead!\n");
-                pool->shared = 0;
-                goto out;
-            }
-        }
-
-        /*
-         * Authorize okay, match a global shared pool or use the newly allocated
-         * one.
-         */
-        first_unused_s_poolid = MAX_GLOBAL_SHARED_POOLS;
-        for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++ )
-        {
-            if ( (shpool = tmem_global.shared_pools[i]) != NULL )
-            {
-                if ( shpool->uuid[0] == uuid_lo && shpool->uuid[1] == uuid_hi )
-                {
-                    /* Succ to match a global shared pool. */
-                    tmem_client_info("(matches shared pool uuid=%"PRIx64".%"PRIx64") pool_id=%d\n",
-                        uuid_hi, uuid_lo, d_poolid);
-                    client->pools[d_poolid] = shpool;
-                    if ( !shared_pool_join(shpool, client) )
-                    {
-                        pool_free(pool);
-                        goto out;
-                    }
-                    else
-                        goto fail;
-                }
-            }
-            else
-            {
-                if ( first_unused_s_poolid == MAX_GLOBAL_SHARED_POOLS )
-                    first_unused_s_poolid = i;
-            }
-        }
-
-        /* Failed to find a global shared pool slot. */
-        if ( first_unused_s_poolid == MAX_GLOBAL_SHARED_POOLS )
-        {
-            tmem_client_warn("tmem: failed... no global shared pool slots available\n");
-            goto fail;
-        }
-        /* Add pool to global shared pool. */
-        else
-        {
-            INIT_LIST_HEAD(&pool->share_list);
-            pool->shared_count = 0;
-            if ( shared_pool_join(pool, client) )
-                goto fail;
-            tmem_global.shared_pools[first_unused_s_poolid] = pool;
-        }
-    }
-
-out:
-    tmem_client_info("pool_id=%d\n", d_poolid);
-    client->info.nr_pools++;
-    return d_poolid;
-
-fail:
-    pool_free(pool);
-    return -EPERM;
-}
-
-/************ TMEM CONTROL OPERATIONS ************************************/
-
-int tmemc_shared_pool_auth(domid_t cli_id, uint64_t uuid_lo,
-                           uint64_t uuid_hi, bool auth)
-{
-    struct client *client;
-    int i, free = -1;
-
-    if ( cli_id == TMEM_CLI_ID_NULL )
-    {
-        tmem_global.shared_auth = auth;
-        return 1;
-    }
-    client = tmem_client_from_cli_id(cli_id);
-    if ( client == NULL )
-        return -EINVAL;
-
-    for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++)
-    {
-        if ( auth == 0 )
-        {
-            if ( (client->shared_auth_uuid[i][0] == uuid_lo) &&
-                    (client->shared_auth_uuid[i][1] == uuid_hi) )
-            {
-                client->shared_auth_uuid[i][0] = -1L;
-                client->shared_auth_uuid[i][1] = -1L;
-                return 1;
-            }
-        }
-        else
-        {
-            if ( (client->shared_auth_uuid[i][0] == -1L) &&
-                    (client->shared_auth_uuid[i][1] == -1L) )
-            {
-                free = i;
-                break;
-            }
-	}
-    }
-    if ( auth == 0 )
-        return 0;
-    else if ( free == -1)
-        return -ENOMEM;
-    else
-    {
-        client->shared_auth_uuid[free][0] = uuid_lo;
-        client->shared_auth_uuid[free][1] = uuid_hi;
-        return 1;
-    }
-}
-
-static int tmemc_save_subop(int cli_id, uint32_t pool_id,
-                        uint32_t subop, tmem_cli_va_param_t buf, uint32_t arg)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    uint32_t p;
-    struct tmem_page_descriptor *pgp, *pgp2;
-    int rc = -ENOENT;
-
-    switch(subop)
-    {
-    case XEN_SYSCTL_TMEM_OP_SAVE_BEGIN:
-        if ( client == NULL )
-            break;
-        for (p = 0; p < MAX_POOLS_PER_DOMAIN; p++)
-            if ( client->pools[p] != NULL )
-                break;
-
-        if ( p == MAX_POOLS_PER_DOMAIN )
-            break;
-
-        client->was_frozen = client->info.flags.u.frozen;
-        client->info.flags.u.frozen = 1;
-        if ( arg != 0 )
-            client->info.flags.u.migrating = 1;
-        rc = 0;
-        break;
-    case XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN:
-        if ( client == NULL )
-            rc = client_create(cli_id) ? 0 : -ENOMEM;
-        else
-            rc = -EEXIST;
-        break;
-    case XEN_SYSCTL_TMEM_OP_SAVE_END:
-        if ( client == NULL )
-            break;
-        client->info.flags.u.migrating = 0;
-        if ( !list_empty(&client->persistent_invalidated_list) )
-            list_for_each_entry_safe(pgp,pgp2,
-              &client->persistent_invalidated_list, client_inv_pages)
-                __pgp_free(pgp, client->pools[pgp->pool_id]);
-        client->info.flags.u.frozen = client->was_frozen;
-        rc = 0;
-        break;
-    }
-    return rc;
-}
-
-static int tmemc_save_get_next_page(int cli_id, uint32_t pool_id,
-                        tmem_cli_va_param_t buf, uint32_t bufsize)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
-                   ? NULL : client->pools[pool_id];
-    struct tmem_page_descriptor *pgp;
-    struct xen_tmem_oid *oid;
-    int ret = 0;
-    struct tmem_handle h;
-
-    if ( pool == NULL || !is_persistent(pool) )
-        return -1;
-
-    if ( bufsize < PAGE_SIZE + sizeof(struct tmem_handle) )
-        return -ENOMEM;
-
-    spin_lock(&pers_lists_spinlock);
-    if ( list_empty(&pool->persistent_page_list) )
-    {
-        ret = -1;
-        goto out;
-    }
-    /* Note: pool->cur_pgp is the pgp last returned by get_next_page. */
-    if ( pool->cur_pgp == NULL )
-    {
-        /* Process the first one. */
-        pool->cur_pgp = pgp = list_entry((&pool->persistent_page_list)->next,
-                         struct tmem_page_descriptor,us.pool_pers_pages);
-    } else if ( list_is_last(&pool->cur_pgp->us.pool_pers_pages,
-                             &pool->persistent_page_list) )
-    {
-        /* Already processed the last one in the list. */
-        ret = -1;
-        goto out;
-    }
-    pgp = list_entry((&pool->cur_pgp->us.pool_pers_pages)->next,
-                         struct tmem_page_descriptor,us.pool_pers_pages);
-    pool->cur_pgp = pgp;
-    oid = &pgp->us.obj->oid;
-    h.pool_id = pool_id;
-    BUILD_BUG_ON(sizeof(h.oid) != sizeof(*oid));
-    memcpy(&(h.oid), oid, sizeof(h.oid));
-    h.index = pgp->index;
-    if ( copy_to_guest(guest_handle_cast(buf, void), &h, 1) )
-    {
-        ret = -EFAULT;
-        goto out;
-    }
-    guest_handle_add_offset(buf, sizeof(h));
-    ret = do_tmem_get(pool, oid, pgp->index, 0, buf);
-
-out:
-    spin_unlock(&pers_lists_spinlock);
-    return ret;
-}
-
-static int tmemc_save_get_next_inv(int cli_id, tmem_cli_va_param_t buf,
-                        uint32_t bufsize)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    struct tmem_page_descriptor *pgp;
-    struct tmem_handle h;
-    int ret = 0;
-
-    if ( client == NULL )
-        return 0;
-    if ( bufsize < sizeof(struct tmem_handle) )
-        return 0;
-    spin_lock(&pers_lists_spinlock);
-    if ( list_empty(&client->persistent_invalidated_list) )
-        goto out;
-    if ( client->cur_pgp == NULL )
-    {
-        pgp = list_entry((&client->persistent_invalidated_list)->next,
-                         struct tmem_page_descriptor,client_inv_pages);
-        client->cur_pgp = pgp;
-    } else if ( list_is_last(&client->cur_pgp->client_inv_pages,
-                             &client->persistent_invalidated_list) )
-    {
-        client->cur_pgp = NULL;
-        ret = 0;
-        goto out;
-    } else {
-        pgp = list_entry((&client->cur_pgp->client_inv_pages)->next,
-                         struct tmem_page_descriptor,client_inv_pages);
-        client->cur_pgp = pgp;
-    }
-    h.pool_id = pgp->pool_id;
-    BUILD_BUG_ON(sizeof(h.oid) != sizeof(pgp->inv_oid));
-    memcpy(&(h.oid), &(pgp->inv_oid), sizeof(h.oid));
-    h.index = pgp->index;
-    ret = 1;
-    if ( copy_to_guest(guest_handle_cast(buf, void), &h, 1) )
-        ret = -EFAULT;
-out:
-    spin_unlock(&pers_lists_spinlock);
-    return ret;
-}
-
-static int tmemc_restore_put_page(int cli_id, uint32_t pool_id,
-                                  struct xen_tmem_oid *oidp,
-                                  uint32_t index, tmem_cli_va_param_t buf,
-                                  uint32_t bufsize)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
-                   ? NULL : client->pools[pool_id];
-
-    if ( pool == NULL )
-        return -1;
-    if (bufsize != PAGE_SIZE) {
-        tmem_client_err("tmem: %s: invalid parameter bufsize(%d) != (%ld)\n",
-                __func__, bufsize, PAGE_SIZE);
-        return -EINVAL;
-    }
-    return do_tmem_put(pool, oidp, index, 0, buf);
-}
-
-static int tmemc_restore_flush_page(int cli_id, uint32_t pool_id,
-                                    struct xen_tmem_oid *oidp,
-                                    uint32_t index)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    struct tmem_pool *pool = (client == NULL || pool_id >= MAX_POOLS_PER_DOMAIN)
-                   ? NULL : client->pools[pool_id];
-
-    if ( pool == NULL )
-        return -1;
-    return do_tmem_flush_page(pool,oidp,index);
-}
-
-int do_tmem_control(struct xen_sysctl_tmem_op *op)
-{
-    int ret;
-    uint32_t pool_id = op->pool_id;
-    uint32_t cmd = op->cmd;
-    struct xen_tmem_oid *oidp = &op->oid;
-
-    ASSERT(rw_is_write_locked(&tmem_rwlock));
-
-    switch (cmd)
-    {
-    case XEN_SYSCTL_TMEM_OP_SAVE_BEGIN:
-    case XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN:
-    case XEN_SYSCTL_TMEM_OP_SAVE_END:
-        ret = tmemc_save_subop(op->cli_id, pool_id, cmd,
-                               guest_handle_cast(op->u.buf, char), op->arg);
-        break;
-    case XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE:
-        ret = tmemc_save_get_next_page(op->cli_id, pool_id,
-                                       guest_handle_cast(op->u.buf, char), op->len);
-        break;
-    case XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV:
-        ret = tmemc_save_get_next_inv(op->cli_id,
-                                      guest_handle_cast(op->u.buf, char), op->len);
-        break;
-    case XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE:
-        ret = tmemc_restore_put_page(op->cli_id, pool_id, oidp, op->arg,
-                                     guest_handle_cast(op->u.buf, char), op->len);
-        break;
-    case XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE:
-        ret = tmemc_restore_flush_page(op->cli_id, pool_id, oidp, op->arg);
-        break;
-    default:
-        ret = -1;
-    }
-
-    return ret;
-}
-
-/************ EXPORTed FUNCTIONS **************************************/
-
-long do_tmem_op(tmem_cli_op_t uops)
-{
-    struct tmem_op op;
-    struct client *client = current->domain->tmem_client;
-    struct tmem_pool *pool = NULL;
-    struct xen_tmem_oid *oidp;
-    int rc = 0;
-
-    if ( !tmem_initialized )
-        return -ENODEV;
-
-    if ( xsm_tmem_op(XSM_HOOK) )
-        return -EPERM;
-
-    tmem_stats.total_tmem_ops++;
-
-    if ( client != NULL && client->domain->is_dying )
-    {
-        tmem_stats.errored_tmem_ops++;
-        return -ENODEV;
-    }
-
-    if ( unlikely(tmem_get_tmemop_from_client(&op, uops) != 0) )
-    {
-        tmem_client_err("tmem: can't get tmem struct from %s\n", tmem_client_str);
-        tmem_stats.errored_tmem_ops++;
-        return -EFAULT;
-    }
-
-    /* Acquire write lock for all commands at first. */
-    write_lock(&tmem_rwlock);
-
-    switch ( op.cmd )
-    {
-    case TMEM_CONTROL:
-    case TMEM_RESTORE_NEW:
-    case TMEM_AUTH:
-        rc = -EOPNOTSUPP;
-        break;
-
-    default:
-    /*
-	 * For other commands, create per-client tmem structure dynamically on
-	 * first use by client.
-	 */
-        if ( client == NULL )
-        {
-            if ( (client = client_create(current->domain->domain_id)) == NULL )
-            {
-                tmem_client_err("tmem: can't create tmem structure for %s\n",
-                               tmem_client_str);
-                rc = -ENOMEM;
-                goto out;
-            }
-        }
-
-        if ( op.cmd == TMEM_NEW_POOL || op.cmd == TMEM_DESTROY_POOL )
-        {
-            if ( op.cmd == TMEM_NEW_POOL )
-                rc = do_tmem_new_pool(TMEM_CLI_ID_NULL, 0, op.u.creat.flags,
-                                op.u.creat.uuid[0], op.u.creat.uuid[1]);
-	        else
-                rc = do_tmem_destroy_pool(op.pool_id);
-        }
-        else
-        {
-            if ( ((uint32_t)op.pool_id >= MAX_POOLS_PER_DOMAIN) ||
-                 ((pool = client->pools[op.pool_id]) == NULL) )
-            {
-                tmem_client_err("tmem: operation requested on uncreated pool\n");
-                rc = -ENODEV;
-                goto out;
-            }
-            /* Commands that only need read lock. */
-            write_unlock(&tmem_rwlock);
-            read_lock(&tmem_rwlock);
-
-            oidp = &op.u.gen.oid;
-            switch ( op.cmd )
-            {
-            case TMEM_NEW_POOL:
-            case TMEM_DESTROY_POOL:
-                BUG(); /* Done earlier. */
-                break;
-            case TMEM_PUT_PAGE:
-                if (tmem_ensure_avail_pages())
-                    rc = do_tmem_put(pool, oidp, op.u.gen.index, op.u.gen.cmfn,
-                                tmem_cli_buf_null);
-                else
-                    rc = -ENOMEM;
-                break;
-            case TMEM_GET_PAGE:
-                rc = do_tmem_get(pool, oidp, op.u.gen.index, op.u.gen.cmfn,
-                                tmem_cli_buf_null);
-                break;
-            case TMEM_FLUSH_PAGE:
-                rc = do_tmem_flush_page(pool, oidp, op.u.gen.index);
-                break;
-            case TMEM_FLUSH_OBJECT:
-                rc = do_tmem_flush_object(pool, oidp);
-                break;
-            default:
-                tmem_client_warn("tmem: op %d not implemented\n", op.cmd);
-                rc = -ENOSYS;
-                break;
-            }
-            read_unlock(&tmem_rwlock);
-            if ( rc < 0 )
-                tmem_stats.errored_tmem_ops++;
-            return rc;
-        }
-        break;
-
-    }
-out:
-    write_unlock(&tmem_rwlock);
-    if ( rc < 0 )
-        tmem_stats.errored_tmem_ops++;
-    return rc;
-}
-
-/* This should be called when the host is destroying a client (domain). */
-void tmem_destroy(void *v)
-{
-    struct client *client = (struct client *)v;
-
-    if ( client == NULL )
-        return;
-
-    if ( !client->domain->is_dying )
-    {
-        printk("tmem: tmem_destroy can only destroy dying client\n");
-        return;
-    }
-
-    write_lock(&tmem_rwlock);
-
-    printk("tmem: flushing tmem pools for %s=%d\n",
-           tmem_cli_id_str, client->cli_id);
-    client_flush(client);
-
-    write_unlock(&tmem_rwlock);
-}
-
-#define MAX_EVICTS 10  /* Should be variable or set via XEN_SYSCTL_TMEM_OP_ ?? */
-void *tmem_relinquish_pages(unsigned int order, unsigned int memflags)
-{
-    struct page_info *pfp;
-    unsigned long evicts_per_relinq = 0;
-    int max_evictions = 10;
-
-    if (!tmem_enabled() || !tmem_freeable_pages())
-        return NULL;
-
-    tmem_stats.relinq_attempts++;
-    if ( order > 0 )
-    {
-#ifndef NDEBUG
-        printk("tmem_relinquish_page: failing order=%d\n", order);
-#endif
-        return NULL;
-    }
-
-    while ( (pfp = tmem_page_list_get()) == NULL )
-    {
-        if ( (max_evictions-- <= 0) || !tmem_evict())
-            break;
-        evicts_per_relinq++;
-    }
-    if ( evicts_per_relinq > tmem_stats.max_evicts_per_relinq )
-        tmem_stats.max_evicts_per_relinq = evicts_per_relinq;
-    if ( pfp != NULL )
-    {
-        if ( !(memflags & MEMF_tmem) )
-            scrub_one_page(pfp);
-        tmem_stats.relinq_pgs++;
-    }
-
-    return pfp;
-}
-
-unsigned long tmem_freeable_pages(void)
-{
-    if ( !tmem_enabled() )
-        return 0;
-
-    return tmem_page_list_pages + _atomic_read(freeable_page_count);
-}
-
-/* Called at hypervisor startup. */
-static int __init init_tmem(void)
-{
-    if ( !tmem_enabled() )
-        return 0;
-
-    if ( !tmem_mempool_init() )
-        return 0;
-
-    if ( tmem_init() )
-    {
-        printk("tmem: initialized comp=%d\n", tmem_compression_enabled());
-        tmem_initialized = 1;
-    }
-    else
-        printk("tmem: initialization FAILED\n");
-
-    return 0;
-}
-__initcall(init_tmem);
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/common/tmem_control.c b/xen/common/tmem_control.c
deleted file mode 100644
index 30bf6fb362..0000000000
--- a/xen/common/tmem_control.c
+++ /dev/null
@@ -1,560 +0,0 @@
-/*
- * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
- *
- */
-
-#include <xen/init.h>
-#include <xen/list.h>
-#include <xen/radix-tree.h>
-#include <xen/rbtree.h>
-#include <xen/rwlock.h>
-#include <xen/tmem_control.h>
-#include <xen/tmem.h>
-#include <xen/tmem_xen.h>
-#include <public/sysctl.h>
-
-/************ TMEM CONTROL OPERATIONS ************************************/
-
-/* Freeze/thaw all pools belonging to client cli_id (all domains if -1). */
-static int tmemc_freeze_pools(domid_t cli_id, int arg)
-{
-    struct client *client;
-    bool freeze = arg == XEN_SYSCTL_TMEM_OP_FREEZE;
-    bool destroy = arg == XEN_SYSCTL_TMEM_OP_DESTROY;
-    char *s;
-
-    s = destroy ? "destroyed" : ( freeze ? "frozen" : "thawed" );
-    if ( cli_id == TMEM_CLI_ID_NULL )
-    {
-        list_for_each_entry(client,&tmem_global.client_list,client_list)
-            client->info.flags.u.frozen = freeze;
-        tmem_client_info("tmem: all pools %s for all %ss\n", s, tmem_client_str);
-    }
-    else
-    {
-        if ( (client = tmem_client_from_cli_id(cli_id)) == NULL)
-            return -1;
-        client->info.flags.u.frozen = freeze;
-        tmem_client_info("tmem: all pools %s for %s=%d\n",
-                         s, tmem_cli_id_str, cli_id);
-    }
-    return 0;
-}
-
-static unsigned long tmem_flush_npages(unsigned long n)
-{
-    unsigned long avail_pages = 0;
-
-    while ( (avail_pages = tmem_page_list_pages) < n )
-    {
-        if (  !tmem_evict() )
-            break;
-    }
-    if ( avail_pages )
-    {
-        spin_lock(&tmem_page_list_lock);
-        while ( !page_list_empty(&tmem_page_list) )
-        {
-            struct page_info *pg = page_list_remove_head(&tmem_page_list);
-            scrub_one_page(pg);
-            tmem_page_list_pages--;
-            free_domheap_page(pg);
-        }
-        ASSERT(tmem_page_list_pages == 0);
-        INIT_PAGE_LIST_HEAD(&tmem_page_list);
-        spin_unlock(&tmem_page_list_lock);
-    }
-    return avail_pages;
-}
-
-static int tmemc_flush_mem(domid_t cli_id, uint32_t kb)
-{
-    uint32_t npages, flushed_pages, flushed_kb;
-
-    if ( cli_id != TMEM_CLI_ID_NULL )
-    {
-        tmem_client_warn("tmem: %s-specific flush not supported yet, use --all\n",
-           tmem_client_str);
-        return -1;
-    }
-    /* Convert kb to pages, rounding up if necessary. */
-    npages = (kb + ((1 << (PAGE_SHIFT-10))-1)) >> (PAGE_SHIFT-10);
-    flushed_pages = tmem_flush_npages(npages);
-    flushed_kb = flushed_pages << (PAGE_SHIFT-10);
-    return flushed_kb;
-}
-
-/*
- * These tmemc_list* routines output lots of stats in a format that is
- *  intended to be program-parseable, not human-readable. Further, by
- *  tying each group of stats to a line format indicator (e.g. G= for
- *  global stats) and each individual stat to a two-letter specifier
- *  (e.g. Ec:nnnnn in the G= line says there are nnnnn pages in the
- *  global ephemeral pool), it should allow the stats reported to be
- *  forward and backwards compatible as tmem evolves.
- */
-#define BSIZE 1024
-
-static int tmemc_list_client(struct client *c, tmem_cli_va_param_t buf,
-                             int off, uint32_t len, bool use_long)
-{
-    char info[BSIZE];
-    int i, n = 0, sum = 0;
-    struct tmem_pool *p;
-    bool s;
-
-    n = scnprintf(info,BSIZE,"C=CI:%d,ww:%d,co:%d,fr:%d,"
-        "Tc:%"PRIu64",Ge:%ld,Pp:%ld,Gp:%ld%c",
-        c->cli_id, c->info.weight, c->info.flags.u.compress, c->info.flags.u.frozen,
-        c->total_cycles, c->succ_eph_gets, c->succ_pers_puts, c->succ_pers_gets,
-        use_long ? ',' : '\n');
-    if (use_long)
-        n += scnprintf(info+n,BSIZE-n,
-             "Ec:%ld,Em:%ld,cp:%ld,cb:%"PRId64",cn:%ld,cm:%ld\n",
-             c->eph_count, c->eph_count_max,
-             c->compressed_pages, c->compressed_sum_size,
-             c->compress_poor, c->compress_nomem);
-    if ( !copy_to_guest_offset(buf, off + sum, info, n + 1) )
-        sum += n;
-    for ( i = 0; i < MAX_POOLS_PER_DOMAIN; i++ )
-    {
-        if ( (p = c->pools[i]) == NULL )
-            continue;
-        s = is_shared(p);
-        n = scnprintf(info,BSIZE,"P=CI:%d,PI:%d,"
-                      "PT:%c%c,U0:%"PRIx64",U1:%"PRIx64"%c",
-                      c->cli_id, p->pool_id,
-                      is_persistent(p) ? 'P' : 'E', s ? 'S' : 'P',
-                      (uint64_t)(s ? p->uuid[0] : 0),
-                      (uint64_t)(s ? p->uuid[1] : 0LL),
-                      use_long ? ',' : '\n');
-        if (use_long)
-            n += scnprintf(info+n,BSIZE-n,
-             "Pc:%d,Pm:%d,Oc:%ld,Om:%ld,Nc:%lu,Nm:%lu,"
-             "ps:%lu,pt:%lu,pd:%lu,pr:%lu,px:%lu,gs:%lu,gt:%lu,"
-             "fs:%lu,ft:%lu,os:%lu,ot:%lu\n",
-             _atomic_read(p->pgp_count), p->pgp_count_max,
-             p->obj_count, p->obj_count_max,
-             p->objnode_count, p->objnode_count_max,
-             p->good_puts, p->puts,p->dup_puts_flushed, p->dup_puts_replaced,
-             p->no_mem_puts,
-             p->found_gets, p->gets,
-             p->flushs_found, p->flushs, p->flush_objs_found, p->flush_objs);
-        if ( sum + n >= len )
-            return sum;
-        if ( !copy_to_guest_offset(buf, off + sum, info, n + 1) )
-            sum += n;
-    }
-    return sum;
-}
-
-static int tmemc_list_shared(tmem_cli_va_param_t buf, int off, uint32_t len,
-                             bool use_long)
-{
-    char info[BSIZE];
-    int i, n = 0, sum = 0;
-    struct tmem_pool *p;
-    struct share_list *sl;
-
-    for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++ )
-    {
-        if ( (p = tmem_global.shared_pools[i]) == NULL )
-            continue;
-        n = scnprintf(info+n,BSIZE-n,"S=SI:%d,PT:%c%c,U0:%"PRIx64",U1:%"PRIx64,
-                      i, is_persistent(p) ? 'P' : 'E',
-                      is_shared(p) ? 'S' : 'P',
-                      p->uuid[0], p->uuid[1]);
-        list_for_each_entry(sl,&p->share_list, share_list)
-            n += scnprintf(info+n,BSIZE-n,",SC:%d",sl->client->cli_id);
-        n += scnprintf(info+n,BSIZE-n,"%c", use_long ? ',' : '\n');
-        if (use_long)
-            n += scnprintf(info+n,BSIZE-n,
-             "Pc:%d,Pm:%d,Oc:%ld,Om:%ld,Nc:%lu,Nm:%lu,"
-             "ps:%lu,pt:%lu,pd:%lu,pr:%lu,px:%lu,gs:%lu,gt:%lu,"
-             "fs:%lu,ft:%lu,os:%lu,ot:%lu\n",
-             _atomic_read(p->pgp_count), p->pgp_count_max,
-             p->obj_count, p->obj_count_max,
-             p->objnode_count, p->objnode_count_max,
-             p->good_puts, p->puts,p->dup_puts_flushed, p->dup_puts_replaced,
-             p->no_mem_puts,
-             p->found_gets, p->gets,
-             p->flushs_found, p->flushs, p->flush_objs_found, p->flush_objs);
-        if ( sum + n >= len )
-            return sum;
-        if ( !copy_to_guest_offset(buf, off + sum, info, n + 1) )
-            sum += n;
-    }
-    return sum;
-}
-
-static int tmemc_list_global_perf(tmem_cli_va_param_t buf, int off,
-                                  uint32_t len, bool use_long)
-{
-    char info[BSIZE];
-    int n = 0, sum = 0;
-
-    n = scnprintf(info+n,BSIZE-n,"T=");
-    n--; /* Overwrite trailing comma. */
-    n += scnprintf(info+n,BSIZE-n,"\n");
-    if ( sum + n >= len )
-        return sum;
-    if ( !copy_to_guest_offset(buf, off + sum, info, n + 1) )
-        sum += n;
-    return sum;
-}
-
-static int tmemc_list_global(tmem_cli_va_param_t buf, int off, uint32_t len,
-                             bool use_long)
-{
-    char info[BSIZE];
-    int n = 0, sum = off;
-
-    n += scnprintf(info,BSIZE,"G="
-      "Tt:%lu,Te:%lu,Cf:%lu,Af:%lu,Pf:%lu,Ta:%lu,"
-      "Lm:%lu,Et:%lu,Ea:%lu,Rt:%lu,Ra:%lu,Rx:%lu,Fp:%lu%c",
-      tmem_stats.total_tmem_ops, tmem_stats.errored_tmem_ops, tmem_stats.failed_copies,
-      tmem_stats.alloc_failed, tmem_stats.alloc_page_failed, tmem_page_list_pages,
-      tmem_stats.low_on_memory, tmem_stats.evicted_pgs,
-      tmem_stats.evict_attempts, tmem_stats.relinq_pgs, tmem_stats.relinq_attempts,
-      tmem_stats.max_evicts_per_relinq,
-      tmem_stats.total_flush_pool, use_long ? ',' : '\n');
-    if (use_long)
-        n += scnprintf(info+n,BSIZE-n,
-          "Ec:%ld,Em:%ld,Oc:%d,Om:%d,Nc:%d,Nm:%d,Pc:%d,Pm:%d,"
-          "Fc:%d,Fm:%d,Sc:%d,Sm:%d,Ep:%lu,Gd:%lu,Zt:%lu,Gz:%lu\n",
-          tmem_global.eph_count, tmem_stats.global_eph_count_max,
-          _atomic_read(tmem_stats.global_obj_count), tmem_stats.global_obj_count_max,
-          _atomic_read(tmem_stats.global_rtree_node_count), tmem_stats.global_rtree_node_count_max,
-          _atomic_read(tmem_stats.global_pgp_count), tmem_stats.global_pgp_count_max,
-          _atomic_read(tmem_stats.global_page_count), tmem_stats.global_page_count_max,
-          _atomic_read(tmem_stats.global_pcd_count), tmem_stats.global_pcd_count_max,
-         tmem_stats.tot_good_eph_puts,tmem_stats.deduped_puts,tmem_stats.pcd_tot_tze_size,
-         tmem_stats.pcd_tot_csize);
-    if ( sum + n >= len )
-        return sum;
-    if ( !copy_to_guest_offset(buf, off + sum, info, n + 1) )
-        sum += n;
-    return sum;
-}
-
-static int tmemc_list(domid_t cli_id, tmem_cli_va_param_t buf, uint32_t len,
-                      bool use_long)
-{
-    struct client *client;
-    int off = 0;
-
-    if ( cli_id == TMEM_CLI_ID_NULL ) {
-        off = tmemc_list_global(buf,0,len,use_long);
-        off += tmemc_list_shared(buf,off,len-off,use_long);
-        list_for_each_entry(client,&tmem_global.client_list,client_list)
-            off += tmemc_list_client(client, buf, off, len-off, use_long);
-        off += tmemc_list_global_perf(buf,off,len-off,use_long);
-    }
-    else if ( (client = tmem_client_from_cli_id(cli_id)) == NULL)
-        return -1;
-    else
-        off = tmemc_list_client(client, buf, 0, len, use_long);
-
-    return 0;
-}
-
-static int __tmemc_set_client_info(struct client *client,
-                                   XEN_GUEST_HANDLE(xen_tmem_client_t) buf)
-{
-    domid_t cli_id;
-    uint32_t old_weight;
-    xen_tmem_client_t info = { };
-
-    ASSERT(client);
-
-    if ( copy_from_guest(&info, buf, 1) )
-        return -EFAULT;
-
-    if ( info.version != TMEM_SPEC_VERSION )
-        return -EOPNOTSUPP;
-
-    if ( info.maxpools > MAX_POOLS_PER_DOMAIN )
-        return -ERANGE;
-
-    /* Ignore info.nr_pools. */
-    cli_id = client->cli_id;
-
-    if ( info.weight != client->info.weight )
-    {
-        old_weight = client->info.weight;
-        client->info.weight = info.weight;
-        tmem_client_info("tmem: weight set to %d for %s=%d\n",
-                         info.weight, tmem_cli_id_str, cli_id);
-        atomic_sub(old_weight,&tmem_global.client_weight_total);
-        atomic_add(client->info.weight,&tmem_global.client_weight_total);
-    }
-
-
-    if ( info.flags.u.compress != client->info.flags.u.compress )
-    {
-        client->info.flags.u.compress = info.flags.u.compress;
-        tmem_client_info("tmem: compression %s for %s=%d\n",
-                         info.flags.u.compress ? "enabled" : "disabled",
-                         tmem_cli_id_str,cli_id);
-    }
-    return 0;
-}
-
-static int tmemc_set_client_info(domid_t cli_id,
-                                 XEN_GUEST_HANDLE(xen_tmem_client_t) info)
-{
-    struct client *client;
-    int ret = -ENOENT;
-
-    if ( cli_id == TMEM_CLI_ID_NULL )
-    {
-        list_for_each_entry(client,&tmem_global.client_list,client_list)
-        {
-            ret =  __tmemc_set_client_info(client, info);
-            if (ret)
-                break;
-        }
-    }
-    else
-    {
-        client = tmem_client_from_cli_id(cli_id);
-        if ( client )
-            ret = __tmemc_set_client_info(client, info);
-    }
-    return ret;
-}
-
-static int tmemc_get_client_info(int cli_id,
-                                 XEN_GUEST_HANDLE(xen_tmem_client_t) info)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-
-    if ( client )
-    {
-        if ( copy_to_guest(info, &client->info, 1) )
-            return  -EFAULT;
-    }
-    else
-    {
-        static const xen_tmem_client_t generic = {
-            .version = TMEM_SPEC_VERSION,
-            .maxpools = MAX_POOLS_PER_DOMAIN
-        };
-
-        if ( copy_to_guest(info, &generic, 1) )
-            return -EFAULT;
-    }
-
-    return 0;
-}
-
-static int tmemc_get_pool(int cli_id,
-                          XEN_GUEST_HANDLE(xen_tmem_pool_info_t) pools,
-                          uint32_t len)
-{
-    struct client *client = tmem_client_from_cli_id(cli_id);
-    unsigned int i, idx;
-    int rc = 0;
-    unsigned int nr = len / sizeof(xen_tmem_pool_info_t);
-
-    if ( len % sizeof(xen_tmem_pool_info_t) )
-        return -EINVAL;
-
-    if ( nr > MAX_POOLS_PER_DOMAIN )
-        return -E2BIG;
-
-    if ( !guest_handle_okay(pools, nr) )
-        return -EINVAL;
-
-    if ( !client )
-        return -EINVAL;
-
-    for ( idx = 0, i = 0; i < MAX_POOLS_PER_DOMAIN; i++ )
-    {
-        struct tmem_pool *pool = client->pools[i];
-        xen_tmem_pool_info_t out;
-
-        if ( pool == NULL )
-            continue;
-
-        out.flags.raw = (pool->persistent ? TMEM_POOL_PERSIST : 0) |
-              (pool->shared ? TMEM_POOL_SHARED : 0) |
-              (POOL_PAGESHIFT << TMEM_POOL_PAGESIZE_SHIFT) |
-              (TMEM_SPEC_VERSION << TMEM_POOL_VERSION_SHIFT);
-        out.n_pages = _atomic_read(pool->pgp_count);
-        out.uuid[0] = pool->uuid[0];
-        out.uuid[1] = pool->uuid[1];
-        out.id = i;
-
-        /* N.B. 'idx' != 'i'. */
-        if ( __copy_to_guest_offset(pools, idx, &out, 1) )
-        {
-            rc = -EFAULT;
-            break;
-        }
-        idx++;
-        /* Don't try to put more than what was requested. */
-        if ( idx >= nr )
-            break;
-    }
-
-    /* And how many we have processed. */
-    return rc ? : idx;
-}
-
-static int tmemc_set_pools(int cli_id,
-                           XEN_GUEST_HANDLE(xen_tmem_pool_info_t) pools,
-                           uint32_t len)
-{
-    unsigned int i;
-    int rc = 0;
-    unsigned int nr = len / sizeof(xen_tmem_pool_info_t);
-    struct client *client = tmem_client_from_cli_id(cli_id);
-
-    if ( len % sizeof(xen_tmem_pool_info_t) )
-        return -EINVAL;
-
-    if ( nr > MAX_POOLS_PER_DOMAIN )
-        return -E2BIG;
-
-    if ( !guest_handle_okay(pools, nr) )
-        return -EINVAL;
-
-    if ( !client )
-    {
-        client = client_create(cli_id);
-        if ( !client )
-            return -ENOMEM;
-    }
-    for ( i = 0; i < nr; i++ )
-    {
-        xen_tmem_pool_info_t pool;
-
-        if ( __copy_from_guest_offset(&pool, pools, i, 1 ) )
-            return -EFAULT;
-
-        if ( pool.n_pages )
-            return -EINVAL;
-
-        rc = do_tmem_new_pool(cli_id, pool.id, pool.flags.raw,
-                              pool.uuid[0], pool.uuid[1]);
-        if ( rc < 0 )
-            break;
-
-        pool.id = rc;
-        if ( __copy_to_guest_offset(pools, i, &pool, 1) )
-            return -EFAULT;
-    }
-
-    /* And how many we have processed. */
-    return rc ? : i;
-}
-
-static int tmemc_auth_pools(int cli_id,
-                            XEN_GUEST_HANDLE(xen_tmem_pool_info_t) pools,
-                            uint32_t len)
-{
-    unsigned int i;
-    int rc = 0;
-    unsigned int nr = len / sizeof(xen_tmem_pool_info_t);
-    struct client *client = tmem_client_from_cli_id(cli_id);
-
-    if ( len % sizeof(xen_tmem_pool_info_t) )
-        return -EINVAL;
-
-    if ( nr > MAX_POOLS_PER_DOMAIN )
-        return -E2BIG;
-
-    if ( !guest_handle_okay(pools, nr) )
-        return -EINVAL;
-
-    if ( !client )
-    {
-        client = client_create(cli_id);
-        if ( !client )
-            return -ENOMEM;
-    }
-
-    for ( i = 0; i < nr; i++ )
-    {
-        xen_tmem_pool_info_t pool;
-
-        if ( __copy_from_guest_offset(&pool, pools, i, 1 ) )
-            return -EFAULT;
-
-        if ( pool.n_pages )
-            return -EINVAL;
-
-        rc = tmemc_shared_pool_auth(cli_id, pool.uuid[0], pool.uuid[1],
-                                    pool.flags.u.auth);
-
-        if ( rc < 0 )
-            break;
-
-    }
-
-    /* And how many we have processed. */
-    return rc ? : i;
-}
-
-int tmem_control(struct xen_sysctl_tmem_op *op)
-{
-    int ret;
-    uint32_t cmd = op->cmd;
-
-    if ( op->pad != 0 )
-        return -EINVAL;
-
-    write_lock(&tmem_rwlock);
-
-    switch (cmd)
-    {
-    case XEN_SYSCTL_TMEM_OP_THAW:
-    case XEN_SYSCTL_TMEM_OP_FREEZE:
-    case XEN_SYSCTL_TMEM_OP_DESTROY:
-        ret = tmemc_freeze_pools(op->cli_id, cmd);
-        break;
-    case XEN_SYSCTL_TMEM_OP_FLUSH:
-        ret = tmemc_flush_mem(op->cli_id, op->arg);
-        break;
-    case XEN_SYSCTL_TMEM_OP_LIST:
-        ret = tmemc_list(op->cli_id,
-                         guest_handle_cast(op->u.buf, char), op->len, op->arg);
-        break;
-    case XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO:
-        ret = tmemc_set_client_info(op->cli_id, op->u.client);
-        break;
-    case XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB:
-        ret = tmem_freeable_pages() >> (20 - PAGE_SHIFT);
-        break;
-    case XEN_SYSCTL_TMEM_OP_GET_CLIENT_INFO:
-        ret = tmemc_get_client_info(op->cli_id, op->u.client);
-        break;
-    case XEN_SYSCTL_TMEM_OP_GET_POOLS:
-        ret = tmemc_get_pool(op->cli_id, op->u.pool, op->len);
-        break;
-    case XEN_SYSCTL_TMEM_OP_SET_POOLS: /* TMEM_RESTORE_NEW */
-        ret = tmemc_set_pools(op->cli_id, op->u.pool, op->len);
-        break;
-    case XEN_SYSCTL_TMEM_OP_SET_AUTH: /* TMEM_AUTH */
-        ret = tmemc_auth_pools(op->cli_id, op->u.pool, op->len);
-        break;
-    default:
-        ret = do_tmem_control(op);
-        break;
-    }
-
-    write_unlock(&tmem_rwlock);
-
-    return ret;
-}
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/common/tmem_xen.c b/xen/common/tmem_xen.c
deleted file mode 100644
index bf7b14f79a..0000000000
--- a/xen/common/tmem_xen.c
+++ /dev/null
@@ -1,277 +0,0 @@
-/******************************************************************************
- * tmem-xen.c
- *
- * Xen-specific Transcendent memory
- *
- * Copyright (c) 2009, Dan Magenheimer, Oracle Corp.
- */
-
-#include <xen/tmem.h>
-#include <xen/tmem_xen.h>
-#include <xen/lzo.h> /* compression code */
-#include <xen/paging.h>
-#include <xen/domain_page.h>
-#include <xen/cpu.h>
-#include <xen/init.h>
-
-bool __read_mostly opt_tmem;
-boolean_param("tmem", opt_tmem);
-
-bool __read_mostly opt_tmem_compress;
-boolean_param("tmem_compress", opt_tmem_compress);
-
-atomic_t freeable_page_count = ATOMIC_INIT(0);
-
-/* these are a concurrency bottleneck, could be percpu and dynamically
- * allocated iff opt_tmem_compress */
-#define LZO_WORKMEM_BYTES LZO1X_1_MEM_COMPRESS
-#define LZO_DSTMEM_PAGES 2
-static DEFINE_PER_CPU_READ_MOSTLY(unsigned char *, workmem);
-static DEFINE_PER_CPU_READ_MOSTLY(unsigned char *, dstmem);
-static DEFINE_PER_CPU_READ_MOSTLY(void *, scratch_page);
-
-#if defined(CONFIG_ARM)
-static inline void *cli_get_page(xen_pfn_t cmfn, mfn_t *pcli_mfn,
-                                 struct page_info **pcli_pfp, bool cli_write)
-{
-    ASSERT_UNREACHABLE();
-    return NULL;
-}
-
-static inline void cli_put_page(void *cli_va, struct page_info *cli_pfp,
-                                mfn_t cli_mfn, bool mark_dirty)
-{
-    ASSERT_UNREACHABLE();
-}
-#else
-#include <asm/p2m.h>
-
-static inline void *cli_get_page(xen_pfn_t cmfn, mfn_t *pcli_mfn,
-                                 struct page_info **pcli_pfp, bool cli_write)
-{
-    p2m_type_t t;
-    struct page_info *page;
-
-    page = get_page_from_gfn(current->domain, cmfn, &t, P2M_ALLOC);
-    if ( !page || t != p2m_ram_rw )
-    {
-        if ( page )
-            put_page(page);
-        return NULL;
-    }
-
-    if ( cli_write && !get_page_type(page, PGT_writable_page) )
-    {
-        put_page(page);
-        return NULL;
-    }
-
-    *pcli_mfn = page_to_mfn(page);
-    *pcli_pfp = page;
-
-    return map_domain_page(*pcli_mfn);
-}
-
-static inline void cli_put_page(void *cli_va, struct page_info *cli_pfp,
-                                mfn_t cli_mfn, bool mark_dirty)
-{
-    if ( mark_dirty )
-    {
-        put_page_and_type(cli_pfp);
-        paging_mark_dirty(current->domain, cli_mfn);
-    }
-    else
-        put_page(cli_pfp);
-    unmap_domain_page(cli_va);
-}
-#endif
-
-int tmem_copy_from_client(struct page_info *pfp,
-    xen_pfn_t cmfn, tmem_cli_va_param_t clibuf)
-{
-    mfn_t tmem_mfn, cli_mfn = INVALID_MFN;
-    char *tmem_va, *cli_va = NULL;
-    struct page_info *cli_pfp = NULL;
-    int rc = 1;
-
-    ASSERT(pfp != NULL);
-    tmem_mfn = page_to_mfn(pfp);
-    tmem_va = map_domain_page(tmem_mfn);
-    if ( guest_handle_is_null(clibuf) )
-    {
-        cli_va = cli_get_page(cmfn, &cli_mfn, &cli_pfp, 0);
-        if ( cli_va == NULL )
-        {
-            unmap_domain_page(tmem_va);
-            return -EFAULT;
-        }
-    }
-    smp_mb();
-    if ( cli_va )
-    {
-        memcpy(tmem_va, cli_va, PAGE_SIZE);
-        cli_put_page(cli_va, cli_pfp, cli_mfn, 0);
-    }
-    else
-        rc = -EINVAL;
-    unmap_domain_page(tmem_va);
-    return rc;
-}
-
-int tmem_compress_from_client(xen_pfn_t cmfn,
-    void **out_va, size_t *out_len, tmem_cli_va_param_t clibuf)
-{
-    int ret = 0;
-    unsigned char *dmem = this_cpu(dstmem);
-    unsigned char *wmem = this_cpu(workmem);
-    char *scratch = this_cpu(scratch_page);
-    struct page_info *cli_pfp = NULL;
-    mfn_t cli_mfn = INVALID_MFN;
-    void *cli_va = NULL;
-
-    if ( dmem == NULL || wmem == NULL )
-        return 0;  /* no buffer, so can't compress */
-    if ( guest_handle_is_null(clibuf) )
-    {
-        cli_va = cli_get_page(cmfn, &cli_mfn, &cli_pfp, 0);
-        if ( cli_va == NULL )
-            return -EFAULT;
-    }
-    else if ( !scratch )
-        return 0;
-    else if ( copy_from_guest(scratch, clibuf, PAGE_SIZE) )
-        return -EFAULT;
-    smp_mb();
-    ret = lzo1x_1_compress(cli_va ?: scratch, PAGE_SIZE, dmem, out_len, wmem);
-    ASSERT(ret == LZO_E_OK);
-    *out_va = dmem;
-    if ( cli_va )
-        cli_put_page(cli_va, cli_pfp, cli_mfn, 0);
-    return 1;
-}
-
-int tmem_copy_to_client(xen_pfn_t cmfn, struct page_info *pfp,
-    tmem_cli_va_param_t clibuf)
-{
-    mfn_t tmem_mfn, cli_mfn = INVALID_MFN;
-    char *tmem_va, *cli_va = NULL;
-    struct page_info *cli_pfp = NULL;
-    int rc = 1;
-
-    ASSERT(pfp != NULL);
-    if ( guest_handle_is_null(clibuf) )
-    {
-        cli_va = cli_get_page(cmfn, &cli_mfn, &cli_pfp, 1);
-        if ( cli_va == NULL )
-            return -EFAULT;
-    }
-    tmem_mfn = page_to_mfn(pfp);
-    tmem_va = map_domain_page(tmem_mfn);
-
-    if ( cli_va )
-    {
-        memcpy(cli_va, tmem_va, PAGE_SIZE);
-        cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
-    }
-    else
-        rc = -EINVAL;
-    unmap_domain_page(tmem_va);
-    smp_mb();
-    return rc;
-}
-
-int tmem_decompress_to_client(xen_pfn_t cmfn, void *tmem_va,
-                                    size_t size, tmem_cli_va_param_t clibuf)
-{
-    mfn_t cli_mfn = INVALID_MFN;
-    struct page_info *cli_pfp = NULL;
-    void *cli_va = NULL;
-    char *scratch = this_cpu(scratch_page);
-    size_t out_len = PAGE_SIZE;
-    int ret;
-
-    if ( guest_handle_is_null(clibuf) )
-    {
-        cli_va = cli_get_page(cmfn, &cli_mfn, &cli_pfp, 1);
-        if ( cli_va == NULL )
-            return -EFAULT;
-    }
-    else if ( !scratch )
-        return 0;
-    ret = lzo1x_decompress_safe(tmem_va, size, cli_va ?: scratch, &out_len);
-    ASSERT(ret == LZO_E_OK);
-    ASSERT(out_len == PAGE_SIZE);
-    if ( cli_va )
-        cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
-    else if ( copy_to_guest(clibuf, scratch, PAGE_SIZE) )
-        return -EFAULT;
-    smp_mb();
-    return 1;
-}
-
-/******************  XEN-SPECIFIC HOST INITIALIZATION ********************/
-static int dstmem_order, workmem_order;
-
-static int cpu_callback(
-    struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
-    unsigned int cpu = (unsigned long)hcpu;
-
-    switch ( action )
-    {
-    case CPU_UP_PREPARE: {
-        if ( per_cpu(dstmem, cpu) == NULL )
-            per_cpu(dstmem, cpu) = alloc_xenheap_pages(dstmem_order, 0);
-        if ( per_cpu(workmem, cpu) == NULL )
-            per_cpu(workmem, cpu) = alloc_xenheap_pages(workmem_order, 0);
-        if ( per_cpu(scratch_page, cpu) == NULL )
-            per_cpu(scratch_page, cpu) = alloc_xenheap_page();
-        break;
-    }
-    case CPU_DEAD:
-    case CPU_UP_CANCELED: {
-        if ( per_cpu(dstmem, cpu) != NULL )
-        {
-            free_xenheap_pages(per_cpu(dstmem, cpu), dstmem_order);
-            per_cpu(dstmem, cpu) = NULL;
-        }
-        if ( per_cpu(workmem, cpu) != NULL )
-        {
-            free_xenheap_pages(per_cpu(workmem, cpu), workmem_order);
-            per_cpu(workmem, cpu) = NULL;
-        }
-        if ( per_cpu(scratch_page, cpu) != NULL )
-        {
-            free_xenheap_page(per_cpu(scratch_page, cpu));
-            per_cpu(scratch_page, cpu) = NULL;
-        }
-        break;
-    }
-    default:
-        break;
-    }
-
-    return NOTIFY_DONE;
-}
-
-static struct notifier_block cpu_nfb = {
-    .notifier_call = cpu_callback
-};
-
-int __init tmem_init(void)
-{
-    unsigned int cpu;
-
-    dstmem_order = get_order_from_pages(LZO_DSTMEM_PAGES);
-    workmem_order = get_order_from_bytes(LZO1X_1_MEM_COMPRESS);
-
-    for_each_online_cpu ( cpu )
-    {
-        void *hcpu = (void *)(long)cpu;
-        cpu_callback(&cpu_nfb, CPU_UP_PREPARE, hcpu);
-    }
-
-    register_cpu_notifier(&cpu_nfb);
-
-    return 1;
-}
diff --git a/xen/include/Makefile b/xen/include/Makefile
index f7895e4d4e..325a0b88d9 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -16,7 +16,6 @@ headers-y := \
     compat/physdev.h \
     compat/platform.h \
     compat/sched.h \
-    compat/tmem.h \
     compat/trace.h \
     compat/vcpu.h \
     compat/version.h \
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 1ccf20787a..1b83407fcd 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -34,7 +34,6 @@
 #include "xen.h"
 #include "domctl.h"
 #include "physdev.h"
-#include "tmem.h"
 
 #define XEN_SYSCTL_INTERFACE_VERSION 0x00000012
 
@@ -732,110 +731,6 @@ struct xen_sysctl_psr_alloc {
     } u;
 };
 
-#define XEN_SYSCTL_TMEM_OP_ALL_CLIENTS 0xFFFFU
-
-#define XEN_SYSCTL_TMEM_OP_THAW                   0
-#define XEN_SYSCTL_TMEM_OP_FREEZE                 1
-#define XEN_SYSCTL_TMEM_OP_FLUSH                  2
-#define XEN_SYSCTL_TMEM_OP_DESTROY                3
-#define XEN_SYSCTL_TMEM_OP_LIST                   4
-#define XEN_SYSCTL_TMEM_OP_GET_CLIENT_INFO        5
-#define XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO        6
-#define XEN_SYSCTL_TMEM_OP_GET_POOLS              7
-#define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB      8
-#define XEN_SYSCTL_TMEM_OP_SET_POOLS              9
-#define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN             10
-#define XEN_SYSCTL_TMEM_OP_SET_AUTH               11
-#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_PAGE     19
-#define XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_INV      20
-#define XEN_SYSCTL_TMEM_OP_SAVE_END               21
-#define XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN          30
-#define XEN_SYSCTL_TMEM_OP_RESTORE_PUT_PAGE       32
-#define XEN_SYSCTL_TMEM_OP_RESTORE_FLUSH_PAGE     33
-
-/*
- * XEN_SYSCTL_TMEM_OP_SAVE_GET_NEXT_[PAGE|INV] override the 'buf' in
- * xen_sysctl_tmem_op with this structure - sometimes with an extra
- * page tackled on.
- */
-struct tmem_handle {
-    uint32_t pool_id;
-    uint32_t index;
-    xen_tmem_oid_t oid;
-};
-
-/*
- * XEN_SYSCTL_TMEM_OP_[GET,SAVE]_CLIENT uses the 'client' in
- * xen_tmem_op with this structure, which is mostly used during migration.
- */
-struct xen_tmem_client {
-    uint32_t version;   /* If mismatched we will get XEN_EOPNOTSUPP. */
-    uint32_t maxpools;  /* If greater than what hypervisor supports, will get
-                           XEN_ERANGE. */
-    uint32_t nr_pools;  /* Current amount of pools. Ignored on SET*/
-    union {             /* See TMEM_CLIENT_[COMPRESS,FROZEN] */
-        uint32_t raw;
-        struct {
-            uint8_t frozen:1,
-                    compress:1,
-                    migrating:1;
-        } u;
-    } flags;
-    uint32_t weight;
-};
-typedef struct xen_tmem_client xen_tmem_client_t;
-DEFINE_XEN_GUEST_HANDLE(xen_tmem_client_t);
-
-/*
- * XEN_SYSCTL_TMEM_OP_[GET|SET]_POOLS or XEN_SYSCTL_TMEM_OP_SET_AUTH
- * uses the 'pool' array in * xen_sysctl_tmem_op with this structure.
- * The XEN_SYSCTL_TMEM_OP_GET_POOLS hypercall will
- * return the number of entries in 'pool' or a negative value
- * if an error was encountered.
- * The XEN_SYSCTL_TMEM_OP_SET_[AUTH|POOLS] will return the number of
- * entries in 'pool' processed or a negative value if an error
- * was encountered.
- */
-struct xen_tmem_pool_info {
-    union {
-        uint32_t raw;
-        struct {
-            uint32_t persist:1,    /* See TMEM_POOL_PERSIST. */
-                     shared:1,     /* See TMEM_POOL_SHARED. */
-                     auth:1,       /* See TMEM_POOL_AUTH. */
-                     rsv1:1,
-                     pagebits:8,   /* TMEM_POOL_PAGESIZE_[SHIFT,MASK]. */
-                     rsv2:12,
-                     version:8;    /* TMEM_POOL_VERSION_[SHIFT,MASK]. */
-        } u;
-    } flags;
-    uint32_t id;                  /* Less than tmem_client.maxpools. */
-    uint64_t n_pages;             /* Zero on XEN_SYSCTL_TMEM_OP_SET_[AUTH|POOLS]. */
-    uint64_aligned_t uuid[2];
-};
-typedef struct xen_tmem_pool_info xen_tmem_pool_info_t;
-DEFINE_XEN_GUEST_HANDLE(xen_tmem_pool_info_t);
-
-struct xen_sysctl_tmem_op {
-    uint32_t cmd;       /* IN: XEN_SYSCTL_TMEM_OP_* . */
-    int32_t pool_id;    /* IN: 0 by default unless _SAVE_*, RESTORE_* .*/
-    uint32_t cli_id;    /* IN: client id, 0 for XEN_SYSCTL_TMEM_QUERY_FREEABLE_MB
-                           for all others can be the domain id or
-                           XEN_SYSCTL_TMEM_OP_ALL_CLIENTS for all. */
-    uint32_t len;       /* IN: length of 'buf'. If not applicable to use 0. */
-    uint32_t arg;       /* IN: If not applicable to command use 0. */
-    uint32_t pad;       /* Padding so structure is the same under 32 and 64. */
-    xen_tmem_oid_t oid; /* IN: If not applicable to command use 0s. */
-    union {
-        XEN_GUEST_HANDLE_64(char) buf; /* IN/OUT: Buffer to save/restore */
-        XEN_GUEST_HANDLE_64(xen_tmem_client_t) client; /* IN/OUT for */
-                        /*  XEN_SYSCTL_TMEM_OP_[GET,SAVE]_CLIENT. */
-        XEN_GUEST_HANDLE_64(xen_tmem_pool_info_t) pool; /* OUT for */
-                        /* XEN_SYSCTL_TMEM_OP_GET_POOLS. Must have 'len' */
-                        /* of them. */
-    } u;
-};
-
 /*
  * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
  *
@@ -1124,7 +1019,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_psr_cmt_op                    21
 #define XEN_SYSCTL_pcitopoinfo                   22
 #define XEN_SYSCTL_psr_alloc                     23
-#define XEN_SYSCTL_tmem_op                       24
+/* #define XEN_SYSCTL_tmem_op                       24 */
 #define XEN_SYSCTL_get_cpu_levelling_caps        25
 #define XEN_SYSCTL_get_cpu_featureset            26
 #define XEN_SYSCTL_livepatch_op                  27
@@ -1154,7 +1049,6 @@ struct xen_sysctl {
         struct xen_sysctl_coverage_op       coverage_op;
         struct xen_sysctl_psr_cmt_op        psr_cmt_op;
         struct xen_sysctl_psr_alloc         psr_alloc;
-        struct xen_sysctl_tmem_op           tmem_op;
         struct xen_sysctl_cpu_levelling_caps cpu_levelling_caps;
         struct xen_sysctl_cpu_featureset    cpu_featureset;
         struct xen_sysctl_livepatch_op      livepatch;
diff --git a/xen/include/public/tmem.h b/xen/include/public/tmem.h
index aa0aafaa9d..c02be9f704 100644
--- a/xen/include/public/tmem.h
+++ b/xen/include/public/tmem.h
@@ -1,8 +1,8 @@
 /******************************************************************************
  * tmem.h
- * 
+ *
  * Guest OS interface to Xen Transcendent Memory.
- * 
+ *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to
  * deal in the Software without restriction, including without limitation the
@@ -29,15 +29,11 @@
 
 #include "xen.h"
 
+#if __XEN_INTERFACE_VERSION__ < 0x00041200
+
 /* version of ABI */
 #define TMEM_SPEC_VERSION          1
 
-/* Commands to HYPERVISOR_tmem_op() */
-#ifdef __XEN__
-#define TMEM_CONTROL               0 /* Now called XEN_SYSCTL_tmem_op */
-#else
-#undef TMEM_CONTROL
-#endif
 #define TMEM_NEW_POOL              1
 #define TMEM_DESTROY_POOL          2
 #define TMEM_PUT_PAGE              4
@@ -111,6 +107,8 @@ typedef struct tmem_op tmem_op_t;
 DEFINE_XEN_GUEST_HANDLE(tmem_op_t);
 #endif
 
+#endif  /* __XEN_INTERFACE_VERSION__ < 0x00041200 */
+
 #endif /* __XEN_PUBLIC_TMEM_H__ */
 
 /*
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea57d..888775f9a7 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -12,7 +12,6 @@
 #include <public/sysctl.h>
 #include <public/platform.h>
 #include <public/event_channel.h>
-#include <public/tmem.h>
 #include <public/version.h>
 #include <public/pmu.h>
 #include <public/hvm/dm_op.h>
@@ -130,12 +129,6 @@ extern long
 do_xsm_op(
     XEN_GUEST_HANDLE_PARAM(xsm_op_t) u_xsm_op);
 
-#ifdef CONFIG_TMEM
-extern long
-do_tmem_op(
-    XEN_GUEST_HANDLE_PARAM(tmem_op_t) uops);
-#endif
-
 extern long
 do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 054d02e6c0..1c9ab306c0 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -248,8 +248,10 @@ struct npfec {
 #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
 #define _MEMF_populate_on_demand 1
 #define  MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand)
+#if 0
 #define _MEMF_tmem        2
 #define  MEMF_tmem        (1U<<_MEMF_tmem)
+#endif
 #define _MEMF_no_dma      3
 #define  MEMF_no_dma      (1U<<_MEMF_no_dma)
 #define _MEMF_exact_node  4
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 0309c1f2a0..c8ca3e6853 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -455,9 +455,6 @@ struct domain
      */
     spinlock_t hypercall_deadlock_mutex;
 
-    /* transcendent memory, auto-allocated on first tmem op by each domain */
-    struct client *tmem_client;
-
     struct lock_profile_qhead profile_head;
 
     /* Various vm_events */
diff --git a/xen/include/xen/tmem.h b/xen/include/xen/tmem.h
deleted file mode 100644
index 414a14d808..0000000000
--- a/xen/include/xen/tmem.h
+++ /dev/null
@@ -1,45 +0,0 @@
-/******************************************************************************
- * tmem.h
- *
- * Transcendent memory
- *
- * Copyright (c) 2008, Dan Magenheimer, Oracle Corp.
- */
-
-#ifndef __XEN_TMEM_H__
-#define __XEN_TMEM_H__
-
-struct xen_sysctl_tmem_op;
-
-#ifdef CONFIG_TMEM
-extern int tmem_control(struct xen_sysctl_tmem_op *op);
-extern void tmem_destroy(void *);
-extern void *tmem_relinquish_pages(unsigned int, unsigned int);
-extern unsigned long tmem_freeable_pages(void);
-#else
-static inline int
-tmem_control(struct xen_sysctl_tmem_op *op)
-{
-    return -ENOSYS;
-}
-
-static inline void
-tmem_destroy(void *p)
-{
-    return;
-}
-
-static inline void *
-tmem_relinquish_pages(unsigned int x, unsigned int y)
-{
-    return NULL;
-}
-
-static inline unsigned long
-tmem_freeable_pages(void)
-{
-    return 0;
-}
-#endif /* CONFIG_TMEM */
-
-#endif /* __XEN_TMEM_H__ */
diff --git a/xen/include/xen/tmem_control.h b/xen/include/xen/tmem_control.h
deleted file mode 100644
index ad04cf707b..0000000000
--- a/xen/include/xen/tmem_control.h
+++ /dev/null
@@ -1,39 +0,0 @@
-/*
- * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
- *
- */
-
-#ifndef __XEN_TMEM_CONTROL_H__
-#define __XEN_TMEM_CONTROL_H__
-
-#ifdef CONFIG_TMEM
-#include <public/sysctl.h>
-/* Variables and functions that tmem_control.c needs from tmem.c */
-
-extern struct tmem_statistics tmem_stats;
-extern struct tmem_global tmem_global;
-
-extern rwlock_t tmem_rwlock;
-
-int tmem_evict(void);
-int do_tmem_control(struct xen_sysctl_tmem_op *op);
-
-struct client *client_create(domid_t cli_id);
-int do_tmem_new_pool(domid_t this_cli_id, uint32_t d_poolid, uint32_t flags,
-                     uint64_t uuid_lo, uint64_t uuid_hi);
-
-int tmemc_shared_pool_auth(domid_t cli_id, uint64_t uuid_lo,
-                           uint64_t uuid_hi, bool auth);
-#endif /* CONFIG_TMEM */
-
-#endif /* __XEN_TMEM_CONTROL_H__ */
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/xen/tmem_xen.h b/xen/include/xen/tmem_xen.h
deleted file mode 100644
index 8516a0b131..0000000000
--- a/xen/include/xen/tmem_xen.h
+++ /dev/null
@@ -1,343 +0,0 @@
-/******************************************************************************
- * tmem_xen.h
- *
- * Xen-specific Transcendent memory
- *
- * Copyright (c) 2009, Dan Magenheimer, Oracle Corp.
- */
-
-#ifndef __XEN_TMEM_XEN_H__
-#define __XEN_TMEM_XEN_H__
-
-#include <xen/mm.h> /* heap alloc/free */
-#include <xen/pfn.h>
-#include <xen/xmalloc.h> /* xmalloc/xfree */
-#include <xen/sched.h>  /* struct domain */
-#include <xen/guest_access.h> /* copy_from_guest */
-#include <xen/hash.h> /* hash_long */
-#include <xen/domain_page.h> /* __map_domain_page */
-#include <xen/rbtree.h> /* struct rb_root */
-#include <xsm/xsm.h> /* xsm_tmem_control */
-#include <public/tmem.h>
-#ifdef CONFIG_COMPAT
-#include <compat/tmem.h>
-#endif
-typedef uint32_t pagesize_t;  /* like size_t, must handle largest PAGE_SIZE */
-
-#define IS_PAGE_ALIGNED(addr) IS_ALIGNED((unsigned long)(addr), PAGE_SIZE)
-#define IS_VALID_PAGE(_pi)    mfn_valid(page_to_mfn(_pi))
-
-extern struct page_list_head tmem_page_list;
-extern spinlock_t tmem_page_list_lock;
-extern unsigned long tmem_page_list_pages;
-extern atomic_t freeable_page_count;
-
-extern int tmem_init(void);
-#define tmem_hash hash_long
-
-extern bool opt_tmem_compress;
-static inline bool tmem_compression_enabled(void)
-{
-    return opt_tmem_compress;
-}
-
-#ifdef CONFIG_TMEM
-extern bool opt_tmem;
-static inline bool tmem_enabled(void)
-{
-    return opt_tmem;
-}
-
-static inline void tmem_disable(void)
-{
-    opt_tmem = false;
-}
-#else
-static inline bool tmem_enabled(void)
-{
-    return false;
-}
-
-static inline void tmem_disable(void)
-{
-}
-#endif /* CONFIG_TMEM */
-
-/*
- * Memory free page list management
- */
-
-static inline struct page_info *tmem_page_list_get(void)
-{
-    struct page_info *pi;
-
-    spin_lock(&tmem_page_list_lock);
-    if ( (pi = page_list_remove_head(&tmem_page_list)) != NULL )
-        tmem_page_list_pages--;
-    spin_unlock(&tmem_page_list_lock);
-    ASSERT((pi == NULL) || IS_VALID_PAGE(pi));
-    return pi;
-}
-
-static inline void tmem_page_list_put(struct page_info *pi)
-{
-    ASSERT(IS_VALID_PAGE(pi));
-    spin_lock(&tmem_page_list_lock);
-    page_list_add(pi, &tmem_page_list);
-    tmem_page_list_pages++;
-    spin_unlock(&tmem_page_list_lock);
-}
-
-/*
- * Memory allocation for persistent data 
- */
-static inline struct page_info *__tmem_alloc_page_thispool(struct domain *d)
-{
-    struct page_info *pi;
-
-    /* note that this tot_pages check is not protected by d->page_alloc_lock,
-     * so may race and periodically fail in donate_page or alloc_domheap_pages
-     * That's OK... neither is a problem, though chatty if log_lvl is set */ 
-    if ( d->tot_pages >= d->max_pages )
-        return NULL;
-
-    if ( tmem_page_list_pages )
-    {
-        if ( (pi = tmem_page_list_get()) != NULL )
-        {
-            if ( donate_page(d,pi,0) == 0 )
-                goto out;
-            else
-                tmem_page_list_put(pi);
-        }
-    }
-
-    pi = alloc_domheap_pages(d,0,MEMF_tmem);
-
-out:
-    ASSERT((pi == NULL) || IS_VALID_PAGE(pi));
-    return pi;
-}
-
-static inline void __tmem_free_page_thispool(struct page_info *pi)
-{
-    struct domain *d = page_get_owner(pi);
-
-    ASSERT(IS_VALID_PAGE(pi));
-    if ( (d == NULL) || steal_page(d,pi,0) == 0 )
-        tmem_page_list_put(pi);
-    else
-    {
-        scrub_one_page(pi);
-        ASSERT((pi->count_info & ~(PGC_allocated | 1)) == 0);
-        free_domheap_pages(pi,0);
-    }
-}
-
-/*
- * Memory allocation for ephemeral (non-persistent) data
- */
-static inline struct page_info *__tmem_alloc_page(void)
-{
-    struct page_info *pi = tmem_page_list_get();
-
-    if ( pi == NULL)
-        pi = alloc_domheap_pages(0,0,MEMF_tmem);
-
-    if ( pi )
-        atomic_inc(&freeable_page_count);
-    ASSERT((pi == NULL) || IS_VALID_PAGE(pi));
-    return pi;
-}
-
-static inline void __tmem_free_page(struct page_info *pi)
-{
-    ASSERT(IS_VALID_PAGE(pi));
-    tmem_page_list_put(pi);
-    atomic_dec(&freeable_page_count);
-}
-
-/*  "Client" (==domain) abstraction */
-static inline struct client *tmem_client_from_cli_id(domid_t cli_id)
-{
-    struct client *c;
-    struct domain *d = rcu_lock_domain_by_id(cli_id);
-    if (d == NULL)
-        return NULL;
-    c = d->tmem_client;
-    rcu_unlock_domain(d);
-    return c;
-}
-
-/* these typedefs are in the public/tmem.h interface
-typedef XEN_GUEST_HANDLE(void) cli_mfn_t;
-typedef XEN_GUEST_HANDLE(char) cli_va_t;
-*/
-typedef XEN_GUEST_HANDLE_PARAM(tmem_op_t) tmem_cli_op_t;
-typedef XEN_GUEST_HANDLE_PARAM(char) tmem_cli_va_param_t;
-
-static inline int tmem_get_tmemop_from_client(tmem_op_t *op, tmem_cli_op_t uops)
-{
-#ifdef CONFIG_COMPAT
-    if ( is_hvm_vcpu(current) ? hvm_guest_x86_mode(current) != 8
-                              : is_pv_32bit_vcpu(current) )
-    {
-        int rc;
-        enum XLAT_tmem_op_u u;
-        tmem_op_compat_t cop;
-
-        rc = copy_from_guest(&cop, guest_handle_cast(uops, void), 1);
-        if ( rc )
-            return rc;
-        switch ( cop.cmd )
-        {
-        case TMEM_NEW_POOL:   u = XLAT_tmem_op_u_creat; break;
-        default:              u = XLAT_tmem_op_u_gen ;  break;
-        }
-        XLAT_tmem_op(op, &cop);
-        return 0;
-    }
-#endif
-    return copy_from_guest(op, uops, 1);
-}
-
-#define tmem_cli_buf_null guest_handle_from_ptr(NULL, char)
-#define TMEM_CLI_ID_NULL ((domid_t)((domid_t)-1L))
-#define tmem_cli_id_str "domid"
-#define tmem_client_str "domain"
-
-int tmem_decompress_to_client(xen_pfn_t, void *, size_t,
-			     tmem_cli_va_param_t);
-int tmem_compress_from_client(xen_pfn_t, void **, size_t *,
-			     tmem_cli_va_param_t);
-
-int tmem_copy_from_client(struct page_info *, xen_pfn_t, tmem_cli_va_param_t);
-int tmem_copy_to_client(xen_pfn_t, struct page_info *, tmem_cli_va_param_t);
-
-#define tmem_client_err(fmt, args...)  printk(XENLOG_G_ERR fmt, ##args)
-#define tmem_client_warn(fmt, args...) printk(XENLOG_G_WARNING fmt, ##args)
-#define tmem_client_info(fmt, args...) printk(XENLOG_G_INFO fmt, ##args)
-
-/* Global statistics (none need to be locked). */
-struct tmem_statistics {
-    unsigned long total_tmem_ops;
-    unsigned long errored_tmem_ops;
-    unsigned long total_flush_pool;
-    unsigned long alloc_failed;
-    unsigned long alloc_page_failed;
-    unsigned long evicted_pgs;
-    unsigned long evict_attempts;
-    unsigned long relinq_pgs;
-    unsigned long relinq_attempts;
-    unsigned long max_evicts_per_relinq;
-    unsigned long low_on_memory;
-    unsigned long deduped_puts;
-    unsigned long tot_good_eph_puts;
-    int global_obj_count_max;
-    int global_pgp_count_max;
-    int global_pcd_count_max;
-    int global_page_count_max;
-    int global_rtree_node_count_max;
-    long global_eph_count_max;
-    unsigned long failed_copies;
-    unsigned long pcd_tot_tze_size;
-    unsigned long pcd_tot_csize;
-    /* Global counters (should use long_atomic_t access). */
-    atomic_t global_obj_count;
-    atomic_t global_pgp_count;
-    atomic_t global_pcd_count;
-    atomic_t global_page_count;
-    atomic_t global_rtree_node_count;
-};
-
-#define atomic_inc_and_max(_c) do { \
-    atomic_inc(&tmem_stats._c); \
-    if ( _atomic_read(tmem_stats._c) > tmem_stats._c##_max ) \
-        tmem_stats._c##_max = _atomic_read(tmem_stats._c); \
-} while (0)
-
-#define atomic_dec_and_assert(_c) do { \
-    atomic_dec(&tmem_stats._c); \
-    ASSERT(_atomic_read(tmem_stats._c) >= 0); \
-} while (0)
-
-#define MAX_GLOBAL_SHARED_POOLS  16
-struct tmem_global {
-    struct list_head ephemeral_page_list;  /* All pages in ephemeral pools. */
-    struct list_head client_list;
-    struct tmem_pool *shared_pools[MAX_GLOBAL_SHARED_POOLS];
-    bool shared_auth;
-    long eph_count;  /* Atomicity depends on eph_lists_spinlock. */
-    atomic_t client_weight_total;
-};
-
-#define MAX_POOLS_PER_DOMAIN 16
-
-struct tmem_pool;
-struct tmem_page_descriptor;
-struct tmem_page_content_descriptor;
-struct client {
-    struct list_head client_list;
-    struct tmem_pool *pools[MAX_POOLS_PER_DOMAIN];
-    struct domain *domain;
-    struct xmem_pool *persistent_pool;
-    struct list_head ephemeral_page_list;
-    long eph_count, eph_count_max;
-    domid_t cli_id;
-    xen_tmem_client_t info;
-    /* For save/restore/migration. */
-    bool was_frozen;
-    struct list_head persistent_invalidated_list;
-    struct tmem_page_descriptor *cur_pgp;
-    /* Statistics collection. */
-    unsigned long compress_poor, compress_nomem;
-    unsigned long compressed_pages;
-    uint64_t compressed_sum_size;
-    uint64_t total_cycles;
-    unsigned long succ_pers_puts, succ_eph_gets, succ_pers_gets;
-    /* Shared pool authentication. */
-    uint64_t shared_auth_uuid[MAX_GLOBAL_SHARED_POOLS][2];
-};
-
-#define POOL_PAGESHIFT (PAGE_SHIFT - 12)
-#define OBJ_HASH_BUCKETS 256 /* Must be power of two. */
-#define OBJ_HASH_BUCKETS_MASK (OBJ_HASH_BUCKETS-1)
-
-#define is_persistent(_p)  (_p->persistent)
-#define is_shared(_p)      (_p->shared)
-
-struct tmem_pool {
-    bool shared;
-    bool persistent;
-    bool is_dying;
-    struct client *client;
-    uint64_t uuid[2]; /* 0 for private, non-zero for shared. */
-    uint32_t pool_id;
-    rwlock_t pool_rwlock;
-    struct rb_root obj_rb_root[OBJ_HASH_BUCKETS]; /* Protected by pool_rwlock. */
-    struct list_head share_list; /* Valid if shared. */
-    int shared_count; /* Valid if shared. */
-    /* For save/restore/migration. */
-    struct list_head persistent_page_list;
-    struct tmem_page_descriptor *cur_pgp;
-    /* Statistics collection. */
-    atomic_t pgp_count;
-    int pgp_count_max;
-    long obj_count;  /* Atomicity depends on pool_rwlock held for write. */
-    long obj_count_max;
-    unsigned long objnode_count, objnode_count_max;
-    uint64_t sum_life_cycles;
-    uint64_t sum_evicted_cycles;
-    unsigned long puts, good_puts, no_mem_puts;
-    unsigned long dup_puts_flushed, dup_puts_replaced;
-    unsigned long gets, found_gets;
-    unsigned long flushs, flushs_found;
-    unsigned long flush_objs, flush_objs_found;
-};
-
-struct share_list {
-    struct list_head share_list;
-    struct client *client;
-};
-
-#endif /* __XEN_TMEM_XEN_H__ */
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 527332054a..2aa238f41f 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -126,8 +126,6 @@
 ?	sched_pin_override		sched.h
 ?	sched_remote_shutdown		sched.h
 ?	sched_shutdown			sched.h
-?	tmem_oid			tmem.h
-!	tmem_op				tmem.h
 ?	t_buf				trace.h
 ?	vcpu_get_physid			vcpu.h
 ?	vcpu_register_vcpu_info		vcpu.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index a29d1efe9b..94af3dfb80 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -433,12 +433,6 @@ static XSM_INLINE int xsm_page_offline(XSM_DEFAULT_ARG uint32_t cmd)
     return xsm_default_action(action, current->domain, NULL);
 }
 
-static XSM_INLINE int xsm_tmem_op(XSM_DEFAULT_VOID)
-{
-    XSM_ASSERT_ACTION(XSM_HOOK);
-    return xsm_default_action(action, current->domain, NULL);
-}
-
 static XSM_INLINE long xsm_do_xsm_op(XEN_GUEST_HANDLE_PARAM(xsm_op_t) op)
 {
     return -ENOSYS;
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 3b192b5c31..ceae80b74b 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -127,7 +127,6 @@ struct xsm_operations {
     int (*resource_setup_misc) (void);
 
     int (*page_offline)(uint32_t cmd);
-    int (*tmem_op)(void);
 
     long (*do_xsm_op) (XEN_GUEST_HANDLE_PARAM(xsm_op_t) op);
 #ifdef CONFIG_COMPAT
@@ -530,11 +529,6 @@ static inline int xsm_page_offline(xsm_default_t def, uint32_t cmd)
     return xsm_ops->page_offline(cmd);
 }
 
-static inline int xsm_tmem_op(xsm_default_t def)
-{
-    return xsm_ops->tmem_op();
-}
-
 static inline long xsm_do_xsm_op (XEN_GUEST_HANDLE_PARAM(xsm_op_t) op)
 {
     return xsm_ops->do_xsm_op(op);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 5701047c06..34f7a305ff 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -103,7 +103,6 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, resource_setup_misc);
 
     set_to_dummy_if_null(ops, page_offline);
-    set_to_dummy_if_null(ops, tmem_op);
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 96d31aaf08..8fbbd2e053 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -808,9 +808,6 @@ static int flask_sysctl(int cmd)
         return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
                                     XEN2__PSR_ALLOC, NULL);
 
-    case XEN_SYSCTL_tmem_op:
-        return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
-
     case XEN_SYSCTL_get_cpu_levelling_caps:
         return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
                                     XEN2__GET_CPU_LEVELLING_CAPS, NULL);
@@ -1176,11 +1173,6 @@ static inline int flask_page_offline(uint32_t cmd)
     }
 }
 
-static inline int flask_tmem_op(void)
-{
-    return domain_has_xen(current->domain, XEN__TMEM_OP);
-}
-
 static int flask_add_to_physmap(struct domain *d1, struct domain *d2)
 {
     return domain_has_perm(d1, d2, SECCLASS_MMU, MMU__PHYSMAP);
@@ -1789,7 +1781,6 @@ static struct xsm_operations flask_ops = {
     .resource_setup_misc = flask_resource_setup_misc,
 
     .page_offline = flask_page_offline,
-    .tmem_op = flask_tmem_op,
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 6fecfdaa83..843d42d824 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -67,10 +67,6 @@ class xen
     lockprof
 # XEN_SYSCTL_cpupool_op
     cpupool_op
-# tmem hypercall (any access)
-    tmem_op
-# XEN_SYSCTL_tmem_op command of tmem (part of sysctl)
-    tmem_control
 # XEN_SYSCTL_scheduler_op with XEN_DOMCTL_SCHEDOP_getinfo, XEN_SYSCTL_sched_id, XEN_DOMCTL_SCHEDOP_getvcpuinfo
     getscheduler
 # XEN_SYSCTL_scheduler_op with XEN_DOMCTL_SCHEDOP_putinfo, XEN_DOMCTL_SCHEDOP_putvcpuinfo
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/3] docs: remove tmem related text
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
  2018-11-28 13:58 ` [PATCH v2 1/3] tools: remove tmem code and commands Wei Liu
  2018-11-28 13:58 ` [PATCH v2 2/3] xen: remove tmem from hypervisor Wei Liu
@ 2018-11-28 13:58 ` Wei Liu
  2018-11-28 15:49   ` Daniel De Graaf
  2018-11-29  2:50 ` [PATCH v2 0/3] Remove tmem Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Wei Liu @ 2018-11-28 13:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Daniel De Graaf

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 docs/man/xl.conf.pod.5              |   9 +-
 docs/man/xl.pod.1.in                |  68 ----
 docs/misc/tmem-internals.html       | 789 ------------------------------------
 docs/misc/xen-command-line.markdown |   6 -
 docs/misc/xsm-flask.txt             |  36 --
 5 files changed, 2 insertions(+), 906 deletions(-)
 delete mode 100644 docs/misc/tmem-internals.html

diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5
index 37262a7ef8..b1bde7d657 100644
--- a/docs/man/xl.conf.pod.5
+++ b/docs/man/xl.conf.pod.5
@@ -148,10 +148,8 @@ The default choice is "xvda".
 =item B<claim_mode=BOOLEAN>
 
 If this option is enabled then when a guest is created there will be an
-guarantee that there is memory available for the guest. This is an
-particularly acute problem on hosts with memory over-provisioned guests
-that use tmem and have self-balloon enabled (which is the default
-option). The self-balloon mechanism can deflate/inflate the balloon
+guarantee that there is memory available for the guest.
+The self-balloon mechanism can deflate/inflate the balloon
 quickly and the amount of free memory (which C<xl info> can show) is
 stale the moment it is printed. When claim is enabled a reservation for
 the amount of memory (see 'memory' in xl.conf(5)) is set, which is then
@@ -163,9 +161,6 @@ If the reservation cannot be meet the guest creation fails immediately
 instead of taking seconds/minutes (depending on the size of the guest)
 while the guest is populated.
 
-Note that to enable tmem type guests, one needs to provide C<tmem> on the
-Xen hypervisor argument and as well on the Linux kernel command line.
-
 Default: C<1>
 
 =over 4
diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index 18006880d6..7c765dbc3c 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -1677,74 +1677,6 @@ Obtain information of USB devices connected as such via the device model
 
 =back
 
-=head1 TRANSCENDENT MEMORY (TMEM)
-
-=over 4
-
-=item B<tmem-list> I<[OPTIONS]> I<domain-id>
-
-List tmem pools.
-
-B<OPTIONS>
-
-=over 4
-
-=item B<-l>
-
-If this parameter is specified, also list tmem stats.
-
-=back
-
-=item B<tmem-freeze> I<domain-id>
-
-Freeze tmem pools.
-
-=item B<tmem-thaw> I<domain-id>
-
-Thaw tmem pools.
-
-=item B<tmem-set> I<domain-id> [I<OPTIONS>]
-
-Change tmem settings.
-
-B<OPTIONS>
-
-=over 4
-
-=item B<-w> I<WEIGHT>
-
-Weight (int)
-
-=item B<-p> I<COMPRESS>
-
-Compress (int)
-
-=back
-
-=item B<tmem-shared-auth> I<domain-id> [I<OPTIONS>]
-
-De/authenticate shared tmem pool.
-
-B<OPTIONS>
-
-=over 4
-
-=item B<-u> I<UUID>
-
-Specify uuid (abcdef01-2345-6789-1234-567890abcdef)
-
-=item B<-a> I<AUTH>
-
-0=auth,1=deauth
-
-=back
-
-=item B<tmem-freeable>
-
-Get information about how much freeable memory (MB) is in-use by tmem.
-
-=back
-
 =head1 FLASK
 
 B<FLASK> is a security framework that defines a mandatory access control policy
diff --git a/docs/misc/tmem-internals.html b/docs/misc/tmem-internals.html
deleted file mode 100644
index 9b7e70e650..0000000000
--- a/docs/misc/tmem-internals.html
+++ /dev/null
@@ -1,789 +0,0 @@
-<h1>Transcendent Memory Internals in Xen</h1>
-<P>
-by Dan Magenheimer, Oracle Corp.</p>
-<P>
-Draft 0.1 -- Updated: 20100324
-<h2>Overview</h2>
-<P>
-This document focuses on the internal implementation of
-Transcendent Memory (tmem) on Xen.  It assumes
-that the reader has a basic knowledge of the terminology, objectives, and
-functionality of tmem and also has access to the Xen source code.
-It corresponds to the Xen 4.0 release, with
-patch added to support page deduplication (V2).
-<P>
-The primary responsibilities of the tmem implementation are to:
-<ul>
-<li>manage a potentially huge and extremely dynamic
-number of memory pages from a potentially large number of clients (domains)
-with low memory overhead and proper isolation
-<li>provide quick and efficient access to these
-pages with as much concurrency as possible
-<li>enable efficient reclamation and <i>eviction</i> of pages (e.g. when
-memory is fully utilized)
-<li>optionally, increase page density through compression and/or
-deduplication
-<li>where necessary, properly assign and account for
-memory belonging to guests to avoid malicious and/or accidental unfairness
-and/or denial-of-service
-<li>record utilization statistics and make them available to management tools
-</ul>
-<h2>Source Code Organization</h2>
-
-<P>
-The source code in Xen that provides the tmem functionality
-is divided up into four files: tmem.c, tmem.h, tmem_xen.c, and tmem_xen.h.
-The files tmem.c and tmem.h are intended to
-be implementation- (and hypervisor-) independent and the other two files
-provide the Xen-specific code.  This
-division is intended to make it easier to port tmem functionality to other
-hypervisors, though at this time porting to other hypervisors has not been
-attempted.  Together, these four files
-total less than 4000 lines of C code.
-<P>
-Even ignoring the implementation-specific functionality, the
-implementation-independent part of tmem has several dependencies on
-library functionality (Xen source filenames in parentheses):
-<ul>
-<li>
-a good fast general-purpose dynamic memory
-allocator with bounded response time and efficient use of memory for a very
-large number of sub-page allocations.  To
-achieve this in Xen, the bad old memory allocator was replaced with a
-slightly-modified version of TLSF (xmalloc_tlsf.c), first ported to Linux by
-Nitin Gupta for compcache.
-<li>
-good tree data structure libraries, specifically
-<i>red-black</i> trees (rbtree.c) and <i>radix</i> trees (radix-tree.c).
-Code for these was borrowed for Linux and adapted for tmem and Xen.
-<li>
-good locking and list code.  Both of these existed in Xen and required
-little or no change.
-<li>
-optionally, a good fast lossless compression
-library.  The Xen implementation added to
-support tmem uses LZO1X (lzo.c), also ported for Linux by Nitin Gupta.
-</ul>
-<P>
-More information about the specific functionality of these
-libraries can easily be found through a search engine, via wikipedia, or in the
-Xen or Linux source logs so we will not elaborate further here.
-
-<h2>Prefixes/Abbreviations/Glossary</h2>
-
-<P>
-The tmem code uses several prefixes and abbreviations.
-Knowledge of these will improve code readability:
-<ul>
-<li>
-<i>tmh</i> ==
-transcendent memory host.  Functions or
-data structures that are defined by the implementation-specific code, i.e. the
-Xen host code
-<li>
-<i>tmemc</i>
-== transcendent memory control.
-Functions or data structures that provide management tool functionality,
-rather than core tmem operations.
-<li>
-<i>cli </i>or
-<i>client</i> == client.
-The tmem generic term for a domain or a guest OS.
-</ul>
-<P>
-When used in prose, common tmem operations are indicated
-with a different font, such as <big><kbd>put</kbd></big>
-and <big><kbd>get</kbd></big>.
-
-<h2>Key Data Structures</h2>
-
-<P>
-To manage a huge number of pages, efficient data structures
-must be carefully selected.
-<P>
-Recall that a tmem-enabled guest OS may create one or more
-pools with different attributes.  It then
-<kbd>put</kbd></big>s and <kbd>get</kbd></big>s
-pages to/from this pool, identifying the page
-with a <i>handle</i> that consists of a <i>pool_id</i>, an <i>
-object_id</i>, and a <i>page_id </i>(sometimes
-called an <i>index</i>).
-This suggests a few obvious core data
-structures:
-<ul>
-<li>
-When a guest OS first calls tmem, a <i>client_t</i> is created to contain
-and track all uses of tmem by that guest OS.  Among
-other things, a <i>client_t</i> keeps pointers
-to a fixed number of pools (16 in the current Xen implementation).
-<li>
-When a guest OS requests a new pool, a <i>pool_t</i> is created.
-Some pools are shared and are kept in a
-sharelist (<i>sharelist_t</i>) which points
-to all the clients that are sharing the pool.
-Since an <i>object_id</i> is 64-bits,
-a <i>pool_t</i> must be able to keep track
-of a potentially very large number of objects.
-To do so, it maintains a number of parallel trees (256 in the current
-Xen implementation) and a hash algorithm is applied to the <i>object_id</i>
-to select the correct tree.
-Each tree element points to an object.
-Because an <i>object_id</i> usually represents an <i>inode</i>
-(a unique file number identifier), and <i>inode</i> numbers
-are fairly random, though often &quot;clumpy&quot;, a <i>red-black tree</i>
-is used.
-<li>
-When a guest first
-<kbd>put</kbd></big>s a page to a pool with an as-yet-unused <i>object_id,</i> an
-<i>obj_t</i> is created.  Since a <i
->page_id</i> is usually an index into a file,
-it is often a small number, but may sometimes be very large (up to
-32-bits).  A <i>radix tree</i> is a good data structure to contain items
-with this kind of index distribution.
-<li>
-When a page is
-<kbd>put</kbd></big>, a page descriptor, or <i>pgp_t</i>, is created, which
-among other things will point to the storage location where the data is kept.
-In the normal case the pointer is to a <i>pfp_t</i>, which is an
-implementation-specific datatype representing a physical pageframe in memory
-(which in Xen is a &quot;struct page_info&quot;).
-When deduplication is enabled, it points to
-yet another data structure, a <i>pcd_</i>t
-(see below).  When compression is enabled
-(and deduplication is not), the pointer points directly to the compressed data.
-For reasons we will see shortly, each <i>pgp_t</i> that represents
-an <i>ephemeral</i> page (that is, a page placed
-in an <i>ephemeral</i> pool) is also placed
-into two doubly-linked linked lists, one containing all ephemeral pages
-<kbd>put</kbd></big> by the same client and one
-containing all ephemeral pages across all clients (&quot;global&quot;).
-<li>
-When deduplication is enabled, multiple <i>pgp_</i>t's may need to point to
-the same data, so another data structure (and level of indirection) is used
-called a page content descriptor, or <i>pcd_t</i>.
-Multiple page descriptors (<i>pgp_t</i>'s) may point to the same <i>pcd_t</i>.
-The <i>pcd_t</i>, in turn, points to either a <i>pfp_t</i>
-(if a full page of data), directly to a
-location in memory (if the page has been compressed or trailing zeroes have
-been eliminated), or even a NULL pointer (if the page contained all zeroes and
-trailing zero elimination is enabled).
-</ul>
-<P>
-The most apparent usage of this multi-layer web of data structures
-is &quot;top-down&quot; because, in normal operation, the vast majority of tmem
-operations invoked by a client are
-<kbd>put</kbd></big>s and <kbd>get</kbd></big>s, which require the various
-data structures to be walked starting with the <i>client_t</i>, then
-a <i>pool_t</i>, then an <i>obj_t</i>, then a <i>pgd_t</i>.
-However, there is another highly frequent tmem operation that is not
-visible from a client: memory reclamation.
-Since tmem attempts to use all spare memory in the system, it must
-frequently free up, or <i>evict</i>,
-pages.  The eviction algorithm will be
-explained in more detail later but, in brief, to free memory, ephemeral pages
-are removed from the tail of one of the doubly-linked lists, which means that
-all of the data structures associated with that page-to-be-removed must be
-updated or eliminated and freed.  As a
-result, each data structure also contains a <i>back-pointer</i>
-to its parent, for example every <i>obj_t</i>
-contains a pointer to its containing <i>pool_t</i>.
-<P>
-This complex web of interconnected data structures is updated constantly and
-thus extremely sensitive to careless code changes which, for example, may
-result in unexpected hypervisor crashes or non-obvious memory leaks.
-On the other hand, the code is fairly well
-modularized so, once understood, it is possible to relatively easily switch out
-one kind of data structure for another.
-To catch problems as quickly as possible when debug is enabled, most of
-the data structures are equipped with <i>sentinels</i>and many inter-function
-assumptions are documented and tested dynamically
-with <i>assertions</i>.
-While these clutter and lengthen the tmem
-code substantially, their presence has proven invaluable on many occasions.
-<P>
-For completeness, we should also describe a key data structure in the Xen
-implementation-dependent code: the <i>tmh_page_list</i>. For security and
-performance reasons, pages that are freed due to tmem operations (such
-as <kbd>get</kbd></big>) are not immediately put back into Xen's pool
-of free memory (aka the Xen <i>heap</i>).
-Tmem pages may contain guest-private data that must be <i>scrubbed</i> before
-those memory pages are released for the use of other guests.
-But if a page is immediately re-used inside of tmem itself, the entire
-page is overwritten with new data, so need not be scrubbed.
-Since tmem is usually the most frequent
-customer of the Xen heap allocation code, it would be a waste of time to scrub
-a page, release it to the Xen heap, and then immediately re-allocate it
-again.  So, instead, tmem maintains
-currently-unused pages of memory on its own free list, <i>tmh_page_list</i>,
-and returns the pages to Xen only when non-tmem Xen
-heap allocation requests would otherwise fail.
-
-<h2>Scalablility/Concurrency</h2>
-
-<P>Tmem has been designed to be highly scalable.
-Since tmem access is invoked similarly in
-many ways to asynchronous disk access, a &quot;big SMP&quot; tmem-aware guest
-OS can, and often will, invoke tmem hypercalls simultaneously on many different
-physical CPUs.  And, of course, multiple
-tmem-aware guests may independently and simultaneously invoke tmem
-hypercalls.  While the normal frequency
-of tmem invocations is rarely extremely high, some tmem operations such as data
-compression or lookups in a very large tree may take tens of thousands of
-cycles or more to complete.  Measurements
-have shown that normal workloads spend no more than about 0.2% (2% with
-compression enabled) of CPU time executing tmem operations.
-But those familiar with OS scalability issues
-recognize that even this limited execution time can create concurrency problems
-in large systems and result in poorly-scalable performance.
-<P>
-A good locking strategy is critical to concurrency, but also
-must be designed carefully to avoid deadlock and <i>livelock</i> problems.  For
-debugging purposes, tmem supports a &quot;big kernel lock&quot; which disables
-concurrency altogether (enabled in Xen with &quot;tmem_lock&quot;, but note
-that this functionality is rarely tested and likely has bit-rotted). Infrequent
-but invasive tmem hypercalls, such as pool creation or the control operations,
-are serialized on a single <i>read-write lock</i>, called tmem_rwlock,
-which must be held for writing.  All other tmem operations must hold this lock
-for reading, so frequent operations such as
-<kbd>put</kbd></big> and <kbd>get</kbd></big> <kbd>flush</kbd></big> can execute simultaneously
-as long as no invasive operations are occurring.
-<P>
-Once a pool has been selected, there is a per-pool
-read-write lock (<i>pool_rwlock</i>) which
-must be held for writing if any transformative operations might occur within
-that pool, such as when an<i> obj_t</i> is
-created or destroyed.  For the highly
-frequent operation of finding an<i> obj_t</i>
-within a pool, pool_rwlock must be held for reading.
-<P>
-Once an object has been selected, there is a per-object
-spinlock (<i>obj_spinlock)</i>.
-This is a spinlock rather than a read-write
-lock because nearly all of the most frequent tmem operations (e.g.
-<kbd>put</kbd></big> and <kbd>get</kbd></big> <kbd>flush</kbd></big>)
-are transformative, in
-that they add or remove a page within the object.
-This lock is generally taken whenever an
-object lookup occurs and released when the tmem operation is complete.
-<P>
-Next, the per-client and global ephemeral lists are
-protected by a single global spinlock (<i>eph_lists_</i>spinlock)
-and the per-client persistent lists are also protected by a single global
-spinlock (<i>pers_list_spinlock</i>).
-And to complete the description of
-implementation-independent locks, if page deduplication is enabled, all pages
-for which the first byte match are contained in one of 256 trees that are
-protected by one of 256 corresponding read-write locks
-(<i>pcd_tree_rwlocks</i>).
-<P>
-In the Xen-specific code (tmem_xen.c), page frames (e.g.  struct page_info)
-that have been released are kept in a list (<i>tmh_page_list</i>) that
-is protected by a spinlock (<i>tmh_page_list_lock</i>).
-There is also an &quot;implied&quot; lock
-associated with compression, which is likely the most time-consuming operation
-in all of tmem (of course, only when compression is enabled): A compression
-buffer is allocated one-per-physical-cpu early in Xen boot and a pointer to
-this buffer is returned to implementation-independent code and used without a
-lock.
-<P>
-The proper method to avoid deadlocks is to take and release
-locks in a very specific predetermined order.
-Unfortunately, since tmem data structures must simultaneously be
-accessed &quot;top-down&quot; (
-<kbd>put</kbd></big> and <kbd>get</kbd></big>)
-and &quot;bottoms-up&quot;
-(memory reclamation), more complex methods must be employed:
-A <i>trylock</i>mechanism is used (c.f. <i>tmem_try_to_evict_pgp()</i>),
-which takes the lock if it is available but returns immediately (rather than
-spinning and waiting) if the lock is not available.
-When walking the ephemeral list to identify
-pages to free, any page that belongs to an object that is locked is simply
-skipped.  Further, if the page is the
-last page belonging to an object, and the pool read-write lock for the pool the
-object belongs to is not available (for writing), that object is skipped.
-These constraints modify the LRU algorithm
-somewhat, but avoid the potential for deadlock.
-<P>
-Unfortunately, a livelock was still discovered in this approach:
-When memory is scarce and each client is
-<kbd>put</kbd></big>ting a large number of pages
-for exactly one object (and thus holding the object spinlock for that object),
-memory reclamation takes a very long time to determine that it is unable to
-free any pages, and so the time to do a
-<kbd>put</kbd></big> (which eventually fails) becomes linear to the
-number of pages in the object!  To avoid
-this situation, a workaround was added to always ensure a minimum amount of
-memory (1MB) is available before any object lock is taken for the client
-invoking tmem (see <i>tmem_ensure_avail_pages()</i>).
-Other such livelocks (and perhaps deadlocks)
-may be lurking.
-<P>
-A last issue related to concurrency is atomicity of counters.
-Tmem gathers a large number of
-statistics.  Some of these counters are
-informational only, while some are critical to tmem operation and must be
-incremented and decremented atomically to ensure, for example, that the number
-of pages in a tree never goes negative if two concurrent tmem operations access
-the counter exactly simultaneously.  Some
-of the atomic counters are used for debugging (in assertions) and perhaps need
-not be atomic; fixing these may increase performance slightly by reducing
-cache-coherency traffic.  Similarly, some
-of the non-atomic counters may yield strange results to management tools, such
-as showing the total number of successful
-<kbd>put</kbd></big>s as being higher than the number of
-<kbd>put</kbd></big>s attempted.
-These are left as exercises for future tmem implementors.
-
-<h2>Control and Manageability</h2>
-
-<P>
-Tmem has a control interface to, for example, set various
-parameters and obtain statistics.  All
-tmem control operations funnel through <i>do_tmem_control()</i>
-and other functions supporting tmem control operations are prefixed
-with <i>tmemc_</i>.
-
-<P>
-During normal operation, even if only one tmem-aware guest
-is running, tmem may absorb nearly all free memory in the system for its own
-use.  Then if a management tool wishes to
-create a new guest (or migrate a guest from another system to this one), it may
-notice that there is insufficient &quot;free&quot; memory and fail the creation
-(or migration).  For this reason, tmem
-introduces a new tool-visible class of memory -- <i>freeable</i> memory --
-and provides a control interface to access
-it.  All ephemeral memory and all pages on the <i>tmh_page_list</i>
-are freeable. To properly access freeable
-memory, a management tool must follow a sequence of steps:
-<ul>
-<li>
-<i>freeze</i>
-tmem:When tmem is frozen, all 
-<kbd>put</kbd></big>s fail, which ensures that no
-additional memory may be absorbed by tmem.
-(See <i>tmemc_freeze_pools()</i>, and
-note that individual clients may be frozen, though this functionality may be
-used only rarely.)
-<li>
-<i>query freeable MB: </i>If all freeable memory were released to the Xen
-heap, this is the amount of memory (in MB) that would be freed.
-See <i>tmh_freeable_pages()</i>.
-<li>
-<i>flush</i>:
-Tmem may be requested to flush, or relinquish, a certain amount of memory, e.g.
-back to the Xen heap.  This amount is
-specified in KB.  See <i
->tmemc_flush_mem()</i> and <i
->tmem_relinquish_npages()</i>.
-<li>
-At this point the management tool may allocate
-the memory, e.g. using Xen's published interfaces.
-<li>
-<i>thaw</i>
-tmem: This terminates the freeze, allowing tmem to accept 
-<kbd>put</kbd></big>s again.
-</ul>
-<P>
-Extensive tmem statistics are available through tmem's
-control interface (see <i>tmemc_list </i>and
-the separate source for the &quot;xm tmem-list&quot; command and the
-xen-tmem-list-parse tool).  To maximize
-forward/backward compatibility with future tmem and tools versions, statistical
-information is passed via an ASCII interface where each individual counter is
-identified by an easily parseable two-letter ASCII sequence.
-
-<h2>Save/Restore/Migrate</h2>
-
-<P>
-Another piece of functionality that has a major impact on
-the tmem code is support for save/restore of a tmem client and, highly related,
-live migration of a tmem client.
-Ephemeral pages, by definition, do not need to be saved or
-live-migrated, but persistent pages are part of the state of a running VM and
-so must be properly preserved.
-<P>
-When a save (or live-migrate) of a tmem-enabled VM is initiated, the first step
-is for the tmem client to be frozen (see the manageability section).
-Next, tmem API version information is
-recorded (to avoid possible incompatibility issues as the tmem spec evolves in
-the future).  Then, certain high-level
-tmem structural information specific to the client is recorded, including
-information about the existing pools.
-Finally, the contents of all persistent pages are recorded.
-<P>
-For live-migration, the process is somewhat more complicated.
-Ignoring tmem for a moment, recall that in
-live migration, the vast majority of the VM's memory is transferred while the
-VM is still fully operational.  During
-each phase, memory pages belonging to the VM that are changed are marked and
-then retransmitted during a later phase.
-Eventually only a small amount of memory remains, the VM is paused, the
-remaining memory is transmitted, and the VM is unpaused on the target machine.
-<P>
-The number of persistent tmem pages may be quite large,
-possibly even larger than all the other memory used by the VM; so it is
-unacceptable to transmit persistent tmem pages during the &quot;paused&quot;
-phase of live migration.  But if the VM
-is still operational, it may be making calls to tmem:
-A frozen tmem client will reject any 
-<big><kbd>put</kbd></big> operations, but tmem must
-still correctly process <big><kbd>flush</kbd></big>es
-(page and object), including implicit flushes due to duplicate 
-<big><kbd>put</kbd></big>s.
-Fortunately, these operations can only
-invalidate tmem pages, not overwrite tmem pages or create new pages.
-So, when a live-migrate has been initiated,
-the client is frozen.  Then during the
-&quot;live&quot; phase, tmem transmits all persistent pages, but also records
-the handle of all persistent pages that are invalidated.
-Then, during the &quot;paused&quot; phase,
-only the handles of invalidated persistent pages are transmitted, resulting in
-the invalidation on the target machine of any matching pages that were
-previously transmitted during the &quot;live&quot; phase.
-<P>
-For restore (and on the target machine of a live migration),
-tmem must be capable of reconstructing the internal state of the client from
-the saved/migrated data.  However, it is
-not the client itself that is <big><kbd>put</kbd></big>'ing
-the pages but the management tools conducting the restore/migration.
-This slightly complicates tmem by requiring
-new API calls and new functions in the implementation, but the code is
-structured so that duplication is minimized.
-Once all tmem data structures for the client are reconstructed, all
-persistent pages are recreated and, in the case of live-migration, all
-invalidations have been processed and the client has been thawed, the restored
-client can be resumed.
-<P>
-Finally, tmem's data structures must be cluttered a bit to
-support save/restore/migration.  Notably,
-a per-pool list of persistent pages must be maintained and, during live
-migration, a per-client list of invalidated pages must be logged.
-A reader of the code will note that these
-lists are overlaid into space-sensitive data structures as a union, which may
-be more error-prone but eliminates significant space waste.
-
-<h2>Miscellaneous Tmem Topics</h2>
-
-<P>
-<i><b>Duplicate <big><kbd>puts</kbd></big></b></i>.
-One interesting corner case that
-significantly complicates the tmem source code is the possibility
-of a <i>duplicate</i>
-<big><kbd>put</kbd></big>,
-which occurs when two
-<big><kbd>put</kbd></big>s
-are requested with the same handle but with possibly different data.
-The tmem API addresses
-<i>
-<big><kbd>put</kbd></big>-<big><kbd>put</kbd></big>-<big><kbd>get</kbd></big>
-coherence</i> explicitly: When a duplicate
-<big><kbd>put</kbd></big> occurs, tmem may react one of two ways: (1) The 
-<big><kbd>put</kbd></big> may succeed with the old
-data overwritten by the new data, or (2) the
-<big><kbd>put</kbd></big> may be failed with the original data flushed and
-neither the old nor the new data accessible.
-Tmem may <i>not</i> fail the 
-<big><kbd>put</kbd></big> and leave the old data accessible.
-<P>
-When tmem has been actively working for an extended period,
-system memory may be in short supply and it is possible for a memory allocation
-for a page (or even a data structure such as a <i>pgd_t</i>) to fail. Thus,
-for a duplicate 
-<big><kbd>put</kbd></big>, it may be impossible for tmem to temporarily
-simultaneously maintain data structures and data for both the original 
-<big><kbd>put</kbd></big> and the duplicate 
-<big><kbd>put</kbd></big>.
-When the space required for the data is
-identical, tmem may be able to overwrite <i>in place </i>the old data with
-the new data (option 1).  But in some circumstances, such as when data
-is being compressed, overwriting is not always possible and option 2 must be
-performed.
-<P>
-<i><b>Page deduplication and trailing-zero elimination.</b></i>
-When page deduplication is enabled
-(&quot;tmem_dedup&quot; option to Xen), ephemeral pages for which the contents
-are identical -- whether the pages belong
-to the same client or different clients -- utilize the same pageframe of
-memory.  In Xen environments where
-multiple domains have a highly similar workload, this can save a substantial
-amount of memory, allowing a much larger number of ephemeral pages to be
-used.  Tmem page deduplication uses
-methods similar to the KSM implementation in Linux [ref], but differences between
-the two are sufficiently great that tmem does not directly leverage the
-code.  In particular, ephemeral pages in
-tmem are never dirtied, so need never be <i>copied-on-write</i>.
-Like KSM, however, tmem avoids hashing,
-instead employing <i>red-black trees</i>
-that use the entire page contents as the <i>lookup
-key</i>.  There may be better ways to implement this.
-<P>
-Dedup'ed pages may optionally be compressed
-(&quot;tmem_compress&quot; and &quot;tmem_dedup&quot; Xen options specified),
-to save even more space, at the cost of more time.
-Additionally, <i>trailing zero elimination (tze)</i> may be applied to dedup'ed
-pages.  With tze, pages that contain a
-significant number of zeroes at the end of the page are saved without the trailing
-zeroes; an all-zero page requires no data to be saved at all.
-In certain workloads that utilize a large number
-of small files (and for which the last partial page of a file is padded with
-zeroes), a significant space savings can be realized without the high cost of
-compression/decompression.
-<P>
-Both compression and tze significantly complicate memory
-allocation.  This will be discussed more below.
-<P>
-<b><i>Memory accounting</i>.</b>
-Accounting is boring, but poor accounting may
-result in some interesting problems.  In
-the implementation-independent code of tmem, most data structures, page frames,
-and partial pages (e.g. for compresssion) are <i>billed</i> to a pool,
-and thus to a client.  Some <i>infrastructure</i> data structures, such as
-pools and clients, are allocated with <i>tmh_alloc_infra()</i>, which does not
-require a pool to be specified.  Two other
-exceptions are page content descriptors (<i>pcd_t</i>)
-and sharelists (<i>sharelist_t</i>) which
-are explicitly not associated with a pool/client by specifying NULL instead of
-a <i>pool_t</i>.
-(Note to self:
-These should probably just use the <i>tmh_alloc_infra()</i> interface too.)
-As we shall see, persistent pool pages and
-data structures may need to be handled a bit differently, so the
-implementation-independent layer calls a different allocation/free routine for
-persistent pages (e.g. <i>tmh_alloc_page_thispool()</i>)
-than for ephemeral pages (e.g. <i>tmh_alloc_page()</i>).
-<P>
-In the Xen-specific layer, we
-disregard the <i>pool_t</i> for ephemeral
-pages, as we use the generic Xen heap for all ephemeral pages and data
-structures.(Denial-of-service attacks
-can be handled in the implementation-independent layer because ephemeral pages
-are kept in per-client queues each with a counted length.
-See the discussion on weights and caps below.)
-However we explicitly bill persistent pages
-and data structures against the client/domain that is using them.
-(See the calls to the Xen routine <i>alloc_domheap_pages() </i>in tmem_xen.h; of
-the first argument is a domain, the pages allocated are billed by Xen to that
-domain.)This means that a Xen domain
-cannot allocate even a single tmem persistent page when it is currently utilizing
-its maximum assigned memory allocation!
-This is reasonable for persistent pages because, even though the data is
-not directly accessible by the domain, the data is permanently saved until
-either the domain flushes it or the domain dies.
-<P>
-Note that proper accounting requires (even for ephemeral pools) that the same
-pool is referenced when memory is freed as when it was allocated, even if the
-ownership of a pool has been moved from one client to another (c.f. <i
->shared_pool_reassign()</i>).
-The underlying Xen-specific information may
-not always enforce this for ephemeral pools, but incorrect alloc/free matching
-can cause some difficult-to-find memory leaks and bent pointers.
-<P>
-Page deduplication is not possible for persistent pools for
-accounting reasons: Imagine a page that is created by persistent pool A, which
-belongs to a domain that is currently well under its maximum allocation.
-Then the <i>pcd_t</i>is matched by persistent pool B, which is
-currently at its maximum.
-Then the domain owning pool A is destroyed.
-Is B beyond its maximum?
-(There may be a clever way around this
-problem.  Exercise for the reader!)
-<P>
-<b><i>Memory allocation.</i></b> The implementation-independent layer assumes
-there is a good fast general-purpose dynamic memory allocator with bounded
-response time and efficient use of memory for a very large number of sub-page
-allocations.  The old xmalloc memory
-allocator in Xen was not a good match for this purpose, so was replaced by the
-TLSF allocator.  Note that the TLSF
-allocator is used only for allocations smaller than a page (and, more
-precisely, no larger than <i>tmem_subpage_maxsize()</i>);
-full pages are allocated by Xen's normal heap allocator.
-<P>
-After the TLSF allocator was integrated into Xen, more work
-was required so that each client could allocate memory from a separate
-independent pool. (See the call to <i>xmem_pool_create()</i>in
-<i>tmh_client_init()</i>.) 
-This allows the data structures allocated for the
-purpose of supporting persistent pages to be billed to the same client as the
-pages themselves.  It also allows partial
-(e.g. compressed) pages to be properly billed.
-Further, when partial page allocations cause internal fragmentation,
-this fragmentation can be isolated per-client.
-And, when a domain dies, full pages can be freed, rather than only
-partial pages. One other change was
-required in the TLSF allocator: In the original version, when a TLSF memory
-pool was allocated, the first page of memory was also allocated.
-Since, for a persistent pool, this page would
-be billed to the client, the allocation of the first page failed if the domain
-was started at its maximum memory, and this resulted in a failure to create the
-memory pool.  To avoid this, the code was
-changed to delay the allocation of the first page until first use of the memory
-pool.
-<P>
-<b><i>Memory allocation interdependency.</i></b>
-As previously described,
-pages of memory must be moveable back and forth between the Xen heap and the
-tmem ephemeral lists (and page lists).
-When tmem needs a page but doesn't have one, it requests one from the
-Xen heap (either indirectly via xmalloc, or directly via Xen's <i
->alloc_domheap_pages()</i>).
-And when Xen needs a page but doesn't have
-one, it requests one from tmem (via a call to <i
->tmem_relinquish_pages()</i> in Xen's <i
->alloc_heap_pages() </i>in page_alloc.c).
-This leads to a potential infinite loop!
-To break this loop, a new memory flag (<i>MEMF_tmem</i>) was added to Xen
-to flag and disallow the loop.
-See <i>tmh_called_from_tmem()</i>
-in <i>tmem_relinquish_pages()</i>.
-Note that the <i
->tmem_relinquish_pages()</i> interface allows for memory requests of
-order &gt; 0 (multiple contiguous pages), but the tmem implementation disallows
-any requests larger than a single page.
-<P>
-<b><i>LRU page reclamation</i></b>.
-Ephemeral pages generally <i>age </i>in
-a queue, and the space associated with the oldest -- or <i
->least-recently-used -- </i>page is reclaimed when tmem needs more
-memory.  But there are a few exceptions
-to strict LRU queuing.  First is when
-removal from a queue is constrained by locks, as previously described above.
-Second, when an ephemeral pool is <i>shared,</i> unlike a private ephemeral
-pool, a
-<big><kbd>get</kbd></big>
-does not imply a
-<big><kbd>flush</kbd></big>
-Instead, in a shared pool, a 
-results in the page being promoted to the front of the queue.
-Third, when a page that is deduplicated (i.e.
-is referenced by more than one <i>pgp_</i>t)
-reaches the end of the LRU queue, it is marked as <i
->eviction attempted</i> and promoted to the front of the queue; if it
-reaches the end of the queue a second time, eviction occurs.
-Note that only the <i
->pgp_</i>t is evicted; the actual data is only reclaimed if there is no
-other <i>pgp_t </i>pointing to the data.
-<P>
-All of these modified- LRU algorithms deserve to be studied
-carefully against a broad range of workloads.
-<P>
-<b><i>Internal fragmentation</i>.</b>
-When
-compression or tze is enabled, allocations between a half-page and a full-page
-in size are very common and this places a great deal of pressure on even the
-best memory allocator.  Additionally,
-problems may be caused for memory reclamation: When one tmem ephemeral page is
-evicted, only a fragment of a physical page of memory might be reclaimed.
-As a result, when compression or tze is
-enabled, it may take a very large number of eviction attempts to free up a full
-contiguous page of memory and so, to avoid near-infinite loops and livelocks, eviction
-must be assumed to be able to fail.
-While all memory allocation paths in tmem are resilient to failure, very
-complex corner cases may eventually occur.
-As a result, compression and tze are disabled by default and should be
-used with caution until they have been tested with a much broader set of
-workloads.(Note to self: The 
-code needs work.)
-<P>
-<b><i>Weights and caps</i>.</b>
-Because
-of the just-discussed LRU-based eviction algorithms, a client that uses tmem at
-a very high frequency can quickly swamp tmem so that it provides little benefit
-to a client that uses it less frequently.
-To reduce the possibility of this denial-of-service, limits can be
-specified via management tools that are enforced internally by tmem.
-On Xen, the &quot;xm tmem-set&quot; command
-can specify &quot;weight=&lt;weight&gt;&quot; or &quot;cap=&lt;cap&gt;&quot;
-for any client.  If weight is non-zero
-for a client and the current percentage of ephemeral pages in use by the client
-exceeds its share (as measured by the sum of weights of all clients), the next
-page chosen for eviction is selected from the requesting client's ephemeral
-queue, instead of the global ephemeral queue that contains pages from all
-clients.(See <i>client_over_quota().</i>)
-Setting a cap for a client is currently a no-op.
-<P>
-<b><i>Shared pools and authentication.</i></b>
-When tmem was first proposed to the linux kernel mailing list
-(LKML), there was concern expressed about security of shared ephemeral
-pools.  The initial tmem implementation only
-required a client to provide a 128-bit UUID to identify a shared pool, and the
-linux-side tmem implementation obtained this UUID from the superblock of the
-shared filesystem (in ocfs2).  It was
-pointed out on LKML that the UUID was essentially a security key and any
-malicious domain that guessed it would have access to any data from the shared
-filesystem that found its way into tmem.
-Ocfs2 has only very limited security; it is assumed that anyone who can
-access the filesystem bits on the shared disk can mount the filesystem and use
-it.  But in a virtualized data center,
-higher isolation requirements may apply.
-As a result, management tools must explicitly authenticate (or may
-explicitly deny) shared pool access to any client.
-On Xen, this is done with the &quot;xl
-tmem-shared-auth&quot; command.
-<P>
-<b><i>32-bit implementation</i>.</b>
-There was some effort put into getting tmem working on a 32-bit Xen.
-However, the Xen heap is limited in size on
-32-bit Xen so tmem did not work very well.
-There are still 32-bit ifdefs in some places in the code, but things may
-have bit-rotted so using tmem on a 32-bit Xen is not recommended.
-
-<h2>Known Issues</h2>
-
-<p><b><i>Fragmentation.</i></b>When tmem
-is active, all physically memory becomes <i>fragmented</i>
-into individual pages.  However, the Xen
-memory allocator allows memory to be requested in multi-page contiguous
-quantities, called order&gt;0 allocations.
-(e.g. 2<sup>order</sup> so
-order==4 is sixteen contiguous pages.)
-In some cases, a request for a larger order will fail gracefully if no
-matching contiguous allocation is available from Xen.
-As of Xen 4.0, however, there are several
-critical order&gt;0 allocation requests that do not fail gracefully.
-Notably, when a domain is created, and
-order==4 structure is required or the domain creation will fail.
-And shadow paging requires many order==2
-allocations; if these fail, a PV live-migration may fail.
-There are likely other such issues.
-<P>
-But, fragmentation can occur even without tmem if any domU does
-any extensive ballooning; tmem just accelerates the fragmentation.
-So the fragmentation problem must be solved
-anyway.  The best solution is to disallow
-order&gt;0 allocations altogether in Xen -- or at least ensure that any attempt
-to allocate order&gt;0 can fail gracefully, e.g. by falling back to a sequence
-of single page allocations. However this restriction may require a major rewrite
-in some of Xen's most sensitive code.
-(Note that order&gt;0 allocations during Xen boot and early in domain0
-launch are safe and, if dom0 does not enable tmem, any order&gt;0 allocation by
-dom0 is safe, until the first domU is created.)
-<P>
-Until Xen can be rewritten to be <i>fragmentation-safe</i>, a small hack
-was added in the Xen page
-allocator.(See the comment &quot;
-memory is scarce&quot; in <i>alloc_heap_pages()</i>.)
-Briefly, a portion of memory is pre-reserved
-for allocations where order&gt;0 and order&lt;9.
-(Domain creation uses 2MB pages, but fails
-gracefully, and there are no other known order==9 allocations or order&gt;9
-allocations currently in Xen.)
-<P>
-<b><i>NUMA</i></b>.  Tmem assumes that
-all memory pages are equal and any RAM page can store a page of data for any
-client.  This has potential performance
-consequences in any NUMA machine where access to <i
->far memory</i> is significantly slower than access to <i
->near memory</i>.
-On nearly all of today's servers, however,
-access times to <i>far memory</i> is still
-much faster than access to disk or network-based storage, and tmem's primary performance
-advantage comes from the fact that paging and swapping are reduced.
-So, the current tmem implementation ignores
-NUMA-ness; future tmem design for NUMA machines is an exercise left for the
-reader.
-
-<h2>Bibliography</h2>
-
-<P>
-(needs work)<b style='mso-bidi-font-weight:>
-<P><a href="http://oss.oracle.com/projects/tmem">http://oss.oracle.com/projects/tmem</a>
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 9028bcde2e..fe891ef074 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1993,12 +1993,6 @@ pages) must also be specified via the tbuf\_size parameter.
 ### timer\_slop
 > `= <integer>`
 
-### tmem
-> `= <boolean>`
-
-### tmem\_compress
-> `= <boolean>`
-
 ### tsc (x86)
 > `= unstable | skewed | stable:socket`
 
diff --git a/docs/misc/xsm-flask.txt b/docs/misc/xsm-flask.txt
index 62f15dde84..40e5fc845e 100644
--- a/docs/misc/xsm-flask.txt
+++ b/docs/misc/xsm-flask.txt
@@ -81,42 +81,6 @@ __HYPERVISOR_memory_op (xen/include/public/memory.h)
  * XENMEM_get_pod_target
  * XENMEM_claim_pages
 
-__HYPERVISOR_tmem_op (xen/include/public/tmem.h)
-
- The following tmem control ops, that is the sub-subops of
- TMEM_CONTROL, are covered by this statement. 
-
- Note that TMEM is also subject to a similar policy arising from
- XSA-15 http://lists.xen.org/archives/html/xen-announce/2012-09/msg00006.html.
- Due to this existing policy all TMEM Ops are already subject to
- reduced security support.
-
- * TMEMC_THAW
- * TMEMC_FREEZE
- * TMEMC_FLUSH
- * TMEMC_DESTROY
- * TMEMC_LIST
- * TMEMC_SET_WEIGHT
- * TMEMC_SET_CAP
- * TMEMC_SET_COMPRESS
- * TMEMC_QUERY_FREEABLE_MB
- * TMEMC_SAVE_BEGIN
- * TMEMC_SAVE_GET_VERSION
- * TMEMC_SAVE_GET_MAXPOOLS
- * TMEMC_SAVE_GET_CLIENT_WEIGHT
- * TMEMC_SAVE_GET_CLIENT_CAP
- * TMEMC_SAVE_GET_CLIENT_FLAGS
- * TMEMC_SAVE_GET_POOL_FLAGS
- * TMEMC_SAVE_GET_POOL_NPAGES
- * TMEMC_SAVE_GET_POOL_UUID
- * TMEMC_SAVE_GET_NEXT_PAGE
- * TMEMC_SAVE_GET_NEXT_INV
- * TMEMC_SAVE_END
- * TMEMC_RESTORE_BEGIN
- * TMEMC_RESTORE_PUT_PAGE
- * TMEMC_RESTORE_FLUSH_PAGE
-
-
 
 Setting up FLASK
 ----------------
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] xen: remove tmem from hypervisor
  2018-11-28 13:58 ` [PATCH v2 2/3] xen: remove tmem from hypervisor Wei Liu
@ 2018-11-28 14:43   ` Jan Beulich
  2018-11-28 14:47     ` Wei Liu
  2018-11-28 15:49   ` Daniel De Graaf
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2018-11-28 14:43 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, xen-devel,
	Daniel de Graaf, Roger Pau Monne

>>> On 28.11.18 at 14:58, <wei.liu2@citrix.com> wrote:
> @@ -250,7 +249,7 @@ static void populate_physmap(struct memop_args *a)
>  
>                  if ( unlikely(!page) )
>                  {
> -                    if ( !tmem_enabled() || a->extent_order )
> +                    if ( a->extent_order )
>                          gdprintk(XENLOG_INFO,
>                                   "Could not allocate order=%u extent: id=%d memflags=%#x (%u of %u)\n",
>                                   a->extent_order, d->domain_id, a->memflags,

From an abstract pov without tmem tmem_enabled() should return constant
"false". Which seems to mean that the if() should go away rather than its
condition getting changed.

> @@ -949,22 +935,6 @@ static struct page_info *alloc_heap_pages(
>          return NULL;
>      }
>  
> -    /*
> -     * TMEM: When available memory is scarce due to tmem absorbing it, allow
> -     * only mid-size allocations to avoid worst of fragmentation issues.
> -     * Others try tmem pools then fail.  This is a workaround until all
> -     * post-dom0-creation-multi-page allocations can be eliminated.
> -     */
> -    if ( ((order == 0) || (order >= 9)) &&
> -         (total_avail_pages <= midsize_alloc_zone_pages) &&
> -         tmem_freeable_pages() )
> -    {
> -        /* Try to free memory from tmem. */
> -        pg = tmem_relinquish_pages(order, memflags);
> -        spin_unlock(&heap_lock);
> -        return pg;
> -    }
> -
>      pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d);
>      /* Try getting a dirty buddy if we couldn't get a clean one. */
>      if ( !pg && !(memflags & MEMF_no_scrub) )
> @@ -1444,10 +1414,6 @@ static void free_heap_pages(
>      else
>          pg->u.free.first_dirty = INVALID_DIRTY_IDX;
>  
> -    if ( tmem_enabled() )
> -        midsize_alloc_zone_pages = max(
> -            midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);

Seeing these two hunks I think midsize_alloc_zone_pages and
MIDSIZE_ALLOC_FRAC want to go away altogether.

> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -248,8 +248,10 @@ struct npfec {
>  #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
>  #define _MEMF_populate_on_demand 1
>  #define  MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand)
> +#if 0
>  #define _MEMF_tmem        2
>  #define  MEMF_tmem        (1U<<_MEMF_tmem)
> +#endif

Why "#if 0" rather than removing the two lines?

With these suitably taken care of feel free to add
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] xen: remove tmem from hypervisor
  2018-11-28 14:43   ` Jan Beulich
@ 2018-11-28 14:47     ` Wei Liu
  2018-11-28 14:50       ` Wei Liu
  0 siblings, 1 reply; 19+ messages in thread
From: Wei Liu @ 2018-11-28 14:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne

On Wed, Nov 28, 2018 at 07:43:25AM -0700, Jan Beulich wrote:
> >>> On 28.11.18 at 14:58, <wei.liu2@citrix.com> wrote:
> > @@ -250,7 +249,7 @@ static void populate_physmap(struct memop_args *a)
> >  
> >                  if ( unlikely(!page) )
> >                  {
> > -                    if ( !tmem_enabled() || a->extent_order )
> > +                    if ( a->extent_order )
> >                          gdprintk(XENLOG_INFO,
> >                                   "Could not allocate order=%u extent: id=%d memflags=%#x (%u of %u)\n",
> >                                   a->extent_order, d->domain_id, a->memflags,
> 
> From an abstract pov without tmem tmem_enabled() should return constant
> "false". Which seems to mean that the if() should go away rather than its
> condition getting changed.

Ack.

> 
> > @@ -949,22 +935,6 @@ static struct page_info *alloc_heap_pages(
> >          return NULL;
> >      }
> >  
> > -    /*
> > -     * TMEM: When available memory is scarce due to tmem absorbing it, allow
> > -     * only mid-size allocations to avoid worst of fragmentation issues.
> > -     * Others try tmem pools then fail.  This is a workaround until all
> > -     * post-dom0-creation-multi-page allocations can be eliminated.
> > -     */
> > -    if ( ((order == 0) || (order >= 9)) &&
> > -         (total_avail_pages <= midsize_alloc_zone_pages) &&
> > -         tmem_freeable_pages() )
> > -    {
> > -        /* Try to free memory from tmem. */
> > -        pg = tmem_relinquish_pages(order, memflags);
> > -        spin_unlock(&heap_lock);
> > -        return pg;
> > -    }
> > -
> >      pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d);
> >      /* Try getting a dirty buddy if we couldn't get a clean one. */
> >      if ( !pg && !(memflags & MEMF_no_scrub) )
> > @@ -1444,10 +1414,6 @@ static void free_heap_pages(
> >      else
> >          pg->u.free.first_dirty = INVALID_DIRTY_IDX;
> >  
> > -    if ( tmem_enabled() )
> > -        midsize_alloc_zone_pages = max(
> > -            midsize_alloc_zone_pages, total_avail_pages / MIDSIZE_ALLOC_FRAC);
> 
> Seeing these two hunks I think midsize_alloc_zone_pages and
> MIDSIZE_ALLOC_FRAC want to go away altogether.

Ack.

> 
> > --- a/xen/include/xen/mm.h
> > +++ b/xen/include/xen/mm.h
> > @@ -248,8 +248,10 @@ struct npfec {
> >  #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
> >  #define _MEMF_populate_on_demand 1
> >  #define  MEMF_populate_on_demand (1U<<_MEMF_populate_on_demand)
> > +#if 0
> >  #define _MEMF_tmem        2
> >  #define  MEMF_tmem        (1U<<_MEMF_tmem)
> > +#endif
> 
> Why "#if 0" rather than removing the two lines?

I wanted to keep it around so that later when someone reads the code
they won't be asking "why is 2 not used".

But yes I'm fine with just deleting these two lines.

> 
> With these suitably taken care of feel free to add
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> 
> Jan
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] xen: remove tmem from hypervisor
  2018-11-28 14:47     ` Wei Liu
@ 2018-11-28 14:50       ` Wei Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Wei Liu @ 2018-11-28 14:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, Daniel de Graaf, Roger Pau Monne

On Wed, Nov 28, 2018 at 02:47:32PM +0000, Wei Liu wrote:
> On Wed, Nov 28, 2018 at 07:43:25AM -0700, Jan Beulich wrote:
> > >>> On 28.11.18 at 14:58, <wei.liu2@citrix.com> wrote:
> > > @@ -250,7 +249,7 @@ static void populate_physmap(struct memop_args *a)
> > >  
> > >                  if ( unlikely(!page) )
> > >                  {
> > > -                    if ( !tmem_enabled() || a->extent_order )
> > > +                    if ( a->extent_order )
> > >                          gdprintk(XENLOG_INFO,
> > >                                   "Could not allocate order=%u extent: id=%d memflags=%#x (%u of %u)\n",
> > >                                   a->extent_order, d->domain_id, a->memflags,
> > 
> > From an abstract pov without tmem tmem_enabled() should return constant
> > "false". Which seems to mean that the if() should go away rather than its
> > condition getting changed.
> 
> Ack.

BTW there is another hunk in this patch which should get the same treatment:

@@ -2265,7 +2231,7 @@ int assign_pages(                                                                                                                       
     {                                                                                                                                                        
         if ( unlikely((d->tot_pages + (1 << order)) > d->max_pages) )                                                                                        
         {                                                                                                                                                    
-            if ( !tmem_enabled() || order != 0 || d->tot_pages != d->max_pages )                                                                             
+            if ( order != 0 || d->tot_pages != d->max_pages )                                                                                                
                 gprintk(XENLOG_INFO, "Over-allocation for domain %u: "                                                                                       
                         "%u > %u\n", d->domain_id,                                                                                                           
                         d->tot_pages + (1 << order), d->max_pages);      

I will fix it as well.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/3] xen: remove tmem from hypervisor
  2018-11-28 13:58 ` [PATCH v2 2/3] xen: remove tmem from hypervisor Wei Liu
  2018-11-28 14:43   ` Jan Beulich
@ 2018-11-28 15:49   ` Daniel De Graaf
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel De Graaf @ 2018-11-28 15:49 UTC (permalink / raw)
  To: Wei Liu, xen-devel
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Roger Pau Monné

On 11/28/18 8:58 AM, Wei Liu wrote:
> This patch removes all tmem related code and CONFIG_TMEM from the
> hypervisor. Also remove tmem hypercalls from the default XSM policy.
> 
> It is written as if tmem is disabled and tmem freeable pages is 0.
> 
> We will need to keep public/tmem.h around forever to avoid breaking
> guests.  Remove the hypervisor only part and put guest visible part
> under a xen version check. Take the chance to remove trailing
> whitespaces.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/3] docs: remove tmem related text
  2018-11-28 13:58 ` [PATCH v2 3/3] docs: remove tmem related text Wei Liu
@ 2018-11-28 15:49   ` Daniel De Graaf
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel De Graaf @ 2018-11-28 15:49 UTC (permalink / raw)
  To: Wei Liu, xen-devel
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich

On 11/28/18 8:58 AM, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
                   ` (2 preceding siblings ...)
  2018-11-28 13:58 ` [PATCH v2 3/3] docs: remove tmem related text Wei Liu
@ 2018-11-29  2:50 ` Konrad Rzeszutek Wilk
  2018-11-29 11:42   ` Wei Liu
  2018-11-30 17:09 ` Ian Jackson
  2019-03-12 12:21 ` Wei Liu
  5 siblings, 1 reply; 19+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-11-29  2:50 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Jan Beulich, xen-devel

On Wed, Nov 28, 2018 at 01:58:03PM +0000, Wei Liu wrote:
> It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
> 
> In this version:
> 
> 1. Remove some residuals from previous version and fix all build errors
>    discovered by Gitlab CI.
> 2. Swap the order of patches to make sure bisection still works. This
>    is verified by calling
>       `./automation/scripts/build-test.sh origin/staging HEAD`
> 3. Make sure Xen still boots and passes all XTF tests after the removal.
> 4. Keep public/tmem.h.

Please also remove the entry in the MAINTAINERS file.

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-29  2:50 ` [PATCH v2 0/3] Remove tmem Konrad Rzeszutek Wilk
@ 2018-11-29 11:42   ` Wei Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Wei Liu @ 2018-11-29 11:42 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Jan Beulich, xen-devel

On Wed, Nov 28, 2018 at 09:50:33PM -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Nov 28, 2018 at 01:58:03PM +0000, Wei Liu wrote:
> > It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
> > from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
> > 
> > In this version:
> > 
> > 1. Remove some residuals from previous version and fix all build errors
> >    discovered by Gitlab CI.
> > 2. Swap the order of patches to make sure bisection still works. This
> >    is verified by calling
> >       `./automation/scripts/build-test.sh origin/staging HEAD`
> > 3. Make sure Xen still boots and passes all XTF tests after the removal.
> > 4. Keep public/tmem.h.
> 
> Please also remove the entry in the MAINTAINERS file.

Sure. I will fold that into the hypervisor patch.

> 
> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
                   ` (3 preceding siblings ...)
  2018-11-29  2:50 ` [PATCH v2 0/3] Remove tmem Konrad Rzeszutek Wilk
@ 2018-11-30 17:09 ` Ian Jackson
  2018-11-30 18:01   ` Wei Liu
  2019-03-12 12:21 ` Wei Liu
  5 siblings, 1 reply; 19+ messages in thread
From: Ian Jackson @ 2018-11-30 17:09 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel

Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
> It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.

Those are notes from some phone call amongst industry stakeholders.
None of the messages have a Subject line mentioning tmem.  There is no
explanation of the basis for the decision; just a confirmation from
the current maintainers that they will ack the removal.

I think this is not really an appropriate way to carry on!  What if
there is someone else who wants to step up to maintain this ?  What
about user communication ?  Going straight from `Supported' to
`Deleted' seems rather vigorous.


In summary I think the claim that "It is agreed" in this cover letter
is false (or, at least, if it is true, the cover letter provides no
references to any basis for thinking that it is true).

If it didn't happen on the mailing list it didn't happen.


Unfortunately, therefore, on process grounds,

Nacked-by: Ian Jackson <ian.jackson@eu.citrix.com>


I dare say the decision to remove it now might be right.

Can we please start this again with a proper explanation of why this
should be summarily deleted, rather than (say) made unmaintained and
deprecated for a release ?  Can someone explain why we don't feel the
need to consult anyone by (say) posting to xen-announce ?  etc.

Then we can actually have an on-list discussion where the decision
would be taken.

Next time I suggest a good first step would be a patch which deletes
the M: line from MAINTAINERS and changes the status to Orphan, since
obviously the current maintainers don't want it.  That patch should be
uncontroversial.  Also in general, depending who we think might be
using a feature, a plan which gives some warning to users (by
deprecating the feature, for example) would often be a good idea.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/3] tools: remove tmem code and commands
  2018-11-28 13:58 ` [PATCH v2 1/3] tools: remove tmem code and commands Wei Liu
@ 2018-11-30 17:10   ` Ian Jackson
  0 siblings, 0 replies; 19+ messages in thread
From: Ian Jackson @ 2018-11-30 17:10 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Marek Marczykowski-Górecki

Wei Liu writes ("[PATCH v2 1/3] tools: remove tmem code and commands"):
> Remove all tmem related code in libxc.
> 
> Leave some stubs in libxl in case anyone has linked to those functions
> before the removal.
> 
> Remove all tmem related commands in xl, all tmem related code in other
> utilities we ship.

Amazingly I see nothing in the libxl domain config about this.  If
there were then we would have to decide what to do if the domain had
tmem-related config.  But AFAICT there is could be no such thing.

On a technical level therefore, there is nothing wrong with this
patch.  However, for the process reasons I have explained in my other
message,

Nacked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Sorry,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-30 17:09 ` Ian Jackson
@ 2018-11-30 18:01   ` Wei Liu
  2018-12-03  9:56     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Wei Liu @ 2018-11-30 18:01 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich, xen-devel

On Fri, Nov 30, 2018 at 05:09:42PM +0000, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
> > It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
> > from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
> 
> Those are notes from some phone call amongst industry stakeholders.
> None of the messages have a Subject line mentioning tmem.  There is no
> explanation of the basis for the decision; just a confirmation from
> the current maintainers that they will ack the removal.
> 
> I think this is not really an appropriate way to carry on!  What if
> there is someone else who wants to step up to maintain this ?  What
> about user communication ?  Going straight from `Supported' to
> `Deleted' seems rather vigorous.

Step up to maintain> I would rather say step up to develop.

The status in MAINTAINERS is wrong. According to SUPPORT.md, it is only
experimental. Our definition of "experimental" is:

   Functional completeness: No
   Functional stability: Here be dragons
   Interface stability: Not stable
   Security supported: No

(https://wiki.xenproject.org/wiki/Xen_Project_Release_Features/Definitions)

This means without putting in significant effort, no-one would be able
to use TMEM. There is no stability guarantee at all for TMEM interface.
Deleting something experimental doesn't seem controversial to me.

I dare say no-one cared because it has got zero development effort in
years since 4.6. Also as you already noticed, no-one can possibly uses
TMEM since the switch to xl (that' even earlier than 4.6).

> 
> 
> In summary I think the claim that "It is agreed" in this cover letter
> is false (or, at least, if it is true, the cover letter provides no
> references to any basis for thinking that it is true).
> 
> If it didn't happen on the mailing list it didn't happen.
> 
> 
> Unfortunately, therefore, on process grounds,
> 
> Nacked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 

Yet the removal of ia64 port wasn't warned or announced on xen-announce,
so I disagree the removal is wrong on process grounds -- since there has
already been a precedence.

If there is a policy file, I would be happy to comply.

> 
> I dare say the decision to remove it now might be right.
> 
> Can we please start this again with a proper explanation of why this
> should be summarily deleted, rather than (say) made unmaintained and
> deprecated for a release ?  Can someone explain why we don't feel the
> need to consult anyone by (say) posting to xen-announce ?  etc.
> 

See above.

> Then we can actually have an on-list discussion where the decision
> would be taken.
> 
> Next time I suggest a good first step would be a patch which deletes
> the M: line from MAINTAINERS and changes the status to Orphan, since
> obviously the current maintainers don't want it.  That patch should be
> uncontroversial.  Also in general, depending who we think might be
> using a feature, a plan which gives some warning to users (by
> deprecating the feature, for example) would often be a good idea.
> 

Can we not invent policy and ask for compliance on the fly?

Wei.

> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-30 18:01   ` Wei Liu
@ 2018-12-03  9:56     ` Jan Beulich
  2018-12-31 12:43       ` Andrew Cooper
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2018-12-03  9:56 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Tim Deegan, xen-devel

>>> On 30.11.18 at 19:01, <wei.liu2@citrix.com> wrote:
> On Fri, Nov 30, 2018 at 05:09:42PM +0000, Ian Jackson wrote:
>> Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
>> > It is agreed that tmem can be removed from xen.git. See the thread starting 
>                                                                               
>     
>> > from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
>> 
>> Those are notes from some phone call amongst industry stakeholders.
>> None of the messages have a Subject line mentioning tmem.  There is no
>> explanation of the basis for the decision; just a confirmation from
>> the current maintainers that they will ack the removal.
>> 
>> I think this is not really an appropriate way to carry on!  What if
>> there is someone else who wants to step up to maintain this ?  What
>> about user communication ?  Going straight from `Supported' to
>> `Deleted' seems rather vigorous.
> 
> Step up to maintain> I would rather say step up to develop.
> 
> The status in MAINTAINERS is wrong. According to SUPPORT.md, it is only
> experimental. Our definition of "experimental" is:
> 
>    Functional completeness: No
>    Functional stability: Here be dragons
>    Interface stability: Not stable
>    Security supported: No

Exactly. Plus my proposal to remove it was posted to xen-devel
on Aug 30th. I don't think removal of an experimental feature
requires posting to xen-announce. Ian - please reconsider your
nack.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-12-03  9:56     ` Jan Beulich
@ 2018-12-31 12:43       ` Andrew Cooper
  2019-01-02 17:27         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Cooper @ 2018-12-31 12:43 UTC (permalink / raw)
  To: Jan Beulich, Ian Jackson, Wei Liu
  Cc: George Dunlap, xen-devel, Stefano Stabellini, Tim Deegan,
	Konrad Rzeszutek Wilk

On 03/12/2018 09:56, Jan Beulich wrote:
>>>> On 30.11.18 at 19:01, <wei.liu2@citrix.com> wrote:
>> On Fri, Nov 30, 2018 at 05:09:42PM +0000, Ian Jackson wrote:
>>> Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
>>>> It is agreed that tmem can be removed from xen.git. See the thread starting 
>>                                                                               
>>     
>>>> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
>>> Those are notes from some phone call amongst industry stakeholders.
>>> None of the messages have a Subject line mentioning tmem.  There is no
>>> explanation of the basis for the decision; just a confirmation from
>>> the current maintainers that they will ack the removal.
>>>
>>> I think this is not really an appropriate way to carry on!  What if
>>> there is someone else who wants to step up to maintain this ?  What
>>> about user communication ?  Going straight from `Supported' to
>>> `Deleted' seems rather vigorous.
>> Step up to maintain> I would rather say step up to develop.
>>
>> The status in MAINTAINERS is wrong. According to SUPPORT.md, it is only
>> experimental. Our definition of "experimental" is:
>>
>>    Functional completeness: No
>>    Functional stability: Here be dragons
>>    Interface stability: Not stable
>>    Security supported: No
> Exactly. Plus my proposal to remove it was posted to xen-devel
> on Aug 30th. I don't think removal of an experimental feature
> requires posting to xen-announce. Ian - please reconsider your
> nack.

I concur with Wei and Jan.  TMEM has been off by default due to being
declared "full of security holes - don't use" since XSA-15.  That was in
2012, and TMEM hasn't made its way back into security support in that time.

In addition, it was never fixed to work with Migration v2.  The save
side doesn't query any TMEM state, and convert-legacy-stream raises TODO
on encountering legacy TMEM data.

I don't know about other distributions, but it has been compiled out of
XenServer for all versions which have Kconfig.

tl;dr It doesn't work, and at this point, it looks very unlikely to
change.  There is a non-zero cost for retaining obsolete functionality,
and the hypervisor maintainers want it gone in 4.12, which we think is
entirely reasonable given the circumstances.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-12-31 12:43       ` Andrew Cooper
@ 2019-01-02 17:27         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 19+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-01-02 17:27 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich, xen-devel, Ian Jackson

On Mon, Dec 31, 2018 at 12:43:39PM +0000, Andrew Cooper wrote:
> On 03/12/2018 09:56, Jan Beulich wrote:
> >>>> On 30.11.18 at 19:01, <wei.liu2@citrix.com> wrote:
> >> On Fri, Nov 30, 2018 at 05:09:42PM +0000, Ian Jackson wrote:
> >>> Wei Liu writes ("[PATCH v2 0/3] Remove tmem"):
> >>>> It is agreed that tmem can be removed from xen.git. See the thread starting 
> >>                                                                               
> >>     
> >>>> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
> >>> Those are notes from some phone call amongst industry stakeholders.
> >>> None of the messages have a Subject line mentioning tmem.  There is no
> >>> explanation of the basis for the decision; just a confirmation from
> >>> the current maintainers that they will ack the removal.
> >>>
> >>> I think this is not really an appropriate way to carry on!  What if
> >>> there is someone else who wants to step up to maintain this ?  What
> >>> about user communication ?  Going straight from `Supported' to
> >>> `Deleted' seems rather vigorous.
> >> Step up to maintain> I would rather say step up to develop.
> >>
> >> The status in MAINTAINERS is wrong. According to SUPPORT.md, it is only
> >> experimental. Our definition of "experimental" is:
> >>
> >>    Functional completeness: No
> >>    Functional stability: Here be dragons
> >>    Interface stability: Not stable
> >>    Security supported: No
> > Exactly. Plus my proposal to remove it was posted to xen-devel
> > on Aug 30th. I don't think removal of an experimental feature
> > requires posting to xen-announce. Ian - please reconsider your
> > nack.
> 
> I concur with Wei and Jan.  TMEM has been off by default due to being
> declared "full of security holes - don't use" since XSA-15.  That was in
> 2012, and TMEM hasn't made its way back into security support in that time.
> 
> In addition, it was never fixed to work with Migration v2.  The save
> side doesn't query any TMEM state, and convert-legacy-stream raises TODO
> on encountering legacy TMEM data.
> 
> I don't know about other distributions, but it has been compiled out of
> XenServer for all versions which have Kconfig.
> 
> tl;dr It doesn't work, and at this point, it looks very unlikely to
> change.  There is a non-zero cost for retaining obsolete functionality,
> and the hypervisor maintainers want it gone in 4.12, which we think is
> entirely reasonable given the circumstances.

I agree on all counts. Can we please remove it?
> 
> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
                   ` (4 preceding siblings ...)
  2018-11-30 17:09 ` Ian Jackson
@ 2019-03-12 12:21 ` Wei Liu
  2019-03-12 13:04   ` Jan Beulich
  5 siblings, 1 reply; 19+ messages in thread
From: Wei Liu @ 2019-03-12 12:21 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich

On Wed, Nov 28, 2018 at 01:58:03PM +0000, Wei Liu wrote:
> It is agreed that tmem can be removed from xen.git. See the thread starting                                                                                   
> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
> 
> In this version:
> 
> 1. Remove some residuals from previous version and fix all build errors
>    discovered by Gitlab CI.
> 2. Swap the order of patches to make sure bisection still works. This
>    is verified by calling
>       `./automation/scripts/build-test.sh origin/staging HEAD`
> 3. Make sure Xen still boots and passes all XTF tests after the removal.
> 4. Keep public/tmem.h.

Now that 4.13 is open. What needs to be done regarding this series?

FAOD I still think its support status requires no prior announcement of
its removal.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/3] Remove tmem
  2019-03-12 12:21 ` Wei Liu
@ 2019-03-12 13:04   ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2019-03-12 13:04 UTC (permalink / raw)
  To: Wei Liu, Ian Jackson
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Tim Deegan, xen-devel

>>> On 12.03.19 at 13:21, <wei.liu2@citrix.com> wrote:
> On Wed, Nov 28, 2018 at 01:58:03PM +0000, Wei Liu wrote:
>> It is agreed that tmem can be removed from xen.git. See the thread starting  
>                                                                               
>    
>> from <D5E866B2-96F4-4E89-941E-73F578DF2F17@citrix.com>.
>> 
>> In this version:
>> 
>> 1. Remove some residuals from previous version and fix all build errors
>>    discovered by Gitlab CI.
>> 2. Swap the order of patches to make sure bisection still works. This
>>    is verified by calling
>>       `./automation/scripts/build-test.sh origin/staging HEAD`
>> 3. Make sure Xen still boots and passes all XTF tests after the removal.
>> 4. Keep public/tmem.h.
> 
> Now that 4.13 is open. What needs to be done regarding this series?
> 
> FAOD I still think its support status requires no prior announcement of
> its removal.

Depending on what exactly "announcement" means, commit a67ce55a3e
("tmem: default to off") was meant to serve as such. But in the end iirc it
was Ian who objected to outright deleting the code, so I think he should
clarify what further steps (if any) he expects to be taken.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-03-12 13:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-28 13:58 [PATCH v2 0/3] Remove tmem Wei Liu
2018-11-28 13:58 ` [PATCH v2 1/3] tools: remove tmem code and commands Wei Liu
2018-11-30 17:10   ` Ian Jackson
2018-11-28 13:58 ` [PATCH v2 2/3] xen: remove tmem from hypervisor Wei Liu
2018-11-28 14:43   ` Jan Beulich
2018-11-28 14:47     ` Wei Liu
2018-11-28 14:50       ` Wei Liu
2018-11-28 15:49   ` Daniel De Graaf
2018-11-28 13:58 ` [PATCH v2 3/3] docs: remove tmem related text Wei Liu
2018-11-28 15:49   ` Daniel De Graaf
2018-11-29  2:50 ` [PATCH v2 0/3] Remove tmem Konrad Rzeszutek Wilk
2018-11-29 11:42   ` Wei Liu
2018-11-30 17:09 ` Ian Jackson
2018-11-30 18:01   ` Wei Liu
2018-12-03  9:56     ` Jan Beulich
2018-12-31 12:43       ` Andrew Cooper
2019-01-02 17:27         ` Konrad Rzeszutek Wilk
2019-03-12 12:21 ` Wei Liu
2019-03-12 13:04   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.