All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V3 0/6] calculate downtime for postcopy live migration
       [not found] <CGME20170426050631eucas1p1628eb0c6c19ceabf0716cdaa7d6556a0@eucas1p1.samsung.com>
@ 2017-04-26  5:06 ` Alexey Perevalov
       [not found]   ` <CGME20170426050632eucas1p11fb353bd5801ae238a9adf55dcd0ea09@eucas1p1.samsung.com>
                     ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

This is third version of patch set.
First version was tagged as RFC, second was without version tag.

Difference since previous version (V3 -> V2)
    - Downtime calculation approach was changed, thanks to Peter Xu
    - Due to previous point no more need to keep GTree as well as bitmap of cpus.
So glib changes aren't included in this patch set, it could be resent in
another patch set, if it will be a good reason for it.
    - No procfs traces in this patchset, if somebody wants it, you could get it
from patchwork site to track down page fault initiators.
    - UFFD_FEATURE_THREAD_ID is requesting only when kernel supports it
    - It doesn't send back the downtime, just trace it

This patch set is based on master branch of git://git.qemu-project.org/qemu.git
base commit is commit 372b3fe0b2ecdd39ba850e31c0c6686315c507af.

It contains patch for kernel header, just for convinience of applying current
patch set, for testing until kernel headers arn't synced.


Alexey Perevalov (6):
  userfault: add pid into uffd_msg & update UFFD_FEATURE_*
  migration: pass ptr to MigrationIncomingState into migration
    ufd_version_check & postcopy_ram_supported_by_host
  migration: split ufd_version_check onto receive/request features part
  migration: add postcopy downtime into MigrationIncommingState
  migration: calculate downtime on dst side
  migration: trace postcopy total downtime

 include/migration/migration.h     |  15 +++++
 include/migration/postcopy-ram.h  |   2 +-
 linux-headers/linux/userfaultfd.h |   5 ++
 migration/migration.c             | 138 +++++++++++++++++++++++++++++++++++++-
 migration/postcopy-ram.c          | 107 ++++++++++++++++++++++++++---
 migration/savevm.c                |   2 +-
 migration/trace-events            |   7 +-
 7 files changed, 262 insertions(+), 14 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 1/6] userfault: add pid into uffd_msg & update UFFD_FEATURE_*
       [not found]   ` <CGME20170426050632eucas1p11fb353bd5801ae238a9adf55dcd0ea09@eucas1p1.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

This commit duplicates header of "userfaultfd: provide pid in userfault msg"
into linux kernel.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 linux-headers/linux/userfaultfd.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/linux-headers/linux/userfaultfd.h b/linux-headers/linux/userfaultfd.h
index 2ed5dc3..e7c8898 100644
--- a/linux-headers/linux/userfaultfd.h
+++ b/linux-headers/linux/userfaultfd.h
@@ -77,6 +77,9 @@ struct uffd_msg {
 		struct {
 			__u64	flags;
 			__u64	address;
+			union {
+				__u32   ptid;
+			} feat;
 		} pagefault;
 
 		struct {
@@ -158,6 +161,8 @@ struct uffdio_api {
 #define UFFD_FEATURE_EVENT_MADVDONTNEED		(1<<3)
 #define UFFD_FEATURE_MISSING_HUGETLBFS		(1<<4)
 #define UFFD_FEATURE_MISSING_SHMEM		(1<<5)
+#define UFFD_FEATURE_EVENT_UNMAP		(1<<6)
+#define UFFD_FEATURE_THREAD_ID			(1<<7)
 	__u64 features;
 
 	__u64 ioctls;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 2/6] migration: pass ptr to MigrationIncomingState into migration ufd_version_check & postcopy_ram_supported_by_host
       [not found]   ` <CGME20170426050633eucas1p131e8089df08fbd540d24b2d2a960012c@eucas1p1.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

That tiny refactoring is necessary to be able to set
UFFD_FEATURE_THREAD_ID while requesting features, and then
to create downtime context in case when kernel supports it.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 include/migration/postcopy-ram.h |  2 +-
 migration/migration.c            |  2 +-
 migration/postcopy-ram.c         | 10 +++++-----
 migration/savevm.c               |  2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/migration/postcopy-ram.h b/include/migration/postcopy-ram.h
index 8e036b9..809f6db 100644
--- a/include/migration/postcopy-ram.h
+++ b/include/migration/postcopy-ram.h
@@ -14,7 +14,7 @@
 #define QEMU_POSTCOPY_RAM_H
 
 /* Return true if the host supports everything we need to do postcopy-ram */
-bool postcopy_ram_supported_by_host(void);
+bool postcopy_ram_supported_by_host(MigrationIncomingState *mis);
 
 /*
  * Make all of RAM sensitive to accesses to areas that haven't yet been written
diff --git a/migration/migration.c b/migration/migration.c
index ad4036f..79f6425 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -802,7 +802,7 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
          * special support.
          */
         if (!old_postcopy_cap && runstate_check(RUN_STATE_INMIGRATE) &&
-            !postcopy_ram_supported_by_host()) {
+            !postcopy_ram_supported_by_host(NULL)) {
             /* postcopy_ram_supported_by_host will have emitted a more
              * detailed message
              */
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index dc80dbb..dcf2ed1 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -60,7 +60,7 @@ struct PostcopyDiscardState {
 #include <sys/eventfd.h>
 #include <linux/userfaultfd.h>
 
-static bool ufd_version_check(int ufd)
+static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
 {
     struct uffdio_api api_struct;
     uint64_t ioctl_mask;
@@ -113,7 +113,7 @@ static int test_range_shared(const char *block_name, void *host_addr,
  * normally fine since if the postcopy succeeds it gets turned back on at the
  * end.
  */
-bool postcopy_ram_supported_by_host(void)
+bool postcopy_ram_supported_by_host(MigrationIncomingState *mis)
 {
     long pagesize = getpagesize();
     int ufd = -1;
@@ -136,7 +136,7 @@ bool postcopy_ram_supported_by_host(void)
     }
 
     /* Version and features check */
-    if (!ufd_version_check(ufd)) {
+    if (!ufd_version_check(ufd, mis)) {
         goto out;
     }
 
@@ -515,7 +515,7 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
      * Although the host check already tested the API, we need to
      * do the check again as an ABI handshake on the new fd.
      */
-    if (!ufd_version_check(mis->userfault_fd)) {
+    if (!ufd_version_check(mis->userfault_fd, mis)) {
         return -1;
     }
 
@@ -653,7 +653,7 @@ void *postcopy_get_tmp_page(MigrationIncomingState *mis)
 
 #else
 /* No target OS support, stubs just fail */
-bool postcopy_ram_supported_by_host(void)
+bool postcopy_ram_supported_by_host(MigrationIncomingState *mis)
 {
     error_report("%s: No OS support", __func__);
     return false;
diff --git a/migration/savevm.c b/migration/savevm.c
index 3b19a4a..f01e418 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1360,7 +1360,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
         return -1;
     }
 
-    if (!postcopy_ram_supported_by_host()) {
+    if (!postcopy_ram_supported_by_host(mis)) {
         postcopy_state_set(POSTCOPY_INCOMING_NONE);
         return -1;
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 3/6] migration: split ufd_version_check onto receive/request features part
       [not found]   ` <CGME20170426050633eucas1p2f0c7756e8762fb70d1e1e045095eb3a2@eucas1p2.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

This modification is necessary for userfault fd features which are
required to be requested from userspace.
UFFD_FEATURE_THREAD_ID is a one of such "on demand" feature, which will
be introduced in the next patch.

QEMU need to use separate userfault file descriptor, due to
userfault context has internal state, and after first call of
ioctl UFFD_API it changes its state to UFFD_STATE_RUNNING (in case of
success), but
kernel while handling ioctl UFFD_API expects UFFD_STATE_WAIT_API. So
only one ioctl with UFFD_API is possible per ufd.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 migration/postcopy-ram.c | 68 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 63 insertions(+), 5 deletions(-)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index dcf2ed1..55b5243 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -60,15 +60,51 @@ struct PostcopyDiscardState {
 #include <sys/eventfd.h>
 #include <linux/userfaultfd.h>
 
-static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
+
+/*
+ * Check userfault fd features, to request only supported features in
+ * future.
+ * __NR_userfaultfd - should be checked before
+ * Return obtained features
+ */
+static bool receive_ufd_features(__u64 *features)
 {
-    struct uffdio_api api_struct;
-    uint64_t ioctl_mask;
+    struct uffdio_api api_struct = {0};
+    int ufd;
+    bool ret = true;
 
+    /* if we are here __NR_userfaultfd should exists */
+    ufd = syscall(__NR_userfaultfd, O_CLOEXEC);
+    if (ufd == -1) {
+        return false;
+    }
+
+    /* ask features */
     api_struct.api = UFFD_API;
     api_struct.features = 0;
     if (ioctl(ufd, UFFDIO_API, &api_struct)) {
-        error_report("postcopy_ram_supported_by_host: UFFDIO_API failed: %s",
+        error_report("receive_ufd_features: UFFDIO_API failed: %s",
+                strerror(errno));
+        ret = false;
+        goto release_ufd;
+    }
+
+    *features = api_struct.features;
+
+release_ufd:
+    close(ufd);
+    return ret;
+}
+
+static bool request_ufd_features(int ufd, __u64 features)
+{
+    struct uffdio_api api_struct = {0};
+    uint64_t ioctl_mask;
+
+    api_struct.api = UFFD_API;
+    api_struct.features = features;
+    if (ioctl(ufd, UFFDIO_API, &api_struct)) {
+        error_report("request_ufd_features: UFFDIO_API failed: %s",
                      strerror(errno));
         return false;
     }
@@ -81,11 +117,33 @@ static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
         return false;
     }
 
+    return true;
+}
+
+static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
+{
+    __u64 new_features = 0;
+
+    /* ask features */
+    __u64 supported_features;
+
+    if (!receive_ufd_features(&supported_features)) {
+        error_report("ufd_version_check failed");
+        return false;
+    }
+
+    /* request features */
+    if (new_features && !request_ufd_features(ufd, new_features)) {
+        error_report("ufd_version_check failed: features %" PRIu64,
+                (uint64_t)new_features);
+        return false;
+    }
+
     if (getpagesize() != ram_pagesize_summary()) {
         bool have_hp = false;
         /* We've got a huge page */
 #ifdef UFFD_FEATURE_MISSING_HUGETLBFS
-        have_hp = api_struct.features & UFFD_FEATURE_MISSING_HUGETLBFS;
+        have_hp = supported_features & UFFD_FEATURE_MISSING_HUGETLBFS;
 #endif
         if (!have_hp) {
             error_report("Userfault on this host does not support huge pages");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 4/6] migration: add postcopy downtime into MigrationIncommingState
       [not found]   ` <CGME20170426050634eucas1p1a1c95d312025c99453a185084b75396a@eucas1p1.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

This patch add request to kernel space for UFFD_FEATURE_THREAD_ID,
in case when this feature is provided by kernel.

DowntimeContext is incapsulated inside migration.c.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 include/migration/migration.h | 12 ++++++++++++
 migration/migration.c         | 33 +++++++++++++++++++++++++++++++++
 migration/postcopy-ram.c      |  8 ++++++++
 3 files changed, 53 insertions(+)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 5720c88..b1759f7 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -83,6 +83,8 @@ typedef enum {
     POSTCOPY_INCOMING_END
 } PostcopyState;
 
+struct DowntimeContext;
+
 /* State for the incoming migration */
 struct MigrationIncomingState {
     QEMUFile *from_src_file;
@@ -123,10 +125,20 @@ struct MigrationIncomingState {
 
     /* See savevm.c */
     LoadStateEntry_Head loadvm_handlers;
+
+    /*
+     * DowntimeContext to keep information for postcopy
+     * live migration, to calculate downtime
+     * */
+    struct DowntimeContext *downtime_ctx;
 };
 
 MigrationIncomingState *migration_incoming_get_current(void);
 void migration_incoming_state_destroy(void);
+/*
+ * Functions to work with downtime context
+ */
+struct DowntimeContext *downtime_context_new(void);
 
 /*
  * An outstanding page request, on the source, having been received
diff --git a/migration/migration.c b/migration/migration.c
index 79f6425..0309c2b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -77,6 +77,18 @@ static NotifierList migration_state_notifiers =
 
 static bool deferred_incoming;
 
+typedef struct DowntimeContext {
+    /* time when page fault initiated per vCPU */
+    int64_t *page_fault_vcpu_time;
+    /* page address per vCPU */
+    uint64_t *vcpu_addr;
+    int64_t total_downtime;
+    /* downtime per vCPU */
+    int64_t *vcpu_downtime;
+    /* point in time when last page fault was initiated */
+    int64_t last_begin;
+} DowntimeContext;
+
 /*
  * Current state of incoming postcopy; note this is not part of
  * MigrationIncomingState since it's state is used during cleanup
@@ -117,6 +129,23 @@ MigrationState *migrate_get_current(void)
     return &current_migration;
 }
 
+struct DowntimeContext *downtime_context_new(void)
+{
+    DowntimeContext *ctx = g_new0(DowntimeContext, 1);
+    ctx->page_fault_vcpu_time = g_new0(int64_t, smp_cpus);
+    ctx->vcpu_addr = g_new0(uint64_t, smp_cpus);
+    ctx->vcpu_downtime = g_new0(int64_t, smp_cpus);
+    return ctx;
+}
+
+static void destroy_downtime_context(struct DowntimeContext *ctx)
+{
+    g_free(ctx->page_fault_vcpu_time);
+    g_free(ctx->vcpu_addr);
+    g_free(ctx->vcpu_downtime);
+    g_free(ctx);
+}
+
 MigrationIncomingState *migration_incoming_get_current(void)
 {
     static bool once;
@@ -139,6 +168,10 @@ void migration_incoming_state_destroy(void)
 
     qemu_event_destroy(&mis->main_thread_load_event);
     loadvm_free_handlers(mis);
+    if (mis->downtime_ctx) {
+        destroy_downtime_context(mis->downtime_ctx);
+        mis->downtime_ctx = NULL;
+    }
 }
 
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 55b5243..ce1ea5d 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -132,6 +132,14 @@ static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
         return false;
     }
 
+#ifdef UFFD_FEATURE_THREAD_ID
+    if (mis && UFFD_FEATURE_THREAD_ID & supported_features) {
+        /* kernel supports that feature */
+        mis->downtime_ctx = downtime_context_new();
+        new_features |= UFFD_FEATURE_THREAD_ID;
+    }
+#endif
+
     /* request features */
     if (new_features && !request_ufd_features(ufd, new_features)) {
         error_report("ufd_version_check failed: features %" PRIu64,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 5/6] migration: calculate downtime on dst side
       [not found]   ` <CGME20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d@eucas1p1.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

This patch provides downtime calculation per vCPU,
as a summary and as a overlapped value for all vCPUs.

This approach was suggested by Peter Xu, as an improvements of
previous approch where QEMU kept tree with faulted page address and cpus bitmask
in it. Now QEMU is keeping array with faulted page address as value and vCPU
as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
list for downtime per vCPU (could be traced with page_fault_addr)

For more details see comments for get_postcopy_total_downtime
implementation.

Downtime will not calculated if postcopy_downtime field of
MigrationIncomingState wasn't initialized.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 include/migration/migration.h |   3 ++
 migration/migration.c         | 103 ++++++++++++++++++++++++++++++++++++++++++
 migration/postcopy-ram.c      |  20 +++++++-
 migration/trace-events        |   6 ++-
 4 files changed, 130 insertions(+), 2 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index b1759f7..137405b 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -139,6 +139,9 @@ void migration_incoming_state_destroy(void);
  * Functions to work with downtime context
  */
 struct DowntimeContext *downtime_context_new(void);
+void mark_postcopy_downtime_begin(uint64_t addr, int cpu);
+void mark_postcopy_downtime_end(uint64_t addr);
+uint64_t get_postcopy_total_downtime(void);
 
 /*
  * An outstanding page request, on the source, having been received
diff --git a/migration/migration.c b/migration/migration.c
index 0309c2b..b559dfe 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2150,3 +2150,106 @@ PostcopyState postcopy_state_set(PostcopyState new_state)
     return atomic_xchg(&incoming_postcopy_state, new_state);
 }
 
+void mark_postcopy_downtime_begin(uint64_t addr, int cpu)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    DowntimeContext *dc;
+    if (!mis->downtime_ctx || cpu < 0) {
+        return;
+    }
+    dc = mis->downtime_ctx;
+    dc->vcpu_addr[cpu] = addr;
+    dc->last_begin = dc->page_fault_vcpu_time[cpu] =
+        qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
+    trace_mark_postcopy_downtime_begin(addr, dc, dc->page_fault_vcpu_time[cpu],
+            cpu);
+}
+
+void mark_postcopy_downtime_end(uint64_t addr)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    DowntimeContext *dc;
+    int i;
+    bool all_vcpu_down = true;
+    int64_t now;
+
+    if (!mis->downtime_ctx) {
+        return;
+    }
+    dc = mis->downtime_ctx;
+    now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
+    /* check all vCPU down,
+     * QEMU has bitmap.h, but even with bitmap_and
+     * will be a cycle */
+    for (i = 0; i < smp_cpus; i++) {
+        if (dc->vcpu_addr[i]) {
+            continue;
+        }
+        all_vcpu_down = false;
+        break;
+    }
+
+    if (all_vcpu_down) {
+        dc->total_downtime += now - dc->last_begin;
+    }
+
+    /* lookup cpu, to clear it */
+    for (i = 0; i < smp_cpus; i++) {
+        uint64_t vcpu_downtime;
+
+        if (dc->vcpu_addr[i] != addr) {
+            continue;
+        }
+
+        vcpu_downtime = now - dc->page_fault_vcpu_time[i];
+
+        dc->vcpu_addr[i] = 0;
+        dc->vcpu_downtime[i] += vcpu_downtime;
+    }
+
+    trace_mark_postcopy_downtime_end(addr, dc, dc->total_downtime);
+}
+
+/*
+ * This function just provide calculated before downtime per cpu and trace it.
+ * Total downtime is calculated in mark_postcopy_downtime_end.
+ *
+ *
+ * Assume we have 3 CPU
+ *
+ *      S1        E1           S1               E1
+ * -----***********------------xxx***************------------------------> CPU1
+ *
+ *             S2                E2
+ * ------------****************xxx---------------------------------------> CPU2
+ *
+ *                         S3            E3
+ * ------------------------****xxx********-------------------------------> CPU3
+ *
+ * We have sequence S1,S2,E1,S3,S1,E2,E3,E1
+ * S2,E1 - doesn't match condition due to sequence S1,S2,E1 doesn't include CPU3
+ * S3,S1,E2 - sequence includes all CPUs, in this case overlap will be S1,E2 -
+ *            it's a part of total downtime.
+ * S1 - here is last_begin
+ * Legend of the picture is following:
+ *              * - means downtime per vCPU
+ *              x - means overlapped downtime (total downtime)
+ */
+uint64_t get_postcopy_total_downtime(void)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    if (!mis->downtime_ctx) {
+        return 0;
+    }
+
+    if (trace_event_get_state(TRACE_DOWNTIME_PER_CPU)) {
+        int i;
+        for (i = 0; i < smp_cpus; i++) {
+            trace_downtime_per_cpu(i, mis->downtime_ctx->vcpu_downtime[i]);
+        }
+    }
+    return mis->downtime_ctx->total_downtime;
+}
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index ce1ea5d..03c2be7 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -23,6 +23,7 @@
 #include "migration/postcopy-ram.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/balloon.h"
+#include <sys/param.h>
 #include "qemu/error-report.h"
 #include "trace.h"
 
@@ -470,6 +471,19 @@ static int ram_block_enable_notify(const char *block_name, void *host_addr,
     return 0;
 }
 
+static int get_mem_fault_cpu_index(uint32_t pid)
+{
+    CPUState *cpu_iter;
+
+    CPU_FOREACH(cpu_iter) {
+        if (cpu_iter->thread_id == pid) {
+            return cpu_iter->cpu_index;
+        }
+    }
+    trace_get_mem_fault_cpu_index(pid);
+    return -1;
+}
+
 /*
  * Handle faults detected by the USERFAULT markings
  */
@@ -547,8 +561,11 @@ static void *postcopy_ram_fault_thread(void *opaque)
         rb_offset &= ~(qemu_ram_pagesize(rb) - 1);
         trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address,
                                                 qemu_ram_get_idstr(rb),
-                                                rb_offset);
+                                                rb_offset,
+                                                msg.arg.pagefault.feat.ptid);
 
+        mark_postcopy_downtime_begin((uintptr_t)(msg.arg.pagefault.address),
+                         get_mem_fault_cpu_index(msg.arg.pagefault.feat.ptid));
         /*
          * Send the request to the source - we want to request one
          * of our host page sizes (which is >= TPS)
@@ -643,6 +660,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from,
 
         return -e;
     }
+    mark_postcopy_downtime_end((uint64_t)host);
 
     trace_postcopy_place_page(host);
     return 0;
diff --git a/migration/trace-events b/migration/trace-events
index 7372ce2..19e7dc5 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -110,6 +110,9 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
 process_incoming_migration_co_postcopy_end_main(void) ""
 migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s"
 migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname)  "ioc=%p ioctype=%s hostname=%s"
+mark_postcopy_downtime_begin(uint64_t addr, void *dd, int64_t time, int cpu) "addr 0x%" PRIx64 " dd %p time %" PRId64 " cpu %d"
+mark_postcopy_downtime_end(uint64_t addr, void *dd, int64_t time) "addr 0x%" PRIx64 " dd %p time %" PRId64
+downtime_per_cpu(int cpu_index, int64_t downtime) "downtime cpu[%d]=%" PRId64
 
 # migration/rdma.c
 qemu_rdma_accept_incoming_migration(void) ""
@@ -186,7 +189,7 @@ postcopy_ram_enable_notify(void) ""
 postcopy_ram_fault_thread_entry(void) ""
 postcopy_ram_fault_thread_exit(void) ""
 postcopy_ram_fault_thread_quit(void) ""
-postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=%" PRIx64 " rb=%s offset=%zx"
+postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, uint32_t pid) "Request for HVA=%" PRIx64 " rb=%s offset=%zx %u"
 postcopy_ram_incoming_cleanup_closeuf(void) ""
 postcopy_ram_incoming_cleanup_entry(void) ""
 postcopy_ram_incoming_cleanup_exit(void) ""
@@ -195,6 +198,7 @@ save_xbzrle_page_skipping(void) ""
 save_xbzrle_page_overflow(void) ""
 ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations"
 ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64
+get_mem_fault_cpu_index(uint32_t pid) "pid %u is not vCPU"
 
 # migration/exec.c
 migration_exec_outgoing(const char *cmd) "cmd=%s"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V3 6/6] migration: trace postcopy total downtime
       [not found]   ` <CGME20170426050635eucas1p2084a21b1deaaa891c278bbb610eb421a@eucas1p2.samsung.com>
@ 2017-04-26  5:06     ` Alexey Perevalov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Perevalov @ 2017-04-26  5:06 UTC (permalink / raw)
  To: dgilbert; +Cc: a.perevalov, i.maximets, f4bug, peterx, qemu-devel

It's not possible to transmit it back to source host,
due to RP protocol is not expandable.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
---
 migration/postcopy-ram.c | 2 ++
 migration/trace-events   | 1 +
 2 files changed, 3 insertions(+)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 03c2be7..46a42d4 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -390,6 +390,8 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
     }
 
     postcopy_state_set(POSTCOPY_INCOMING_END);
+    /* here should be downtime receiving back operation */
+    trace_postcopy_ram_incoming_cleanup_downtime(get_postcopy_total_downtime());
     migrate_send_rp_shut(mis, qemu_file_get_error(mis->from_src_file) != 0);
 
     if (mis->postcopy_tmp_page) {
diff --git a/migration/trace-events b/migration/trace-events
index 19e7dc5..25521d4 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -194,6 +194,7 @@ postcopy_ram_incoming_cleanup_closeuf(void) ""
 postcopy_ram_incoming_cleanup_entry(void) ""
 postcopy_ram_incoming_cleanup_exit(void) ""
 postcopy_ram_incoming_cleanup_join(void) ""
+postcopy_ram_incoming_cleanup_downtime(uint64_t total) "total downtime %" PRIu64
 save_xbzrle_page_skipping(void) ""
 save_xbzrle_page_overflow(void) ""
 ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-04-26  5:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20170426050631eucas1p1628eb0c6c19ceabf0716cdaa7d6556a0@eucas1p1.samsung.com>
2017-04-26  5:06 ` [Qemu-devel] [PATCH V3 0/6] calculate downtime for postcopy live migration Alexey Perevalov
     [not found]   ` <CGME20170426050632eucas1p11fb353bd5801ae238a9adf55dcd0ea09@eucas1p1.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 1/6] userfault: add pid into uffd_msg & update UFFD_FEATURE_* Alexey Perevalov
     [not found]   ` <CGME20170426050633eucas1p131e8089df08fbd540d24b2d2a960012c@eucas1p1.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 2/6] migration: pass ptr to MigrationIncomingState into migration ufd_version_check & postcopy_ram_supported_by_host Alexey Perevalov
     [not found]   ` <CGME20170426050633eucas1p2f0c7756e8762fb70d1e1e045095eb3a2@eucas1p2.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 3/6] migration: split ufd_version_check onto receive/request features part Alexey Perevalov
     [not found]   ` <CGME20170426050634eucas1p1a1c95d312025c99453a185084b75396a@eucas1p1.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 4/6] migration: add postcopy downtime into MigrationIncommingState Alexey Perevalov
     [not found]   ` <CGME20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d@eucas1p1.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 5/6] migration: calculate downtime on dst side Alexey Perevalov
     [not found]   ` <CGME20170426050635eucas1p2084a21b1deaaa891c278bbb610eb421a@eucas1p2.samsung.com>
2017-04-26  5:06     ` [Qemu-devel] [PATCH V3 6/6] migration: trace postcopy total downtime Alexey Perevalov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.