All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Alexey Perevalov <a.perevalov@samsung.com>
Cc: qemu-devel@nongnu.org, i.maximets@samsung.com, peterx@redhat.com
Subject: Re: [Qemu-devel] [PATCH V5 7/9] migration: calculate vCPU blocktime on dst side
Date: Tue, 16 May 2017 12:34:16 +0100	[thread overview]
Message-ID: <20170516113416.GD2839@work-vm> (raw)
In-Reply-To: <1494595886-30912-8-git-send-email-a.perevalov@samsung.com>

* Alexey Perevalov (a.perevalov@samsung.com) wrote:
> This patch provides blocktime calculation per vCPU,
> as a summary and as a overlapped value for all vCPUs.
> 
> This approach was suggested by Peter Xu, as an improvements of
> previous approch where QEMU kept tree with faulted page address and cpus bitmask
> in it. Now QEMU is keeping array with faulted page address as value and vCPU
> as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
> list for blocktime per vCPU (could be traced with page_fault_addr)
> 
> Blocktime will not calculated if postcopy_blocktime field of
> MigrationIncomingState wasn't initialized.
> 
> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>

I have some multi-threading/ordering worries still.

The fault thread receives faults over the ufd and calls
mark_postcopy_blocktime_being.  That's fine.

The receiving thread receives pages, calls place page, and
calls mark_postcopy_blocktime_end.  That's also fine.

However, remember that we send pages from the source without
them being requested as background transfers; consider:


    Source           receive-thread          fault-thread

  1  Send A
  2                  Receive A
  3                                            Access A
  4                                            Report on UFD
  5                  Place
  6                                            Read UFD entry


 Placing and reading UFD race - and up till now that's been fine;
so we can read off the ufd an address that's already on it's way from
the source, and which we might just be receiving, or that we might
have already placed.

In this code at (6) won't you call mark_postcopy_blocktime_start
even though it's already been placed at (5) ? Then that blocktime
will stay set until the end of the run?

Perhaps that's not a problem; if mark_postcopy_blocktime_end is called
for a different address it wont count the blocktime; and when
mark_postcopy_blocktime_start is called for a different address it'll
remove the addres that was a problem above - so perhaps that's fine?


> ---
>  migration/postcopy-ram.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++-
>  migration/trace-events   |  5 ++-
>  2 files changed, 90 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index a1f1705..e2660ae 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -23,6 +23,7 @@
>  #include "migration/postcopy-ram.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/balloon.h"
> +#include <sys/param.h>
>  #include "qemu/error-report.h"
>  #include "trace.h"
>  
> @@ -542,6 +543,86 @@ static int ram_block_enable_notify(const char *block_name, void *host_addr,
>      return 0;
>  }
>  
> +static int get_mem_fault_cpu_index(uint32_t pid)
> +{
> +    CPUState *cpu_iter;
> +
> +    CPU_FOREACH(cpu_iter) {
> +        if (cpu_iter->thread_id == pid) {
> +            return cpu_iter->cpu_index;
> +        }
> +    }
> +    trace_get_mem_fault_cpu_index(pid);
> +    return -1;
> +}
> +
> +static void mark_postcopy_blocktime_begin(uint64_t addr, int cpu)
> +{
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +    PostcopyBlocktimeContext *dc;
> +    int64_t now_ms;
> +    if (!mis->blocktime_ctx || cpu < 0) {
> +        return;
> +    }

You might consider:

 PostcopyBlocktimeContext *dc = mis->blocktime_ctx;
 int64_t now_ms;
 if (!dc || cpu < 0) {
     return;
 }

it gets rid of the two reads of mis->blocktime_ctx
(You do something similar in a few places)

> +    now_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +    dc = mis->blocktime_ctx;
> +    if (dc->vcpu_addr[cpu] == 0) {
> +        atomic_inc(&dc->smp_cpus_down);
> +    }
> +
> +    atomic_xchg__nocheck(&dc->vcpu_addr[cpu], addr);
> +    atomic_xchg__nocheck(&dc->last_begin, now_ms);
> +    atomic_xchg__nocheck(&dc->page_fault_vcpu_time[cpu], now_ms);
> +
> +    trace_mark_postcopy_blocktime_begin(addr, dc, dc->page_fault_vcpu_time[cpu],
> +            cpu);
> +}
> +
> +static void mark_postcopy_blocktime_end(uint64_t addr)
> +{
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +    PostcopyBlocktimeContext *dc;
> +    int i, affected_cpu = 0;
> +    int64_t now_ms;
> +    bool vcpu_total_blocktime = false;
> +
> +    if (!mis->blocktime_ctx) {
> +        return;
> +    }
> +    dc = mis->blocktime_ctx;
> +    now_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +
> +    /* lookup cpu, to clear it,
> +     * that algorithm looks straighforward, but it's not
> +     * optimal, more optimal algorithm is keeping tree or hash
> +     * where key is address value is a list of  */
> +    for (i = 0; i < smp_cpus; i++) {
> +        uint64_t vcpu_blocktime = 0;
> +        if (atomic_fetch_add(&dc->vcpu_addr[i], 0) != addr) {
> +            continue;
> +        }
> +        atomic_xchg__nocheck(&dc->vcpu_addr[i], 0);
> +        vcpu_blocktime = now_ms -
> +            atomic_fetch_add(&dc->page_fault_vcpu_time[i], 0);
> +        affected_cpu += 1;
> +        /* we need to know is that mark_postcopy_end was due to
> +         * faulted page, another possible case it's prefetched
> +         * page and in that case we shouldn't be here */
> +        if (!vcpu_total_blocktime &&
> +            atomic_fetch_add(&dc->smp_cpus_down, 0) == smp_cpus) {
> +            vcpu_total_blocktime = true;
> +        }
> +        /* continue cycle, due to one page could affect several vCPUs */
> +        dc->vcpu_blocktime[i] += vcpu_blocktime;
> +    }
> +
> +    atomic_sub(&dc->smp_cpus_down, affected_cpu);
> +    if (vcpu_total_blocktime) {
> +        dc->total_blocktime += now_ms - atomic_fetch_add(&dc->last_begin, 0);

This total_blocktime calculation is a little odd; the 'last_begin' is
not necessarily related to the same CPU or same block.

Dave

> +    }
> +    trace_mark_postcopy_blocktime_end(addr, dc, dc->total_blocktime);
> +}
> +
>  /*
>   * Handle faults detected by the USERFAULT markings
>   */
> @@ -619,8 +700,11 @@ static void *postcopy_ram_fault_thread(void *opaque)
>          rb_offset &= ~(qemu_ram_pagesize(rb) - 1);
>          trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address,
>                                                  qemu_ram_get_idstr(rb),
> -                                                rb_offset);
> +                                                rb_offset,
> +                                                msg.arg.pagefault.feat.ptid);
>  
> +        mark_postcopy_blocktime_begin((uintptr_t)(msg.arg.pagefault.address),
> +                         get_mem_fault_cpu_index(msg.arg.pagefault.feat.ptid));
>          /*
>           * Send the request to the source - we want to request one
>           * of our host page sizes (which is >= TPS)
> @@ -715,6 +799,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from,
>  
>          return -e;
>      }
> +    mark_postcopy_blocktime_end((uint64_t)(uintptr_t)host);
>  
>      trace_postcopy_place_page(host);
>      return 0;
> diff --git a/migration/trace-events b/migration/trace-events
> index b8f01a2..9424e3e 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -110,6 +110,8 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
>  process_incoming_migration_co_postcopy_end_main(void) ""
>  migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s"
>  migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname)  "ioc=%p ioctype=%s hostname=%s"
> +mark_postcopy_blocktime_begin(uint64_t addr, void *dd, int64_t time, int cpu) "addr 0x%" PRIx64 " dd %p time %" PRId64 " cpu %d"
> +mark_postcopy_blocktime_end(uint64_t addr, void *dd, int64_t time) "addr 0x%" PRIx64 " dd %p time %" PRId64
>  
>  # migration/rdma.c
>  qemu_rdma_accept_incoming_migration(void) ""
> @@ -186,7 +188,7 @@ postcopy_ram_enable_notify(void) ""
>  postcopy_ram_fault_thread_entry(void) ""
>  postcopy_ram_fault_thread_exit(void) ""
>  postcopy_ram_fault_thread_quit(void) ""
> -postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=%" PRIx64 " rb=%s offset=%zx"
> +postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, uint32_t pid) "Request for HVA=%" PRIx64 " rb=%s offset=%zx %u"
>  postcopy_ram_incoming_cleanup_closeuf(void) ""
>  postcopy_ram_incoming_cleanup_entry(void) ""
>  postcopy_ram_incoming_cleanup_exit(void) ""
> @@ -195,6 +197,7 @@ save_xbzrle_page_skipping(void) ""
>  save_xbzrle_page_overflow(void) ""
>  ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations"
>  ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64
> +get_mem_fault_cpu_index(uint32_t pid) "pid %u is not vCPU"
>  
>  # migration/exec.c
>  migration_exec_outgoing(const char *cmd) "cmd=%s"
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2017-05-16 11:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20170512133144eucas1p23502fd953ee73bda5b2afb25e65604f9@eucas1p2.samsung.com>
2017-05-12 13:31 ` [Qemu-devel] [PATCH V5 0/9] calculate blocktime for postcopy live migration Alexey Perevalov
     [not found]   ` <CGME20170512133144eucas1p288cde3bd6faefb16cbb0d3790885783d@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 1/9] userfault: add pid into uffd_msg & update UFFD_FEATURE_* Alexey Perevalov
     [not found]   ` <CGME20170512133144eucas1p25b4275feb4126a21415242c5085382fd@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 2/9] migration: pass ptr to MigrationIncomingState into migration ufd_version_check & postcopy_ram_supported_by_host Alexey Perevalov
2017-05-18 14:09       ` Eric Blake
     [not found]   ` <CGME20170512133145eucas1p2cad4c4efe46e6f1b757d97dd9d301dbe@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 3/9] migration: fix hardcoded function name in error report Alexey Perevalov
2017-05-16  9:46       ` Dr. David Alan Gilbert
     [not found]   ` <CGME20170512133146eucas1p17df48bb6b5fcefe3717e18cd9afd84b7@eucas1p1.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 4/9] migration: split ufd_version_check onto receive/request features part Alexey Perevalov
2017-05-16 10:32       ` Dr. David Alan Gilbert
2017-05-18  6:55         ` Alexey
2017-05-19 18:46           ` Dr. David Alan Gilbert
     [not found]   ` <CGME20170512133146eucas1p2ba4841cabf508b66410fae6784952eaa@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 5/9] migration: introduce postcopy-blocktime capability Alexey Perevalov
2017-05-16 10:33       ` Dr. David Alan Gilbert
2017-05-22 16:20       ` Eric Blake
2017-05-22 16:42         ` Alexey
2017-05-30 11:26         ` Dr. David Alan Gilbert
     [not found]   ` <CGME20170512133147eucas1p1aca0281fc864bf6f3beb610e7ce2695b@eucas1p1.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 6/9] migration: add postcopy vcpu blocktime context into MigrationIncomingState Alexey Perevalov
     [not found]   ` <CGME20170512133147eucas1p1eaa21aac3a0b9d45be0ef8ea903b6824@eucas1p1.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 7/9] migration: calculate vCPU blocktime on dst side Alexey Perevalov
2017-05-16 11:34       ` Dr. David Alan Gilbert [this message]
2017-05-16 15:19         ` Alexey
2017-05-18  7:18         ` Alexey
2017-05-19 19:05           ` Dr. David Alan Gilbert
2017-05-22  7:43             ` Alexey Perevalov
     [not found]   ` <CGME20170512133148eucas1p2c04111d415b1fbd6fb702cfc2a3ed6f9@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 8/9] migration: add postcopy total blocktime into query-migrate Alexey Perevalov
2017-05-19 19:23       ` Dr. David Alan Gilbert
2017-05-22 16:15         ` Eric Blake
2017-05-22 16:14       ` Eric Blake
     [not found]   ` <CGME20170512133149eucas1p2b4c448fe763975cf11cf96801857d42e@eucas1p2.samsung.com>
2017-05-12 13:31     ` [Qemu-devel] [PATCH V5 9/9] migration: postcopy_blocktime documentation Alexey Perevalov
2017-05-12 20:09   ` [Qemu-devel] [PATCH V5 0/9] calculate blocktime for postcopy live migration Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170516113416.GD2839@work-vm \
    --to=dgilbert@redhat.com \
    --cc=a.perevalov@samsung.com \
    --cc=i.maximets@samsung.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.