On Wed, Mar 25, 2015 at 06:16:45PM +0000, Dr. David Alan Gilbert wrote: > * David Gibson (david@gibson.dropbear.id.au) wrote: > > On Wed, Feb 25, 2015 at 04:51:55PM +0000, Dr. David Alan Gilbert (git) wrote: > > > From: "Dr. David Alan Gilbert" > > > > > > Add MIG_RP_CMD_REQ_PAGES command on Return path for the postcopy > > > destination to request a page from the source. > > > > > > Signed-off-by: Dr. David Alan Gilbert > > > --- > > > include/migration/migration.h | 4 +++ > > > migration/migration.c | 70 +++++++++++++++++++++++++++++++++++++++++++ > > > trace-events | 1 + > > > 3 files changed, 75 insertions(+) > > > > > > diff --git a/include/migration/migration.h b/include/migration/migration.h > > > index 2c607e7..2c15d63 100644 > > > --- a/include/migration/migration.h > > > +++ b/include/migration/migration.h > > > @@ -46,6 +46,8 @@ enum mig_rpcomm_cmd { > > > MIG_RP_CMD_INVALID = 0, /* Must be 0 */ > > > MIG_RP_CMD_SHUT, /* sibling will not send any more RP messages */ > > > MIG_RP_CMD_PONG, /* Response to a PING; data (seq: be32 ) */ > > > + > > > + MIG_RP_CMD_REQ_PAGES, /* data (start: be64, len: be64) */ > > > }; > > > > > > /* Postcopy page-map-incoming - data about each page on the inbound side */ > > > @@ -253,6 +255,8 @@ void migrate_send_rp_shut(MigrationIncomingState *mis, > > > uint32_t value); > > > void migrate_send_rp_pong(MigrationIncomingState *mis, > > > uint32_t value); > > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname, > > > + ram_addr_t start, ram_addr_t len); > > > > > > void ram_control_before_iterate(QEMUFile *f, uint64_t flags); > > > void ram_control_after_iterate(QEMUFile *f, uint64_t flags); > > > diff --git a/migration/migration.c b/migration/migration.c > > > index bd066f6..2e9d0dd 100644 > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -138,6 +138,36 @@ void migrate_send_rp_pong(MigrationIncomingState *mis, > > > migrate_send_rp_message(mis, MIG_RP_CMD_PONG, 4, (uint8_t *)&buf); > > > } > > > > > > +/* Request a range of pages from the source VM at the given > > > + * start address. > > > + * rbname: Name of the RAMBlock to request the page in, if NULL it's the same > > > + * as the last request (a name must have been given previously) > > > + * Start: Address offset within the RB > > > + * Len: Length in bytes required - must be a multiple of pagesize > > > + */ > > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname, > > > + ram_addr_t start, ram_addr_t len) > > > +{ > > > + uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto 256 */ > > > + uint64_t *buf64 = (uint64_t *)bufc; > > > + size_t msglen = 16; /* start + len */ > > > + > > > + assert(!(len & 1)); > > > + if (rbname) { > > > + int rbname_len = strlen(rbname); > > > + assert(rbname_len < 256); > > > + > > > + len |= 1; /* Flag to say we've got a name */ > > > + bufc[msglen++] = rbname_len; > > > + memcpy(bufc + msglen, rbname, rbname_len); > > > + msglen += rbname_len; > > > + } > > > + > > > + buf64[0] = cpu_to_be64((uint64_t)start); > > > + buf64[1] = cpu_to_be64((uint64_t)len); > > > + migrate_send_rp_message(mis, MIG_RP_CMD_REQ_PAGES, msglen, bufc); > > > > So.. what's the reason we actually need ramblock names on the wire, > > rather than working purely from GPAs? > > > > It occurs to me that referencing ramblock names from the wire protocol > > exposes something that's kind of an internal detail, and may limit our > > options for reworking the memory subsystem in future. > > RAMBlock names are already exposed on the wire in precopy migration in > the forward direction anyway (see save_page_header), however there > are a few reasons: > 1) There's no guarantee that the page you are transmitting is currently > mapped into the guest. The ACPI tables are never mapped. > 2) Aliases in GPA are allowed. > > The only thing that's unique is a reference to the RAMBlock and an offset > within it. > (and yes, it does break when we change RAMBlock names but that's normally > accidental - it did happen sometime around QEMU 1.6 when the PCI strings > were accidentally changed with a knock on effect of renaming RAMBlocks). Ah, right, good point. > > > +} > > > + > > > void qemu_start_incoming_migration(const char *uri, Error **errp) > > > { > > > const char *p; > > > @@ -789,6 +819,17 @@ static void source_return_path_bad(MigrationState *s) > > > } > > > > > > /* > > > + * Process a request for pages received on the return path, > > > + * We're allowed to send more than requested (e.g. to round to our page size) > > > + * and we don't need to send pages that have already been sent. > > > + */ > > > +static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname, > > > + ram_addr_t start, ram_addr_t len) > > > +{ > > > + trace_migrate_handle_rp_req_pages(start, len); > > > +} > > > + > > > +/* > > > * Handles messages sent on the return path towards the source VM > > > * > > > */ > > > @@ -800,6 +841,8 @@ static void *source_return_path_thread(void *opaque) > > > const int max_len = 512; > > > uint8_t buf[max_len]; > > > uint32_t tmp32; > > > + ram_addr_t start, len; > > > + char *tmpstr; > > > int res; > > > > > > trace_source_return_path_thread_entry(); > > > @@ -815,6 +858,11 @@ static void *source_return_path_thread(void *opaque) > > > expected_len = 4; > > > break; > > > > > > + case MIG_RP_CMD_REQ_PAGES: > > > + /* 16 byte start/len _possibly_ plus an id str */ > > > + expected_len = 16 + 256; > > > > Isn't that the maximum length, rather than the minimum or typical length? > > I'm just trying to be fairly paranoid prior to reading the header. > If at this point we're working our way through garbage received off a misaligned > stream then if we fail to spot it in this switch then we end up doing > a qemu_get_buffer on that bad length. Checking it just for a maximum > would make it safe against something malicious, but if the destination > hadn't sent that much data then we'd just block here and never get around > to reporting an error. This way, for most command types we're > being pretty careful. > > > > > + break; > > > + > > > default: > > > error_report("RP: Received invalid cmd 0x%04x length 0x%04x", > > > header_com, header_len); > > > @@ -860,6 +908,28 @@ static void *source_return_path_thread(void *opaque) > > > trace_source_return_path_thread_pong(tmp32); > > > break; > > > > > > + case MIG_RP_CMD_REQ_PAGES: > > > + start = be64_to_cpup((uint64_t *)buf); > > > + len = be64_to_cpup(((uint64_t *)buf)+1); > > > + tmpstr = NULL; > > > + if (len & 1) { > > > + len -= 1; /* Remove the flag */ > > > + /* Now we expect an idstr */ > > > + tmp32 = buf[16]; /* Length of the following idstr */ > > > + tmpstr = (char *)&buf[17]; > > > + buf[17+tmp32] = '\0'; > > > + expected_len = 16+1+tmp32; > > > + } else { > > > + expected_len = 16; > > > > Ah.. so expected_len is changed here. But then what was the point of > > setting it in the earlier switch? > > For the upper bound on the qemu_get_buffer. Ok, I think I'd forgotten exactly how the checks worked from the earlier patch. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson