From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35357) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YaxFS-0006hk-T2 for qemu-devel@nongnu.org; Wed, 25 Mar 2015 22:10:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YaxFP-0003aF-Vv for qemu-devel@nongnu.org; Wed, 25 Mar 2015 22:10:18 -0400 Received: from ozlabs.org ([2401:3900:2:1::2]:50169) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YaxFP-0003Zf-9G for qemu-devel@nongnu.org; Wed, 25 Mar 2015 22:10:15 -0400 Date: Thu, 26 Mar 2015 12:28:39 +1100 From: David Gibson Message-ID: <20150326012839.GD28039@voom.redhat.com> References: <1424883128-9841-1-git-send-email-dgilbert@redhat.com> <1424883128-9841-33-git-send-email-dgilbert@redhat.com> <20150323050050.GQ25043@voom.fritz.box> <20150325181644.GL2313@work-vm> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NtwzykIc2mflq5ck" Content-Disposition: inline In-Reply-To: <20150325181644.GL2313@work-vm> Subject: Re: [Qemu-devel] [PATCH v5 32/45] Page request: Add MIG_RP_CMD_REQ_PAGES reverse command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, quintela@redhat.com, qemu-devel@nongnu.org, amit.shah@redhat.com, pbonzini@redhat.com, yanghy@cn.fujitsu.com --NtwzykIc2mflq5ck Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 25, 2015 at 06:16:45PM +0000, Dr. David Alan Gilbert wrote: > * David Gibson (david@gibson.dropbear.id.au) wrote: > > On Wed, Feb 25, 2015 at 04:51:55PM +0000, Dr. David Alan Gilbert (git) = wrote: > > > From: "Dr. David Alan Gilbert" > > >=20 > > > Add MIG_RP_CMD_REQ_PAGES command on Return path for the postcopy > > > destination to request a page from the source. > > >=20 > > > Signed-off-by: Dr. David Alan Gilbert > > > --- > > > include/migration/migration.h | 4 +++ > > > migration/migration.c | 70 +++++++++++++++++++++++++++++++++= ++++++++++ > > > trace-events | 1 + > > > 3 files changed, 75 insertions(+) > > >=20 > > > diff --git a/include/migration/migration.h b/include/migration/migrat= ion.h > > > index 2c607e7..2c15d63 100644 > > > --- a/include/migration/migration.h > > > +++ b/include/migration/migration.h > > > @@ -46,6 +46,8 @@ enum mig_rpcomm_cmd { > > > MIG_RP_CMD_INVALID =3D 0, /* Must be 0 */ > > > MIG_RP_CMD_SHUT, /* sibling will not send any more RP me= ssages */ > > > MIG_RP_CMD_PONG, /* Response to a PING; data (seq: be32 = ) */ > > > + > > > + MIG_RP_CMD_REQ_PAGES, /* data (start: be64, len: be64) */ > > > }; > > > =20 > > > /* Postcopy page-map-incoming - data about each page on the inbound = side */ > > > @@ -253,6 +255,8 @@ void migrate_send_rp_shut(MigrationIncomingState = *mis, > > > uint32_t value); > > > void migrate_send_rp_pong(MigrationIncomingState *mis, > > > uint32_t value); > > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const ch= ar* rbname, > > > + ram_addr_t start, ram_addr_t len); > > > =20 > > > void ram_control_before_iterate(QEMUFile *f, uint64_t flags); > > > void ram_control_after_iterate(QEMUFile *f, uint64_t flags); > > > diff --git a/migration/migration.c b/migration/migration.c > > > index bd066f6..2e9d0dd 100644 > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -138,6 +138,36 @@ void migrate_send_rp_pong(MigrationIncomingState= *mis, > > > migrate_send_rp_message(mis, MIG_RP_CMD_PONG, 4, (uint8_t *)&buf= ); > > > } > > > =20 > > > +/* Request a range of pages from the source VM at the given > > > + * start address. > > > + * rbname: Name of the RAMBlock to request the page in, if NULL it= 's the same > > > + * as the last request (a name must have been given previo= usly) > > > + * Start: Address offset within the RB > > > + * Len: Length in bytes required - must be a multiple of pagesize > > > + */ > > > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const ch= ar *rbname, > > > + ram_addr_t start, ram_addr_t len) > > > +{ > > > + uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname = upto 256 */ > > > + uint64_t *buf64 =3D (uint64_t *)bufc; > > > + size_t msglen =3D 16; /* start + len */ > > > + > > > + assert(!(len & 1)); > > > + if (rbname) { > > > + int rbname_len =3D strlen(rbname); > > > + assert(rbname_len < 256); > > > + > > > + len |=3D 1; /* Flag to say we've got a name */ > > > + bufc[msglen++] =3D rbname_len; > > > + memcpy(bufc + msglen, rbname, rbname_len); > > > + msglen +=3D rbname_len; > > > + } > > > + > > > + buf64[0] =3D cpu_to_be64((uint64_t)start); > > > + buf64[1] =3D cpu_to_be64((uint64_t)len); > > > + migrate_send_rp_message(mis, MIG_RP_CMD_REQ_PAGES, msglen, bufc); > >=20 > > So.. what's the reason we actually need ramblock names on the wire, > > rather than working purely from GPAs? > >=20 > > It occurs to me that referencing ramblock names from the wire protocol > > exposes something that's kind of an internal detail, and may limit our > > options for reworking the memory subsystem in future. >=20 > RAMBlock names are already exposed on the wire in precopy migration in > the forward direction anyway (see save_page_header), however there > are a few reasons: > 1) There's no guarantee that the page you are transmitting is currently > mapped into the guest. The ACPI tables are never mapped. > 2) Aliases in GPA are allowed. >=20 > The only thing that's unique is a reference to the RAMBlock and an offset > within it. > (and yes, it does break when we change RAMBlock names but that's normally > accidental - it did happen sometime around QEMU 1.6 when the PCI strings > were accidentally changed with a knock on effect of renaming RAMBlocks). Ah, right, good point. > > > +} > > > + > > > void qemu_start_incoming_migration(const char *uri, Error **errp) > > > { > > > const char *p; > > > @@ -789,6 +819,17 @@ static void source_return_path_bad(MigrationStat= e *s) > > > } > > > =20 > > > /* > > > + * Process a request for pages received on the return path, > > > + * We're allowed to send more than requested (e.g. to round to our p= age size) > > > + * and we don't need to send pages that have already been sent. > > > + */ > > > +static void migrate_handle_rp_req_pages(MigrationState *ms, const ch= ar* rbname, > > > + ram_addr_t start, ram_addr_t = len) > > > +{ > > > + trace_migrate_handle_rp_req_pages(start, len); > > > +} > > > + > > > +/* > > > * Handles messages sent on the return path towards the source VM > > > * > > > */ > > > @@ -800,6 +841,8 @@ static void *source_return_path_thread(void *opaq= ue) > > > const int max_len =3D 512; > > > uint8_t buf[max_len]; > > > uint32_t tmp32; > > > + ram_addr_t start, len; > > > + char *tmpstr; > > > int res; > > > =20 > > > trace_source_return_path_thread_entry(); > > > @@ -815,6 +858,11 @@ static void *source_return_path_thread(void *opa= que) > > > expected_len =3D 4; > > > break; > > > =20 > > > + case MIG_RP_CMD_REQ_PAGES: > > > + /* 16 byte start/len _possibly_ plus an id str */ > > > + expected_len =3D 16 + 256; > >=20 > > Isn't that the maximum length, rather than the minimum or typical lengt= h? >=20 > I'm just trying to be fairly paranoid prior to reading the header. > If at this point we're working our way through garbage received off a mis= aligned > stream then if we fail to spot it in this switch then we end up doing > a qemu_get_buffer on that bad length. Checking it just for a maximum > would make it safe against something malicious, but if the destination > hadn't sent that much data then we'd just block here and never get around > to reporting an error. This way, for most command types we're > being pretty careful. >=20 >=20 > > > + break; > > > + > > > default: > > > error_report("RP: Received invalid cmd 0x%04x length 0x%= 04x", > > > header_com, header_len); > > > @@ -860,6 +908,28 @@ static void *source_return_path_thread(void *opa= que) > > > trace_source_return_path_thread_pong(tmp32); > > > break; > > > =20 > > > + case MIG_RP_CMD_REQ_PAGES: > > > + start =3D be64_to_cpup((uint64_t *)buf); > > > + len =3D be64_to_cpup(((uint64_t *)buf)+1); > > > + tmpstr =3D NULL; > > > + if (len & 1) { > > > + len -=3D 1; /* Remove the flag */ > > > + /* Now we expect an idstr */ > > > + tmp32 =3D buf[16]; /* Length of the following idstr = */ > > > + tmpstr =3D (char *)&buf[17]; > > > + buf[17+tmp32] =3D '\0'; > > > + expected_len =3D 16+1+tmp32; > > > + } else { > > > + expected_len =3D 16; > >=20 > > Ah.. so expected_len is changed here. But then what was the point of > > setting it in the earlier switch? >=20 > For the upper bound on the qemu_get_buffer. Ok, I think I'd forgotten exactly how the checks worked from the earlier patch. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --NtwzykIc2mflq5ck Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJVE2DHAAoJEGw4ysog2bOSEWMP/1XXNavc6xHT/ufY1i4CBLmy qWoXg5128vpw+izOQaZ9zaHTEan56EgPAEnYD5dwYzaix/WgJ9QxqQppOcmUa7tF ombn5Fk3mO6tv9wx2bAot54ID5PIQqRm5KAoAIXzi+gnmDVQGFoWywluHtXce6KH U0kRTuUvRWtU4XqMcTqjgf3Wrmt0V+Zg0Vl0bfSQacxWnkD0yfox/rgDH0k9kxB4 npgvsMdpr38/o5YhXFz8PTsjqD4L484yIHNgzV5nr022ARQDYO6shzzAwar4AMJ5 RubEmmf0dmGWhLJbkL9OYDdJL0962V/PCT1qxXZZKk6skBp0wHul+FFEIpMQvOn4 fWeD0HTsoL94v2wUPt6lq/s5nCb2PG7MuT3+HgB6pQXT/S4VzzWlqcsL8FxquLLB y8NpFjM33sf9Af5OSWsqveVW4H2j+C6FYE72sTxr8uuwaAv3p59p51N02WVZIrHq GRTsoBUEbNlE8+9VvcbC/th3/05OQ3N98RnaP5vYQSZBa2ZCiFcfujtsSUl8JilN yOEcVnWYFpCsLwcLSj3xbb/XBA9E2rD9L0lv3/G5PM0T7rzjYNwEbiDxD9NNWvnw eIjR3kdsYON8XMa2L2tf7USjaaDkTZjgNsva3ddzvEwwCPNhzH4YJ6WaWyJl+YH6 v0YvQ7bjfBg0JW1YxEDW =lFsU -----END PGP SIGNATURE----- --NtwzykIc2mflq5ck--