From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server. Date: Thu, 22 Sep 2016 12:32:10 +0100 Message-ID: References: <1472813240-11011-1-git-send-email-yu.c.zhang@linux.intel.com> <1472813240-11011-2-git-send-email-yu.c.zhang@linux.intel.com> <57D24730.2050904@linux.intel.com> <57D24DFF.3070901@linux.intel.com> <57E3A06E.1020000@linux.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="94eb2c0600dc3b80e3053d170390" Return-path: In-Reply-To: <57E3A06E.1020000@linux.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Yu Zhang Cc: Paul Durrant , "Lv, Zhiyuan" , Jan Beulich , "Xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --94eb2c0600dc3b80e3053d170390 Content-Type: text/plain; charset="UTF-8" On Thu, Sep 22, 2016 at 10:12 AM, Yu Zhang wrote: > > > On 9/21/2016 9:04 PM, George Dunlap wrote: >> >> On Fri, Sep 9, 2016 at 6:51 AM, Yu Zhang >> wrote: >>>> >>>> On 9/2/2016 6:47 PM, Yu Zhang wrote: >>>>> >>>>> A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to >>>>> let one ioreq server claim/disclaim its responsibility for the >>>>> handling of guest pages with p2m type p2m_ioreq_server. Users >>>>> of this HVMOP can specify which kind of operation is supposed >>>>> to be emulated in a parameter named flags. Currently, this HVMOP >>>>> only support the emulation of write operations. And it can be >>>>> further extended to support the emulation of read ones if an >>>>> ioreq server has such requirement in the future. >>>>> >>>>> For now, we only support one ioreq server for this p2m type, so >>>>> once an ioreq server has claimed its ownership, subsequent calls >>>>> of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also >>>>> disclaim the ownership of guest ram pages with p2m_ioreq_server, by >>>>> triggering this new HVMOP, with ioreq server id set to the current >>>>> owner's and flags parameter set to 0. >>>>> >>>>> Note both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server >>>>> are only supported for HVMs with HAP enabled. >>>>> >>>>> Also note that only after one ioreq server claims its ownership >>>>> of p2m_ioreq_server, will the p2m type change to p2m_ioreq_server >>>>> be allowed. >>>>> >>>>> Signed-off-by: Paul Durrant >>>>> Signed-off-by: Yu Zhang >>>>> Acked-by: Tim Deegan >>>>> --- >>>>> Cc: Paul Durrant >>>>> Cc: Jan Beulich >>>>> Cc: Andrew Cooper >>>>> Cc: George Dunlap >>>>> Cc: Jun Nakajima >>>>> Cc: Kevin Tian >>>>> Cc: Tim Deegan >>>>> >>>>> changes in v6: >>>>> - Clarify logic in hvmemul_do_io(). >>>>> - Use recursive lock for ioreq server lock. >>>>> - Remove debug print when mapping ioreq server. >>>>> - Clarify code in ept_p2m_type_to_flags() for consistency. >>>>> - Remove definition of P2M_IOREQ_HANDLE_WRITE_ACCESS. >>>>> - Add comments for HVMMEM_ioreq_server to note only changes >>>>> to/from HVMMEM_ram_rw are permitted. >>>>> - Add domain_pause/unpause() in hvm_map_mem_type_to_ioreq_server() >>>>> to avoid the race condition when a vm exit happens on a write- >>>>> protected page, just to find the ioreq server has been unmapped >>>>> already. >>>>> - Introduce a seperate patch to delay the release of p2m >>>>> lock to avoid the race condition. >>>>> - Introduce a seperate patch to handle the read-modify-write >>>>> operations on a write protected page. >>>>> >>>> Why do we need to do this? Won't the default case just DTRT if it finds >>>> that the ioreq server has been unmapped? >>> >>> >>> Well, patch 4 will either mark the remaining p2m_ioreq_server entries as >>> "recal" or >>> reset to p2m_ram_rw directly. So my understanding is that we do not wish >>> to >>> see a ept violation due to a p2m_ioreq_server access after the ioreq >>> server >>> is unmapped. >>> Yet without this domain_pause/unpause() pair, VM accesses may trigger an >>> ept >>> violation >>> during the hvmop hypercall(hvm_map_mem_type_to_ioreq_server), just to >>> find >>> the ioreq >>> server is NULL. Then we would have to provide handlers which just do the >>> copy to/from >>> actions for the VM. This seems awkward to me. >> >> So the race you're worried about is this: >> >> 1. Guest fault happens >> 2. ioreq server calls map_mem_type_to_ioreq_server, unhooking >> 3. guest finds no ioreq server present >> >> I think in that case the easiest thing to do would be to simply assume >> there was a race and re-execute the instruction. Is that not possible >> for some reason? >> >> -George > > > Thanks for your reply, George. :) > Two reasons I'd like to use the domain_pause/unpause() to avoid the race > condition: > > 1> Like my previous explanation, in the read-modify-write scenario, the > ioreq server will > be NULL for the read emulation. But in such case, hypervisor will not > discard this trap, instead > it is supposed to do the copy work for the read access. So it would be > difficult for hypervisor > to decide if the ioreq server was detached due to a race condition, or if > the ioreq server should > be a NULL because we are emulating a read operation first for a > read-modify-write instruction. Wouldn't a patch like the attached work (applied on top of the whole series)? -George --94eb2c0600dc3b80e3053d170390 Content-Type: text/x-diff; charset="US-ASCII"; name="0001-x86-hvm-Handle-both-ioreq-detach-races-and-read-modi.patch" Content-Disposition: attachment; filename="0001-x86-hvm-Handle-both-ioreq-detach-races-and-read-modi.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ite9dewz0 RnJvbSA0ZTRiMTQ2NDFjZDk0YzBjOWZlNjQ2MDZiMzI5Y2RiYmY5YzZhOTJiIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBHZW9yZ2UgRHVubGFwIDxnZW9yZ2UuZHVubGFwQGNpdHJpeC5j b20+CkRhdGU6IFRodSwgMjIgU2VwIDIwMTYgMTI6MzA6MjYgKzAxMDAKU3ViamVjdDogW1BBVENI XSB4ODYvaHZtOiBIYW5kbGUgYm90aCBpb3JlcSBkZXRhY2ggcmFjZXMgYW5kIHJlYWQtbW9kaWZ5 LXdyaXRlCiBpbnN0cnVjdGlvbnMKCkNvbXBpbGUtdGVzdGVkIG9ubHkuCgpTaWduZWQtb2ZmLWJ5 OiBHZW9yZ2UgRHVubGFwIDxnZW9yZ2UuZHVubGFwQGNpdHJpeC5jb20+Ci0tLQogeGVuL2FyY2gv eDg2L2h2bS9lbXVsYXRlLmMgfCA2OCArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKy0tLS0tLS0tCiAxIGZpbGUgY2hhbmdlZCwgNTYgaW5zZXJ0aW9ucygrKSwgMTIgZGVsZXRp b25zKC0pCgpkaWZmIC0tZ2l0IGEveGVuL2FyY2gveDg2L2h2bS9lbXVsYXRlLmMgYi94ZW4vYXJj aC94ODYvaHZtL2VtdWxhdGUuYwppbmRleCA1NjRjMTE3Li4xMjBlZjg2IDEwMDY0NAotLS0gYS94 ZW4vYXJjaC94ODYvaHZtL2VtdWxhdGUuYworKysgYi94ZW4vYXJjaC94ODYvaHZtL2VtdWxhdGUu YwpAQCAtMjE0LDQwICsyMTQsODQgQEAgc3RhdGljIGludCBodm1lbXVsX2RvX2lvKAogICAgICAg ICBicmVhazsKICAgICBjYXNlIFg4NkVNVUxfVU5IQU5ETEVBQkxFOgogICAgIHsKKyAgICAgICAg LyogCisgICAgICAgICAqIFhlbiBpc24ndCBlbXVsYXRpbmcgdGhlIGluc3RydWN0aW9uIGludGVy bmFsbHksIHNvIHNlZSBpZgorICAgICAgICAgKiB0aGVyZSdzIGFuIGlvcmVxIHNlcnZlciB0aGF0 IGNhbiBoYW5kbGUgaXQuICBSdWxlczoKKyAgICAgICAgICoKKyAgICAgICAgICogLSBQSU8gYW5k ICJub3JtYWwiIG1taW8gcnVuIHRocm91Z2gKKyAgICAgICAgICogaHZtX3NlbGVjdF9pb3JlcV9z ZXJ2ZXIoKSB0byBjaG9vc2UgdGhlIGlvcmVxIHNlcnZlciBieQorICAgICAgICAgKiByYW5nZS4g IElmIG5vIHNlcnZlciBpcyBmb3VuZCwgdGhlIGFjY2VzcyBpcyBpZ25vcmVkLgorICAgICAgICAg KgorICAgICAgICAgKiAtIHAybV9pb3JlcV9zZXJ2ZXIgYWNjZXNzZXMgYXJlIGhhbmRsZWQgYnkg dGhlIGN1cnJlbnQKKyAgICAgICAgICogaW9yZXFfc2VydmVyIGZvciB0aGUgZG9tYWluLCBidXQg dGhlcmUgYXJlIHNvbWUgY29ybmVyCisgICAgICAgICAqIGNhc2VzOgorICAgICAgICAgKgorICAg ICAgICAgKiAgIC0gSWYgdGhlIGRvbWFpbiBpb3JlcV9zZXJ2ZXIgaXMgTlVMTCwgYXNzdW1lIHRo aXMgaXMgYQorICAgICAgICAgKiAgIHJhY2UgYmV0d2VlbiB1bmxvb2tpbmcgdGhlIGlvcmVxIHNl cnZlciBhbmQKKyAgICAgICAgICogICBwMm1fdHlwZV9jaGFuZ2UgYW5kIHJlLXRyeSB0aGUgaW5z dHJ1Y3Rpb24uCisgICAgICAgICAqCisgICAgICAgICAqICAgLSBJZiB0aGUgSU9SRVFfTUVNX0FD Q0VTU19XUklURSBmbGFnIGlzIG5vdCBzZXQsIHRyZWF0IGl0CisgICAgICAgICAqICAgbGlrZSBh IG5vcm1hbCBQSU8gb3IgTU1JTyB0aGF0IGRvZXNuJ3QgaGF2ZSBhbiBpb3JlcQorICAgICAgICAg KiAgIHNlcnZlciAoaS5lLiwgYnkgaWdub3JpbmcgaXQpLgorICAgICAgICAgKgorICAgICAgICAg KiAgIC0gSWYgdGhlIGFjY2Vzc3MgaXMgYSByZWFkLCB0aGlzIG11c3QgYmUgcGFydCBvZiBhCisg ICAgICAgICAqICAgcmVhZC1tb2RpZnktd3JpdGUgaW5zdHJ1Y3Rpb247IGVtdWxhdGUgdGhlIHJl YWQgc28gdGhhdCB3ZSBoYXZlCisgICAgICAgICAqICAgaXQKKyAgICAgICAgICovCisKICAgICAg ICAgc3RydWN0IGh2bV9pb3JlcV9zZXJ2ZXIgKnMgPSBOVUxMOwogICAgICAgICBwMm1fdHlwZV90 IHAybXQgPSBwMm1faW52YWxpZDsKLQorICAgICAgICAKICAgICAgICAgaWYgKCBpc19tbWlvICkK ICAgICAgICAgewogICAgICAgICAgICAgdW5zaWduZWQgbG9uZyBnbWZuID0gcGFkZHJfdG9fcGZu KGFkZHIpOwogCiAgICAgICAgICAgICAodm9pZCkgZ2V0X2dmbl9xdWVyeV91bmxvY2tlZChjdXJy ZCwgZ21mbiwgJnAybXQpOwogCi0gICAgICAgICAgICBpZiAoIHAybXQgPT0gcDJtX2lvcmVxX3Nl cnZlciAmJiBkaXIgPT0gSU9SRVFfV1JJVEUgKQorICAgICAgICAgICAgaWYgKCBwMm10ID09IHAy bV9pb3JlcV9zZXJ2ZXIgKQogICAgICAgICAgICAgewogICAgICAgICAgICAgICAgIHVuc2lnbmVk IGludCBmbGFnczsKIAogICAgICAgICAgICAgICAgIHMgPSBwMm1fZ2V0X2lvcmVxX3NlcnZlcihj dXJyZCwgJmZsYWdzKTsKKworICAgICAgICAgICAgICAgIC8qIAorICAgICAgICAgICAgICAgICAq IElmIHAybXQgaXMgaW9yZXFfc2VydmVyIGJ1dCBpb3JlcV9zZXJ2ZXIgaXMgTlVMTCwKKyAgICAg ICAgICAgICAgICAgKiB3ZSBwcm9iYWJseSBsb3N0IGEgcmFjZSB3aXRoIHAybV90eXBlX2NoYW5n ZTsganVzdAorICAgICAgICAgICAgICAgICAqIHJldHJ5IHRoZSBhY2Nlc3MuCisgICAgICAgICAg ICAgICAgICovCisgICAgICAgICAgICAgICAgaWYgKCBzID09IE5VTEwgKQorICAgICAgICAgICAg ICAgIHsKKyAgICAgICAgICAgICAgICAgICAgcmMgPSBYODZFTVVMX1JFVFJZOworICAgICAgICAg ICAgICAgICAgICB2aW8tPmlvX3JlcS5zdGF0ZSA9IFNUQVRFX0lPUkVRX05PTkU7CisgICAgICAg ICAgICAgICAgICAgIGJyZWFrOworICAgICAgICAgICAgICAgIH0KKworICAgICAgICAgICAgICAg IC8qIAorICAgICAgICAgICAgICAgICAqIFRoaXMgaXMgcGFydCBvZiBhIHJlYWQtbW9kaWZ5LXdy aXRlIGluc3RydWN0aW9uLgorICAgICAgICAgICAgICAgICAqIEVtdWxhdGUgdGhlIHJlYWQgcGFy dCBzbyB3ZSBoYXZlIHRoZSB2YWx1ZSBjYWNoZWQuCisgICAgICAgICAgICAgICAgICovCisgICAg ICAgICAgICAgICAgaWYgKCBkaXIgPT0gSU9SRVFfUkVBRCApCisgICAgICAgICAgICAgICAgewor ICAgICAgICAgICAgICAgICAgICByYyA9IGh2bV9wcm9jZXNzX2lvX2ludGVyY2VwdCgmbWVtX2hh bmRsZXIsICZwKTsKKyAgICAgICAgICAgICAgICAgICAgdmlvLT5pb19yZXEuc3RhdGUgPSBTVEFU RV9JT1JFUV9OT05FOworICAgICAgICAgICAgICAgICAgICBicmVhazsKKyAgICAgICAgICAgICAg ICB9CisKKyAgICAgICAgICAgICAgICAKICAgICAgICAgICAgICAgICBpZiAoICEoZmxhZ3MgJiBY RU5fSFZNT1BfSU9SRVFfTUVNX0FDQ0VTU19XUklURSkgKQorICAgICAgICAgICAgICAgIHsKICAg ICAgICAgICAgICAgICAgICAgcyA9IE5VTEw7CisgICAgICAgICAgICAgICAgfQogICAgICAgICAg ICAgfQogICAgICAgICB9CiAKICAgICAgICAgaWYgKCAhcyAmJiBwMm10ICE9IHAybV9pb3JlcV9z ZXJ2ZXIgKQogICAgICAgICAgICAgcyA9IGh2bV9zZWxlY3RfaW9yZXFfc2VydmVyKGN1cnJkLCAm cCk7CiAKLSAgICAgICAgLyogSWYgdGhlcmUgaXMgbm8gc3VpdGFibGUgYmFja2luZyBETSwganVz dCBpZ25vcmUgYWNjZXNzZXMgKi8KICAgICAgICAgaWYgKCAhcyApCiAgICAgICAgIHsKLSAgICAg ICAgICAgIC8qCi0gICAgICAgICAgICAgKiBGb3IgcDJtX2lvcmVxX3NlcnZlciBwYWdlcyBhY2Nl c3NlZCB3aXRoIHJlYWQtbW9kaWZ5LXdyaXRlCi0gICAgICAgICAgICAgKiBpbnN0cnVjdGlvbnMs IHdlIHByb3ZpZGUgYSByZWFkIGhhbmRsZXIgdG8gY29weSB0aGUgZGF0YSB0bwotICAgICAgICAg ICAgICogdGhlIGJ1ZmZlci4KLSAgICAgICAgICAgICAqLwotICAgICAgICAgICAgaWYgKCBwMm10 ID09IHAybV9pb3JlcV9zZXJ2ZXIgKQotICAgICAgICAgICAgICAgIHJjID0gaHZtX3Byb2Nlc3Nf aW9faW50ZXJjZXB0KCZtZW1faGFuZGxlciwgJnApOwotICAgICAgICAgICAgZWxzZQotICAgICAg ICAgICAgICAgIHJjID0gaHZtX3Byb2Nlc3NfaW9faW50ZXJjZXB0KCZudWxsX2hhbmRsZXIsICZw KTsKKyAgICAgICAgICAgIC8qIElmIHRoZXJlIGlzIG5vIHN1aXRhYmxlIGJhY2tpbmcgRE0sIGp1 c3QgaWdub3JlIGFjY2Vzc2VzICovCisgICAgICAgICAgICByYyA9IGh2bV9wcm9jZXNzX2lvX2lu dGVyY2VwdCgmbnVsbF9oYW5kbGVyLCAmcCk7CiAgICAgICAgICAgICB2aW8tPmlvX3JlcS5zdGF0 ZSA9IFNUQVRFX0lPUkVRX05PTkU7CiAgICAgICAgIH0KICAgICAgICAgZWxzZQotLSAKMi4xLjQK Cg== --94eb2c0600dc3b80e3053d170390 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --94eb2c0600dc3b80e3053d170390--