From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: [PATCH v5 05/47] x86emul: use AVX512 logic for emulating V{, P}MASKMOV* Date: Mon, 19 Nov 2018 03:15:28 -0700 Message-ID: <5BF28D4002000078001FD429@prv1-mh.provo.novell.com> References: <5B6BF83602000078001DC548@prv1-mh.provo.novell.com> <5BF289D802000078001FD3DF@prv1-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1gOgaZ-0005e6-VO for xen-devel@lists.xenproject.org; Mon, 19 Nov 2018 10:15:31 +0000 In-Reply-To: <5BF289D802000078001FD3DF@prv1-mh.provo.novell.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" To: xen-devel Cc: George Dunlap , Andrew Cooper , Wei Liu , Roger Pau Monne List-Id: xen-devel@lists.xenproject.org VGhlIG1vcmUgZ2VuZXJpYyBBVlg1MTIgaW1wbGVtZW50YXRpb24gYWxsb3dzIHF1aXRlIGEgYml0 IG9mIGluc24tCnNwZWNpZmljIGNvZGUgdG8gYmUgZHJvcHBlZC9zaGFyZWQuCgpTaWduZWQtb2Zm LWJ5OiBKYW4gQmV1bGljaCA8amJldWxpY2hAc3VzZS5jb20+CkFja2VkLWJ5OiBBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPgoKLS0tIGEveGVuL2FyY2gveDg2L3g4Nl9l bXVsYXRlL3g4Nl9lbXVsYXRlLmMKKysrIGIveGVuL2FyY2gveDg2L3g4Nl9lbXVsYXRlL3g4Nl9l bXVsYXRlLmMKQEAgLTQzOSw4ICs0MzksOCBAQCBzdGF0aWMgY29uc3Qgc3RydWN0IGV4dDBmMzhf dGFibGUgewogICAgIFsweDI4IC4uLiAweDI5XSA9IHsgLnNpbWRfc2l6ZSA9IHNpbWRfcGFja2Vk X2ludCB9LAogICAgIFsweDJhXSA9IHsgLnNpbWRfc2l6ZSA9IHNpbWRfcGFja2VkX2ludCwgLnR3 b19vcCA9IDEsIC5kOHMgPSBkOHNfdmwgfSwKICAgICBbMHgyYl0gPSB7IC5zaW1kX3NpemUgPSBz aW1kX3BhY2tlZF9pbnQgfSwKLSAgICBbMHgyYyAuLi4gMHgyZF0gPSB7IC5zaW1kX3NpemUgPSBz aW1kX290aGVyIH0sCi0gICAgWzB4MmUgLi4uIDB4MmZdID0geyAuc2ltZF9zaXplID0gc2ltZF9v dGhlciwgLnRvX21lbSA9IDEgfSwKKyAgICBbMHgyYyAuLi4gMHgyZF0gPSB7IC5zaW1kX3NpemUg PSBzaW1kX3BhY2tlZF9mcCB9LAorICAgIFsweDJlIC4uLiAweDJmXSA9IHsgLnNpbWRfc2l6ZSA9 IHNpbWRfcGFja2VkX2ZwLCAudG9fbWVtID0gMSB9LAogICAgIFsweDMwIC4uLiAweDM1XSA9IHsg LnNpbWRfc2l6ZSA9IHNpbWRfb3RoZXIsIC50d29fb3AgPSAxIH0sCiAgICAgWzB4MzYgLi4uIDB4 M2ZdID0geyAuc2ltZF9zaXplID0gc2ltZF9wYWNrZWRfaW50IH0sCiAgICAgWzB4NDBdID0geyAu c2ltZF9zaXplID0gc2ltZF9wYWNrZWRfaW50IH0sCkBAIC00NDksOCArNDQ5LDggQEAgc3RhdGlj IGNvbnN0IHN0cnVjdCBleHQwZjM4X3RhYmxlIHsKICAgICBbMHg1OCAuLi4gMHg1OV0gPSB7IC5z aW1kX3NpemUgPSBzaW1kX290aGVyLCAudHdvX29wID0gMSB9LAogICAgIFsweDVhXSA9IHsgLnNp bWRfc2l6ZSA9IHNpbWRfMTI4LCAudHdvX29wID0gMSB9LAogICAgIFsweDc4IC4uLiAweDc5XSA9 IHsgLnNpbWRfc2l6ZSA9IHNpbWRfb3RoZXIsIC50d29fb3AgPSAxIH0sCi0gICAgWzB4OGNdID0g eyAuc2ltZF9zaXplID0gc2ltZF9vdGhlciB9LAotICAgIFsweDhlXSA9IHsgLnNpbWRfc2l6ZSA9 IHNpbWRfb3RoZXIsIC50b19tZW0gPSAxIH0sCisgICAgWzB4OGNdID0geyAuc2ltZF9zaXplID0g c2ltZF9wYWNrZWRfaW50IH0sCisgICAgWzB4OGVdID0geyAuc2ltZF9zaXplID0gc2ltZF9wYWNr ZWRfaW50LCAudG9fbWVtID0gMSB9LAogICAgIFsweDkwIC4uLiAweDkzXSA9IHsgLnNpbWRfc2l6 ZSA9IHNpbWRfb3RoZXIsIC52c2liID0gMSB9LAogICAgIFsweDk2IC4uLiAweDk4XSA9IHsgLnNp bWRfc2l6ZSA9IHNpbWRfcGFja2VkX2ZwIH0sCiAgICAgWzB4OTldID0geyAuc2ltZF9zaXplID0g c2ltZF9zY2FsYXJfdmV4dyB9LApAQCAtNzk4OSw2ICs3OTg5LDggQEAgeDg2X2VtdWxhdGUoCiAK ICAgICAgICAgZ2VuZXJhdGVfZXhjZXB0aW9uX2lmKGVhLnR5cGUgIT0gT1BfTUVNIHx8IHZleC53 LCBFWENfVUQpOwogICAgICAgICBob3N0X2FuZF92Y3B1X211c3RfaGF2ZShhdngpOworICAgICAg ICBlbGVtX2J5dGVzID0gNCA8PCAoYiAmIDEpOworICAgIHZtYXNrbW92OgogICAgICAgICBnZXRf ZnB1KFg4NkVNVUxfRlBVX3ltbSk7CiAKICAgICAgICAgLyoKQEAgLTgwMDMsNyArODAwNSw3IEBA IHg4Nl9lbXVsYXRlKAogICAgICAgICBvcGMgPSBpbml0X3ByZWZpeGVzKHN0dWIpOwogICAgICAg ICBwdmV4ID0gY29weV9WRVgob3BjLCB2ZXgpOwogICAgICAgICBwdmV4LT5vcGN4ID0gdmV4XzBm OwotICAgICAgICBpZiAoICEoYiAmIDEpICkKKyAgICAgICAgaWYgKCBlbGVtX2J5dGVzID09IDQg KQogICAgICAgICAgICAgcHZleC0+cGZ4ID0gdmV4X25vbmU7CiAgICAgICAgIG9wY1swXSA9IDB4 NTA7IC8qIHZtb3Ztc2twe3MsZH0gKi8KICAgICAgICAgLyogVXNlICVyYXggYXMgR1BSIGRlc3Rp bmF0aW9uIGFuZCBWRVgudnZ2diBhcyBzb3VyY2UuICovCkBAIC04MDE2LDIxICs4MDE4LDkgQEAg eDg2X2VtdWxhdGUoCiAgICAgICAgIGludm9rZV9zdHViKCIiLCAiIiwgIj1hIiAoZWEudmFsKSA6 IFtkdW1teV0gImkiICgwKSk7CiAgICAgICAgIHB1dF9zdHViKHN0dWIpOwogCi0gICAgICAgIGlm ICggIWVhLnZhbCApCi0gICAgICAgICAgICBnb3RvIGNvbXBsZXRlX2luc247Ci0KLSAgICAgICAg b3BfYnl0ZXMgPSA0IDw8IChiICYgMSk7Ci0gICAgICAgIGZpcnN0X2J5dGUgPSBfX2J1aWx0aW5f Y3R6KGVhLnZhbCk7Ci0gICAgICAgIGVhLnZhbCA+Pj0gZmlyc3RfYnl0ZTsKLSAgICAgICAgZmly c3RfYnl0ZSAqPSBvcF9ieXRlczsKLSAgICAgICAgb3BfYnl0ZXMgKj0gMzIgLSBfX2J1aWx0aW5f Y2x6KGVhLnZhbCk7Ci0KLSAgICAgICAgLyoKLSAgICAgICAgICogRXZlbiBmb3IgdGhlIG1lbW9y eSB3cml0ZSB2YXJpYW50IGEgbWVtb3J5IHJlYWQgaXMgbmVlZGVkLCB1bmxlc3MKLSAgICAgICAg ICogYWxsIHNldCBtYXNrIGJpdHMgYXJlIGNvbnRpZ3VvdXMuCi0gICAgICAgICAqLwotICAgICAg ICBpZiAoIGVhLnZhbCAmIChlYS52YWwgKyAxKSApCi0gICAgICAgICAgICBkID0gKGQgJiB+U3Jj TWFzaykgfCBTcmNNZW07CisgICAgICAgIGV2ZXgub3Btc2sgPSAxOyAvKiBmYWtlICovCisgICAg ICAgIG9wX21hc2sgPSBlYS52YWw7CisgICAgICAgIGZhdWx0X3N1cHByZXNzaW9uID0gdHJ1ZTsK IAogICAgICAgICBvcGMgPSBpbml0X3ByZWZpeGVzKHN0dWIpOwogICAgICAgICBvcGNbMF0gPSBi OwpAQCAtODA4MSw2MyArODA3MSwxMCBAQCB4ODZfZW11bGF0ZSgKIAogICAgIGNhc2UgWDg2RU1V TF9PUENfVkVYXzY2KDB4MGYzOCwgMHg4Yyk6IC8qIHZwbWFza21vdntkLHF9IG1lbSx7eCx5fW1t LHt4LHl9bW0gKi8KICAgICBjYXNlIFg4NkVNVUxfT1BDX1ZFWF82NigweDBmMzgsIDB4OGUpOiAv KiB2cG1hc2ttb3Z7ZCxxfSB7eCx5fW1tLHt4LHl9bW0sbWVtICovCi0gICAgewotICAgICAgICB0 eXBlb2YodmV4KSAqcHZleDsKLSAgICAgICAgdW5zaWduZWQgaW50IG1hc2sgPSB2ZXgudyA/IDB4 ODA4MDgwODBVIDogMHg4ODg4ODg4OFU7Ci0KICAgICAgICAgZ2VuZXJhdGVfZXhjZXB0aW9uX2lm KGVhLnR5cGUgIT0gT1BfTUVNLCBFWENfVUQpOwogICAgICAgICBob3N0X2FuZF92Y3B1X211c3Rf aGF2ZShhdngyKTsKLSAgICAgICAgZ2V0X2ZwdShYODZFTVVMX0ZQVV95bW0pOwotCi0gICAgICAg IC8qCi0gICAgICAgICAqIFdoaWxlIHdlIGNhbid0IHJlYXNvbmFibHkgcHJvdmlkZSBmdWxseSBj b3JyZWN0IGJlaGF2aW9yIGhlcmUKLSAgICAgICAgICogKGluIHBhcnRpY3VsYXIsIGZvciB3cml0 ZXMsIGF2b2lkaW5nIHRoZSBtZW1vcnkgcmVhZCBpbiBhbnRpY2lwYXRpb24KLSAgICAgICAgICog b2YgYWxsIGVsZW1lbnRzIGluIHRoZSByYW5nZSBldmVudHVhbGx5IGJlaW5nIHdyaXR0ZW4pLCB3 ZSBjYW4gKGFuZAotICAgICAgICAgKiBzaG91bGQpIHN0aWxsIGxpbWl0IHRoZSBtZW1vcnkgYWNj ZXNzIHRvIHRoZSBzbWFsbGVzdCBwb3NzaWJsZSByYW5nZQotICAgICAgICAgKiAoc3VwcHJlc3Np bmcgaXQgYWx0b2dldGhlciBpZiBhbGwgbWFzayBiaXRzIGFyZSBjbGVhciksIHRvIHByb3ZpZGUK LSAgICAgICAgICogY29ycmVjdCBmYXVsdGluZyBiZWhhdmlvci4gUmVhZCB0aGUgbWFzayBiaXRz IHZpYSB2bW92bXNrcHtzLGR9Ci0gICAgICAgICAqIGZvciB0aGF0IHB1cnBvc2UuCi0gICAgICAg ICAqLwotICAgICAgICBvcGMgPSBpbml0X3ByZWZpeGVzKHN0dWIpOwotICAgICAgICBwdmV4ID0g Y29weV9WRVgob3BjLCB2ZXgpOwotICAgICAgICBwdmV4LT5vcGN4ID0gdmV4XzBmOwotICAgICAg ICBvcGNbMF0gPSAweGQ3OyAvKiB2cG1vdm1za2IgKi8KLSAgICAgICAgLyogVXNlICVyYXggYXMg R1BSIGRlc3RpbmF0aW9uIGFuZCBWRVgudnZ2diBhcyBzb3VyY2UuICovCi0gICAgICAgIHB2ZXgt PnIgPSAxOwotICAgICAgICBwdmV4LT5iID0gIW1vZGVfNjRiaXQoKSB8fCAodmV4LnJlZyA+PiAz KTsKLSAgICAgICAgb3BjWzFdID0gMHhjMCB8ICh+dmV4LnJlZyAmIDcpOwotICAgICAgICBwdmV4 LT5yZWcgPSAweGY7Ci0gICAgICAgIG9wY1syXSA9IDB4YzM7Ci0KLSAgICAgICAgaW52b2tlX3N0 dWIoIiIsICIiLCAiPWEiIChlYS52YWwpIDogW2R1bW15XSAiaSIgKDApKTsKLSAgICAgICAgcHV0 X3N0dWIoc3R1Yik7Ci0KLSAgICAgICAgLyogQ29udmVydCBieXRlIGdyYW51bGFyIHJlc3VsdCB0 byBkd29yZC9xd29yZCBncmFudWxhcml0eS4gKi8KLSAgICAgICAgZWEudmFsICY9IG1hc2s7Ci0g ICAgICAgIGlmICggIWVhLnZhbCApCi0gICAgICAgICAgICBnb3RvIGNvbXBsZXRlX2luc247Ci0K LSAgICAgICAgZmlyc3RfYnl0ZSA9IF9fYnVpbHRpbl9jdHooZWEudmFsKSAmIH4oKDQgPDwgdmV4 LncpIC0gMSk7Ci0gICAgICAgIGVhLnZhbCA+Pj0gZmlyc3RfYnl0ZTsKLSAgICAgICAgb3BfYnl0 ZXMgPSAzMiAtIF9fYnVpbHRpbl9jbHooZWEudmFsKTsKLQotICAgICAgICAvKgotICAgICAgICAg KiBFdmVuIGZvciB0aGUgbWVtb3J5IHdyaXRlIHZhcmlhbnQgYSBtZW1vcnkgcmVhZCBpcyBuZWVk ZWQsIHVubGVzcwotICAgICAgICAgKiBhbGwgc2V0IG1hc2sgYml0cyBhcmUgY29udGlndW91cy4K LSAgICAgICAgICovCi0gICAgICAgIGlmICggZWEudmFsICYgKGVhLnZhbCArIH5tYXNrICsgMSkg KQotICAgICAgICAgICAgZCA9IChkICYgflNyY01hc2spIHwgU3JjTWVtOwotCi0gICAgICAgIG9w YyA9IGluaXRfcHJlZml4ZXMoc3R1Yik7Ci0gICAgICAgIG9wY1swXSA9IGI7Ci0gICAgICAgIC8q IENvbnZlcnQgbWVtb3J5IG9wZXJhbmQgdG8gKCVyQVgpLiAqLwotICAgICAgICByZXhfcHJlZml4 ICY9IH5SRVhfQjsKLSAgICAgICAgdmV4LmIgPSAxOwotICAgICAgICBvcGNbMV0gPSBtb2RybSAm IDB4Mzg7Ci0gICAgICAgIGluc25fYnl0ZXMgPSBQRlhfQllURVMgKyAyOwotCi0gICAgICAgIGJy ZWFrOwotICAgIH0KKyAgICAgICAgZWxlbV9ieXRlcyA9IDQgPDwgdmV4Lnc7CisgICAgICAgIGdv dG8gdm1hc2ttb3Y7CiAKICAgICBjYXNlIFg4NkVNVUxfT1BDX1ZFWF82NigweDBmMzgsIDB4OTAp OiAvKiB2cGdhdGhlcmR7ZCxxfSB7eCx5fW1tLG1lbSx7eCx5fW1tICovCiAgICAgY2FzZSBYODZF TVVMX09QQ19WRVhfNjYoMHgwZjM4LCAweDkxKTogLyogdnBnYXRoZXJxe2QscX0ge3gseX1tbSxt ZW0se3gseX1tbSAqLwoKCgoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX18KWGVuLWRldmVsIG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVucHJvamVj dC5vcmcKaHR0cHM6Ly9saXN0cy54ZW5wcm9qZWN0Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL3hlbi1k ZXZlbA==