From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Xen on ARM IRQ latency and scheduler overhead Date: Thu, 9 Feb 2017 16:54:37 -0800 (PST) Message-ID: Mime-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-1013049813-1486685645=:20549" Return-path: Content-ID: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: xen-devel@lists.xen.org Cc: george.dunlap@eu.citrix.com, edgar.iglesias@xilinx.com, dario.faggioli@citrix.com, sstabellini@kernel.org, julien.grall@arm.com List-Id: xen-devel@lists.xenproject.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323329-1013049813-1486685645=:20549 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: Hi all, I have run some IRQ latency measurements on Xen on ARM on a Xilinx ZynqMP board (four Cortex A53 cores, GICv2). Dom0 has 1 vcpu pinned to cpu0, DomU has 1 vcpu pinned to cpu2. Dom0 is Ubuntu. DomU is an ad-hoc baremetal app to measure interrupt latency: https://github.com/edgarigl/tbm I modified the app to use the phys_timer instead of the virt_timer. You can build it with: make CFG=configs/xen-guest-irq-latency.cfg I modified Xen to export the phys_timer to guests, see the very hacky patch attached. This way, the phys_timer interrupt should behave like any conventional device interrupts assigned to a guest. These are the results, in nanosec: AVG MIN MAX WARM MAX NODEBUG no WFI 1890 1800 3170 2070 NODEBUG WFI 4850 4810 7030 4980 NODEBUG no WFI credit2 2217 2090 3420 2650 NODEBUG WFI credit2 8080 7890 10320 8300 DEBUG no WFI 2252 2080 3320 2650 DEBUG WFI 6500 6140 8520 8130 DEBUG WFI, credit2 8050 7870 10680 8450 DEBUG means Xen DEBUG build. WARM MAX is the maximum latency, taking out the first few interrupts to warm the caches. WFI is the ARM and ARM64 sleeping instruction, trapped and emulated by Xen by calling vcpu_block. As you can see, depending on whether the guest issues a WFI or not while waiting for interrupts, the results change significantly. Interestingly, credit2 does worse than credit1 in this area. Trying to figure out where those 3000-4000ns of difference between the WFI and non-WFI cases come from, I wrote a patch to zero the latency introduced by xen/arch/arm/domain.c:schedule_tail. That saves about 1000ns. There are no other arch specific context switch functions worth optimizing. We are down to 2000-3000ns. Then, I started investigating the scheduler. I measured how long it takes to run "vcpu_unblock": 1050ns, which is significant. I don't know what is causing the remaining 1000-2000ns, but I bet on another scheduler function. Do you have any suggestions on which one? Assuming that the problem is indeed the scheduler, one workaround that we could introduce today would be to avoid calling vcpu_unblock on guest WFI and call vcpu_yield instead. This change makes things significantly better: AVG MIN MAX WARM MAX DEBUG WFI (yield, no block) 2900 2190 5130 5130 DEBUG WFI (yield, no block) credit2 3514 2280 6180 5430 Is that a reasonable change to make? Would it cause significantly more power consumption in Xen (because xen/arch/arm/domain.c:idle_loop might not be called anymore)? If we wanted to zero the difference between the WFI and non-WFI cases, would we need a new scheduler? A simple "noop scheduler" that statically assigns vcpus to pcpus, one by one, until they run out, then return error? Or do we need more extensive modifications to xen/common/schedule.c? Any other ideas? Thanks, Stefano --8323329-1013049813-1486685645=:20549 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=time Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: ATTACHMENT; FILENAME=time ZGlmZiAtLWdpdCBhL3hlbi9hcmNoL2FybS9kb21haW4uYyBiL3hlbi9hcmNo L2FybS9kb21haW4uYw0KaW5kZXggN2U0MzY5MS4uZjVmZjY5YiAxMDA2NDQN Ci0tLSBhL3hlbi9hcmNoL2FybS9kb21haW4uYw0KKysrIGIveGVuL2FyY2gv YXJtL2RvbWFpbi5jDQpAQCAtNjYzLDYgKzY2Myw3IEBAIHZvaWQgYXJjaF9k b21haW5fZGVzdHJveShzdHJ1Y3QgZG9tYWluICpkKQ0KICAgICAvKiBJT01N VSBwYWdlIHRhYmxlIGlzIHNoYXJlZCB3aXRoIFAyTSwgYWx3YXlzIGNhbGwN CiAgICAgICogaW9tbXVfZG9tYWluX2Rlc3Ryb3koKSBiZWZvcmUgcDJtX3Rl YXJkb3duKCkuDQogICAgICAqLw0KKyAgICBXUklURV9TWVNSRUczMigwLCBD TlRQX0NUTF9FTDApOw0KICAgICBpb21tdV9kb21haW5fZGVzdHJveShkKTsN CiAgICAgcDJtX3RlYXJkb3duKGQpOw0KICAgICBkb21haW5fdmdpY19mcmVl KGQpOw0KZGlmZiAtLWdpdCBhL3hlbi9hcmNoL2FybS9naWMuYyBiL3hlbi9h cmNoL2FybS9naWMuYw0KaW5kZXggYTUzNDhmMi4uNWM4YjYyMSAxMDA2NDQN Ci0tLSBhL3hlbi9hcmNoL2FybS9naWMuYw0KKysrIGIveGVuL2FyY2gvYXJt L2dpYy5jDQpAQCAtNDcsNyArNDcsNyBAQCBzdGF0aWMgREVGSU5FX1BFUl9D UFUodWludDY0X3QsIGxyX21hc2spOw0KIA0KIHN0YXRpYyB2b2lkIGdpY191 cGRhdGVfb25lX2xyKHN0cnVjdCB2Y3B1ICp2LCBpbnQgaSk7DQogDQotc3Rh dGljIGNvbnN0IHN0cnVjdCBnaWNfaHdfb3BlcmF0aW9ucyAqZ2ljX2h3X29w czsNCitjb25zdCBzdHJ1Y3QgZ2ljX2h3X29wZXJhdGlvbnMgKmdpY19od19v cHM7DQogDQogdm9pZCByZWdpc3Rlcl9naWNfb3BzKGNvbnN0IHN0cnVjdCBn aWNfaHdfb3BlcmF0aW9ucyAqb3BzKQ0KIHsNCmRpZmYgLS1naXQgYS94ZW4v YXJjaC9hcm0vaXJxLmMgYi94ZW4vYXJjaC9hcm0vaXJxLmMNCmluZGV4IGRk NjJiYTYuLjlhNGU1MGQgMTAwNjQ0DQotLS0gYS94ZW4vYXJjaC9hcm0vaXJx LmMNCisrKyBiL3hlbi9hcmNoL2FybS9pcnEuYw0KQEAgLTE4NCw2ICsxODQs NyBAQCBpbnQgcmVxdWVzdF9pcnEodW5zaWduZWQgaW50IGlycSwgdW5zaWdu ZWQgaW50IGlycWZsYWdzLA0KIH0NCiANCiAvKiBEaXNwYXRjaCBhbiBpbnRl cnJ1cHQgKi8NCitleHRlcm4gY29uc3Qgc3RydWN0IGdpY19od19vcGVyYXRp b25zICpnaWNfaHdfb3BzOw0KIHZvaWQgZG9fSVJRKHN0cnVjdCBjcHVfdXNl cl9yZWdzICpyZWdzLCB1bnNpZ25lZCBpbnQgaXJxLCBpbnQgaXNfZmlxKQ0K IHsNCiAgICAgc3RydWN0IGlycV9kZXNjICpkZXNjID0gaXJxX3RvX2Rlc2Mo aXJxKTsNCkBAIC0yMDIsNiArMjAzLDEyIEBAIHZvaWQgZG9fSVJRKHN0cnVj dCBjcHVfdXNlcl9yZWdzICpyZWdzLCB1bnNpZ25lZCBpbnQgaXJxLCBpbnQg aXNfZmlxKQ0KICAgICBpcnFfZW50ZXIoKTsNCiANCiAgICAgc3Bpbl9sb2Nr KCZkZXNjLT5sb2NrKTsNCisNCisgICAgaWYgKGlycSA9PSAzMCkgew0KKyAg ICAgICAgc2V0X2JpdChfSVJRX0dVRVNULCAmZGVzYy0+c3RhdHVzKTsNCisg ICAgICAgIGRlc2MtPmhhbmRsZXIgPSBnaWNfaHdfb3BzLT5naWNfZ3Vlc3Rf aXJxX3R5cGU7DQorICAgIH0NCisNCiAgICAgZGVzYy0+aGFuZGxlci0+YWNr KGRlc2MpOw0KIA0KICAgICBpZiAoICFkZXNjLT5hY3Rpb24gKQ0KQEAgLTIy NCw3ICsyMzEsMjMgQEAgdm9pZCBkb19JUlEoc3RydWN0IGNwdV91c2VyX3Jl Z3MgKnJlZ3MsIHVuc2lnbmVkIGludCBpcnEsIGludCBpc19maXEpDQogICAg ICAgICAgKiBUaGUgaXJxIGNhbm5vdCBiZSBhIFBQSSwgd2Ugb25seSBzdXBw b3J0IGRlbGl2ZXJ5IG9mIFNQSXMgdG8NCiAgICAgICAgICAqIGd1ZXN0cy4N CiAJICovDQotICAgICAgICB2Z2ljX3ZjcHVfaW5qZWN0X3NwaShpbmZvLT5k LCBpbmZvLT52aXJxKTsNCisgICAgICAgIGlmIChpcnEgIT0gMzApDQorICAg ICAgICAgICAgdmdpY192Y3B1X2luamVjdF9zcGkoaW5mby0+ZCwgaW5mby0+ dmlycSk7DQorICAgICAgICBlbHNlIHsNCisgICAgICAgICAgICBzdHJ1Y3Qg ZG9tYWluICpkOw0KKyAgICAgICAgICAgIA0KKyAgICAgICAgICAgIGZvcl9l YWNoX2RvbWFpbiAoIGQgKQ0KKyAgICAgICAgICAgIHsNCisgICAgICAgICAg ICAgICAgc3RydWN0IHBlbmRpbmdfaXJxICpwOw0KKyAgICAgICAgICAgICAg ICANCisgICAgICAgICAgICAgICAgaWYgKGQtPmRvbWFpbl9pZCA9PSAwIHx8 IGlzX2lkbGVfZG9tYWluKGQpKQ0KKyAgICAgICAgICAgICAgICAgICAgY29u dGludWU7DQorICAgICAgICAgICAgICAgIHAgPSBpcnFfdG9fcGVuZGluZyhk LT52Y3B1WzBdLCAzMCk7DQorICAgICAgICAgICAgICAgIHAtPmRlc2MgPSBk ZXNjOw0KKyAgICAgICAgICAgICAgICB2Z2ljX3ZjcHVfaW5qZWN0X2lycShk LT52Y3B1WzBdLCAzMCk7DQorICAgICAgICAgICAgICAgIGJyZWFrOw0KKyAg ICAgICAgICAgIH0NCisgICAgICAgIH0NCiAgICAgICAgIGdvdG8gb3V0X25v X2VuZDsNCiAgICAgfQ0KIA0KZGlmZiAtLWdpdCBhL3hlbi9hcmNoL2FybS90 aW1lLmMgYi94ZW4vYXJjaC9hcm0vdGltZS5jDQppbmRleCA3ZGFlMjhiLi4w MjQ5NjMxIDEwMDY0NA0KLS0tIGEveGVuL2FyY2gvYXJtL3RpbWUuYw0KKysr IGIveGVuL2FyY2gvYXJtL3RpbWUuYw0KQEAgLTI5Nyw5ICszMDAsOSBAQCB2 b2lkIGluaXRfdGltZXJfaW50ZXJydXB0KHZvaWQpDQogICAgIC8qIFNlbnNp YmxlIGRlZmF1bHRzICovDQogICAgIFdSSVRFX1NZU1JFRzY0KDAsIENOVFZP RkZfRUwyKTsgICAgIC8qIE5vIFZNLXNwZWNpZmljIG9mZnNldCAqLw0KICAg ICAvKiBEbyBub3QgbGV0IHRoZSBWTXMgcHJvZ3JhbSB0aGUgcGh5c2ljYWwg dGltZXIsIG9ubHkgcmVhZCB0aGUgcGh5c2ljYWwgY291bnRlciAqLw0KLSAg ICBXUklURV9TWVNSRUczMihDTlRIQ1RMX0VMMl9FTDFQQ1RFTiwgQ05USENU TF9FTDIpOw0KICAgICBXUklURV9TWVNSRUczMigwLCBDTlRQX0NUTF9FTDAp OyAgICAvKiBQaHlzaWNhbCB0aW1lciBkaXNhYmxlZCAqLw0KICAgICBXUklU RV9TWVNSRUczMigwLCBDTlRIUF9DVExfRUwyKTsgICAvKiBIeXBlcnZpc29y J3MgdGltZXIgZGlzYWJsZWQgKi8NCisgICAgV1JJVEVfU1lTUkVHMzIoQ05U SENUTF9FTDJfRUwxUENURU58Q05USENUTF9FTDJfRUwxUENFTiwgQ05USENU TF9FTDIpOw0KICAgICBpc2IoKTsNCiANCiAgICAgcmVxdWVzdF9pcnEodGlt ZXJfaXJxW1RJTUVSX0hZUF9QUEldLCAwLCB0aW1lcl9pbnRlcnJ1cHQsDQo= --8323329-1013049813-1486685645=:20549 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --8323329-1013049813-1486685645=:20549--