From: Vitaly Kuznetsov <vkuznets@redhat.com> To: Michael Kelley <mikelley@microsoft.com>, Tianyu Lan <lantianyu1986@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>, Tianyu Lan <Tianyu.Lan@microsoft.com>, "linux-arch\@vger.kernel.org" <linux-arch@vger.kernel.org>, "linux-hyperv\@vger.kernel.org" <linux-hyperv@vger.kernel.org>, "linux-kernel\@vger kernel org" <linux-kernel@vger.kernel.org>, Andy Lutomirski <luto@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>, the arch/x86 maintainers <x86@kernel.org>, KY Srinivasan <kys@microsoft.com>, Haiyang Zhang <haiyangz@microsoft.com>, Stephen Hemminger <sthemmin@microsoft.com>, Sasha Levin <sashal@kernel.org>, Daniel Lezcano <daniel.lezcano@linaro.org>, Arnd Bergmann <arnd@arndb.de>, "ashal\@kernel.org" <ashal@kernel.org> Subject: RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function Date: Tue, 13 Aug 2019 10:33:37 +0200 [thread overview] Message-ID: <87sgq5a2hq.fsf@vitty.brq.redhat.com> (raw) In-Reply-To: <DM5PR21MB0137E03AAD8C2EA61EC81ED7D7D30@DM5PR21MB0137.namprd21.prod.outlook.com> Michael Kelley <mikelley@microsoft.com> writes: > From: Tianyu Lan <lantianyu1986@gmail.com> Sent: Tuesday, July 30, 2019 6:41 AM >> >> On Mon, Jul 29, 2019 at 8:13 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote: >> > >> > Peter Zijlstra <peterz@infradead.org> writes: >> > >> > > On Mon, Jul 29, 2019 at 12:59:26PM +0200, Vitaly Kuznetsov wrote: >> > >> lantianyu1986@gmail.com writes: >> > >> >> > >> > From: Tianyu Lan <Tianyu.Lan@microsoft.com> >> > >> > >> > >> > Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock >> > >> > on x86. But native_sched_clock() directly uses the raw TSC value, which >> > >> > can be discontinuous in a Hyper-V VM. Add the generic hv_setup_sched_clock() >> > >> > to set the sched clock function appropriately. On x86, this sets >> > >> > pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is >> > >> > scaled and adjusted to be continuous. >> > >> >> > >> Hypervisor can, in theory, disable TSC page and then we're forced to use >> > >> MSR-based clocksource but using it as sched_clock() can be very slow, >> > >> I'm afraid. >> > >> >> > >> On the other hand, what we have now is probably worse: TSC can, >> > >> actually, jump backwards (e.g. on migration) and we're breaking the >> > >> requirements for sched_clock(). >> > > >> > > That (obviously) also breaks the requirements for using TSC as >> > > clocksource. >> > > >> > > IOW, it breaks the entire purpose of having TSC in the first place. >> > >> > Currently, we mark raw TSC as unstable when running on Hyper-V (see >> > 88c9281a9fba6), 'TSC page' (which is TSC * scale + offset) is being used >> > instead. The problem is that 'TSC page' can be disabled by the >> > hypervisor and in that case the only remaining clocksource is MSR-based >> > (slow). >> > >> >> Yes, that will be slow if Hyper-V doesn't expose hv tsc page and >> kernel uses MSR based >> clocksource. Each MSR read will trigger one VM-EXIT. This also happens on other >> hypervisors (e,g, KVM doesn't expose KVM clock). Hypervisor should >> take this into >> account and determine which clocksource should be exposed or not. >> > > We've confirmed with the Hyper-V team that the TSC page is always available > on Hyper-V 2016 and later, and on Hyper-V 2012 R2 when the physical > hardware presents an InvariantTSC. Currently we check that TSC page is valid on every read and it seems this is redundant, right? It is either available on boot or not. I can only imagine migrating a VM to a non-InvariantTSC host when Hyper-V will likely disable the page (and we can get reenlightenment notification then). > But the Linux Kconfig's are set up so > the TSC page is not used for 32-bit guests -- all clock reads are synthetic MSR > reads. For 32-bit, this set of changes will add more overhead because the > sched clock reads will now be MSR reads. > > I would be inclined to fix the problem, even with the perf hit on 32-bit Linux. > I don’t have any data on 32-bit Linux being used in a Hyper-V guest, but it's not > supported in Azure so usage is pretty small. The alternative would be to continue > to use the raw TSC value on 32-bit, even with the risk of a discontinuity in case of > live migration or similar scenarios. The issue needs fixing, I agree, however using MSR based clocksource as sched clock may give us too big of a performance hit (not sure who cares about 32 bit guest performance nowadays but still). What stops us from enabling TSC page for 32 bit guests if it is available? -- Vitaly
WARNING: multiple messages have this Message-ID (diff)
From: Vitaly Kuznetsov <vkuznets@redhat.com> To: Michael Kelley <mikelley@microsoft.com>, Tianyu Lan <lantianyu1986@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org>, Tianyu Lan <Tianyu.Lan@microsoft.com>, "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>, "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>, "linux-kernel@vger kernel org" <linux-kernel@vger.kernel.org>, Andy Lutomirski <luto@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>, the arch/x86 maintainers <x86@kernel.org>, KY Srinivasan <kys@microsoft.com>, Haiyang Zhang <haiyangz@microsoft.com>, Stephen Hemminger <sthemmin@microsoft.com>, Sasha Levin <sashal@kernel.org>, Daniel Lezcano <daniel.lezcano@linaro.org>, Arnd Bergmann <arnd@arndb.de>"ashal@kernel.org" <ashal@kernel.org> Subject: RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function Date: Tue, 13 Aug 2019 10:33:37 +0200 [thread overview] Message-ID: <87sgq5a2hq.fsf@vitty.brq.redhat.com> (raw) In-Reply-To: <DM5PR21MB0137E03AAD8C2EA61EC81ED7D7D30@DM5PR21MB0137.namprd21.prod.outlook.com> Michael Kelley <mikelley@microsoft.com> writes: > From: Tianyu Lan <lantianyu1986@gmail.com> Sent: Tuesday, July 30, 2019 6:41 AM >> >> On Mon, Jul 29, 2019 at 8:13 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote: >> > >> > Peter Zijlstra <peterz@infradead.org> writes: >> > >> > > On Mon, Jul 29, 2019 at 12:59:26PM +0200, Vitaly Kuznetsov wrote: >> > >> lantianyu1986@gmail.com writes: >> > >> >> > >> > From: Tianyu Lan <Tianyu.Lan@microsoft.com> >> > >> > >> > >> > Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock >> > >> > on x86. But native_sched_clock() directly uses the raw TSC value, which >> > >> > can be discontinuous in a Hyper-V VM. Add the generic hv_setup_sched_clock() >> > >> > to set the sched clock function appropriately. On x86, this sets >> > >> > pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is >> > >> > scaled and adjusted to be continuous. >> > >> >> > >> Hypervisor can, in theory, disable TSC page and then we're forced to use >> > >> MSR-based clocksource but using it as sched_clock() can be very slow, >> > >> I'm afraid. >> > >> >> > >> On the other hand, what we have now is probably worse: TSC can, >> > >> actually, jump backwards (e.g. on migration) and we're breaking the >> > >> requirements for sched_clock(). >> > > >> > > That (obviously) also breaks the requirements for using TSC as >> > > clocksource. >> > > >> > > IOW, it breaks the entire purpose of having TSC in the first place. >> > >> > Currently, we mark raw TSC as unstable when running on Hyper-V (see >> > 88c9281a9fba6), 'TSC page' (which is TSC * scale + offset) is being used >> > instead. The problem is that 'TSC page' can be disabled by the >> > hypervisor and in that case the only remaining clocksource is MSR-based >> > (slow). >> > >> >> Yes, that will be slow if Hyper-V doesn't expose hv tsc page and >> kernel uses MSR based >> clocksource. Each MSR read will trigger one VM-EXIT. This also happens on other >> hypervisors (e,g, KVM doesn't expose KVM clock). Hypervisor should >> take this into >> account and determine which clocksource should be exposed or not. >> > > We've confirmed with the Hyper-V team that the TSC page is always available > on Hyper-V 2016 and later, and on Hyper-V 2012 R2 when the physical > hardware presents an InvariantTSC. Currently we check that TSC page is valid on every read and it seems this is redundant, right? It is either available on boot or not. I can only imagine migrating a VM to a non-InvariantTSC host when Hyper-V will likely disable the page (and we can get reenlightenment notification then). > But the Linux Kconfig's are set up so > the TSC page is not used for 32-bit guests -- all clock reads are synthetic MSR > reads. For 32-bit, this set of changes will add more overhead because the > sched clock reads will now be MSR reads. > > I would be inclined to fix the problem, even with the perf hit on 32-bit Linux. > I don’t have any data on 32-bit Linux being used in a Hyper-V guest, but it's not > supported in Azure so usage is pretty small. The alternative would be to continue > to use the raw TSC value on 32-bit, even with the risk of a discontinuity in case of > live migration or similar scenarios. The issue needs fixing, I agree, however using MSR based clocksource as sched clock may give us too big of a performance hit (not sure who cares about 32 bit guest performance nowadays but still). What stops us from enabling TSC page for 32 bit guests if it is available? -- Vitaly
next prev parent reply other threads:[~2019-08-13 8:33 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-07-29 7:52 [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function lantianyu1986 2019-07-29 7:52 ` lantianyu1986 2019-07-29 7:52 ` [PATCH 1/2] clocksource/Hyper-v: Allocate Hyper-V tsc page statically lantianyu1986 2019-07-29 7:52 ` lantianyu1986 2019-08-12 18:39 ` Michael Kelley 2019-08-12 18:39 ` Michael Kelley 2019-07-29 7:52 ` [PATCH 2/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function lantianyu1986 2019-07-29 7:52 ` lantianyu1986 2019-08-12 18:41 ` Michael Kelley 2019-08-12 18:41 ` Michael Kelley 2019-07-29 10:59 ` [PATCH 0/2] " Vitaly Kuznetsov 2019-07-29 10:59 ` Vitaly Kuznetsov 2019-07-29 11:09 ` Peter Zijlstra 2019-07-29 11:09 ` Peter Zijlstra 2019-07-29 12:13 ` Vitaly Kuznetsov 2019-07-29 12:13 ` Vitaly Kuznetsov 2019-07-30 13:41 ` Tianyu Lan 2019-07-30 13:41 ` Tianyu Lan 2019-08-12 19:22 ` Michael Kelley 2019-08-12 19:22 ` Michael Kelley 2019-08-13 8:33 ` Vitaly Kuznetsov [this message] 2019-08-13 8:33 ` Vitaly Kuznetsov 2019-08-20 14:32 ` Michael Kelley 2019-08-20 14:32 ` Michael Kelley 2019-08-21 7:15 ` Vitaly Kuznetsov 2019-08-21 7:15 ` Vitaly Kuznetsov 2019-08-21 8:54 ` Vitaly Kuznetsov 2019-08-21 8:54 ` Vitaly Kuznetsov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87sgq5a2hq.fsf@vitty.brq.redhat.com \ --to=vkuznets@redhat.com \ --cc=Tianyu.Lan@microsoft.com \ --cc=arnd@arndb.de \ --cc=ashal@kernel.org \ --cc=bp@alien8.de \ --cc=daniel.lezcano@linaro.org \ --cc=haiyangz@microsoft.com \ --cc=hpa@zytor.com \ --cc=kys@microsoft.com \ --cc=lantianyu1986@gmail.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-hyperv@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mikelley@microsoft.com \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=sashal@kernel.org \ --cc=sthemmin@microsoft.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.