From: Michael Kelley <mikelley@microsoft.com>
To: Tianyu Lan <lantianyu1986@gmail.com>, vkuznets <vkuznets@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Tianyu Lan <Tianyu.Lan@microsoft.com>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
"linux-kernel@vger kernel org" <linux-kernel@vger.kernel.org>,
Andy Lutomirski <luto@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
the arch/x86 maintainers <x86@kernel.org>,
KY Srinivasan <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
Sasha Levin <sashal@kernel.org>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Arnd Bergmann <arnd@arndb.de>,
"ashal@kernel.org" <ashal@kernel.org>
Subject: RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function
Date: Mon, 12 Aug 2019 19:22:25 +0000 [thread overview]
Message-ID: <DM5PR21MB0137E03AAD8C2EA61EC81ED7D7D30@DM5PR21MB0137.namprd21.prod.outlook.com> (raw)
In-Reply-To: <CAOLK0py6ngy9kAnZcRMBK8U45s2L5Wo4X0NP_qPM0zv7WjeVQQ@mail.gmail.com>
From: Tianyu Lan <lantianyu1986@gmail.com> Sent: Tuesday, July 30, 2019 6:41 AM
>
> On Mon, Jul 29, 2019 at 8:13 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >
> > Peter Zijlstra <peterz@infradead.org> writes:
> >
> > > On Mon, Jul 29, 2019 at 12:59:26PM +0200, Vitaly Kuznetsov wrote:
> > >> lantianyu1986@gmail.com writes:
> > >>
> > >> > From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> > >> >
> > >> > Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock
> > >> > on x86. But native_sched_clock() directly uses the raw TSC value, which
> > >> > can be discontinuous in a Hyper-V VM. Add the generic hv_setup_sched_clock()
> > >> > to set the sched clock function appropriately. On x86, this sets
> > >> > pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is
> > >> > scaled and adjusted to be continuous.
> > >>
> > >> Hypervisor can, in theory, disable TSC page and then we're forced to use
> > >> MSR-based clocksource but using it as sched_clock() can be very slow,
> > >> I'm afraid.
> > >>
> > >> On the other hand, what we have now is probably worse: TSC can,
> > >> actually, jump backwards (e.g. on migration) and we're breaking the
> > >> requirements for sched_clock().
> > >
> > > That (obviously) also breaks the requirements for using TSC as
> > > clocksource.
> > >
> > > IOW, it breaks the entire purpose of having TSC in the first place.
> >
> > Currently, we mark raw TSC as unstable when running on Hyper-V (see
> > 88c9281a9fba6), 'TSC page' (which is TSC * scale + offset) is being used
> > instead. The problem is that 'TSC page' can be disabled by the
> > hypervisor and in that case the only remaining clocksource is MSR-based
> > (slow).
> >
>
> Yes, that will be slow if Hyper-V doesn't expose hv tsc page and
> kernel uses MSR based
> clocksource. Each MSR read will trigger one VM-EXIT. This also happens on other
> hypervisors (e,g, KVM doesn't expose KVM clock). Hypervisor should
> take this into
> account and determine which clocksource should be exposed or not.
>
We've confirmed with the Hyper-V team that the TSC page is always available
on Hyper-V 2016 and later, and on Hyper-V 2012 R2 when the physical
hardware presents an InvariantTSC. But the Linux Kconfig's are set up so
the TSC page is not used for 32-bit guests -- all clock reads are synthetic MSR
reads. For 32-bit, this set of changes will add more overhead because the
sched clock reads will now be MSR reads.
I would be inclined to fix the problem, even with the perf hit on 32-bit Linux.
I don’t have any data on 32-bit Linux being used in a Hyper-V guest, but it's not
supported in Azure so usage is pretty small. The alternative would be to continue
to use the raw TSC value on 32-bit, even with the risk of a discontinuity in case of
live migration or similar scenarios.
Michael
next prev parent reply other threads:[~2019-08-12 19:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-29 7:52 [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function lantianyu1986
2019-07-29 7:52 ` [PATCH 1/2] clocksource/Hyper-v: Allocate Hyper-V tsc page statically lantianyu1986
2019-08-12 18:39 ` Michael Kelley
2019-07-29 7:52 ` [PATCH 2/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function lantianyu1986
2019-08-12 18:41 ` Michael Kelley
2019-07-29 10:59 ` [PATCH 0/2] " Vitaly Kuznetsov
2019-07-29 11:09 ` Peter Zijlstra
2019-07-29 12:13 ` Vitaly Kuznetsov
2019-07-30 13:41 ` Tianyu Lan
2019-08-12 19:22 ` Michael Kelley [this message]
2019-08-13 8:33 ` Vitaly Kuznetsov
2019-08-20 14:32 ` Michael Kelley
2019-08-21 7:15 ` Vitaly Kuznetsov
2019-08-21 8:54 ` Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM5PR21MB0137E03AAD8C2EA61EC81ED7D7D30@DM5PR21MB0137.namprd21.prod.outlook.com \
--to=mikelley@microsoft.com \
--cc=Tianyu.Lan@microsoft.com \
--cc=arnd@arndb.de \
--cc=ashal@kernel.org \
--cc=bp@alien8.de \
--cc=daniel.lezcano@linaro.org \
--cc=haiyangz@microsoft.com \
--cc=hpa@zytor.com \
--cc=kys@microsoft.com \
--cc=lantianyu1986@gmail.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=sashal@kernel.org \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).