All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: Marc Zyngier <maz@kernel.org>, Mark Rutland <Mark.Rutland@arm.com>
Cc: "will@kernel.org" <will@kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
	"sudeep.holla@arm.com" <sudeep.holla@arm.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-efi@vger.kernel.org" <linux-efi@vger.kernel.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
	KY Srinivasan <kys@microsoft.com>
Subject: RE: [PATCH v10 3/7] arm64: hyperv: Add Hyper-V clocksource/clockevent support
Date: Mon, 28 Jun 2021 02:21:28 +0000	[thread overview]
Message-ID: <MWHPR21MB159396AA55E2270FD4BC4EECD7039@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <87pmwdariz.wl-maz@kernel.org>

From: Marc Zyngier <maz@kernel.org> Sent: Wednesday, June 23, 2021 1:56 AM
> 
> On Tue, 22 Jun 2021 10:54:12 +0100,
> Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > Hi Michael,
> >
> > Thanks for all this; comments inline below. I've added Marc Zyngier, who
> > co-maintains the architected timer code.
> >
> > On Mon, Jun 14, 2021 at 02:42:23AM +0000, Michael Kelley wrote:
> > > From: Mark Rutland <mark.rutland@arm.com> Sent: Thursday, June 10, 2021 9:45 AM
> > > > On Tue, Jun 08, 2021 at 03:36:06PM +0000, Michael Kelley wrote:
> > > > > I've had a couple rounds of discussions with the Hyper-V team.   For
> > > > > the clocksource we've agreed to table the live migration discussion, and
> > > > > I'll resubmit the code so that arm_arch_timer.c provides the
> > > > > standard arch_sys_counter clocksource.  As noted previously, this just
> > > > > works for a Hyper-V guest.  The live migration discussion may come
> > > > > back later after a deeper investigation by Hyper-V.
> > > >
> > > > Great; thanks for this!
> > > >
> > > > > For clockevents, there's not a near term fix.  It's more than just plumbing
> > > > > an interrupt for Hyper-V to virtualize the ARM64 arch timer in a guest VM.
> > > > > From their perspective there's also benefit in having a timer abstraction
> > > > > that's independent of the architecture, and in the Linux guest, the STIMER
> > > > > code is common across x86/x64 and ARM64.  It follows the standard Linux
> > > > > clockevents model, as it should. The code is already in use in out-of-tree
> > > > > builds in the Linux VMs included in Windows 10 on ARM64 as part of the
> > > > > so-called "Windows Subsystem for Linux".
> > > > >
> > > > > So I'm hoping we can get this core support for ARM64 guests on Hyper-V
> > > > > into upstream using the existing STIMER support.  At some point, Hyper-V
> > > > > will do the virtualization of the ARM64 arch timer, but we don't want to
> > > > > have to stay out-of-tree until after that happens.
> > > >
> > > > My main concern here is making sure that we can rely on architected
> > > > properties, and don't have to special-case architected bits for hyperv
> > > > (or any other hypervisor), since that inevitably causes longer-term
> > > > pain.
> > > >
> > > > While in abstract I'm not as worried about using the timer
> > > > clock_event_device specifically, that same driver provides the
> > > > clocksource and the event stream, and I want those to work as usual,
> > > > without being tied into the hyperv code. IIUC that will require some
> > > > work, since the driver won't register if the GTDT is missing timer
> > > > interrupts (or if there is no GTDT).
> > > >
> > > > I think it really depends on what that looks like.
> > >
> > > Mark,
> > >
> > > Here are the details:
> > >
> > > The existing initialization and registration code in arm_arch_timer.c
> > > works in a Hyper-V guest with no changes.  As previously mentioned,
> > > the GTDT exists and is correctly populated.  Even though it isn't used,
> > > there's a PPI INTID specified for the virtual timer, just so
> > > the "arm_sys_timer" clockevent can be initialized and registered.
> > > The IRQ shows up in the output of "cat /proc/interrupts" with zero counts
> > > for all CPUs since no interrupts are ever generated. The EL1 virtual
> > > timer registers (CNTV_CVAL_EL0, CNTV_TVAL_EL0, and CNTV_CTL_EL0)
> > > are accessible in the VM.  The "arm_sys_timer" clockevent is left in
> > > a shutdown state with CNTV_CTL_EL0.ENABLE set to zero when the
> > > Hyper-V STIMER clockevent is registered with a higher rating.
> >
> > This concerns me, since we're lying to the kernel, and assuming that it
> > will never try to use this timer. I appreciate that evidently we don't
> > happen to rely on that today if you register a higher priority timer,
> > but that does open us up to future fragility (e.g. if we added sanity
> > checks when registering timers), and IIRC there are ways for userspace
> > to change the clockevent device today.
> 
> Indeed. Userspace can perfectly unbind the clockevent using
> /sys/devices/system/clockevents/clockevent*/unbind_device, and the
> kernel will be happy to switch to the next per-cpu timer, which
> happens to be the arch timer. Oh wait...
> 
> >
> > > Event streams are initialized and the __delay() implementation
> > > for ARM64 inside the kernel works.  However, on the Ampere
> > > eMAG hardware I'm using for testing, the WFE instruction returns
> > > more quickly than it should even though the event stream fields in
> > > CNTKCTL_EL1 are correct.  I have a query in to the Hyper-V team
> > > to see if they are trapping WFE and just returning, vs. perhaps the
> > > eMAG processor takes the easy way out and has WFE just return
> > > immediately.  I'm not knowledgeable about other uses of timer
> > > event streams, so let me know if there are other usage scenarios
> > > I should check.
> >
> > I saw your reply confirming that this is gnerally working as expected
> > (and that Hyper-V is not trapping WFE) so this sounds fine to me.
> >
> > > Finally, the "arch_sys_counter" clocksource gets initialized and
> > > setup correctly.  If the Hyper-V clocksource is also initialized,
> > > you can flip between the two clocksources at runtime as expected.
> > > If the Hyper-V clocksource is not setup, then Linux in the VM runs
> > > fine with the "arch_sys_counter" clocksource.
> >
> > Great!
> >
> > As above, my remaining concern here is fragility around the
> > clockevent_device; I'm not keen that we're lying (in the GTDT) that
> > interrupts are wired up when they not functional, and while you can get
> > away with that today, that relies on kernel implementation details that
> > could change.
> >
> > Ideally, Hyper-V would provide the architectural timer (as it's already
> > claiming to in the GTDT), things would "just work", and the Hyper-V
> > timer would be an optimization rather than a functional necessity.
> >
> > You mentioned above that Hyper-V will virtualize the timer "at some
> > point" -- is that already planned, and when is that likely to be?
> >
> > Marc, do you have any thoughts on this?
> 
> Overall, lying to the kernel is a bad idea. Only implementing half of
> the architecture is another bad idea. I doubt the combination of two
> bad ideas produces a good one.
> 
> If Hyper-V guests need to use another timer (for migration purposes?),
> that's fine. But we rely on both the base architecture to be
> completely implemented *and* on advertised features to be functional.
> I think this has been our position since the first Hyper-V patches
> were posted... 3 years ago?
> 
> What is the hold up for reliably virtualising the arch timer,
> including interrupt delivery?

Marc (and Mark) --

In our early interactions about the Hyper-V clocks and timers, the code
was a bit spread out, and you suggested moving all the clocksource
and clockevent stuff to a driver under drivers/clocksource.  See
https://lore.kernel.org/lkml/e0374a07-809c-cabd-2eb6-e6b5ad84742e@arm.com/.
That was a good change independent of any ARM64 considerations,
but I read (or perhaps overread) your comments to say that it was OK
to use these Hyper-V para-virtualized clocks/timers instead of the ARM64
architectural ones in a Hyper-V VM.  They work and it's what the Hyper-V
guys wanted to do anyway, so having Hyper-V offer the ARM64 arch
counter and timer in a VM hasn't been a priority.  They had other stuff that
didn't work at all on ARM64, so that's where their attention went.

I agree that it would be better to have the ARM64 arch counter/timer
fully implemented in a Hyper-V VM.  But we're wanting to find a practical
way to move forward, and in doing so confine any rough edges to Hyper-V
VMs and the Hyper-V specific code in the kernel tree. We're maintaining
and shipping the code out-of-tree based on Hyper-V ARM64 current behavior
and would like to get this core enablement code upstream. Sure, unbinding
the Hyper-V clockevent doesn't work, but that's not a problem in any use
cases we see from customers.

All that said, our discussions with the Hyper-V team are continuing.  We're
still in the process of seeing what's practical to get and when.

Michael

> 
> Thanks,
> 
> 	M.
> 
> --
> Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Michael Kelley <mikelley@microsoft.com>
To: Marc Zyngier <maz@kernel.org>, Mark Rutland <Mark.Rutland@arm.com>
Cc: "will@kernel.org" <will@kernel.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
	"sudeep.holla@arm.com" <sudeep.holla@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-efi@vger.kernel.org" <linux-efi@vger.kernel.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
	KY Srinivasan <kys@microsoft.com>
Subject: RE: [PATCH v10 3/7] arm64: hyperv: Add Hyper-V clocksource/clockevent support
Date: Mon, 28 Jun 2021 02:21:28 +0000	[thread overview]
Message-ID: <MWHPR21MB159396AA55E2270FD4BC4EECD7039@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <87pmwdariz.wl-maz@kernel.org>

From: Marc Zyngier <maz@kernel.org> Sent: Wednesday, June 23, 2021 1:56 AM
> 
> On Tue, 22 Jun 2021 10:54:12 +0100,
> Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > Hi Michael,
> >
> > Thanks for all this; comments inline below. I've added Marc Zyngier, who
> > co-maintains the architected timer code.
> >
> > On Mon, Jun 14, 2021 at 02:42:23AM +0000, Michael Kelley wrote:
> > > From: Mark Rutland <mark.rutland@arm.com> Sent: Thursday, June 10, 2021 9:45 AM
> > > > On Tue, Jun 08, 2021 at 03:36:06PM +0000, Michael Kelley wrote:
> > > > > I've had a couple rounds of discussions with the Hyper-V team.   For
> > > > > the clocksource we've agreed to table the live migration discussion, and
> > > > > I'll resubmit the code so that arm_arch_timer.c provides the
> > > > > standard arch_sys_counter clocksource.  As noted previously, this just
> > > > > works for a Hyper-V guest.  The live migration discussion may come
> > > > > back later after a deeper investigation by Hyper-V.
> > > >
> > > > Great; thanks for this!
> > > >
> > > > > For clockevents, there's not a near term fix.  It's more than just plumbing
> > > > > an interrupt for Hyper-V to virtualize the ARM64 arch timer in a guest VM.
> > > > > From their perspective there's also benefit in having a timer abstraction
> > > > > that's independent of the architecture, and in the Linux guest, the STIMER
> > > > > code is common across x86/x64 and ARM64.  It follows the standard Linux
> > > > > clockevents model, as it should. The code is already in use in out-of-tree
> > > > > builds in the Linux VMs included in Windows 10 on ARM64 as part of the
> > > > > so-called "Windows Subsystem for Linux".
> > > > >
> > > > > So I'm hoping we can get this core support for ARM64 guests on Hyper-V
> > > > > into upstream using the existing STIMER support.  At some point, Hyper-V
> > > > > will do the virtualization of the ARM64 arch timer, but we don't want to
> > > > > have to stay out-of-tree until after that happens.
> > > >
> > > > My main concern here is making sure that we can rely on architected
> > > > properties, and don't have to special-case architected bits for hyperv
> > > > (or any other hypervisor), since that inevitably causes longer-term
> > > > pain.
> > > >
> > > > While in abstract I'm not as worried about using the timer
> > > > clock_event_device specifically, that same driver provides the
> > > > clocksource and the event stream, and I want those to work as usual,
> > > > without being tied into the hyperv code. IIUC that will require some
> > > > work, since the driver won't register if the GTDT is missing timer
> > > > interrupts (or if there is no GTDT).
> > > >
> > > > I think it really depends on what that looks like.
> > >
> > > Mark,
> > >
> > > Here are the details:
> > >
> > > The existing initialization and registration code in arm_arch_timer.c
> > > works in a Hyper-V guest with no changes.  As previously mentioned,
> > > the GTDT exists and is correctly populated.  Even though it isn't used,
> > > there's a PPI INTID specified for the virtual timer, just so
> > > the "arm_sys_timer" clockevent can be initialized and registered.
> > > The IRQ shows up in the output of "cat /proc/interrupts" with zero counts
> > > for all CPUs since no interrupts are ever generated. The EL1 virtual
> > > timer registers (CNTV_CVAL_EL0, CNTV_TVAL_EL0, and CNTV_CTL_EL0)
> > > are accessible in the VM.  The "arm_sys_timer" clockevent is left in
> > > a shutdown state with CNTV_CTL_EL0.ENABLE set to zero when the
> > > Hyper-V STIMER clockevent is registered with a higher rating.
> >
> > This concerns me, since we're lying to the kernel, and assuming that it
> > will never try to use this timer. I appreciate that evidently we don't
> > happen to rely on that today if you register a higher priority timer,
> > but that does open us up to future fragility (e.g. if we added sanity
> > checks when registering timers), and IIRC there are ways for userspace
> > to change the clockevent device today.
> 
> Indeed. Userspace can perfectly unbind the clockevent using
> /sys/devices/system/clockevents/clockevent*/unbind_device, and the
> kernel will be happy to switch to the next per-cpu timer, which
> happens to be the arch timer. Oh wait...
> 
> >
> > > Event streams are initialized and the __delay() implementation
> > > for ARM64 inside the kernel works.  However, on the Ampere
> > > eMAG hardware I'm using for testing, the WFE instruction returns
> > > more quickly than it should even though the event stream fields in
> > > CNTKCTL_EL1 are correct.  I have a query in to the Hyper-V team
> > > to see if they are trapping WFE and just returning, vs. perhaps the
> > > eMAG processor takes the easy way out and has WFE just return
> > > immediately.  I'm not knowledgeable about other uses of timer
> > > event streams, so let me know if there are other usage scenarios
> > > I should check.
> >
> > I saw your reply confirming that this is gnerally working as expected
> > (and that Hyper-V is not trapping WFE) so this sounds fine to me.
> >
> > > Finally, the "arch_sys_counter" clocksource gets initialized and
> > > setup correctly.  If the Hyper-V clocksource is also initialized,
> > > you can flip between the two clocksources at runtime as expected.
> > > If the Hyper-V clocksource is not setup, then Linux in the VM runs
> > > fine with the "arch_sys_counter" clocksource.
> >
> > Great!
> >
> > As above, my remaining concern here is fragility around the
> > clockevent_device; I'm not keen that we're lying (in the GTDT) that
> > interrupts are wired up when they not functional, and while you can get
> > away with that today, that relies on kernel implementation details that
> > could change.
> >
> > Ideally, Hyper-V would provide the architectural timer (as it's already
> > claiming to in the GTDT), things would "just work", and the Hyper-V
> > timer would be an optimization rather than a functional necessity.
> >
> > You mentioned above that Hyper-V will virtualize the timer "at some
> > point" -- is that already planned, and when is that likely to be?
> >
> > Marc, do you have any thoughts on this?
> 
> Overall, lying to the kernel is a bad idea. Only implementing half of
> the architecture is another bad idea. I doubt the combination of two
> bad ideas produces a good one.
> 
> If Hyper-V guests need to use another timer (for migration purposes?),
> that's fine. But we rely on both the base architecture to be
> completely implemented *and* on advertised features to be functional.
> I think this has been our position since the first Hyper-V patches
> were posted... 3 years ago?
> 
> What is the hold up for reliably virtualising the arch timer,
> including interrupt delivery?

Marc (and Mark) --

In our early interactions about the Hyper-V clocks and timers, the code
was a bit spread out, and you suggested moving all the clocksource
and clockevent stuff to a driver under drivers/clocksource.  See
https://lore.kernel.org/lkml/e0374a07-809c-cabd-2eb6-e6b5ad84742e@arm.com/.
That was a good change independent of any ARM64 considerations,
but I read (or perhaps overread) your comments to say that it was OK
to use these Hyper-V para-virtualized clocks/timers instead of the ARM64
architectural ones in a Hyper-V VM.  They work and it's what the Hyper-V
guys wanted to do anyway, so having Hyper-V offer the ARM64 arch
counter and timer in a VM hasn't been a priority.  They had other stuff that
didn't work at all on ARM64, so that's where their attention went.

I agree that it would be better to have the ARM64 arch counter/timer
fully implemented in a Hyper-V VM.  But we're wanting to find a practical
way to move forward, and in doing so confine any rough edges to Hyper-V
VMs and the Hyper-V specific code in the kernel tree. We're maintaining
and shipping the code out-of-tree based on Hyper-V ARM64 current behavior
and would like to get this core enablement code upstream. Sure, unbinding
the Hyper-V clockevent doesn't work, but that's not a problem in any use
cases we see from customers.

All that said, our discussions with the Hyper-V team are continuing.  We're
still in the process of seeing what's practical to get and when.

Michael

> 
> Thanks,
> 
> 	M.
> 
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-28  2:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12 17:37 [PATCH v10 0/7] Enable Linux guests on Hyper-V on ARM64 Michael Kelley
2021-05-12 17:37 ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 1/7] asm-generic: hyperv: Fix incorrect architecture dependencies Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 2/7] arm64: hyperv: Add Hyper-V hypercall and register access utilities Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-14 12:52   ` Mark Rutland
2021-05-14 12:52     ` Mark Rutland
2021-05-14 15:14     ` Michael Kelley
2021-05-14 15:14       ` Michael Kelley
2021-05-17 11:44       ` Mark Rutland
2021-05-17 11:44         ` Mark Rutland
2021-05-17 16:41         ` Michael Kelley
2021-05-17 16:41           ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 3/7] arm64: hyperv: Add Hyper-V clocksource/clockevent support Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-14 12:37   ` Mark Rutland
2021-05-14 12:37     ` Mark Rutland
2021-05-14 15:35     ` Michael Kelley
2021-05-14 15:35       ` Michael Kelley
2021-05-17 13:08       ` Mark Rutland
2021-05-17 13:08         ` Mark Rutland
2021-05-17 17:27         ` Michael Kelley
2021-05-17 17:27           ` Michael Kelley
2021-05-18 17:00           ` Mark Rutland
2021-05-18 17:00             ` Mark Rutland
2021-06-08 15:36             ` Michael Kelley
2021-06-08 15:36               ` Michael Kelley
2021-06-10 16:45               ` Mark Rutland
2021-06-10 16:45                 ` Mark Rutland
2021-06-14  2:42                 ` Michael Kelley
2021-06-14  2:42                   ` Michael Kelley
2021-06-16 20:17                   ` Michael Kelley
2021-06-16 20:17                     ` Michael Kelley
2021-06-22  9:54                   ` Mark Rutland
2021-06-22  9:54                     ` Mark Rutland
2021-06-23  8:56                     ` Marc Zyngier
2021-06-23  8:56                       ` Marc Zyngier
2021-06-28  2:21                       ` Michael Kelley [this message]
2021-06-28  2:21                         ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 4/7] arm64: hyperv: Add kexec and panic handlers Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 5/7] arm64: hyperv: Initialize hypervisor on boot Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-13 15:16   ` Wei Liu
2021-05-13 15:16     ` Wei Liu
2021-05-12 17:37 ` [PATCH v10 6/7] arm64: efi: Export screen_info Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-12 17:37 ` [PATCH v10 7/7] Drivers: hv: Enable Hyper-V code to be built on ARM64 Michael Kelley
2021-05-12 17:37   ` Michael Kelley
2021-05-13 13:17 ` [PATCH v10 0/7] Enable Linux guests on Hyper-V " Sudeep Holla
2021-05-13 13:17   ` Sudeep Holla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR21MB159396AA55E2270FD4BC4EECD7039@MWHPR21MB1593.namprd21.prod.outlook.com \
    --to=mikelley@microsoft.com \
    --cc=Mark.Rutland@arm.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=maz@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.