Re: [PATCH v4 01/12] x86/rtc: drop code related to strict mode

From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>,
	<xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v4 01/12] x86/rtc: drop code related to strict mode
Date: Mon, 3 May 2021 16:47:55 +0200	[thread overview]
Message-ID: <YJANG3LeuA3Ygt/Q@Air-de-Roger> (raw)
In-Reply-To: <5b06565e-1f2e-3498-c18f-e7eac0042761@suse.com>

On Mon, May 03, 2021 at 02:26:51PM +0200, Jan Beulich wrote:
> On 03.05.2021 11:28, Roger Pau Monné wrote:
> > On Thu, Apr 29, 2021 at 04:53:07PM +0200, Jan Beulich wrote:
> >> On 20.04.2021 16:07, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/rtc.c
> >>> +++ b/xen/arch/x86/hvm/rtc.c
> >>> @@ -46,15 +46,6 @@
> >>>  #define epoch_year     1900
> >>>  #define get_year(x)    (x + epoch_year)
> >>>  
> >>> -enum rtc_mode {
> >>> -   rtc_mode_no_ack,
> >>> -   rtc_mode_strict
> >>> -};
> >>> -
> >>> -/* This must be in sync with how hvmloader sets the ACPI WAET flags. */
> >>> -#define mode_is(d, m) ((void)(d), rtc_mode_##m == rtc_mode_no_ack)
> >>> -#define rtc_mode_is(s, m) mode_is(vrtc_domain(s), m)
> >>
> >> Leaving aside my concerns about this removal, I think some form of
> >> reference to hvmloader and its respective behavior should remain
> >> here, presumably in form of a (replacement) comment.
> > 
> > What about adding a comment in rtc_pf_callback:
> > 
> > /*
> >  * The current RTC implementation will inject an interrupt regardless
> >  * of whether REG_C has been read since the last interrupt was
> >  * injected. This is why the ACPI WAET 'RTC good' flag must be
> >  * unconditionally set by hvmloader.
> >  */
> 
> For one I'm unconvinced this is "must"; I think it is "may". We're
> producing excess interrupts for an unaware guest, aiui. Presumably most
> guests can tolerate this, but - second - it may be unnecessary overhead.
> Which in turn may be why nobody has complained so far, as this sort of
> overhead my be hard to notice. I also suspect the RTC may not be used
> very often for generating a periodic interrupt.

I agree that there might be some overhead here, but asking for the
guest to read REG_C in order to receive further interrupts also seems
like quite a lot of overhead because all the interception involved.
IMO it's best to unconditionally offer the no_ack mode (like Xen has
been doing).

Also strict_mode wasn't really behaving according to the spec either,
as it would injected up to 10 interrupts without the user have read
REG_C.

> (I've also not seen the
> flag named "RTC good" - the ACPI constant is ACPI_WAET_RTC_NO_ACK, for
> example.)

I'm reading the WAET spec as published my Microsoft:

http://msdn.microsoft.com/en-us/windows/hardware/gg487524.aspx

Where the flag is listed as 'RTC good'. Maybe that's outdated now?
Seems to be the official source for the specification from
https://uefi.org/acpi.

> >>> @@ -337,8 +336,7 @@ int pt_update_irq(struct vcpu *v)
> >>>      {
> >>>          if ( pt->pending_intr_nr )
> >>>          {
> >>> -            /* RTC code takes care of disabling the timer itself. */
> >>> -            if ( (pt->irq != RTC_IRQ || !pt->priv) && pt_irq_masked(pt) &&
> >>> +            if ( pt_irq_masked(pt) &&
> >>>                   /* Level interrupts should be asserted even if masked. */
> >>>                   !pt->level )
> >>>              {
> >>
> >> I'm struggling to relate this to any other part of the patch. In
> >> particular I can't find the case where a periodic timer would be
> >> registered with RTC_IRQ and a NULL private pointer. The only use
> >> I can find is with a non-NULL pointer, which would mean the "else"
> >> path is always taken at present for the RTC case (which you now
> >> change).
> > 
> > Right, the else case was always taken because as the comment noted RTC
> > would take care of disabling itself (by calling destroy_periodic_time
> > from the callback when using strict_mode). When no_ack mode was
> > implemented this wasn't taken into account AFAICT, and thus the RTC
> > was never removed from the list even when masked.
> > 
> > I think with no_ack mode the RTC shouldn't have this specific handling
> > in pt_update_irq, as it should behave like any other virtual timer.
> > I could try to split this as a separate bugfix, but then I would have
> > to teach pt_update_irq to differentiate between strict_mode and no_ack
> > mode.
> 
> A fair part of my confusion was about "&& !pt->priv".

I think you meant "|| !pt->priv"?

> I've looked back
> at 9607327abbd3 ("x86/HVM: properly handle RTC periodic timer even when
> !RTC_PIE"), where this was added. It was, afaict, to cover for
> hpet_set_timer() passing NULL with RTC_IRQ.

That's tricky, as hpet_set_timer hardcodes 8 instead of using RTC_IRQ
which makes it really easy to miss.

> Which makes me suspect that
> be07023be115 ("x86/vhpet: add support for level triggered interrupts")
> may have subtly broken things.

Right - as that would have made the RTC irq when generated from the
HPET no longer be suspended when masked (as pt->priv would no longer
be NULL). Could be fixed with:

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index ca94e8b4538..f2cbd12f400 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -318,7 +318,8 @@ static void hpet_set_timer(HPETState *h, unsigned int tn,
                          hpet_tick_to_ns(h, diff),
                          oneshot ? 0 : hpet_tick_to_ns(h, h->hpet.period[tn]),
                          irq, timer_level(h, tn) ? hpet_timer_fired : NULL,
-                         (void *)(unsigned long)tn, timer_level(h, tn));
+                         timer_level(h, tn) ? (void *)(unsigned long)tn : NULL,
+                         timer_level(h, tn));
 }
 
 static inline uint64_t hpet_fixup_reg(

Passing again NULL as the callback private data for edge triggered
interrupts.

> > Would you be fine if the following is added to the commit message
> > instead:
> > 
> > "Note that the special handling of the RTC timer done in pt_update_irq
> > is wrong for the no_ack mode, as the RTC timer callback won't disable
> > the timer anymore when it detects the guest is not reading REG_C. As
> > such remove the code as part of the removal of strict_mode, and don't
> > special case the RTC timer anymore in pt_update_irq."
> 
> Not sure yet - as per above I'm still not convinced this part of the
> change is correct.

I believe part of this handling is kind of bogus - for example I'm
unsure Xen should account masked interrupt injections as missed ticks.
A guest might decide to mask it's interrupt source for whatever
reason, and then it shouldn't receive a flurry of interrupts when
unmasked. Ie: missed ticks should only be accounted for interrupts
that should have been delivered but the guest wasn't scheduled. I
think such model would also simplify some of the logic that we
currently have.

In fact I have a patch on top of this current series which I haven't
posted yet that does implement this new mode of not accounting masked
interrupts as missed ticks to the delivered later.

Thanks, Roger.