All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] x86: RTC handling adjustments
@ 2020-07-15 11:54 Jan Beulich
  2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 42+ messages in thread
From: Jan Beulich @ 2020-07-15 11:54 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monné

The first patch addresses a recent regression and hence ought to be
considered for 4.14, despite it getting late. I noticed the problem
while re-basing the 2nd patch here, which I decided to now re-post
despite the original discussion on v1 not having lead to any clear
result (i.e. the supposed "leave Dom0 handle the RTC" has never
materialized over the past almost 5 years), while I still think
that the changed code is at least a little bit of improvement.

1: restore pv_rtc_handler() invocation
2: detect CMOS aliasing on ports other than 0x70/0x71

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
@ 2020-07-15 11:56 ` Jan Beulich
  2020-07-15 12:13   ` Roger Pau Monné
  2020-07-15 12:31   ` Paul Durrant
  2020-07-15 11:57 ` [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71 Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 42+ messages in thread
From: Jan Beulich @ 2020-07-15 11:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Paul Durrant

This was lost when making the logic accessible to PVH Dom0.

While doing so make the access to the global function pointer safe
against races (as noticed by Roger): The only current user wants to be
invoked just once (but can tolerate to be invoked multiple times),
zapping the pointer at that point.

Fixes: 835d8d69d96a ("x86/rtc: provide mediated access to RTC for PVH dom0")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Latch pointer under lock.
v2: New.

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1148,6 +1148,8 @@ void rtc_guest_write(unsigned int port,
 
     switch ( port )
     {
+        typeof(pv_rtc_handler) hook;
+
     case RTC_PORT(0):
         /*
          * All PV domains (and PVH dom0) are allowed to write to the latched
@@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
     case RTC_PORT(1):
         if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
             break;
+
+        spin_lock_irqsave(&rtc_lock, flags);
+        hook = pv_rtc_handler;
+        spin_unlock_irqrestore(&rtc_lock, flags);
+
+        if ( hook )
+            hook(currd->arch.cmos_idx & 0x7f, data);
+
         spin_lock_irqsave(&rtc_lock, flags);
         outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
         outb(data, RTC_PORT(1));



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
  2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
@ 2020-07-15 11:57 ` Jan Beulich
  2020-07-20 11:11   ` Roger Pau Monné
  2023-03-20  8:32 ` [PATCH v4] " Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-07-15 11:57 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Paul Durrant

... in order to also intercept accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Re-base over change to earlier patch.
v2: Re-base.

--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -670,6 +670,80 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
     return ret;
 }
 
+#ifndef COMPAT
+#include <asm/mc146818rtc.h>
+
+unsigned int __read_mostly cmos_alias_mask;
+
+static int __init probe_cmos_alias(void)
+{
+    unsigned int i, offs;
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return 0;
+
+    for ( offs = 2; offs < 8; offs <<= 1 )
+    {
+        bool read = true;
+
+        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
+        {
+            uint8_t normal, alt;
+            unsigned long flags;
+
+            if ( i == acpi_gbl_FADT.century )
+                continue;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+
+            normal = CMOS_READ(i);
+            if ( inb(RTC_PORT(offs)) != i )
+                read = false;
+
+            alt = inb(RTC_PORT(offs + 1));
+
+            spin_unlock_irqrestore(&rtc_lock, flags);
+
+            if ( normal != alt )
+                break;
+
+            process_pending_softirqs();
+        }
+        if ( i == 0x80 )
+        {
+            cmos_alias_mask |= offs;
+            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
+                   RTC_PORT(offs), read ? "r/w" : "w/o");
+        }
+    }
+
+    return 0;
+}
+__initcall(probe_cmos_alias);
+
+/* Has the administrator granted sufficient permission for this I/O access? */
+bool admin_io_okay(unsigned int port, unsigned int bytes,
+                   const struct domain *d)
+{
+    /*
+     * Port 0xcf8 (CONFIG_ADDRESS) is only visible for DWORD accesses.
+     * We never permit direct access to that register.
+     */
+    if ( (port == 0xcf8) && (bytes == 4) )
+        return false;
+
+    /*
+     * We also never permit direct access to the RTC/CMOS registers
+     * if we may be accessing the RTC ones ourselves.
+     */
+    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
+         ((port & ~(cmos_alias_mask | 1)) == RTC_PORT(0)) )
+        return false;
+
+    return ioports_access_permitted(d, port, port + bytes - 1);
+}
+#endif /* COMPAT */
+
 /*
  * Local variables:
  * mode: C
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -198,24 +198,6 @@ static bool guest_io_okay(unsigned int p
     return false;
 }
 
-/* Has the administrator granted sufficient permission for this I/O access? */
-static bool admin_io_okay(unsigned int port, unsigned int bytes,
-                          const struct domain *d)
-{
-    /*
-     * Port 0xcf8 (CONFIG_ADDRESS) is only visible for DWORD accesses.
-     * We never permit direct access to that register.
-     */
-    if ( (port == 0xcf8) && (bytes == 4) )
-        return false;
-
-    /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( ((port & ~1) == RTC_PORT(0)) )
-        return false;
-
-    return ioports_access_permitted(d, port, port + bytes - 1);
-}
-
 static bool pci_cfg_ok(struct domain *currd, unsigned int start,
                        unsigned int size, uint32_t *write)
 {
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -48,7 +48,7 @@
 #include <xen/cpu.h>
 #include <asm/nmi.h>
 #include <asm/alternative.h>
-#include <asm/mc146818rtc.h>
+#include <asm/iocap.h>
 #include <asm/cpuid.h>
 #include <asm/spec_ctrl.h>
 #include <asm/guest.h>
@@ -2009,37 +2009,33 @@ int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init io_bitmap_cb(unsigned long s, unsigned long e,
                                      void *ctx)
 {
-    struct domain *d = ctx;
+    const struct domain *d = ctx;
     unsigned int i;
 
     ASSERT(e <= INT_MAX);
     for ( i = s; i <= e; i++ )
-        __clear_bit(i, d->arch.hvm.io_bitmap);
+        if ( admin_io_okay(i, 1, d) )
+            __clear_bit(i, d->arch.hvm.io_bitmap);
 
     return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-    int rc;
+    if ( !is_hvm_domain(d) )
+        return;
 
-    if ( is_hvm_domain(d) )
-    {
-        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
-        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
-                                    io_bitmap_cb, d);
-        BUG_ON(rc);
-        /*
-         * NB: we need to trap accesses to 0xcf8 in order to intercept
-         * 4 byte accesses, that need to be handled by Xen in order to
-         * keep consistency.
-         * Access to 1 byte RTC ports also needs to be trapped in order
-         * to keep consistency with PV.
-         */
-        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-    }
+    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
+    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
+                                io_bitmap_cb, d) )
+        BUG();
+
+    /*
+     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
+     * guest_io_read(), and guest_io_write()), which isn't covered by
+     * the admin_io_okay() check in io_bitmap_cb().
+     */
+    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
 }
 
 /*
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1092,7 +1092,10 @@ static unsigned long get_cmos_time(void)
         if ( seconds < 60 )
         {
             if ( rtc.sec != seconds )
+            {
                 cmos_rtc_probe = false;
+                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
+            }
             break;
         }
 
@@ -1114,7 +1117,7 @@ unsigned int rtc_guest_read(unsigned int
     unsigned long flags;
     unsigned int data = ~0;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
     case RTC_PORT(0):
         /*
@@ -1126,11 +1129,12 @@ unsigned int rtc_guest_read(unsigned int
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        data = inb(RTC_PORT(1));
+        outb(currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1))),
+             port - 1);
+        data = inb(port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
@@ -1146,9 +1150,10 @@ void rtc_guest_write(unsigned int port,
     struct domain *currd = current->domain;
     unsigned long flags;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
         typeof(pv_rtc_handler) hook;
+        unsigned int idx;
 
     case RTC_PORT(0):
         /*
@@ -1160,19 +1165,21 @@ void rtc_guest_write(unsigned int port,
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
 
         spin_lock_irqsave(&rtc_lock, flags);
         hook = pv_rtc_handler;
         spin_unlock_irqrestore(&rtc_lock, flags);
 
+        idx = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1)));
+
         if ( hook )
-            hook(currd->arch.cmos_idx & 0x7f, data);
+            hook(idx, data);
 
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        outb(data, RTC_PORT(1));
+        outb(idx, port - 1);
+        outb(data, port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
--- a/xen/include/asm-x86/iocap.h
+++ b/xen/include/asm-x86/iocap.h
@@ -18,4 +18,7 @@
     (!rangeset_is_empty((d)->iomem_caps) ||             \
      !rangeset_is_empty((d)->arch.ioport_caps))
 
+bool admin_io_okay(unsigned int port, unsigned int bytes,
+                   const struct domain *d);
+
 #endif /* __X86_IOCAP_H__ */
--- a/xen/include/asm-x86/mc146818rtc.h
+++ b/xen/include/asm-x86/mc146818rtc.h
@@ -9,6 +9,8 @@
 
 extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
 
+extern unsigned int cmos_alias_mask;
+
 /**********************************************************************
  * register summary
  **********************************************************************/



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
@ 2020-07-15 12:13   ` Roger Pau Monné
  2020-07-15 12:36     ` Jan Beulich
  2020-07-15 12:31   ` Paul Durrant
  1 sibling, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2020-07-15 12:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Wed, Jul 15, 2020 at 01:56:47PM +0200, Jan Beulich wrote:
> This was lost when making the logic accessible to PVH Dom0.
> 
> While doing so make the access to the global function pointer safe
> against races (as noticed by Roger): The only current user wants to be
> invoked just once (but can tolerate to be invoked multiple times),
> zapping the pointer at that point.
> 
> Fixes: 835d8d69d96a ("x86/rtc: provide mediated access to RTC for PVH dom0")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Thanks, sorry I have one comment below.

> ---
> v3: Latch pointer under lock.
> v2: New.
> 
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1148,6 +1148,8 @@ void rtc_guest_write(unsigned int port,
>  
>      switch ( port )
>      {
> +        typeof(pv_rtc_handler) hook;

Nit: FWIW, given the current structure of the function I would just have placed
it together with the rest of the top-level local variables.

> +
>      case RTC_PORT(0):
>          /*
>           * All PV domains (and PVH dom0) are allowed to write to the latched
> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
>      case RTC_PORT(1):
>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
>              break;
> +
> +        spin_lock_irqsave(&rtc_lock, flags);
> +        hook = pv_rtc_handler;
> +        spin_unlock_irqrestore(&rtc_lock, flags);

Given that clearing the pv_rtc_handler variable in handle_rtc_once is
not done while holding the rtc_lock, I'm not sure there's much point
in holding the lock here, ie: just doing something like:

hook = pv_rtc_handler;
if ( hook )
    hook(currd->arch.cmos_idx & 0x7f, data);

Should be as safe as what you do.

We also assume that setting pv_rtc_handler to NULL is an atomic
operation.

Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
  2020-07-15 12:13   ` Roger Pau Monné
@ 2020-07-15 12:31   ` Paul Durrant
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Durrant @ 2020-07-15 12:31 UTC (permalink / raw)
  To: 'Jan Beulich', xen-devel
  Cc: 'Andrew Cooper', 'Wei Liu',
	'Roger Pau Monné'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 15 July 2020 12:57
> To: xen-devel@lists.xenproject.org
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Paul Durrant <paul@xen.org>; Wei Liu <wl@xen.org>;
> Roger Pau Monné <roger.pau@citrix.com>
> Subject: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
> 
> This was lost when making the logic accessible to PVH Dom0.
> 
> While doing so make the access to the global function pointer safe
> against races (as noticed by Roger): The only current user wants to be
> invoked just once (but can tolerate to be invoked multiple times),
> zapping the pointer at that point.
> 
> Fixes: 835d8d69d96a ("x86/rtc: provide mediated access to RTC for PVH dom0")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: Latch pointer under lock.
> v2: New.
> 
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1148,6 +1148,8 @@ void rtc_guest_write(unsigned int port,
> 
>      switch ( port )
>      {
> +        typeof(pv_rtc_handler) hook;
> +
>      case RTC_PORT(0):
>          /*
>           * All PV domains (and PVH dom0) are allowed to write to the latched
> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
>      case RTC_PORT(1):
>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
>              break;
> +
> +        spin_lock_irqsave(&rtc_lock, flags);
> +        hook = pv_rtc_handler;
> +        spin_unlock_irqrestore(&rtc_lock, flags);
> +
> +        if ( hook )
> +            hook(currd->arch.cmos_idx & 0x7f, data);
> +
>          spin_lock_irqsave(&rtc_lock, flags);
>          outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
>          outb(data, RTC_PORT(1));

LGTM..

Release-acked-by: Paul Durrant <paul@xen.org>




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 12:13   ` Roger Pau Monné
@ 2020-07-15 12:36     ` Jan Beulich
  2020-07-15 13:32       ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-07-15 12:36 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 15.07.2020 14:13, Roger Pau Monné wrote:
> On Wed, Jul 15, 2020 at 01:56:47PM +0200, Jan Beulich wrote:
>> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
>>      case RTC_PORT(1):
>>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
>>              break;
>> +
>> +        spin_lock_irqsave(&rtc_lock, flags);
>> +        hook = pv_rtc_handler;
>> +        spin_unlock_irqrestore(&rtc_lock, flags);
> 
> Given that clearing the pv_rtc_handler variable in handle_rtc_once is
> not done while holding the rtc_lock, I'm not sure there's much point
> in holding the lock here, ie: just doing something like:
> 
> hook = pv_rtc_handler;
> if ( hook )
>     hook(currd->arch.cmos_idx & 0x7f, data);
> 
> Should be as safe as what you do.

No, the compiler is free to eliminate the local variable and read
the global one twice (and it may change contents in between) then.
I could use ACCESS_ONCE() or read_atomic() here, but then it would
become quite clear that at the same time ...

> We also assume that setting pv_rtc_handler to NULL is an atomic
> operation.

... this (which isn't different from what we do elsewhere, and we
really can't fix everything at the same time) ought to also become
ACCESS_ONCE() (or write_atomic()).

A compromise might be to use barrier() in place of the locking for
now ...

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 12:36     ` Jan Beulich
@ 2020-07-15 13:32       ` Roger Pau Monné
  2020-07-15 13:51         ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2020-07-15 13:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Wed, Jul 15, 2020 at 02:36:49PM +0200, Jan Beulich wrote:
> On 15.07.2020 14:13, Roger Pau Monné wrote:
> > On Wed, Jul 15, 2020 at 01:56:47PM +0200, Jan Beulich wrote:
> >> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
> >>      case RTC_PORT(1):
> >>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
> >>              break;
> >> +
> >> +        spin_lock_irqsave(&rtc_lock, flags);
> >> +        hook = pv_rtc_handler;
> >> +        spin_unlock_irqrestore(&rtc_lock, flags);
> > 
> > Given that clearing the pv_rtc_handler variable in handle_rtc_once is
> > not done while holding the rtc_lock, I'm not sure there's much point
> > in holding the lock here, ie: just doing something like:
> > 
> > hook = pv_rtc_handler;
> > if ( hook )
> >     hook(currd->arch.cmos_idx & 0x7f, data);
> > 
> > Should be as safe as what you do.
> 
> No, the compiler is free to eliminate the local variable and read
> the global one twice (and it may change contents in between) then.
> I could use ACCESS_ONCE() or read_atomic() here, but then it would
> become quite clear that at the same time ...
> 
> > We also assume that setting pv_rtc_handler to NULL is an atomic
> > operation.
> 
> ... this (which isn't different from what we do elsewhere, and we
> really can't fix everything at the same time) ought to also become
> ACCESS_ONCE() (or write_atomic()).
> 
> A compromise might be to use barrier() in place of the locking for
> now ...

Oh, right. Didn't realize you did it in order to prevent
optimizations. Using the lock seems also quite weird IMO, so I'm not
sure it's much better than just using ACCESS_ONCE (or a barrier).
Anyway, I don't want to delay this any longer, so:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Feel free to change to ACCESS_ONCE or barrier if you think it's
clearer.

Thanks.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 13:32       ` Roger Pau Monné
@ 2020-07-15 13:51         ` Jan Beulich
  2020-07-15 14:51           ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-07-15 13:51 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 15.07.2020 15:32, Roger Pau Monné wrote:
> On Wed, Jul 15, 2020 at 02:36:49PM +0200, Jan Beulich wrote:
>> On 15.07.2020 14:13, Roger Pau Monné wrote:
>>> On Wed, Jul 15, 2020 at 01:56:47PM +0200, Jan Beulich wrote:
>>>> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
>>>>      case RTC_PORT(1):
>>>>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
>>>>              break;
>>>> +
>>>> +        spin_lock_irqsave(&rtc_lock, flags);
>>>> +        hook = pv_rtc_handler;
>>>> +        spin_unlock_irqrestore(&rtc_lock, flags);
>>>
>>> Given that clearing the pv_rtc_handler variable in handle_rtc_once is
>>> not done while holding the rtc_lock, I'm not sure there's much point
>>> in holding the lock here, ie: just doing something like:
>>>
>>> hook = pv_rtc_handler;
>>> if ( hook )
>>>     hook(currd->arch.cmos_idx & 0x7f, data);
>>>
>>> Should be as safe as what you do.
>>
>> No, the compiler is free to eliminate the local variable and read
>> the global one twice (and it may change contents in between) then.
>> I could use ACCESS_ONCE() or read_atomic() here, but then it would
>> become quite clear that at the same time ...
>>
>>> We also assume that setting pv_rtc_handler to NULL is an atomic
>>> operation.
>>
>> ... this (which isn't different from what we do elsewhere, and we
>> really can't fix everything at the same time) ought to also become
>> ACCESS_ONCE() (or write_atomic()).
>>
>> A compromise might be to use barrier() in place of the locking for
>> now ...
> 
> Oh, right. Didn't realize you did it in order to prevent
> optimizations. Using the lock seems also quite weird IMO, so I'm not
> sure it's much better than just using ACCESS_ONCE (or a barrier).
> Anyway, I don't want to delay this any longer, so:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> Feel free to change to ACCESS_ONCE or barrier if you think it's
> clearer.

I did so (also on the writer side), not the least based on guessing
what Andrew would presumably have preferred.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 13:51         ` Jan Beulich
@ 2020-07-15 14:51           ` Roger Pau Monné
  2020-07-16 10:06             ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2020-07-15 14:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Wed, Jul 15, 2020 at 03:51:17PM +0200, Jan Beulich wrote:
> On 15.07.2020 15:32, Roger Pau Monné wrote:
> > On Wed, Jul 15, 2020 at 02:36:49PM +0200, Jan Beulich wrote:
> >> On 15.07.2020 14:13, Roger Pau Monné wrote:
> >>> On Wed, Jul 15, 2020 at 01:56:47PM +0200, Jan Beulich wrote:
> >>>> @@ -1160,6 +1162,14 @@ void rtc_guest_write(unsigned int port,
> >>>>      case RTC_PORT(1):
> >>>>          if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
> >>>>              break;
> >>>> +
> >>>> +        spin_lock_irqsave(&rtc_lock, flags);
> >>>> +        hook = pv_rtc_handler;
> >>>> +        spin_unlock_irqrestore(&rtc_lock, flags);
> >>>
> >>> Given that clearing the pv_rtc_handler variable in handle_rtc_once is
> >>> not done while holding the rtc_lock, I'm not sure there's much point
> >>> in holding the lock here, ie: just doing something like:
> >>>
> >>> hook = pv_rtc_handler;
> >>> if ( hook )
> >>>     hook(currd->arch.cmos_idx & 0x7f, data);
> >>>
> >>> Should be as safe as what you do.
> >>
> >> No, the compiler is free to eliminate the local variable and read
> >> the global one twice (and it may change contents in between) then.
> >> I could use ACCESS_ONCE() or read_atomic() here, but then it would
> >> become quite clear that at the same time ...
> >>
> >>> We also assume that setting pv_rtc_handler to NULL is an atomic
> >>> operation.
> >>
> >> ... this (which isn't different from what we do elsewhere, and we
> >> really can't fix everything at the same time) ought to also become
> >> ACCESS_ONCE() (or write_atomic()).
> >>
> >> A compromise might be to use barrier() in place of the locking for
> >> now ...
> > 
> > Oh, right. Didn't realize you did it in order to prevent
> > optimizations. Using the lock seems also quite weird IMO, so I'm not
> > sure it's much better than just using ACCESS_ONCE (or a barrier).
> > Anyway, I don't want to delay this any longer, so:
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> > Feel free to change to ACCESS_ONCE or barrier if you think it's
> > clearer.
> 
> I did so (also on the writer side), not the least based on guessing
> what Andrew would presumably have preferred.

Thanks! Sorry I might be pedantic, but is the ACCESS_ONCE on the write
side actually required? I'm not sure I see what ACCESS_ONCE protects
against in handle_rtc_once.

Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-15 14:51           ` Roger Pau Monné
@ 2020-07-16 10:06             ` Jan Beulich
  2020-07-16 10:31               ` Roger Pau Monné
  2020-07-20 15:28               ` Andrew Cooper
  0 siblings, 2 replies; 42+ messages in thread
From: Jan Beulich @ 2020-07-16 10:06 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 15.07.2020 16:51, Roger Pau Monné wrote:
> On Wed, Jul 15, 2020 at 03:51:17PM +0200, Jan Beulich wrote:
>> On 15.07.2020 15:32, Roger Pau Monné wrote:
>>> Feel free to change to ACCESS_ONCE or barrier if you think it's
>>> clearer.
>>
>> I did so (also on the writer side), not the least based on guessing
>> what Andrew would presumably have preferred.
> 
> Thanks! Sorry I might be pedantic, but is the ACCESS_ONCE on the write
> side actually required? I'm not sure I see what ACCESS_ONCE protects
> against in handle_rtc_once.

Well, this is all sort of a mess, I think. We have this mixture of
ACCESS_ONCE() and read_atomic() / write_atomic(), but I don't think
we use them consistently, and I'm not sure either is suitable to
deal with all (theoretical) corner cases.

read_atomic() / write_atomic() guarantee a single insn to be used
to access a piece of data. I'm uncertain whether they also guarantee
single access (i.e. that the compiler won't replicate the asm()-s).
The wording in gcc doc is pretty precise, but not quite enough imo
to be entirely certain.

ACCESS_ONCE() guarantees single access, but doesn't guarantee that
the compiler wouldn't split this single access into multiple insns.
(It's just, like elsewhere, that it would be pretty silly of it if
it did.)

Yesterday, as said, I tried to in particular do what I expect/guess
Andrew would have wanted done. This is despite me not being entirely
convinced this is the right thing to do here, i.e. personally I
would have preferred read_atomic() / write_atomic(), as I think the
intention of what the gcc doc is saying is what we want (taking
into consideration both uses of "volatile" in these helpers).

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-16 10:06             ` Jan Beulich
@ 2020-07-16 10:31               ` Roger Pau Monné
  2020-07-16 10:52                 ` Jan Beulich
  2020-07-20 15:28               ` Andrew Cooper
  1 sibling, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2020-07-16 10:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Thu, Jul 16, 2020 at 12:06:14PM +0200, Jan Beulich wrote:
> On 15.07.2020 16:51, Roger Pau Monné wrote:
> > On Wed, Jul 15, 2020 at 03:51:17PM +0200, Jan Beulich wrote:
> >> On 15.07.2020 15:32, Roger Pau Monné wrote:
> >>> Feel free to change to ACCESS_ONCE or barrier if you think it's
> >>> clearer.
> >>
> >> I did so (also on the writer side), not the least based on guessing
> >> what Andrew would presumably have preferred.
> > 
> > Thanks! Sorry I might be pedantic, but is the ACCESS_ONCE on the write
> > side actually required? I'm not sure I see what ACCESS_ONCE protects
> > against in handle_rtc_once.
> 
> Well, this is all sort of a mess, I think. We have this mixture of
> ACCESS_ONCE() and read_atomic() / write_atomic(), but I don't think
> we use them consistently, and I'm not sure either is suitable to
> deal with all (theoretical) corner cases.
> 
> read_atomic() / write_atomic() guarantee a single insn to be used
> to access a piece of data. I'm uncertain whether they also guarantee
> single access (i.e. that the compiler won't replicate the asm()-s).

Yes, that would be my expectation from my reading of the manual, as
it prevents gcc from: "move it out of loops or omit it on the
assumption that the result from a previous call is still valid".

> The wording in gcc doc is pretty precise, but not quite enough imo
> to be entirely certain.

I agree it's not that precise.

> ACCESS_ONCE() guarantees single access, but doesn't guarantee that
> the compiler wouldn't split this single access into multiple insns.
> (It's just, like elsewhere, that it would be pretty silly of it if
> it did.)
> 
> Yesterday, as said, I tried to in particular do what I expect/guess
> Andrew would have wanted done. This is despite me not being entirely
> convinced this is the right thing to do here, i.e. personally I
> would have preferred read_atomic() / write_atomic(), as I think the
> intention of what the gcc doc is saying is what we want (taking
> into consideration both uses of "volatile" in these helpers).

Well, gcc states:

"Note that the compiler can move even volatile asm instructions
relative to other code, including across jump instructions."

So I think we likely want to use {read/write}_atomic plus a compiler
barrier? I'm not sure anyway how the read of pv_rtc_handler could be
moved, but I guess I'm not that creative :).

AFAICT we require a write_atomic in handle_rtc_once in order to assure
a single instruction is used (no barrier required), and then we
require a read_atomic + a compiler barrier in rtc_guest_write in order
to prevent the compiler from optimizing the accesses to 'hook' in any
way? (that barrier might not be strictly required, as you say it's not
fully clear whether 'asm volatile' doesn't provide the necessary
protection here).

Thanks.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-16 10:31               ` Roger Pau Monné
@ 2020-07-16 10:52                 ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2020-07-16 10:52 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 16.07.2020 12:31, Roger Pau Monné wrote:
> On Thu, Jul 16, 2020 at 12:06:14PM +0200, Jan Beulich wrote:
>> On 15.07.2020 16:51, Roger Pau Monné wrote:
>>> On Wed, Jul 15, 2020 at 03:51:17PM +0200, Jan Beulich wrote:
>>>> On 15.07.2020 15:32, Roger Pau Monné wrote:
>>>>> Feel free to change to ACCESS_ONCE or barrier if you think it's
>>>>> clearer.
>>>>
>>>> I did so (also on the writer side), not the least based on guessing
>>>> what Andrew would presumably have preferred.
>>>
>>> Thanks! Sorry I might be pedantic, but is the ACCESS_ONCE on the write
>>> side actually required? I'm not sure I see what ACCESS_ONCE protects
>>> against in handle_rtc_once.
>>
>> Well, this is all sort of a mess, I think. We have this mixture of
>> ACCESS_ONCE() and read_atomic() / write_atomic(), but I don't think
>> we use them consistently, and I'm not sure either is suitable to
>> deal with all (theoretical) corner cases.
>>
>> read_atomic() / write_atomic() guarantee a single insn to be used
>> to access a piece of data. I'm uncertain whether they also guarantee
>> single access (i.e. that the compiler won't replicate the asm()-s).
> 
> Yes, that would be my expectation from my reading of the manual, as
> it prevents gcc from: "move it out of loops or omit it on the
> assumption that the result from a previous call is still valid".
> 
>> The wording in gcc doc is pretty precise, but not quite enough imo
>> to be entirely certain.
> 
> I agree it's not that precise.
> 
>> ACCESS_ONCE() guarantees single access, but doesn't guarantee that
>> the compiler wouldn't split this single access into multiple insns.
>> (It's just, like elsewhere, that it would be pretty silly of it if
>> it did.)
>>
>> Yesterday, as said, I tried to in particular do what I expect/guess
>> Andrew would have wanted done. This is despite me not being entirely
>> convinced this is the right thing to do here, i.e. personally I
>> would have preferred read_atomic() / write_atomic(), as I think the
>> intention of what the gcc doc is saying is what we want (taking
>> into consideration both uses of "volatile" in these helpers).
> 
> Well, gcc states:
> 
> "Note that the compiler can move even volatile asm instructions
> relative to other code, including across jump instructions."
> 
> So I think we likely want to use {read/write}_atomic plus a compiler
> barrier? I'm not sure anyway how the read of pv_rtc_handler could be
> moved, but I guess I'm not that creative :).
> 
> AFAICT we require a write_atomic in handle_rtc_once in order to assure
> a single instruction is used (no barrier required), and then we
> require a read_atomic + a compiler barrier in rtc_guest_write in order
> to prevent the compiler from optimizing the accesses to 'hook' in any
> way? (that barrier might not be strictly required, as you say it's not
> fully clear whether 'asm volatile' doesn't provide the necessary
> protection here).

Yes, but I'd really like to wait for Andrew's input here before we
make any further adjustments. In any even what has gone in yesterday
is already better than what the original code was that needed
restoring.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-15 11:57 ` [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71 Jan Beulich
@ 2020-07-20 11:11   ` Roger Pau Monné
  2023-03-17 16:12     ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2020-07-20 11:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Wed, Jul 15, 2020 at 01:57:07PM +0200, Jan Beulich wrote:
> ... in order to also intercept accesses through the alias ports.
> 
> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC.

Will wait for v4 with the fixed additions to PVH dom0.

Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-16 10:06             ` Jan Beulich
  2020-07-16 10:31               ` Roger Pau Monné
@ 2020-07-20 15:28               ` Andrew Cooper
  2020-07-20 16:27                 ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Andrew Cooper @ 2020-07-20 15:28 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné; +Cc: xen-devel, Wei Liu, Paul Durrant

On 16/07/2020 11:06, Jan Beulich wrote:
> ACCESS_ONCE() guarantees single access, but doesn't guarantee that
> the compiler wouldn't split this single access into multiple insns.

ACCESS_ONCE() does guarantee single accesses for any natural integer size.

There is a section about this specifically in Linux's
memory-barriers.txt, and this isn't the first time I've pointed it out...

~Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-20 15:28               ` Andrew Cooper
@ 2020-07-20 16:27                 ` Jan Beulich
  2020-07-21  6:36                   ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2020-07-20 16:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, Roger Pau Monné

On 20.07.2020 17:28, Andrew Cooper wrote:
> On 16/07/2020 11:06, Jan Beulich wrote:
>> ACCESS_ONCE() guarantees single access, but doesn't guarantee that
>> the compiler wouldn't split this single access into multiple insns.
> 
> ACCESS_ONCE() does guarantee single accesses for any natural integer size.
> 
> There is a section about this specifically in Linux's
> memory-barriers.txt, and this isn't the first time I've pointed it out...

There indeed is text stating this, but I can't find any word on
why they believe this is the case. My understanding of volatile
is that it guarantees no more (and also no less) accesses to
any single storage location than indicated by the source. But
it doesn't prevent "tearing" of accesses. And really, how could
it, considering that volatile can also be applied to types that
aren't basic ones, and hence in particular to ones that can't
possibly be accessed by a single insn?

I'd be glad to see you point out the aspect I'm missing here.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation
  2020-07-20 16:27                 ` Jan Beulich
@ 2020-07-21  6:36                   ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2020-07-21  6:36 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Roger Pau Monné, Wei Liu, Paul Durrant

On 20.07.2020 18:27, Jan Beulich wrote:
> On 20.07.2020 17:28, Andrew Cooper wrote:
>> On 16/07/2020 11:06, Jan Beulich wrote:
>>> ACCESS_ONCE() guarantees single access, but doesn't guarantee that
>>> the compiler wouldn't split this single access into multiple insns.
>>
>> ACCESS_ONCE() does guarantee single accesses for any natural integer size.
>>
>> There is a section about this specifically in Linux's
>> memory-barriers.txt, and this isn't the first time I've pointed it out...
> 
> There indeed is text stating this, but I can't find any word on
> why they believe this is the case. My understanding of volatile
> is that it guarantees no more (and also no less) accesses to
> any single storage location than indicated by the source. But
> it doesn't prevent "tearing" of accesses. And really, how could
> it, considering that volatile can also be applied to types that
> aren't basic ones, and hence in particular to ones that can't
> possibly be accessed by a single insn?

To avoid a possible reference to *_ONCE() only accepting scalar
types - even the more explicit logic in the Linux constructs
permits "long long". Yet (I'm inclined to say of course) the
compiler makes no effort at all to carry out such a 64-bit
access as a single (atomic) insn on a 32-bit arch (i.e. cmpxchg8b
on ix86, if available). If there really was such a guarantee, it
surely would need to, or diagnose that it can't.

Furthermore I've looked at the current implementation of their
macros:

/*
 * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
 * atomicity or dependency ordering guarantees. Note that this may result
 * in tears!
 */
#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))

#define __READ_ONCE_SCALAR(x)						\
({									\
	__unqual_scalar_typeof(x) __x = __READ_ONCE(x);			\
	smp_read_barrier_depends();					\
	(typeof(x))__x;							\
})

#define READ_ONCE(x)							\
({									\
	compiletime_assert_rwonce_type(x);				\
	__READ_ONCE_SCALAR(x);						\
})

The difference between __READ_ONCE() and READ_ONCE() effectively
is merely the smp_read_barrier_depends() afaics. Hence to me the
"tears" in the comment can only refer to "tear drops", not to
"torn accesses". The comment ahead of
compiletime_assert_rwonce_type() is also "interesting":

/*
 * Yes, this permits 64-bit accesses on 32-bit architectures. These will
 * actually be atomic in some cases (namely Armv7 + LPAE), but for others we
 * rely on the access being split into 2x32-bit accesses for a 32-bit quantity
 * (e.g. a virtual address) and a strong prevailing wind.
 */

(I'm struggling to see what extra effects this construct has over
the type enforcement by __unqual_scalar_typeof().)

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-20 11:11   ` Roger Pau Monné
@ 2023-03-17 16:12     ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2023-03-17 16:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Mon, Jul 20, 2020 at 01:11:18PM +0200, Roger Pau Monné wrote:
> On Wed, Jul 15, 2020 at 01:57:07PM +0200, Jan Beulich wrote:
> > ... in order to also intercept accesses through the alias ports.
> > 
> > Also stop intercepting accesses to the CMOS ports if we won't ourselves
> > use the CMOS RTC.
> 
> Will wait for v4 with the fixed additions to PVH dom0.

I think this is waiting for a v4, let me know if that's not the case.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
  2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
  2020-07-15 11:57 ` [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71 Jan Beulich
@ 2023-03-20  8:32 ` Jan Beulich
  2023-03-21 14:12   ` Roger Pau Monné
  2023-03-23 14:49   ` Roger Pau Monné
  2023-03-30 10:40 ` [PATCH v5] " Jan Beulich
  2023-04-18  9:24 ` [PATCH v6] " Jan Beulich
  4 siblings, 2 replies; 42+ messages in thread
From: Jan Beulich @ 2023-03-20  8:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monné

... in order to also intercept Dom0 accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: Also conditionally mask top bit for guest index port accesses. Add
    missing adjustments to rtc_init(). Re-work to avoid recursive
    read_lock(). Also adjust guest_io_{read,write}(). Re-base.
v3: Re-base over change to earlier patch.
v2: Re-base.

--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -27,7 +27,7 @@
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/save.h>
-#include <asm/current.h>
+#include <asm/iocap.h>
 #include <xen/trace.h>
 #include <public/hvm/params.h>
 
@@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
 
     if ( !has_vrtc(d) )
     {
-        if ( is_hardware_domain(d) )
-            /* Hardware domain gets mediated access to the physical RTC. */
-            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
-        return;
+        unsigned int port;
+
+        if ( !is_hardware_domain(d) )
+            return;
+
+        /*
+         * Hardware domain gets mediated access to the physical RTC/CMOS
+         * (of course unless we don't use it ourselves).
+         */
+        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
+            if ( is_cmos_port(port, 2, d) )
+                register_portio_handler(d, port, 2, hw_rtc_io);
     }
 
     spin_lock_init(&s->lock);
--- a/xen/arch/x86/include/asm/mc146818rtc.h
+++ b/xen/arch/x86/include/asm/mc146818rtc.h
@@ -9,6 +9,10 @@
 
 extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
 
+struct domain;
+bool is_cmos_port(unsigned int port, unsigned int bytes,
+                  const struct domain *d);
+
 /**********************************************************************
  * register summary
  **********************************************************************/
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
         return false;
 
     /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+    if ( is_cmos_port(port, bytes, d) )
         return false;
 
     return ioports_access_permitted(d, port, port + bytes - 1);
@@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
         {
             sub_data = pv_pit_handler(port, 0, 0);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             sub_data = rtc_guest_read(port);
         }
@@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
         {
             pv_pit_handler(port, (uint8_t)data, 1);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             rtc_guest_write(port, data);
         }
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init cf_check io_bitmap_cb(
     unsigned long s, unsigned long e, void *ctx)
 {
-    struct domain *d = ctx;
+    const struct domain *d = ctx;
     unsigned int i;
 
     ASSERT(e <= INT_MAX);
     for ( i = s; i <= e; i++ )
-        __clear_bit(i, d->arch.hvm.io_bitmap);
+        /*
+         * Accesses to RTC ports also need to be trapped in order to keep
+         * consistency with PV.
+         */
+        if ( !is_cmos_port(i, 1, d) )
+            __clear_bit(i, d->arch.hvm.io_bitmap);
 
     return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-    int rc;
+    if ( !is_hvm_domain(d) )
+        return;
 
-    if ( is_hvm_domain(d) )
-    {
-        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
-        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
-                                    io_bitmap_cb, d);
-        BUG_ON(rc);
-        /*
-         * NB: we need to trap accesses to 0xcf8 in order to intercept
-         * 4 byte accesses, that need to be handled by Xen in order to
-         * keep consistency.
-         * Access to 1 byte RTC ports also needs to be trapped in order
-         * to keep consistency with PV.
-         */
-        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-    }
+    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
+    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
+                                io_bitmap_cb, d) )
+        BUG();
+
+    /*
+     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
+     * guest_io_read(), and guest_io_write()).
+     */
+    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
 }
 
 /*
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
         if ( seconds < 60 )
         {
             if ( rtc.sec != seconds )
+            {
                 cmos_rtc_probe = false;
+                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
+            }
             break;
         }
 
@@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
     return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
 }
 
+static unsigned int __ro_after_init cmos_alias_mask;
+
+static int __init cf_check probe_cmos_alias(void)
+{
+    unsigned int i, offs;
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return 0;
+
+    for ( offs = 2; offs < 8; offs <<= 1 )
+    {
+        bool read = true;
+
+        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
+        {
+            uint8_t normal, alt;
+            unsigned long flags;
+
+            if ( i == acpi_gbl_FADT.century )
+                continue;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+
+            normal = CMOS_READ(i);
+            if ( inb(RTC_PORT(offs)) != i )
+                read = false;
+
+            alt = inb(RTC_PORT(offs + 1));
+
+            spin_unlock_irqrestore(&rtc_lock, flags);
+
+            if ( normal != alt )
+                break;
+
+            process_pending_softirqs();
+        }
+        if ( i == 0x80 )
+        {
+            cmos_alias_mask |= offs;
+            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
+                   RTC_PORT(offs), read ? "r/w" : "w/o");
+        }
+    }
+
+    return 0;
+}
+__initcall(probe_cmos_alias);
+
+bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
+{
+    if ( !is_hardware_domain(d) )
+        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
+
+    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
+         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
+    {
+        unsigned int cmos = RTC_PORT(0), nr = 2, step;
+
+        while ( cmos_alias_mask & nr )
+            nr <<= 1;
+        for ( step = nr << 1;
+              step < cmos_alias_mask && !(cmos_alias_mask & step); )
+            step <<= 1;
+        do {
+            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
+                 port <= cmos + 1 && port + bytes > cmos )
+                return true;
+            cmos += step;
+        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
+    }
+
+    return false;
+}
+
 /* Helpers for guest accesses to the physical RTC. */
 unsigned int rtc_guest_read(unsigned int port)
 {
@@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
     unsigned long flags;
     unsigned int data = ~0;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
     case RTC_PORT(0):
         /*
@@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
          * of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        data = currd->arch.cmos_idx;
+        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        data = inb(RTC_PORT(1));
+        outb(currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1))),
+             port - 1);
+        data = inb(port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
@@ -1288,9 +1366,10 @@ void rtc_guest_write(unsigned int port,
     struct domain *currd = current->domain;
     unsigned long flags;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
         typeof(pv_rtc_handler) hook;
+        unsigned int idx;
 
     case RTC_PORT(0):
         /*
@@ -1298,20 +1377,22 @@ void rtc_guest_write(unsigned int port,
          * value of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        currd->arch.cmos_idx = data;
+        currd->arch.cmos_idx = data & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
 
+        idx = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1)));
+
         hook = ACCESS_ONCE(pv_rtc_handler);
         if ( hook )
-            hook(currd->arch.cmos_idx & 0x7f, data);
+            hook(idx, data);
 
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        outb(data, RTC_PORT(1));
+        outb(idx, port - 1);
+        outb(data, port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-20  8:32 ` [PATCH v4] " Jan Beulich
@ 2023-03-21 14:12   ` Roger Pau Monné
  2023-03-22  9:55     ` Jan Beulich
  2023-03-27 15:44     ` Jan Beulich
  2023-03-23 14:49   ` Roger Pau Monné
  1 sibling, 2 replies; 42+ messages in thread
From: Roger Pau Monné @ 2023-03-21 14:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.

I'm trying to get some documentation about this aliasing, but so far I
haven't been able to find any.  Do you have any references of where I
might be able to find it?

> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC.

Could this create any concerns with the ability to disable NMIs if we
no longer filter accesses to the RTC?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-21 14:12   ` Roger Pau Monné
@ 2023-03-22  9:55     ` Jan Beulich
  2023-03-23 12:29       ` Roger Pau Monné
  2023-03-27 15:44     ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-03-22  9:55 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 21.03.2023 15:12, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> I'm trying to get some documentation about this aliasing, but so far I
> haven't been able to find any.  Do you have any references of where I
> might be able to find it?

I think several ICH datasheet documents mention this. Right now I'm
looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
Address Map" under "Real Time Clock Registers").

But such aliasing (really: lack of decoding) has been present on
various of the low 1024 ports from the very early days of x86. So we
may want to take care of such elsewhere as well, e.g. for the PIC
(where aforementioned doc also explicitly mentions the aliases).

>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC.
> 
> Could this create any concerns with the ability to disable NMIs if we
> no longer filter accesses to the RTC?

Hmm, that's a valid concern, but I'm not sure in how far we need to
be worried about giving Dom0 this level of control. As long as we
don't use it ourselves of course (I'm unaware of us using this
anywhere). If we're worried, we could continue to intercept port
0x70 alone, just to mask off the top bit for writes.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-22  9:55     ` Jan Beulich
@ 2023-03-23 12:29       ` Roger Pau Monné
  2023-03-23 14:26         ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-03-23 12:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Wed, Mar 22, 2023 at 10:55:42AM +0100, Jan Beulich wrote:
> On 21.03.2023 15:12, Roger Pau Monné wrote:
> > On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> >> ... in order to also intercept Dom0 accesses through the alias ports.
> > 
> > I'm trying to get some documentation about this aliasing, but so far I
> > haven't been able to find any.  Do you have any references of where I
> > might be able to find it?
> 
> I think several ICH datasheet documents mention this. Right now I'm
> looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
> Address Map" under "Real Time Clock Registers").

Thanks, I had to fetch this from elsewhere as I haven't been able to
find it on the Intel documentation site, maybe it's too old?

> But such aliasing (really: lack of decoding) has been present on
> various of the low 1024 ports from the very early days of x86. So we
> may want to take care of such elsewhere as well, e.g. for the PIC
> (where aforementioned doc also explicitly mentions the aliases).

I wonder how relevant those aliases are for OSes, do we know of any OS
that uses them?

For example we don't seem to provide them to HVM guests at all, and we
seem to get away with it.

> >> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> >> use the CMOS RTC.
> > 
> > Could this create any concerns with the ability to disable NMIs if we
> > no longer filter accesses to the RTC?
> 
> Hmm, that's a valid concern, but I'm not sure in how far we need to
> be worried about giving Dom0 this level of control. As long as we
> don't use it ourselves of course (I'm unaware of us using this
> anywhere). If we're worried, we could continue to intercept port
> 0x70 alone, just to mask off the top bit for writes.

I would be mostly worried about dom0 disabling NMI and thus causing
the Xen watchdog to trigger for example.  I don't think we should
allow dom0 to disable NMIs at all.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-23 12:29       ` Roger Pau Monné
@ 2023-03-23 14:26         ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-03-23 14:26 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 23.03.2023 13:29, Roger Pau Monné wrote:
> On Wed, Mar 22, 2023 at 10:55:42AM +0100, Jan Beulich wrote:
>> On 21.03.2023 15:12, Roger Pau Monné wrote:
>>> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>>>> ... in order to also intercept Dom0 accesses through the alias ports.
>>>
>>> I'm trying to get some documentation about this aliasing, but so far I
>>> haven't been able to find any.  Do you have any references of where I
>>> might be able to find it?
>>
>> I think several ICH datasheet documents mention this. Right now I'm
>> looking at the ICH10 one (319973-003), section 13.6.1 ("I/O Register
>> Address Map" under "Real Time Clock Registers").
> 
> Thanks, I had to fetch this from elsewhere as I haven't been able to
> find it on the Intel documentation site, maybe it's too old?
> 
>> But such aliasing (really: lack of decoding) has been present on
>> various of the low 1024 ports from the very early days of x86. So we
>> may want to take care of such elsewhere as well, e.g. for the PIC
>> (where aforementioned doc also explicitly mentions the aliases).
> 
> I wonder how relevant those aliases are for OSes, do we know of any OS
> that uses them?
> 
> For example we don't seem to provide them to HVM guests at all, and we
> seem to get away with it.

There are two aspects here: One is the functionality that becomes available
specifically via using the aliases here (and I'm not 100% certain this isn't
chipset dependent in the first place), allowing access to the full 256 bytes
of CMOS storage (i.e. no parts clipped off for the RTC registers). The other
aspect is simply disallowing access to ports we mean Dom0 to not have access
to. That would be the sole purpose e.g. for the PIC port ranges. Otherwise
there's little point disallowing access to the base ranges, imo.

>>>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>>>> use the CMOS RTC.
>>>
>>> Could this create any concerns with the ability to disable NMIs if we
>>> no longer filter accesses to the RTC?
>>
>> Hmm, that's a valid concern, but I'm not sure in how far we need to
>> be worried about giving Dom0 this level of control. As long as we
>> don't use it ourselves of course (I'm unaware of us using this
>> anywhere). If we're worried, we could continue to intercept port
>> 0x70 alone, just to mask off the top bit for writes.
> 
> I would be mostly worried about dom0 disabling NMI and thus causing
> the Xen watchdog to trigger for example.  I don't think we should
> allow dom0 to disable NMIs at all.

I'll see what I can do, preferably without keeping the intercepts fully
engaged.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-20  8:32 ` [PATCH v4] " Jan Beulich
  2023-03-21 14:12   ` Roger Pau Monné
@ 2023-03-23 14:49   ` Roger Pau Monné
  2023-03-23 16:08     ` Jan Beulich
  2023-03-27 15:44     ` Jan Beulich
  1 sibling, 2 replies; 42+ messages in thread
From: Roger Pau Monné @ 2023-03-23 14:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v4: Also conditionally mask top bit for guest index port accesses. Add
>     missing adjustments to rtc_init(). Re-work to avoid recursive
>     read_lock(). Also adjust guest_io_{read,write}(). Re-base.
> v3: Re-base over change to earlier patch.
> v2: Re-base.
> 
> --- a/xen/arch/x86/hvm/rtc.c
> +++ b/xen/arch/x86/hvm/rtc.c
> @@ -27,7 +27,7 @@
>  #include <asm/hvm/vpt.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/save.h>
> -#include <asm/current.h>
> +#include <asm/iocap.h>
>  #include <xen/trace.h>
>  #include <public/hvm/params.h>
>  
> @@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
>  
>      if ( !has_vrtc(d) )
>      {
> -        if ( is_hardware_domain(d) )
> -            /* Hardware domain gets mediated access to the physical RTC. */
> -            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
> -        return;
> +        unsigned int port;
> +
> +        if ( !is_hardware_domain(d) )
> +            return;
> +
> +        /*
> +         * Hardware domain gets mediated access to the physical RTC/CMOS
> +         * (of course unless we don't use it ourselves).
> +         */
> +        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
> +            if ( is_cmos_port(port, 2, d) )
> +                register_portio_handler(d, port, 2, hw_rtc_io);
>      }
>  
>      spin_lock_init(&s->lock);
> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> @@ -9,6 +9,10 @@
>  
>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>  
> +struct domain;
> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> +                  const struct domain *d);

We seem to usually name this rtc rather than cmos, any reason to use
cmos for the helper naming rather than rtc?

If not I would rather use is_rtc_port(), so that we can keep it in
sync with the existing RTC_PORT() macros and the handler names
rtc_guest_{read,write}, hw_rtc_io.

> +
>  /**********************************************************************
>   * register summary
>   **********************************************************************/
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
>          return false;
>  
>      /* We also never permit direct access to the RTC/CMOS registers. */
> -    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
> +    if ( is_cmos_port(port, bytes, d) )
>          return false;
>  
>      return ioports_access_permitted(d, port, port + bytes - 1);
> @@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
>          {
>              sub_data = pv_pit_handler(port, 0, 0);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              sub_data = rtc_guest_read(port);
>          }
> @@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
>          {
>              pv_pit_handler(port, (uint8_t)data, 1);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              rtc_guest_write(port, data);
>          }
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>  static int __hwdom_init cf_check io_bitmap_cb(
>      unsigned long s, unsigned long e, void *ctx)
>  {
> -    struct domain *d = ctx;
> +    const struct domain *d = ctx;
>      unsigned int i;
>  
>      ASSERT(e <= INT_MAX);
>      for ( i = s; i <= e; i++ )
> -        __clear_bit(i, d->arch.hvm.io_bitmap);
> +        /*
> +         * Accesses to RTC ports also need to be trapped in order to keep
> +         * consistency with PV.
> +         */

More than to keep consistency with PV, don't we need to trap accesses
to that concurrent accesses between dom0 and Xen (when also using the
device) don't overlap, as the RTC/CMOS space uses indirect indexing.

And likely to avoid dom0 from disabling NMIs.

I see that you copied the existing comment, but not sure it's fully
accurate?

> +        if ( !is_cmos_port(i, 1, d) )
> +            __clear_bit(i, d->arch.hvm.io_bitmap);
>  
>      return 0;
>  }
>  
>  void __hwdom_init setup_io_bitmap(struct domain *d)
>  {
> -    int rc;
> +    if ( !is_hvm_domain(d) )
> +        return;
>  
> -    if ( is_hvm_domain(d) )
> -    {
> -        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> -        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> -                                    io_bitmap_cb, d);
> -        BUG_ON(rc);
> -        /*
> -         * NB: we need to trap accesses to 0xcf8 in order to intercept
> -         * 4 byte accesses, that need to be handled by Xen in order to
> -         * keep consistency.
> -         * Access to 1 byte RTC ports also needs to be trapped in order
> -         * to keep consistency with PV.
> -         */
> -        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
> -    }
> +    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> +    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> +                                io_bitmap_cb, d) )
> +        BUG();
> +
> +    /*
> +     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
> +     * guest_io_read(), and guest_io_write()).
> +     */
> +    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
>  }
>  
>  /*
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
>          if ( seconds < 60 )
>          {
>              if ( rtc.sec != seconds )
> +            {
>                  cmos_rtc_probe = false;
> +                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
> +            }
>              break;
>          }
>  
> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>  }
>  
> +static unsigned int __ro_after_init cmos_alias_mask;
> +
> +static int __init cf_check probe_cmos_alias(void)
> +{
> +    unsigned int i, offs;
> +
> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
> +        return 0;
> +
> +    for ( offs = 2; offs < 8; offs <<= 1 )
> +    {
> +        bool read = true;

You can limit the scope of i to the inner for loop.

> +
> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
> +        {
> +            uint8_t normal, alt;
> +            unsigned long flags;
> +
> +            if ( i == acpi_gbl_FADT.century )
> +                continue;
> +
> +            spin_lock_irqsave(&rtc_lock, flags);
> +
> +            normal = CMOS_READ(i);
> +            if ( inb(RTC_PORT(offs)) != i )
> +                read = false;
> +
> +            alt = inb(RTC_PORT(offs + 1));
> +
> +            spin_unlock_irqrestore(&rtc_lock, flags);
> +
> +            if ( normal != alt )
> +                break;
> +
> +            process_pending_softirqs();

You adding a call to process pending softirqs for every loop
iteration makes me wonder how long is each of those accesses expected
to take, since we could be performing a lot of them (0x80 * 3).

I don't think so, but there can not be any side effects from reading
from the CMOS RAM I would assume, even for cases where the CMOS ports
are not aliases?

I would assume ports to be either aliased to the CMOS, or otherwise
reserved.  What makes me wonder if it wouldn't be simpler to just
passthough accesses to all the possible CMOS alias ports.

> +        }
> +        if ( i == 0x80 )
> +        {
> +            cmos_alias_mask |= offs;
> +            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
> +                   RTC_PORT(offs), read ? "r/w" : "w/o");
> +        }
> +    }
> +
> +    return 0;
> +}
> +__initcall(probe_cmos_alias);
> +
> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> +{
> +    if ( !is_hardware_domain(d) )
> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> +
> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
> +    {
> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
> +
> +        while ( cmos_alias_mask & nr )
> +            nr <<= 1;
> +        for ( step = nr << 1;
> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
> +            step <<= 1;
> +        do {
> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
> +                 port <= cmos + 1 && port + bytes > cmos )
> +                return true;
> +            cmos += step;
> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );

I would use a for loop similar to the one used in probe_cmos_alias()
to check for aliased accesses?

if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
    return true;

for ( offs = 2; offs < 8; offs <<= 1 )
{
    if ( !(offs & cmos_alias_mask) )
        continue;
    if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
        return true;
}

return false;

So that you can also optimize for the more common case RTC_PORT(0) and
RTC_PORT(1) are used?

Or there's something I'm missing?

> +    }
> +
> +    return false;
> +}
> +
>  /* Helpers for guest accesses to the physical RTC. */
>  unsigned int rtc_guest_read(unsigned int port)
>  {
> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>      unsigned long flags;
>      unsigned int data = ~0;
>  
> -    switch ( port )
> +    switch ( port & ~cmos_alias_mask )
>      {
>      case RTC_PORT(0):
>          /*
> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>           * of the first RTC port, as there's no access to the physical IO
>           * ports.
>           */
> -        data = currd->arch.cmos_idx;
> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));

We do allow read access to alias ports even when the underling
hardware does do so, which I think is fine, but might be worth a
comment (since we already detect whether the RTC_PORT(0) alias is also
readable.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-23 14:49   ` Roger Pau Monné
@ 2023-03-23 16:08     ` Jan Beulich
  2023-03-23 16:40       ` Roger Pau Monné
  2023-03-27 15:44     ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-03-23 16:08 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 23.03.2023 15:49, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> --- a/xen/arch/x86/include/asm/mc146818rtc.h
>> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
>> @@ -9,6 +9,10 @@
>>  
>>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>>  
>> +struct domain;
>> +bool is_cmos_port(unsigned int port, unsigned int bytes,
>> +                  const struct domain *d);
> 
> We seem to usually name this rtc rather than cmos, any reason to use
> cmos for the helper naming rather than rtc?
> 
> If not I would rather use is_rtc_port(), so that we can keep it in
> sync with the existing RTC_PORT() macros and the handler names
> rtc_guest_{read,write}, hw_rtc_io.

Already when talking about just ports 70 and 71 there's more CMOS
behind them than RTC. With extended CMOS accesses the ratio further
shifts. So I view using "rtc" here simply as increasingly
inappropriate.

>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>>  static int __hwdom_init cf_check io_bitmap_cb(
>>      unsigned long s, unsigned long e, void *ctx)
>>  {
>> -    struct domain *d = ctx;
>> +    const struct domain *d = ctx;
>>      unsigned int i;
>>  
>>      ASSERT(e <= INT_MAX);
>>      for ( i = s; i <= e; i++ )
>> -        __clear_bit(i, d->arch.hvm.io_bitmap);
>> +        /*
>> +         * Accesses to RTC ports also need to be trapped in order to keep
>> +         * consistency with PV.
>> +         */
> 
> More than to keep consistency with PV, don't we need to trap accesses
> to that concurrent accesses between dom0 and Xen (when also using the
> device) don't overlap, as the RTC/CMOS space uses indirect indexing.

That's what I read "consistency" to mean.

> And likely to avoid dom0 from disabling NMIs.

I can add that to the comment, but as you say ...

> I see that you copied the existing comment, but not sure it's fully
> accurate?

... I've merely moved it.

>> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>>  }
>>  
>> +static unsigned int __ro_after_init cmos_alias_mask;
>> +
>> +static int __init cf_check probe_cmos_alias(void)
>> +{
>> +    unsigned int i, offs;
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return 0;
>> +
>> +    for ( offs = 2; offs < 8; offs <<= 1 )
>> +    {
>> +        bool read = true;
> 
> You can limit the scope of i to the inner for loop.

Sure. It has been long ago when I wrote this, so I guess I was after it
being one line less of code.

>> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
>> +        {
>> +            uint8_t normal, alt;
>> +            unsigned long flags;
>> +
>> +            if ( i == acpi_gbl_FADT.century )
>> +                continue;
>> +
>> +            spin_lock_irqsave(&rtc_lock, flags);
>> +
>> +            normal = CMOS_READ(i);
>> +            if ( inb(RTC_PORT(offs)) != i )
>> +                read = false;
>> +
>> +            alt = inb(RTC_PORT(offs + 1));
>> +
>> +            spin_unlock_irqrestore(&rtc_lock, flags);
>> +
>> +            if ( normal != alt )
>> +                break;
>> +
>> +            process_pending_softirqs();
> 
> You adding a call to process pending softirqs for every loop
> iteration makes me wonder how long is each of those accesses expected
> to take, since we could be performing a lot of them (0x80 * 3).

It seemed best to me to keep things simple here, at the expense at a
few too many calls.

> I don't think so, but there can not be any side effects from reading
> from the CMOS RAM I would assume, even for cases where the CMOS ports
> are not aliases?

Well, one of the fundamental assumptions is that these read attempts
won't have side effects. Without that assumption we simply can't do
such probing.

> I would assume ports to be either aliased to the CMOS, or otherwise
> reserved.  What makes me wonder if it wouldn't be simpler to just
> passthough accesses to all the possible CMOS alias ports.

But we need to intercept writes to 70 to track the index. IOW we
cannot simply pass through all of them, and we also cannot simply
intercept them all and treat them all the same.

>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>> +{
>> +    if ( !is_hardware_domain(d) )
>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>> +
>> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
>> +    {
>> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
>> +
>> +        while ( cmos_alias_mask & nr )
>> +            nr <<= 1;
>> +        for ( step = nr << 1;
>> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
>> +            step <<= 1;
>> +        do {
>> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
>> +                 port <= cmos + 1 && port + bytes > cmos )
>> +                return true;
>> +            cmos += step;
>> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
> 
> I would use a for loop similar to the one used in probe_cmos_alias()
> to check for aliased accesses?
> 
> if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
>     return true;
> 
> for ( offs = 2; offs < 8; offs <<= 1 )
> {
>     if ( !(offs & cmos_alias_mask) )
>         continue;
>     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
>         return true;
> }
> 
> return false;
> 
> So that you can also optimize for the more common case RTC_PORT(0) and
> RTC_PORT(1) are used?
> 
> Or there's something I'm missing?

I'll have to check carefully, but to be honest I would prefer to not
touch this code again unless there's clearly something wrong with it.

>> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>>      unsigned long flags;
>>      unsigned int data = ~0;
>>  
>> -    switch ( port )
>> +    switch ( port & ~cmos_alias_mask )
>>      {
>>      case RTC_PORT(0):
>>          /*
>> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>>           * of the first RTC port, as there's no access to the physical IO
>>           * ports.
>>           */
>> -        data = currd->arch.cmos_idx;
>> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> 
> We do allow read access to alias ports even when the underling
> hardware does do so,

I'm afraid I don't understand this, so ...

> which I think is fine, but might be worth a
> comment (since we already detect whether the RTC_PORT(0) alias is also
> readable.

... I can't really derive what kind of information you're after to put
in a comment.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-23 16:08     ` Jan Beulich
@ 2023-03-23 16:40       ` Roger Pau Monné
  2023-03-27 15:46         ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-03-23 16:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Thu, Mar 23, 2023 at 05:08:43PM +0100, Jan Beulich wrote:
> On 23.03.2023 15:49, Roger Pau Monné wrote:
> > On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
> >> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> >> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> >> @@ -9,6 +9,10 @@
> >>  
> >>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
> >>  
> >> +struct domain;
> >> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> >> +                  const struct domain *d);
> > 
> > We seem to usually name this rtc rather than cmos, any reason to use
> > cmos for the helper naming rather than rtc?
> > 
> > If not I would rather use is_rtc_port(), so that we can keep it in
> > sync with the existing RTC_PORT() macros and the handler names
> > rtc_guest_{read,write}, hw_rtc_io.
> 
> Already when talking about just ports 70 and 71 there's more CMOS
> behind them than RTC. With extended CMOS accesses the ratio further
> shifts. So I view using "rtc" here simply as increasingly
> inappropriate.

Hm, it's your patch at the end, and such decision would likely fall
under the same bag as other style related questions.

I would prefer to keep the naming consistent, as to not confuse
readers with code dealing with the same underlying IO ports using both
RTC and CMOS, but that's just my taste.

> >> --- a/xen/arch/x86/setup.c
> >> +++ b/xen/arch/x86/setup.c
> >> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
> >>  static int __hwdom_init cf_check io_bitmap_cb(
> >>      unsigned long s, unsigned long e, void *ctx)
> >>  {
> >> -    struct domain *d = ctx;
> >> +    const struct domain *d = ctx;
> >>      unsigned int i;
> >>  
> >>      ASSERT(e <= INT_MAX);
> >>      for ( i = s; i <= e; i++ )
> >> -        __clear_bit(i, d->arch.hvm.io_bitmap);
> >> +        /*
> >> +         * Accesses to RTC ports also need to be trapped in order to keep
> >> +         * consistency with PV.
> >> +         */
> > 
> > More than to keep consistency with PV, don't we need to trap accesses
> > to that concurrent accesses between dom0 and Xen (when also using the
> > device) don't overlap, as the RTC/CMOS space uses indirect indexing.
> 
> That's what I read "consistency" to mean.

But consistency with PV?  We need to keep consistency with concurrent
Xen (hypervisor) accesses I would think.

I would s/PV/hypervisor accesses/ in the comment above while moving
it.

> >> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
> >> +        {
> >> +            uint8_t normal, alt;
> >> +            unsigned long flags;
> >> +
> >> +            if ( i == acpi_gbl_FADT.century )
> >> +                continue;
> >> +
> >> +            spin_lock_irqsave(&rtc_lock, flags);
> >> +
> >> +            normal = CMOS_READ(i);
> >> +            if ( inb(RTC_PORT(offs)) != i )
> >> +                read = false;
> >> +
> >> +            alt = inb(RTC_PORT(offs + 1));
> >> +
> >> +            spin_unlock_irqrestore(&rtc_lock, flags);
> >> +
> >> +            if ( normal != alt )
> >> +                break;
> >> +
> >> +            process_pending_softirqs();
> > 
> > You adding a call to process pending softirqs for every loop
> > iteration makes me wonder how long is each of those accesses expected
> > to take, since we could be performing a lot of them (0x80 * 3).
> 
> It seemed best to me to keep things simple here, at the expense at a
> few too many calls.
> 
> > I don't think so, but there can not be any side effects from reading
> > from the CMOS RAM I would assume, even for cases where the CMOS ports
> > are not aliases?
> 
> Well, one of the fundamental assumptions is that these read attempts
> won't have side effects. Without that assumption we simply can't do
> such probing.
> 
> > I would assume ports to be either aliased to the CMOS, or otherwise
> > reserved.  What makes me wonder if it wouldn't be simpler to just
> > passthough accesses to all the possible CMOS alias ports.
> 
> But we need to intercept writes to 70 to track the index. IOW we
> cannot simply pass through all of them, and we also cannot simply
> intercept them all and treat them all the same.

Why couldn't we intercept all the possible alias port and passthrough
all of them?  As long as there's nothing else there's no risk in doing
so?

> >> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> >> +{
> >> +    if ( !is_hardware_domain(d) )
> >> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> >> +
> >> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> >> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
> >> +    {
> >> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
> >> +
> >> +        while ( cmos_alias_mask & nr )
> >> +            nr <<= 1;
> >> +        for ( step = nr << 1;
> >> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
> >> +            step <<= 1;
> >> +        do {
> >> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
> >> +                 port <= cmos + 1 && port + bytes > cmos )
> >> +                return true;
> >> +            cmos += step;
> >> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
> > 
> > I would use a for loop similar to the one used in probe_cmos_alias()
> > to check for aliased accesses?
> > 
> > if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
> >     return true;
> > 
> > for ( offs = 2; offs < 8; offs <<= 1 )
> > {
> >     if ( !(offs & cmos_alias_mask) )
> >         continue;
> >     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
> >         return true;
> > }
> > 
> > return false;
> > 
> > So that you can also optimize for the more common case RTC_PORT(0) and
> > RTC_PORT(1) are used?
> > 
> > Or there's something I'm missing?
> 
> I'll have to check carefully, but to be honest I would prefer to not
> touch this code again unless there's clearly something wrong with it.

TBH, I think the proposed code is extremely difficult to follow, there
are 3 loops in a row which gives me a headache when thinking about all
the possible combinations.

I think my proposed alternative is much easier to follow because it
has a single loop, and it's using the same bounds used to fill the
cmos_alias_mask in the first place.  But maybe that's just my taste.

> >> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
> >>      unsigned long flags;
> >>      unsigned int data = ~0;
> >>  
> >> -    switch ( port )
> >> +    switch ( port & ~cmos_alias_mask )
> >>      {
> >>      case RTC_PORT(0):
> >>          /*
> >> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
> >>           * of the first RTC port, as there's no access to the physical IO
> >>           * ports.
> >>           */
> >> -        data = currd->arch.cmos_idx;
> >> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> > 
> > We do allow read access to alias ports even when the underling
> > hardware does do so,
> 
> I'm afraid I don't understand this, so ...
> 
> > which I think is fine, but might be worth a
> > comment (since we already detect whether the RTC_PORT(0) alias is also
> > readable.
> 
> ... I can't really derive what kind of information you're after to put
> in a comment.

Reading from ports that alias RTC_PORT(0) might not always return the
value written to RTC_PORT(0) (you have a check for that in
probe_cmos_alias()).  Yet in rtc_guest_read() Xen does always return
the cached CMOS index.  Which is likely to be all fine, but needs a
comment to note this behavior might not match what the underlying
hardware would return.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-21 14:12   ` Roger Pau Monné
  2023-03-22  9:55     ` Jan Beulich
@ 2023-03-27 15:44     ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-03-27 15:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 21.03.2023 15:12, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> I'm trying to get some documentation about this aliasing, but so far I
> haven't been able to find any.  Do you have any references of where I
> might be able to find it?
> 
>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC.
> 
> Could this create any concerns with the ability to disable NMIs if we
> no longer filter accesses to the RTC?

I've added "..., because of there being none." If there's no RTC, I don't
think it's specified what function (if any) port 70 bit 7 has.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-23 14:49   ` Roger Pau Monné
  2023-03-23 16:08     ` Jan Beulich
@ 2023-03-27 15:44     ` Jan Beulich
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-03-27 15:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 23.03.2023 15:49, Roger Pau Monné wrote:
> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>> @@ -1249,6 +1252,80 @@ static unsigned long get_cmos_time(void)
>>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>>  }
>>  
>> +static unsigned int __ro_after_init cmos_alias_mask;
>> +
>> +static int __init cf_check probe_cmos_alias(void)
>> +{
>> +    unsigned int i, offs;
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return 0;
>> +
>> +    for ( offs = 2; offs < 8; offs <<= 1 )
>> +    {
>> +        bool read = true;
> 
> You can limit the scope of i to the inner for loop.
> 
>> +
>> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
>> +        {
>> +            uint8_t normal, alt;
>> +            unsigned long flags;
>> +
>> +            if ( i == acpi_gbl_FADT.century )
>> +                continue;
>> +
>> +            spin_lock_irqsave(&rtc_lock, flags);
>> +
>> +            normal = CMOS_READ(i);
>> +            if ( inb(RTC_PORT(offs)) != i )
>> +                read = false;
>> +
>> +            alt = inb(RTC_PORT(offs + 1));
>> +
>> +            spin_unlock_irqrestore(&rtc_lock, flags);
>> +
>> +            if ( normal != alt )
>> +                break;
>> +
>> +            process_pending_softirqs();
> 
> You adding a call to process pending softirqs for every loop
> iteration makes me wonder how long is each of those accesses expected
> to take, since we could be performing a lot of them (0x80 * 3).
> 
> I don't think so, but there can not be any side effects from reading
> from the CMOS RAM I would assume, even for cases where the CMOS ports
> are not aliases?
> 
> I would assume ports to be either aliased to the CMOS, or otherwise
> reserved.  What makes me wonder if it wouldn't be simpler to just
> passthough accesses to all the possible CMOS alias ports.

I'm afraid this assumption doesn't hold, as can be seen from the ICH10
datasheet that I did point you at the other day. There ports 72/73
serve a purpose different from ports 70/71 (and their aliases at 74/75).
Unless (as the datasheet calls it) U128E is clear (wherever that bit
lives), in which case 72/73 (and 76/77) will also be aliases of 70/71.
So we won't get away without probing, and if we deem probing too risky,
then all I can do is drop this patch.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-23 16:40       ` Roger Pau Monné
@ 2023-03-27 15:46         ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-03-27 15:46 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 23.03.2023 17:40, Roger Pau Monné wrote:
> On Thu, Mar 23, 2023 at 05:08:43PM +0100, Jan Beulich wrote:
>> On 23.03.2023 15:49, Roger Pau Monné wrote:
>>> On Mon, Mar 20, 2023 at 09:32:26AM +0100, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/setup.c
>>>> +++ b/xen/arch/x86/setup.c
>>>> @@ -2072,37 +2072,36 @@ int __hwdom_init xen_in_range(unsigned l
>>>>  static int __hwdom_init cf_check io_bitmap_cb(
>>>>      unsigned long s, unsigned long e, void *ctx)
>>>>  {
>>>> -    struct domain *d = ctx;
>>>> +    const struct domain *d = ctx;
>>>>      unsigned int i;
>>>>  
>>>>      ASSERT(e <= INT_MAX);
>>>>      for ( i = s; i <= e; i++ )
>>>> -        __clear_bit(i, d->arch.hvm.io_bitmap);
>>>> +        /*
>>>> +         * Accesses to RTC ports also need to be trapped in order to keep
>>>> +         * consistency with PV.
>>>> +         */
>>>
>>> More than to keep consistency with PV, don't we need to trap accesses
>>> to that concurrent accesses between dom0 and Xen (when also using the
>>> device) don't overlap, as the RTC/CMOS space uses indirect indexing.
>>
>> That's what I read "consistency" to mean.
> 
> But consistency with PV?  We need to keep consistency with concurrent
> Xen (hypervisor) accesses I would think.
> 
> I would s/PV/hypervisor accesses/ in the comment above while moving
> it.

Hmm, yes, changed. (Still I'm not normally very happy about changing
comments I don't really understand the way they're written.)

>>>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>>>> +{
>>>> +    if ( !is_hardware_domain(d) )
>>>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>>>> +
>>>> +    if ( !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>>>> +         port <= RTC_PORT(cmos_alias_mask | 1) && port + bytes > RTC_PORT(0) )
>>>> +    {
>>>> +        unsigned int cmos = RTC_PORT(0), nr = 2, step;
>>>> +
>>>> +        while ( cmos_alias_mask & nr )
>>>> +            nr <<= 1;
>>>> +        for ( step = nr << 1;
>>>> +              step < cmos_alias_mask && !(cmos_alias_mask & step); )
>>>> +            step <<= 1;
>>>> +        do {
>>>> +            if ( !(cmos & ~RTC_PORT(cmos_alias_mask)) &&
>>>> +                 port <= cmos + 1 && port + bytes > cmos )
>>>> +                return true;
>>>> +            cmos += step;
>>>> +        } while ( cmos <= RTC_PORT(cmos_alias_mask) );
>>>
>>> I would use a for loop similar to the one used in probe_cmos_alias()
>>> to check for aliased accesses?
>>>
>>> if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
>>>     return true;
>>>
>>> for ( offs = 2; offs < 8; offs <<= 1 )
>>> {
>>>     if ( !(offs & cmos_alias_mask) )
>>>         continue;
>>>     if ( port <= RTC_PORT(1 + off) && port + bytes > RTC_PORT(off) )
>>>         return true;
>>> }
>>>
>>> return false;
>>>
>>> So that you can also optimize for the more common case RTC_PORT(0) and
>>> RTC_PORT(1) are used?
>>>
>>> Or there's something I'm missing?
>>
>> I'll have to check carefully, but to be honest I would prefer to not
>> touch this code again unless there's clearly something wrong with it.
> 
> TBH, I think the proposed code is extremely difficult to follow, there
> are 3 loops in a row which gives me a headache when thinking about all
> the possible combinations.
> 
> I think my proposed alternative is much easier to follow because it
> has a single loop, and it's using the same bounds used to fill the
> cmos_alias_mask in the first place.  But maybe that's just my taste.

Upon re-consideration I've adjusted the code. Iirc probe_cmos_alias()
originally had something similar, so changing it in the other place as
well is only logical.

>>>> @@ -1256,7 +1333,7 @@ unsigned int rtc_guest_read(unsigned int
>>>>      unsigned long flags;
>>>>      unsigned int data = ~0;
>>>>  
>>>> -    switch ( port )
>>>> +    switch ( port & ~cmos_alias_mask )
>>>>      {
>>>>      case RTC_PORT(0):
>>>>          /*
>>>> @@ -1264,15 +1341,16 @@ unsigned int rtc_guest_read(unsigned int
>>>>           * of the first RTC port, as there's no access to the physical IO
>>>>           * ports.
>>>>           */
>>>> -        data = currd->arch.cmos_idx;
>>>> +        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
>>>
>>> We do allow read access to alias ports even when the underling
>>> hardware does do so,
>>
>> I'm afraid I don't understand this, so ...
>>
>>> which I think is fine, but might be worth a
>>> comment (since we already detect whether the RTC_PORT(0) alias is also
>>> readable.
>>
>> ... I can't really derive what kind of information you're after to put
>> in a comment.
> 
> Reading from ports that alias RTC_PORT(0) might not always return the
> value written to RTC_PORT(0) (you have a check for that in
> probe_cmos_alias()).  Yet in rtc_guest_read() Xen does always return
> the cached CMOS index.  Which is likely to be all fine, but needs a
> comment to note this behavior might not match what the underlying
> hardware would return.

You do realize that this is pre-existing behavior? On many chipsets
you cannot read the index value from port 70. Yet we've always
returned it, presumably on the grounds of the value either being
rubbish (often 0xff) or the index. And rather than trying to figure
out whether to return rubbish (explicitly of via accessing the port)
I guess it was deemed easier and more helpful to always return the
index value. (I've amended the comment there nevertheless.)

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2023-03-20  8:32 ` [PATCH v4] " Jan Beulich
@ 2023-03-30 10:40 ` Jan Beulich
  2023-04-03 11:09   ` Roger Pau Monné
  2023-04-18  9:24 ` [PATCH v6] " Jan Beulich
  4 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-03-30 10:40 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monné

... in order to also intercept Dom0 accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC, because of there being none.

Note that rtc_init() deliberately uses 16 as the upper loop bound,
despite probe_cmos_alias() using 8: The higher bound is benign now, but
would save us touching the code (or, worse, missing to touch it) in case
the lower one was doubled.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: Simplify logic in is_cmos_port(). Limit the scope of a local
    variable. Adjust a comment that's being moved.
v4: Also conditionally mask top bit for guest index port accesses. Add
    missing adjustments to rtc_init(). Re-work to avoid recursive
    read_lock(). Also adjust guest_io_{read,write}(). Re-base.
v3: Re-base over change to earlier patch.
v2: Re-base.

--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -27,7 +27,7 @@
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/save.h>
-#include <asm/current.h>
+#include <asm/iocap.h>
 #include <xen/trace.h>
 #include <public/hvm/params.h>
 
@@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
 
     if ( !has_vrtc(d) )
     {
-        if ( is_hardware_domain(d) )
-            /* Hardware domain gets mediated access to the physical RTC. */
-            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
-        return;
+        unsigned int port;
+
+        if ( !is_hardware_domain(d) )
+            return;
+
+        /*
+         * Hardware domain gets mediated access to the physical RTC/CMOS (of
+         * course unless we don't use it ourselves, for there being none).
+         */
+        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
+            if ( is_cmos_port(port, 2, d) )
+                register_portio_handler(d, port, 2, hw_rtc_io);
     }
 
     spin_lock_init(&s->lock);
--- a/xen/arch/x86/include/asm/mc146818rtc.h
+++ b/xen/arch/x86/include/asm/mc146818rtc.h
@@ -9,6 +9,10 @@
 
 extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
 
+struct domain;
+bool is_cmos_port(unsigned int port, unsigned int bytes,
+                  const struct domain *d);
+
 /**********************************************************************
  * register summary
  **********************************************************************/
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
         return false;
 
     /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+    if ( is_cmos_port(port, bytes, d) )
         return false;
 
     return ioports_access_permitted(d, port, port + bytes - 1);
@@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
         {
             sub_data = pv_pit_handler(port, 0, 0);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             sub_data = rtc_guest_read(port);
         }
@@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
         {
             pv_pit_handler(port, (uint8_t)data, 1);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             rtc_guest_write(port, data);
         }
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2131,37 +2131,36 @@ int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init cf_check io_bitmap_cb(
     unsigned long s, unsigned long e, void *ctx)
 {
-    struct domain *d = ctx;
+    const struct domain *d = ctx;
     unsigned int i;
 
     ASSERT(e <= INT_MAX);
     for ( i = s; i <= e; i++ )
-        __clear_bit(i, d->arch.hvm.io_bitmap);
+        /*
+         * Accesses to RTC ports also need to be trapped in order to keep
+         * consistency with hypervisor accesses.
+         */
+        if ( !is_cmos_port(i, 1, d) )
+            __clear_bit(i, d->arch.hvm.io_bitmap);
 
     return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-    int rc;
+    if ( !is_hvm_domain(d) )
+        return;
 
-    if ( is_hvm_domain(d) )
-    {
-        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
-        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
-                                    io_bitmap_cb, d);
-        BUG_ON(rc);
-        /*
-         * NB: we need to trap accesses to 0xcf8 in order to intercept
-         * 4 byte accesses, that need to be handled by Xen in order to
-         * keep consistency.
-         * Access to 1 byte RTC ports also needs to be trapped in order
-         * to keep consistency with PV.
-         */
-        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-    }
+    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
+    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
+                                io_bitmap_cb, d) )
+        BUG();
+
+    /*
+     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
+     * guest_io_read(), and guest_io_write()).
+     */
+    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
 }
 
 /*
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
         if ( seconds < 60 )
         {
             if ( rtc.sec != seconds )
+            {
                 cmos_rtc_probe = false;
+                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
+            }
             break;
         }
 
@@ -1249,6 +1252,77 @@ static unsigned long get_cmos_time(void)
     return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
 }
 
+static unsigned int __ro_after_init cmos_alias_mask;
+
+static int __init cf_check probe_cmos_alias(void)
+{
+    unsigned int offs;
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return 0;
+
+    for ( offs = 2; offs < 8; offs <<= 1 )
+    {
+        unsigned int i;
+        bool read = true;
+
+        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
+        {
+            uint8_t normal, alt;
+            unsigned long flags;
+
+            if ( i == acpi_gbl_FADT.century )
+                continue;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+
+            normal = CMOS_READ(i);
+            if ( inb(RTC_PORT(offs)) != i )
+                read = false;
+
+            alt = inb(RTC_PORT(offs + 1));
+
+            spin_unlock_irqrestore(&rtc_lock, flags);
+
+            if ( normal != alt )
+                break;
+
+            process_pending_softirqs();
+        }
+        if ( i == 0x80 )
+        {
+            cmos_alias_mask |= offs;
+            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
+                   RTC_PORT(offs), read ? "r/w" : "w/o");
+        }
+    }
+
+    return 0;
+}
+__initcall(probe_cmos_alias);
+
+bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
+{
+    unsigned int offs;
+
+    if ( !is_hardware_domain(d) ||
+         !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) )
+        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return false;
+
+    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
+    {
+        if ( !(offs & cmos_alias_mask) )
+            continue;
+        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
+            return true;
+    }
+
+    return false;
+}
+
 /* Helpers for guest accesses to the physical RTC. */
 unsigned int rtc_guest_read(unsigned int port)
 {
@@ -1256,23 +1330,25 @@ unsigned int rtc_guest_read(unsigned int
     unsigned long flags;
     unsigned int data = ~0;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
     case RTC_PORT(0):
         /*
          * All PV domains (and PVH dom0) are allowed to read the latched value
          * of the first RTC port, as there's no access to the physical IO
-         * ports.
+         * ports.  Note that we return the index value regardless of whether
+         * underlying hardware would permit doing so.
          */
-        data = currd->arch.cmos_idx;
+        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        data = inb(RTC_PORT(1));
+        outb(currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1))),
+             port - 1);
+        data = inb(port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
@@ -1288,9 +1364,10 @@ void rtc_guest_write(unsigned int port,
     struct domain *currd = current->domain;
     unsigned long flags;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
         typeof(pv_rtc_handler) hook;
+        unsigned int idx;
 
     case RTC_PORT(0):
         /*
@@ -1298,20 +1375,22 @@ void rtc_guest_write(unsigned int port,
          * value of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        currd->arch.cmos_idx = data;
+        currd->arch.cmos_idx = data & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
 
+        idx = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1)));
+
         hook = ACCESS_ONCE(pv_rtc_handler);
         if ( hook )
-            hook(currd->arch.cmos_idx & 0x7f, data);
+            hook(idx, data);
 
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        outb(data, RTC_PORT(1));
+        outb(idx, port - 1);
+        outb(data, port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-03-30 10:40 ` [PATCH v5] " Jan Beulich
@ 2023-04-03 11:09   ` Roger Pau Monné
  2023-04-03 11:26     ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-03 11:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Thu, Mar 30, 2023 at 12:40:38PM +0200, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC, because of there being none.

So it's fine for dom0 to switch off NMIs if Xen isn't using the RTC?
Seems like a weird side-effect of Xen not using the RTC (seeing as we
would otherwise mask bit 8 from dom0 RTC accesses).

Also I'm worried that when Xen doesn't intercept RTC ports accesses
from dom0 could be interrupted for example by the vCPU being scheduled
out, so a vCPU might perform a write to the index port, and be
scheduled out, leaving the RTC in an undefined state.

I've read claims online that the RTC is not reset by the firmware, and
since it has a battery the state is kept across reboots, so
interrupting an access like that cold leave the RTC in a broken state
across reboots.

> Note that rtc_init() deliberately uses 16 as the upper loop bound,
> despite probe_cmos_alias() using 8: The higher bound is benign now, but
> would save us touching the code (or, worse, missing to touch it) in case
> the lower one was doubled.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v5: Simplify logic in is_cmos_port(). Limit the scope of a local
>     variable. Adjust a comment that's being moved.
> v4: Also conditionally mask top bit for guest index port accesses. Add
>     missing adjustments to rtc_init(). Re-work to avoid recursive
>     read_lock(). Also adjust guest_io_{read,write}(). Re-base.
> v3: Re-base over change to earlier patch.
> v2: Re-base.
> 
> --- a/xen/arch/x86/hvm/rtc.c
> +++ b/xen/arch/x86/hvm/rtc.c
> @@ -27,7 +27,7 @@
>  #include <asm/hvm/vpt.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/save.h>
> -#include <asm/current.h>
> +#include <asm/iocap.h>
>  #include <xen/trace.h>
>  #include <public/hvm/params.h>
>  
> @@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
>  
>      if ( !has_vrtc(d) )
>      {
> -        if ( is_hardware_domain(d) )
> -            /* Hardware domain gets mediated access to the physical RTC. */
> -            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
> -        return;
> +        unsigned int port;
> +
> +        if ( !is_hardware_domain(d) )
> +            return;
> +
> +        /*
> +         * Hardware domain gets mediated access to the physical RTC/CMOS (of
> +         * course unless we don't use it ourselves, for there being none).
> +         */
> +        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
> +            if ( is_cmos_port(port, 2, d) )
> +                register_portio_handler(d, port, 2, hw_rtc_io);

You seem to have dropped a return from here, as for PVH dom0 the
initialization below shouldn't be done.

>      }
>  
>      spin_lock_init(&s->lock);
> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> @@ -9,6 +9,10 @@
>  
>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>  
> +struct domain;
> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> +                  const struct domain *d);
> +
>  /**********************************************************************
>   * register summary
>   **********************************************************************/
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
>          return false;
>  
>      /* We also never permit direct access to the RTC/CMOS registers. */
> -    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
> +    if ( is_cmos_port(port, bytes, d) )
>          return false;
>  
>      return ioports_access_permitted(d, port, port + bytes - 1);
> @@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
>          {
>              sub_data = pv_pit_handler(port, 0, 0);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              sub_data = rtc_guest_read(port);
>          }
> @@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
>          {
>              pv_pit_handler(port, (uint8_t)data, 1);
>          }
> -        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
> +        else if ( is_cmos_port(port, 1, currd) )
>          {
>              rtc_guest_write(port, data);
>          }
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -2131,37 +2131,36 @@ int __hwdom_init xen_in_range(unsigned l
>  static int __hwdom_init cf_check io_bitmap_cb(
>      unsigned long s, unsigned long e, void *ctx)
>  {
> -    struct domain *d = ctx;
> +    const struct domain *d = ctx;
>      unsigned int i;
>  
>      ASSERT(e <= INT_MAX);
>      for ( i = s; i <= e; i++ )
> -        __clear_bit(i, d->arch.hvm.io_bitmap);
> +        /*
> +         * Accesses to RTC ports also need to be trapped in order to keep
> +         * consistency with hypervisor accesses.
> +         */
> +        if ( !is_cmos_port(i, 1, d) )
> +            __clear_bit(i, d->arch.hvm.io_bitmap);
>  
>      return 0;
>  }
>  
>  void __hwdom_init setup_io_bitmap(struct domain *d)
>  {
> -    int rc;
> +    if ( !is_hvm_domain(d) )
> +        return;
>  
> -    if ( is_hvm_domain(d) )
> -    {
> -        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> -        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> -                                    io_bitmap_cb, d);
> -        BUG_ON(rc);
> -        /*
> -         * NB: we need to trap accesses to 0xcf8 in order to intercept
> -         * 4 byte accesses, that need to be handled by Xen in order to
> -         * keep consistency.
> -         * Access to 1 byte RTC ports also needs to be trapped in order
> -         * to keep consistency with PV.
> -         */
> -        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
> -        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
> -    }
> +    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
> +    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
> +                                io_bitmap_cb, d) )
> +        BUG();
> +
> +    /*
> +     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
> +     * guest_io_read(), and guest_io_write()).
> +     */
> +    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
>  }
>  
>  /*
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
>          if ( seconds < 60 )
>          {
>              if ( rtc.sec != seconds )
> +            {
>                  cmos_rtc_probe = false;
> +                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
> +            }
>              break;
>          }
>  
> @@ -1249,6 +1252,77 @@ static unsigned long get_cmos_time(void)
>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>  }
>  
> +static unsigned int __ro_after_init cmos_alias_mask;
> +
> +static int __init cf_check probe_cmos_alias(void)
> +{
> +    unsigned int offs;
> +
> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
> +        return 0;
> +
> +    for ( offs = 2; offs < 8; offs <<= 1 )
> +    {
> +        unsigned int i;
> +        bool read = true;
> +
> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
> +        {
> +            uint8_t normal, alt;
> +            unsigned long flags;
> +
> +            if ( i == acpi_gbl_FADT.century )
> +                continue;
> +
> +            spin_lock_irqsave(&rtc_lock, flags);
> +
> +            normal = CMOS_READ(i);
> +            if ( inb(RTC_PORT(offs)) != i )
> +                read = false;
> +
> +            alt = inb(RTC_PORT(offs + 1));
> +
> +            spin_unlock_irqrestore(&rtc_lock, flags);
> +
> +            if ( normal != alt )
> +                break;
> +
> +            process_pending_softirqs();
> +        }
> +        if ( i == 0x80 )
> +        {
> +            cmos_alias_mask |= offs;
> +            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
> +                   RTC_PORT(offs), read ? "r/w" : "w/o");

I would consider making this a DEBUG message, not sure it's that
useful for a normal end user, and printing to the console can be slow.

> +        }
> +    }
> +
> +    return 0;
> +}
> +__initcall(probe_cmos_alias);
> +
> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> +{
> +    unsigned int offs;
> +
> +    if ( !is_hardware_domain(d) ||
> +         !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) )
> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> +
> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
> +        return false;
> +
> +    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
> +    {
> +        if ( !(offs & cmos_alias_mask) )
> +            continue;
> +        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
> +            return true;
> +    }

Maybe I'm confused, but doesn't this loop start at RTC_PORT(2), and
hence you need to check for the RTC_PORT(0,1) pair outside of the
loop?

> +
> +    return false;
> +}
> +
>  /* Helpers for guest accesses to the physical RTC. */
>  unsigned int rtc_guest_read(unsigned int port)
>  {
> @@ -1256,23 +1330,25 @@ unsigned int rtc_guest_read(unsigned int
>      unsigned long flags;
>      unsigned int data = ~0;
>  
> -    switch ( port )
> +    switch ( port & ~cmos_alias_mask )

Given that the call is gated with is_cmos_port() it would be clearer
to just use RTC_PORT(1) as the mask here IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-03 11:09   ` Roger Pau Monné
@ 2023-04-03 11:26     ` Jan Beulich
  2023-04-03 11:44       ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-04-03 11:26 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 03.04.2023 13:09, Roger Pau Monné wrote:
> On Thu, Mar 30, 2023 at 12:40:38PM +0200, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
>>
>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC, because of there being none.
> 
> So it's fine for dom0 to switch off NMIs if Xen isn't using the RTC?
> Seems like a weird side-effect of Xen not using the RTC (seeing as we
> would otherwise mask bit 8 from dom0 RTC accesses).

I haven't been able to find documentation on this single bit in the
absence of RTC / CMOS.

> Also I'm worried that when Xen doesn't intercept RTC ports accesses
> from dom0 could be interrupted for example by the vCPU being scheduled
> out, so a vCPU might perform a write to the index port, and be
> scheduled out, leaving the RTC in an undefined state.

I did specifically add "because of there being none" to the sentence
to clarify in which case we avoid intercepting.

> I've read claims online that the RTC is not reset by the firmware, and
> since it has a battery the state is kept across reboots, so
> interrupting an access like that cold leave the RTC in a broken state
> across reboots.

I can easily imagine such firmware exists.

>> @@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
>>  
>>      if ( !has_vrtc(d) )
>>      {
>> -        if ( is_hardware_domain(d) )
>> -            /* Hardware domain gets mediated access to the physical RTC. */
>> -            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
>> -        return;
>> +        unsigned int port;
>> +
>> +        if ( !is_hardware_domain(d) )
>> +            return;
>> +
>> +        /*
>> +         * Hardware domain gets mediated access to the physical RTC/CMOS (of
>> +         * course unless we don't use it ourselves, for there being none).
>> +         */
>> +        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
>> +            if ( is_cmos_port(port, 2, d) )
>> +                register_portio_handler(d, port, 2, hw_rtc_io);
> 
> You seem to have dropped a return from here, as for PVH dom0 the
> initialization below shouldn't be done.

Oh, indeed, thanks for spotting. (The excess init is benign afaict, but
I still shouldn't have dropped that "return".)

>> @@ -1249,6 +1252,77 @@ static unsigned long get_cmos_time(void)
>>      return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>>  }
>>  
>> +static unsigned int __ro_after_init cmos_alias_mask;
>> +
>> +static int __init cf_check probe_cmos_alias(void)
>> +{
>> +    unsigned int offs;
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return 0;
>> +
>> +    for ( offs = 2; offs < 8; offs <<= 1 )
>> +    {
>> +        unsigned int i;
>> +        bool read = true;
>> +
>> +        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
>> +        {
>> +            uint8_t normal, alt;
>> +            unsigned long flags;
>> +
>> +            if ( i == acpi_gbl_FADT.century )
>> +                continue;
>> +
>> +            spin_lock_irqsave(&rtc_lock, flags);
>> +
>> +            normal = CMOS_READ(i);
>> +            if ( inb(RTC_PORT(offs)) != i )
>> +                read = false;
>> +
>> +            alt = inb(RTC_PORT(offs + 1));
>> +
>> +            spin_unlock_irqrestore(&rtc_lock, flags);
>> +
>> +            if ( normal != alt )
>> +                break;
>> +
>> +            process_pending_softirqs();
>> +        }
>> +        if ( i == 0x80 )
>> +        {
>> +            cmos_alias_mask |= offs;
>> +            printk(XENLOG_INFO "CMOS aliased at %02x, index %s\n",
>> +                   RTC_PORT(offs), read ? "r/w" : "w/o");
> 
> I would consider making this a DEBUG message, not sure it's that
> useful for a normal end user, and printing to the console can be slow.

Can do, sure.

>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>> +{
>> +    unsigned int offs;
>> +
>> +    if ( !is_hardware_domain(d) ||
>> +         !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) )
>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>> +
>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>> +        return false;
>> +
>> +    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
>> +    {
>> +        if ( !(offs & cmos_alias_mask) )
>> +            continue;
>> +        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
>> +            return true;
>> +    }
> 
> Maybe I'm confused, but doesn't this loop start at RTC_PORT(2), and
> hence you need to check for the RTC_PORT(0,1) pair outside of the
> loop?

The loop starts at offset 2, yes, but see the initial if() in the
function. Or at least I thought I got that right, but it looks like
I didn't (failed attempt to try to address a specific request of
yours, iirc).

>> @@ -1256,23 +1330,25 @@ unsigned int rtc_guest_read(unsigned int
>>      unsigned long flags;
>>      unsigned int data = ~0;
>>  
>> -    switch ( port )
>> +    switch ( port & ~cmos_alias_mask )
> 
> Given that the call is gated with is_cmos_port() it would be clearer
> to just use RTC_PORT(1) as the mask here IMO.

Hmm, personally I wouldn't consider RTC_PORT(1) to be reasonable to
use as a mask (even if technically it would be okay).

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-03 11:26     ` Jan Beulich
@ 2023-04-03 11:44       ` Roger Pau Monné
  2023-04-03 12:24         ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-03 11:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Mon, Apr 03, 2023 at 01:26:41PM +0200, Jan Beulich wrote:
> On 03.04.2023 13:09, Roger Pau Monné wrote:
> > On Thu, Mar 30, 2023 at 12:40:38PM +0200, Jan Beulich wrote:
> >> ... in order to also intercept Dom0 accesses through the alias ports.
> >>
> >> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> >> use the CMOS RTC, because of there being none.
> > 
> > So it's fine for dom0 to switch off NMIs if Xen isn't using the RTC?
> > Seems like a weird side-effect of Xen not using the RTC (seeing as we
> > would otherwise mask bit 8 from dom0 RTC accesses).
> 
> I haven't been able to find documentation on this single bit in the
> absence of RTC / CMOS.
> 
> > Also I'm worried that when Xen doesn't intercept RTC ports accesses
> > from dom0 could be interrupted for example by the vCPU being scheduled
> > out, so a vCPU might perform a write to the index port, and be
> > scheduled out, leaving the RTC in an undefined state.
> 
> I did specifically add "because of there being none" to the sentence
> to clarify in which case we avoid intercepting.

Oh, right, sorry for the fuzz, I didn't parse that last bit of the
sentence.  I'm fine with the current wording then.

> >> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
> >> +{
> >> +    unsigned int offs;
> >> +
> >> +    if ( !is_hardware_domain(d) ||
> >> +         !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) )
> >> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
> >> +
> >> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
> >> +        return false;
> >> +
> >> +    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
> >> +    {
> >> +        if ( !(offs & cmos_alias_mask) )
> >> +            continue;
> >> +        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
> >> +            return true;
> >> +    }
> > 
> > Maybe I'm confused, but doesn't this loop start at RTC_PORT(2), and
> > hence you need to check for the RTC_PORT(0,1) pair outside of the
> > loop?
> 
> The loop starts at offset 2, yes, but see the initial if() in the
> function. Or at least I thought I got that right, but it looks like
> I didn't (failed attempt to try to address a specific request of
> yours, iirc).

Hm, doesn't that first if() cause that on all systems with an RTC only
PORTS(0,1) are allowed?

> >> @@ -1256,23 +1330,25 @@ unsigned int rtc_guest_read(unsigned int
> >>      unsigned long flags;
> >>      unsigned int data = ~0;
> >>  
> >> -    switch ( port )
> >> +    switch ( port & ~cmos_alias_mask )
> > 
> > Given that the call is gated with is_cmos_port() it would be clearer
> > to just use RTC_PORT(1) as the mask here IMO.
> 
> Hmm, personally I wouldn't consider RTC_PORT(1) to be reasonable to
> use as a mask (even if technically it would be okay).

OK, never mind then.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-03 11:44       ` Roger Pau Monné
@ 2023-04-03 12:24         ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-04-03 12:24 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 03.04.2023 13:44, Roger Pau Monné wrote:
> On Mon, Apr 03, 2023 at 01:26:41PM +0200, Jan Beulich wrote:
>> On 03.04.2023 13:09, Roger Pau Monné wrote:
>>> On Thu, Mar 30, 2023 at 12:40:38PM +0200, Jan Beulich wrote:
>>>> +bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
>>>> +{
>>>> +    unsigned int offs;
>>>> +
>>>> +    if ( !is_hardware_domain(d) ||
>>>> +         !(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) )
>>>> +        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>>>> +
>>>> +    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
>>>> +        return false;
>>>> +
>>>> +    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
>>>> +    {
>>>> +        if ( !(offs & cmos_alias_mask) )
>>>> +            continue;
>>>> +        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
>>>> +            return true;
>>>> +    }
>>>
>>> Maybe I'm confused, but doesn't this loop start at RTC_PORT(2), and
>>> hence you need to check for the RTC_PORT(0,1) pair outside of the
>>> loop?
>>
>> The loop starts at offset 2, yes, but see the initial if() in the
>> function. Or at least I thought I got that right, but it looks like
>> I didn't (failed attempt to try to address a specific request of
>> yours, iirc).
> 
> Hm, doesn't that first if() cause that on all systems with an RTC only
> PORTS(0,1) are allowed?

Indeed, hence why I said "failed attempt". Looking at it now I really
don't know what I was thinking when writing it that way.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
                   ` (3 preceding siblings ...)
  2023-03-30 10:40 ` [PATCH v5] " Jan Beulich
@ 2023-04-18  9:24 ` Jan Beulich
  2023-04-18 11:35   ` Roger Pau Monné
  4 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-04-18  9:24 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monné

... in order to also intercept Dom0 accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC, because of there being none.

Note that rtc_init() deliberately uses 16 as the upper loop bound,
despite probe_cmos_alias() using 8: The higher bound is benign now, but
would save us touching the code (or, worse, missing to touch it) in case
the lower one was doubled.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v6: Restore lost "return" in rtc_init(). Convert printk() to dprintk()
    in probe_cmos_alias(). Correct is_cmos_port() for hwdom.
v5: Simplify logic in is_cmos_port(). Limit the scope of a local
    variable. Adjust a comment that's being moved.
v4: Also conditionally mask top bit for guest index port accesses. Add
    missing adjustments to rtc_init(). Re-work to avoid recursive
    read_lock(). Also adjust guest_io_{read,write}(). Re-base.
v3: Re-base over change to earlier patch.
v2: Re-base.

--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -27,7 +27,7 @@
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/save.h>
-#include <asm/current.h>
+#include <asm/iocap.h>
 #include <xen/trace.h>
 #include <public/hvm/params.h>
 
@@ -836,9 +836,19 @@ void rtc_init(struct domain *d)
 
     if ( !has_vrtc(d) )
     {
-        if ( is_hardware_domain(d) )
-            /* Hardware domain gets mediated access to the physical RTC. */
-            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
+        unsigned int port;
+
+        if ( !is_hardware_domain(d) )
+            return;
+
+        /*
+         * Hardware domain gets mediated access to the physical RTC/CMOS (of
+         * course unless we don't use it ourselves, for there being none).
+         */
+        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
+            if ( is_cmos_port(port, 2, d) )
+                register_portio_handler(d, port, 2, hw_rtc_io);
+
         return;
     }
 
--- a/xen/arch/x86/include/asm/mc146818rtc.h
+++ b/xen/arch/x86/include/asm/mc146818rtc.h
@@ -9,6 +9,10 @@
 
 extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
 
+struct domain;
+bool is_cmos_port(unsigned int port, unsigned int bytes,
+                  const struct domain *d);
+
 /**********************************************************************
  * register summary
  **********************************************************************/
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -208,7 +208,7 @@ static bool admin_io_okay(unsigned int p
         return false;
 
     /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+    if ( is_cmos_port(port, bytes, d) )
         return false;
 
     return ioports_access_permitted(d, port, port + bytes - 1);
@@ -278,7 +278,7 @@ static uint32_t guest_io_read(unsigned i
         {
             sub_data = pv_pit_handler(port, 0, 0);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             sub_data = rtc_guest_read(port);
         }
@@ -424,7 +424,7 @@ static void guest_io_write(unsigned int
         {
             pv_pit_handler(port, (uint8_t)data, 1);
         }
-        else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+        else if ( is_cmos_port(port, 1, currd) )
         {
             rtc_guest_write(port, data);
         }
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2130,37 +2130,36 @@ int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init cf_check io_bitmap_cb(
     unsigned long s, unsigned long e, void *ctx)
 {
-    struct domain *d = ctx;
+    const struct domain *d = ctx;
     unsigned int i;
 
     ASSERT(e <= INT_MAX);
     for ( i = s; i <= e; i++ )
-        __clear_bit(i, d->arch.hvm.io_bitmap);
+        /*
+         * Accesses to RTC ports also need to be trapped in order to keep
+         * consistency with hypervisor accesses.
+         */
+        if ( !is_cmos_port(i, 1, d) )
+            __clear_bit(i, d->arch.hvm.io_bitmap);
 
     return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-    int rc;
+    if ( !is_hvm_domain(d) )
+        return;
 
-    if ( is_hvm_domain(d) )
-    {
-        bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
-        rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
-                                    io_bitmap_cb, d);
-        BUG_ON(rc);
-        /*
-         * NB: we need to trap accesses to 0xcf8 in order to intercept
-         * 4 byte accesses, that need to be handled by Xen in order to
-         * keep consistency.
-         * Access to 1 byte RTC ports also needs to be trapped in order
-         * to keep consistency with PV.
-         */
-        __set_bit(0xcf8, d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-        __set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-    }
+    bitmap_fill(d->arch.hvm.io_bitmap, 0x10000);
+    if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x10000,
+                                io_bitmap_cb, d) )
+        BUG();
+
+    /*
+     * We need to trap 4-byte accesses to 0xcf8 (see admin_io_okay(),
+     * guest_io_read(), and guest_io_write()).
+     */
+    __set_bit(0xcf8, d->arch.hvm.io_bitmap);
 }
 
 /*
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1234,7 +1234,10 @@ static unsigned long get_cmos_time(void)
         if ( seconds < 60 )
         {
             if ( rtc.sec != seconds )
+            {
                 cmos_rtc_probe = false;
+                acpi_gbl_FADT.boot_flags &= ~ACPI_FADT_NO_CMOS_RTC;
+            }
             break;
         }
 
@@ -1249,6 +1252,79 @@ static unsigned long get_cmos_time(void)
     return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
 }
 
+static unsigned int __ro_after_init cmos_alias_mask;
+
+static int __init cf_check probe_cmos_alias(void)
+{
+    unsigned int offs;
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return 0;
+
+    for ( offs = 2; offs < 8; offs <<= 1 )
+    {
+        unsigned int i;
+        bool read = true;
+
+        for ( i = RTC_REG_D + 1; i < 0x80; ++i )
+        {
+            uint8_t normal, alt;
+            unsigned long flags;
+
+            if ( i == acpi_gbl_FADT.century )
+                continue;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+
+            normal = CMOS_READ(i);
+            if ( inb(RTC_PORT(offs)) != i )
+                read = false;
+
+            alt = inb(RTC_PORT(offs + 1));
+
+            spin_unlock_irqrestore(&rtc_lock, flags);
+
+            if ( normal != alt )
+                break;
+
+            process_pending_softirqs();
+        }
+        if ( i == 0x80 )
+        {
+            cmos_alias_mask |= offs;
+            dprintk(XENLOG_INFO, "CMOS aliased at %02x, index %s\n",
+                    RTC_PORT(offs), read ? "r/w" : "w/o");
+        }
+    }
+
+    return 0;
+}
+__initcall(probe_cmos_alias);
+
+bool is_cmos_port(unsigned int port, unsigned int bytes, const struct domain *d)
+{
+    unsigned int offs;
+
+    if ( !is_hardware_domain(d) )
+        return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
+
+    if ( acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC )
+        return false;
+
+    if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+        return true;
+
+    for ( offs = 2; offs <= cmos_alias_mask; offs <<= 1 )
+    {
+        if ( !(offs & cmos_alias_mask) )
+            continue;
+        if ( port <= RTC_PORT(offs | 1) && port + bytes > RTC_PORT(offs) )
+            return true;
+    }
+
+    return false;
+}
+
 /* Helpers for guest accesses to the physical RTC. */
 unsigned int rtc_guest_read(unsigned int port)
 {
@@ -1256,23 +1332,25 @@ unsigned int rtc_guest_read(unsigned int
     unsigned long flags;
     unsigned int data = ~0;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
     case RTC_PORT(0):
         /*
          * All PV domains (and PVH dom0) are allowed to read the latched value
          * of the first RTC port, as there's no access to the physical IO
-         * ports.
+         * ports.  Note that we return the index value regardless of whether
+         * underlying hardware would permit doing so.
          */
-        data = currd->arch.cmos_idx;
+        data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        data = inb(RTC_PORT(1));
+        outb(currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1))),
+             port - 1);
+        data = inb(port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 
@@ -1288,9 +1366,10 @@ void rtc_guest_write(unsigned int port,
     struct domain *currd = current->domain;
     unsigned long flags;
 
-    switch ( port )
+    switch ( port & ~cmos_alias_mask )
     {
         typeof(pv_rtc_handler) hook;
+        unsigned int idx;
 
     case RTC_PORT(0):
         /*
@@ -1298,20 +1377,22 @@ void rtc_guest_write(unsigned int port,
          * value of the first RTC port, as there's no access to the physical IO
          * ports.
          */
-        currd->arch.cmos_idx = data;
+        currd->arch.cmos_idx = data & (0xff >> (port == RTC_PORT(0)));
         break;
 
     case RTC_PORT(1):
-        if ( !ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        if ( !ioports_access_permitted(currd, port - 1, port) )
             break;
 
+        idx = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(1)));
+
         hook = ACCESS_ONCE(pv_rtc_handler);
         if ( hook )
-            hook(currd->arch.cmos_idx & 0x7f, data);
+            hook(idx, data);
 
         spin_lock_irqsave(&rtc_lock, flags);
-        outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-        outb(data, RTC_PORT(1));
+        outb(idx, port - 1);
+        outb(data, port);
         spin_unlock_irqrestore(&rtc_lock, flags);
         break;
 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-18  9:24 ` [PATCH v6] " Jan Beulich
@ 2023-04-18 11:35   ` Roger Pau Monné
  2023-04-19  7:56     ` Jan Beulich
  2023-04-19 13:58     ` Jan Beulich
  0 siblings, 2 replies; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-18 11:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
> ... in order to also intercept Dom0 accesses through the alias ports.
> 
> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> use the CMOS RTC, because of there being none.
> 
> Note that rtc_init() deliberately uses 16 as the upper loop bound,
> despite probe_cmos_alias() using 8: The higher bound is benign now, but
> would save us touching the code (or, worse, missing to touch it) in case
> the lower one was doubled.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> v6: Restore lost "return" in rtc_init(). Convert printk() to dprintk()
>     in probe_cmos_alias(). Correct is_cmos_port() for hwdom.
> v5: Simplify logic in is_cmos_port(). Limit the scope of a local
>     variable. Adjust a comment that's being moved.
> v4: Also conditionally mask top bit for guest index port accesses. Add
>     missing adjustments to rtc_init(). Re-work to avoid recursive
>     read_lock(). Also adjust guest_io_{read,write}(). Re-base.
> v3: Re-base over change to earlier patch.
> v2: Re-base.
> 
> --- a/xen/arch/x86/hvm/rtc.c
> +++ b/xen/arch/x86/hvm/rtc.c
> @@ -27,7 +27,7 @@
>  #include <asm/hvm/vpt.h>
>  #include <asm/hvm/io.h>
>  #include <asm/hvm/save.h>
> -#include <asm/current.h>
> +#include <asm/iocap.h>
>  #include <xen/trace.h>
>  #include <public/hvm/params.h>
>  
> @@ -836,9 +836,19 @@ void rtc_init(struct domain *d)
>  
>      if ( !has_vrtc(d) )
>      {
> -        if ( is_hardware_domain(d) )
> -            /* Hardware domain gets mediated access to the physical RTC. */
> -            register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
> +        unsigned int port;
> +
> +        if ( !is_hardware_domain(d) )
> +            return;
> +
> +        /*
> +         * Hardware domain gets mediated access to the physical RTC/CMOS (of
> +         * course unless we don't use it ourselves, for there being none).
> +         */
> +        for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
> +            if ( is_cmos_port(port, 2, d) )
> +                register_portio_handler(d, port, 2, hw_rtc_io);
> +
>          return;
>      }
>  
> --- a/xen/arch/x86/include/asm/mc146818rtc.h
> +++ b/xen/arch/x86/include/asm/mc146818rtc.h
> @@ -9,6 +9,10 @@
>  
>  extern spinlock_t rtc_lock;             /* serialize CMOS RAM access */
>  
> +struct domain;
> +bool is_cmos_port(unsigned int port, unsigned int bytes,
> +                  const struct domain *d);
> +
>  /**********************************************************************
>   * register summary
>   **********************************************************************/
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -208,7 +208,7 @@ static bool admin_io_okay(unsigned int p
>          return false;
>  
>      /* We also never permit direct access to the RTC/CMOS registers. */

Hm, it's unclear to me whether the comment above would need updating:
we don't allow direct access to the RTC/CMOS registers, but we allow
direct access to the RTC/CMOS ports if there's no device behind.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-18 11:35   ` Roger Pau Monné
@ 2023-04-19  7:56     ` Jan Beulich
  2023-04-19 10:45       ` Roger Pau Monné
  2023-04-19 13:58     ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-04-19  7:56 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 18.04.2023 13:35, Roger Pau Monné wrote:
> On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
>>
>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC, because of there being none.
>>
>> Note that rtc_init() deliberately uses 16 as the upper loop bound,
>> despite probe_cmos_alias() using 8: The higher bound is benign now, but
>> would save us touching the code (or, worse, missing to touch it) in case
>> the lower one was doubled.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/pv/emul-priv-op.c
>> +++ b/xen/arch/x86/pv/emul-priv-op.c
>> @@ -208,7 +208,7 @@ static bool admin_io_okay(unsigned int p
>>          return false;
>>  
>>      /* We also never permit direct access to the RTC/CMOS registers. */
> 
> Hm, it's unclear to me whether the comment above would need updating:
> we don't allow direct access to the RTC/CMOS registers, but we allow
> direct access to the RTC/CMOS ports if there's no device behind.

Right, but those ports then don't allow access to said registers. So
I think the comment is fine as is.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-19  7:56     ` Jan Beulich
@ 2023-04-19 10:45       ` Roger Pau Monné
  0 siblings, 0 replies; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-19 10:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Wed, Apr 19, 2023 at 09:56:35AM +0200, Jan Beulich wrote:
> On 18.04.2023 13:35, Roger Pau Monné wrote:
> > On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
> >> ... in order to also intercept Dom0 accesses through the alias ports.
> >>
> >> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> >> use the CMOS RTC, because of there being none.
> >>
> >> Note that rtc_init() deliberately uses 16 as the upper loop bound,
> >> despite probe_cmos_alias() using 8: The higher bound is benign now, but
> >> would save us touching the code (or, worse, missing to touch it) in case
> >> the lower one was doubled.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> >> --- a/xen/arch/x86/pv/emul-priv-op.c
> >> +++ b/xen/arch/x86/pv/emul-priv-op.c
> >> @@ -208,7 +208,7 @@ static bool admin_io_okay(unsigned int p
> >>          return false;
> >>  
> >>      /* We also never permit direct access to the RTC/CMOS registers. */
> > 
> > Hm, it's unclear to me whether the comment above would need updating:
> > we don't allow direct access to the RTC/CMOS registers, but we allow
> > direct access to the RTC/CMOS ports if there's no device behind.
> 
> Right, but those ports then don't allow access to said registers. So
> I think the comment is fine as is.

Yes, that's why I wasn't really sure whether to comment.  The comment
is formally correct, but it might lead to confusion if one doesn't
carefully read 'RTC/CMOS registers' (vs RTC/CMOS IO ports).

Anyway, sorry for the noise.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-18 11:35   ` Roger Pau Monné
  2023-04-19  7:56     ` Jan Beulich
@ 2023-04-19 13:58     ` Jan Beulich
  2023-04-19 15:55       ` Roger Pau Monné
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-04-19 13:58 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 18.04.2023 13:35, Roger Pau Monné wrote:
> On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
>> ... in order to also intercept Dom0 accesses through the alias ports.
>>
>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>> use the CMOS RTC, because of there being none.
>>
>> Note that rtc_init() deliberately uses 16 as the upper loop bound,
>> despite probe_cmos_alias() using 8: The higher bound is benign now, but
>> would save us touching the code (or, worse, missing to touch it) in case
>> the lower one was doubled.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Before committing I went back to read through doc and earlier comments,
in particular regarding the NMI disable. As a result I'm now inclined
to follow your earlier request and fold in the change below. Thoughts?

Jan

--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1305,6 +1305,13 @@ bool is_cmos_port(unsigned int port, uns
 {
     unsigned int offs;
 
+    /*
+     * While not really CMOS-related, port 0x70 always needs intercepting
+     * to deal with the NMI disable bit.
+     */
+    if ( port <= RTC_PORT(0) && port + bytes > RTC_PORT(0) )
+        return true;
+
     if ( !is_hardware_domain(d) )
         return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
 
@@ -1342,6 +1349,17 @@ unsigned int rtc_guest_read(unsigned int
          * underlying hardware would permit doing so.
          */
         data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
+
+        /*
+         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
+         * ports. While reading the index register isn't normally possible,
+         * play safe and return back whatever can be read (just in case a value
+         * written through an alias would be attempted to be read back here).
+         */
+        if ( port == RTC_PORT(0) &&
+             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
+             ioports_access_permitted(currd, port, port) )
+            data = inb(port) & 0x7f;
         break;
 
     case RTC_PORT(1):
@@ -1378,6 +1396,16 @@ void rtc_guest_write(unsigned int port,
          * ports.
          */
         currd->arch.cmos_idx = data & (0xff >> (port == RTC_PORT(0)));
+
+        /*
+         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
+         * ports. Therefore the port write, with the NMI disable bit zapped,
+         * needs carrying out right away.
+         */
+        if ( port == RTC_PORT(0) &&
+             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
+             ioports_access_permitted(currd, port, port) )
+            outb(data & 0x7f, port);
         break;
 
     case RTC_PORT(1):




^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-19 13:58     ` Jan Beulich
@ 2023-04-19 15:55       ` Roger Pau Monné
  2023-04-20  8:31         ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-19 15:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Wed, Apr 19, 2023 at 03:58:10PM +0200, Jan Beulich wrote:
> On 18.04.2023 13:35, Roger Pau Monné wrote:
> > On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
> >> ... in order to also intercept Dom0 accesses through the alias ports.
> >>
> >> Also stop intercepting accesses to the CMOS ports if we won't ourselves
> >> use the CMOS RTC, because of there being none.
> >>
> >> Note that rtc_init() deliberately uses 16 as the upper loop bound,
> >> despite probe_cmos_alias() using 8: The higher bound is benign now, but
> >> would save us touching the code (or, worse, missing to touch it) in case
> >> the lower one was doubled.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Before committing I went back to read through doc and earlier comments,
> in particular regarding the NMI disable. As a result I'm now inclined
> to follow your earlier request and fold in the change below. Thoughts?

It was unclear to me whether port 0x70 also had this NMI disabling
behavior when the RTC/CMOS is not present but it seems that port is
shared between the RTC index and the NMI logic, so lack of RTC doesn't
imply lack of the NMI bit.

> Jan
> 
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1305,6 +1305,13 @@ bool is_cmos_port(unsigned int port, uns
>  {
>      unsigned int offs;
>  
> +    /*
> +     * While not really CMOS-related, port 0x70 always needs intercepting
> +     * to deal with the NMI disable bit.
> +     */
> +    if ( port <= RTC_PORT(0) && port + bytes > RTC_PORT(0) )
> +        return true;

It might make it clearer to move this after the !is_hardware_domain(d)
check, as non-hardware domains don't get access to that port anyway?

> +
>      if ( !is_hardware_domain(d) )
>          return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);
>  
> @@ -1342,6 +1349,17 @@ unsigned int rtc_guest_read(unsigned int
>           * underlying hardware would permit doing so.
>           */
>          data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> +
> +        /*
> +         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
> +         * ports. While reading the index register isn't normally possible,
> +         * play safe and return back whatever can be read (just in case a value
> +         * written through an alias would be attempted to be read back here).
> +         */
> +        if ( port == RTC_PORT(0) &&
> +             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> +             ioports_access_permitted(currd, port, port) )
> +            data = inb(port) & 0x7f;

Do we really need to mask the high bit here?  We don't allow setting
that bit in the first place.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-19 15:55       ` Roger Pau Monné
@ 2023-04-20  8:31         ` Jan Beulich
  2023-04-20 14:31           ` Roger Pau Monné
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2023-04-20  8:31 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 19.04.2023 17:55, Roger Pau Monné wrote:
> On Wed, Apr 19, 2023 at 03:58:10PM +0200, Jan Beulich wrote:
>> On 18.04.2023 13:35, Roger Pau Monné wrote:
>>> On Tue, Apr 18, 2023 at 11:24:19AM +0200, Jan Beulich wrote:
>>>> ... in order to also intercept Dom0 accesses through the alias ports.
>>>>
>>>> Also stop intercepting accesses to the CMOS ports if we won't ourselves
>>>> use the CMOS RTC, because of there being none.
>>>>
>>>> Note that rtc_init() deliberately uses 16 as the upper loop bound,
>>>> despite probe_cmos_alias() using 8: The higher bound is benign now, but
>>>> would save us touching the code (or, worse, missing to touch it) in case
>>>> the lower one was doubled.
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Before committing I went back to read through doc and earlier comments,
>> in particular regarding the NMI disable. As a result I'm now inclined
>> to follow your earlier request and fold in the change below. Thoughts?
> 
> It was unclear to me whether port 0x70 also had this NMI disabling
> behavior when the RTC/CMOS is not present but it seems that port is
> shared between the RTC index and the NMI logic, so lack of RTC doesn't
> imply lack of the NMI bit.

Right. My earlier oversight was that the datasheet I had pointed you
at actually explicitly mentions the NMI disable bit (really NMI# enable
there, named NMI_EN) in a separate section.

>> --- a/xen/arch/x86/time.c
>> +++ b/xen/arch/x86/time.c
>> @@ -1305,6 +1305,13 @@ bool is_cmos_port(unsigned int port, uns
>>  {
>>      unsigned int offs;
>>  
>> +    /*
>> +     * While not really CMOS-related, port 0x70 always needs intercepting
>> +     * to deal with the NMI disable bit.
>> +     */
>> +    if ( port <= RTC_PORT(0) && port + bytes > RTC_PORT(0) )
>> +        return true;
> 
> It might make it clearer to move this after the !is_hardware_domain(d)
> check, as non-hardware domains don't get access to that port anyway?

I've done this; I had is earlier because when moved ...

>> +
>>      if ( !is_hardware_domain(d) )
>>          return port <= RTC_PORT(1) && port + bytes > RTC_PORT(0);

... below here it becomes yet more odd with the 2nd of following if()s.
But I guess that's still "acceptably odd".

>> @@ -1342,6 +1349,17 @@ unsigned int rtc_guest_read(unsigned int
>>           * underlying hardware would permit doing so.
>>           */
>>          data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
>> +
>> +        /*
>> +         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
>> +         * ports. While reading the index register isn't normally possible,
>> +         * play safe and return back whatever can be read (just in case a value
>> +         * written through an alias would be attempted to be read back here).
>> +         */
>> +        if ( port == RTC_PORT(0) &&
>> +             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>> +             ioports_access_permitted(currd, port, port) )
>> +            data = inb(port) & 0x7f;
> 
> Do we really need to mask the high bit here?  We don't allow setting
> that bit in the first place.

I think it's more consistent to mask it off, specifically with the code
visible in context right above the insertion. The doc isn't really clear
about readability of that bit: On one hand in says R/W for port 0x70 in
the NMI_EN section, yet otoh in the RTC section it says "Note that port
70h is not directly readable. The only way to read this register is
through Alt Access mode." (I think the NMI_EN section is more trustworthy,
but still.) Plus if we were to ever make use of the NMI disable, we
wouldn't want Dom0 see the bit set.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-20  8:31         ` Jan Beulich
@ 2023-04-20 14:31           ` Roger Pau Monné
  2023-04-20 14:55             ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Roger Pau Monné @ 2023-04-20 14:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On Thu, Apr 20, 2023 at 10:31:08AM +0200, Jan Beulich wrote:
> On 19.04.2023 17:55, Roger Pau Monné wrote:
> > On Wed, Apr 19, 2023 at 03:58:10PM +0200, Jan Beulich wrote:
> >> @@ -1342,6 +1349,17 @@ unsigned int rtc_guest_read(unsigned int
> >>           * underlying hardware would permit doing so.
> >>           */
> >>          data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
> >> +
> >> +        /*
> >> +         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
> >> +         * ports. While reading the index register isn't normally possible,
> >> +         * play safe and return back whatever can be read (just in case a value
> >> +         * written through an alias would be attempted to be read back here).
> >> +         */
> >> +        if ( port == RTC_PORT(0) &&
> >> +             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
> >> +             ioports_access_permitted(currd, port, port) )
> >> +            data = inb(port) & 0x7f;
> > 
> > Do we really need to mask the high bit here?  We don't allow setting
> > that bit in the first place.
> 
> I think it's more consistent to mask it off, specifically with the code
> visible in context right above the insertion. The doc isn't really clear
> about readability of that bit: On one hand in says R/W for port 0x70 in
> the NMI_EN section, yet otoh in the RTC section it says "Note that port
> 70h is not directly readable. The only way to read this register is
> through Alt Access mode." (I think the NMI_EN section is more trustworthy,
> but still.) Plus if we were to ever make use of the NMI disable, we
> wouldn't want Dom0 see the bit set.

I guess so, at the end Xen itself doesn't use the bit so far.  Maybe
at some point we would want to expose the value of the bit to dom0 if
Xen starts using it (most than anything for informative purposes if
NMIs are disabled).

Feel free to fold the diff to the existing patch and keep the RB.

I guess you will also add something to the commit message about the
special handling of the NMI enable bit even when the RTC/CMOS is not
present?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v6] x86: detect CMOS aliasing on ports other than 0x70/0x71
  2023-04-20 14:31           ` Roger Pau Monné
@ 2023-04-20 14:55             ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2023-04-20 14:55 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Paul Durrant, Wei Liu

On 20.04.2023 16:31, Roger Pau Monné wrote:
> On Thu, Apr 20, 2023 at 10:31:08AM +0200, Jan Beulich wrote:
>> On 19.04.2023 17:55, Roger Pau Monné wrote:
>>> On Wed, Apr 19, 2023 at 03:58:10PM +0200, Jan Beulich wrote:
>>>> @@ -1342,6 +1349,17 @@ unsigned int rtc_guest_read(unsigned int
>>>>           * underlying hardware would permit doing so.
>>>>           */
>>>>          data = currd->arch.cmos_idx & (0xff >> (port == RTC_PORT(0)));
>>>> +
>>>> +        /*
>>>> +         * When there's (supposedly) no RTC/CMOS, we don't intercept the other
>>>> +         * ports. While reading the index register isn't normally possible,
>>>> +         * play safe and return back whatever can be read (just in case a value
>>>> +         * written through an alias would be attempted to be read back here).
>>>> +         */
>>>> +        if ( port == RTC_PORT(0) &&
>>>> +             (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) &&
>>>> +             ioports_access_permitted(currd, port, port) )
>>>> +            data = inb(port) & 0x7f;
>>>
>>> Do we really need to mask the high bit here?  We don't allow setting
>>> that bit in the first place.
>>
>> I think it's more consistent to mask it off, specifically with the code
>> visible in context right above the insertion. The doc isn't really clear
>> about readability of that bit: On one hand in says R/W for port 0x70 in
>> the NMI_EN section, yet otoh in the RTC section it says "Note that port
>> 70h is not directly readable. The only way to read this register is
>> through Alt Access mode." (I think the NMI_EN section is more trustworthy,
>> but still.) Plus if we were to ever make use of the NMI disable, we
>> wouldn't want Dom0 see the bit set.
> 
> I guess so, at the end Xen itself doesn't use the bit so far.  Maybe
> at some point we would want to expose the value of the bit to dom0 if
> Xen starts using it (most than anything for informative purposes if
> NMIs are disabled).
> 
> Feel free to fold the diff to the existing patch and keep the RB.

Thanks.

> I guess you will also add something to the commit message about the
> special handling of the NMI enable bit even when the RTC/CMOS is not
> present?

Of course, albeit not more than a sentence, as the code comments provide
the details.

Jan


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2023-04-20 14:56 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 11:54 [PATCH v3 0/2] x86: RTC handling adjustments Jan Beulich
2020-07-15 11:56 ` [PATCH v3 1/2] x86: restore pv_rtc_handler() invocation Jan Beulich
2020-07-15 12:13   ` Roger Pau Monné
2020-07-15 12:36     ` Jan Beulich
2020-07-15 13:32       ` Roger Pau Monné
2020-07-15 13:51         ` Jan Beulich
2020-07-15 14:51           ` Roger Pau Monné
2020-07-16 10:06             ` Jan Beulich
2020-07-16 10:31               ` Roger Pau Monné
2020-07-16 10:52                 ` Jan Beulich
2020-07-20 15:28               ` Andrew Cooper
2020-07-20 16:27                 ` Jan Beulich
2020-07-21  6:36                   ` Jan Beulich
2020-07-15 12:31   ` Paul Durrant
2020-07-15 11:57 ` [PATCH v3 2/2] x86: detect CMOS aliasing on ports other than 0x70/0x71 Jan Beulich
2020-07-20 11:11   ` Roger Pau Monné
2023-03-17 16:12     ` Roger Pau Monné
2023-03-20  8:32 ` [PATCH v4] " Jan Beulich
2023-03-21 14:12   ` Roger Pau Monné
2023-03-22  9:55     ` Jan Beulich
2023-03-23 12:29       ` Roger Pau Monné
2023-03-23 14:26         ` Jan Beulich
2023-03-27 15:44     ` Jan Beulich
2023-03-23 14:49   ` Roger Pau Monné
2023-03-23 16:08     ` Jan Beulich
2023-03-23 16:40       ` Roger Pau Monné
2023-03-27 15:46         ` Jan Beulich
2023-03-27 15:44     ` Jan Beulich
2023-03-30 10:40 ` [PATCH v5] " Jan Beulich
2023-04-03 11:09   ` Roger Pau Monné
2023-04-03 11:26     ` Jan Beulich
2023-04-03 11:44       ` Roger Pau Monné
2023-04-03 12:24         ` Jan Beulich
2023-04-18  9:24 ` [PATCH v6] " Jan Beulich
2023-04-18 11:35   ` Roger Pau Monné
2023-04-19  7:56     ` Jan Beulich
2023-04-19 10:45       ` Roger Pau Monné
2023-04-19 13:58     ` Jan Beulich
2023-04-19 15:55       ` Roger Pau Monné
2023-04-20  8:31         ` Jan Beulich
2023-04-20 14:31           ` Roger Pau Monné
2023-04-20 14:55             ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.