All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
@ 2020-05-08 12:43 Jan Beulich
  2020-05-08 12:54 ` Andrew Cooper
  2020-05-08 15:03 ` Roger Pau Monné
  0 siblings, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2020-05-08 12:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monné

The op has a register/unregister flag, and hence registration shouldn't
happen unilaterally. Introduce unregister_vpci_mmcfg_handler() to handle
this case.

Fixes: eb3dd90e4089 ("x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0")
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
     return 0;
 }
 
+int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
+                                  unsigned int start_bus, unsigned int end_bus,
+                                  unsigned int seg)
+{
+    struct hvm_mmcfg *mmcfg;
+    int rc = -ENOENT;
+
+    ASSERT(is_hardware_domain(d));
+
+    if ( start_bus > end_bus )
+        return -EINVAL;
+
+    write_lock(&d->arch.hvm.mmcfg_lock);
+
+    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
+        if ( mmcfg->addr == addr + (start_bus << 20) &&
+             mmcfg->segment == seg &&
+             mmcfg->start_bus == start_bus &&
+             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
+        {
+            list_del(&mmcfg->next);
+            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
+                xfree(mmcfg);
+            else
+            {
+                /*
+                 * Cannot unregister the MMIO handler - leave a fake entry
+                 * on the list.
+                 */
+                memset(mmcfg, 0, sizeof(*mmcfg));
+                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);
+            }
+            rc = 0;
+            break;
+        }
+
+    write_unlock(&d->arch.hvm.mmcfg_lock);
+
+    return rc;
+}
+
 void destroy_vpci_mmcfg(struct domain *d)
 {
     struct list_head *mmcfg_regions = &d->arch.hvm.mmcfg_regions;
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
         if ( !ret && has_vpci(currd) )
         {
             /*
-             * For HVM (PVH) domains try to add the newly found MMCFG to the
-             * domain.
+             * For HVM (PVH) domains try to add/remove the reported MMCFG
+             * to/from the domain.
              */
-            ret = register_vpci_mmcfg_handler(currd, info.address,
-                                              info.start_bus, info.end_bus,
-                                              info.segment);
+            if ( info.flags & XEN_PCI_MMCFG_RESERVED )
+                ret = register_vpci_mmcfg_handler(currd, info.address,
+                                                  info.start_bus, info.end_bus,
+                                                  info.segment);
+            else
+                ret = unregister_vpci_mmcfg_handler(currd, info.address,
+                                                    info.start_bus,
+                                                    info.end_bus,
+                                                    info.segment);
         }
 
         break;
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -178,6 +178,9 @@ void register_vpci_portio_handler(struct
 int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
                                 unsigned int start_bus, unsigned int end_bus,
                                 unsigned int seg);
+int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
+                                  unsigned int start_bus, unsigned int end_bus,
+                                  unsigned int seg);
 /* Destroy tracked MMCFG areas. */
 void destroy_vpci_mmcfg(struct domain *d);
 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 12:43 [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region Jan Beulich
@ 2020-05-08 12:54 ` Andrew Cooper
  2020-05-08 13:49   ` Jan Beulich
  2020-05-08 15:03 ` Roger Pau Monné
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Cooper @ 2020-05-08 12:54 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, Roger Pau Monné

On 08/05/2020 13:43, Jan Beulich wrote:
> The op has a register/unregister flag, and hence registration shouldn't
> happen unilaterally. Introduce unregister_vpci_mmcfg_handler() to handle
> this case.
>
> Fixes: eb3dd90e4089 ("x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I agree in principle that registration shouldn't be unilateral, but why
on earth does the API behave like that to begin with?

There is no provision to move or update MMCFG regions in any spec I'm
aware of, and hardware cannot in practice update memory routing like this.

Under what circumstances should we tolerate an unregister in the first
place?

~Andrew


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 12:54 ` Andrew Cooper
@ 2020-05-08 13:49   ` Jan Beulich
  2020-05-08 14:48     ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2020-05-08 13:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, Roger Pau Monné

On 08.05.2020 14:54, Andrew Cooper wrote:
> On 08/05/2020 13:43, Jan Beulich wrote:
>> The op has a register/unregister flag, and hence registration shouldn't
>> happen unilaterally. Introduce unregister_vpci_mmcfg_handler() to handle
>> this case.
>>
>> Fixes: eb3dd90e4089 ("x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0")
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I agree in principle that registration shouldn't be unilateral, but why
> on earth does the API behave like that to begin with?
> 
> There is no provision to move or update MMCFG regions in any spec I'm
> aware of, and hardware cannot in practice update memory routing like this.
> 
> Under what circumstances should we tolerate an unregister in the first
> place?

Hot unplug of an entire segment, for example.

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 13:49   ` Jan Beulich
@ 2020-05-08 14:48     ` Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2020-05-08 14:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, xen-devel

On Fri, May 08, 2020 at 03:49:35PM +0200, Jan Beulich wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
> 
> On 08.05.2020 14:54, Andrew Cooper wrote:
> > On 08/05/2020 13:43, Jan Beulich wrote:
> >> The op has a register/unregister flag, and hence registration shouldn't
> >> happen unilaterally. Introduce unregister_vpci_mmcfg_handler() to handle
> >> this case.
> >>
> >> Fixes: eb3dd90e4089 ("x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0")
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > I agree in principle that registration shouldn't be unilateral, but why
> > on earth does the API behave like that to begin with?
> > 
> > There is no provision to move or update MMCFG regions in any spec I'm
> > aware of, and hardware cannot in practice update memory routing like this.
> > 
> > Under what circumstances should we tolerate an unregister in the first
> > place?
> 
> Hot unplug of an entire segment, for example.

An OS could also rebalance the resources of a host bridge, according
to the PCI firmware spec, in which case _CBA should be re-evaluated.

I'm not sure whether rebalancing would work anyway, or if there even
are systems that support this and OSes that would attempt to do it,
but since we have the interface for this let's try to do something
sensible.

The other options is simply returning -EOPNOTSUPP. Iff the domain
doesn't try to access devices that would reside on the segment
hot-unplugged it shouldn't make much of a difference, rebalancing is
the case were Xen must support add/remove in order to re-place the
position of the ECAM areas.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 12:43 [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region Jan Beulich
  2020-05-08 12:54 ` Andrew Cooper
@ 2020-05-08 15:03 ` Roger Pau Monné
  2020-05-08 15:11   ` Jan Beulich
  1 sibling, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2020-05-08 15:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Fri, May 08, 2020 at 02:43:38PM +0200, Jan Beulich wrote:
> The op has a register/unregister flag, and hence registration shouldn't
> happen unilaterally. Introduce unregister_vpci_mmcfg_handler() to handle
> this case.
> 
> Fixes: eb3dd90e4089 ("x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
>      return 0;
>  }
>  
> +int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
> +                                  unsigned int start_bus, unsigned int end_bus,
> +                                  unsigned int seg)
> +{
> +    struct hvm_mmcfg *mmcfg;
> +    int rc = -ENOENT;
> +
> +    ASSERT(is_hardware_domain(d));
> +
> +    if ( start_bus > end_bus )
> +        return -EINVAL;
> +
> +    write_lock(&d->arch.hvm.mmcfg_lock);
> +
> +    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
> +        if ( mmcfg->addr == addr + (start_bus << 20) &&
> +             mmcfg->segment == seg &&
> +             mmcfg->start_bus == start_bus &&
> +             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
> +        {
> +            list_del(&mmcfg->next);
> +            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
> +                xfree(mmcfg);
> +            else
> +            {
> +                /*
> +                 * Cannot unregister the MMIO handler - leave a fake entry
> +                 * on the list.
> +                 */
> +                memset(mmcfg, 0, sizeof(*mmcfg));
> +                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);

Instead of leaving this zombie entry around maybe we could add a
static bool in register_vpci_mmcfg_handler to signal whether the MMIO
intercept has been registered?

> +            }
> +            rc = 0;
> +            break;
> +        }
> +
> +    write_unlock(&d->arch.hvm.mmcfg_lock);
> +
> +    return rc;
> +}
> +
>  void destroy_vpci_mmcfg(struct domain *d)
>  {
>      struct list_head *mmcfg_regions = &d->arch.hvm.mmcfg_regions;
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>          if ( !ret && has_vpci(currd) )
>          {
>              /*
> -             * For HVM (PVH) domains try to add the newly found MMCFG to the
> -             * domain.
> +             * For HVM (PVH) domains try to add/remove the reported MMCFG
> +             * to/from the domain.
>               */
> -            ret = register_vpci_mmcfg_handler(currd, info.address,
> -                                              info.start_bus, info.end_bus,
> -                                              info.segment);
> +            if ( info.flags & XEN_PCI_MMCFG_RESERVED )

Do you think you could also add a small note in physdev.h regarding
the fact that XEN_PCI_MMCFG_RESERVED is used to register a MMCFG
region, and not setting it would imply an unregister request?

It's not obvious to me from the name of the flag.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 15:03 ` Roger Pau Monné
@ 2020-05-08 15:11   ` Jan Beulich
  2020-05-08 16:08     ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2020-05-08 15:11 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 08.05.2020 17:03, Roger Pau Monné wrote:
> On Fri, May 08, 2020 at 02:43:38PM +0200, Jan Beulich wrote:
>> --- a/xen/arch/x86/hvm/io.c
>> +++ b/xen/arch/x86/hvm/io.c
>> @@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
>>      return 0;
>>  }
>>  
>> +int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
>> +                                  unsigned int start_bus, unsigned int end_bus,
>> +                                  unsigned int seg)
>> +{
>> +    struct hvm_mmcfg *mmcfg;
>> +    int rc = -ENOENT;
>> +
>> +    ASSERT(is_hardware_domain(d));
>> +
>> +    if ( start_bus > end_bus )
>> +        return -EINVAL;
>> +
>> +    write_lock(&d->arch.hvm.mmcfg_lock);
>> +
>> +    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
>> +        if ( mmcfg->addr == addr + (start_bus << 20) &&
>> +             mmcfg->segment == seg &&
>> +             mmcfg->start_bus == start_bus &&
>> +             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
>> +        {
>> +            list_del(&mmcfg->next);
>> +            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
>> +                xfree(mmcfg);
>> +            else
>> +            {
>> +                /*
>> +                 * Cannot unregister the MMIO handler - leave a fake entry
>> +                 * on the list.
>> +                 */
>> +                memset(mmcfg, 0, sizeof(*mmcfg));
>> +                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);
> 
> Instead of leaving this zombie entry around maybe we could add a
> static bool in register_vpci_mmcfg_handler to signal whether the MMIO
> intercept has been registered?

That was my initial plan indeed, but registration is per-domain.

>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>>          if ( !ret && has_vpci(currd) )
>>          {
>>              /*
>> -             * For HVM (PVH) domains try to add the newly found MMCFG to the
>> -             * domain.
>> +             * For HVM (PVH) domains try to add/remove the reported MMCFG
>> +             * to/from the domain.
>>               */
>> -            ret = register_vpci_mmcfg_handler(currd, info.address,
>> -                                              info.start_bus, info.end_bus,
>> -                                              info.segment);
>> +            if ( info.flags & XEN_PCI_MMCFG_RESERVED )
> 
> Do you think you could also add a small note in physdev.h regarding
> the fact that XEN_PCI_MMCFG_RESERVED is used to register a MMCFG
> region, and not setting it would imply an unregister request?
> 
> It's not obvious to me from the name of the flag.

The main purpose of the flag is to identify whether a region can be
used (because of having been found marked suitably reserved by
firmware). The flag not set effectively means "region is not marked
reserved". You pointing this out makes me wonder whether instead I
should simply expand the if() in context, without making it behave
like unregistration. Then again we'd have no way to unregister a
region, and hence (ab)using this function for this purpose seems to
makes sense (and, afaict, not require any code changes elsewhere).

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 15:11   ` Jan Beulich
@ 2020-05-08 16:08     ` Roger Pau Monné
  2020-05-11 13:46       ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Pau Monné @ 2020-05-08 16:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Fri, May 08, 2020 at 05:11:35PM +0200, Jan Beulich wrote:
> On 08.05.2020 17:03, Roger Pau Monné wrote:
> > On Fri, May 08, 2020 at 02:43:38PM +0200, Jan Beulich wrote:
> >> --- a/xen/arch/x86/hvm/io.c
> >> +++ b/xen/arch/x86/hvm/io.c
> >> @@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
> >>      return 0;
> >>  }
> >>  
> >> +int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
> >> +                                  unsigned int start_bus, unsigned int end_bus,
> >> +                                  unsigned int seg)
> >> +{
> >> +    struct hvm_mmcfg *mmcfg;
> >> +    int rc = -ENOENT;
> >> +
> >> +    ASSERT(is_hardware_domain(d));
> >> +
> >> +    if ( start_bus > end_bus )
> >> +        return -EINVAL;
> >> +
> >> +    write_lock(&d->arch.hvm.mmcfg_lock);
> >> +
> >> +    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
> >> +        if ( mmcfg->addr == addr + (start_bus << 20) &&
> >> +             mmcfg->segment == seg &&
> >> +             mmcfg->start_bus == start_bus &&
> >> +             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
> >> +        {
> >> +            list_del(&mmcfg->next);
> >> +            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
> >> +                xfree(mmcfg);
> >> +            else
> >> +            {
> >> +                /*
> >> +                 * Cannot unregister the MMIO handler - leave a fake entry
> >> +                 * on the list.
> >> +                 */
> >> +                memset(mmcfg, 0, sizeof(*mmcfg));
> >> +                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);
> > 
> > Instead of leaving this zombie entry around maybe we could add a
> > static bool in register_vpci_mmcfg_handler to signal whether the MMIO
> > intercept has been registered?
> 
> That was my initial plan indeed, but registration is per-domain.

Indeed, this would work now because it's only used by the hardware
domain, but it's not a good move long term.

What about splitting the registration into a
register_vpci_mmio_handler and call it from hvm_domain_initialise
like it's done for register_vpci_portio_handler?

That might be cleaner long term, sorry if it's more work.

> >> --- a/xen/arch/x86/physdev.c
> >> +++ b/xen/arch/x86/physdev.c
> >> @@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
> >>          if ( !ret && has_vpci(currd) )
> >>          {
> >>              /*
> >> -             * For HVM (PVH) domains try to add the newly found MMCFG to the
> >> -             * domain.
> >> +             * For HVM (PVH) domains try to add/remove the reported MMCFG
> >> +             * to/from the domain.
> >>               */
> >> -            ret = register_vpci_mmcfg_handler(currd, info.address,
> >> -                                              info.start_bus, info.end_bus,
> >> -                                              info.segment);
> >> +            if ( info.flags & XEN_PCI_MMCFG_RESERVED )
> > 
> > Do you think you could also add a small note in physdev.h regarding
> > the fact that XEN_PCI_MMCFG_RESERVED is used to register a MMCFG
> > region, and not setting it would imply an unregister request?
> > 
> > It's not obvious to me from the name of the flag.
> 
> The main purpose of the flag is to identify whether a region can be
> used (because of having been found marked suitably reserved by
> firmware). The flag not set effectively means "region is not marked
> reserved".

Looking at pci_mmcfg_arch_disable, should the region then also be
removed from mmio_ro_ranges? (kind of tangential to this patch)

> You pointing this out makes me wonder whether instead I
> should simply expand the if() in context, without making it behave
> like unregistration. Then again we'd have no way to unregister a
> region, and hence (ab)using this function for this purpose seems to
> makes sense (and, afaict, not require any code changes elsewhere).

Right now the only user I know of PHYSDEVOP_pci_mmcfg_reserved is
Linux, and AFAICT it always sets the XEN_PCI_MMCFG_RESERVED flag (at
least upstream).

I don't mind that much what we end up doing, as long as it's
documented in physdev.h. There's no documentation of that physdevop
hypercall at all, so if we provide proper documentation I would be
fine with treating a call with no flags as an unregistration request
(which is kind of what we already do for a classic PV hardware
domain).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-08 16:08     ` Roger Pau Monné
@ 2020-05-11 13:46       ` Jan Beulich
  2020-05-11 14:35         ` Roger Pau Monné
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2020-05-11 13:46 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On 08.05.2020 18:08, Roger Pau Monné wrote:
> On Fri, May 08, 2020 at 05:11:35PM +0200, Jan Beulich wrote:
>> On 08.05.2020 17:03, Roger Pau Monné wrote:
>>> On Fri, May 08, 2020 at 02:43:38PM +0200, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/hvm/io.c
>>>> +++ b/xen/arch/x86/hvm/io.c
>>>> @@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
>>>>      return 0;
>>>>  }
>>>>  
>>>> +int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
>>>> +                                  unsigned int start_bus, unsigned int end_bus,
>>>> +                                  unsigned int seg)
>>>> +{
>>>> +    struct hvm_mmcfg *mmcfg;
>>>> +    int rc = -ENOENT;
>>>> +
>>>> +    ASSERT(is_hardware_domain(d));
>>>> +
>>>> +    if ( start_bus > end_bus )
>>>> +        return -EINVAL;
>>>> +
>>>> +    write_lock(&d->arch.hvm.mmcfg_lock);
>>>> +
>>>> +    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
>>>> +        if ( mmcfg->addr == addr + (start_bus << 20) &&
>>>> +             mmcfg->segment == seg &&
>>>> +             mmcfg->start_bus == start_bus &&
>>>> +             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
>>>> +        {
>>>> +            list_del(&mmcfg->next);
>>>> +            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
>>>> +                xfree(mmcfg);
>>>> +            else
>>>> +            {
>>>> +                /*
>>>> +                 * Cannot unregister the MMIO handler - leave a fake entry
>>>> +                 * on the list.
>>>> +                 */
>>>> +                memset(mmcfg, 0, sizeof(*mmcfg));
>>>> +                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);
>>>
>>> Instead of leaving this zombie entry around maybe we could add a
>>> static bool in register_vpci_mmcfg_handler to signal whether the MMIO
>>> intercept has been registered?
>>
>> That was my initial plan indeed, but registration is per-domain.
> 
> Indeed, this would work now because it's only used by the hardware
> domain, but it's not a good move long term.
> 
> What about splitting the registration into a
> register_vpci_mmio_handler and call it from hvm_domain_initialise
> like it's done for register_vpci_portio_handler?

No, the goal is to not register unneeded handlers. But see below -
I'll likely ditch the function anyway.

>>>> --- a/xen/arch/x86/physdev.c
>>>> +++ b/xen/arch/x86/physdev.c
>>>> @@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
>>>>          if ( !ret && has_vpci(currd) )
>>>>          {
>>>>              /*
>>>> -             * For HVM (PVH) domains try to add the newly found MMCFG to the
>>>> -             * domain.
>>>> +             * For HVM (PVH) domains try to add/remove the reported MMCFG
>>>> +             * to/from the domain.
>>>>               */
>>>> -            ret = register_vpci_mmcfg_handler(currd, info.address,
>>>> -                                              info.start_bus, info.end_bus,
>>>> -                                              info.segment);
>>>> +            if ( info.flags & XEN_PCI_MMCFG_RESERVED )
>>>
>>> Do you think you could also add a small note in physdev.h regarding
>>> the fact that XEN_PCI_MMCFG_RESERVED is used to register a MMCFG
>>> region, and not setting it would imply an unregister request?
>>>
>>> It's not obvious to me from the name of the flag.
>>
>> The main purpose of the flag is to identify whether a region can be
>> used (because of having been found marked suitably reserved by
>> firmware). The flag not set effectively means "region is not marked
>> reserved".
> 
> Looking at pci_mmcfg_arch_disable, should the region then also be
> removed from mmio_ro_ranges? (kind of tangential to this patch)

If it's truly unregistration - yes. But ...

>> You pointing this out makes me wonder whether instead I
>> should simply expand the if() in context, without making it behave
>> like unregistration. Then again we'd have no way to unregister a
>> region, and hence (ab)using this function for this purpose seems to
>> makes sense (and, afaict, not require any code changes elsewhere).
> 
> Right now the only user I know of PHYSDEVOP_pci_mmcfg_reserved is
> Linux, and AFAICT it always sets the XEN_PCI_MMCFG_RESERVED flag (at
> least upstream).

... I've looked at our forward port, where this was first introduced.
There we made the call in all cases, with the flag indicating what is
wanted. Therefore I don't think we want to assign the flag being
clear the meaning of "unregistration". I'll therefore switch to the
simpler change of just expanding the if().

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region
  2020-05-11 13:46       ` Jan Beulich
@ 2020-05-11 14:35         ` Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2020-05-11 14:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Mon, May 11, 2020 at 03:46:38PM +0200, Jan Beulich wrote:
> On 08.05.2020 18:08, Roger Pau Monné wrote:
> > On Fri, May 08, 2020 at 05:11:35PM +0200, Jan Beulich wrote:
> >> On 08.05.2020 17:03, Roger Pau Monné wrote:
> >>> On Fri, May 08, 2020 at 02:43:38PM +0200, Jan Beulich wrote:
> >>>> --- a/xen/arch/x86/hvm/io.c
> >>>> +++ b/xen/arch/x86/hvm/io.c
> >>>> @@ -558,6 +558,47 @@ int register_vpci_mmcfg_handler(struct d
> >>>>      return 0;
> >>>>  }
> >>>>  
> >>>> +int unregister_vpci_mmcfg_handler(struct domain *d, paddr_t addr,
> >>>> +                                  unsigned int start_bus, unsigned int end_bus,
> >>>> +                                  unsigned int seg)
> >>>> +{
> >>>> +    struct hvm_mmcfg *mmcfg;
> >>>> +    int rc = -ENOENT;
> >>>> +
> >>>> +    ASSERT(is_hardware_domain(d));
> >>>> +
> >>>> +    if ( start_bus > end_bus )
> >>>> +        return -EINVAL;
> >>>> +
> >>>> +    write_lock(&d->arch.hvm.mmcfg_lock);
> >>>> +
> >>>> +    list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next )
> >>>> +        if ( mmcfg->addr == addr + (start_bus << 20) &&
> >>>> +             mmcfg->segment == seg &&
> >>>> +             mmcfg->start_bus == start_bus &&
> >>>> +             mmcfg->size == ((end_bus - start_bus + 1) << 20) )
> >>>> +        {
> >>>> +            list_del(&mmcfg->next);
> >>>> +            if ( !list_empty(&d->arch.hvm.mmcfg_regions) )
> >>>> +                xfree(mmcfg);
> >>>> +            else
> >>>> +            {
> >>>> +                /*
> >>>> +                 * Cannot unregister the MMIO handler - leave a fake entry
> >>>> +                 * on the list.
> >>>> +                 */
> >>>> +                memset(mmcfg, 0, sizeof(*mmcfg));
> >>>> +                list_add(&mmcfg->next, &d->arch.hvm.mmcfg_regions);
> >>>
> >>> Instead of leaving this zombie entry around maybe we could add a
> >>> static bool in register_vpci_mmcfg_handler to signal whether the MMIO
> >>> intercept has been registered?
> >>
> >> That was my initial plan indeed, but registration is per-domain.
> > 
> > Indeed, this would work now because it's only used by the hardware
> > domain, but it's not a good move long term.
> > 
> > What about splitting the registration into a
> > register_vpci_mmio_handler and call it from hvm_domain_initialise
> > like it's done for register_vpci_portio_handler?
> 
> No, the goal is to not register unneeded handlers. But see below -
> I'll likely ditch the function anyway.
> 
> >>>> --- a/xen/arch/x86/physdev.c
> >>>> +++ b/xen/arch/x86/physdev.c
> >>>> @@ -559,12 +559,18 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H
> >>>>          if ( !ret && has_vpci(currd) )
> >>>>          {
> >>>>              /*
> >>>> -             * For HVM (PVH) domains try to add the newly found MMCFG to the
> >>>> -             * domain.
> >>>> +             * For HVM (PVH) domains try to add/remove the reported MMCFG
> >>>> +             * to/from the domain.
> >>>>               */
> >>>> -            ret = register_vpci_mmcfg_handler(currd, info.address,
> >>>> -                                              info.start_bus, info.end_bus,
> >>>> -                                              info.segment);
> >>>> +            if ( info.flags & XEN_PCI_MMCFG_RESERVED )
> >>>
> >>> Do you think you could also add a small note in physdev.h regarding
> >>> the fact that XEN_PCI_MMCFG_RESERVED is used to register a MMCFG
> >>> region, and not setting it would imply an unregister request?
> >>>
> >>> It's not obvious to me from the name of the flag.
> >>
> >> The main purpose of the flag is to identify whether a region can be
> >> used (because of having been found marked suitably reserved by
> >> firmware). The flag not set effectively means "region is not marked
> >> reserved".
> > 
> > Looking at pci_mmcfg_arch_disable, should the region then also be
> > removed from mmio_ro_ranges? (kind of tangential to this patch)
> 
> If it's truly unregistration - yes. But ...
> 
> >> You pointing this out makes me wonder whether instead I
> >> should simply expand the if() in context, without making it behave
> >> like unregistration. Then again we'd have no way to unregister a
> >> region, and hence (ab)using this function for this purpose seems to
> >> makes sense (and, afaict, not require any code changes elsewhere).
> > 
> > Right now the only user I know of PHYSDEVOP_pci_mmcfg_reserved is
> > Linux, and AFAICT it always sets the XEN_PCI_MMCFG_RESERVED flag (at
> > least upstream).
> 
> ... I've looked at our forward port, where this was first introduced.
> There we made the call in all cases, with the flag indicating what is
> wanted. Therefore I don't think we want to assign the flag being
> clear the meaning of "unregistration". I'll therefore switch to the
> simpler change of just expanding the if().

I'm not opposed to this. Leaving the vpci MMIO handlers for disabled
regions is fine, writes will be ignored and reads will return ~0.

This will prevent a PVH hardware domain from accessing those broken
MMCFG regions if it really wants to, but I think it's similar to how a
classic PV dom0 would behave (with the exception that in that case the
domain would be allowed to read from the MMCFG area).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-11 14:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 12:43 [PATCH] x86/PVH: PHYSDEVOP_pci_mmcfg_reserved should not blindly register a region Jan Beulich
2020-05-08 12:54 ` Andrew Cooper
2020-05-08 13:49   ` Jan Beulich
2020-05-08 14:48     ` Roger Pau Monné
2020-05-08 15:03 ` Roger Pau Monné
2020-05-08 15:11   ` Jan Beulich
2020-05-08 16:08     ` Roger Pau Monné
2020-05-11 13:46       ` Jan Beulich
2020-05-11 14:35         ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.