All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-27 11:19 [PATCH] hw/vfio: let readonly flag take effect for mmaped regions yan.y.zhao
@ 2020-03-27 10:51 ` Philippe Mathieu-Daudé
  2020-03-27 16:17   ` Paolo Bonzini
  2020-03-27 17:25 ` Alex Williamson
  1 sibling, 1 reply; 12+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-03-27 10:51 UTC (permalink / raw)
  To: yan.y.zhao, qemu-devel; +Cc: pbonzini, alex.williamson, Xin Zeng

Hi Yan,

On 3/27/20 12:19 PM, yan.y.zhao@intel.com wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> 
> regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> are only read-only in host page table for qemu.
> 
> This patch sets corresponding ept page entries read-only for regions
> with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> 
> accordingly, it ignores guest write when guest writes to the read-only
> regions are trapped.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> ---
>   hw/vfio/common.c | 4 ++++
>   memory.c         | 3 +++
>   2 files changed, 7 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 0b3593b3c0..e901621ca0 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -971,6 +971,10 @@ int vfio_region_mmap(VFIORegion *region)
>                                             name, region->mmaps[i].size,
>                                             region->mmaps[i].mmap);
>           g_free(name);
> +
> +        if (!(region->flags & VFIO_REGION_INFO_FLAG_WRITE)) {
> +            memory_region_set_readonly(&region->mmaps[i].mem, true);
> +        }
>           memory_region_add_subregion(region->mem, region->mmaps[i].offset,
>                                       &region->mmaps[i].mem);
>   
> diff --git a/memory.c b/memory.c
> index 601b749906..4b1071dc74 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void *opaque, hwaddr addr,
>       MemoryRegion *mr = opaque;
>   
>       trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, size);
> +    if (mr->readonly) {
> +        return;
> +    }

Shouldn't this be in memory_region_dispatch_write()?

Please split this patch in 2, this (generic) hunk as first patch, then 
the VFIO more specific change.

>   
>       switch (size) {
>       case 1:
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
@ 2020-03-27 11:19 yan.y.zhao
  2020-03-27 10:51 ` Philippe Mathieu-Daudé
  2020-03-27 17:25 ` Alex Williamson
  0 siblings, 2 replies; 12+ messages in thread
From: yan.y.zhao @ 2020-03-27 11:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, alex.williamson, Yan Zhao, Xin Zeng

From: Yan Zhao <yan.y.zhao@intel.com>

currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.

regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
are only read-only in host page table for qemu.

This patch sets corresponding ept page entries read-only for regions
with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.

accordingly, it ignores guest write when guest writes to the read-only
regions are trapped.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
---
 hw/vfio/common.c | 4 ++++
 memory.c         | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0b3593b3c0..e901621ca0 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -971,6 +971,10 @@ int vfio_region_mmap(VFIORegion *region)
                                           name, region->mmaps[i].size,
                                           region->mmaps[i].mmap);
         g_free(name);
+
+        if (!(region->flags & VFIO_REGION_INFO_FLAG_WRITE)) {
+            memory_region_set_readonly(&region->mmaps[i].mem, true);
+        }
         memory_region_add_subregion(region->mem, region->mmaps[i].offset,
                                     &region->mmaps[i].mem);
 
diff --git a/memory.c b/memory.c
index 601b749906..4b1071dc74 100644
--- a/memory.c
+++ b/memory.c
@@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void *opaque, hwaddr addr,
     MemoryRegion *mr = opaque;
 
     trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, size);
+    if (mr->readonly) {
+        return;
+    }
 
     switch (size) {
     case 1:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-27 10:51 ` Philippe Mathieu-Daudé
@ 2020-03-27 16:17   ` Paolo Bonzini
  2020-03-31  7:59     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2020-03-27 16:17 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, yan.y.zhao, qemu-devel
  Cc: alex.williamson, Xin Zeng

On 27/03/20 11:51, Philippe Mathieu-Daudé wrote:
>>  diff --git a/memory.c b/memory.c
>> index 601b749906..4b1071dc74 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void
>> *opaque, hwaddr addr,
>>       MemoryRegion *mr = opaque;
>>         trace_memory_region_ram_device_write(get_cpu_index(), mr,
>> addr, data, size);
>> +    if (mr->readonly) {
>> +        return;
>> +    }
> 
> Shouldn't this be in memory_region_dispatch_write()?

No, in general you want memory regions to get writes, so that they
become for example a machine-check exception of some sorts.  However,
memory_region_ram_device_write should probably be changed to a
.write_with_attrs operation, so that it can return MEMTX_ERROR.

> Please split this patch in 2, this (generic) hunk as first patch, then
> the VFIO more specific change.
> 
>>         switch (size) {
>>       case 1:
>>
> 

No need, I can just add my Acked-by for Alex to pick up the patch.

Paolo



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-27 11:19 [PATCH] hw/vfio: let readonly flag take effect for mmaped regions yan.y.zhao
  2020-03-27 10:51 ` Philippe Mathieu-Daudé
@ 2020-03-27 17:25 ` Alex Williamson
  2020-03-30  1:35   ` Yan Zhao
  1 sibling, 1 reply; 12+ messages in thread
From: Alex Williamson @ 2020-03-27 17:25 UTC (permalink / raw)
  To: yan.y.zhao; +Cc: pbonzini, Xin Zeng, qemu-devel

On Fri, 27 Mar 2020 11:19:34 +0000
yan.y.zhao@intel.com wrote:

> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> 
> regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> are only read-only in host page table for qemu.
> 
> This patch sets corresponding ept page entries read-only for regions
> with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> 
> accordingly, it ignores guest write when guest writes to the read-only
> regions are trapped.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> ---

Currently we set the r/w protection on the mmap, do I understand
correctly that the change in the vfio code below results in KVM exiting
to QEMU to handle a write to a read-only region and therefore we need
the memory.c change to drop the write?  This prevents a SIGBUS or
similar?

Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
regions and vfio_region_write() would still allow writes, so if the
device were using x-no-mmap=on, I think we'd still get a write to this
region and expect the vfio device to drop it.  Should we prevent that
write in QEMU as well?

Can you also identify what device and region requires this so that we
can decide whether this is QEMU 5.0 or 5.1 material?  PCI BARs are of
course always R/W and the ROM uses different ops and doesn't support
mmap, so this is a device specific region of some sort.  Thanks,

Alex


>  hw/vfio/common.c | 4 ++++
>  memory.c         | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 0b3593b3c0..e901621ca0 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -971,6 +971,10 @@ int vfio_region_mmap(VFIORegion *region)
>                                            name, region->mmaps[i].size,
>                                            region->mmaps[i].mmap);
>          g_free(name);
> +
> +        if (!(region->flags & VFIO_REGION_INFO_FLAG_WRITE)) {
> +            memory_region_set_readonly(&region->mmaps[i].mem, true);
> +        }
>          memory_region_add_subregion(region->mem, region->mmaps[i].offset,
>                                      &region->mmaps[i].mem);
>  
> diff --git a/memory.c b/memory.c
> index 601b749906..4b1071dc74 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void *opaque, hwaddr addr,
>      MemoryRegion *mr = opaque;
>  
>      trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, size);
> +    if (mr->readonly) {
> +        return;
> +    }
>  
>      switch (size) {
>      case 1:



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-27 17:25 ` Alex Williamson
@ 2020-03-30  1:35   ` Yan Zhao
  2020-03-30  6:34     ` Yan Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Yan Zhao @ 2020-03-30  1:35 UTC (permalink / raw)
  To: Alex Williamson; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:
> On Fri, 27 Mar 2020 11:19:34 +0000
> yan.y.zhao@intel.com wrote:
> 
> > From: Yan Zhao <yan.y.zhao@intel.com>
> > 
> > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > 
> > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > are only read-only in host page table for qemu.
> > 
> > This patch sets corresponding ept page entries read-only for regions
> > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > 
> > accordingly, it ignores guest write when guest writes to the read-only
> > regions are trapped.
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > ---
> 
> Currently we set the r/w protection on the mmap, do I understand
> correctly that the change in the vfio code below results in KVM exiting
> to QEMU to handle a write to a read-only region and therefore we need
> the memory.c change to drop the write?  This prevents a SIGBUS or
> similar?
yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
it's mmaped to read-only. we think it's better to just drop the writes
from guest rather than corrupt the qemu.

> 
> Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> regions and vfio_region_write() would still allow writes, so if the
> device were using x-no-mmap=on, I think we'd still get a write to this
> region and expect the vfio device to drop it.  Should we prevent that
> write in QEMU as well?
yes, it expects vfio device to drop it right now.
As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
handle it properly.
both dropping in qemu and dropping in vfio device are fine to us.
we wonder which one is your preference :)


> Can you also identify what device and region requires this so that we
> can decide whether this is QEMU 5.0 or 5.1 material?  PCI BARs are of
> course always R/W and the ROM uses different ops and doesn't support
> mmap, so this is a device specific region of some sort.  Thanks,
> 
It's a virtual mdev device for which we want to emulate a virtual
read-only MMIO BAR.
Is there any consideration that PCI BARs have to be R/W ?
we didn't find it out in PCI specification.

Thanks
Yan


> 
> >  hw/vfio/common.c | 4 ++++
> >  memory.c         | 3 +++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 0b3593b3c0..e901621ca0 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -971,6 +971,10 @@ int vfio_region_mmap(VFIORegion *region)
> >                                            name, region->mmaps[i].size,
> >                                            region->mmaps[i].mmap);
> >          g_free(name);
> > +
> > +        if (!(region->flags & VFIO_REGION_INFO_FLAG_WRITE)) {
> > +            memory_region_set_readonly(&region->mmaps[i].mem, true);
> > +        }
> >          memory_region_add_subregion(region->mem, region->mmaps[i].offset,
> >                                      &region->mmaps[i].mem);
> >  
> > diff --git a/memory.c b/memory.c
> > index 601b749906..4b1071dc74 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void *opaque, hwaddr addr,
> >      MemoryRegion *mr = opaque;
> >  
> >      trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, size);
> > +    if (mr->readonly) {
> > +        return;
> > +    }
> >  
> >      switch (size) {
> >      case 1:
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-30  1:35   ` Yan Zhao
@ 2020-03-30  6:34     ` Yan Zhao
  2020-03-30 14:59       ` Alex Williamson
  0 siblings, 1 reply; 12+ messages in thread
From: Yan Zhao @ 2020-03-30  6:34 UTC (permalink / raw)
  To: Alex Williamson; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Mon, Mar 30, 2020 at 09:35:27AM +0800, Yan Zhao wrote:
> On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:
> > On Fri, 27 Mar 2020 11:19:34 +0000
> > yan.y.zhao@intel.com wrote:
> > 
> > > From: Yan Zhao <yan.y.zhao@intel.com>
> > > 
> > > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > > 
> > > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > > are only read-only in host page table for qemu.
> > > 
> > > This patch sets corresponding ept page entries read-only for regions
> > > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > > 
> > > accordingly, it ignores guest write when guest writes to the read-only
> > > regions are trapped.
> > > 
> > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > > ---
> > 
> > Currently we set the r/w protection on the mmap, do I understand
> > correctly that the change in the vfio code below results in KVM exiting
> > to QEMU to handle a write to a read-only region and therefore we need
> > the memory.c change to drop the write?  This prevents a SIGBUS or
> > similar?
> yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
> it's mmaped to read-only. we think it's better to just drop the writes
> from guest rather than corrupt the qemu.
> 
> > 
> > Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> > regions and vfio_region_write() would still allow writes, so if the
> > device were using x-no-mmap=on, I think we'd still get a write to this
> > region and expect the vfio device to drop it.  Should we prevent that
> > write in QEMU as well?
> yes, it expects vfio device to drop it right now.
> As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
> handle it properly.
> both dropping in qemu and dropping in vfio device are fine to us.
> we wonder which one is your preference :)
>
> 
> > Can you also identify what device and region requires this so that we
> > can decide whether this is QEMU 5.0 or 5.1 material?  PCI BARs are of
> > course always R/W and the ROM uses different ops and doesn't support
> > mmap, so this is a device specific region of some sort.  Thanks,
> > 
> It's a virtual mdev device for which we want to emulate a virtual
> read-only MMIO BAR.
> Is there any consideration that PCI BARs have to be R/W ?
> we didn't find it out in PCI specification.
> 
>
looks MMIO regions in vfio platform are also possible to be read-only and
mmaped.

Thanks
Yan

> 
> > 
> > >  hw/vfio/common.c | 4 ++++
> > >  memory.c         | 3 +++
> > >  2 files changed, 7 insertions(+)
> > > 
> > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > index 0b3593b3c0..e901621ca0 100644
> > > --- a/hw/vfio/common.c
> > > +++ b/hw/vfio/common.c
> > > @@ -971,6 +971,10 @@ int vfio_region_mmap(VFIORegion *region)
> > >                                            name, region->mmaps[i].size,
> > >                                            region->mmaps[i].mmap);
> > >          g_free(name);
> > > +
> > > +        if (!(region->flags & VFIO_REGION_INFO_FLAG_WRITE)) {
> > > +            memory_region_set_readonly(&region->mmaps[i].mem, true);
> > > +        }
> > >          memory_region_add_subregion(region->mem, region->mmaps[i].offset,
> > >                                      &region->mmaps[i].mem);
> > >  
> > > diff --git a/memory.c b/memory.c
> > > index 601b749906..4b1071dc74 100644
> > > --- a/memory.c
> > > +++ b/memory.c
> > > @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void *opaque, hwaddr addr,
> > >      MemoryRegion *mr = opaque;
> > >  
> > >      trace_memory_region_ram_device_write(get_cpu_index(), mr, addr, data, size);
> > > +    if (mr->readonly) {
> > > +        return;
> > > +    }
> > >  
> > >      switch (size) {
> > >      case 1:
> > 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-30  6:34     ` Yan Zhao
@ 2020-03-30 14:59       ` Alex Williamson
  2020-03-31  1:59         ` Yan Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Alex Williamson @ 2020-03-30 14:59 UTC (permalink / raw)
  To: Yan Zhao; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Mon, 30 Mar 2020 02:34:02 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Mon, Mar 30, 2020 at 09:35:27AM +0800, Yan Zhao wrote:
> > On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:  
> > > On Fri, 27 Mar 2020 11:19:34 +0000
> > > yan.y.zhao@intel.com wrote:
> > >   
> > > > From: Yan Zhao <yan.y.zhao@intel.com>
> > > > 
> > > > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > > > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > > > 
> > > > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > > > are only read-only in host page table for qemu.
> > > > 
> > > > This patch sets corresponding ept page entries read-only for regions
> > > > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > > > 
> > > > accordingly, it ignores guest write when guest writes to the read-only
> > > > regions are trapped.
> > > > 
> > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > > > ---  
> > > 
> > > Currently we set the r/w protection on the mmap, do I understand
> > > correctly that the change in the vfio code below results in KVM exiting
> > > to QEMU to handle a write to a read-only region and therefore we need
> > > the memory.c change to drop the write?  This prevents a SIGBUS or
> > > similar?  
> > yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
> > it's mmaped to read-only. we think it's better to just drop the writes
> > from guest rather than corrupt the qemu.
> >   
> > > 
> > > Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> > > regions and vfio_region_write() would still allow writes, so if the
> > > device were using x-no-mmap=on, I think we'd still get a write to this
> > > region and expect the vfio device to drop it.  Should we prevent that
> > > write in QEMU as well?  
> > yes, it expects vfio device to drop it right now.
> > As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
> > handle it properly.
> > both dropping in qemu and dropping in vfio device are fine to us.
> > we wonder which one is your preference :)

The kernel and device should always do the right thing, we cannot rely
on the user to honor the mapping, but it's also a reasonable response
from the kernel to kill the process with a SIGSEGV if the user ignores
the protections.  So I don't think it's an either/or, the kernel needs
to do the right thing for itself and in this case QEMU should do the
right thing for itself, which is to drop writes for regions that don't
support it.  So in general, I agree with your patch.
 
> > > Can you also identify what device and region requires this so that we
> > > can decide whether this is QEMU 5.0 or 5.1 material?  PCI BARs are of
> > > course always R/W and the ROM uses different ops and doesn't support
> > > mmap, so this is a device specific region of some sort.  Thanks,
> > >   
> > It's a virtual mdev device for which we want to emulate a virtual
> > read-only MMIO BAR.
> > Is there any consideration that PCI BARs have to be R/W ?
> > we didn't find it out in PCI specification.

What the device chooses to do with writes to a BAR is its own business,
the PCI spec shouldn't try to define that.  There's also no PCI spec
mechanism to declare the access protections for an entire BAR, that's
device specific behavior.  The current QEMU vfio-pci behavior is
therefore somewhat implicit in knowing this for a directly assigned
device.  We can mmap the device and we expect writes to unwritable
registers within that mapping to be dropped.

For an mdev device, we can't rely on the user honoring the access
protections, ie. the user shouldn't be able to exploit the kernel or
device by doing so, but I also agree that QEMU, as a friendly vfio
user, should avoid unsupported operations and protect itself from how
the kernel may handle the fault.

Since this mdev device doesn't exist yet, I'm thinking this is QEMU
v5.1 material though.

> looks MMIO regions in vfio platform are also possible to be read-only and
> mmaped.

Yes.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-30 14:59       ` Alex Williamson
@ 2020-03-31  1:59         ` Yan Zhao
  2020-03-31 19:28           ` Alex Williamson
  0 siblings, 1 reply; 12+ messages in thread
From: Yan Zhao @ 2020-03-31  1:59 UTC (permalink / raw)
  To: Alex Williamson; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Mon, Mar 30, 2020 at 10:59:23PM +0800, Alex Williamson wrote:
> On Mon, 30 Mar 2020 02:34:02 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Mon, Mar 30, 2020 at 09:35:27AM +0800, Yan Zhao wrote:
> > > On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:  
> > > > On Fri, 27 Mar 2020 11:19:34 +0000
> > > > yan.y.zhao@intel.com wrote:
> > > >   
> > > > > From: Yan Zhao <yan.y.zhao@intel.com>
> > > > > 
> > > > > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > > > > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > > > > 
> > > > > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > > > > are only read-only in host page table for qemu.
> > > > > 
> > > > > This patch sets corresponding ept page entries read-only for regions
> > > > > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > > > > 
> > > > > accordingly, it ignores guest write when guest writes to the read-only
> > > > > regions are trapped.
> > > > > 
> > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > > > > ---  
> > > > 
> > > > Currently we set the r/w protection on the mmap, do I understand
> > > > correctly that the change in the vfio code below results in KVM exiting
> > > > to QEMU to handle a write to a read-only region and therefore we need
> > > > the memory.c change to drop the write?  This prevents a SIGBUS or
> > > > similar?  
> > > yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
> > > it's mmaped to read-only. we think it's better to just drop the writes
> > > from guest rather than corrupt the qemu.
> > >   
> > > > 
> > > > Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> > > > regions and vfio_region_write() would still allow writes, so if the
> > > > device were using x-no-mmap=on, I think we'd still get a write to this
> > > > region and expect the vfio device to drop it.  Should we prevent that
> > > > write in QEMU as well?  
> > > yes, it expects vfio device to drop it right now.
> > > As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
> > > handle it properly.
> > > both dropping in qemu and dropping in vfio device are fine to us.
> > > we wonder which one is your preference :)
> 
> The kernel and device should always do the right thing, we cannot rely
> on the user to honor the mapping, but it's also a reasonable response
> from the kernel to kill the process with a SIGSEGV if the user ignores
> the protections.  So I don't think it's an either/or, the kernel needs
> to do the right thing for itself and in this case QEMU should do the
> right thing for itself, which is to drop writes for regions that don't
> support it.  So in general, I agree with your patch.
>
hi Alex
so is there anything I need to do? do I need to add a write dropping in
vfio_region_write() too? if yes, do I need to keep the
trace_vfio_region_write() before dropping ?

Thanks
Yan

> > > > Can you also identify what device and region requires this so that we
> > > > can decide whether this is QEMU 5.0 or 5.1 material?  PCI BARs are of
> > > > course always R/W and the ROM uses different ops and doesn't support
> > > > mmap, so this is a device specific region of some sort.  Thanks,
> > > >   
> > > It's a virtual mdev device for which we want to emulate a virtual
> > > read-only MMIO BAR.
> > > Is there any consideration that PCI BARs have to be R/W ?
> > > we didn't find it out in PCI specification.
> 
> What the device chooses to do with writes to a BAR is its own business,
> the PCI spec shouldn't try to define that.  There's also no PCI spec
> mechanism to declare the access protections for an entire BAR, that's
> device specific behavior.  The current QEMU vfio-pci behavior is
> therefore somewhat implicit in knowing this for a directly assigned
> device.  We can mmap the device and we expect writes to unwritable
> registers within that mapping to be dropped.
> 
> For an mdev device, we can't rely on the user honoring the access
> protections, ie. the user shouldn't be able to exploit the kernel or
> device by doing so, but I also agree that QEMU, as a friendly vfio
> user, should avoid unsupported operations and protect itself from how
> the kernel may handle the fault.
> 
> Since this mdev device doesn't exist yet, I'm thinking this is QEMU
> v5.1 material though.
> 
> > looks MMIO regions in vfio platform are also possible to be read-only and
> > mmaped.
> 
> Yes.  Thanks,
> 
> Alex
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-27 16:17   ` Paolo Bonzini
@ 2020-03-31  7:59     ` Philippe Mathieu-Daudé
  2020-04-01  6:47       ` Yan Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-03-31  7:59 UTC (permalink / raw)
  To: Paolo Bonzini, yan.y.zhao, qemu-devel; +Cc: alex.williamson, Xin Zeng

On 3/27/20 5:17 PM, Paolo Bonzini wrote:
> On 27/03/20 11:51, Philippe Mathieu-Daudé wrote:
>>>    diff --git a/memory.c b/memory.c
>>> index 601b749906..4b1071dc74 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void
>>> *opaque, hwaddr addr,
>>>        MemoryRegion *mr = opaque;
>>>          trace_memory_region_ram_device_write(get_cpu_index(), mr,
>>> addr, data, size);
>>> +    if (mr->readonly) {
>>> +        return;
>>> +    }
>>
>> Shouldn't this be in memory_region_dispatch_write()?
> 
> No, in general you want memory regions to get writes, so that they
> become for example a machine-check exception of some sorts.  However,
> memory_region_ram_device_write should probably be changed to a
> .write_with_attrs operation, so that it can return MEMTX_ERROR.
> 
>> Please split this patch in 2, this (generic) hunk as first patch, then
>> the VFIO more specific change.
>>
>>>          switch (size) {
>>>        case 1:
>>>
>>
> 
> No need, I can just add my Acked-by for Alex to pick up the patch.

Having 2 different fix in 2 different patches helps when cherry-picking 
(bisecting, backporting...) and reverting. My 2 cents anyway.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-31  1:59         ` Yan Zhao
@ 2020-03-31 19:28           ` Alex Williamson
  2020-04-01  6:45             ` Yan Zhao
  0 siblings, 1 reply; 12+ messages in thread
From: Alex Williamson @ 2020-03-31 19:28 UTC (permalink / raw)
  To: Yan Zhao; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Mon, 30 Mar 2020 21:59:41 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Mon, Mar 30, 2020 at 10:59:23PM +0800, Alex Williamson wrote:
> > On Mon, 30 Mar 2020 02:34:02 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Mon, Mar 30, 2020 at 09:35:27AM +0800, Yan Zhao wrote:  
> > > > On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:    
> > > > > On Fri, 27 Mar 2020 11:19:34 +0000
> > > > > yan.y.zhao@intel.com wrote:
> > > > >     
> > > > > > From: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > 
> > > > > > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > > > > > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > > > > > 
> > > > > > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > > > > > are only read-only in host page table for qemu.
> > > > > > 
> > > > > > This patch sets corresponding ept page entries read-only for regions
> > > > > > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > > > > > 
> > > > > > accordingly, it ignores guest write when guest writes to the read-only
> > > > > > regions are trapped.
> > > > > > 
> > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > > > > > ---    
> > > > > 
> > > > > Currently we set the r/w protection on the mmap, do I understand
> > > > > correctly that the change in the vfio code below results in KVM exiting
> > > > > to QEMU to handle a write to a read-only region and therefore we need
> > > > > the memory.c change to drop the write?  This prevents a SIGBUS or
> > > > > similar?    
> > > > yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
> > > > it's mmaped to read-only. we think it's better to just drop the writes
> > > > from guest rather than corrupt the qemu.
> > > >     
> > > > > 
> > > > > Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> > > > > regions and vfio_region_write() would still allow writes, so if the
> > > > > device were using x-no-mmap=on, I think we'd still get a write to this
> > > > > region and expect the vfio device to drop it.  Should we prevent that
> > > > > write in QEMU as well?    
> > > > yes, it expects vfio device to drop it right now.
> > > > As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
> > > > handle it properly.
> > > > both dropping in qemu and dropping in vfio device are fine to us.
> > > > we wonder which one is your preference :)  
> > 
> > The kernel and device should always do the right thing, we cannot rely
> > on the user to honor the mapping, but it's also a reasonable response
> > from the kernel to kill the process with a SIGSEGV if the user ignores
> > the protections.  So I don't think it's an either/or, the kernel needs
> > to do the right thing for itself and in this case QEMU should do the
> > right thing for itself, which is to drop writes for regions that don't
> > support it.  So in general, I agree with your patch.
> >  
> hi Alex
> so is there anything I need to do? do I need to add a write dropping in
> vfio_region_write() too? if yes, do I need to keep the
> trace_vfio_region_write() before dropping ?

Hi Yan,

Yes to both feels the most consistent, right?  If we want the same
behavior between mmap'd and non-mmap'd regions, QEMU should drop both
writes.  Your change to memory_region_ram_device_write() drops the
write after tracing.  For both cases though it might be worthwhile to
have a separate trace recorded to indicate a dropped write, probably in
place of the normal trace to avoid confusion.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-31 19:28           ` Alex Williamson
@ 2020-04-01  6:45             ` Yan Zhao
  0 siblings, 0 replies; 12+ messages in thread
From: Yan Zhao @ 2020-04-01  6:45 UTC (permalink / raw)
  To: Alex Williamson; +Cc: pbonzini, Zeng, Xin, qemu-devel

On Wed, Apr 01, 2020 at 03:28:27AM +0800, Alex Williamson wrote:
> On Mon, 30 Mar 2020 21:59:41 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Mon, Mar 30, 2020 at 10:59:23PM +0800, Alex Williamson wrote:
> > > On Mon, 30 Mar 2020 02:34:02 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >   
> > > > On Mon, Mar 30, 2020 at 09:35:27AM +0800, Yan Zhao wrote:  
> > > > > On Sat, Mar 28, 2020 at 01:25:37AM +0800, Alex Williamson wrote:    
> > > > > > On Fri, 27 Mar 2020 11:19:34 +0000
> > > > > > yan.y.zhao@intel.com wrote:
> > > > > >     
> > > > > > > From: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > > 
> > > > > > > currently, vfio regions without VFIO_REGION_INFO_FLAG_WRITE are only
> > > > > > > read-only when VFIO_REGION_INFO_FLAG_MMAP is not set.
> > > > > > > 
> > > > > > > regions with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP
> > > > > > > are only read-only in host page table for qemu.
> > > > > > > 
> > > > > > > This patch sets corresponding ept page entries read-only for regions
> > > > > > > with flag VFIO_REGION_INFO_FLAG_READ | VFIO_REGION_INFO_FLAG_MMAP.
> > > > > > > 
> > > > > > > accordingly, it ignores guest write when guest writes to the read-only
> > > > > > > regions are trapped.
> > > > > > > 
> > > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > > Signed-off-by: Xin Zeng <xin.zeng@intel.com>
> > > > > > > ---    
> > > > > > 
> > > > > > Currently we set the r/w protection on the mmap, do I understand
> > > > > > correctly that the change in the vfio code below results in KVM exiting
> > > > > > to QEMU to handle a write to a read-only region and therefore we need
> > > > > > the memory.c change to drop the write?  This prevents a SIGBUS or
> > > > > > similar?    
> > > > > yes, correct. the change in memory.c is to prevent a SIGSEGV in host as
> > > > > it's mmaped to read-only. we think it's better to just drop the writes
> > > > > from guest rather than corrupt the qemu.
> > > > >     
> > > > > > 
> > > > > > Meanwhile vfio_region_setup() uses the same vfio_region_ops for all
> > > > > > regions and vfio_region_write() would still allow writes, so if the
> > > > > > device were using x-no-mmap=on, I think we'd still get a write to this
> > > > > > region and expect the vfio device to drop it.  Should we prevent that
> > > > > > write in QEMU as well?    
> > > > > yes, it expects vfio device to drop it right now.
> > > > > As the driver sets the flag without VFIO_REGION_INFO_FLAG_WRITE, it should
> > > > > handle it properly.
> > > > > both dropping in qemu and dropping in vfio device are fine to us.
> > > > > we wonder which one is your preference :)  
> > > 
> > > The kernel and device should always do the right thing, we cannot rely
> > > on the user to honor the mapping, but it's also a reasonable response
> > > from the kernel to kill the process with a SIGSEGV if the user ignores
> > > the protections.  So I don't think it's an either/or, the kernel needs
> > > to do the right thing for itself and in this case QEMU should do the
> > > right thing for itself, which is to drop writes for regions that don't
> > > support it.  So in general, I agree with your patch.
> > >  
> > hi Alex
> > so is there anything I need to do? do I need to add a write dropping in
> > vfio_region_write() too? if yes, do I need to keep the
> > trace_vfio_region_write() before dropping ?
> 
> Hi Yan,
> 
> Yes to both feels the most consistent, right?  If we want the same
> behavior between mmap'd and non-mmap'd regions, QEMU should drop both
> writes.  Your change to memory_region_ram_device_write() drops the
> write after tracing.  For both cases though it might be worthwhile to
> have a separate trace recorded to indicate a dropped write, probably in
> place of the normal trace to avoid confusion.  Thanks,
> 
got it. will do it.

Thanks
Yan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] hw/vfio: let readonly flag take effect for mmaped regions
  2020-03-31  7:59     ` Philippe Mathieu-Daudé
@ 2020-04-01  6:47       ` Yan Zhao
  0 siblings, 0 replies; 12+ messages in thread
From: Yan Zhao @ 2020-04-01  6:47 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Paolo Bonzini, alex.williamson, Zeng, Xin, qemu-devel

On Tue, Mar 31, 2020 at 03:59:02PM +0800, Philippe Mathieu-Daudé wrote:
> On 3/27/20 5:17 PM, Paolo Bonzini wrote:
> > On 27/03/20 11:51, Philippe Mathieu-Daudé wrote:
> >>>    diff --git a/memory.c b/memory.c
> >>> index 601b749906..4b1071dc74 100644
> >>> --- a/memory.c
> >>> +++ b/memory.c
> >>> @@ -1313,6 +1313,9 @@ static void memory_region_ram_device_write(void
> >>> *opaque, hwaddr addr,
> >>>        MemoryRegion *mr = opaque;
> >>>          trace_memory_region_ram_device_write(get_cpu_index(), mr,
> >>> addr, data, size);
> >>> +    if (mr->readonly) {
> >>> +        return;
> >>> +    }
> >>
> >> Shouldn't this be in memory_region_dispatch_write()?
> > 
> > No, in general you want memory regions to get writes, so that they
> > become for example a machine-check exception of some sorts.  However,
> > memory_region_ram_device_write should probably be changed to a
> > .write_with_attrs operation, so that it can return MEMTX_ERROR.
> > 
> >> Please split this patch in 2, this (generic) hunk as first patch, then
> >> the VFIO more specific change.
> >>
> >>>          switch (size) {
> >>>        case 1:
> >>>
> >>
> > 
> > No need, I can just add my Acked-by for Alex to pick up the patch.
> 
> Having 2 different fix in 2 different patches helps when cherry-picking 
> (bisecting, backporting...) and reverting. My 2 cents anyway.
ok. I can seperate it in patch v2.

Thanks for your input:)



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-01  6:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 11:19 [PATCH] hw/vfio: let readonly flag take effect for mmaped regions yan.y.zhao
2020-03-27 10:51 ` Philippe Mathieu-Daudé
2020-03-27 16:17   ` Paolo Bonzini
2020-03-31  7:59     ` Philippe Mathieu-Daudé
2020-04-01  6:47       ` Yan Zhao
2020-03-27 17:25 ` Alex Williamson
2020-03-30  1:35   ` Yan Zhao
2020-03-30  6:34     ` Yan Zhao
2020-03-30 14:59       ` Alex Williamson
2020-03-31  1:59         ` Yan Zhao
2020-03-31 19:28           ` Alex Williamson
2020-04-01  6:45             ` Yan Zhao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.