All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: weifuqiang@huawei.com, mst@redhat.com, qemu-devel@nongnu.org,
	arei.gonglei@huawei.com, huangzhichao@huawei.com,
	"Longpeng\(Mike\)" <longpeng2@huawei.com>
Subject: Re: [PATCH RESEND 1/3] vfio/pci: fix a null pointer reference in vfio_rom_read
Date: Thu, 12 Mar 2020 08:07:06 -0600	[thread overview]
Message-ID: <20200312080706.3f6f9d8e@w520.home> (raw)
In-Reply-To: <87zhcmrujd.fsf@dusky.pond.sub.org>

On Thu, 12 Mar 2020 06:50:30 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Alex Williamson <alex.williamson@redhat.com> writes:
> 
> > On Wed, 11 Mar 2020 08:04:28 +0100
> > Markus Armbruster <armbru@redhat.com> wrote:
> >  
> >> Alex Williamson <alex.williamson@redhat.com> writes:
> >>   
> >> > On Mon, 24 Feb 2020 14:42:17 +0800
> >> > "Longpeng(Mike)" <longpeng2@huawei.com> wrote:
> >> >    
> >> >> From: Longpeng <longpeng2@huawei.com>
> >> >> 
> >> >> vfio_pci_load_rom() maybe failed and then the vdev->rom is NULL in
> >> >> some situation (though I've not encountered yet), maybe we should
> >> >> avoid the VM abort.    
> >> 
> >> What "VM abort" exactly?  
> >
> > There is none because memcpy() does something sane when size is zero,
> > but to be ISO whatever spec compliant we shouldn't rely on that.
> >  
> >> >> 
> >> >> Signed-off-by: Longpeng <longpeng2@huawei.com>
> >> >> ---
> >> >>  hw/vfio/pci.c | 13 ++++++++-----
> >> >>  1 file changed, 8 insertions(+), 5 deletions(-)
> >> >> 
> >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> >> index 5e75a95..ed798ae 100644
> >> >> --- a/hw/vfio/pci.c
> >> >> +++ b/hw/vfio/pci.c
> >> >> @@ -768,7 +768,7 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
> >> >>      }
> >> >>  }
> >> >>  
> >> >> -static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
> >> >> +static bool vfio_pci_load_rom(VFIOPCIDevice *vdev)
> >> >>  {
> >> >>      struct vfio_region_info *reg_info;
> >> >>      uint64_t size;
> >> >> @@ -778,7 +778,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
> >> >>      if (vfio_get_region_info(&vdev->vbasedev,
> >> >>                               VFIO_PCI_ROM_REGION_INDEX, &reg_info)) {
> >> >>          error_report("vfio: Error getting ROM info: %m");
> >> >> -        return;
> >> >> +        return false;
> >> >>      }
> >> >>  
> >> >>      trace_vfio_pci_load_rom(vdev->vbasedev.name, (unsigned long)reg_info->size,
> >> >> @@ -797,7 +797,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
> >> >>          error_printf("Device option ROM contents are probably invalid "
> >> >>                      "(check dmesg).\nSkip option ROM probe with rombar=0, "
> >> >>                      "or load from file with romfile=\n");
> >> >> -        return;
> >> >> +        return false;
> >> >>      }
> >> >>  
> >> >>      vdev->rom = g_malloc(size);
> >> >> @@ -849,6 +849,8 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
> >> >>              data[6] = -csum;
> >> >>          }
> >> >>      }
> >> >> +
> >> >> +    return true;
> >> >>  }
> >> >>  
> >> >>  static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)
> >> >> @@ -863,8 +865,9 @@ static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)    
> >>     {
> >>         VFIOPCIDevice *vdev = opaque;
> >>         union {
> >>             uint8_t byte;
> >>             uint16_t word;
> >>             uint32_t dword;
> >>             uint64_t qword;
> >>         } val;  
> >> >>      uint64_t data = 0;
> >> >>  
> >> >>      /* Load the ROM lazily when the guest tries to read it */
> >> >> -    if (unlikely(!vdev->rom && !vdev->rom_read_failed)) {
> >> >> -        vfio_pci_load_rom(vdev);
> >> >> +    if (unlikely(!vdev->rom && !vdev->rom_read_failed) &&
> >> >> +        !vfio_pci_load_rom(vdev)) {
> >> >> +        return 0;
> >> >>      }
> >> >>  
> >> >>      memcpy(&val, vdev->rom + addr,    
> >> >
> >> > Looks like an obvious bug, until you look at the rest of this memcpy():
> >> >
> >> > memcpy(&val, vdev->rom + addr,
> >> >            (addr < vdev->rom_size) ? MIN(size, vdev->rom_size - addr) : 0);
> >> >
> >> > IOW, we'll do a zero sized memcpy() if rom_size is zero, so there's no
> >> > risk of the concern identified in the commit log.  This patch is
> >> > unnecessary.  Thanks,    
> >> 
> >> I'm blind: why does !vdev->rom imply !vdev->rom_size?  
> >
> > See vfio_pci_load_rom(), rom_size and rom are set and allocated
> > together.  
> 
> What if vfio_pci_load_rom() isn't called, or returns before it sets
> these guys?

vfio_pci_load_rom() is called from this read function if they're not
set and we haven't triggered a read failure.  They're zero allocated.
The intention is that any vfio_pci_load_rom() failure will leave these
unset and trigger a zero sized memcpy, or now skip the mempcy as in the
patch below.  Thanks,

Alex

> >> Moreover, when MIN(size, vdev->rom_size - addr) < size, we seem to read
> >> uninitialized data from @val:  
> >
> > This is fixed in my patch
> > https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg02778.html  
> 
> Yes.
> 
> >> 
> >>         switch (size) {
> >>         case 1:
> >>             data = val.byte;
> >>             break;
> >>         case 2:
> >>             data = le16_to_cpu(val.word);
> >>             break;
> >>         case 4:
> >>             data = le32_to_cpu(val.dword);
> >>             break;
> >>         default:
> >>             hw_error("vfio: unsupported read size, %d bytes\n", size);
> >>             break;
> >>         }
> >> 
> >>         trace_vfio_rom_read(vdev->vbasedev.name, addr, size, data);
> >> 
> >>         return data;
> >>     }
> >> 
> >> Why is that okay?
> >> 
> >> Why do we initialize @data?  
> >
> > Bug.  The switch was only added later (75bd0c7253f3) and we failed to
> > catch it.  Prior to that we were initializing val and the memcpy() only
> > overwrote it as necessary.  In any case, getting back garbage for the
> > rom when there isn't one generally works ok since the chances of
> > generating a proper rom signature are infinitesimal.  Clearly not what
> > was intended though.
> >  
> >> How can we get to the default case?  If we can get there, is hw_error()
> >> really the right thing to do?  It almost never is...  If getting there
> >> is the guest's fault, we need to tell it off the same way physical
> >> hardware does.  If we should not ever get there (i.e. it's a QEMU bug),
> >> then a plain abort() would be clearer.  
> >
> > AFAIK this is relatively standard, if not somewhat paranoid, handling
> > for a MemoryRegion ops callback.  The MemoryRegionOps code only allows
> > certain size accesses, so it would effectively be an internal error to
> > hit the default case, which seems to be not an uncommon use case of
> > hw_error.  Thanks,  
> 
> Using hw_error() for such programming errors is not helpful.  Everything
> it adds to abort() is useless or misleading.
> 
> In fact, most uses of hw_error() are not helpful.
> 
> But you're going with the flow here.  I accept that.



  reply	other threads:[~2020-03-12 14:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24  6:42 [PATCH RESEND 0/3] fix some warnings by static code scan tool Longpeng(Mike)
2020-02-24  6:42 ` [PATCH RESEND 1/3] vfio/pci: fix a null pointer reference in vfio_rom_read Longpeng(Mike)
2020-02-24 16:04   ` Alex Williamson
2020-02-24 23:48     ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-10 16:11       ` Alex Williamson
2020-03-10 23:14         ` Laszlo Ersek
2020-03-11  1:36           ` Alex Williamson
2020-03-11  7:08             ` Markus Armbruster
2020-03-11 10:28               ` Laszlo Ersek
2020-03-11 10:26             ` Laszlo Ersek
2020-03-11 11:54               ` Markus Armbruster
2020-03-11 13:00                 ` Laszlo Ersek
2020-03-11  7:04     ` Markus Armbruster
     [not found]       ` <20200311093939.494bfe27@w520.home>
2020-03-12  5:50         ` Markus Armbruster
2020-03-12 14:07           ` Alex Williamson [this message]
2020-02-24  6:42 ` [PATCH RESEND 2/3] vhost: fix a null pointer reference of vhost_log Longpeng(Mike)
2020-03-10  2:11   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-10  5:57   ` Michael S. Tsirkin
2020-03-10  8:04     ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2020-03-10  8:23       ` Michael S. Tsirkin
2020-03-10 12:02         ` Longpeng (Mike)
2020-03-10 12:18           ` Michael S. Tsirkin
2020-02-24  6:42 ` [PATCH RESEND 3/3] util/pty: fix a null pointer reference in qemu_openpty_raw Longpeng(Mike)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200312080706.3f6f9d8e@w520.home \
    --to=alex.williamson@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=huangzhichao@huawei.com \
    --cc=longpeng2@huawei.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=weifuqiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.