All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Gerd Hoffmann <kraxel@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>
Cc: "igvt-g@ml01.01.org" <igvt-g@ml01.01.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Cao jin <caoj.fnst@cn.fujitsu.com>,
	"vfio-users@redhat.com" <vfio-users@redhat.com>
Subject: Re: [Qemu-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset	tweaks
Date: Tue, 2 Feb 2016 07:01:46 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D15F793FFA@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1454330962.10168.34.camel@redhat.com>

> From: Gerd Hoffmann
> Sent: Monday, February 01, 2016 8:49 PM
> 
>   Hi,
> 
> > Thanks for the tip that seabios allocated pages automatically become
> > e820 reserved, that simplifies things a bit.
> 
> It's common practice for all firmware.  The e820 table from qemu is just
> a starting point, it is not passed on to the guest os as-is.  All
> permanent allocations (acpi tables, smbios tables, seabios driver data
> such as virtio rings, ...) are taken away from RAM and added to
> RESERVED, and IIRC seabios also takes care to reserve the bios and
> option rom regions in real mode address space.

Agree. It's cleaner to have seabios do allocation otherwise it's prune to
cause conflict if vfio-pci randomly picks up an address (though we can
do special tweaks within seabios to skip that one by reading 0xFC).

> 
> > > Maybe we should define the interface as "guest writes 0xfc to pick
> > > address, qemu takes care to place opregion there".  That gives us the
> > > freedom to change the qemu implementation (either copy host opregion or
> > > map the host opregion) without breaking things.
> >
> > Ok, so seabios allocates two pages, writes the base address of those
> > pages to 0xfc and looks to see whether the signature appears at that
> > address due to qemu mapping.  It verifies the size and does a
> > free/realloc if not the right size.
> 
> I think seabios first needs to reserve something big enough for a
> temporary mapping, to check signature + size, otherwise the opregion
> might scratch data structures beyond opregion in case it happens to be
> larger than 8k.
> 
> How likely is it that the opregion size ever changes?  Should we better
> be prepared to handle it?  Or would it be ok to have a ...
> 
>    if (opregion_size > 8k)
>       panic();
> 
> ... style sanity check?

Above sanity check should be enough. We use 8k in KVMGT too.

> 
> > If the graphics signature does not
> > appear, free those pages and assume no opregion support.
> 
> Yes.
> 
> > If we later
> > decide to use a copy, we'd need to disable the 0xfc automagic mapping
> > and probably pass the data via fw_cfg.  Sound right?
> 
> I'd have qemu copy the data on 0xfc write then, so things continue to
> work without updating seabios.  So, the firmware has to allocate space,
> reserve it etc.,  and programming the 0xfc register.  Qemu has to make
> sure the opregion appears at the address written by the firmware, by
> whatever method it prefers.

Yup. It's Qemu's responsibility to expose opregion content. 

btw, prefer to do copying here. It's pointless to allow write from guest
side. One write example is SWSCI mailbox, thru which gfx driver can
trigger some SCI event to communicate with BIOS (specifically ACPI
methods here), mostly for some monitor operations. However it's 
not a right thing for guest to trigger host SCI and thus kick host 
ACPI methods.

> 
> > > lpc bridge is no problem, only pci id fields are copied over and
> > > unprivileged access is allowed for them.
> > >
> > > Copying the gfx registers of the host bridge is a problem indeed.
> >
> > I would argue that both are really a problem, libvirt wants to put QEMU
> > in a container that prevents access to any host system files other than
> > those explicitly allowed.  Therefore libvirt needs to grant the process
> > access to the lpc sysfs config file even though it only needs user
> > visible register values.
> 
> Yes, correct.  We want svirt be as strict as possible.
> 

That is the most tricky part in IGD pass-thru, which is why Intel decides
to remove those dependencies to make pass-thru easier.

Thanks
Kevin

WARNING: multiple messages have this Message-ID (diff)
From: "Tian, Kevin" <kevin.tian-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Gerd Hoffmann <kraxel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Alex Williamson
	<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: "igvt-g-y27Ovi1pjclAfugRpC6u6w@public.gmane.org"
	<igvt-g-y27Ovi1pjclAfugRpC6u6w@public.gmane.org>,
	"xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR@public.gmane.org"
	<xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR@public.gmane.org>,
	Eduardo Habkost
	<ehabkost-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Stefano Stabellini
	<stefano.stabellini-mvvWK6WmYclDPfheJLI6IQ@public.gmane.org>,
	"qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org"
	<qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org>,
	Cao jin <caoj.fnst-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>,
	"vfio-users-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
	<vfio-users-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [iGVT-g] [PATCH v3 00/11] igd passthrough chipset	tweaks
Date: Tue, 2 Feb 2016 07:01:46 +0000	[thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D15F793FFA@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <1454330962.10168.34.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

> From: Gerd Hoffmann
> Sent: Monday, February 01, 2016 8:49 PM
> 
>   Hi,
> 
> > Thanks for the tip that seabios allocated pages automatically become
> > e820 reserved, that simplifies things a bit.
> 
> It's common practice for all firmware.  The e820 table from qemu is just
> a starting point, it is not passed on to the guest os as-is.  All
> permanent allocations (acpi tables, smbios tables, seabios driver data
> such as virtio rings, ...) are taken away from RAM and added to
> RESERVED, and IIRC seabios also takes care to reserve the bios and
> option rom regions in real mode address space.

Agree. It's cleaner to have seabios do allocation otherwise it's prune to
cause conflict if vfio-pci randomly picks up an address (though we can
do special tweaks within seabios to skip that one by reading 0xFC).

> 
> > > Maybe we should define the interface as "guest writes 0xfc to pick
> > > address, qemu takes care to place opregion there".  That gives us the
> > > freedom to change the qemu implementation (either copy host opregion or
> > > map the host opregion) without breaking things.
> >
> > Ok, so seabios allocates two pages, writes the base address of those
> > pages to 0xfc and looks to see whether the signature appears at that
> > address due to qemu mapping.  It verifies the size and does a
> > free/realloc if not the right size.
> 
> I think seabios first needs to reserve something big enough for a
> temporary mapping, to check signature + size, otherwise the opregion
> might scratch data structures beyond opregion in case it happens to be
> larger than 8k.
> 
> How likely is it that the opregion size ever changes?  Should we better
> be prepared to handle it?  Or would it be ok to have a ...
> 
>    if (opregion_size > 8k)
>       panic();
> 
> ... style sanity check?

Above sanity check should be enough. We use 8k in KVMGT too.

> 
> > If the graphics signature does not
> > appear, free those pages and assume no opregion support.
> 
> Yes.
> 
> > If we later
> > decide to use a copy, we'd need to disable the 0xfc automagic mapping
> > and probably pass the data via fw_cfg.  Sound right?
> 
> I'd have qemu copy the data on 0xfc write then, so things continue to
> work without updating seabios.  So, the firmware has to allocate space,
> reserve it etc.,  and programming the 0xfc register.  Qemu has to make
> sure the opregion appears at the address written by the firmware, by
> whatever method it prefers.

Yup. It's Qemu's responsibility to expose opregion content. 

btw, prefer to do copying here. It's pointless to allow write from guest
side. One write example is SWSCI mailbox, thru which gfx driver can
trigger some SCI event to communicate with BIOS (specifically ACPI
methods here), mostly for some monitor operations. However it's 
not a right thing for guest to trigger host SCI and thus kick host 
ACPI methods.

> 
> > > lpc bridge is no problem, only pci id fields are copied over and
> > > unprivileged access is allowed for them.
> > >
> > > Copying the gfx registers of the host bridge is a problem indeed.
> >
> > I would argue that both are really a problem, libvirt wants to put QEMU
> > in a container that prevents access to any host system files other than
> > those explicitly allowed.  Therefore libvirt needs to grant the process
> > access to the lpc sysfs config file even though it only needs user
> > visible register values.
> 
> Yes, correct.  We want svirt be as strict as possible.
> 

That is the most tricky part in IGD pass-thru, which is why Intel decides
to remove those dependencies to make pass-thru easier.

Thanks
Kevin

  parent reply	other threads:[~2016-02-02  7:02 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-05 11:41 [Qemu-devel] [PATCH v3 00/11] igd passthrough chipset tweaks Gerd Hoffmann
2016-01-05 11:41 ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 01/11] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 02/11] pc: remove has_igd_gfx_passthru global Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 14:32   ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2016-01-06 14:32     ` Stefano Stabellini
2016-01-19 15:09   ` [Qemu-devel] " Eduardo Habkost
2016-01-19 15:09     ` Eduardo Habkost
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 03/11] pc: move igd support code to igd.c Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 04/11] igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 14:32   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 14:32     ` Stefano Stabellini
2016-01-23 14:51   ` [Qemu-devel] " Eduardo Habkost
2016-01-23 14:51     ` Eduardo Habkost
2016-01-25  8:59     ` [Qemu-devel] " Gerd Hoffmann
2016-01-25  8:59       ` Gerd Hoffmann
2016-01-25 11:53       ` [Qemu-devel] " Stefano Stabellini
2016-01-25 11:53         ` Stefano Stabellini
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 05/11] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 14:41   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 14:41     ` Stefano Stabellini
2016-01-06 15:45     ` [Qemu-devel] " Gerd Hoffmann
2016-01-06 15:45       ` Gerd Hoffmann
2016-01-19 15:13       ` [Qemu-devel] " Eduardo Habkost
2016-01-19 15:13         ` Eduardo Habkost
2016-01-20  9:10         ` [Qemu-devel] " Gerd Hoffmann
2016-01-20  9:10           ` Gerd Hoffmann
2016-01-23 14:52           ` [Qemu-devel] " Eduardo Habkost
2016-01-23 14:52             ` Eduardo Habkost
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 06/11] igd: use defines for standard pci config space offsets Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 14:43   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 14:43     ` Stefano Stabellini
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 07/11] igd: revamp host config read Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 15:02   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 15:02     ` Stefano Stabellini
2016-01-06 15:51     ` [Qemu-devel] " Gerd Hoffmann
2016-01-06 15:51       ` Gerd Hoffmann
2016-01-06 16:23       ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2016-01-06 16:23         ` Stefano Stabellini
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 08/11] igd: add q35 support Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 09/11] igd: move igd-passthrough-isa-bridge to igd.c too Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 10/11] igd: handle igd-passthrough-isa-bridge setup in realize() Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 15:29   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 15:29     ` Stefano Stabellini
2016-01-06 15:52     ` [Qemu-devel] " Gerd Hoffmann
2016-01-06 15:52       ` Gerd Hoffmann
2016-01-05 11:41 ` [Qemu-devel] [PATCH v3 11/11] igd: move igd-passthrough-isa-bridge creation to machine init Gerd Hoffmann
2016-01-05 11:41   ` Gerd Hoffmann
2016-01-06 15:36   ` [Qemu-devel] " Stefano Stabellini
2016-01-06 15:36     ` Stefano Stabellini
2016-01-07  7:38     ` [Qemu-devel] " Gerd Hoffmann
2016-01-07  7:38       ` Gerd Hoffmann
2016-01-07 13:10       ` [Qemu-devel] " Stefano Stabellini
2016-01-07 13:10         ` Stefano Stabellini
2016-01-07 15:50         ` [Qemu-devel] " Gerd Hoffmann
2016-01-07 15:50           ` Gerd Hoffmann
2016-01-08 11:20           ` Stefano Stabellini
2016-01-08 11:20             ` Stefano Stabellini
2016-01-08 12:12             ` [Qemu-devel] " Stefano Stabellini
2016-01-08 12:12               ` Stefano Stabellini
2016-01-08 12:32               ` Gerd Hoffmann
2016-01-08 12:32                 ` Gerd Hoffmann
2016-01-08 12:38                 ` [Qemu-devel] " Stefano Stabellini
2016-01-08 12:38                   ` Stefano Stabellini
2016-01-05 13:07 ` [Qemu-devel] [PATCH v3 00/11] igd passthrough chipset tweaks Michael S. Tsirkin
2016-01-05 13:07   ` Michael S. Tsirkin
2016-01-28 19:35 ` [Qemu-devel] [vfio-users] " Alex Williamson
2016-01-28 19:35   ` Alex Williamson
2016-01-29  2:22   ` [Qemu-devel] [iGVT-g] " Kay, Allen M
2016-01-29  2:22     ` Kay, Allen M
2016-01-29  2:54     ` [Qemu-devel] " Alex Williamson
2016-01-29  2:54       ` Alex Williamson
2016-01-29  6:21       ` [Qemu-devel] " Jike Song
2016-01-29  6:21         ` Jike Song
2016-01-29 21:58       ` [Qemu-devel] " Kay, Allen M
2016-01-29 21:58         ` Kay, Allen M
2016-02-02  7:07         ` [Qemu-devel] " Tian, Kevin
2016-02-02  7:07           ` Tian, Kevin
2016-02-02 19:10           ` [Qemu-devel] " Kay, Allen M
2016-02-02 19:10             ` [iGVT-g] " Kay, Allen M
2016-02-02 19:37             ` [Qemu-devel] [iGVT-g] [vfio-users] " Alex Williamson
2016-02-02 19:37               ` Alex Williamson
2016-02-02 23:32               ` [Qemu-devel] " Kay, Allen M
2016-02-02 23:32                 ` Kay, Allen M
2016-01-29  7:09   ` [Qemu-devel] " Gerd Hoffmann
2016-01-29  7:09     ` Gerd Hoffmann
2016-01-29 17:59     ` [Qemu-devel] " Alex Williamson
2016-01-29 17:59       ` Alex Williamson
2016-01-30  1:18       ` [Qemu-devel] [iGVT-g] " Kay, Allen M
2016-01-30  1:18         ` Kay, Allen M
2016-01-31 17:42         ` [Qemu-devel] " Alex Williamson
2016-01-31 17:42           ` Alex Williamson
2016-02-02  0:04           ` [Qemu-devel] " Kay, Allen M
2016-02-02  0:04             ` Kay, Allen M
2016-02-02  6:42             ` [Qemu-devel] [Xen-devel] " Tian, Kevin
2016-02-02  6:42               ` Tian, Kevin
2016-02-02 11:50               ` [Qemu-devel] " David Woodhouse
2016-02-02 11:50                 ` David Woodhouse
2016-02-02 14:54                 ` [Qemu-devel] " Alex Williamson
2016-02-02 14:54                   ` Alex Williamson
2016-02-02 15:06                   ` [Qemu-devel] " David Woodhouse
2016-02-02 15:06                     ` David Woodhouse
2016-02-02 14:38             ` [Qemu-devel] " Alex Williamson
2016-02-02 14:38               ` Alex Williamson
2016-02-01 12:49       ` [Qemu-devel] " Gerd Hoffmann
2016-02-01 12:49         ` Gerd Hoffmann
2016-02-01 22:16         ` [Qemu-devel] " Alex Williamson
2016-02-01 22:16           ` Alex Williamson
2016-02-02  7:43           ` [Qemu-devel] [vfio-users] " Gerd Hoffmann
2016-02-02  7:43             ` Gerd Hoffmann
2016-02-02  7:01         ` Tian, Kevin [this message]
2016-02-02  7:01           ` [iGVT-g] " Tian, Kevin
2016-02-02  8:56           ` [Qemu-devel] [iGVT-g] [vfio-users] " Gerd Hoffmann
2016-02-02  8:56             ` Gerd Hoffmann
2016-02-02 16:31             ` [Qemu-devel] " Kevin O'Connor
2016-02-02 16:31               ` Kevin O'Connor
2016-02-02 16:49               ` [Qemu-devel] " Laszlo Ersek
2016-02-02 16:49                 ` Laszlo Ersek
2016-02-02 20:18               ` [Qemu-devel] " Alex Williamson
2016-02-02 20:18                 ` Alex Williamson
2016-02-03  6:08             ` [Qemu-devel] " Tian, Kevin
2016-02-03  6:08               ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D15F793FFA@SHSMSX101.ccr.corp.intel.com \
    --to=kevin.tian@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=ehabkost@redhat.com \
    --cc=igvt-g@ml01.01.org \
    --cc=kraxel@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=vfio-users@redhat.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.