All of lore.kernel.org
 help / color / mirror / Atom feed
* pciback: question about the permissive flag
@ 2010-07-06 21:37 Joanna Rutkowska
  2010-07-07  6:32 ` Keir Fraser
  2010-07-07 15:18 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 11+ messages in thread
From: Joanna Rutkowska @ 2010-07-06 21:37 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 783 bytes --]

I'm trying to understand the purpose of the permissive flag in the Xen
pciback driver. The comments in the code suggest that setting
permissive=1 is "potentially unsafe", and I've been wondering why?

My thinking goes this way -- we either:

1) have IOMMU/VT-d in the system, and use it to isolate the device
assigned to a DomU, in which case allowing the DomU to fully control the
assigned device's config space should not be a problem because VT-d
should do its job (we hope at least ;),

or

2) we don't have IOMMU/VT-d, in which case assigning a device to
anything other than Dom0 is simply insecure, no matter if we try to
restrict access to config space (but still allow DMA engine to be
programmed by DomU) or not.

So, what am I missing here?

joanna.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-06 21:37 pciback: question about the permissive flag Joanna Rutkowska
@ 2010-07-07  6:32 ` Keir Fraser
  2010-07-07 13:30   ` Ian Pratt
  2010-07-07 15:18 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 11+ messages in thread
From: Keir Fraser @ 2010-07-07  6:32 UTC (permalink / raw)
  To: Joanna Rutkowska, xen-devel

On 06/07/2010 22:37, "Joanna Rutkowska" <joanna@invisiblethingslab.com>
wrote:

> So, what am I missing here?

I think the fear was that there could be class- or device-specific config
registers that we wouldn't know how to handle, and which could have
unexpected effects if they are passed through naively. Concrete examples
were never given, and this was all pre-vtd so as you say pass-through of a
DMA-capable device was insecure anyway. I've always thought the permissive
flag stuff was pretty useless, and I always suggest people to enable the
permissive flag.

 -- Keir

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pciback: question about the permissive flag
  2010-07-07  6:32 ` Keir Fraser
@ 2010-07-07 13:30   ` Ian Pratt
  2010-07-07 14:05     ` Joanna Rutkowska
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Pratt @ 2010-07-07 13:30 UTC (permalink / raw)
  To: Keir Fraser, Joanna Rutkowska, xen-devel; +Cc: Ian Pratt

> I think the fear was that there could be class- or device-specific config
> registers that we wouldn't know how to handle, and which could have
> unexpected effects if they are passed through naively. Concrete examples
> were never given, and this was all pre-vtd so as you say pass-through of a
> DMA-capable device was insecure anyway. I've always thought the permissive
> flag stuff was pretty useless, and I always suggest people to enable the
> permissive flag.

There are some devices (typically integrated ones, e.g. igfx) that use PCI config space in nasty ways, such as to describe additional BARs, or to trigger SMIs. Allowing free access to these seems dangerous. 

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-07 13:30   ` Ian Pratt
@ 2010-07-07 14:05     ` Joanna Rutkowska
  2010-07-07 15:28       ` Konrad Rzeszutek Wilk
  2010-07-07 15:44       ` Ian Pratt
  0 siblings, 2 replies; 11+ messages in thread
From: Joanna Rutkowska @ 2010-07-07 14:05 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 2253 bytes --]

On 07/07/10 15:30, Ian Pratt wrote:
>> I think the fear was that there could be class- or device-specific
>> config registers that we wouldn't know how to handle, and which
>> could have unexpected effects if they are passed through naively.
>> Concrete examples were never given, and this was all pre-vtd so as
>> you say pass-through of a DMA-capable device was insecure anyway.
>> I've always thought the permissive flag stuff was pretty useless,
>> and I always suggest people to enable the permissive flag.
> 
> There are some devices (typically integrated ones, e.g. igfx) that
> use PCI config space in nasty ways, such as to describe additional
> BARs, or to trigger SMIs. Allowing free access to these seems
> dangerous.
> 

So, you're saying that, if we have a device that allows us to set some
of its PCI config register (some BAR) to tell where to MMIO-map some of
the device's additional config range, and if we "asked it" to map it
over, say, some physical addresses belonging to the hypervisor, then the
MCH would allow for that? And the CPU would happily redirect access to
those addresses over to the device memory? Why would it? That would
clearly be a CPU/chipset bug, as we normally would have to mark this
memory range as MMIOed in the first place...

And even if we wanted to instruct the device to map its memory over some
already MMIOed memory in a hypervisor, shouldn't VT-d prevent the
read/write transactions going to this device?

As for the SMI generation: that stinks indeed. But, does it offer any
control over the generated #SMI, e.g. what we write into the 0xb2 port,
or something like that? If it doesn, then surely it's an avenue for
DomU->SMM escalation, which would mean full system compromise.

I'm trying to figure out why so many drivers do not work well when run
in a PV driver domain (specifically net drivers), but work fine when
running in Dom0. Clearly this is not a pfn != mfn problem, as this
inequality also applies to Dom0, while in Dom0 the same drivers work
just fine. So it seems like it could only be caused by either of the
following:
1) restricted access to device config space
2) interrupt routing problem

Or maybe something else?

Thanks,
joanna.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-06 21:37 pciback: question about the permissive flag Joanna Rutkowska
  2010-07-07  6:32 ` Keir Fraser
@ 2010-07-07 15:18 ` Konrad Rzeszutek Wilk
  2010-07-07 21:23   ` Joanna Rutkowska
  1 sibling, 1 reply; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-07 15:18 UTC (permalink / raw)
  To: Joanna Rutkowska; +Cc: xen-devel

On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote:
> I'm trying to understand the purpose of the permissive flag in the Xen
> pciback driver. The comments in the code suggest that setting
> permissive=1 is "potentially unsafe", and I've been wondering why?
> 
> My thinking goes this way -- we either:
> 
> 1) have IOMMU/VT-d in the system, and use it to isolate the device
> assigned to a DomU, in which case allowing the DomU to fully control the
> assigned device's config space should not be a problem because VT-d

But that is not the case. The PCI config writes are actually done by
Dom0. The Xen PCI frontend redirects all config space reads/writes to
the Xen PCI backend that does them on the guest behalf.

There are some backend-backend config space libs that deal with
different regions (power, MSI), and for those that are not present
the permissive flag is used to figure out whether the guest is allowed
to write to that region.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-07 14:05     ` Joanna Rutkowska
@ 2010-07-07 15:28       ` Konrad Rzeszutek Wilk
  2010-07-07 15:44       ` Ian Pratt
  1 sibling, 0 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-07 15:28 UTC (permalink / raw)
  To: Joanna Rutkowska; +Cc: Ian Pratt, xen-devel, Keir Fraser

On Wed, Jul 07, 2010 at 04:05:44PM +0200, Joanna Rutkowska wrote:
> On 07/07/10 15:30, Ian Pratt wrote:
> >> I think the fear was that there could be class- or device-specific
> >> config registers that we wouldn't know how to handle, and which
> >> could have unexpected effects if they are passed through naively.
> >> Concrete examples were never given, and this was all pre-vtd so as
> >> you say pass-through of a DMA-capable device was insecure anyway.
> >> I've always thought the permissive flag stuff was pretty useless,
> >> and I always suggest people to enable the permissive flag.
> > 
> > There are some devices (typically integrated ones, e.g. igfx) that
> > use PCI config space in nasty ways, such as to describe additional
> > BARs, or to trigger SMIs. Allowing free access to these seems
> > dangerous.
> > 
> 
> So, you're saying that, if we have a device that allows us to set some
> of its PCI config register (some BAR) to tell where to MMIO-map some of
> the device's additional config range, and if we "asked it" to map it
> over, say, some physical addresses belonging to the hypervisor, then the
> MCH would allow for that? And the CPU would happily redirect access to
> those addresses over to the device memory? Why would it? That would

I would think the VT-d chipset would throw a fit.
> clearly be a CPU/chipset bug, as we normally would have to mark this
> memory range as MMIOed in the first place...
> 
> And even if we wanted to instruct the device to map its memory over some
> already MMIOed memory in a hypervisor, shouldn't VT-d prevent the
> read/write transactions going to this device?

That is my feeling too.
> 
> As for the SMI generation: that stinks indeed. But, does it offer any
> control over the generated #SMI, e.g. what we write into the 0xb2 port,
> or something like that? If it doesn, then surely it's an avenue for
> DomU->SMM escalation, which would mean full system compromise.
> 
> I'm trying to figure out why so many drivers do not work well when run
> in a PV driver domain (specifically net drivers), but work fine when
> running in Dom0. Clearly this is not a pfn != mfn problem, as this
> inequality also applies to Dom0, while in Dom0 the same drivers work
> just fine. So it seems like it could only be caused by either of the
> following:
> 1) restricted access to device config space

You can track those easily. Turn on xen-pciback.verbose=1 and you should
see the writes/reads and see if there are any that touch on the
restricted areas.

> 2) interrupt routing problem

Well, that can easily be seen by the /proc/interrupts. If the numbers
are increasing the interrupts are getting there.

Thought if this is MSI/MSI-X make sure you have the latest pv-ops
kernel. There were some bugs I introduced earlier on so that turning on
MSI/MSI-X interrupts would trash the guest. That has been fixed
nowadays.

> 
> Or maybe something else?

If you crank up the debug options something should show up. Especially
if you have the IOMMU turned on.

Are these wireless drivers?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pciback: question about the permissive flag
  2010-07-07 14:05     ` Joanna Rutkowska
  2010-07-07 15:28       ` Konrad Rzeszutek Wilk
@ 2010-07-07 15:44       ` Ian Pratt
  2010-07-07 21:41         ` Joanna Rutkowska
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Pratt @ 2010-07-07 15:44 UTC (permalink / raw)
  To: Joanna Rutkowska; +Cc: Ian Pratt, xen-devel, Keir Fraser

> So, you're saying that, if we have a device that allows us to set some of
> its PCI config register (some BAR) to tell where to MMIO-map some of the
> device's additional config range, and if we "asked it" to map it over,
> say, some physical addresses belonging to the hypervisor, then the MCH
> would allow for that? And the CPU would happily redirect access to those
> addresses over to the device memory? Why would it? That would clearly be a
> CPU/chipset bug, as we normally would have to mark this memory range as
> MMIOed in the first place...

Mapping it over memory might be prevented by the MCH (would you want to rely on that?), but mapping it over another device is likely going to create system instability if not a vulnerability.

> And even if we wanted to instruct the device to map its memory over some
> already MMIOed memory in a hypervisor, shouldn't VT-d prevent the
> read/write transactions going to this device?

VT-d only deals with DMAs coming from the device, not CPU MMIOs.

> As for the SMI generation: that stinks indeed. But, does it offer any
> control over the generated #SMI, e.g. what we write into the 0xb2 port, or
> something like that? 

No idea. Discarding such config writes just seems like a good default.

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-07 15:18 ` Konrad Rzeszutek Wilk
@ 2010-07-07 21:23   ` Joanna Rutkowska
  2010-07-09 14:09     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 11+ messages in thread
From: Joanna Rutkowska @ 2010-07-07 21:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1570 bytes --]

On 07/07/10 17:18, Konrad Rzeszutek Wilk wrote:
> On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote:
>> I'm trying to understand the purpose of the permissive flag in the Xen
>> pciback driver. The comments in the code suggest that setting
>> permissive=1 is "potentially unsafe", and I've been wondering why?
>>
>> My thinking goes this way -- we either:
>>
>> 1) have IOMMU/VT-d in the system, and use it to isolate the device
>> assigned to a DomU, in which case allowing the DomU to fully control the
>> assigned device's config space should not be a problem because VT-d
> 
> But that is not the case. The PCI config writes are actually done by
> Dom0. The Xen PCI frontend redirects all config space reads/writes to
> the Xen PCI backend that does them on the guest behalf.
> 

Hmm, not sure if I understand why you wrote "this is not the case"
above? Of course DomU cannot directly change anything in PCI config
space of any device, because its kernel code executes in Ring 3 or 1,
and cannot do IO to 0xcf8/cfc. But I was under impression that once we
assign a PCI device to the DomU, and once we set permissive=1, then this
would effectively allow DomU to fully control the device config space.
Is this not correct?

> There are some backend-backend config space libs that deal with
> different regions (power, MSI), and for those that are not present
> the permissive flag is used to figure out whether the guest is allowed
> to write to that region.
> 

What do you mean by a "backend-backend" lib?

joanna.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-07 15:44       ` Ian Pratt
@ 2010-07-07 21:41         ` Joanna Rutkowska
  2010-07-07 22:51           ` Ian Pratt
  0 siblings, 1 reply; 11+ messages in thread
From: Joanna Rutkowska @ 2010-07-07 21:41 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Keir Fraser


[-- Attachment #1.1: Type: text/plain, Size: 2463 bytes --]

On 07/07/10 17:44, Ian Pratt wrote:
>> So, you're saying that, if we have a device that allows us to set
>> some of its PCI config register (some BAR) to tell where to
>> MMIO-map some of the device's additional config range, and if we
>> "asked it" to map it over, say, some physical addresses belonging
>> to the hypervisor, then the MCH would allow for that? And the CPU
>> would happily redirect access to those addresses over to the device
>> memory? Why would it? That would clearly be a CPU/chipset bug, as
>> we normally would have to mark this memory range as MMIOed in the
>> first place...
> 
> Mapping it over memory might be prevented by the MCH (would you want
> to rely on that?),

Well, we need to rely on the CPU and MCH anyway, so why not? :)

> but mapping it over another device is likely going
> to create system instability if not a vulnerability.
>
>> And even if we wanted to instruct the device to map its memory over
>> some already MMIOed memory in a hypervisor, shouldn't VT-d prevent
>> the read/write transactions going to this device?
> 
> VT-d only deals with DMAs coming from the device, not CPU MMIOs.
> 

So, we would have two devices on the PCIe bus that would be willing to
respond for a single PCI read request (for some address that both of the
devices map some of their memory). I guess which device would actually
answer would be implementation/race-condition specific.

Let assume the "bad" device answers the PCIe read request first, and
will send its data back (this is what the attacker hopes to achieve --
to feed unexpected data into the hypervisor/Dom0). Are you saying that
VT-d would not prevent this answer coming back to the CPU? Can somebody
from Intel comment on this? This is interesting.

>> As for the SMI generation: that stinks indeed. But, does it offer
>> any control over the generated #SMI, e.g. what we write into the
>> 0xb2 port, or something like that?
> 
> No idea. Discarding such config writes just seems like a good
> default.
> 

So far I've been aware that the southbridge can trigger #SMI in response
to certain conditions, e.g. wrong BDF address (which is apparently used
by OEMs to emulate PCI devices from within SMM, how crazy is this?!).
But what would be the reason to let IGD device to trigger #SMI?

Can Interrupt Remapping (apparently present in VT-d2) be used to prevent
a device from triggering an #SMI?

Thanks,
joanna.


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 226 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pciback: question about the permissive flag
  2010-07-07 21:41         ` Joanna Rutkowska
@ 2010-07-07 22:51           ` Ian Pratt
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Pratt @ 2010-07-07 22:51 UTC (permalink / raw)
  To: Joanna Rutkowska; +Cc: Ian Pratt, xen-devel, Keir Fraser

> >> And even if we wanted to instruct the device to map its memory over
> >> some already MMIOed memory in a hypervisor, shouldn't VT-d prevent
> >> the read/write transactions going to this device?
> >
> > VT-d only deals with DMAs coming from the device, not CPU MMIOs.
> >
> 
> So, we would have two devices on the PCIe bus that would be willing to
> respond for a single PCI read request (for some address that both of the
> devices map some of their memory). I guess which device would actually
> answer would be implementation/race-condition specific.

On a PCI bus there's definitely opportunity for races. 
On a PCIe bus I'm not entirely sure what would happen as the bridge/IOH presumably has an opinion of which addresses should be routed through which ports. [You also have to be careful of multiple devices behind non-ACS capable bridges where creating a new BAR could cause DMAs to go peer-to-peer.]
 
> Let assume the "bad" device answers the PCIe read request first, and will
> send its data back (this is what the attacker hopes to achieve -- to feed
> unexpected data into the hypervisor/Dom0). Are you saying that VT-d would
> not prevent this answer coming back to the CPU? Can somebody from Intel
> comment on this? This is interesting.

VT-d only gets involved with transactions initiated by the device (i.e. DMAs). Control/remapping of MMIO transactions initiated by the CPU are handled by the normal CPU MMU.

> >> As for the SMI generation: that stinks indeed. But, does it offer any
> >> control over the generated #SMI, e.g. what we write into the
> >> 0xb2 port, or something like that?
> >
> > No idea. Discarding such config writes just seems like a good default.
> >
> 
> So far I've been aware that the southbridge can trigger #SMI in response
> to certain conditions, e.g. wrong BDF address (which is apparently used by
> OEMs to emulate PCI devices from within SMM, how crazy is this?!).
> But what would be the reason to let IGD device to trigger #SMI?

Probably something like OpRegion doorbells.
 
> Can Interrupt Remapping (apparently present in VT-d2) be used to prevent a
> device from triggering an #SMI?

Er, that's beyond my knowledge...
 

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pciback: question about the permissive flag
  2010-07-07 21:23   ` Joanna Rutkowska
@ 2010-07-09 14:09     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-09 14:09 UTC (permalink / raw)
  To: Joanna Rutkowska; +Cc: xen-devel

On Wed, Jul 07, 2010 at 11:23:38PM +0200, Joanna Rutkowska wrote:
> On 07/07/10 17:18, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jul 06, 2010 at 11:37:27PM +0200, Joanna Rutkowska wrote:
> >> I'm trying to understand the purpose of the permissive flag in the Xen
> >> pciback driver. The comments in the code suggest that setting
> >> permissive=1 is "potentially unsafe", and I've been wondering why?
> >>
> >> My thinking goes this way -- we either:
> >>
> >> 1) have IOMMU/VT-d in the system, and use it to isolate the device
> >> assigned to a DomU, in which case allowing the DomU to fully control the
> >> assigned device's config space should not be a problem because VT-d
> > 
> > But that is not the case. The PCI config writes are actually done by
> > Dom0. The Xen PCI frontend redirects all config space reads/writes to
> > the Xen PCI backend that does them on the guest behalf.
> > 
> 
> Hmm, not sure if I understand why you wrote "this is not the case"
> above? Of course DomU cannot directly change anything in PCI config
> space of any device, because its kernel code executes in Ring 3 or 1,
> and cannot do IO to 0xcf8/cfc. But I was under impression that once we
> assign a PCI device to the DomU, and once we set permissive=1, then this
> would effectively allow DomU to fully control the device config space.
> Is this not correct?

That is correct.
> 
> > There are some backend-backend config space libs that deal with
> > different regions (power, MSI), and for those that are not present
> > the permissive flag is used to figure out whether the guest is allowed
> > to write to that region.
> > 
> 
> What do you mean by a "backend-backend" lib?

drivers/xen/pciback/conf_space_*
> 
> joanna.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-07-09 14:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-06 21:37 pciback: question about the permissive flag Joanna Rutkowska
2010-07-07  6:32 ` Keir Fraser
2010-07-07 13:30   ` Ian Pratt
2010-07-07 14:05     ` Joanna Rutkowska
2010-07-07 15:28       ` Konrad Rzeszutek Wilk
2010-07-07 15:44       ` Ian Pratt
2010-07-07 21:41         ` Joanna Rutkowska
2010-07-07 22:51           ` Ian Pratt
2010-07-07 15:18 ` Konrad Rzeszutek Wilk
2010-07-07 21:23   ` Joanna Rutkowska
2010-07-09 14:09     ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.