All of lore.kernel.org
 help / color / mirror / Atom feed
* Need some advices on how to workaround a hardware bug
@ 2018-03-30  5:14 Chao Gao
  2018-03-30  8:23 ` Jan Beulich
  0 siblings, 1 reply; 3+ messages in thread
From: Chao Gao @ 2018-03-30  5:14 UTC (permalink / raw)
  To: xen-devel

Hi,

I met an EPT violation and then the guest was destroyed by Xen
after assigning a device to the guest. After some investigation, I found
it is caused by the device isn't a standard PCI device -- its MSI-x PBA
locates in the same 4k-byte page with other CSR. When the driver in
guest writes the registers in that page, an EPT violation happens because
the PBA page is marked as read-only by the below line in
msix_capability_init()
        if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
	                                msix->pba.last) )
The reason why Xen marks this page read-only I think is PCI SPEC says:
	Software should never write, and should only read Pending Bits.
	If software writes to Pending Bits, the result is undefined
Then Xen goes through all registered MMIO range and finds this address
hasn't been registered. Thus it destroys the guest.

I plan to work out a workaround for this issue to allow Xen guest (also
dom0 if dom0 uses EPT? not sure) to use devices efficiently which
violate PCI SPEC in this way. Currently, there are two options (EPT SPP
might provide a perfect solution) :

One is trapping the page where PBA locates and ignoring writes to PBA and
apply writes to other fields in the same page. It would incur significant
performance degradation if this page is accessed frequently. In order
to mitigate the performance drop, a patch to trap PBA lazily like what
qemu does [1] is needed.

The other is Do not trap accesses to the page where PBA locates. In this
option, all accesses to the page will go to hardware device without
Xen's interception. I think one concern would be whether this option
would lead to bring some security holes, compared with trapping these accesses.
In my mind, the answer is no because Xen even doesn't read PBA. A corner
case for this option might be PBA resides in the same page with MSIx table,
which is allowed according to the following description in PCI SPEC:
	The MSI-X Table and MSI-X PBA are permitted to co-reside within a
	naturally aligned 4-KB address range, though they must not overlap with
	each other.

Which one do you think is better? or any other thoughts about how to
workaround this case?

[1]:https://git.qemu.org/?p=qemu.git;a=commit;h=95239e162518dc6577164be3d9a789aba7f591a3

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Need some advices on how to workaround a hardware bug
  2018-03-30  5:14 Need some advices on how to workaround a hardware bug Chao Gao
@ 2018-03-30  8:23 ` Jan Beulich
  2018-04-02  6:36   ` Chao Gao
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Beulich @ 2018-03-30  8:23 UTC (permalink / raw)
  To: chao.gao; +Cc: xen-devel

>>> Chao Gao <chao.gao@intel.com> 03/30/18 7:19 AM >>>
>I met an EPT violation and then the guest was destroyed by Xen
>after assigning a device to the guest. After some investigation, I found
>it is caused by the device isn't a standard PCI device -- its MSI-x PBA
>locates in the same 4k-byte page with other CSR. When the driver in
>guest writes the registers in that page, an EPT violation happens because
>the PBA page is marked as read-only by the below line in
>msix_capability_init()
>if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
>msix->pba.last) )
>The reason why Xen marks this page read-only I think is PCI SPEC says:
>Software should never write, and should only read Pending Bits.
>If software writes to Pending Bits, the result is undefined
>Then Xen goes through all registered MMIO range and finds this address
>hasn't been registered. Thus it destroys the guest.
>
>I plan to work out a workaround for this issue to allow Xen guest (also
>dom0 if dom0 uses EPT? not sure) to use devices efficiently which
>violate PCI SPEC in this way. Currently, there are two options (EPT SPP
>might provide a perfect solution) :
>
>One is trapping the page where PBA locates and ignoring writes to PBA and
>apply writes to other fields in the same page. It would incur significant
>performance degradation if this page is accessed frequently. In order
>to mitigate the performance drop, a patch to trap PBA lazily like what
>qemu does [1] is needed.
>
>The other is Do not trap accesses to the page where PBA locates. In this
>option, all accesses to the page will go to hardware device without
>Xen's interception. I think one concern would be whether this option
>would lead to bring some security holes, compared with trapping these accesses.
>In my mind, the answer is no because Xen even doesn't read PBA. A corner
>case for this option might be PBA resides in the same page with MSIx table,
>which is allowed according to the following description in PCI SPEC:
>The MSI-X Table and MSI-X PBA are permitted to co-reside within a
>naturally aligned 4-KB address range, though they must not overlap with
>each other.
>
>Which one do you think is better? or any other thoughts about how to
>workaround this case?

First of all, I don't think the qemu change you point out is an equivalent for
the situation here: We don't emulate PBA, we only control access.

Not trapping write accesses to PBA is okay only under one of two conditions:
For Dom0 (which we trust) or if the host admin gave their consent. This extends
to both variants you suggest - any non-spec compliant behavior bears the risk
of undermining security of the entire system beyond the well known issues with
pass-through. Hence apart from EPT SPP (which we don't have yet), only a
command line or guest config controlled approach of making exceptions from
the base policy is viable imo.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Need some advices on how to workaround a hardware bug
  2018-03-30  8:23 ` Jan Beulich
@ 2018-04-02  6:36   ` Chao Gao
  0 siblings, 0 replies; 3+ messages in thread
From: Chao Gao @ 2018-04-02  6:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Fri, Mar 30, 2018 at 02:23:13AM -0600, Jan Beulich wrote:
>>>> Chao Gao <chao.gao@intel.com> 03/30/18 7:19 AM >>>
>>I met an EPT violation and then the guest was destroyed by Xen
>>after assigning a device to the guest. After some investigation, I found
>>it is caused by the device isn't a standard PCI device -- its MSI-x PBA
>>locates in the same 4k-byte page with other CSR. When the driver in
>>guest writes the registers in that page, an EPT violation happens because
>>the PBA page is marked as read-only by the below line in
>>msix_capability_init()
>>if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
>>msix->pba.last) )
>>The reason why Xen marks this page read-only I think is PCI SPEC says:
>>Software should never write, and should only read Pending Bits.
>>If software writes to Pending Bits, the result is undefined
>>Then Xen goes through all registered MMIO range and finds this address
>>hasn't been registered. Thus it destroys the guest.
>>
>>I plan to work out a workaround for this issue to allow Xen guest (also
>>dom0 if dom0 uses EPT? not sure) to use devices efficiently which
>>violate PCI SPEC in this way. Currently, there are two options (EPT SPP
>>might provide a perfect solution) :
>>
>>One is trapping the page where PBA locates and ignoring writes to PBA and
>>apply writes to other fields in the same page. It would incur significant
>>performance degradation if this page is accessed frequently. In order
>>to mitigate the performance drop, a patch to trap PBA lazily like what
>>qemu does [1] is needed.
>>
>>The other is Do not trap accesses to the page where PBA locates. In this
>>option, all accesses to the page will go to hardware device without
>>Xen's interception. I think one concern would be whether this option
>>would lead to bring some security holes, compared with trapping these accesses.
>>In my mind, the answer is no because Xen even doesn't read PBA. A corner
>>case for this option might be PBA resides in the same page with MSIx table,
>>which is allowed according to the following description in PCI SPEC:
>>The MSI-X Table and MSI-X PBA are permitted to co-reside within a
>>naturally aligned 4-KB address range, though they must not overlap with
>>each other.
>>
>>Which one do you think is better? or any other thoughts about how to
>>workaround this case?
>
>First of all, I don't think the qemu change you point out is an equivalent for
>the situation here: We don't emulate PBA, we only control access.
>
>Not trapping write accesses to PBA is okay only under one of two conditions:
>For Dom0 (which we trust) or if the host admin gave their consent. This extends
>to both variants you suggest - any non-spec compliant behavior bears the risk
>of undermining security of the entire system beyond the well known issues with
>pass-through. Hence apart from EPT SPP (which we don't have yet), only a
>command line or guest config controlled approach of making exceptions from
>the base policy is viable imo.

Got it. Thanks for your kind suggestion.

I will use a command line, for example, "pba_quirk" -- specify a
list of SBDF of devices. When assigning devices in this list to guest,
reading or writing the page where MSI-X PBA resides are allowed.
This option provides a workaround for nonstandard PCI devices whose
MSI-X PBA shares the same 4K-byte page with other registers. Note that
adding an untrusted device to this option would undermine security of the
entire system.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-04-02  6:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-30  5:14 Need some advices on how to workaround a hardware bug Chao Gao
2018-03-30  8:23 ` Jan Beulich
2018-04-02  6:36   ` Chao Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.