[Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
@ 2019-09-29 21:24 Chao Gao
  2019-09-30  9:09 ` Roger Pau Monné
  2019-09-30  9:18 ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: Chao Gao @ 2019-09-29 21:24 UTC (permalink / raw)
  To: xen-devel, chao.gao
  Cc: Andrew Cooper, Jan Beulich, Wei Liu, Roger Pau Monné

Current, Xen isn't aware of device reset (initiated by dom0). Xen may
access the device while device cannot respond to config requests
normally (e.g.  after device reset, device may respond to config
requests with CRS completions to indicate it needs more time to
complete a reset, refer to pci_dev_wait() in linux kernel for more
detail). Here, don't assume msix capability is always visible and
return -EAGAIN to the caller.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
I didn't find a way to trigger the assertion in normal usages.
It is found by an internal test: echo 1 to /sys/bus/pci/<sbdf>/reset
when the device is being used by a guest. Although the test is a
little insane, it is better to avoid crashing Xen even for this case.
---
 xen/arch/x86/msi.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index 76d4034..e2f3c6c 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
         pos = entry ? entry->msi_attrib.pos
                     : pci_find_cap_offset(seg, bus, slot, func,
                                           PCI_CAP_ID_MSIX);
-        ASSERT(pos);
+        if ( unlikely(!pos) )
+        {
+            printk_once(XENLOG_WARNING
+                        "%04x:%02x:%02x.%u MSI-X capability is missing\n",
+                        seg, bus, slot, func);
+            return -EAGAIN;
+        }
 
         if ( reg >= pos && reg < msix_pba_offset_reg(pos) + 4 )
         {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
  2019-09-29 21:24 [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing Chao Gao
@ 2019-09-30  9:09 ` Roger Pau Monné
  2019-09-30 14:21   ` Chao Gao
  2019-09-30  9:18 ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2019-09-30  9:09 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

On Mon, Sep 30, 2019 at 05:24:31AM +0800, Chao Gao wrote:
> Current, Xen isn't aware of device reset (initiated by dom0). Xen may
> access the device while device cannot respond to config requests
> normally (e.g.  after device reset, device may respond to config
> requests with CRS completions to indicate it needs more time to
> complete a reset, refer to pci_dev_wait() in linux kernel for more
> detail). Here, don't assume msix capability is always visible and
> return -EAGAIN to the caller.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> I didn't find a way to trigger the assertion in normal usages.
> It is found by an internal test: echo 1 to /sys/bus/pci/<sbdf>/reset
> when the device is being used by a guest. Although the test is a
> little insane, it is better to avoid crashing Xen even for this case.

The hardware domain doing such things behind Xen's back is quite
likely to end badly, either hitting an ASSERT somewhere or with a
malfunctioning device. Xen should be signaled of when such reset is
happening, so it can also tear down the internal state of the
device.

Xen could trap accesses to the FLR bit in order to detect device
resets, but that's only a way of performing a device reset, other
methods are likely more complicated to detect, and hence this would
only be a partial solution.

Have you considered whether it's feasible to signal Xen that a device
reset is happening, so it can torn down the internal device state?

> ---
>  xen/arch/x86/msi.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index 76d4034..e2f3c6c 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>          pos = entry ? entry->msi_attrib.pos
>                      : pci_find_cap_offset(seg, bus, slot, func,
>                                            PCI_CAP_ID_MSIX);
> -        ASSERT(pos);

I think at least a comment should be added here describing why a
capability might suddenly disappear.

> +        if ( unlikely(!pos) )
> +        {
> +            printk_once(XENLOG_WARNING

I'm not sure if printk_once is the best option, the message would be
printed only once, and for the first device that hits this. Ideally I
think it should be printed at least once for each device that hits
this condition.

Alternatively you can turn this into a gprintk which would be good
enough IMO.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
  2019-09-29 21:24 [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing Chao Gao
  2019-09-30  9:09 ` Roger Pau Monné
@ 2019-09-30  9:18 ` Jan Beulich
  2019-09-30 14:30   ` Chao Gao
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2019-09-30  9:18 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Roger Pau Monné, Wei Liu, Andrew Cooper

On 29.09.2019 23:24, Chao Gao wrote:
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>          pos = entry ? entry->msi_attrib.pos
>                      : pci_find_cap_offset(seg, bus, slot, func,
>                                            PCI_CAP_ID_MSIX);
> -        ASSERT(pos);
> +        if ( unlikely(!pos) )
> +        {
> +            printk_once(XENLOG_WARNING
> +                        "%04x:%02x:%02x.%u MSI-X capability is missing\n",
> +                        seg, bus, slot, func);
> +            return -EAGAIN;
> +        }

Besides agreeing with Roger's comments, whose access do we
intercept here at the time you observe the operation above
producing a zero "pos"? If it's Dom0, then surely there's a bug
in Dom0 doing the access in the first place when a reset hasn't
completed yet? If it's a DomU, then is the reset happening
behind _its_ back as well (which is not going to end well)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
  2019-09-30  9:09 ` Roger Pau Monné
@ 2019-09-30 14:21   ` Chao Gao
  0 siblings, 0 replies; 6+ messages in thread
From: Chao Gao @ 2019-09-30 14:21 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

On Mon, Sep 30, 2019 at 11:09:58AM +0200, Roger Pau Monné wrote:
>On Mon, Sep 30, 2019 at 05:24:31AM +0800, Chao Gao wrote:
>> Current, Xen isn't aware of device reset (initiated by dom0). Xen may
>> access the device while device cannot respond to config requests
>> normally (e.g.  after device reset, device may respond to config
>> requests with CRS completions to indicate it needs more time to
>> complete a reset, refer to pci_dev_wait() in linux kernel for more
>> detail). Here, don't assume msix capability is always visible and
>> return -EAGAIN to the caller.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> I didn't find a way to trigger the assertion in normal usages.
>> It is found by an internal test: echo 1 to /sys/bus/pci/<sbdf>/reset
>> when the device is being used by a guest. Although the test is a
>> little insane, it is better to avoid crashing Xen even for this case.
>
>The hardware domain doing such things behind Xen's back is quite
>likely to end badly, either hitting an ASSERT somewhere or with a
>malfunctioning device. Xen should be signaled of when such reset is
>happening, so it can also tear down the internal state of the
>device.
>
>Xen could trap accesses to the FLR bit in order to detect device
>resets, but that's only a way of performing a device reset, other
>methods are likely more complicated to detect, and hence this would
>only be a partial solution.
>
>Have you considered whether it's feasible to signal Xen that a device
>reset is happening, so it can torn down the internal device state?

I think it is feasible. But I am not sure whether it is necessary.
As you said to me before, after detaching the device from a domain,
the internal device state in Xen should have be reset. That's why
hardware domain or other domainU can use the device again. So Xen
has provided hypercalls to tear down the internal state. (IMO, the
internal state includes interrupt binding and mapping, MMIO mapping.
But I am not sure if I miss something).

The question then becomes: should Xen tolerate hardware domain's
misbehavior (resetting a device without tearing down internal state)
or just panic?

>
>> ---
>>  xen/arch/x86/msi.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
>> index 76d4034..e2f3c6c 100644
>> --- a/xen/arch/x86/msi.c
>> +++ b/xen/arch/x86/msi.c
>> @@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>>          pos = entry ? entry->msi_attrib.pos
>>                      : pci_find_cap_offset(seg, bus, slot, func,
>>                                            PCI_CAP_ID_MSIX);
>> -        ASSERT(pos);
>
>I think at least a comment should be added here describing why a
>capability might suddenly disappear.

Will do.

>
>> +        if ( unlikely(!pos) )
>> +        {
>> +            printk_once(XENLOG_WARNING
>
>I'm not sure if printk_once is the best option, the message would be
>printed only once, and for the first device that hits this. Ideally I
>think it should be printed at least once for each device that hits
>this condition.
>
>Alternatively you can turn this into a gprintk which would be good
>enough IMO.

Will do.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
  2019-09-30  9:18 ` Jan Beulich
@ 2019-09-30 14:30   ` Chao Gao
  2019-09-30 15:21     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Chao Gao @ 2019-09-30 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné, Wei Liu, Andrew Cooper

On Mon, Sep 30, 2019 at 11:18:05AM +0200, Jan Beulich wrote:
>On 29.09.2019 23:24, Chao Gao wrote:
>> --- a/xen/arch/x86/msi.c
>> +++ b/xen/arch/x86/msi.c
>> @@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>>          pos = entry ? entry->msi_attrib.pos
>>                      : pci_find_cap_offset(seg, bus, slot, func,
>>                                            PCI_CAP_ID_MSIX);
>> -        ASSERT(pos);
>> +        if ( unlikely(!pos) )
>> +        {
>> +            printk_once(XENLOG_WARNING
>> +                        "%04x:%02x:%02x.%u MSI-X capability is missing\n",
>> +                        seg, bus, slot, func);
>> +            return -EAGAIN;
>> +        }
>
>Besides agreeing with Roger's comments, whose access do we
>intercept here at the time you observe the operation above
>producing a zero "pos"? If it's Dom0, then surely there's a bug
>in Dom0 doing the access in the first place when a reset hasn't
>completed yet?
>If it's a DomU, then is the reset happening
>behind _its_ back as well (which is not going to end well)?

Looks like it is Dom0. Xen should defend against Dom0 bugs, right?

Here is the call trace:
(XEN) memory_map:remove: dom1 gfn=f0000 mfn=de000 nr=2000
(XEN) memory_map:remove: dom1 gfn=f4051 mfn=e0001 nr=3
(XEN) Assertion 'pos' failed at msi.c:1311
(XEN) ---[ Xen-4.13-unstable  x86_64  debug=y   Tainted:  C   ]---
(XEN) CPU:    38
(XEN) RIP:    e008:[<ffff82d08027ed90>] pci_msi_conf_write_intercept+0xd7/0x216
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor (d0v1)
(XEN) rax: 0000000000000000   rbx: ffff83087a446c50   rcx: 0000000000000000
(XEN) rdx: ffff830863c57fff   rsi: 0000000000000293   rdi: ffff82d080498ee0
(XEN) rbp: ffff830863c579e0   rsp: ffff830863c579b0   r8:  0000000000000000
(XEN) r9:  ffff830863692ae0   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 00000000000000b2   r13: ffff830863c57a64   r14: 0000000000000000
(XEN) r15: 0000000000000089   cr0: 0000000080050033   cr4: 00000000003426e0
(XEN) cr3: 0000000812052000   cr2: 0000557d51fbc000
(XEN) fsb: 00007f05f2caa400   gsb: ffff888194a40000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d08027ed90> (pci_msi_conf_write_intercept+0xd7/0x216):
(XEN)  00 e8 d0 26 fd ff eb 85 <0f> 0b ba 05 00 00 00 be ff ff ff ff 48 89 df e8
(XEN) Xen stack trace from rsp=ffff830863c579b0:
(XEN)    00000002000057be 0000000000008900 0000000000000000 0000000000000002
(XEN)    ffff830863c57a64 00000000000000b2 ffff830863c57a18 ffff82d080297d99
(XEN)    ffff8308636bb000 ffff830863c57a64 00000000000000b2 0000000000000002
(XEN)    0000000000008900 ffff830863c57a50 ffff82d08037d40b 0000000000000cfe
(XEN)    0000000000000002 0000000000000002 ffff8308636bb000 ffff830863c57a64
(XEN)    ffff830863c57a90 ffff82d08037d5af 00007fff8022854f ffff830863c57e30
(XEN)    0000000000000002 0000000000000cfe ffff83086369c000 ffff8308636bb000
(XEN)    ffff830863c57ad0 ffff82d08037db65 0000000000007fff 0000000000000cfe
(XEN)    ffff830863c57e30 ffff82d0803fb7c0 0000000000000000 0000000000000000
(XEN)    ffff830863c57de8 ffff82d0802bf35d 0000000400000000 0000000000000000
(XEN)    ffff82d080387800 000000ef000000ef ffff830863c57bc0 ffff82d000000007
(XEN)    ffff82d000000000 00000000000000ef ffff8305a473ae70 0000000000000000
(XEN)    ffff830863c57b20 ffff82cffffff000 0000000000000282 ffff830863c57b60
(XEN)    ffff82d08023c27d ffff8305a473ae60 ffff830863c57ba0 ffff82d080248596
(XEN)    0000000200000040 ffff8305a473ae60 0000000000000086 ffff830863c57ba0
(XEN)    ffff82d08023c27d 0000000000000286 ffff830863c57bb8 0000000000000040
(XEN)    ffff830863c57bc8 ffff82d08026c747 aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa
(XEN)    aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa
(XEN)    aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa ffff830863c57da0 0000000000000000
(XEN)    0000000000000003 0000000000000000 0000000000000000 8086000000008086
(XEN) Xen call trace:
(XEN)    [<ffff82d08027ed90>] pci_msi_conf_write_intercept+0xd7/0x216
(XEN)    [<ffff82d080297d99>] pci_conf_write_intercept+0x68/0x72
(XEN)    [<ffff82d08037d40b>] emul-priv-op.c#pci_cfg_ok+0xb5/0x146
(XEN)    [<ffff82d08037d5af>] emul-priv-op.c#guest_io_write+0x113/0x20b
(XEN)    [<ffff82d08037db65>] emul-priv-op.c#write_io+0xda/0xe4
(XEN)    [<ffff82d0802bf35d>] x86_emulate+0x11cf7/0x3169d
(XEN)    [<ffff82d0802e09bd>] x86_emulate_wrapper+0x26/0x5f
(XEN)    [<ffff82d08037f57e>] pv_emulate_privileged_op+0x150/0x271
(XEN)    [<ffff82d0802a80bb>] do_general_protection+0x20b/0x257
(XEN)    [<ffff82d080387a3d>] x86_64/entry.S#handle_exception_saved+0x68/0x94

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing
  2019-09-30 14:30   ` Chao Gao
@ 2019-09-30 15:21     ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2019-09-30 15:21 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, xen-devel, Wei Liu, Roger Pau Monné

On 30.09.2019 16:30, Chao Gao wrote:
> On Mon, Sep 30, 2019 at 11:18:05AM +0200, Jan Beulich wrote:
>> On 29.09.2019 23:24, Chao Gao wrote:
>>> --- a/xen/arch/x86/msi.c
>>> +++ b/xen/arch/x86/msi.c
>>> @@ -1265,7 +1265,13 @@ int pci_msi_conf_write_intercept(struct pci_dev *pdev, unsigned int reg,
>>>          pos = entry ? entry->msi_attrib.pos
>>>                      : pci_find_cap_offset(seg, bus, slot, func,
>>>                                            PCI_CAP_ID_MSIX);
>>> -        ASSERT(pos);
>>> +        if ( unlikely(!pos) )
>>> +        {
>>> +            printk_once(XENLOG_WARNING
>>> +                        "%04x:%02x:%02x.%u MSI-X capability is missing\n",
>>> +                        seg, bus, slot, func);
>>> +            return -EAGAIN;
>>> +        }
>>
>> Besides agreeing with Roger's comments, whose access do we
>> intercept here at the time you observe the operation above
>> producing a zero "pos"? If it's Dom0, then surely there's a bug
>> in Dom0 doing the access in the first place when a reset hasn't
>> completed yet?
>> If it's a DomU, then is the reset happening
>> behind _its_ back as well (which is not going to end well)?
> 
> Looks like it is Dom0. Xen should defend against Dom0 bugs, right?

To a degree, yes. But what you suggest above is (to me) not defense,
but simply papering over an issue. What happens with ...

> (XEN) Xen call trace:
> (XEN)    [<ffff82d08027ed90>] pci_msi_conf_write_intercept+0xd7/0x216
> (XEN)    [<ffff82d080297d99>] pci_conf_write_intercept+0x68/0x72
> (XEN)    [<ffff82d08037d40b>] emul-priv-op.c#pci_cfg_ok+0xb5/0x146
> (XEN)    [<ffff82d08037d5af>] emul-priv-op.c#guest_io_write+0x113/0x20b
> (XEN)    [<ffff82d08037db65>] emul-priv-op.c#write_io+0xda/0xe4
> (XEN)    [<ffff82d0802bf35d>] x86_emulate+0x11cf7/0x3169d
> (XEN)    [<ffff82d0802e09bd>] x86_emulate_wrapper+0x26/0x5f
> (XEN)    [<ffff82d08037f57e>] pv_emulate_privileged_op+0x150/0x271
> (XEN)    [<ffff82d0802a80bb>] do_general_protection+0x20b/0x257
> (XEN)    [<ffff82d080387a3d>] x86_64/entry.S#handle_exception_saved+0x68/0x94

... this call stack is that the request by Dom0 simply gets dropped
on the floor. How do you ensure this isn't going to cause further
problems down the road? IOW I think in this case the fix needed to
be in Dom0, and I don't think Xen necessarily has to make things
appear to have gone smoothly. What I _could_ see Xen do in this
case is actually punish Dom0, e.g. by injecting #GP(0). (Obviously
this won't be a good idea ahead of the issue actually getting
fixed _in_ Dom0.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-09-30 15:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-29 21:24 [Xen-devel] [PATCH for Xen 4.13] x86/msi: Don't panic if msix capability is missing Chao Gao
2019-09-30  9:09 ` Roger Pau Monné
2019-09-30 14:21   ` Chao Gao
2019-09-30  9:18 ` Jan Beulich
2019-09-30 14:30   ` Chao Gao
2019-09-30 15:21     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).