All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Keir Fraser <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v2 1/4] x86/MSI-X: be more careful during teardown
Date: Mon, 13 Apr 2015 10:11:52 +0100	[thread overview]
Message-ID: <552BA47802000078000715E5@mail.emea.novell.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1504021718160.7690@kaball.uk.xensource.com>

>>> On 02.04.15 at 18:49, <stefano.stabellini@eu.citrix.com> wrote:
> On Wed, 25 Mar 2015, Jan Beulich wrote:
>> When a device gets detached from a guest, pciback will clear its
>> command register, thus disabling both memory and I/O decoding. The
>> disabled memory decoding, however, has an effect on the MSI-X table
>> accesses the hypervisor does: These won't have the intended effect
>> anymore. Even worse, for PCIe devices (but not SR-IOV virtual
>> functions) such accesses may (will?) be treated as Unsupported
>> Requests, causing respective errors to be surfaced, potentially in the
>> form of NMIs that may be fatal to the hypervisor or Dom0 is different
>> ways. Hence rather than carrying out these accesses, we should avoid
>> them where we can, and use alternative (e.g. PCI config space based)
>> mechanisms to achieve at least the same effect.
> 
> I don't think that it is a good idea for both Xen and Linux to access
> the command register simultaneously.  Working around Linux in Xen
> doesn't sound like an optimal solution.   Maybe we could just fix the
> pciback and that would be enough.

I'm afraid that would just eliminate the specific case, but not the
general issue. While we trust Dom0 to not do outright bad things,
the hypervisor should still avoid doing things that can go wrong
due to the state a device is put (or left) in by Dom0.

> In any case we should make it clear somewhere who is supposed to write
> to the command register (and other PCI reigsters) at any given time,
> otherwise it would be very easy for a new kernel update to break the
> hypervisor and we wouldn't even know why it happened.

We should, but at this point in time this is going to be rather
problematic. Such a separation of responsibilities should have been
done before all the pass-through code got written.

>> @@ -369,24 +401,52 @@ static void msi_set_mask_bit(struct irq_
>>          }
>>          break;
>>      case PCI_CAP_ID_MSIX:
>> -    {
>> -        int offset = PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
>> -        writel(flag, entry->mask_base + offset);
>> -        readl(entry->mask_base + offset);
>> -        break;
>> -    }
>> +        if ( likely(memory_decoded(pdev)) )
>> +        {
>> +            writel(flag, entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
>> +            readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
>> +            break;
>> +        }
>> +        if ( flag )
>> +        {
>> +            u16 control;
>> +            domid_t domid = pdev->domain->domain_id;
>> +
>> +            control = pci_conf_read16(seg, bus, slot, func,
>> +                                      msix_control_reg(entry->msi_attrib.pos));
>> +            if ( control & PCI_MSIX_FLAGS_MASKALL )
>> +                break;
>> +            pci_conf_write16(seg, bus, slot, func,
>> +                             msix_control_reg(entry->msi_attrib.pos),
>> +                             control | PCI_MSIX_FLAGS_MASKALL);
> 
> How is that going to interact with Linux (Dom0) writing to the command
> register? Moreover QEMU writes to the PCI_MSIX_FLAGS_MASKALL bit for
> devices assigned to HVM guests. Could this cause any conflicts?

I don't see the relation to the comment register here - all quoted code
only deals with the MSI-X control register.

> One might argue that QEMU should not touch PCI_MSIX_FLAGS_MASKALL, but
> as a matter of fact QEMU has done that for years and we cannot break
> that behaviour without introducing regressions.  In fact as it stands
> QEMU is the owner of PCI_MSIX_FLAGS_MASKALL for devices assigned to HVM
> guests, not Xen.
> 
> Can we avoid messing with PCI_MSIX_FLAGS_MASKALL in Xen for passed
> through devices to running domains?  I think that might be a good enough
> separation of responsibilities between Xen and QEMU.

No, the bits has to be under hypervisor control (just like any other
hardware bits controlling interrupts). Dealing with this is the subject
of patches I have ready, but didn't post yet.

Jan

  reply	other threads:[~2015-04-13  9:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25 16:34 [PATCH v2 0/4] x86/MSI-X: XSA-120 follow-up Jan Beulich
2015-03-25 16:39 ` [PATCH v2 1/4] x86/MSI-X: be more careful during teardown Jan Beulich
2015-03-30 10:05   ` Andrew Cooper
2015-04-02 16:49   ` Stefano Stabellini
2015-04-13  9:11     ` Jan Beulich [this message]
2015-04-13 10:50       ` Stefano Stabellini
2015-04-13 11:21         ` Jan Beulich
2015-04-13 12:01           ` Stefano Stabellini
2015-04-13 12:47             ` Jan Beulich
2015-04-13 15:09               ` Stefano Stabellini
2015-04-14 13:47       ` Ian Campbell
2015-03-25 16:39 ` [PATCH v2 2/4] x86/MSI-X: access MSI-X table only after having enabled MSI-X Jan Beulich
2015-04-10 20:02   ` Konrad Rzeszutek Wilk
2015-04-13  9:05     ` Jan Beulich
2015-04-15 17:41       ` Konrad Rzeszutek Wilk
2015-04-16  7:43         ` Jan Beulich
2015-04-16 18:21           ` Konrad Rzeszutek Wilk
2015-04-17  7:09             ` Jan Beulich
2015-04-17 14:01         ` Jan Beulich
2015-03-25 16:40 ` [PATCH v2 3/4] x86/MSI-X: reduce fiddling with control register during restore Jan Beulich
2015-03-25 16:40 ` [PATCH v2 4/4] x86/MSI-X: cleanup Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=552BA47802000078000715E5@mail.emea.novell.com \
    --to=jbeulich@suse.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=keir@xen.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.