linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] genirq/msi: Make sure PCI MSIs are activated early
@ 2016-07-13 16:18 Marc Zyngier
  2016-07-22 22:04 ` Bjorn Helgaas
  2016-08-09  7:28 ` [tip:irq/urgent] " tip-bot for Marc Zyngier
  0 siblings, 2 replies; 11+ messages in thread
From: Marc Zyngier @ 2016-07-13 16:18 UTC (permalink / raw)
  To: Bjorn Helgaas, Thomas Gleixner
  Cc: Bharat Kumar Gogada, linux-pci, linux-kernel

Bharat Kumar Gogada reported issues with the generic MSI code,
where the end-point ended up with garbage in its MSI configuration
(both for the vector and the message).

It turns out that the two MSI paths in the kernel are doing slightly
different things:

generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI

and it turns out that end-points are allowed to latch the content
of the MSI configuration registers as soon as MSIs are enabled.
In Bharat's case, the end-point ends up using whatever was there
already, which is not what you want.

In order to make things converge, we introduce a new MSI domain
flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
PCI/MSI. When set, this flag forces the programming of the end-point
as soon as the MSIs are allocated.

A consequence of this is that we have an extra activate in
irq_startup, but that should be without much consequence.

Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/pci/msi.c   | 2 ++
 include/linux/msi.h | 2 ++
 kernel/irq/msi.c    | 7 +++++++
 3 files changed, 11 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a080f44..565e2a4 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1277,6 +1277,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		pci_msi_domain_update_chip_ops(info);
 
+	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
+
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (!domain)
 		return NULL;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 8b425c6..513b7c7 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -270,6 +270,8 @@ enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 4),
+	/* Needs early activate, required for PCI */
+	MSI_FLAG_ACTIVATE_EARLY		= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 38e89ce..4ed2cca 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -361,6 +361,13 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 		else
 			dev_dbg(dev, "irq [%d-%d] for MSI\n",
 				virq, virq + desc->nvec_used - 1);
+
+		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
+			struct irq_data *irq_data;
+
+			irq_data = irq_domain_get_irq_data(domain, desc->irq);
+			irq_domain_activate_irq(irq_data);
+		}
 	}
 
 	return 0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-13 16:18 [PATCH] genirq/msi: Make sure PCI MSIs are activated early Marc Zyngier
@ 2016-07-22 22:04 ` Bjorn Helgaas
  2016-07-25  7:45   ` Thomas Gleixner
  2016-08-09  7:28 ` [tip:irq/urgent] " tip-bot for Marc Zyngier
  1 sibling, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2016-07-22 22:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Bjorn Helgaas, Thomas Gleixner, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> Bharat Kumar Gogada reported issues with the generic MSI code,
> where the end-point ended up with garbage in its MSI configuration
> (both for the vector and the message).
> 
> It turns out that the two MSI paths in the kernel are doing slightly
> different things:
> 
> generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
> PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI
> 
> and it turns out that end-points are allowed to latch the content
> of the MSI configuration registers as soon as MSIs are enabled.
> In Bharat's case, the end-point ends up using whatever was there
> already, which is not what you want.
> 
> In order to make things converge, we introduce a new MSI domain
> flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> PCI/MSI. When set, this flag forces the programming of the end-point
> as soon as the MSIs are allocated.
> 
> A consequence of this is that we have an extra activate in
> irq_startup, but that should be without much consequence.
> 
> Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thomas, let me know if you'd like me to take this.  It looks like the
real smarts here are in kernel/irq, so I assume you'll take it unless
I hear otherwise.

> ---
>  drivers/pci/msi.c   | 2 ++
>  include/linux/msi.h | 2 ++
>  kernel/irq/msi.c    | 7 +++++++
>  3 files changed, 11 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index a080f44..565e2a4 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1277,6 +1277,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>  	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>  		pci_msi_domain_update_chip_ops(info);
>  
> +	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
> +
>  	domain = msi_create_irq_domain(fwnode, info, parent);
>  	if (!domain)
>  		return NULL;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 8b425c6..513b7c7 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -270,6 +270,8 @@ enum {
>  	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
>  	/* Support PCI MSIX interrupts */
>  	MSI_FLAG_PCI_MSIX		= (1 << 4),
> +	/* Needs early activate, required for PCI */
> +	MSI_FLAG_ACTIVATE_EARLY		= (1 << 5),
>  };
>  
>  int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 38e89ce..4ed2cca 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -361,6 +361,13 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
>  		else
>  			dev_dbg(dev, "irq [%d-%d] for MSI\n",
>  				virq, virq + desc->nvec_used - 1);
> +
> +		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
> +			struct irq_data *irq_data;
> +
> +			irq_data = irq_domain_get_irq_data(domain, desc->irq);
> +			irq_domain_activate_irq(irq_data);
> +		}
>  	}
>  
>  	return 0;
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-22 22:04 ` Bjorn Helgaas
@ 2016-07-25  7:45   ` Thomas Gleixner
  2016-07-25 14:47     ` Bjorn Helgaas
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2016-07-25  7:45 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Fri, 22 Jul 2016, Bjorn Helgaas wrote:
> On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> > and it turns out that end-points are allowed to latch the content
> > of the MSI configuration registers as soon as MSIs are enabled.
> > In Bharat's case, the end-point ends up using whatever was there
> > already, which is not what you want.
> > 
> > In order to make things converge, we introduce a new MSI domain
> > flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> > PCI/MSI. When set, this flag forces the programming of the end-point
> > as soon as the MSIs are allocated.
> > 
> > A consequence of this is that we have an extra activate in
> > irq_startup, but that should be without much consequence.
> > 
> > Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Thomas, let me know if you'd like me to take this.  It looks like the
> real smarts here are in kernel/irq, so I assume you'll take it unless
> I hear otherwise.

I'll take it. Though I have second thoughts about the whole issue.

We deliberately made the allocation sequence of interrupts in a way that we
can easily rollback in case of failure.

We achieved that by activating the interrupts only at request time and not
somewhere in the middle of the allocation sequence. That makes the whole
hierarchical allocation more robust and avoids complex rollbacks.

Now that new flag is basically torpedoing that approach.

What I really wonder is why that is only an issue with that particular xilinx
hardware/IP block. I'm aware that up to PCI 2.3 the mask bit for MSI
interrupts is optional or in really old versions not even specified. So only
if that mask bit is missing the above described issue can happen.

If not, then we might have a general issue that we don't mask the entry before
we call pci_msi_set_enable().

Thoughts?

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-25  7:45   ` Thomas Gleixner
@ 2016-07-25 14:47     ` Bjorn Helgaas
  2016-07-26 11:42       ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2016-07-25 14:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> On Fri, 22 Jul 2016, Bjorn Helgaas wrote:
> > On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> > > and it turns out that end-points are allowed to latch the content
> > > of the MSI configuration registers as soon as MSIs are enabled.
> > > In Bharat's case, the end-point ends up using whatever was there
> > > already, which is not what you want.
> > > 
> > > In order to make things converge, we introduce a new MSI domain
> > > flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> > > PCI/MSI. When set, this flag forces the programming of the end-point
> > > as soon as the MSIs are allocated.
> > > 
> > > A consequence of this is that we have an extra activate in
> > > irq_startup, but that should be without much consequence.
> > > 
> > > Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > > Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > 
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > Thomas, let me know if you'd like me to take this.  It looks like the
> > real smarts here are in kernel/irq, so I assume you'll take it unless
> > I hear otherwise.
> 
> I'll take it. Though I have second thoughts about the whole issue.
> 
> We deliberately made the allocation sequence of interrupts in a way that we
> can easily rollback in case of failure.
> 
> We achieved that by activating the interrupts only at request time and not
> somewhere in the middle of the allocation sequence. That makes the whole
> hierarchical allocation more robust and avoids complex rollbacks.
> 
> Now that new flag is basically torpedoing that approach.
> 
> What I really wonder is why that is only an issue with that particular xilinx
> hardware/IP block. I'm aware that up to PCI 2.3 the mask bit for MSI
> interrupts is optional or in really old versions not even specified. So only
> if that mask bit is missing the above described issue can happen.
> 
> If not, then we might have a general issue that we don't mask the entry before
> we call pci_msi_set_enable().

Good question.  I haven't followed this thread in detail, so my ack
meant "I'm OK with this if you are," not "I've reviewed this and
think it's great."

I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
idea to me.

I don't understand the whole flow.  Here's what I've gleaned so far:

  pci_enable_msi_range
    msi_capability_init
      pci_msi_setup_msi_irqs
        domain = pci_msi_get_domain(dev)
        if (domain)
          # this seems like the problem case
          pci_msi_domain_alloc_irqs(domain, dev, nvec)
            msi_domain_alloc_irqs
              ...
        else
          # this case apparently works fine
          arch_setup_msi_irqs
            for_each_pci_msi_entry(entry, dev)
              arch_setup_msi_irq
                chip->setup_irq
                  xilinx_pcie_msi_setup_irq  # xilinx_pcie_msi_chip.setup_irq
                    pci_write_msi_msg
                      __pci_write_msi_msg
                        pci_write_config_dword(PCI_MSI_ADDRESS_LO)
      pci_msi_set_enable(dev, 1)
        pci_write_config_word(PCI_MSI_FLAGS, PCI_MSI_FLAGS_ENABLE)

I assume the problem is that in the MSI domain case, we don't call the
chip->setup_irq method until later.  I gave up trying to figure out
where that happens.  Is it something like the following?

  request_irq
    request_threaded_irq
      __setup_irq
        ...
          ?? chip->setup_irq ??

That does seem like a problem.  Maybe it would be better to delay
setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
have been set?

[1] http://lkml.kernel.org/r/8520D5D51A55D047800579B094147198258B80DE@XAP-PVEXMBX01.xlnx.xilinx.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-25 14:47     ` Bjorn Helgaas
@ 2016-07-26 11:42       ` Thomas Gleixner
  2016-07-26 13:05         ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2016-07-26 11:42 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Mon, 25 Jul 2016, Bjorn Helgaas wrote:
> On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
> written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
> idea to me.

Well. That's only a problem if the PCI device does not support masking. But
yes, we missed that case back then.
 
> That does seem like a problem.  Maybe it would be better to delay
> setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
> have been set?

I thought about that, but that gets ugly pretty fast. Here is an alternative
solution.

I think that's the proper place to do it _AFTER_ the hierarchical allocation
took place. On x86 Marc's ACTIVATE_EARLY flag would not work because the
message is not yet ready to be assembled.

Thanks,

	tglx
---
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a080f4496fe2..142341f8331b 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -645,6 +645,15 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 		return ret;
 	}
 
+	/*
+	 * The mask can be ignored and PCI 2.3 does not specify mask bits for
+	 * each MSI interrupt. So in case of hierarchical irqdomains we need
+	 * to make sure that if masking is not available that the msi message
+	 * is written prior to setting the MSI enable bit in the device.
+	 */
+	if (pci_msi_ignore_mask || !entry->msi_attrib.maskbit)
+		irq_domain_activate_irq(irq_get_irq_data(entry->irq));
+
 	/* Set MSI enabled bits	 */
 	pci_intx_for_msi(dev, 0);
 	pci_msi_set_enable(dev, 1);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-26 11:42       ` Thomas Gleixner
@ 2016-07-26 13:05         ` Thomas Gleixner
  2016-07-26 14:05           ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2016-07-26 13:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> On Mon, 25 Jul 2016, Bjorn Helgaas wrote:
> > On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> > I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
> > written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
> > idea to me.
> 
> Well. That's only a problem if the PCI device does not support masking. But
> yes, we missed that case back then.
>  
> > That does seem like a problem.  Maybe it would be better to delay
> > setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
> > have been set?
> 
> I thought about that, but that gets ugly pretty fast. Here is an alternative
> solution.
> 
> I think that's the proper place to do it _AFTER_ the hierarchical allocation
> took place. On x86 Marc's ACTIVATE_EARLY flag would not work because the
> message is not yet ready to be assembled.

Actually it works, because the MSI domain is the last one which is running the
allocation function. So everything else is initialized already.

I'll take Marc's patch with some additional commentry as it turned out to be a
workaround for the reported VMware issues with PCI/MSI-X pass through.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-26 13:05         ` Thomas Gleixner
@ 2016-07-26 14:05           ` Thomas Gleixner
  2016-07-28 15:03             ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2016-07-26 14:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > On Mon, 25 Jul 2016, Bjorn Helgaas wrote:
> > > On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> > > I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
> > > written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
> > > idea to me.
> > 
> > Well. That's only a problem if the PCI device does not support masking. But
> > yes, we missed that case back then.
> >  
> > > That does seem like a problem.  Maybe it would be better to delay
> > > setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
> > > have been set?
> > 
> > I thought about that, but that gets ugly pretty fast. Here is an alternative
> > solution.
> > 
> > I think that's the proper place to do it _AFTER_ the hierarchical allocation
> > took place. On x86 Marc's ACTIVATE_EARLY flag would not work because the
> > message is not yet ready to be assembled.
> 
> Actually it works, because the MSI domain is the last one which is running the
> allocation function. So everything else is initialized already.
> 
> I'll take Marc's patch with some additional commentry as it turned out to be a
> workaround for the reported VMware issues with PCI/MSI-X pass through.

Now I digged a little bit deeper into all that PCI/MSI maze.

When a interrupt is freed, then we write the msi message to 0, but the
PCI_MSI_FLAGS_ENABLE flag is still set. That makes me wonder ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-26 14:05           ` Thomas Gleixner
@ 2016-07-28 15:03             ` Thomas Gleixner
  2016-07-28 16:49               ` Bjorn Helgaas
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2016-07-28 15:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > > On Mon, 25 Jul 2016, Bjorn Helgaas wrote:
> > > > On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> > > > I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
> > > > written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
> > > > idea to me.
> > > 
> > > Well. That's only a problem if the PCI device does not support masking. But
> > > yes, we missed that case back then.
> > >  
> > > > That does seem like a problem.  Maybe it would be better to delay
> > > > setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
> > > > have been set?
> > > 
> > > I thought about that, but that gets ugly pretty fast. Here is an alternative
> > > solution.
> > > 
> > > I think that's the proper place to do it _AFTER_ the hierarchical allocation
> > > took place. On x86 Marc's ACTIVATE_EARLY flag would not work because the
> > > message is not yet ready to be assembled.
> > 
> > Actually it works, because the MSI domain is the last one which is running the
> > allocation function. So everything else is initialized already.
> > 
> > I'll take Marc's patch with some additional commentry as it turned out to be a
> > workaround for the reported VMware issues with PCI/MSI-X pass through.
> 
> Now I digged a little bit deeper into all that PCI/MSI maze.
> 
> When a interrupt is freed, then we write the msi message to 0, but the
> PCI_MSI_FLAGS_ENABLE flag is still set. That makes me wonder ...

Bjorn, any opinion on that?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-28 15:03             ` Thomas Gleixner
@ 2016-07-28 16:49               ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2016-07-28 16:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Marc Zyngier, Bjorn Helgaas, Bharat Kumar Gogada, linux-pci,
	linux-kernel

On Thu, Jul 28, 2016 at 05:03:30PM +0200, Thomas Gleixner wrote:
> On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > > On Tue, 26 Jul 2016, Thomas Gleixner wrote:
> > > > On Mon, 25 Jul 2016, Bjorn Helgaas wrote:
> > > > > On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> > > > > I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
> > > > > written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
> > > > > idea to me.
> > > > 
> > > > Well. That's only a problem if the PCI device does not support masking. But
> > > > yes, we missed that case back then.
> > > >  
> > > > > That does seem like a problem.  Maybe it would be better to delay
> > > > > setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
> > > > > have been set?
> > > > 
> > > > I thought about that, but that gets ugly pretty fast. Here is an alternative
> > > > solution.
> > > > 
> > > > I think that's the proper place to do it _AFTER_ the hierarchical allocation
> > > > took place. On x86 Marc's ACTIVATE_EARLY flag would not work because the
> > > > message is not yet ready to be assembled.
> > > 
> > > Actually it works, because the MSI domain is the last one which is running the
> > > allocation function. So everything else is initialized already.
> > > 
> > > I'll take Marc's patch with some additional commentry as it turned out to be a
> > > workaround for the reported VMware issues with PCI/MSI-X pass through.
> > 
> > Now I digged a little bit deeper into all that PCI/MSI maze.
> > 
> > When a interrupt is freed, then we write the msi message to 0, but the
> > PCI_MSI_FLAGS_ENABLE flag is still set. That makes me wonder ...
> 
> Bjorn, any opinion on that?

I assume you mean we write 0 to PCI_MSI_ADDRESS_LO, PCI_MSI_DATA_32,
and similar registers in the MSI Capability structure.

It doesn't sound safe to me to do that while PCI_MSI_FLAGS_ENABLE is
still set.  I don't see anything in the spec that constrains when a
device latches the values from those registers.  It seems legal to do
it on PCI_MSI_FLAGS_ENABLE transitions, but it also seems legal to do
it whenever the device needs to signal an interrupt.

If a device does the latter, it seems like clearing PCI_MSI_ADDRESS_LO
while PCI_MSI_FLAGS_ENABLE is set could lead to stray DMA writes if
the device for some reason signals an interrupt later.

Bjorn

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:irq/urgent] genirq/msi: Make sure PCI MSIs are activated early
  2016-07-13 16:18 [PATCH] genirq/msi: Make sure PCI MSIs are activated early Marc Zyngier
  2016-07-22 22:04 ` Bjorn Helgaas
@ 2016-08-09  7:28 ` tip-bot for Marc Zyngier
  2016-09-02 14:48   ` Bharat Kumar Gogada
  1 sibling, 1 reply; 11+ messages in thread
From: tip-bot for Marc Zyngier @ 2016-08-09  7:28 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bharat.kumar.gogada, bhelgaas, hpa, linux, marc.zyngier,
	jason.taylor, linux-kernel, forst, mingo, tglx

Commit-ID:  f3b0946d629c8bfbd3e5f038e30cb9c711a35f10
Gitweb:     http://git.kernel.org/tip/f3b0946d629c8bfbd3e5f038e30cb9c711a35f10
Author:     Marc Zyngier <marc.zyngier@arm.com>
AuthorDate: Wed, 13 Jul 2016 17:18:33 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 9 Aug 2016 09:19:32 +0200

genirq/msi: Make sure PCI MSIs are activated early

Bharat Kumar Gogada reported issues with the generic MSI code, where the
end-point ended up with garbage in its MSI configuration (both for the vector
and the message).

It turns out that the two MSI paths in the kernel are doing slightly different
things:

generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI

And it turns out that end-points are allowed to latch the content of the MSI
configuration registers as soon as MSIs are enabled.  In Bharat's case, the
end-point ends up using whatever was there already, which is not what you
want.

In order to make things converge, we introduce a new MSI domain flag
(MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for PCI/MSI. When set,
this flag forces the programming of the end-point as soon as the MSIs are
allocated.

A consequence of this is that we have an extra activate in irq_startup, but
that should be without much consequence.

tglx: 

 - Several people reported a VMWare regression with PCI/MSI-X passthrough. It
   turns out that the patch also cures that issue.

 - We need to have a look at the MSI disable interrupt path, where we write
   the msg to all zeros without disabling MSI in the PCI device. Is that
   correct?

Fixes: 52f518a3a7c2 "x86/MSI: Use hierarchical irqdomains to manage MSI interrupts"
Reported-and-tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Reported-and-tested-by: Foster Snowhill <forst@forstwoof.ru>
Reported-by: Matthias Prager <linux@matthiasprager.de>
Reported-by: Jason Taylor <jason.taylor@simplivity.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1468426713-31431-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 drivers/pci/msi.c   |  2 ++
 include/linux/msi.h |  2 ++
 kernel/irq/msi.c    | 11 +++++++++++
 3 files changed, 15 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a02981e..eafa613 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1411,6 +1411,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		pci_msi_domain_update_chip_ops(info);
 
+	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
+
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (!domain)
 		return NULL;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 4f0bfe5..e8c81fb 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -270,6 +270,8 @@ enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 2),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 3),
+	/* Needs early activate, required for PCI */
+	MSI_FLAG_ACTIVATE_EARLY		= (1 << 4),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 5499935..19e9dfb 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -359,6 +359,17 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 		else
 			dev_dbg(dev, "irq [%d-%d] for MSI\n",
 				virq, virq + desc->nvec_used - 1);
+		/*
+		 * This flag is set by the PCI layer as we need to activate
+		 * the MSI entries before the PCI layer enables MSI in the
+		 * card. Otherwise the card latches a random msi message.
+		 */
+		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
+			struct irq_data *irq_data;
+
+			irq_data = irq_domain_get_irq_data(domain, desc->irq);
+			irq_domain_activate_irq(irq_data);
+		}
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [tip:irq/urgent] genirq/msi: Make sure PCI MSIs are activated early
  2016-08-09  7:28 ` [tip:irq/urgent] " tip-bot for Marc Zyngier
@ 2016-09-02 14:48   ` Bharat Kumar Gogada
  0 siblings, 0 replies; 11+ messages in thread
From: Bharat Kumar Gogada @ 2016-09-02 14:48 UTC (permalink / raw)
  To: tglx, forst, mingo, jason.taylor, linux-kernel, linux,
	marc.zyngier, hpa, Bharat Kumar Gogada, bhelgaas,
	linux-tip-commits

Thanks Marc and Thomas for addressing the issue.

> -----Original Message-----
> From: tip tree robot [mailto:tipbot@zytor.com]
> Sent: Tuesday, August 09, 2016 12:59 PM
> To: linux-tip-commits@vger.kernel.org
> Cc: Bharat Kumar Gogada <bharatku@xilinx.com>; bhelgaas@google.com;
> hpa@zytor.com; linux@matthiasprager.de; marc.zyngier@arm.com;
> jason.taylor@simplivity.com; linux-kernel@vger.kernel.org;
> forst@forstwoof.ru; mingo@kernel.org; tglx@linutronix.de
> Subject: [tip:irq/urgent] genirq/msi: Make sure PCI MSIs are activated early
> 
> Commit-ID:  f3b0946d629c8bfbd3e5f038e30cb9c711a35f10
> Gitweb:
> http://git.kernel.org/tip/f3b0946d629c8bfbd3e5f038e30cb9c711a35f10
> Author:     Marc Zyngier <marc.zyngier@arm.com>
> AuthorDate: Wed, 13 Jul 2016 17:18:33 +0100
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Tue, 9 Aug 2016 09:19:32 +0200
> 
> genirq/msi: Make sure PCI MSIs are activated early
> 
> Bharat Kumar Gogada reported issues with the generic MSI code, where the
> end-point ended up with garbage in its MSI configuration (both for the vector
> and the message).
> 
> It turns out that the two MSI paths in the kernel are doing slightly different
> things:
> 
> generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
> PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI
> 
> And it turns out that end-points are allowed to latch the content of the MSI
> configuration registers as soon as MSIs are enabled.  In Bharat's case, the
> end-point ends up using whatever was there already, which is not what you
> want.
> 
> In order to make things converge, we introduce a new MSI domain flag
> (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for PCI/MSI. When set,
> this flag forces the programming of the end-point as soon as the MSIs are
> allocated.
> 
> A consequence of this is that we have an extra activate in irq_startup, but
> that should be without much consequence.
> 
> tglx:
> 
>  - Several people reported a VMWare regression with PCI/MSI-X passthrough. It
>    turns out that the patch also cures that issue.
> 
>  - We need to have a look at the MSI disable interrupt path, where we write
>    the msg to all zeros without disabling MSI in the PCI device. Is that
>    correct?
> 
> Fixes: 52f518a3a7c2 "x86/MSI: Use hierarchical irqdomains to manage MSI
> interrupts"
> Reported-and-tested-by: Bharat Kumar Gogada
> <bharat.kumar.gogada@xilinx.com>
> Reported-and-tested-by: Foster Snowhill <forst@forstwoof.ru>
> Reported-by: Matthias Prager <linux@matthiasprager.de>
> Reported-by: Jason Taylor <jason.taylor@simplivity.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> Cc: stable@vger.kernel.org
> Link: http://lkml.kernel.org/r/1468426713-31431-1-git-send-email-
> marc.zyngier@arm.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> ---
>  drivers/pci/msi.c   |  2 ++
>  include/linux/msi.h |  2 ++
>  kernel/irq/msi.c    | 11 +++++++++++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index a02981e..eafa613 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1411,6 +1411,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct
> fwnode_handle *fwnode,
>  	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>  		pci_msi_domain_update_chip_ops(info);
> 
> +	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
> +
>  	domain = msi_create_irq_domain(fwnode, info, parent);
>  	if (!domain)
>  		return NULL;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 4f0bfe5..e8c81fb 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -270,6 +270,8 @@ enum {
>  	MSI_FLAG_MULTI_PCI_MSI		= (1 << 2),
>  	/* Support PCI MSIX interrupts */
>  	MSI_FLAG_PCI_MSIX		= (1 << 3),
> +	/* Needs early activate, required for PCI */
> +	MSI_FLAG_ACTIVATE_EARLY		= (1 << 4),
>  };
> 
>  int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 5499935..19e9dfb 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -359,6 +359,17 @@ int msi_domain_alloc_irqs(struct irq_domain *domain,
> struct device *dev,
>  		else
>  			dev_dbg(dev, "irq [%d-%d] for MSI\n",
>  				virq, virq + desc->nvec_used - 1);
> +		/*
> +		 * This flag is set by the PCI layer as we need to activate
> +		 * the MSI entries before the PCI layer enables MSI in the
> +		 * card. Otherwise the card latches a random msi message.
> +		 */
> +		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
> +			struct irq_data *irq_data;
> +
> +			irq_data = irq_domain_get_irq_data(domain, desc-
> >irq);
> +			irq_domain_activate_irq(irq_data);
> +		}
>  	}
> 
>  	return 0;

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-09-02 15:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-13 16:18 [PATCH] genirq/msi: Make sure PCI MSIs are activated early Marc Zyngier
2016-07-22 22:04 ` Bjorn Helgaas
2016-07-25  7:45   ` Thomas Gleixner
2016-07-25 14:47     ` Bjorn Helgaas
2016-07-26 11:42       ` Thomas Gleixner
2016-07-26 13:05         ` Thomas Gleixner
2016-07-26 14:05           ` Thomas Gleixner
2016-07-28 15:03             ` Thomas Gleixner
2016-07-28 16:49               ` Bjorn Helgaas
2016-08-09  7:28 ` [tip:irq/urgent] " tip-bot for Marc Zyngier
2016-09-02 14:48   ` Bharat Kumar Gogada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).