All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kishon Vijay Abraham I <kishon@ti.com>
To: Bjorn Helgaas <helgaas@kernel.org>, Marc Zyngier <marc.zyngier@arm.com>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"colin.king@canonical.com" <colin.king@canonical.com>,
	Soren Brinkmann <sorenb@xilinx.com>,
	Michal Simek <michals@xilinx.com>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ravikiran Gummaluri <rgummal@xilinx.com>,
	Ley Foon Tan <lftan@altera.com>,
	Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	Murali Karicheri <m-karicheri2@ti.com>
Subject: Re: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.
Date: Wed, 14 Sep 2016 15:25:38 +0530	[thread overview]
Message-ID: <57D91E9A.7060402@ti.com> (raw)
In-Reply-To: <20160913153402.GA4138@localhost>

Hi,

On Tuesday 13 September 2016 09:04 PM, Bjorn Helgaas wrote:
> [+cc Ley Foon (altera), Thomas (aardvark), Kishon (dra7xx), Murali (keystone)]
> 
> On Tue, Sep 13, 2016 at 10:05:11AM -0500, Bjorn Helgaas wrote:
>> On Tue, Sep 13, 2016 at 08:41:28AM +0100, Marc Zyngier wrote:
>>> On 12/09/16 23:02, Bjorn Helgaas wrote:
>>>> On Thu, Sep 01, 2016 at 05:19:55AM +0000, Bharat Kumar Gogada wrote:
>>>>>>>>>> Hi Bharat,
>>>>>>>>>>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
>>>>>>>>>>> nwl_pcie
>>>>>>>>>> *pcie)
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>     pcie->legacy_irq_domain = irq_domain_add_linear(legacy_intc_node,
>>>>>>>>>>> -                                                   INTX_NUM,
>>>>>>>>>>> +                                                   INTX_NUM + 1,
>>>>>>>>>>>                                                     &legacy_domain_ops,
>>>>>>>>>>>                                                     pcie);
>>>>>>>>>>
>>>>>>>>>> This feels like the wrong thing to do. You have INTX_NUM irqs, so
>>>>>>>>>> the domain allocation should reflect this. On the other hand, the
>>>>>>>>>> way the driver currently deals with mappings is quite broken
>>>>>>>>>> (consistently adding 1 to
>>>>>>>> the HW interrupt).
>>>>>>>>>>
>>>>>>>>> Hi Marc,
>>>>>>>>>
>>>>>>>>> Without above change I get following crash in kernel while booting.
>>>>>>>>>
>>>>>>>>> [    2.441684] error: hwirq 0x4 is too large for dummy
>>>>>>>>>
>>>>>>>>> [    2.441694] ------------[ cut here ]------------
>>>>>>>>>
>>>>>>>>> [    2.441698] WARNING: at kernel/irq/irqdomain.c:344
>>>>>>>>>
>>>>>>>>> [    2.441702] Modules linked in:
>>>>>>>>>
>>>>>>>>> [    2.441706]
>>>>>>>>>
>>>>>>>>> [    2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
>>>>>>>>>
>>>>>>>>> [    2.441718] Hardware name: xlnx,zynqmp (DT)
>>>>>>>>>
>>>>>>>>> [    2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
>>>>>>>> ffffffc071888000
>>>>>>>>>
>>>>>>>>> [    2.441732] PC is at irq_domain_associate+0x138/0x1c0
>>>>>>>>>
>>>>>>>>> [    2.441738] LR is at irq_domain_associate+0x138/0x1c0
>>>>>>>>>
>>>>>>>>> In kernel/irq/irqdomain.c function irq_domain_associate
>>>>>>>>>
>>>>>>>>> if (WARN(hwirq >= domain->hwirq_max,
>>>>>>>>>                  "error: hwirq 0x%x is too large for %s\n", (int)hwirq, domain-
>>>>>>> name))
>>>>>>>>>                 return -EINVAL;
>>>>>>>>>
>>>>>>>>> Here the hwirq and hwirq_max are equal to 4 without the above
>>>>>>>>> condition
>>>>>>>> (INTX_NUM + 1) due to which crash is coming.
>>>>>>>>> This is happening as the legacy interrupts are starting from 1 (INTA).
>>>>>>>>
>>>>>>>> I understood that. I'm still persisting in saying that you have the wrong fix.
>>>>>>>>
>>>>>>>> Your domain should always allocate many interrupts as you have
>>>>>>>> interrupt sources. These interrupts (hwirq) should be numbered from 0 to (n-
>>>>>> 1).
>>>>>>>
>>>>>>> Agreed, but here comes the problem the hwirq for legacy interrupts
>>>>>>> will start at 0x1 to 0x4 (INTA to INTD) and these values are as per
>>>>>>> PCIe specification for legacy interrupts. So these cannot be numbered
>>>>>>> from 0. So when 0x4 (INTD) for a multi-function device comes the crash
>>>>>>> occurs.
>>>>>>
>>>>>> So who provides this hwirq? Who calls irq_domain_associate() with hwirq set to
>>>>>> 4?
>>>>>>
>>>>> PCIe subsystem invokes pcibios_add_device function in arch/arm64/kernel/pci.c for every pci device.
>>>>> The purpose of this function is to assign dev->irq using of_irq_parse_and_map_pci.
>>>>> of_irq_parse_and_map_pci invokes of_irq_parse_pci where it reads PCI_INTERRUPT_PIN from configuration space and saves it
>>>>> in parameter of struct of_phandle_args.
>>>>> This structure is passed to irq_create_of_mapping where it invokes irq_create_fwspec_mapping.
>>>>> irq_create_fwspec_mapping invokes irq_domain_translate and gets hwirq, here the above saved PCI_INTERRUPT_PIN value is assigned 
>>>>> to hwirq (*hwirq = fwspec->param[0]).
>>>>> And then using this hwirq irq_create_mapping -> irq_domain_associate were invoked and mapping is created for virtual irq with this hwirq.
>>>>> So for any end point PCI_INTERRUPT_PIN value starts from 0x1 to 0x4 and so hwirq starts from 0x1 to 0x4.
>>>>>
>>>>> So the values are more generic w.r.t to protocol, that's why hwirq will range from 0x1 to 0x4. 
>>>>> And then if you check pcie-altera.c they are doing this adding one in their handler and while creating legacy domain.
>>>>
>>>> Is this resolved yet?  Marc, are you happy, or should we iterate on this
>>>> again?
>>>
>>> Ah, sorry to have dropped the ball on this patch.
>>
>> No problem, I wasn't making forward progress anyway.
>>
>>> I guess that given that the infrastructure imposes the hwirq range on
>>> the host drivers, Bharat's approach is the only way (and a number of
>>> other host drivers are already slightly broken). I'll try and have a
>>> look at solving this at the generic level. In the meantime:
>>>
>>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>
>> After looking at this myself, I'm not happy with this either.  It feels
>> like there are bugs lurking here and we're just hiding one of them.
>>
>> Here are the callers of irq_domain_add_linear() for legacy INTx in
>> drivers/pci/host:
>>
>>   advk_pcie_init_irq_domain    LEGACY_IRQ_NUM   (4)
>>   dra7xx_pcie_init_irq_domain  4
>>   ks_dw_pcie_host_init         MAX_LEGACY_IRQS  (4)
>>   altera_pcie_init_irq_domain  INTX_NUM + 1     (5)
>>   nwl_pcie_init_irq_domain     INTX_NUM + 1     (5)
>>   xilinx_pcie_init_irq_domain  4
> 
> The altera change corresponding to this was 99496bd2971f ("PCI: altera: Fix
> error when INTx is 4").  I should have noticed this inconsistency back
> then.
> 
> Are aardvark, dra7xx, keystone, and xilinx (non-NWL) broken because they
> only request 4 IRQs and only INTA, INTB, and INTC work?

yeah.. it's broken in dra7xx. I get [1] when I configure the pci endpoint to
use INTD.

Thanks
Kishon

[1] -> http://pastebin.ubuntu.com/23177268/
> 
>> I think all of these use the of_irq_parse_and_map_pci() path you
>> mentioned, so if the problem is in the way that path works, I would
>> think these should *all* be requesting the same number of interrupts
>> in the domain.
>>
>> I agree with Marc that we should request 4 IRQs, because that's what
>> we need.  If we can't do that for some reason, we ought to at least
>> make all these callers the same.
>>
>> Bjorn
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: kishon@ti.com (Kishon Vijay Abraham I)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.
Date: Wed, 14 Sep 2016 15:25:38 +0530	[thread overview]
Message-ID: <57D91E9A.7060402@ti.com> (raw)
In-Reply-To: <20160913153402.GA4138@localhost>

Hi,

On Tuesday 13 September 2016 09:04 PM, Bjorn Helgaas wrote:
> [+cc Ley Foon (altera), Thomas (aardvark), Kishon (dra7xx), Murali (keystone)]
> 
> On Tue, Sep 13, 2016 at 10:05:11AM -0500, Bjorn Helgaas wrote:
>> On Tue, Sep 13, 2016 at 08:41:28AM +0100, Marc Zyngier wrote:
>>> On 12/09/16 23:02, Bjorn Helgaas wrote:
>>>> On Thu, Sep 01, 2016 at 05:19:55AM +0000, Bharat Kumar Gogada wrote:
>>>>>>>>>> Hi Bharat,
>>>>>>>>>>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
>>>>>>>>>>> nwl_pcie
>>>>>>>>>> *pcie)
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>>     pcie->legacy_irq_domain = irq_domain_add_linear(legacy_intc_node,
>>>>>>>>>>> -                                                   INTX_NUM,
>>>>>>>>>>> +                                                   INTX_NUM + 1,
>>>>>>>>>>>                                                     &legacy_domain_ops,
>>>>>>>>>>>                                                     pcie);
>>>>>>>>>>
>>>>>>>>>> This feels like the wrong thing to do. You have INTX_NUM irqs, so
>>>>>>>>>> the domain allocation should reflect this. On the other hand, the
>>>>>>>>>> way the driver currently deals with mappings is quite broken
>>>>>>>>>> (consistently adding 1 to
>>>>>>>> the HW interrupt).
>>>>>>>>>>
>>>>>>>>> Hi Marc,
>>>>>>>>>
>>>>>>>>> Without above change I get following crash in kernel while booting.
>>>>>>>>>
>>>>>>>>> [    2.441684] error: hwirq 0x4 is too large for dummy
>>>>>>>>>
>>>>>>>>> [    2.441694] ------------[ cut here ]------------
>>>>>>>>>
>>>>>>>>> [    2.441698] WARNING: at kernel/irq/irqdomain.c:344
>>>>>>>>>
>>>>>>>>> [    2.441702] Modules linked in:
>>>>>>>>>
>>>>>>>>> [    2.441706]
>>>>>>>>>
>>>>>>>>> [    2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
>>>>>>>>>
>>>>>>>>> [    2.441718] Hardware name: xlnx,zynqmp (DT)
>>>>>>>>>
>>>>>>>>> [    2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
>>>>>>>> ffffffc071888000
>>>>>>>>>
>>>>>>>>> [    2.441732] PC is at irq_domain_associate+0x138/0x1c0
>>>>>>>>>
>>>>>>>>> [    2.441738] LR is at irq_domain_associate+0x138/0x1c0
>>>>>>>>>
>>>>>>>>> In kernel/irq/irqdomain.c function irq_domain_associate
>>>>>>>>>
>>>>>>>>> if (WARN(hwirq >= domain->hwirq_max,
>>>>>>>>>                  "error: hwirq 0x%x is too large for %s\n", (int)hwirq, domain-
>>>>>>> name))
>>>>>>>>>                 return -EINVAL;
>>>>>>>>>
>>>>>>>>> Here the hwirq and hwirq_max are equal to 4 without the above
>>>>>>>>> condition
>>>>>>>> (INTX_NUM + 1) due to which crash is coming.
>>>>>>>>> This is happening as the legacy interrupts are starting from 1 (INTA).
>>>>>>>>
>>>>>>>> I understood that. I'm still persisting in saying that you have the wrong fix.
>>>>>>>>
>>>>>>>> Your domain should always allocate many interrupts as you have
>>>>>>>> interrupt sources. These interrupts (hwirq) should be numbered from 0 to (n-
>>>>>> 1).
>>>>>>>
>>>>>>> Agreed, but here comes the problem the hwirq for legacy interrupts
>>>>>>> will start at 0x1 to 0x4 (INTA to INTD) and these values are as per
>>>>>>> PCIe specification for legacy interrupts. So these cannot be numbered
>>>>>>> from 0. So when 0x4 (INTD) for a multi-function device comes the crash
>>>>>>> occurs.
>>>>>>
>>>>>> So who provides this hwirq? Who calls irq_domain_associate() with hwirq set to
>>>>>> 4?
>>>>>>
>>>>> PCIe subsystem invokes pcibios_add_device function in arch/arm64/kernel/pci.c for every pci device.
>>>>> The purpose of this function is to assign dev->irq using of_irq_parse_and_map_pci.
>>>>> of_irq_parse_and_map_pci invokes of_irq_parse_pci where it reads PCI_INTERRUPT_PIN from configuration space and saves it
>>>>> in parameter of struct of_phandle_args.
>>>>> This structure is passed to irq_create_of_mapping where it invokes irq_create_fwspec_mapping.
>>>>> irq_create_fwspec_mapping invokes irq_domain_translate and gets hwirq, here the above saved PCI_INTERRUPT_PIN value is assigned 
>>>>> to hwirq (*hwirq = fwspec->param[0]).
>>>>> And then using this hwirq irq_create_mapping -> irq_domain_associate were invoked and mapping is created for virtual irq with this hwirq.
>>>>> So for any end point PCI_INTERRUPT_PIN value starts from 0x1 to 0x4 and so hwirq starts from 0x1 to 0x4.
>>>>>
>>>>> So the values are more generic w.r.t to protocol, that's why hwirq will range from 0x1 to 0x4. 
>>>>> And then if you check pcie-altera.c they are doing this adding one in their handler and while creating legacy domain.
>>>>
>>>> Is this resolved yet?  Marc, are you happy, or should we iterate on this
>>>> again?
>>>
>>> Ah, sorry to have dropped the ball on this patch.
>>
>> No problem, I wasn't making forward progress anyway.
>>
>>> I guess that given that the infrastructure imposes the hwirq range on
>>> the host drivers, Bharat's approach is the only way (and a number of
>>> other host drivers are already slightly broken). I'll try and have a
>>> look at solving this at the generic level. In the meantime:
>>>
>>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>
>> After looking at this myself, I'm not happy with this either.  It feels
>> like there are bugs lurking here and we're just hiding one of them.
>>
>> Here are the callers of irq_domain_add_linear() for legacy INTx in
>> drivers/pci/host:
>>
>>   advk_pcie_init_irq_domain    LEGACY_IRQ_NUM   (4)
>>   dra7xx_pcie_init_irq_domain  4
>>   ks_dw_pcie_host_init         MAX_LEGACY_IRQS  (4)
>>   altera_pcie_init_irq_domain  INTX_NUM + 1     (5)
>>   nwl_pcie_init_irq_domain     INTX_NUM + 1     (5)
>>   xilinx_pcie_init_irq_domain  4
> 
> The altera change corresponding to this was 99496bd2971f ("PCI: altera: Fix
> error when INTx is 4").  I should have noticed this inconsistency back
> then.
> 
> Are aardvark, dra7xx, keystone, and xilinx (non-NWL) broken because they
> only request 4 IRQs and only INTA, INTB, and INTC work?

yeah.. it's broken in dra7xx. I get [1] when I configure the pci endpoint to
use INTD.

Thanks
Kishon

[1] -> http://pastebin.ubuntu.com/23177268/
> 
>> I think all of these use the of_irq_parse_and_map_pci() path you
>> mentioned, so if the problem is in the way that path works, I would
>> think these should *all* be requesting the same number of interrupts
>> in the domain.
>>
>> I agree with Marc that we should request 4 IRQs, because that's what
>> we need.  If we can't do that for some reason, we ought to at least
>> make all these callers the same.
>>
>> Bjorn
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-09-14  9:56 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-30 10:39 [PATCH 1/3] PCI: Xilinx NWL PCIe: Expanding PCIe core errors and printing event occurred Bharat Kumar Gogada
2016-08-30 10:39 ` Bharat Kumar Gogada
2016-08-30 10:39 ` [PATCH 2/3] PCI: Xilinx NWL PCIe: Enabling all MSI interrupts using MSI mask Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 10:39 ` [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 12:17   ` Marc Zyngier
2016-08-30 12:17     ` Marc Zyngier
2016-08-30 14:13     ` Bharat Kumar Gogada
2016-08-30 14:13       ` Bharat Kumar Gogada
2016-08-30 14:13       ` Bharat Kumar Gogada
2016-08-30 15:02       ` Marc Zyngier
2016-08-30 15:02         ` Marc Zyngier
2016-08-30 15:02         ` Marc Zyngier
2016-08-31  9:56         ` Bharat Kumar Gogada
2016-08-31  9:56           ` Bharat Kumar Gogada
2016-08-31  9:56           ` Bharat Kumar Gogada
2016-08-31 10:56           ` Marc Zyngier
2016-08-31 10:56             ` Marc Zyngier
2016-08-31 10:56             ` Marc Zyngier
2016-09-01  5:19             ` Bharat Kumar Gogada
2016-09-01  5:19               ` Bharat Kumar Gogada
2016-09-01  5:19               ` Bharat Kumar Gogada
2016-09-12 22:02               ` Bjorn Helgaas
2016-09-12 22:02                 ` Bjorn Helgaas
2016-09-12 22:02                 ` Bjorn Helgaas
2016-09-13  7:41                 ` Marc Zyngier
2016-09-13  7:41                   ` Marc Zyngier
2016-09-13  7:41                   ` Marc Zyngier
2016-09-13 15:05                   ` Bjorn Helgaas
2016-09-13 15:05                     ` Bjorn Helgaas
2016-09-13 15:05                     ` Bjorn Helgaas
2016-09-13 15:34                     ` Bjorn Helgaas
2016-09-13 15:34                       ` Bjorn Helgaas
2016-09-13 15:34                       ` Bjorn Helgaas
2016-09-13 15:50                       ` Thomas Petazzoni
2016-09-13 15:50                         ` Thomas Petazzoni
2016-09-13 15:50                         ` Thomas Petazzoni
2016-09-14  5:34                       ` Bharat Kumar Gogada
2016-09-14  5:34                         ` Bharat Kumar Gogada
2016-09-14  5:34                         ` Bharat Kumar Gogada
2016-09-14  9:55                       ` Kishon Vijay Abraham I [this message]
2016-09-14  9:55                         ` Kishon Vijay Abraham I
2016-09-14  9:55                         ` Kishon Vijay Abraham I
2017-03-02  8:46                     ` Bharat Kumar Gogada
2017-03-02  8:46                       ` Bharat Kumar Gogada
2017-03-02  8:46                       ` Bharat Kumar Gogada
2016-09-13 14:32 ` [PATCH 1/3] PCI: Xilinx NWL PCIe: Expanding PCIe core errors and printing event occurred Bjorn Helgaas
2016-09-13 14:32   ` Bjorn Helgaas
2016-09-14  5:26   ` Bharat Kumar Gogada
2016-09-14  5:26     ` Bharat Kumar Gogada
2016-09-14  5:26     ` Bharat Kumar Gogada
2016-09-13 15:18 ` Bjorn Helgaas
2016-09-13 15:18   ` Bjorn Helgaas
2016-09-14  5:28   ` Bharat Kumar Gogada
2016-09-14  5:28     ` Bharat Kumar Gogada
2016-09-14  5:28     ` Bharat Kumar Gogada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57D91E9A.7060402@ti.com \
    --to=kishon@ti.com \
    --cc=arnd@arndb.de \
    --cc=bharat.kumar.gogada@xilinx.com \
    --cc=bhelgaas@google.com \
    --cc=colin.king@canonical.com \
    --cc=helgaas@kernel.org \
    --cc=lftan@altera.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=m-karicheri2@ti.com \
    --cc=marc.zyngier@arm.com \
    --cc=michals@xilinx.com \
    --cc=rgummal@xilinx.com \
    --cc=robh@kernel.org \
    --cc=sorenb@xilinx.com \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.