All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
To: Marc Zyngier <marc.zyngier@arm.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"colin.king@canonical.com" <colin.king@canonical.com>,
	Soren Brinkmann <sorenb@xilinx.com>,
	Michal Simek <michals@xilinx.com>,
	"arnd@arndb.de" <arnd@arndb.de>
Cc: "linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ravikiran Gummaluri <rgummal@xilinx.com>
Subject: RE: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.
Date: Wed, 31 Aug 2016 09:56:02 +0000	[thread overview]
Message-ID: <8520D5D51A55D047800579B094147198258D28AF@XAP-PVEXMBX01.xlnx.xilinx.com> (raw)
In-Reply-To: <57C59FE8.30307@arm.com>

 > On 30/08/16 15:13, Bharat Kumar Gogada wrote:
> >> Hi Bharat,
> >>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
> >>> nwl_pcie
> >> *pcie)
> >>>     }
> >>>
> >>>     pcie->legacy_irq_domain = irq_domain_add_linear(legacy_intc_node,
> >>> -                                                   INTX_NUM,
> >>> +                                                   INTX_NUM + 1,
> >>>                                                     &legacy_domain_ops,
> >>>                                                     pcie);
> >>
> >> This feels like the wrong thing to do. You have INTX_NUM irqs, so the
> >> domain allocation should reflect this. On the other hand, the way the
> >> driver currently deals with mappings is quite broken (consistently adding 1 to
> the HW interrupt).
> >>
> > Hi Marc,
> >
> > Without above change I get following crash in kernel while booting.
> >
> > [    2.441684] error: hwirq 0x4 is too large for dummy
> >
> > [    2.441694] ------------[ cut here ]------------
> >
> > [    2.441698] WARNING: at kernel/irq/irqdomain.c:344
> >
> > [    2.441702] Modules linked in:
> >
> > [    2.441706]
> >
> > [    2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
> >
> > [    2.441718] Hardware name: xlnx,zynqmp (DT)
> >
> > [    2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
> ffffffc071888000
> >
> > [    2.441732] PC is at irq_domain_associate+0x138/0x1c0
> >
> > [    2.441738] LR is at irq_domain_associate+0x138/0x1c0
> >
> > In kernel/irq/irqdomain.c function irq_domain_associate
> >
> > if (WARN(hwirq >= domain->hwirq_max,
> >                  "error: hwirq 0x%x is too large for %s\n", (int)hwirq, domain->name))
> >                 return -EINVAL;
> >
> > Here the hwirq and hwirq_max are equal to 4 without the above condition
> (INTX_NUM + 1) due to which crash is coming.
> > This is happening as the legacy interrupts are starting from 1 (INTA).
> 
> I understood that. I'm still persisting in saying that you have the wrong fix.
> 
> Your domain should always allocate many interrupts as you have interrupt
> sources. These interrupts (hwirq) should be numbered from 0 to (n-1).

Agreed, but here comes the problem the hwirq for legacy interrupts will start at 0x1 to 0x4 (INTA to INTD) and 
these values are as per PCIe specification for legacy interrupts.
So these cannot be numbered from 0. So when 0x4 (INTD) for a multi-function device comes the crash occurs.
	
> 
> > And I'm  consistently adding 1 to the HW interrupt as in
> > nwl_pcie_leg_handler I get 0th bit set from MSGF_LEG_STATUS if INTA
> > interrupt is raised but my hwirq number being mapped for INTA is 0x1
> > so that's I'm adding 1 to obtain correct virtual irq. Same case in
> > nwl_pcie_free_irq_domain since hwirq starts from one I'm adding 1 to
> > obtain virtual irq and free it.
> 
> I can see that. Nonetheless, this is wrong. Can you please test the patch I
> provided in my reply and report what happens?

Can you be more specific on what is the wrong, I'm adding one since the hwirq starts from 0x1 as mentioned 
above.
I did try your suggestion with Ethernet card, but kernel hangs (it does not show any crash also, just hangs) when I do 
interface up (without bit + 1, using only bit position in handler). This is not working because in the legacy domain virq mapping
starts with hwirq 0x1, there is no mapping for 0x0 in the domain, so EP interrupt is not serviced since virq being returned is zero.

Thanks & Regards,
Bharat



 

WARNING: multiple messages have this Message-ID (diff)
From: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
To: Marc Zyngier <marc.zyngier@arm.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"colin.king@canonical.com" <colin.king@canonical.com>,
	Soren Brinkmann <sorenb@xilinx.com>,
	Michal Simek <michals@xilinx.com>,
	"arnd@arndb.de" <arnd@arndb.de>
Cc: "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ravikiran Gummaluri <rgummal@xilinx.com>
Subject: RE: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.
Date: Wed, 31 Aug 2016 09:56:02 +0000	[thread overview]
Message-ID: <8520D5D51A55D047800579B094147198258D28AF@XAP-PVEXMBX01.xlnx.xilinx.com> (raw)
In-Reply-To: <57C59FE8.30307@arm.com>

 > On 30/08/16 15:13, Bharat Kumar Gogada wrote:
> >> Hi Bharat,
> >>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
> >>> nwl_pcie
> >> *pcie)
> >>>     }
> >>>
> >>>     pcie->legacy_irq_domain =3D irq_domain_add_linear(legacy_intc_nod=
e,
> >>> -                                                   INTX_NUM,
> >>> +                                                   INTX_NUM + 1,
> >>>                                                     &legacy_domain_op=
s,
> >>>                                                     pcie);
> >>
> >> This feels like the wrong thing to do. You have INTX_NUM irqs, so the
> >> domain allocation should reflect this. On the other hand, the way the
> >> driver currently deals with mappings is quite broken (consistently add=
ing 1 to
> the HW interrupt).
> >>
> > Hi Marc,
> >
> > Without above change I get following crash in kernel while booting.
> >
> > [    2.441684] error: hwirq 0x4 is too large for dummy
> >
> > [    2.441694] ------------[ cut here ]------------
> >
> > [    2.441698] WARNING: at kernel/irq/irqdomain.c:344
> >
> > [    2.441702] Modules linked in:
> >
> > [    2.441706]
> >
> > [    2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
> >
> > [    2.441718] Hardware name: xlnx,zynqmp (DT)
> >
> > [    2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
> ffffffc071888000
> >
> > [    2.441732] PC is at irq_domain_associate+0x138/0x1c0
> >
> > [    2.441738] LR is at irq_domain_associate+0x138/0x1c0
> >
> > In kernel/irq/irqdomain.c function irq_domain_associate
> >
> > if (WARN(hwirq >=3D domain->hwirq_max,
> >                  "error: hwirq 0x%x is too large for %s\n", (int)hwirq,=
 domain->name))
> >                 return -EINVAL;
> >
> > Here the hwirq and hwirq_max are equal to 4 without the above condition
> (INTX_NUM + 1) due to which crash is coming.
> > This is happening as the legacy interrupts are starting from 1 (INTA).
>=20
> I understood that. I'm still persisting in saying that you have the wrong=
 fix.
>=20
> Your domain should always allocate many interrupts as you have interrupt
> sources. These interrupts (hwirq) should be numbered from 0 to (n-1).

Agreed, but here comes the problem the hwirq for legacy interrupts will sta=
rt at 0x1 to 0x4 (INTA to INTD) and=20
these values are as per PCIe specification for legacy interrupts.
So these cannot be numbered from 0. So when 0x4 (INTD) for a multi-function=
 device comes the crash occurs.
=09
>=20
> > And I'm  consistently adding 1 to the HW interrupt as in
> > nwl_pcie_leg_handler I get 0th bit set from MSGF_LEG_STATUS if INTA
> > interrupt is raised but my hwirq number being mapped for INTA is 0x1
> > so that's I'm adding 1 to obtain correct virtual irq. Same case in
> > nwl_pcie_free_irq_domain since hwirq starts from one I'm adding 1 to
> > obtain virtual irq and free it.
>=20
> I can see that. Nonetheless, this is wrong. Can you please test the patch=
 I
> provided in my reply and report what happens?

Can you be more specific on what is the wrong, I'm adding one since the hwi=
rq starts from 0x1 as mentioned=20
above.
I did try your suggestion with Ethernet card, but kernel hangs (it does not=
 show any crash also, just hangs) when I do=20
interface up (without bit + 1, using only bit position in handler). This is=
 not working because in the legacy domain virq mapping
starts with hwirq 0x1, there is no mapping for 0x0 in the domain, so EP int=
errupt is not serviced since virq being returned is zero.

Thanks & Regards,
Bharat



=20

WARNING: multiple messages have this Message-ID (diff)
From: bharat.kumar.gogada@xilinx.com (Bharat Kumar Gogada)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.
Date: Wed, 31 Aug 2016 09:56:02 +0000	[thread overview]
Message-ID: <8520D5D51A55D047800579B094147198258D28AF@XAP-PVEXMBX01.xlnx.xilinx.com> (raw)
In-Reply-To: <57C59FE8.30307@arm.com>

 > On 30/08/16 15:13, Bharat Kumar Gogada wrote:
> >> Hi Bharat,
> >>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
> >>> nwl_pcie
> >> *pcie)
> >>>     }
> >>>
> >>>     pcie->legacy_irq_domain = irq_domain_add_linear(legacy_intc_node,
> >>> -                                                   INTX_NUM,
> >>> +                                                   INTX_NUM + 1,
> >>>                                                     &legacy_domain_ops,
> >>>                                                     pcie);
> >>
> >> This feels like the wrong thing to do. You have INTX_NUM irqs, so the
> >> domain allocation should reflect this. On the other hand, the way the
> >> driver currently deals with mappings is quite broken (consistently adding 1 to
> the HW interrupt).
> >>
> > Hi Marc,
> >
> > Without above change I get following crash in kernel while booting.
> >
> > [    2.441684] error: hwirq 0x4 is too large for dummy
> >
> > [    2.441694] ------------[ cut here ]------------
> >
> > [    2.441698] WARNING: at kernel/irq/irqdomain.c:344
> >
> > [    2.441702] Modules linked in:
> >
> > [    2.441706]
> >
> > [    2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
> >
> > [    2.441718] Hardware name: xlnx,zynqmp (DT)
> >
> > [    2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
> ffffffc071888000
> >
> > [    2.441732] PC is at irq_domain_associate+0x138/0x1c0
> >
> > [    2.441738] LR is at irq_domain_associate+0x138/0x1c0
> >
> > In kernel/irq/irqdomain.c function irq_domain_associate
> >
> > if (WARN(hwirq >= domain->hwirq_max,
> >                  "error: hwirq 0x%x is too large for %s\n", (int)hwirq, domain->name))
> >                 return -EINVAL;
> >
> > Here the hwirq and hwirq_max are equal to 4 without the above condition
> (INTX_NUM + 1) due to which crash is coming.
> > This is happening as the legacy interrupts are starting from 1 (INTA).
> 
> I understood that. I'm still persisting in saying that you have the wrong fix.
> 
> Your domain should always allocate many interrupts as you have interrupt
> sources. These interrupts (hwirq) should be numbered from 0 to (n-1).

Agreed, but here comes the problem the hwirq for legacy interrupts will start at 0x1 to 0x4 (INTA to INTD) and 
these values are as per PCIe specification for legacy interrupts.
So these cannot be numbered from 0. So when 0x4 (INTD) for a multi-function device comes the crash occurs.
	
> 
> > And I'm  consistently adding 1 to the HW interrupt as in
> > nwl_pcie_leg_handler I get 0th bit set from MSGF_LEG_STATUS if INTA
> > interrupt is raised but my hwirq number being mapped for INTA is 0x1
> > so that's I'm adding 1 to obtain correct virtual irq. Same case in
> > nwl_pcie_free_irq_domain since hwirq starts from one I'm adding 1 to
> > obtain virtual irq and free it.
> 
> I can see that. Nonetheless, this is wrong. Can you please test the patch I
> provided in my reply and report what happens?

Can you be more specific on what is the wrong, I'm adding one since the hwirq starts from 0x1 as mentioned 
above.
I did try your suggestion with Ethernet card, but kernel hangs (it does not show any crash also, just hangs) when I do 
interface up (without bit + 1, using only bit position in handler). This is not working because in the legacy domain virq mapping
starts with hwirq 0x1, there is no mapping for 0x0 in the domain, so EP interrupt is not serviced since virq being returned is zero.

Thanks & Regards,
Bharat



 

  reply	other threads:[~2016-08-31  9:56 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-30 10:39 [PATCH 1/3] PCI: Xilinx NWL PCIe: Expanding PCIe core errors and printing event occurred Bharat Kumar Gogada
2016-08-30 10:39 ` Bharat Kumar Gogada
2016-08-30 10:39 ` [PATCH 2/3] PCI: Xilinx NWL PCIe: Enabling all MSI interrupts using MSI mask Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 10:39 ` [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts Bharat Kumar Gogada
2016-08-30 10:39   ` Bharat Kumar Gogada
2016-08-30 12:17   ` Marc Zyngier
2016-08-30 12:17     ` Marc Zyngier
2016-08-30 14:13     ` Bharat Kumar Gogada
2016-08-30 14:13       ` Bharat Kumar Gogada
2016-08-30 14:13       ` Bharat Kumar Gogada
2016-08-30 15:02       ` Marc Zyngier
2016-08-30 15:02         ` Marc Zyngier
2016-08-30 15:02         ` Marc Zyngier
2016-08-31  9:56         ` Bharat Kumar Gogada [this message]
2016-08-31  9:56           ` Bharat Kumar Gogada
2016-08-31  9:56           ` Bharat Kumar Gogada
2016-08-31 10:56           ` Marc Zyngier
2016-08-31 10:56             ` Marc Zyngier
2016-08-31 10:56             ` Marc Zyngier
2016-09-01  5:19             ` Bharat Kumar Gogada
2016-09-01  5:19               ` Bharat Kumar Gogada
2016-09-01  5:19               ` Bharat Kumar Gogada
2016-09-12 22:02               ` Bjorn Helgaas
2016-09-12 22:02                 ` Bjorn Helgaas
2016-09-12 22:02                 ` Bjorn Helgaas
2016-09-13  7:41                 ` Marc Zyngier
2016-09-13  7:41                   ` Marc Zyngier
2016-09-13  7:41                   ` Marc Zyngier
2016-09-13 15:05                   ` Bjorn Helgaas
2016-09-13 15:05                     ` Bjorn Helgaas
2016-09-13 15:05                     ` Bjorn Helgaas
2016-09-13 15:34                     ` Bjorn Helgaas
2016-09-13 15:34                       ` Bjorn Helgaas
2016-09-13 15:34                       ` Bjorn Helgaas
2016-09-13 15:50                       ` Thomas Petazzoni
2016-09-13 15:50                         ` Thomas Petazzoni
2016-09-13 15:50                         ` Thomas Petazzoni
2016-09-14  5:34                       ` Bharat Kumar Gogada
2016-09-14  5:34                         ` Bharat Kumar Gogada
2016-09-14  5:34                         ` Bharat Kumar Gogada
2016-09-14  9:55                       ` Kishon Vijay Abraham I
2016-09-14  9:55                         ` Kishon Vijay Abraham I
2016-09-14  9:55                         ` Kishon Vijay Abraham I
2017-03-02  8:46                     ` Bharat Kumar Gogada
2017-03-02  8:46                       ` Bharat Kumar Gogada
2017-03-02  8:46                       ` Bharat Kumar Gogada
2016-09-13 14:32 ` [PATCH 1/3] PCI: Xilinx NWL PCIe: Expanding PCIe core errors and printing event occurred Bjorn Helgaas
2016-09-13 14:32   ` Bjorn Helgaas
2016-09-14  5:26   ` Bharat Kumar Gogada
2016-09-14  5:26     ` Bharat Kumar Gogada
2016-09-14  5:26     ` Bharat Kumar Gogada
2016-09-13 15:18 ` Bjorn Helgaas
2016-09-13 15:18   ` Bjorn Helgaas
2016-09-14  5:28   ` Bharat Kumar Gogada
2016-09-14  5:28     ` Bharat Kumar Gogada
2016-09-14  5:28     ` Bharat Kumar Gogada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8520D5D51A55D047800579B094147198258D28AF@XAP-PVEXMBX01.xlnx.xilinx.com \
    --to=bharat.kumar.gogada@xilinx.com \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=colin.king@canonical.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=michals@xilinx.com \
    --cc=rgummal@xilinx.com \
    --cc=robh@kernel.org \
    --cc=sorenb@xilinx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.