All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/ioapic: Don't return 0 as valid virq
@ 2023-03-02 19:34 Saurabh Sengar
  2023-03-09 17:14 ` Saurabh Singh Sengar
  2023-03-12 20:40 ` Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: Saurabh Sengar @ 2023-03-02 19:34 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, mikelley, linux-kernel

Zero is invalid virq and should't be returned as a valid value for
lower irq bound. If IO-APIC and gsi_top are not initialized return
the 'from' value as virq.

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
 arch/x86/kernel/apic/io_apic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a868b76cd3d4..000cc6159b8b 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from)
 	/*
 	 * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
 	 * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
+	 *
+	 * Incase gsi_top is also not initialized return @from.
 	 */
 	if (!ioapic_initialized)
-		return gsi_top;
+		return gsi_top ? : from;
 	/*
 	 * For DT enabled machines ioapic_dynirq_base is irrelevant and not
 	 * updated. So simply return @from if ioapic_dynirq_base == 0.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-02 19:34 [PATCH] x86/ioapic: Don't return 0 as valid virq Saurabh Sengar
@ 2023-03-09 17:14 ` Saurabh Singh Sengar
  2023-03-12 20:40 ` Borislav Petkov
  1 sibling, 0 replies; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-09 17:14 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, mikelley, linux-kernel

On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> Zero is invalid virq and should't be returned as a valid value for
> lower irq bound. If IO-APIC and gsi_top are not initialized return
> the 'from' value as virq.
> 
> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> ---
>  arch/x86/kernel/apic/io_apic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index a868b76cd3d4..000cc6159b8b 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from)
>  	/*
>  	 * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
>  	 * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> +	 *
> +	 * Incase gsi_top is also not initialized return @from.
>  	 */
>  	if (!ioapic_initialized)
> -		return gsi_top;
> +		return gsi_top ? : from;
>  	/*
>  	 * For DT enabled machines ioapic_dynirq_base is irrelevant and not
>  	 * updated. So simply return @from if ioapic_dynirq_base == 0.
> -- 
> 2.34.1

Hi Maintainers,

May I get your feedback on this patch.

Regards,
Saurabh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-02 19:34 [PATCH] x86/ioapic: Don't return 0 as valid virq Saurabh Sengar
  2023-03-09 17:14 ` Saurabh Singh Sengar
@ 2023-03-12 20:40 ` Borislav Petkov
  2023-03-13  3:29   ` [EXTERNAL] " Saurabh Singh Sengar
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2023-03-12 20:40 UTC (permalink / raw)
  To: Saurabh Sengar
  Cc: tglx, mingo, dave.hansen, x86, hpa, johan+linaro, isaku.yamahata,
	mikelley, linux-kernel

On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> Zero is invalid virq and should't be returned as a valid value for
> lower irq bound. If IO-APIC and gsi_top are not initialized return

Why isn't gsi_top initialized?

What is this fixing?

Don't be afraid to do

git annotate arch/x86/kernel/apic/io_apic.c

and see which commit added this. This one:

3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")

Now add the folks from this commit to Cc and tell them why in your case
gsi_top is not initialized and what they're breaking by doing that.

The more your commit message explains *why* you're fixing something, the
better it is for the maintainers/reviewers to actually know what to do.

Right now I'm reading this and I'm thinking, random, unjustified change.
Ignore.

Ok?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-12 20:40 ` Borislav Petkov
@ 2023-03-13  3:29   ` Saurabh Singh Sengar
  2023-03-13  3:37     ` Saurabh Singh Sengar
  2023-03-13 11:14     ` [EXTERNAL] " Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-13  3:29 UTC (permalink / raw)
  To: Borislav Petkov, Saurabh Sengar
  Cc: tglx, mingo, dave.hansen, x86, hpa, johan+linaro, isaku.yamahata,
	Michael Kelley (LINUX),
	linux-kernel, rahul.tanwar, andriy.shevchenko

Cc: rahul.tanwar@linux.intel.com, andriy.shevchenko@intel.com

Thanks for you comments, please see my responses below.

> -----Original Message-----
> From: Borislav Petkov <bp@alien8.de>
> Sent: Monday, March 13, 2023 2:10 AM
> To: Saurabh Sengar <ssengar@linux.microsoft.com>
> Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com;
> x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org;
> isaku.yamahata@intel.com; Michael Kelley (LINUX)
> <mikelley@microsoft.com>; linux-kernel@vger.kernel.org
> Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> 
> On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> > Zero is invalid virq and should't be returned as a valid value for
> > lower irq bound. If IO-APIC and gsi_top are not initialized return
> 
> Why isn't gsi_top initialized?
> 
> What is this fixing?

In the absence of a device tree node for IO-APIC,  IO-APIC is not registered,
resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound
will return 0. Returning 0 from this function will allow interrupts to have 0
assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq
should be derived from 'hint' value passed to function as 'from'.

I can add above info in commit message,  please let me know if anything
more to be added.

To be specific in our system which is a guest VM we don't need IO-APIC and hence
there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI.

> 
> Don't be afraid to do
> 
> git annotate arch/x86/kernel/apic/io_apic.c
> 
> and see which commit added this. This one:
> 
> 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled
> machines")
> 
> Now add the folks from this commit to Cc and tell them why in your case
> gsi_top is not initialized and what they're breaking by doing that.

Thanks. I will add "Fixes:" and "Cc:" tag in next version.

> 
> The more your commit message explains *why* you're fixing something, the
> better it is for the maintainers/reviewers to actually know what to do.
> 
> Right now I'm reading this and I'm thinking, random, unjustified change.
> Ignore.
> 
> Ok?
> 
> Thx.
> 
> --
> Regards/Gruss,
>     Boris.
> 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> e.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4
> 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3
> D&reserved=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-13  3:29   ` [EXTERNAL] " Saurabh Singh Sengar
@ 2023-03-13  3:37     ` Saurabh Singh Sengar
  2023-03-13 11:07       ` Borislav Petkov
  2023-03-13 11:14     ` [EXTERNAL] " Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-13  3:37 UTC (permalink / raw)
  To: Saurabh Singh Sengar, Borislav Petkov, Saurabh Sengar
  Cc: tglx, mingo, dave.hansen, x86, hpa, johan+linaro, isaku.yamahata,
	Michael Kelley (LINUX),
	linux-kernel, andriy.shevchenko

I just see mail to rahul.tanwar@linux.intel.com is undelivered, shall I still add it in "Cc:" ?
Please let me know what we usually do in such cases.

Regards,
Saurabh


> -----Original Message-----
> From: Saurabh Singh Sengar <ssengar@microsoft.com>
> Sent: Monday, March 13, 2023 9:00 AM
> To: Borislav Petkov <bp@alien8.de>; Saurabh Sengar
> <ssengar@linux.microsoft.com>
> Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com;
> x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org;
> isaku.yamahata@intel.com; Michael Kelley (LINUX)
> <mikelley@microsoft.com>; linux-kernel@vger.kernel.org;
> rahul.tanwar@linux.intel.com; andriy.shevchenko@intel.com
> Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> 
> Cc: rahul.tanwar@linux.intel.com, andriy.shevchenko@intel.com
> 
> Thanks for you comments, please see my responses below.
> 
> > -----Original Message-----
> > From: Borislav Petkov <bp@alien8.de>
> > Sent: Monday, March 13, 2023 2:10 AM
> > To: Saurabh Sengar <ssengar@linux.microsoft.com>
> > Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com;
> > x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org;
> > isaku.yamahata@intel.com; Michael Kelley (LINUX)
> > <mikelley@microsoft.com>; linux-kernel@vger.kernel.org
> > Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid
> > virq
> >
> > On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> > > Zero is invalid virq and should't be returned as a valid value for
> > > lower irq bound. If IO-APIC and gsi_top are not initialized return
> >
> > Why isn't gsi_top initialized?
> >
> > What is this fixing?
> 
> In the absence of a device tree node for IO-APIC,  IO-APIC is not registered,
> resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound
> will return 0. Returning 0 from this function will allow interrupts to have 0
> assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq
> should be derived from 'hint' value passed to function as 'from'.
> 
> I can add above info in commit message,  please let me know if anything
> more to be added.
> 
> To be specific in our system which is a guest VM we don't need IO-APIC and
> hence there is no device tree node for it. It is observed that we get irq 0
> assigned to PCI-MSI.
> 
> >
> > Don't be afraid to do
> >
> > git annotate arch/x86/kernel/apic/io_apic.c
> >
> > and see which commit added this. This one:
> >
> > 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT
> > enabled
> > machines")
> >
> > Now add the folks from this commit to Cc and tell them why in your
> > case gsi_top is not initialized and what they're breaking by doing that.
> 
> Thanks. I will add "Fixes:" and "Cc:" tag in next version.
> 
> >
> > The more your commit message explains *why* you're fixing something,
> > the better it is for the maintainers/reviewers to actually know what to do.
> >
> > Right now I'm reading this and I'm thinking, random, unjustified change.
> > Ignore.
> >
> > Ok?
> >
> > Thx.
> >
> > --
> > Regards/Gruss,
> >     Boris.
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeop
> >
> l%2F&data=05%7C01%7Cssengar%40microsoft.com%7C0595a41023f849c5ee
> b308db
> >
> 23732cd0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638142749
> 8888650
> >
> 08%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> iLCJBTiI
> >
> 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qpOM5MMYpUof
> VOaNsp8HxTmv
> > %2B80iVn5rFfzNQTlTwLw%3D&reserved=0
> > e.kernel.org%2Ftglx%2Fnotes-about-
> >
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4
> >
> 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> >
> 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> >
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> >
> %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3
> > D&reserved=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-13  3:37     ` Saurabh Singh Sengar
@ 2023-03-13 11:07       ` Borislav Petkov
  0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2023-03-13 11:07 UTC (permalink / raw)
  To: Saurabh Singh Sengar
  Cc: Saurabh Sengar, tglx, mingo, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, andriy.shevchenko

On Mon, Mar 13, 2023 at 03:37:44AM +0000, Saurabh Singh Sengar wrote:
> I just see mail to rahul.tanwar@linux.intel.com is undelivered, shall I still add it in "Cc:" ?
> Please let me know what we usually do in such cases.

You were posting just fine inline, why do you have to top-post now?

If it is undelivered, then the address is probably invalid now so drop
it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-13  3:29   ` [EXTERNAL] " Saurabh Singh Sengar
  2023-03-13  3:37     ` Saurabh Singh Sengar
@ 2023-03-13 11:14     ` Borislav Petkov
  2023-03-14 10:23       ` Saurabh Singh Sengar
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2023-03-13 11:14 UTC (permalink / raw)
  To: Saurabh Singh Sengar
  Cc: Saurabh Sengar, tglx, mingo, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, rahul.tanwar, andriy.shevchenko

On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> To be specific in our system which is a guest VM we don't need IO-APIC and hence
> there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI.

This should be added to your commit message: what guest VM is that and
why should the kernel support it.

Why doesn't it need an IO-APIC and why does the current code need to be
changed just for your guest VM?

What else needs to be changed so that your VM works?

Where is that VM's documentation and why can't that VM be fixed *not* to
need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
others do...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-13 11:14     ` [EXTERNAL] " Borislav Petkov
@ 2023-03-14 10:23       ` Saurabh Singh Sengar
  2023-03-24  7:09         ` Saurabh Singh Sengar
  2023-03-24 15:39         ` [EXTERNAL] " Thomas Gleixner
  0 siblings, 2 replies; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-14 10:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Saurabh Sengar, tglx, mingo, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, rahul.tanwar, andriy.shevchenko



> -----Original Message-----
> From: Borislav Petkov <bp@alien8.de>
> Sent: Monday, March 13, 2023 4:44 PM
> To: Saurabh Singh Sengar <ssengar@microsoft.com>
> Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de;
> mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com;
> Michael Kelley (LINUX) <mikelley@microsoft.com>; linux-
> kernel@vger.kernel.org; rahul.tanwar@linux.intel.com;
> andriy.shevchenko@intel.com
> Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> 
> On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> > To be specific in our system which is a guest VM we don't need IO-APIC
> > and hence there is no device tree node for it. It is observed that we get irq 0
> assigned to PCI-MSI.
> 
> This should be added to your commit message: what guest VM is that and
> why should the kernel support it.

Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
documentation is in Documentation/virt/hyperv/.

In commit I wanted to mention that any system which is not registering IO-APIC
will have this issue. But I am fine to mention specifically about the issue I am facing.
As part of your next comment, I have explained the issue in detail if that is good, I
can put that as commit message.

> 
> Why doesn't it need an IO-APIC and why does the current code need to be
> changed just for your guest VM?

For Hyper-V Virtual Machines, few platforms don't have any devices to be
hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
assigns interrupts to PCIe devices. In such platforms IO-APIC is not
registered which causes gsi_top value to remain at 0 and not get properly
assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
function in io_apic.c decides the lower bound of irq numbers based on gsi_top.

Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
virq number because gsi_top is still 0. 0 being invalid virq is ignored by
MSI irq domain and results allocation of the same PCIe MSI twice.

		CPU0		CPU1
0:		2			0		Hyper-V PCIe MSI 1073741824-edge
1:		69			0		Hyper-V PCIe MSI 1073741824-edge      nvme0q0

To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
This will also be identical to the behaviour of weak arch_dynirq_lower_bound
function defined in kernel/softirq.c.

> 
> What else needs to be changed so that your VM works?

This is the only change required.

> 
> Where is that VM's documentation and why can't that VM be fixed *not* to
> need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
> others do...

Documentation is mentioned above. As there is no need of IO-APIC there is
no need emulating it.

Please let me know if there is any further clarification required.

> 
> --
> Regards/Gruss,
>     Boris.
> 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> e.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324
> 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re
> served=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-14 10:23       ` Saurabh Singh Sengar
@ 2023-03-24  7:09         ` Saurabh Singh Sengar
  2023-03-24 15:39         ` [EXTERNAL] " Thomas Gleixner
  1 sibling, 0 replies; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-24  7:09 UTC (permalink / raw)
  To: Saurabh Singh Sengar, Borislav Petkov
  Cc: Saurabh Sengar, tglx, mingo, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, andriy.shevchenko



> -----Original Message-----
> From: Saurabh Singh Sengar <ssengar@microsoft.com>
> Sent: Tuesday, March 14, 2023 3:54 PM
> To: Borislav Petkov <bp@alien8.de>
> Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de;
> mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com;
> Michael Kelley (LINUX) <mikelley@microsoft.com>; linux-
> kernel@vger.kernel.org; rahul.tanwar@linux.intel.com;
> andriy.shevchenko@intel.com
> Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> 
> 
> 
> > -----Original Message-----
> > From: Borislav Petkov <bp@alien8.de>
> > Sent: Monday, March 13, 2023 4:44 PM
> > To: Saurabh Singh Sengar <ssengar@microsoft.com>
> > Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de;
> > mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org;
> > hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com;
> > Michael Kelley (LINUX) <mikelley@microsoft.com>; linux-
> > kernel@vger.kernel.org; rahul.tanwar@linux.intel.com;
> > andriy.shevchenko@intel.com
> > Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> >
> > On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> > > To be specific in our system which is a guest VM we don't need IO-APIC
> > > and hence there is no device tree node for it. It is observed that we get irq
> 0
> > assigned to PCI-MSI.
> >
> > This should be added to your commit message: what guest VM is that and
> > why should the kernel support it.
> 
> Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> documentation is in Documentation/virt/hyperv/.
> 
> In commit I wanted to mention that any system which is not registering IO-
> APIC
> will have this issue. But I am fine to mention specifically about the issue I am
> facing.
> As part of your next comment, I have explained the issue in detail if that is
> good, I
> can put that as commit message.
> 
> >
> > Why doesn't it need an IO-APIC and why does the current code need to be
> > changed just for your guest VM?
> 
> For Hyper-V Virtual Machines, few platforms don't have any devices to be
> hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> registered which causes gsi_top value to remain at 0 and not get properly
> assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> function in io_apic.c decides the lower bound of irq numbers based on
> gsi_top.
> 
> Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> MSI irq domain and results allocation of the same PCIe MSI twice.
> 
> 		CPU0		CPU1
> 0:		2			0		Hyper-V PCIe MSI
> 1073741824-edge
> 1:		69			0		Hyper-V PCIe MSI
> 1073741824-edge      nvme0q0
> 
> To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> function defined in kernel/softirq.c.

Hi Borislav,

Have you had an opportunity to review the above commit message ?
Kindly provide me with your opinion on whether it meets your expectations.

Regards,
Saurabh

> 
> >
> > What else needs to be changed so that your VM works?
> 
> This is the only change required.
> 
> >
> > Where is that VM's documentation and why can't that VM be fixed *not* to
> > need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
> > others do...
> 
> Documentation is mentioned above. As there is no need of IO-APIC there is
> no need emulating it.
> 
> Please let me know if there is any further clarification required.
> 
> >
> > --
> > Regards/Gruss,
> >     Boris.
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> %2F&data=05%7C01%7Cssengar%40microsoft.com%7C84362c605bf04e6d56
> 6a08db247630d0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63
> 8143862345546032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> &sdata=7iCmUXeDyD%2Fqv%2BO63N6HG%2FPrS9HWP3yaGClT7X2RB0c%3D
> &reserved=0
> > e.kernel.org%2Ftglx%2Fnotes-about-
> >
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324
> >
> 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> >
> 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> >
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> >
> %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re
> > served=0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-14 10:23       ` Saurabh Singh Sengar
  2023-03-24  7:09         ` Saurabh Singh Sengar
@ 2023-03-24 15:39         ` Thomas Gleixner
  2023-03-24 16:24           ` Saurabh Singh Sengar
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2023-03-24 15:39 UTC (permalink / raw)
  To: Saurabh Singh Sengar, Borislav Petkov
  Cc: Saurabh Sengar, mingo, dave.hansen, x86, hpa, johan+linaro,
	isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, rahul.tanwar, andriy.shevchenko

On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote:
>> This should be added to your commit message: what guest VM is that and
>> why should the kernel support it.
>
> Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> documentation is in Documentation/virt/hyperv/.
>
> In commit I wanted to mention that any system which is not registering
> IO-APIC will have this issue. But I am fine to mention specifically
> about the issue I am facing.  As part of your next comment, I have
> explained the issue in detail if that is good, I can put that as
> commit message.
>> 
>> Why doesn't it need an IO-APIC and why does the current code need to be
>> changed just for your guest VM?
>
> For Hyper-V Virtual Machines, few platforms don't have any devices to be
> hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> registered which causes gsi_top value to remain at 0 and not get properly
> assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> function in io_apic.c decides the lower bound of irq numbers based on gsi_top.
>
> Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> MSI irq domain and results allocation of the same PCIe MSI twice.
>
> 		CPU0		CPU1
> 0:		2			0		Hyper-V PCIe MSI 1073741824-edge
> 1:		69			0		Hyper-V PCIe MSI 1073741824-edge      nvme0q0
>
> To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> function defined in kernel/softirq.c.

I find this mightly confusing. Something like this perhaps:

  Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

  arch_dynirq_lower_bound() is invoked by the core interrupt code to
  retrieve the lowest possible Linux interrupt number for dynamically
  allocated interrupts like MSI.

  The x86 implementation uses this to exclude the IO/APIC GSI space.
  This works correctly as long as there is an IO/APIC registered, but
  returns 0 if not. This has been observed in VMs where the BIOS does
  not advertise an IO/APIC.  

  0 is an invalid interrupt number except for the legacy timer interrupt
  on x86. The return value is unchecked in the core code, so it ends up
  to allocate interrupt number 0 which is subsequently considered to be
  invalid by the caller, e.g. the MSI allocation code.

  The function has already a check for 0 in the case that an IO/APIC is
  registered, but ioapic_dynirq_base is 0 in case of device tree setups.

  Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
  which is used in the case that no IO/APIC is registered.

And then make the code to look like the below, which makes it very
clear what this is about.

Thanks,

        tglx
---
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int
 
 unsigned int arch_dynirq_lower_bound(unsigned int from)
 {
+	unsigned int ret;
+
 	/*
 	 * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
 	 * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
 	 */
-	if (!ioapic_initialized)
-		return gsi_top;
+	ret = ioapic_dynirq_base ? : gsi_top;
+
 	/*
-	 * For DT enabled machines ioapic_dynirq_base is irrelevant and not
-	 * updated. So simply return @from if ioapic_dynirq_base == 0.
+	 * For DT enabled machines ioapic_dynirq_base is irrelevant and
+	 * always 0. gsi_top can be 0 if there is no IO/APIC registered.
+	 *
+	 * 0 is an invalid interrupt number for dynamic allocations. Return
+	 * @from instead.
 	 */
-	return ioapic_dynirq_base ? : from;
+	return ret ? : from;
 }
 
 #ifdef CONFIG_X86_32



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
  2023-03-24 15:39         ` [EXTERNAL] " Thomas Gleixner
@ 2023-03-24 16:24           ` Saurabh Singh Sengar
  0 siblings, 0 replies; 11+ messages in thread
From: Saurabh Singh Sengar @ 2023-03-24 16:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Saurabh Singh Sengar, Borislav Petkov, mingo, dave.hansen, x86,
	hpa, johan+linaro, isaku.yamahata, Michael Kelley (LINUX),
	linux-kernel, rahul.tanwar, andriy.shevchenko

On Fri, Mar 24, 2023 at 04:39:07PM +0100, Thomas Gleixner wrote:
> On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote:
> >> This should be added to your commit message: what guest VM is that and
> >> why should the kernel support it.
> >
> > Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> > documentation is in Documentation/virt/hyperv/.
> >
> > In commit I wanted to mention that any system which is not registering
> > IO-APIC will have this issue. But I am fine to mention specifically
> > about the issue I am facing.  As part of your next comment, I have
> > explained the issue in detail if that is good, I can put that as
> > commit message.
> >> 
> >> Why doesn't it need an IO-APIC and why does the current code need to be
> >> changed just for your guest VM?
> >
> > For Hyper-V Virtual Machines, few platforms don't have any devices to be
> > hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> > assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> > registered which causes gsi_top value to remain at 0 and not get properly
> > assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> > flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> > function in io_apic.c decides the lower bound of irq numbers based on gsi_top.
> >
> > Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> > virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> > MSI irq domain and results allocation of the same PCIe MSI twice.
> >
> > 		CPU0		CPU1
> > 0:		2			0		Hyper-V PCIe MSI 1073741824-edge
> > 1:		69			0		Hyper-V PCIe MSI 1073741824-edge      nvme0q0
> >
> > To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> > hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> > This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> > function defined in kernel/softirq.c.
> 
> I find this mightly confusing. Something like this perhaps:
> 
>   Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()
> 
>   arch_dynirq_lower_bound() is invoked by the core interrupt code to
>   retrieve the lowest possible Linux interrupt number for dynamically
>   allocated interrupts like MSI.
> 
>   The x86 implementation uses this to exclude the IO/APIC GSI space.
>   This works correctly as long as there is an IO/APIC registered, but
>   returns 0 if not. This has been observed in VMs where the BIOS does
>   not advertise an IO/APIC.  
> 
>   0 is an invalid interrupt number except for the legacy timer interrupt
>   on x86. The return value is unchecked in the core code, so it ends up
>   to allocate interrupt number 0 which is subsequently considered to be
>   invalid by the caller, e.g. the MSI allocation code.
> 
>   The function has already a check for 0 in the case that an IO/APIC is
>   registered, but ioapic_dynirq_base is 0 in case of device tree setups.
> 
>   Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
>   which is used in the case that no IO/APIC is registered.
> 
> And then make the code to look like the below, which makes it very
> clear what this is about.
> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int
>  
>  unsigned int arch_dynirq_lower_bound(unsigned int from)
>  {
> +	unsigned int ret;
> +
>  	/*
>  	 * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
>  	 * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
>  	 */
> -	if (!ioapic_initialized)
> -		return gsi_top;
> +	ret = ioapic_dynirq_base ? : gsi_top;
> +
>  	/*
> -	 * For DT enabled machines ioapic_dynirq_base is irrelevant and not
> -	 * updated. So simply return @from if ioapic_dynirq_base == 0.
> +	 * For DT enabled machines ioapic_dynirq_base is irrelevant and
> +	 * always 0. gsi_top can be 0 if there is no IO/APIC registered.
> +	 *
> +	 * 0 is an invalid interrupt number for dynamic allocations. Return
> +	 * @from instead.
>  	 */
> -	return ioapic_dynirq_base ? : from;
> +	return ret ? : from;
>  }
>  
>  #ifdef CONFIG_X86_32
>

Thanks you for your valuable suggestions. Commit message and code looks
much better now. I will send the V2 with your "Co-Developed-by" tag, I
hope its fine with you.

Regards,
Saurabh
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-03-24 16:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-02 19:34 [PATCH] x86/ioapic: Don't return 0 as valid virq Saurabh Sengar
2023-03-09 17:14 ` Saurabh Singh Sengar
2023-03-12 20:40 ` Borislav Petkov
2023-03-13  3:29   ` [EXTERNAL] " Saurabh Singh Sengar
2023-03-13  3:37     ` Saurabh Singh Sengar
2023-03-13 11:07       ` Borislav Petkov
2023-03-13 11:14     ` [EXTERNAL] " Borislav Petkov
2023-03-14 10:23       ` Saurabh Singh Sengar
2023-03-24  7:09         ` Saurabh Singh Sengar
2023-03-24 15:39         ` [EXTERNAL] " Thomas Gleixner
2023-03-24 16:24           ` Saurabh Singh Sengar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.