linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>, x86@kernel.org
Cc: Tony Luck <tony.luck@intel.com>, Andi Kleen <ak@linux.intel.com>,
	Stephane Eranian <eranian@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Ricardo Neri <ricardo.neri@intel.com>,
	iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org,
	Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Subject: Re: [PATCH v6 05/29] x86/apic/vector: Do not allocate vectors for NMIs
Date: Fri, 06 May 2022 23:12:20 +0200	[thread overview]
Message-ID: <87zgjufjrf.ffs@tglx> (raw)
In-Reply-To: <20220506000008.30892-6-ricardo.neri-calderon@linux.intel.com>

On Thu, May 05 2022 at 16:59, Ricardo Neri wrote:
> Vectors are meaningless when allocating IRQs with NMI as the delivery
> mode.

Vectors are not meaningless. NMI has a fixed vector.

The point is that for a fixed vector there is no vector management
required.

Can you spot the difference?

> In such case, skip the reservation of IRQ vectors. Do it in the lowest-
> level functions where the actual IRQ reservation takes place.
>
> Since NMIs target specific CPUs, keep the functionality to find the best
> CPU.

Again. What for?
  
> +	if (apicd->hw_irq_cfg.delivery_mode == APIC_DELIVERY_MODE_NMI) {
> +		cpu = irq_matrix_find_best_cpu(vector_matrix, dest);
> +		apicd->cpu = cpu;
> +		vector = 0;
> +		goto no_vector;
> +	}

Why would a NMI ever end up in this code? There is no vector management
required and this find cpu exercise is pointless.

>  	vector = irq_matrix_alloc(vector_matrix, dest, resvd, &cpu);
>  	trace_vector_alloc(irqd->irq, vector, resvd, vector);
>  	if (vector < 0)
>  		return vector;
>  	apic_update_vector(irqd, vector, cpu);
> +
> +no_vector:
>  	apic_update_irq_cfg(irqd, vector, cpu);
>  
>  	return 0;
> @@ -321,12 +330,22 @@ assign_managed_vector(struct irq_data *irqd, const struct cpumask *dest)
>  	/* set_affinity might call here for nothing */
>  	if (apicd->vector && cpumask_test_cpu(apicd->cpu, vector_searchmask))
>  		return 0;
> +
> +	if (apicd->hw_irq_cfg.delivery_mode == APIC_DELIVERY_MODE_NMI) {
> +		cpu = irq_matrix_find_best_cpu_managed(vector_matrix, dest);
> +		apicd->cpu = cpu;
> +		vector = 0;
> +		goto no_vector;
> +	}

This code can never be reached for a NMI delivery. If so, then it's a
bug.

This all is special purpose for that particular HPET NMI watchdog use
case and we are not exposing this to anything else at all.

So why are you sprinkling this NMI nonsense all over the place? Just
because? There are well defined entry points to all of this where this
can be fenced off.

If at some day the hardware people get their act together and provide a
proper vector space for NMIs then we have to revisit that, but then
there will be a separate NMI vector management too.

What you want is the below which also covers the next patch. Please keep
an eye on the comments I added/modified.

Thanks,

        tglx
---
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -42,6 +42,7 @@ EXPORT_SYMBOL_GPL(x86_vector_domain);
 static DEFINE_RAW_SPINLOCK(vector_lock);
 static cpumask_var_t vector_searchmask;
 static struct irq_chip lapic_controller;
+static struct irq_chip lapic_nmi_controller;
 static struct irq_matrix *vector_matrix;
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct hlist_head, cleanup_list);
@@ -451,6 +452,10 @@ static int x86_vector_activate(struct ir
 	trace_vector_activate(irqd->irq, apicd->is_managed,
 			      apicd->can_reserve, reserve);
 
+	/* NMI has a fixed vector. No vector management required */
+	if (apicd->hw_irq_cfg.delivery_mode == APIC_DELIVERY_MODE_NMI)
+		return 0;
+
 	raw_spin_lock_irqsave(&vector_lock, flags);
 	if (!apicd->can_reserve && !apicd->is_managed)
 		assign_irq_vector_any_locked(irqd);
@@ -472,6 +477,10 @@ static void vector_free_reserved_and_man
 	trace_vector_teardown(irqd->irq, apicd->is_managed,
 			      apicd->has_reserved);
 
+	/* NMI has a fixed vector. No vector management required */
+	if (apicd->hw_irq_cfg.delivery_mode == APIC_DELIVERY_MODE_NMI)
+		return;
+
 	if (apicd->has_reserved)
 		irq_matrix_remove_reserved(vector_matrix);
 	if (apicd->is_managed)
@@ -568,6 +577,24 @@ static int x86_vector_alloc_irqs(struct
 		irqd->hwirq = virq + i;
 		irqd_set_single_target(irqd);
 
+		if (info->flags & X86_IRQ_ALLOC_AS_NMI) {
+			/*
+			 * NMIs have a fixed vector and need their own
+			 * interrupt chip so nothing can end up in the
+			 * regular local APIC management code except the
+			 * MSI message composing callback.
+			 */
+			irqd->chip = &lapic_nmi_controller;
+			/*
+			 * Don't allow affinity setting attempts for NMIs.
+			 * This cannot work with the regular affinity
+			 * mechanisms and for the intended HPET NMI
+			 * watchdog use case it's not required.
+			 */
+			irqd_set_no_balance(irqd);
+			continue;
+		}
+
 		/*
 		 * Prevent that any of these interrupts is invoked in
 		 * non interrupt context via e.g. generic_handle_irq()
@@ -921,6 +948,11 @@ static struct irq_chip lapic_controller
 	.irq_retrigger		= apic_retrigger_irq,
 };
 
+static struct irq_chip lapic_nmi_controller = {
+	.name			= "APIC-NMI",
+	.irq_compose_msi_msg	= x86_vector_msi_compose_msg,
+};
+
 #ifdef CONFIG_SMP
 
 static void free_moved_vector(struct apic_chip_data *apicd)
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -261,6 +261,11 @@ static inline bool irqd_is_per_cpu(struc
 	return __irqd_to_state(d) & IRQD_PER_CPU;
 }
 
+static inline void irqd_set_no_balance(struct irq_data *d)
+{
+	__irqd_to_state(d) |= IRQD_NO_BALANCING;
+}
+
 static inline bool irqd_can_balance(struct irq_data *d)
 {
 	return !(__irqd_to_state(d) & (IRQD_PER_CPU | IRQD_NO_BALANCING));

  reply	other threads:[~2022-05-06 21:12 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05 23:59 [PATCH v6 00/29] x86: Implement an HPET-based hardlockup detector Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 01/29] irq/matrix: Expose functions to allocate the best CPU for new vectors Ricardo Neri
2022-05-06 19:48   ` Thomas Gleixner
2022-05-12  0:09     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 02/29] x86/apic: Add irq_cfg::delivery_mode Ricardo Neri
2022-05-06 19:53   ` Thomas Gleixner
2022-05-12  0:26     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 03/29] x86/apic/msi: Set the delivery mode individually for each IRQ Ricardo Neri
2022-05-06 20:05   ` Thomas Gleixner
2022-05-12  0:38     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 04/29] x86/apic: Add the X86_IRQ_ALLOC_AS_NMI irq allocation flag Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 05/29] x86/apic/vector: Do not allocate vectors for NMIs Ricardo Neri
2022-05-06 21:12   ` Thomas Gleixner [this message]
2022-05-13 18:03     ` Ricardo Neri
2022-05-13 20:50       ` Thomas Gleixner
2022-05-13 23:45         ` Ricardo Neri
2022-05-14  8:15           ` Thomas Gleixner
2022-05-05 23:59 ` [PATCH v6 06/29] x86/apic/vector: Implement support for NMI delivery mode Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 07/29] iommu/vt-d: Clear the redirection hint when the destination mode is physical Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 08/29] iommu/vt-d: Rework prepare_irte() to support per-IRQ delivery mode Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 09/29] iommu/vt-d: Set the IRTE delivery mode individually for each IRQ Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 10/29] iommu/vt-d: Implement minor tweaks for NMI irqs Ricardo Neri
2022-05-06 21:23   ` Thomas Gleixner
2022-05-13 18:07     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 11/29] iommu/amd: Expose [set|get]_dev_entry_bit() Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 12/29] iommu/amd: Enable NMIPass when allocating an NMI irq Ricardo Neri
2022-05-06 21:26   ` Thomas Gleixner
2022-05-13 19:01     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 13/29] iommu/amd: Compose MSI messages for NMI irqs in non-IR format Ricardo Neri
2022-05-06 21:31   ` Thomas Gleixner
2022-05-13 19:03     ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 14/29] x86/hpet: Expose hpet_writel() in header Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 15/29] x86/hpet: Add helper function hpet_set_comparator_periodic() Ricardo Neri
2022-05-06 21:41   ` Thomas Gleixner
2022-05-06 21:51     ` Thomas Gleixner
2022-05-13 21:29       ` Ricardo Neri
2022-05-13 21:19     ` Ricardo Neri
2022-05-14  8:17       ` Thomas Gleixner
2022-05-17 22:54         ` Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 16/29] x86/hpet: Prepare IRQ assignments to use the X86_ALLOC_AS_NMI flag Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 17/29] x86/hpet: Reserve an HPET channel for the hardlockup detector Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 18/29] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 19/29] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2022-05-05 23:59 ` [PATCH v6 20/29] init/main: Delay initialization of the lockup detector after smp_init() Ricardo Neri
2022-05-10 10:38   ` Nicholas Piggin
2022-05-13 23:16     ` Ricardo Neri
2022-05-20  0:25       ` Nicholas Piggin
2022-05-06  0:00 ` [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category Ricardo Neri
2022-05-09 13:59   ` Thomas Gleixner
2022-05-17 18:41     ` Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 22/29] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2022-05-09 14:03   ` Thomas Gleixner
2022-05-13 22:16     ` Ricardo Neri
2022-05-14 14:04       ` Thomas Gleixner
2022-05-06  0:00 ` [PATCH v6 23/29] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 24/29] watchdog/hardlockup: Use parse_option_str() to handle "nmi_watchdog" Ricardo Neri
2022-05-10 10:46   ` Nicholas Piggin
2022-05-13 23:17     ` Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 25/29] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 26/29] x86/watchdog: Add a shim hardlockup detector Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 27/29] watchdog: Expose lockup_detector_reconfigure() Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 28/29] x86/tsc: Restart NMI watchdog after refining tsc_khz Ricardo Neri
2022-05-10 11:16   ` Nicholas Piggin
2022-05-10 11:44     ` Thomas Gleixner
2022-05-17 22:53       ` Ricardo Neri
2022-05-17 22:08     ` Ricardo Neri
2022-05-06  0:00 ` [PATCH v6 29/29] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable Ricardo Neri
2022-05-10 12:14   ` Nicholas Piggin
2022-05-17  3:09     ` Ricardo Neri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zgjufjrf.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=Suravee.Suthikulpanit@amd.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eranian@google.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).