All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"devel@linuxdriverproject.org" <devel@linuxdriverproject.org>,
	"olaf@aepfle.de" <olaf@aepfle.de>,
	"apw@canonical.com" <apw@canonical.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"leann.ogasawara@canonical.com" <leann.ogasawara@canonical.com>,
	"marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"KY Srinivasan" <kys@microsoft.com>
Subject: RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode for stimer0
Date: Thu, 30 Nov 2017 18:08:42 +0000	[thread overview]
Message-ID: <DM5PR21MB0747E37FBE0FAA402C5C5D61DC380@DM5PR21MB0747.namprd21.prod.outlook.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1711011517370.1942@nanos>

Thomas Gleixner <tglx@linutronix.de> writes:

> On Tue, 31 Oct 2017, mikelley@exchange.microsoft.com wrote:
> > diff --git a/arch/x86/include/uapi/asm/hyperv.h
> > b/arch/x86/include/uapi/asm/hyperv.h
> > index f65d125..408cf3e 100644
> > --- a/arch/x86/include/uapi/asm/hyperv.h
> > +++ b/arch/x86/include/uapi/asm/hyperv.h
> > @@ -112,6 +112,22 @@
> >  #define HV_X64_GUEST_IDLE_STATE_AVAILABLE		(1 << 5)
> >  /* Guest crash data handler available */
> >  #define HV_X64_GUEST_CRASH_MSR_AVAILABLE		(1 << 10)
> > +/* Debug MSRs available */
> > +#define HV_X64_DEBUG_MSR_AVAILABLE			(1 << 11)
> > +/* Support for Non-Privileged Instruction Execution Prevention is available */
> > +#define HV_X64_NPIEP_AVAILABLE				(1 << 12)
> > +/* Support for DisableHypervisor is available */
> > +#define HV_X64_DISABLE_HYPERVISOR_AVAILABLE		(1 << 13)
> > +/* Extended GVA Ranges for Flush Virtual Address list is available */
> > +#define HV_X64_EXTENDED_GVA_RANGE_AVAILABLE		(1 << 14)
> > +/* Return Hypercall output via XMM registers is available */
> > +#define HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE		(1 << 15)
> > +/* SINT polling mode available */
> > +#define HV_X64_SINT_POLLING_MODE_AVAILABLE		(1 << 17)
> > +/* Hypercall MSR lock is available */
> > +#define HV_X64_HYPERCALL_MSR_LOCK_AVAILABLE		(1 << 18)
> > +/* stimer direct mode is available */
> > +#define HV_X64_STIMER_DIRECT_MODE_AVAILABLE		(1 << 19)
> 
> How are all these defines (except the last one related to that patch?

Will move to a separate patch.

> 
> > +/* Hardware IRQ number to use for stimer0 in Direct Mode.  This IRQ
> > +is a fake
> > + * because stimer's in Direct Mode simply interrupt on the specified
> > +vector,
> > + * without using a particular IOAPIC pin. But we use the IRQ
> > +allocation
> > + * machinery, so we need a hardware IRQ #.  This value is somewhat
> > +arbitrary,
> > + * but it should not be a legacy IRQ (0 to 15), and should fit within
> > +the
> > + * single IOAPIC (0 to 23) that Hyper-V provides to a guest VM. So
> > +any value
> > + * between 16 and 23 should be good.
> > + */
> > +#define HV_STIMER0_IRQNR		18
> 
> Why would you want abuse an IOAPIC interrupt if all you need is a vector?

Allocating a vector up-front like the existing HYPERVISOR_CALLBACK_VECTOR would certainly be more straightforward, and in fact, my original internal testing of stimer Direct Mode used that approach.   Vectors are a limited resource, so I wanted to find a way to allocate one on-the-fly rather than use fixed pre-allocation, even under CONFIG_HYPERV.   But I've got no problem with allocating a vector up-front and skipping all the irq/APIC manipulation and related issues.

Any guidance on which vector to use?

> 
> > +/* Routines to do per-architecture handling of stimer0 when in Direct
> > +Mode */
> > +
> > +void hv_ack_stimer0_interrupt(struct irq_desc *desc) {
> > +	ack_APIC_irq();
> > +}
> > +
> > +static void allonline_vector_allocation_domain(int cpu, struct cpumask *retmask,
> > +				const struct cpumask *mask)
> > +{
> > +	cpumask_copy(retmask, cpu_online_mask); }
> > +
> > +int hv_allocate_stimer0_irq(int *irq, int *vector) {
> > +	int		localirq;
> > +	int		result;
> > +	struct irq_data *irq_data;
> > +
> > +	/* The normal APIC vector allocation domain allows allocation of
> > +vectors
> 
> Please fix you comment style. Multi line comments are:
> 
>        /*
>         * Bla....
> 	* foo...
> 	*/
> 

Will do.

> > +	 * only for the calling CPU.  So we change the allocation domain to one
> > +	 * that allows vectors to be allocated in all online CPUs.  This
> > +	 * change is fine in a Hyper-V VM because VMs don't have the usual
> > +	 * complement of interrupting devices.
> > +	 */
> > +	apic->vector_allocation_domain = allonline_vector_allocation_domain;
> 
> This does not work anymore. vector_allocation_domain is gone as of 4.15. Please check the
> vector rework in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip x86/apic
> 
> Aside of that what guarantees that all cpus are online at the point where you allocate that
> interrupt? Nothing, so the vector will be not reserved or allocated on offline CPUs. Now guess
> what happens if you bring the offline CPUs online later, it will drown in spurious interrupts.
> Worse, the vector can also be reused for a device interrupt. Great plan.
> 
> > +	localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR,
> > +				ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
> > +	if (localirq < 0) {
> > +		pr_err("Cannot register stimer0 gsi. Error %d", localirq);
> > +		return -1;
> > +	}
> > +
> > +	/* We pass in a dummy IRQ handler because architecture independent code
> > +	 * will later override the IRQ domain interrupt handler and set it to a
> > +	 * Hyper-V specific handler.
> > +	 */
> > +	result = request_irq(localirq, (irq_handler_t)(-1), 0,
> 					"Hyper-V stimer0", NULL);
> 
> That's a crude hack. Really.
> 
> > +	if (result) {
> > +		pr_err("Cannot request stimer0 irq. Error %d", result);
> > +		acpi_unregister_gsi(localirq);
> > +		return -1;
> > +	}
> > +	irq_data = irq_domain_get_irq_data(x86_vector_domain, localirq);
> > +	*vector = irqd_cfg(irq_data)->vector;
> > +	*irq = localirq;
> 
> Uurgh, no. This is even more of a layering violation. Grab random data from wherever it
> comes and then expect that it works. This will simply fall apart the moment someone changes
> the affinity of this interrupt. It will move to some random other vector and the system drowns
> in spurious interrupts on the old vector.
> 
> > +/* ISR for when stimer0 is operating in Direct Mode.  Direct Mode
> > +does
> > + * not use VMBus or any VMBus messages, so process here and not in
> > +the
> > + * VMBus driver code.
> > + */
> > +
> > +static void hv_stimer0_isr(struct irq_desc *desc) {
> > +	struct hv_per_cpu_context *hv_cpu;
> > +
> > +	__this_cpu_inc(*desc->kstat_irqs);
> > +	__this_cpu_inc(kstat.irqs_sum);
> > +	hv_ack_stimer0_interrupt(desc);
> > +	hv_cpu = this_cpu_ptr(hv_context.cpu_context);
> > +	hv_cpu->clk_evt->event_handler(hv_cpu->clk_evt);
> > +	add_interrupt_randomness(desc->irq_data.irq, 0); }
> > +
> >  static int hv_ce_set_next_event(unsigned long delta,
> >  				struct clock_event_device *evt)
> >  {
> > @@ -108,6 +149,8 @@ static int hv_ce_shutdown(struct
> > clock_event_device *evt)  {
> >  	hv_init_timer(HV_X64_MSR_STIMER0_COUNT, 0);
> >  	hv_init_timer_config(HV_X64_MSR_STIMER0_CONFIG, 0);
> > +	if (stimer_direct_mode)
> > +		hv_disable_stimer0_percpu_irq(stimer0_irq);
> 
> Whats the point of that? It's an empty inline:
> 
> > +#if IS_ENABLED(CONFIG_HYPERV)
> > +static inline void hv_enable_stimer0_percpu_irq(int irq) { } static
> > +inline void hv_disable_stimer0_percpu_irq(int irq) { }

We've got code in the works for Hyper-V on ARM64.  This is architecture independent code that caters to the ARM64 side where stimer Direct Mode will use percpu IRQs, and hooks are needed to call enable_percpu_irq() and disable_percpu_irq().

> 
> > +	if (stimer_direct_mode) {
> > +
> > +		/* When it expires, the timer will directly interrupt
> > +		 * on the specific hardware vector.
> > +		 */
> > +		timer_cfg.direct_mode = 1;
> > +		timer_cfg.apic_vector = stimer0_vector;
> > +		hv_enable_stimer0_percpu_irq(stimer0_irq);
> 
> Ditto.
> 
> > +	if (stimer_direct_mode) {
> > +		if (hv_allocate_stimer0_irq(&stimer0_irq, &stimer0_vector))
> > +			goto err;
> > +		irq_set_handler(stimer0_irq, hv_stimer0_isr);
> 
> What you really want to do here is to allocate a fixed vector like we do for the other percpu
> interrupts (local apic timer, IPIs etc.) and use that. This must be done at boot time, when you
> detect that the kernel runs on HyperV. This makes sure, that all CPUs will have the vector
> populated, even those which come online late.

Will submit a v2 with that approach.

> 
> Though you can't do that from a module. You have to do that setup during early boot from
> ms_hyperv_init_platform(). That's the only sane way to deal with that.
> 
> Thanks,
> 
> 	tglx

  parent reply	other threads:[~2017-11-30 18:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-31 22:18 [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode for stimer0 mikelley
2017-11-01 12:00 ` Vitaly Kuznetsov
2017-11-01 13:41   ` Vitaly Kuznetsov
2017-11-30 18:08     ` Michael Kelley (EOSG)
2017-11-01 14:51 ` Thomas Gleixner
2017-11-01 16:19   ` Thomas Gleixner
2017-11-01 16:21     ` Michael Kelley (EOSG)
2017-11-30 18:08   ` Michael Kelley (EOSG) [this message]
2017-11-03 16:01 ` kbuild test robot
2017-11-03 22:55 ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR21MB0747E37FBE0FAA402C5C5D61DC380@DM5PR21MB0747.namprd21.prod.outlook.com \
    --to=michael.h.kelley@microsoft.com \
    --cc=apw@canonical.com \
    --cc=devel@linuxdriverproject.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jasowang@redhat.com \
    --cc=kys@microsoft.com \
    --cc=leann.ogasawara@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.cerri@canonical.com \
    --cc=olaf@aepfle.de \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.