From: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"devel@linuxdriverproject.org" <devel@linuxdriverproject.org>,
"olaf@aepfle.de" <olaf@aepfle.de>,
"apw@canonical.com" <apw@canonical.com>,
"vkuznets@redhat.com" <vkuznets@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"leann.ogasawara@canonical.com" <leann.ogasawara@canonical.com>,
"marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
"KY Srinivasan" <kys@microsoft.com>
Subject: RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode for stimer0
Date: Thu, 30 Nov 2017 18:08:42 +0000 [thread overview]
Message-ID: <DM5PR21MB0747E37FBE0FAA402C5C5D61DC380@DM5PR21MB0747.namprd21.prod.outlook.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1711011517370.1942@nanos>
Thomas Gleixner <tglx@linutronix.de> writes:
> On Tue, 31 Oct 2017, mikelley@exchange.microsoft.com wrote:
> > diff --git a/arch/x86/include/uapi/asm/hyperv.h
> > b/arch/x86/include/uapi/asm/hyperv.h
> > index f65d125..408cf3e 100644
> > --- a/arch/x86/include/uapi/asm/hyperv.h
> > +++ b/arch/x86/include/uapi/asm/hyperv.h
> > @@ -112,6 +112,22 @@
> > #define HV_X64_GUEST_IDLE_STATE_AVAILABLE (1 << 5)
> > /* Guest crash data handler available */
> > #define HV_X64_GUEST_CRASH_MSR_AVAILABLE (1 << 10)
> > +/* Debug MSRs available */
> > +#define HV_X64_DEBUG_MSR_AVAILABLE (1 << 11)
> > +/* Support for Non-Privileged Instruction Execution Prevention is available */
> > +#define HV_X64_NPIEP_AVAILABLE (1 << 12)
> > +/* Support for DisableHypervisor is available */
> > +#define HV_X64_DISABLE_HYPERVISOR_AVAILABLE (1 << 13)
> > +/* Extended GVA Ranges for Flush Virtual Address list is available */
> > +#define HV_X64_EXTENDED_GVA_RANGE_AVAILABLE (1 << 14)
> > +/* Return Hypercall output via XMM registers is available */
> > +#define HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE (1 << 15)
> > +/* SINT polling mode available */
> > +#define HV_X64_SINT_POLLING_MODE_AVAILABLE (1 << 17)
> > +/* Hypercall MSR lock is available */
> > +#define HV_X64_HYPERCALL_MSR_LOCK_AVAILABLE (1 << 18)
> > +/* stimer direct mode is available */
> > +#define HV_X64_STIMER_DIRECT_MODE_AVAILABLE (1 << 19)
>
> How are all these defines (except the last one related to that patch?
Will move to a separate patch.
>
> > +/* Hardware IRQ number to use for stimer0 in Direct Mode. This IRQ
> > +is a fake
> > + * because stimer's in Direct Mode simply interrupt on the specified
> > +vector,
> > + * without using a particular IOAPIC pin. But we use the IRQ
> > +allocation
> > + * machinery, so we need a hardware IRQ #. This value is somewhat
> > +arbitrary,
> > + * but it should not be a legacy IRQ (0 to 15), and should fit within
> > +the
> > + * single IOAPIC (0 to 23) that Hyper-V provides to a guest VM. So
> > +any value
> > + * between 16 and 23 should be good.
> > + */
> > +#define HV_STIMER0_IRQNR 18
>
> Why would you want abuse an IOAPIC interrupt if all you need is a vector?
Allocating a vector up-front like the existing HYPERVISOR_CALLBACK_VECTOR would certainly be more straightforward, and in fact, my original internal testing of stimer Direct Mode used that approach. Vectors are a limited resource, so I wanted to find a way to allocate one on-the-fly rather than use fixed pre-allocation, even under CONFIG_HYPERV. But I've got no problem with allocating a vector up-front and skipping all the irq/APIC manipulation and related issues.
Any guidance on which vector to use?
>
> > +/* Routines to do per-architecture handling of stimer0 when in Direct
> > +Mode */
> > +
> > +void hv_ack_stimer0_interrupt(struct irq_desc *desc) {
> > + ack_APIC_irq();
> > +}
> > +
> > +static void allonline_vector_allocation_domain(int cpu, struct cpumask *retmask,
> > + const struct cpumask *mask)
> > +{
> > + cpumask_copy(retmask, cpu_online_mask); }
> > +
> > +int hv_allocate_stimer0_irq(int *irq, int *vector) {
> > + int localirq;
> > + int result;
> > + struct irq_data *irq_data;
> > +
> > + /* The normal APIC vector allocation domain allows allocation of
> > +vectors
>
> Please fix you comment style. Multi line comments are:
>
> /*
> * Bla....
> * foo...
> */
>
Will do.
> > + * only for the calling CPU. So we change the allocation domain to one
> > + * that allows vectors to be allocated in all online CPUs. This
> > + * change is fine in a Hyper-V VM because VMs don't have the usual
> > + * complement of interrupting devices.
> > + */
> > + apic->vector_allocation_domain = allonline_vector_allocation_domain;
>
> This does not work anymore. vector_allocation_domain is gone as of 4.15. Please check the
> vector rework in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip x86/apic
>
> Aside of that what guarantees that all cpus are online at the point where you allocate that
> interrupt? Nothing, so the vector will be not reserved or allocated on offline CPUs. Now guess
> what happens if you bring the offline CPUs online later, it will drown in spurious interrupts.
> Worse, the vector can also be reused for a device interrupt. Great plan.
>
> > + localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR,
> > + ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
> > + if (localirq < 0) {
> > + pr_err("Cannot register stimer0 gsi. Error %d", localirq);
> > + return -1;
> > + }
> > +
> > + /* We pass in a dummy IRQ handler because architecture independent code
> > + * will later override the IRQ domain interrupt handler and set it to a
> > + * Hyper-V specific handler.
> > + */
> > + result = request_irq(localirq, (irq_handler_t)(-1), 0,
> "Hyper-V stimer0", NULL);
>
> That's a crude hack. Really.
>
> > + if (result) {
> > + pr_err("Cannot request stimer0 irq. Error %d", result);
> > + acpi_unregister_gsi(localirq);
> > + return -1;
> > + }
> > + irq_data = irq_domain_get_irq_data(x86_vector_domain, localirq);
> > + *vector = irqd_cfg(irq_data)->vector;
> > + *irq = localirq;
>
> Uurgh, no. This is even more of a layering violation. Grab random data from wherever it
> comes and then expect that it works. This will simply fall apart the moment someone changes
> the affinity of this interrupt. It will move to some random other vector and the system drowns
> in spurious interrupts on the old vector.
>
> > +/* ISR for when stimer0 is operating in Direct Mode. Direct Mode
> > +does
> > + * not use VMBus or any VMBus messages, so process here and not in
> > +the
> > + * VMBus driver code.
> > + */
> > +
> > +static void hv_stimer0_isr(struct irq_desc *desc) {
> > + struct hv_per_cpu_context *hv_cpu;
> > +
> > + __this_cpu_inc(*desc->kstat_irqs);
> > + __this_cpu_inc(kstat.irqs_sum);
> > + hv_ack_stimer0_interrupt(desc);
> > + hv_cpu = this_cpu_ptr(hv_context.cpu_context);
> > + hv_cpu->clk_evt->event_handler(hv_cpu->clk_evt);
> > + add_interrupt_randomness(desc->irq_data.irq, 0); }
> > +
> > static int hv_ce_set_next_event(unsigned long delta,
> > struct clock_event_device *evt)
> > {
> > @@ -108,6 +149,8 @@ static int hv_ce_shutdown(struct
> > clock_event_device *evt) {
> > hv_init_timer(HV_X64_MSR_STIMER0_COUNT, 0);
> > hv_init_timer_config(HV_X64_MSR_STIMER0_CONFIG, 0);
> > + if (stimer_direct_mode)
> > + hv_disable_stimer0_percpu_irq(stimer0_irq);
>
> Whats the point of that? It's an empty inline:
>
> > +#if IS_ENABLED(CONFIG_HYPERV)
> > +static inline void hv_enable_stimer0_percpu_irq(int irq) { } static
> > +inline void hv_disable_stimer0_percpu_irq(int irq) { }
We've got code in the works for Hyper-V on ARM64. This is architecture independent code that caters to the ARM64 side where stimer Direct Mode will use percpu IRQs, and hooks are needed to call enable_percpu_irq() and disable_percpu_irq().
>
> > + if (stimer_direct_mode) {
> > +
> > + /* When it expires, the timer will directly interrupt
> > + * on the specific hardware vector.
> > + */
> > + timer_cfg.direct_mode = 1;
> > + timer_cfg.apic_vector = stimer0_vector;
> > + hv_enable_stimer0_percpu_irq(stimer0_irq);
>
> Ditto.
>
> > + if (stimer_direct_mode) {
> > + if (hv_allocate_stimer0_irq(&stimer0_irq, &stimer0_vector))
> > + goto err;
> > + irq_set_handler(stimer0_irq, hv_stimer0_isr);
>
> What you really want to do here is to allocate a fixed vector like we do for the other percpu
> interrupts (local apic timer, IPIs etc.) and use that. This must be done at boot time, when you
> detect that the kernel runs on HyperV. This makes sure, that all CPUs will have the vector
> populated, even those which come online late.
Will submit a v2 with that approach.
>
> Though you can't do that from a module. You have to do that setup during early boot from
> ms_hyperv_init_platform(). That's the only sane way to deal with that.
>
> Thanks,
>
> tglx
next prev parent reply other threads:[~2017-11-30 18:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-31 22:18 [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode for stimer0 mikelley
2017-11-01 12:00 ` Vitaly Kuznetsov
2017-11-01 13:41 ` Vitaly Kuznetsov
2017-11-30 18:08 ` Michael Kelley (EOSG)
2017-11-01 14:51 ` Thomas Gleixner
2017-11-01 16:19 ` Thomas Gleixner
2017-11-01 16:21 ` Michael Kelley (EOSG)
2017-11-30 18:08 ` Michael Kelley (EOSG) [this message]
2017-11-03 16:01 ` kbuild test robot
2017-11-03 22:55 ` kbuild test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM5PR21MB0747E37FBE0FAA402C5C5D61DC380@DM5PR21MB0747.namprd21.prod.outlook.com \
--to=michael.h.kelley@microsoft.com \
--cc=apw@canonical.com \
--cc=devel@linuxdriverproject.org \
--cc=gregkh@linuxfoundation.org \
--cc=jasowang@redhat.com \
--cc=kys@microsoft.com \
--cc=leann.ogasawara@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo.cerri@canonical.com \
--cc=olaf@aepfle.de \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).