From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753548AbdK3SIw (ORCPT <rfc822;w@1wt.eu>);
        Thu, 30 Nov 2017 13:08:52 -0500
Received: from mail-sn1nam02on0103.outbound.protection.outlook.com ([104.47.36.103]:13185
        "EHLO NAM02-SN1-obe.outbound.protection.outlook.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1753289AbdK3SIt (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 30 Nov 2017 13:08:49 -0500
From: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
To: Thomas Gleixner <tglx@linutronix.de>
CC: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "devel@linuxdriverproject.org" <devel@linuxdriverproject.org>,
        "olaf@aepfle.de" <olaf@aepfle.de>,
        "apw@canonical.com" <apw@canonical.com>,
        "vkuznets@redhat.com" <vkuznets@redhat.com>,
        "jasowang@redhat.com" <jasowang@redhat.com>,
        "leann.ogasawara@canonical.com" <leann.ogasawara@canonical.com>,
        "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>,
        Stephen Hemminger <sthemmin@microsoft.com>,
        "KY Srinivasan" <kys@microsoft.com>
Subject: RE: [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode
 for stimer0
Thread-Topic: [PATCH char-misc 1/1] Drivers: hv: vmbus: Implement Direct Mode
 for stimer0
Thread-Index: AQHTUpZo6Qcg61vIvkmfgDtshVpGsaL/nI0AgC23nfA=
Date: Thu, 30 Nov 2017 18:08:42 +0000
Message-ID: <DM5PR21MB0747E37FBE0FAA402C5C5D61DC380@DM5PR21MB0747.namprd21.prod.outlook.com>
References: <20171031221849.12117-1-mikelley@exchange.microsoft.com>
 <alpine.DEB.2.20.1711011517370.1942@nanos>
In-Reply-To: <alpine.DEB.2.20.1711011517370.1942@nanos>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=True;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Owner=mikelley@ntdev.microsoft.com;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2017-11-30T18:08:40.3727368Z;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=General;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Application=Microsoft Azure
 Information Protection;
 MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Extended_MSFT_Method=Automatic;
 Sensitivity=General
x-originating-ip: [24.22.167.197]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1;DM5PR21MB0154;6:ewkmjy6iYQ2NyfCYBde48nRVv/37dQ/vI2u93sMJJDs4vewWvKrpUKEOwJJdL4jSg0sjTvEQrJDUnq89yix/mZweF7ms2lp5Q/cYV/fZEzVyH481bF6RfGJUyL/Br0BCHCrSmXvkQQnnUfEg3VNJHFS3aA7TGZqx8jY+iWcysgurpA/91vhiO/OBwirUoGAvzOZT9JK6w+4PxscDJYt51/D6fPCwxXYk4ScObdVDgbHmjGBlRmhmltt6g0xr9vzuiTlkZ9jovJZU7ZEIk2LnAIKM1VQQw9gE48gD3At/r1aWKds0+/XPZHklWvtKpCCmPMDdYxZH22ZUOJndIyDZsOOIxRGKKMyEeMnI4X570iI=;5:ToA4i9phHDvnwJN8oWlnaLzskGwsZLhIzeuRNqJLukv7ImdRfFqwxUtEIedlWr4i66o4OImt0VkLAOFKQLB1W1FWHuTEZbwuC912n4lTwu7KoUmsANd70JErGZ2z0l44IeQ6QnIjs3Isez4GiDAzcVOyB/JdxbFBMiKkVwFI1BU=;24:km9uJJLtDsRAA3pyFJ504mrT53WNXN8HiTK3/uhQ4PAu2DwTrmgDUIiPd/9gRH7qOL5lBHhy1uHCJxIyPHU+Z++odisJaDNZYve33CCQJ4g=;7:qH1fwCdDO018+uz+h0Rc3W7OfotI/umzfYvoM1fQP8mG821Zsnli4v61fI2+In1tW1W1crnqWRnHHD4KDnKlDvsd2F1UAX7Gt3rmpnL6JyJak9k49cgYg7msYFTJdlnNOiMk8mMxQfhdm6SkH9+g/APaCeOp/hyeBAUiwkZLqONfMvOeJ8oSGV4IY/b05hpT116DW29E8BbOS7J6ZjiUbND6xsvQ3QqPxHAJH7mHQ0hNHNn+P6TAacuxXRCBWdwj
x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR;
x-forefront-antispam-report: SFV:SKI;SCL:-1;SFV:NSPM;SFS:(10019020)(6009001)(366004)(39860400002)(346002)(376002)(47760400005)(199003)(24454002)(189002)(189998001)(106356001)(105586002)(33656002)(2900100001)(8936002)(68736007)(76176010)(3660700001)(3280700002)(2906002)(8990500004)(14454004)(478600001)(54356010)(10290500003)(50986010)(72206003)(7696005)(229853002)(316002)(22452003)(54906003)(25786009)(99286004)(97736004)(7416002)(102836003)(6116002)(10090500001)(3846002)(5660300001)(8676002)(81166006)(81156014)(305945005)(7736002)(74316002)(55016002)(77096006)(6436002)(6506006)(53936002)(9686003)(6916009)(107886003)(2950100002)(4326008)(6246003)(86362001)(66066001)(86612001)(101416001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR21MB0154;H:DM5PR21MB0747.namprd21.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en;
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: 00000c2b-56c9-4fe1-c9b4-08d5381d640f
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(2017052603286);SRVR:DM5PR21MB0154;
x-ms-traffictypediagnostic: DM5PR21MB0154:
x-microsoft-antispam-prvs: <DM5PR21MB015475B977E68F6275995963DC380@DM5PR21MB0154.namprd21.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(209352067349851)(140211028294663)(84791874153150);
x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(61425038)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(3231022)(10201501046)(6055026)(61426038)(61427038)(6041248)(20161123562025)(20161123558100)(20161123560025)(20161123564025)(20161123555025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:DM5PR21MB0154;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:DM5PR21MB0154;
x-forefront-prvs: 05079D8470
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=Michael.H.Kelley@microsoft.com; 
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 00000c2b-56c9-4fe1-c9b4-08d5381d640f
X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Nov 2017 18:08:42.4568
 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR21MB0154
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id vAUI8vSV031820

Thomas Gleixner <tglx@linutronix.de> writes:

> On Tue, 31 Oct 2017, mikelley@exchange.microsoft.com wrote:
> > diff --git a/arch/x86/include/uapi/asm/hyperv.h
> > b/arch/x86/include/uapi/asm/hyperv.h
> > index f65d125..408cf3e 100644
> > --- a/arch/x86/include/uapi/asm/hyperv.h
> > +++ b/arch/x86/include/uapi/asm/hyperv.h
> > @@ -112,6 +112,22 @@
> >  #define HV_X64_GUEST_IDLE_STATE_AVAILABLE		(1 << 5)
> >  /* Guest crash data handler available */
> >  #define HV_X64_GUEST_CRASH_MSR_AVAILABLE		(1 << 10)
> > +/* Debug MSRs available */
> > +#define HV_X64_DEBUG_MSR_AVAILABLE			(1 << 11)
> > +/* Support for Non-Privileged Instruction Execution Prevention is available */
> > +#define HV_X64_NPIEP_AVAILABLE				(1 << 12)
> > +/* Support for DisableHypervisor is available */
> > +#define HV_X64_DISABLE_HYPERVISOR_AVAILABLE		(1 << 13)
> > +/* Extended GVA Ranges for Flush Virtual Address list is available */
> > +#define HV_X64_EXTENDED_GVA_RANGE_AVAILABLE		(1 << 14)
> > +/* Return Hypercall output via XMM registers is available */
> > +#define HV_X64_HYPERCALL_XMM_OUTPUT_AVAILABLE		(1 << 15)
> > +/* SINT polling mode available */
> > +#define HV_X64_SINT_POLLING_MODE_AVAILABLE		(1 << 17)
> > +/* Hypercall MSR lock is available */
> > +#define HV_X64_HYPERCALL_MSR_LOCK_AVAILABLE		(1 << 18)
> > +/* stimer direct mode is available */
> > +#define HV_X64_STIMER_DIRECT_MODE_AVAILABLE		(1 << 19)
> 
> How are all these defines (except the last one related to that patch?

Will move to a separate patch.

> 
> > +/* Hardware IRQ number to use for stimer0 in Direct Mode.  This IRQ
> > +is a fake
> > + * because stimer's in Direct Mode simply interrupt on the specified
> > +vector,
> > + * without using a particular IOAPIC pin. But we use the IRQ
> > +allocation
> > + * machinery, so we need a hardware IRQ #.  This value is somewhat
> > +arbitrary,
> > + * but it should not be a legacy IRQ (0 to 15), and should fit within
> > +the
> > + * single IOAPIC (0 to 23) that Hyper-V provides to a guest VM. So
> > +any value
> > + * between 16 and 23 should be good.
> > + */
> > +#define HV_STIMER0_IRQNR		18
> 
> Why would you want abuse an IOAPIC interrupt if all you need is a vector?

Allocating a vector up-front like the existing HYPERVISOR_CALLBACK_VECTOR would certainly be more straightforward, and in fact, my original internal testing of stimer Direct Mode used that approach.   Vectors are a limited resource, so I wanted to find a way to allocate one on-the-fly rather than use fixed pre-allocation, even under CONFIG_HYPERV.   But I've got no problem with allocating a vector up-front and skipping all the irq/APIC manipulation and related issues.

Any guidance on which vector to use?

> 
> > +/* Routines to do per-architecture handling of stimer0 when in Direct
> > +Mode */
> > +
> > +void hv_ack_stimer0_interrupt(struct irq_desc *desc) {
> > +	ack_APIC_irq();
> > +}
> > +
> > +static void allonline_vector_allocation_domain(int cpu, struct cpumask *retmask,
> > +				const struct cpumask *mask)
> > +{
> > +	cpumask_copy(retmask, cpu_online_mask); }
> > +
> > +int hv_allocate_stimer0_irq(int *irq, int *vector) {
> > +	int		localirq;
> > +	int		result;
> > +	struct irq_data *irq_data;
> > +
> > +	/* The normal APIC vector allocation domain allows allocation of
> > +vectors
> 
> Please fix you comment style. Multi line comments are:
> 
>        /*
>         * Bla....
> 	* foo...
> 	*/
> 

Will do.

> > +	 * only for the calling CPU.  So we change the allocation domain to one
> > +	 * that allows vectors to be allocated in all online CPUs.  This
> > +	 * change is fine in a Hyper-V VM because VMs don't have the usual
> > +	 * complement of interrupting devices.
> > +	 */
> > +	apic->vector_allocation_domain = allonline_vector_allocation_domain;
> 
> This does not work anymore. vector_allocation_domain is gone as of 4.15. Please check the
> vector rework in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip x86/apic
> 
> Aside of that what guarantees that all cpus are online at the point where you allocate that
> interrupt? Nothing, so the vector will be not reserved or allocated on offline CPUs. Now guess
> what happens if you bring the offline CPUs online later, it will drown in spurious interrupts.
> Worse, the vector can also be reused for a device interrupt. Great plan.
> 
> > +	localirq = acpi_register_gsi(NULL, HV_STIMER0_IRQNR,
> > +				ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
> > +	if (localirq < 0) {
> > +		pr_err("Cannot register stimer0 gsi. Error %d", localirq);
> > +		return -1;
> > +	}
> > +
> > +	/* We pass in a dummy IRQ handler because architecture independent code
> > +	 * will later override the IRQ domain interrupt handler and set it to a
> > +	 * Hyper-V specific handler.
> > +	 */
> > +	result = request_irq(localirq, (irq_handler_t)(-1), 0,
> 					"Hyper-V stimer0", NULL);
> 
> That's a crude hack. Really.
> 
> > +	if (result) {
> > +		pr_err("Cannot request stimer0 irq. Error %d", result);
> > +		acpi_unregister_gsi(localirq);
> > +		return -1;
> > +	}
> > +	irq_data = irq_domain_get_irq_data(x86_vector_domain, localirq);
> > +	*vector = irqd_cfg(irq_data)->vector;
> > +	*irq = localirq;
> 
> Uurgh, no. This is even more of a layering violation. Grab random data from wherever it
> comes and then expect that it works. This will simply fall apart the moment someone changes
> the affinity of this interrupt. It will move to some random other vector and the system drowns
> in spurious interrupts on the old vector.
> 
> > +/* ISR for when stimer0 is operating in Direct Mode.  Direct Mode
> > +does
> > + * not use VMBus or any VMBus messages, so process here and not in
> > +the
> > + * VMBus driver code.
> > + */
> > +
> > +static void hv_stimer0_isr(struct irq_desc *desc) {
> > +	struct hv_per_cpu_context *hv_cpu;
> > +
> > +	__this_cpu_inc(*desc->kstat_irqs);
> > +	__this_cpu_inc(kstat.irqs_sum);
> > +	hv_ack_stimer0_interrupt(desc);
> > +	hv_cpu = this_cpu_ptr(hv_context.cpu_context);
> > +	hv_cpu->clk_evt->event_handler(hv_cpu->clk_evt);
> > +	add_interrupt_randomness(desc->irq_data.irq, 0); }
> > +
> >  static int hv_ce_set_next_event(unsigned long delta,
> >  				struct clock_event_device *evt)
> >  {
> > @@ -108,6 +149,8 @@ static int hv_ce_shutdown(struct
> > clock_event_device *evt)  {
> >  	hv_init_timer(HV_X64_MSR_STIMER0_COUNT, 0);
> >  	hv_init_timer_config(HV_X64_MSR_STIMER0_CONFIG, 0);
> > +	if (stimer_direct_mode)
> > +		hv_disable_stimer0_percpu_irq(stimer0_irq);
> 
> Whats the point of that? It's an empty inline:
> 
> > +#if IS_ENABLED(CONFIG_HYPERV)
> > +static inline void hv_enable_stimer0_percpu_irq(int irq) { } static
> > +inline void hv_disable_stimer0_percpu_irq(int irq) { }

We've got code in the works for Hyper-V on ARM64.  This is architecture independent code that caters to the ARM64 side where stimer Direct Mode will use percpu IRQs, and hooks are needed to call enable_percpu_irq() and disable_percpu_irq().

> 
> > +	if (stimer_direct_mode) {
> > +
> > +		/* When it expires, the timer will directly interrupt
> > +		 * on the specific hardware vector.
> > +		 */
> > +		timer_cfg.direct_mode = 1;
> > +		timer_cfg.apic_vector = stimer0_vector;
> > +		hv_enable_stimer0_percpu_irq(stimer0_irq);
> 
> Ditto.
> 
> > +	if (stimer_direct_mode) {
> > +		if (hv_allocate_stimer0_irq(&stimer0_irq, &stimer0_vector))
> > +			goto err;
> > +		irq_set_handler(stimer0_irq, hv_stimer0_isr);
> 
> What you really want to do here is to allocate a fixed vector like we do for the other percpu
> interrupts (local apic timer, IPIs etc.) and use that. This must be done at boot time, when you
> detect that the kernel runs on HyperV. This makes sure, that all CPUs will have the vector
> populated, even those which come online late.

Will submit a v2 with that approach.

> 
> Though you can't do that from a module. You have to do that setup during early boot from
> ms_hyperv_init_platform(). That's the only sane way to deal with that.
> 
> Thanks,
> 
> 	tglx