From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751276AbdFBPKM (ORCPT <rfc822;w@1wt.eu>);
        Fri, 2 Jun 2017 11:10:12 -0400
Received: from mx2.suse.de ([195.135.220.15]:54570 "EHLO mx1.suse.de"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1750813AbdFBPKL (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 2 Jun 2017 11:10:11 -0400
Subject: Re: [PATCH] xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next
 online VCPU
To: Anoob Soman <anoob.soman@citrix.com>, xen-devel@lists.xenproject.org,
        linux-kernel@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
References: <1496414988-12878-1-git-send-email-anoob.soman@citrix.com>
From: Juergen Gross <jgross@suse.com>
Message-ID: <a75010da-0e62-5153-2dd2-2069a3c5f54f@suse.com>
Date: Fri, 2 Jun 2017 17:10:08 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.1
MIME-Version: 1.0
In-Reply-To: <1496414988-12878-1-git-send-email-anoob.soman@citrix.com>
Content-Type: text/plain; charset=utf-8
Content-Language: de-DE
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 02/06/17 16:49, Anoob Soman wrote:
> A HVM domian booting generates around 200K (evtchn:qemu-dm xen-dyn)
> interrupts,in a short period of time. All these evtchn:qemu-dm are bound
> to VCPU 0, until irqbalance sees these IRQ and moves it to a different VCPU.
> In one configuration, irqbalance runs every 10 seconds, which means
> irqbalance doesn't get to see these burst of interrupts and doesn't
> re-balance interrupts most of the time, making all evtchn:qemu-dm to be
> processed by VCPU0. This cause VCPU0 to spend most of time processing
> hardirq and very little time on softirq. Moreover, if dom0 kernel PREEMPTION
> is disabled, VCPU0 never runs watchdog (process context), triggering a
> softlockup detection code to panic.
> 
> Binding evtchn:qemu-dm to next online VCPU, will spread hardirq
> processing evenly across different CPU. Later, irqbalance will try to balance
> evtchn:qemu-dm, if required.
> 
> Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
> ---
>  drivers/xen/events/events_base.c |  9 +++++++--
>  drivers/xen/evtchn.c             | 36 +++++++++++++++++++++++++++++++++++-
>  include/xen/events.h             |  1 +
>  3 files changed, 43 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
> index b52852f..8224ec1 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -1303,10 +1303,9 @@ void rebind_evtchn_irq(int evtchn, int irq)
>  }
>  
>  /* Rebind an evtchn so that it gets delivered to a specific cpu */
> -static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
> +int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu)
>  {
>  	struct evtchn_bind_vcpu bind_vcpu;
> -	int evtchn = evtchn_from_irq(irq);
>  	int masked;
>  
>  	if (!VALID_EVTCHN(evtchn))
> @@ -1338,6 +1337,12 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
>  
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(xen_rebind_evtchn_to_cpu);
> +
> +static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
> +{
> +	return xen_rebind_evtchn_to_cpu(evtchn_from_irq(irq), tcpu);
> +}
>  
>  static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest,
>  			    bool force)
> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
> index 10f1ef5..1192f24 100644
> --- a/drivers/xen/evtchn.c
> +++ b/drivers/xen/evtchn.c
> @@ -58,6 +58,8 @@
>  #include <xen/xen-ops.h>
>  #include <asm/xen/hypervisor.h>
>  
> +static DEFINE_PER_CPU(int, bind_last_selected_cpu);
> +
>  struct per_user_data {
>  	struct mutex bind_mutex; /* serialize bind/unbind operations */
>  	struct rb_root evtchns;
> @@ -421,6 +423,36 @@ static void evtchn_unbind_from_user(struct per_user_data *u,
>  	del_evtchn(u, evtchn);
>  }
>  
> +static void evtchn_bind_interdom_next_vcpu(int evtchn)
> +{
> +	unsigned int selected_cpu, irq;
> +	struct irq_desc *desc = NULL;
> +	unsigned long flags;
> +
> +	irq = irq_from_evtchn(evtchn);
> +	desc = irq_to_desc(irq);
> +
> +	if (!desc)
> +		return;
> +
> +	raw_spin_lock_irqsave(&desc->lock, flags);
> +	selected_cpu = this_cpu_read(bind_last_selected_cpu);
> +	selected_cpu = cpumask_next_and(selected_cpu,
> +			desc->irq_common_data.affinity, cpu_online_mask);
> +
> +	if (unlikely(selected_cpu >= nr_cpu_ids))
> +		selected_cpu = cpumask_first_and(desc->irq_common_data.affinity,
> +				cpu_online_mask);
> +
> +	raw_spin_unlock_irqrestore(&desc->lock, flags);
> +	this_cpu_write(bind_last_selected_cpu, selected_cpu);
> +
> +	local_irq_disable();
> +	/* unmask expects irqs to be disabled */
> +	xen_rebind_evtchn_to_cpu(evtchn, selected_cpu);
> +	local_irq_enable();

I'd prefer the to have irq disabled from taking the lock until here.
This will avoid problems due to preemption and will be faster as it
avoids one irq on/off cycle. So:

local_irq_disable();
raw_spin_lock();
...
raw_spin_unlock();
this_cpu_write();
xen_rebind_evtchn_to_cpu();
local_irq_enable();


Juergen