All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ben Serebrin <serebrin@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	willemb@google.com, venkateshs@google.com, jonolson@google.com,
	willemdebruijn.kernel@gmail.com, rick.jones2@hpe.com,
	jmattson@google.com, linux-s390 <linux-s390@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs
Date: Wed, 8 Feb 2017 21:37:41 +0200	[thread overview]
Message-ID: <20170208205353-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170207181506.36668-1-serebrin@google.com>

On Tue, Feb 07, 2017 at 10:15:06AM -0800, Ben Serebrin wrote:
> From: Benjamin Serebrin <serebrin@google.com>
> 
> If the number of virtio queue pairs is not equal to the
> number of VCPUs, the virtio guest driver doesn't assign
> any CPU affinity for the queue interrupts or the xps
> aggregation interrupt.  (In contrast, the driver does assign
> both if the counts of VCPUs and queues are equal, which is a good
> default behavior.)
> 
> Google Compute Engine currently provides 1 queue pair for
> every VCPU, but limits that at a maximum of 32 queue pairs.
> 
> This code extends the driver's default interrupt affinity
> and transmit affinity settings for the case where there
> are mismatching queue and VCPU counts.  Userspace affinity
> adjustment may always be needed to tune for a given workload.

IIRC irqbalance will bail out and avoid touching affinity
if you set affinity from driver.  Breaking that's not nice.
Pls correct me if I'm wrong.

Generally, I wonder - we aren't the only device with a limited number of
queues.

> Tested:
> 
> (on a 64-VCPU VM with debian 8, jessie-backports 4.9.2)
> 
> Without the fix we see all queues affinitized to all CPUs:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 0-63
> [...]
> 0-63
> 
> and we see all TX queues' xps_cpus affinitzed to no cores:
> 
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
> 00000000,00000000
> [...]
> 00000000,00000000
> 
> With the fix, we see each queue assigned to the a single core,
> and xps affinity set to 1 unique cpu per TX queue.
> 
> 64 VCPU:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 
> 0-63
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-63
> 0-63
> 0-63
> 0-63
> 
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done
> 
> 00000001,00000001
> 00000002,00000002
> 00000004,00000004
> 00000008,00000008
> 00000010,00000010
> 00000020,00000020
> 00000040,00000040
> 00000080,00000080
> 00000100,00000100
> 00000200,00000200
> 00000400,00000400
> 00000800,00000800
> 00001000,00001000
> 00002000,00002000
> 00004000,00004000
> 00008000,00008000
> 00010000,00010000
> 00020000,00020000
> 00040000,00040000
> 00080000,00080000
> 00100000,00100000
> 00200000,00200000
> 00400000,00400000
> 00800000,00800000
> 01000000,01000000
> 02000000,02000000
> 04000000,04000000
> 08000000,08000000
> 10000000,10000000
> 20000000,20000000
> 40000000,40000000
> 80000000,80000000
> 
> 48 VCPU:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 0-47
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-47
> 0-47
> 0-47
> 0-47
> 
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done
> 
> 0001,00000001
> 0002,00000002
> 0004,00000004
> 0008,00000008
> 0010,00000010
> 0020,00000020
> 0040,00000040
> 0080,00000080
> 0100,00000100
> 0200,00000200
> 0400,00000400
> 0800,00000800
> 1000,00001000
> 2000,00002000
> 4000,00004000
> 8000,00008000
> 0000,00010000
> 0000,00020000
> 0000,00040000
> 0000,00080000
> 0000,00100000
> 0000,00200000
> 0000,00400000
> 0000,00800000
> 0000,01000000
> 0000,02000000
> 0000,04000000
> 0000,08000000
> 0000,10000000
> 0000,20000000
> 0000,40000000
> 0000,80000000
> 
> Acked-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Jim Mattson <jmattson@google.com>
> Acked-by: Venkatesh Srinivas <venkateshs@google.com>
> 
> Signed-off-by: Ben Serebrin <serebrin@google.com>


What happens if you have more than 1 virtio net device?

> ---
>  drivers/net/virtio_net.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 765c2d6358da..0dc3a102bfc4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1502,20 +1502,44 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
>  	 * queue pairs, we let the queue pairs to be private to one cpu by
>  	 * setting the affinity hint to eliminate the contention.
>  	 */
> -	if (vi->curr_queue_pairs == 1 ||
> -	    vi->max_queue_pairs != num_online_cpus()) {
> +	if (vi->curr_queue_pairs == 1) {
>  		virtnet_clean_affinity(vi, -1);
>  		return;
>  	}
>  
> +	/* If there are more cpus than queues, then assign the queues'
> +	 * interrupts to the first cpus until we run out.
> +	 */
>  	i = 0;
>  	for_each_online_cpu(cpu) {
> +		if (i == vi->max_queue_pairs)
> +			break;
>  		virtqueue_set_affinity(vi->rq[i].vq, cpu);
>  		virtqueue_set_affinity(vi->sq[i].vq, cpu);
> -		netif_set_xps_queue(vi->dev, cpumask_of(cpu), i);
>  		i++;
>  	}
>  
> +	/* Stripe the XPS affinities across the online CPUs.
> +	 * Hyperthread pairs are typically assigned such that Linux's
> +	 * CPU X and X + (numcpus / 2) are hyperthread twins, so we cause
> +	 * hyperthread twins to share TX queues, in the case where there are
> +	 * more cpus than queues.

Couldn't you add some kind of API so that we don't need to make
assumptions like this? E.g. "give me a new CPU core to use for an
interrupt"?
Would address multiple device thing too.


> +	 */
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
> +		struct cpumask mask;
> +		int skip = i;
> +
> +		cpumask_clear(&mask);
> +		for_each_online_cpu(cpu) {
> +			while (skip--)
> +				cpu = cpumask_next(cpu, cpu_online_mask);
> +			if (cpu < num_possible_cpus())
> +				cpumask_set_cpu(cpu, &mask);
> +			skip = vi->max_queue_pairs - 1;
> +		}
> +		netif_set_xps_queue(vi->dev, &mask, i);
> +	}
> +

Doesn't look like this will handle the case of num cpus < num queues well.

>  	vi->affinity_hint_set = true;
>  }
>  

  parent reply	other threads:[~2017-02-08 19:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07 18:15 [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs Ben Serebrin
2017-02-08 18:37 ` David Miller
2017-02-08 19:37 ` Michael S. Tsirkin [this message]
2017-02-14 19:17   ` Benjamin Serebrin
2017-02-14 21:05     ` Michael S. Tsirkin
2017-02-15 16:50       ` Willem de Bruijn
2017-02-15 17:42         ` Michael S. Tsirkin
2017-02-15 18:27           ` Benjamin Serebrin
2017-02-15 19:17             ` Michael S. Tsirkin
2017-02-15 21:38               ` Benjamin Serebrin
2017-02-15 21:49                 ` Michael S. Tsirkin
2017-02-15 22:13                   ` Benjamin Serebrin
2017-02-15 15:23     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170208205353-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=jmattson@google.com \
    --cc=jonolson@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rick.jones2@hpe.com \
    --cc=serebrin@google.com \
    --cc=venkateshs@google.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.