All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrei Vagin <avagin@virtuozzo.com>
To: "Nambiar, Amritha" <amritha.nambiar@intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: netdev@vger.kernel.org, davem@davemloft.net,
	alexander.h.duyck@intel.com, willemdebruijn.kernel@gmail.com,
	sridhar.samudrala@intel.com, alexander.duyck@gmail.com,
	edumazet@google.com, hannes@stressinduktion.org,
	tom@herbertland.com, tom@quantonium.net, jasowang@redhat.com,
	gaowanlong@cn.fujitsu.com
Subject: Re: [net-next, v6, 6/7] net-sysfs: Add interface for Rx queue(s) map per Tx queue
Date: Wed, 18 Jul 2018 11:22:36 -0700	[thread overview]
Message-ID: <20180718182235.GA28548@outlook.office365.com> (raw)
In-Reply-To: <4b7f5d42-1b81-f095-f313-f43e41cf8601@intel.com>

On Tue, Jul 10, 2018 at 07:28:49PM -0700, Nambiar, Amritha wrote:
> On 7/4/2018 12:20 AM, Andrei Vagin wrote:
> > Hello Amritha,
> > 
> > I see a following warning on 4.18.0-rc3-next-20180703.
> > It looks like a problem is in this series.
> > 
> > [    1.084722] ============================================
> > [    1.084797] WARNING: possible recursive locking detected
> > [    1.084872] 4.18.0-rc3-next-20180703+ #1 Not tainted
> > [    1.084949] --------------------------------------------
> > [    1.085024] swapper/0/1 is trying to acquire lock:
> > [    1.085100] 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20
> > [    1.085189] 
> > [    1.085189] but task is already holding lock:
> > [    1.085271] 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
> > [    1.085357] 
> > [    1.085357] other info that might help us debug this:
> > [    1.085450]  Possible unsafe locking scenario:
> > [    1.085450] 
> > [    1.085531]        CPU0
> > [    1.085605]        ----
> > [    1.085679]   lock(cpu_hotplug_lock.rw_sem);
> > [    1.085753]   lock(cpu_hotplug_lock.rw_sem);
> > [    1.085828] 
> > [    1.085828]  *** DEADLOCK ***
> > [    1.085828] 
> > [    1.085916]  May be due to missing lock nesting notation
> > [    1.085916] 
> > [    1.085998] 3 locks held by swapper/0/1:
> > [    1.086074]  #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110
> > [    1.086164]  #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
> > [    1.086248]  #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60
> > [    1.086336] 
> > [    1.086336] stack backtrace:
> > [    1.086419] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc3-next-20180703+ #1
> > [    1.086504] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > [    1.086587] Call Trace:
> > [    1.086667]  dump_stack+0x85/0xcb
> > [    1.086744]  __lock_acquire+0x68a/0x1330
> > [    1.086821]  ? lock_acquire+0x9f/0x200
> > [    1.086900]  ? find_held_lock+0x2d/0x90
> > [    1.086976]  ? lock_acquire+0x9f/0x200
> > [    1.087051]  lock_acquire+0x9f/0x200
> > [    1.087126]  ? static_key_slow_inc+0xe/0x20
> > [    1.087205]  cpus_read_lock+0x3e/0x80
> > [    1.087280]  ? static_key_slow_inc+0xe/0x20
> > [    1.087355]  static_key_slow_inc+0xe/0x20
> > [    1.087435]  __netif_set_xps_queue+0x216/0xc60
> > [    1.087512]  virtnet_set_affinity+0xf0/0x130
> > [    1.087589]  init_vqs+0x51b/0x5a0
> > [    1.087665]  virtnet_probe+0x39f/0x870
> > [    1.087742]  virtio_dev_probe+0x170/0x220
> > [    1.087819]  driver_probe_device+0x30b/0x480
> > [    1.087897]  ? set_debug_rodata+0x11/0x11
> > [    1.087972]  __driver_attach+0xe0/0x110
> > [    1.088064]  ? driver_probe_device+0x480/0x480
> > [    1.088141]  bus_for_each_dev+0x79/0xc0
> > [    1.088221]  bus_add_driver+0x164/0x260
> > [    1.088302]  ? veth_init+0x11/0x11
> > [    1.088379]  driver_register+0x5b/0xe0
> > [    1.088402]  ? veth_init+0x11/0x11
> > [    1.088402]  virtio_net_driver_init+0x6d/0x90
> > [    1.088402]  do_one_initcall+0x5d/0x34c
> > [    1.088402]  ? set_debug_rodata+0x11/0x11
> > [    1.088402]  ? rcu_read_lock_sched_held+0x6b/0x80
> > [    1.088402]  kernel_init_freeable+0x1ea/0x27b
> > [    1.088402]  ? rest_init+0xd0/0xd0
> > [    1.088402]  kernel_init+0xa/0x110
> > [    1.088402]  ret_from_fork+0x3a/0x50
> > [    1.094190] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
> > 
> > 
> > https://travis-ci.org/avagin/linux/jobs/399867744
> > 
> 
> With this patch series, I introduced static_key for XPS maps
> (xps_needed), so static_key_slow_inc() is used to switch branches. The
> definition of static_key_slow_inc() has cpus_read_lock in place. In the
> virtio_net driver, XPS queues are initialized after setting the
> queue:cpu affinity in virtnet_set_affinity() which is already protected
> within cpus_read_lock. Hence, the warning here trying to acquire
> cpus_read_lock when it is already held.
> 
> A quick fix for this would be to just extract netif_set_xps_queue() out
> of the lock by simply wrapping it with another put/get_online_cpus
> (unlock right before and hold lock right after). But this may not a
> clean solution. It'd help if I can get suggestions on what would be a
> clean option to fix this without extensively changing the code in
> virtio_net. Is it mandatory to protect the affinitization with
> read_lock? I don't see similar lock in other drivers while setting the
> affinity.

> I understand this warning should go away, but isn't it safe to
> have multiple readers.

Peter and Ingo, maybe you could explain why it isn't safe to take one
reader lock twice?

Thanks,
Andrei

> 
> > On Fri, Jun 29, 2018 at 09:27:07PM -0700, Amritha Nambiar wrote:
> >> Extend transmit queue sysfs attribute to configure Rx queue(s) map
> >> per Tx queue. By default no receive queues are configured for the
> >> Tx queue.
> >>
> >> - /sys/class/net/eth0/queues/tx-*/xps_rxqs
> >>
> >> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> >> ---
> >>  net/core/net-sysfs.c |   83 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 83 insertions(+)
> >>
> >> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> >> index b39987c..f25ac5f 100644
> >> --- a/net/core/net-sysfs.c
> >> +++ b/net/core/net-sysfs.c
> >> @@ -1283,6 +1283,88 @@ static ssize_t xps_cpus_store(struct netdev_queue *queue,
> >>  
> >>  static struct netdev_queue_attribute xps_cpus_attribute __ro_after_init
> >>  	= __ATTR_RW(xps_cpus);
> >> +
> >> +static ssize_t xps_rxqs_show(struct netdev_queue *queue, char *buf)
> >> +{
> >> +	struct net_device *dev = queue->dev;
> >> +	struct xps_dev_maps *dev_maps;
> >> +	unsigned long *mask, index;
> >> +	int j, len, num_tc = 1, tc = 0;
> >> +
> >> +	index = get_netdev_queue_index(queue);
> >> +
> >> +	if (dev->num_tc) {
> >> +		num_tc = dev->num_tc;
> >> +		tc = netdev_txq_to_tc(dev, index);
> >> +		if (tc < 0)
> >> +			return -EINVAL;
> >> +	}
> >> +	mask = kcalloc(BITS_TO_LONGS(dev->num_rx_queues), sizeof(long),
> >> +		       GFP_KERNEL);
> >> +	if (!mask)
> >> +		return -ENOMEM;
> >> +
> >> +	rcu_read_lock();
> >> +	dev_maps = rcu_dereference(dev->xps_rxqs_map);
> >> +	if (!dev_maps)
> >> +		goto out_no_maps;
> >> +
> >> +	for (j = -1; j = netif_attrmask_next(j, NULL, dev->num_rx_queues),
> >> +	     j < dev->num_rx_queues;) {
> >> +		int i, tci = j * num_tc + tc;
> >> +		struct xps_map *map;
> >> +
> >> +		map = rcu_dereference(dev_maps->attr_map[tci]);
> >> +		if (!map)
> >> +			continue;
> >> +
> >> +		for (i = map->len; i--;) {
> >> +			if (map->queues[i] == index) {
> >> +				set_bit(j, mask);
> >> +				break;
> >> +			}
> >> +		}
> >> +	}
> >> +out_no_maps:
> >> +	rcu_read_unlock();
> >> +
> >> +	len = bitmap_print_to_pagebuf(false, buf, mask, dev->num_rx_queues);
> >> +	kfree(mask);
> >> +
> >> +	return len < PAGE_SIZE ? len : -EINVAL;
> >> +}
> >> +
> >> +static ssize_t xps_rxqs_store(struct netdev_queue *queue, const char *buf,
> >> +			      size_t len)
> >> +{
> >> +	struct net_device *dev = queue->dev;
> >> +	struct net *net = dev_net(dev);
> >> +	unsigned long *mask, index;
> >> +	int err;
> >> +
> >> +	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
> >> +		return -EPERM;
> >> +
> >> +	mask = kcalloc(BITS_TO_LONGS(dev->num_rx_queues), sizeof(long),
> >> +		       GFP_KERNEL);
> >> +	if (!mask)
> >> +		return -ENOMEM;
> >> +
> >> +	index = get_netdev_queue_index(queue);
> >> +
> >> +	err = bitmap_parse(buf, len, mask, dev->num_rx_queues);
> >> +	if (err) {
> >> +		kfree(mask);
> >> +		return err;
> >> +	}
> >> +
> >> +	err = __netif_set_xps_queue(dev, mask, index, true);
> >> +	kfree(mask);
> >> +	return err ? : len;
> >> +}
> >> +
> >> +static struct netdev_queue_attribute xps_rxqs_attribute __ro_after_init
> >> +	= __ATTR_RW(xps_rxqs);
> >>  #endif /* CONFIG_XPS */
> >>  
> >>  static struct attribute *netdev_queue_default_attrs[] __ro_after_init = {
> >> @@ -1290,6 +1372,7 @@ static struct attribute *netdev_queue_default_attrs[] __ro_after_init = {
> >>  	&queue_traffic_class.attr,
> >>  #ifdef CONFIG_XPS
> >>  	&xps_cpus_attribute.attr,
> >> +	&xps_rxqs_attribute.attr,
> >>  	&queue_tx_maxrate.attr,
> >>  #endif
> >>  	NULL

  reply	other threads:[~2018-07-18 19:02 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-30  4:26 [net-next PATCH v6 0/7] Symmetric queue selection using XPS for Rx queues Amritha Nambiar
2018-06-30  4:26 ` [net-next PATCH v6 1/7] net: Refactor XPS for CPUs and " Amritha Nambiar
2018-06-30  4:26 ` [net-next PATCH v6 2/7] net: Use static_key for XPS maps Amritha Nambiar
2018-06-30  4:26 ` [net-next PATCH v6 3/7] net: sock: Change tx_queue_mapping in sock_common to unsigned short Amritha Nambiar
2018-06-30  4:26 ` [net-next PATCH v6 4/7] net: Record receive queue number for a connection Amritha Nambiar
2018-06-30  4:27 ` [net-next PATCH v6 5/7] net: Enable Tx queue selection based on Rx queues Amritha Nambiar
2018-06-30  4:27 ` [net-next PATCH v6 6/7] net-sysfs: Add interface for Rx queue(s) map per Tx queue Amritha Nambiar
2018-07-04  7:20   ` [net-next, v6, " Andrei Vagin
2018-07-11  2:28     ` Nambiar, Amritha
2018-07-18 18:22       ` Andrei Vagin [this message]
2018-07-18 19:24         ` Stephen Hemminger
2018-07-19  9:16         ` Peter Zijlstra
2018-08-02  0:11       ` Andrei Vagin
2018-08-02 21:04         ` Nambiar, Amritha
2018-08-02 21:08           ` Michael S. Tsirkin
2018-08-02 21:08           ` Michael S. Tsirkin
2018-08-03 19:06             ` Andrei Vagin
2018-08-03 19:12               ` Michael S. Tsirkin
2018-08-03 21:19                 ` Andrei Vagin
2018-08-04  4:30                 ` Andrei Vagin
2018-08-03 19:12               ` Michael S. Tsirkin
2018-06-30  4:27 ` [net-next PATCH v6 7/7] Documentation: Add explanation for XPS using Rx-queue(s) map Amritha Nambiar
2018-07-02  0:11 ` [net-next PATCH v6 0/7] Symmetric queue selection using XPS for Rx queues David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180718182235.GA28548@outlook.office365.com \
    --to=avagin@virtuozzo.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=amritha.nambiar@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gaowanlong@cn.fujitsu.com \
    --cc=hannes@stressinduktion.org \
    --cc=jasowang@redhat.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=tom@herbertland.com \
    --cc=tom@quantonium.net \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.