ath10k.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: David Laight <David.Laight@ACULAB.COM>,
	Andrew Lunn <andrew@lunn.ch>,
	Rakesh Pillai <pillair@codeaurora.org>
Cc: Hillf Danton <hdanton@sina.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>,
	"dianders@chromium.org" <dianders@chromium.org>,
	Markus Elfring <Markus.Elfring@web.de>,
	"evgreen@chromium.org" <evgreen@chromium.org>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"johannes@sipsolutions.net" <johannes@sipsolutions.net>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kvalo@codeaurora.org" <kvalo@codeaurora.org>
Subject: Re: [RFC 0/7] Add support to process rx packets in thread
Date: Sat, 25 Jul 2020 16:16:33 +0800	[thread overview]
Message-ID: <20200725081633.7432-1-hdanton@sina.com> (raw)
In-Reply-To: <9fb3d3bd8d944a649cbe828fddca1bc1@AcuMS.aculab.com>


On Wed, 22 Jul 2020 09:12:42 +0000 David Laight wrote:
> > On 21 July 2020 18:25 Andrew Lunn wrote:
> > 
> > On Tue, Jul 21, 2020 at 10:44:19PM +0530, Rakesh Pillai wrote:
> > > NAPI gets scheduled on the CPU core which got the
> > > interrupt. The linux scheduler cannot move it to a
> > > different core, even if the CPU on which NAPI is running
> > > is heavily loaded. This can lead to degraded wifi
> > > performance when running traffic at peak data rates.
> > >
> > > A thread on the other hand can be moved to different
> > > CPU cores, if the one on which its running is heavily
> > > loaded. During high incoming data traffic, this gives
> > > better performance, since the thread can be moved to a
> > > less loaded or sometimes even a more powerful CPU core
> > > to account for the required CPU performance in order
> > > to process the incoming packets.
> > >
> > > This patch series adds the support to use a high priority
> > > thread to process the incoming packets, as opposed to
> > > everything being done in NAPI context.
> > 
> > I don't see why this problem is limited to the ath10k driver. I expect
> > it applies to all drivers using NAPI. So shouldn't you be solving this
> > in the NAPI core? Allow a driver to request the NAPI core uses a
> > thread?
> 
> It's not just NAPI the problem is with the softint processing.
> I suspect a lot of systems would work better if it ran as
> a (highish priority) kernel thread.

Hi folks

Below is a minimunm poc implementation I can imagine on top of workqueue
to make napi threaded. Thoughts are appreciated.

> I've had to remove the main locks from a multi-threaded application
> and replace them with atomic counters.
> Consider what happens when the threads remove items from a shared
> work list.
> The code looks like:
> 	mutex_enter();
> 	remove_item_from_list();
> 	mutex_exit().
> The mutex is only held for a few instructions, so while you'd expect
> the cache line to be 'hot' you wouldn't get real contention.
> However the following scenarios happen:
> 1) An ethernet interrupt happens while the mutex is held.
>    This stops the other threads until all the softint processing
>    has finished.
> 2) An ethernet interrupt (and softint) runs on a thread that is
>    waiting for the mutex.
>    (Or on the cpu that the thread's processor affinity ties it to.)
>    In this case the 'fair' (ticket) mutex code won't let any other
>    thread acquire the mutex.
>    So again everything stops until the softints all complete.
> 
> The second one is also a problem when trying to wake up all
> the threads (eg after adding a lot of items to the list).
> The ticket locks force them to wake in order, but
> sometimes the 'thundering herd' would work better.
> 
> IIRC this is actually worse for processes running under the RT
> scheduler (without CONFIG_PREEMPT) because the they are almost
> always scheduled on the same cpu they ran on last.
> If it is busy, but cannot be pre-empted, they are not moved
> to an idle cpu.
>    
> To confound things there is a very broken workaround for broken
> hardware in the driver for the e1000 interface on (at least)
> Ivy Bridge cpu that can cause the driver to spin for a very
> long time (IIRC milliseconds) whenever it has to write to a
> MAC register (ie on every transmit setup).
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

To make napi threaded, if either irq or softirq thread is entirely ruled
out, add napi::work that will be queued on a highpri workqueue. It is
actually a unbound one to facilitate scheduler to catter napi loads on to
idle CPU cores. What users need to do with the threaded napi
is s/netif_napi_add/netif_threaded_napi_add/ and no more.

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -338,6 +338,9 @@ struct napi_struct {
 	struct list_head	dev_list;
 	struct hlist_node	napi_hash_node;
 	unsigned int		napi_id;
+#ifdef CONFIG_THREADED_NAPI
+	struct work_struct	work;
+#endif
 };
 
 enum {
@@ -2234,6 +2237,19 @@ static inline void *netdev_priv(const st
 void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
 		    int (*poll)(struct napi_struct *, int), int weight);
 
+#ifdef CONFIG_THREADED_NAPI
+void netif_threaded_napi_add(struct net_device *dev, struct napi_struct *napi,
+		    int (*poll)(struct napi_struct *, int), int weight);
+#else
+static inline void netif_threaded_napi_add(struct net_device *dev,
+					struct napi_struct *napi,
+					int (*poll)(struct napi_struct *, int),
+					int weight)
+{
+	netif_napi_add(dev, napi, poll, weight);
+}
+#endif
+
 /**
  *	netif_tx_napi_add - initialize a NAPI context
  *	@dev:  network device
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6277,6 +6277,61 @@ static int process_backlog(struct napi_s
 	return work;
 }
 
+#ifdef CONFIG_THREADED_NAPI
+/* unbound highpri workqueue for threaded napi */
+static struct workqueue_struct *napi_workq;
+
+static void napi_workfn(struct work_struct *work)
+{
+	struct napi_struct *n = container_of(work, struct napi_struct, work);
+
+	for (;;) {
+		if (!test_bit(NAPI_STATE_SCHED, &n->state))
+			return;
+
+		if (n->poll(n, n->weight) < n->weight)
+			return;
+
+		if (need_resched()) {
+			/*
+			 * have to pay for the latency of task switch even if 
+			 * napi is scheduled
+			 */
+			if (test_bit(NAPI_STATE_SCHED, &n->state))
+				queue_work(napi_workq, work);
+			return;
+		}
+	}
+}
+
+void netif_threaded_napi_add(struct net_device *dev,
+				struct napi_struct *napi,
+				int (*poll)(struct napi_struct *, int),
+				int weight)
+{
+	netif_napi_add(dev, napi, poll, weight);
+	INIT_WORK(&napi->work, napi_workfn);
+}
+
+static inline bool is_threaded_napi(struct napi_struct *n)
+{
+	return n->work.func == napi_workfn;
+}
+
+static inline void threaded_napi_sched(struct napi_struct *n)
+{
+	if (is_threaded_napi(n))
+		queue_work(napi_workq, &n->work);
+	else
+		____napi_schedule(this_cpu_ptr(&softnet_data), n);
+}
+#else
+static inline void threaded_napi_sched(struct napi_struct *n)
+{
+	____napi_schedule(this_cpu_ptr(&softnet_data), n);
+}
+#endif
+
 /**
  * __napi_schedule - schedule for receive
  * @n: entry to schedule
@@ -6289,7 +6344,7 @@ void __napi_schedule(struct napi_struct
 	unsigned long flags;
 
 	local_irq_save(flags);
-	____napi_schedule(this_cpu_ptr(&softnet_data), n);
+	threaded_napi_sched(n);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL(__napi_schedule);
@@ -6335,7 +6390,7 @@ EXPORT_SYMBOL(napi_schedule_prep);
  */
 void __napi_schedule_irqoff(struct napi_struct *n)
 {
-	____napi_schedule(this_cpu_ptr(&softnet_data), n);
+	threaded_napi_sched(n);
 }
 EXPORT_SYMBOL(__napi_schedule_irqoff);
 
@@ -10685,6 +10740,10 @@ static int __init net_dev_init(void)
 		sd->backlog.weight = weight_p;
 	}
 
+#ifdef CONFIG_THREADED_NAPI
+	napi_workq = alloc_workqueue("napi_workq", WQ_UNBOUND | WQ_HIGHPRI,
+					    WQ_UNBOUND_MAX_ACTIVE);
+#endif
 	dev_boot_phase = 0;
 
 	/* The loopback device is special if any other network devices


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2020-07-25  8:17 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 17:14 [RFC 0/7] Add support to process rx packets in thread Rakesh Pillai
2020-07-21 17:14 ` [RFC 1/7] mac80211: Add check for napi handle before WARN_ON Rakesh Pillai
2020-07-22 12:56   ` Johannes Berg
2020-07-23 18:26     ` Rakesh Pillai
2020-07-23 20:06       ` Johannes Berg
2020-07-24  6:21         ` Rakesh Pillai
2020-07-26 16:19         ` Rakesh Pillai
2020-07-30 12:40           ` Johannes Berg
2020-07-21 17:14 ` [RFC 2/7] ath10k: Add support to process rx packet in thread Rakesh Pillai
2020-07-21 21:53   ` Rajkumar Manoharan
2020-07-22 12:27     ` Felix Fietkau
2020-07-22 12:55       ` Johannes Berg
2020-07-22 13:00         ` Felix Fietkau
2020-07-23  6:09           ` Rajkumar Manoharan
2021-03-22 23:57           ` Ben Greear
2021-03-23  1:20             ` Brian Norris
2021-03-23  3:01               ` Ben Greear
2021-03-23  7:45                 ` Felix Fietkau
2021-03-25  9:45                   ` Rakesh Pillai
2021-03-25 10:33                     ` Felix Fietkau
2020-07-23 18:25     ` Rakesh Pillai
2020-07-24 23:11       ` Jacob Keller
2020-07-21 17:14 ` [RFC 3/7] ath10k: Add module param to enable rx thread Rakesh Pillai
2020-07-21 17:14 ` [RFC 4/7] ath10k: Do not exhaust budget on process tx completion Rakesh Pillai
2020-07-21 17:14 ` [RFC 5/7] ath10k: Handle the rx packet processing in thread Rakesh Pillai
2020-07-21 17:14 ` [RFC 6/7] ath10k: Add deliver to stack from thread context Rakesh Pillai
2020-07-21 17:14 ` [RFC 7/7] ath10k: Handle rx thread suspend and resume Rakesh Pillai
2020-07-23 23:06   ` Sebastian Gottschall
2020-07-24  6:19     ` Rakesh Pillai
2020-07-21 17:25 ` [RFC 0/7] Add support to process rx packets in thread Andrew Lunn
2020-07-21 18:05   ` Florian Fainelli
2020-07-23 18:21     ` Rakesh Pillai
2020-07-23 19:02       ` Florian Fainelli
2020-07-24  6:20         ` Rakesh Pillai
2020-07-24 22:28           ` Florian Fainelli
2020-07-22  9:12   ` David Laight
2020-07-25  8:16     ` Hillf Danton [this message]
2020-07-25 10:38       ` Sebastian Gottschall
2020-07-25 12:25         ` Hillf Danton
2020-07-25 14:08         ` Sebastian Gottschall
2020-07-25 14:57           ` Hillf Danton
2020-07-25 15:41             ` Sebastian Gottschall
2020-07-26 11:16               ` David Laight
2020-07-28 16:59                 ` Rakesh Pillai
2020-07-29  1:34                   ` Hillf Danton
2020-07-25 17:57       ` Felix Fietkau
2020-07-26  1:22         ` Hillf Danton
2020-07-26  8:10           ` Felix Fietkau
2020-07-26  8:32             ` Hillf Danton
2020-07-26  8:59               ` Felix Fietkau
2020-07-22 16:20   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200725081633.7432-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=Markus.Elfring@web.de \
    --cc=andrew@lunn.ch \
    --cc=ath10k@lists.infradead.org \
    --cc=davem@davemloft.net \
    --cc=dianders@chromium.org \
    --cc=evgreen@chromium.org \
    --cc=johannes@sipsolutions.net \
    --cc=kuba@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pillair@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).