From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail3-165.sinamail.sina.com.cn ([202.108.3.165]) by merlin.infradead.org with smtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jzJFW-0005jr-3h for ath10k@lists.infradead.org; Sat, 25 Jul 2020 12:26:00 +0000 From: Hillf Danton Subject: Re: [RFC 0/7] Add support to process rx packets in thread Date: Sat, 25 Jul 2020 20:25:01 +0800 Message-Id: <20200725122501.15000-1-hdanton@sina.com> In-Reply-To: <8359a849-2b8a-c842-a501-c6cb6966e345@dd-wrt.com> References: <1595351666-28193-1-git-send-email-pillair@codeaurora.org> <20200721172514.GT1339445@lunn.ch> <20200725081633.7432-1-hdanton@sina.com> MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "ath10k" Errors-To: ath10k-bounces+kvalo=adurom.com@lists.infradead.org To: Sebastian Gottschall Cc: Andrew Lunn , Hillf Danton , "netdev@vger.kernel.org" , "linux-wireless@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "ath10k@lists.infradead.org" , "dianders@chromium.org" , David Laight , Rakesh Pillai , "evgreen@chromium.org" , "kuba@kernel.org" , Markus Elfring , "johannes@sipsolutions.net" , Paolo Abeni , "davem@davemloft.net" , "kvalo@codeaurora.org" On Sat, 25 Jul 2020 12:38:00 +0200 Sebastian Gottschall wrote: > you may consider this > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1142611.html > Thanks very much for your link. > years ago someone already wanted to bring this feature upstream, but it > was denied. i already tested this patch the last 2 days and it worked so > far (with some little modifications) > so such a solution existed already and may be considered here I don't see outstanding difference in principle from Paolo's work in 2016 except for the use of kthread_create() and friends because kworker made use of them even before 2016. This is a simpler one as shown by the diff stat in his cover letter. Paolo, feel free to correct me if I misread anything. Finally I don't see the need to add sysfs attr, given CONFIG_THREADED_NAPI in this work. BTW let us know if anyone has plans to pick up the 2016 RFC. Hillf Paolo Abeni (2): net: implement threaded-able napi poll loop support net: add sysfs attribute to control napi threaded mode include/linux/netdevice.h | 4 ++ net/core/dev.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++ net/core/net-sysfs.c | 102 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 219 insertions(+) > > Sebastian > > > someone > > Am 25.07.2020 um 10:16 schrieb Hillf Danton: > > On Wed, 22 Jul 2020 09:12:42 +0000 David Laight wrote: > >>> On 21 July 2020 18:25 Andrew Lunn wrote: > >>> > >>> On Tue, Jul 21, 2020 at 10:44:19PM +0530, Rakesh Pillai wrote: > >>>> NAPI gets scheduled on the CPU core which got the > >>>> interrupt. The linux scheduler cannot move it to a > >>>> different core, even if the CPU on which NAPI is running > >>>> is heavily loaded. This can lead to degraded wifi > >>>> performance when running traffic at peak data rates. > >>>> > >>>> A thread on the other hand can be moved to different > >>>> CPU cores, if the one on which its running is heavily > >>>> loaded. During high incoming data traffic, this gives > >>>> better performance, since the thread can be moved to a > >>>> less loaded or sometimes even a more powerful CPU core > >>>> to account for the required CPU performance in order > >>>> to process the incoming packets. > >>>> > >>>> This patch series adds the support to use a high priority > >>>> thread to process the incoming packets, as opposed to > >>>> everything being done in NAPI context. > >>> I don't see why this problem is limited to the ath10k driver. I expect > >>> it applies to all drivers using NAPI. So shouldn't you be solving this > >>> in the NAPI core? Allow a driver to request the NAPI core uses a > >>> thread? > >> It's not just NAPI the problem is with the softint processing. > >> I suspect a lot of systems would work better if it ran as > >> a (highish priority) kernel thread. > > Hi folks > > > > Below is a minimunm poc implementation I can imagine on top of workqueue > > to make napi threaded. Thoughts are appreciated. > > > >> I've had to remove the main locks from a multi-threaded application > >> and replace them with atomic counters. > >> Consider what happens when the threads remove items from a shared > >> work list. > >> The code looks like: > >> mutex_enter(); > >> remove_item_from_list(); > >> mutex_exit(). > >> The mutex is only held for a few instructions, so while you'd expect > >> the cache line to be 'hot' you wouldn't get real contention. > >> However the following scenarios happen: > >> 1) An ethernet interrupt happens while the mutex is held. > >> This stops the other threads until all the softint processing > >> has finished. > >> 2) An ethernet interrupt (and softint) runs on a thread that is > >> waiting for the mutex. > >> (Or on the cpu that the thread's processor affinity ties it to.) > >> In this case the 'fair' (ticket) mutex code won't let any other > >> thread acquire the mutex. > >> So again everything stops until the softints all complete. > >> > >> The second one is also a problem when trying to wake up all > >> the threads (eg after adding a lot of items to the list). > >> The ticket locks force them to wake in order, but > >> sometimes the 'thundering herd' would work better. > >> > >> IIRC this is actually worse for processes running under the RT > >> scheduler (without CONFIG_PREEMPT) because the they are almost > >> always scheduled on the same cpu they ran on last. > >> If it is busy, but cannot be pre-empted, they are not moved > >> to an idle cpu. > >> > >> To confound things there is a very broken workaround for broken > >> hardware in the driver for the e1000 interface on (at least) > >> Ivy Bridge cpu that can cause the driver to spin for a very > >> long time (IIRC milliseconds) whenever it has to write to a > >> MAC register (ie on every transmit setup). > >> > >> David > >> > >> - > >> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > >> Registration No: 1397386 (Wales) > >> > >> > > To make napi threaded, if either irq or softirq thread is entirely ruled > > out, add napi::work that will be queued on a highpri workqueue. It is > > actually a unbound one to facilitate scheduler to catter napi loads on to > > idle CPU cores. What users need to do with the threaded napi > > is s/netif_napi_add/netif_threaded_napi_add/ and no more. > > > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -338,6 +338,9 @@ struct napi_struct { > > struct list_head dev_list; > > struct hlist_node napi_hash_node; > > unsigned int napi_id; > > +#ifdef CONFIG_THREADED_NAPI > > + struct work_struct work; > > +#endif > > }; > > > > enum { > > @@ -2234,6 +2237,19 @@ static inline void *netdev_priv(const st > > void netif_napi_add(struct net_device *dev, struct napi_struct *napi, > > int (*poll)(struct napi_struct *, int), int weight); > > > > +#ifdef CONFIG_THREADED_NAPI > > +void netif_threaded_napi_add(struct net_device *dev, struct napi_struct *napi, > > + int (*poll)(struct napi_struct *, int), int weight); > > +#else > > +static inline void netif_threaded_napi_add(struct net_device *dev, > > + struct napi_struct *napi, > > + int (*poll)(struct napi_struct *, int), > > + int weight) > > +{ > > + netif_napi_add(dev, napi, poll, weight); > > +} > > +#endif > > + > > /** > > * netif_tx_napi_add - initialize a NAPI context > > * @dev: network device > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6277,6 +6277,61 @@ static int process_backlog(struct napi_s > > return work; > > } > > > > +#ifdef CONFIG_THREADED_NAPI > > +/* unbound highpri workqueue for threaded napi */ > > +static struct workqueue_struct *napi_workq; > > + > > +static void napi_workfn(struct work_struct *work) > > +{ > > + struct napi_struct *n = container_of(work, struct napi_struct, work); > > + > > + for (;;) { > > + if (!test_bit(NAPI_STATE_SCHED, &n->state)) > > + return; > > + > > + if (n->poll(n, n->weight) < n->weight) > > + return; > > + > > + if (need_resched()) { > > + /* > > + * have to pay for the latency of task switch even if > > + * napi is scheduled > > + */ > > + if (test_bit(NAPI_STATE_SCHED, &n->state)) > > + queue_work(napi_workq, work); > > + return; > > + } > > + } > > +} > > + > > +void netif_threaded_napi_add(struct net_device *dev, > > + struct napi_struct *napi, > > + int (*poll)(struct napi_struct *, int), > > + int weight) > > +{ > > + netif_napi_add(dev, napi, poll, weight); > > + INIT_WORK(&napi->work, napi_workfn); > > +} > > + > > +static inline bool is_threaded_napi(struct napi_struct *n) > > +{ > > + return n->work.func == napi_workfn; > > +} > > + > > +static inline void threaded_napi_sched(struct napi_struct *n) > > +{ > > + if (is_threaded_napi(n)) > > + queue_work(napi_workq, &n->work); > > + else > > + ____napi_schedule(this_cpu_ptr(&softnet_data), n); > > +} > > +#else > > +static inline void threaded_napi_sched(struct napi_struct *n) > > +{ > > + ____napi_schedule(this_cpu_ptr(&softnet_data), n); > > +} > > +#endif > > + > > /** > > * __napi_schedule - schedule for receive > > * @n: entry to schedule > > @@ -6289,7 +6344,7 @@ void __napi_schedule(struct napi_struct > > unsigned long flags; > > > > local_irq_save(flags); > > - ____napi_schedule(this_cpu_ptr(&softnet_data), n); > > + threaded_napi_sched(n); > > local_irq_restore(flags); > > } > > EXPORT_SYMBOL(__napi_schedule); > > @@ -6335,7 +6390,7 @@ EXPORT_SYMBOL(napi_schedule_prep); > > */ > > void __napi_schedule_irqoff(struct napi_struct *n) > > { > > - ____napi_schedule(this_cpu_ptr(&softnet_data), n); > > + threaded_napi_sched(n); > > } > > EXPORT_SYMBOL(__napi_schedule_irqoff); > > > > @@ -10685,6 +10740,10 @@ static int __init net_dev_init(void) > > sd->backlog.weight = weight_p; > > } > > > > +#ifdef CONFIG_THREADED_NAPI > > + napi_workq = alloc_workqueue("napi_workq", WQ_UNBOUND | WQ_HIGHPRI, > > + WQ_UNBOUND_MAX_ACTIVE); > > +#endif > > dev_boot_phase = 0; > > > > /* The loopback device is special if any other network devices > > > > > > _______________________________________________ > > ath10k mailing list > > ath10k@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/ath10k _______________________________________________ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k