All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2] net: try to avoid unneeded backlog flush
@ 2020-09-10 21:33 Paolo Abeni
  2020-09-14 21:39 ` David Miller
  0 siblings, 1 reply; 2+ messages in thread
From: Paolo Abeni @ 2020-09-10 21:33 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Marcelo Tosatti, Eric Dumazet

flush_all_backlogs() may cause deadlock on systems
running processes with FIFO scheduling policy.

The above is critical in -RT scenarios, where user-space
specifically ensure no network activity is scheduled on
the CPU running the mentioned FIFO process, but still get
stuck.

This commit tries to address the problem checking the
backlog status on the remote CPUs before scheduling the
flush operation. If the backlog is empty, we can skip it.

v1 -> v2:
 - explicitly clear flushed cpu mask - Eric

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/core/dev.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 152ad3b578de..cf7bb8d8b803 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5621,17 +5621,60 @@ static void flush_backlog(struct work_struct *work)
 	local_bh_enable();
 }
 
+static bool flush_required(int cpu)
+{
+#if IS_ENABLED(CONFIG_RPS)
+	struct softnet_data *sd = &per_cpu(softnet_data, cpu);
+	bool do_flush;
+
+	local_irq_disable();
+	rps_lock(sd);
+
+	/* as insertion into process_queue happens with the rps lock held,
+	 * process_queue access may race only with dequeue
+	 */
+	do_flush = !skb_queue_empty(&sd->input_pkt_queue) ||
+		   !skb_queue_empty_lockless(&sd->process_queue);
+	rps_unlock(sd);
+	local_irq_enable();
+
+	return do_flush;
+#endif
+	/* without RPS we can't safely check input_pkt_queue: during a
+	 * concurrent remote skb_queue_splice() we can detect as empty both
+	 * input_pkt_queue and process_queue even if the latter could end-up
+	 * containing a lot of packets.
+	 */
+	return true;
+}
+
 static void flush_all_backlogs(void)
 {
+	static cpumask_t flush_cpus;
 	unsigned int cpu;
 
+	/* since we are under rtnl lock protection we can use static data
+	 * for the cpumask and avoid allocating on stack the possibly
+	 * large mask
+	 */
+	ASSERT_RTNL();
+
 	get_online_cpus();
 
-	for_each_online_cpu(cpu)
-		queue_work_on(cpu, system_highpri_wq,
-			      per_cpu_ptr(&flush_works, cpu));
+	cpumask_clear(&flush_cpus);
+	for_each_online_cpu(cpu) {
+		if (flush_required(cpu)) {
+			queue_work_on(cpu, system_highpri_wq,
+				      per_cpu_ptr(&flush_works, cpu));
+			cpumask_set_cpu(cpu, &flush_cpus);
+		}
+	}
 
-	for_each_online_cpu(cpu)
+	/* we can have in flight packet[s] on the cpus we are not flushing,
+	 * synchronize_net() in rollback_registered_many() will take care of
+	 * them
+	 */
+	for_each_cpu(cpu, &flush_cpus)
 		flush_work(per_cpu_ptr(&flush_works, cpu));
 
 	put_online_cpus();
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH net-next v2] net: try to avoid unneeded backlog flush
  2020-09-10 21:33 [PATCH net-next v2] net: try to avoid unneeded backlog flush Paolo Abeni
@ 2020-09-14 21:39 ` David Miller
  0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2020-09-14 21:39 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, mtosatti, edumazet

From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 10 Sep 2020 23:33:18 +0200

> flush_all_backlogs() may cause deadlock on systems
> running processes with FIFO scheduling policy.
> 
> The above is critical in -RT scenarios, where user-space
> specifically ensure no network activity is scheduled on
> the CPU running the mentioned FIFO process, but still get
> stuck.
> 
> This commit tries to address the problem checking the
> backlog status on the remote CPUs before scheduling the
> flush operation. If the backlog is empty, we can skip it.
> 
> v1 -> v2:
>  - explicitly clear flushed cpu mask - Eric
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Applied, thank you.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-09-14 21:39 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10 21:33 [PATCH net-next v2] net: try to avoid unneeded backlog flush Paolo Abeni
2020-09-14 21:39 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.