All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc
@ 2021-06-17  1:04 Yunsheng Lin
  2021-06-19  0:30 ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Yunsheng Lin @ 2021-06-17  1:04 UTC (permalink / raw)
  To: davem, kuba
  Cc: olteanv, ast, daniel, andriin, edumazet, weiwan, cong.wang,
	ap420073, netdev, linux-kernel, linuxarm, mkl, linux-can, jhs,
	xiyou.wangcong, jiri, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, bpf, jonas.bonn, pabeni, mzhivich,
	johunt, albcamus, kehuan.feng, a.fatoum, atenart,
	alexander.duyck, hdanton, jgross, JKosina, mkubecek, bjorn,
	alobakin

The spin_trylock() was assumed to contain the implicit
barrier needed to ensure the correct ordering between
STATE_MISSED setting/clearing and STATE_MISSED checking
in commit a90c57f2cedd ("net: sched: fix packet stuck
problem for lockless qdisc").

But it turns out that spin_trylock() only has load-acquire
semantic, for strongly-ordered system(like x86), the compiler
barrier implicitly contained in spin_trylock() seems enough
to ensure the correct ordering. But for weakly-orderly system
(like arm64), the store-release semantic is needed to ensure
the correct ordering as clear_bit() and test_bit() is store
operation, see queued_spin_lock().

So add the explicit barrier to ensure the correct ordering
for the above case.

Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
V2: add the missing Fixes tag.

The above ordering issue can easily cause out of order packet
problem when testing lockless qdisc bypass patchset [1] with
two iperf threads and one netdev queue in arm64 system.

1. https://lkml.org/lkml/2021/6/2/1417
---
 include/net/sch_generic.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 1e62551..5771030 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -163,6 +163,12 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 		if (spin_trylock(&qdisc->seqlock))
 			goto nolock_empty;
 
+		/* Paired with smp_mb__after_atomic() to make sure
+		 * STATE_MISSED checking is synchronized with clearing
+		 * in pfifo_fast_dequeue().
+		 */
+		smp_mb__before_atomic();
+
 		/* If the MISSED flag is set, it means other thread has
 		 * set the MISSED flag before second spin_trylock(), so
 		 * we can return false here to avoid multi cpus doing
@@ -180,6 +186,12 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 		 */
 		set_bit(__QDISC_STATE_MISSED, &qdisc->state);
 
+		/* spin_trylock() only has load-acquire semantic, so use
+		 * smp_mb__after_atomic() to ensure STATE_MISSED is set
+		 * before doing the second spin_trylock().
+		 */
+		smp_mb__after_atomic();
+
 		/* Retry again in case other CPU may not see the new flag
 		 * after it releases the lock at the end of qdisc_run_end().
 		 */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc
  2021-06-17  1:04 [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc Yunsheng Lin
@ 2021-06-19  0:30 ` Jakub Kicinski
  2021-06-19  0:38   ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2021-06-19  0:30 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, olteanv, ast, daniel, andriin, edumazet, weiwan,
	cong.wang, ap420073, netdev, linux-kernel, linuxarm, mkl,
	linux-can, jhs, xiyou.wangcong, jiri, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, bpf, jonas.bonn,
	pabeni, mzhivich, johunt, albcamus, kehuan.feng, a.fatoum,
	atenart, alexander.duyck, hdanton, jgross, JKosina, mkubecek,
	bjorn, alobakin

On Thu, 17 Jun 2021 09:04:14 +0800 Yunsheng Lin wrote:
> The spin_trylock() was assumed to contain the implicit
> barrier needed to ensure the correct ordering between
> STATE_MISSED setting/clearing and STATE_MISSED checking
> in commit a90c57f2cedd ("net: sched: fix packet stuck
> problem for lockless qdisc").
> 
> But it turns out that spin_trylock() only has load-acquire
> semantic, for strongly-ordered system(like x86), the compiler
> barrier implicitly contained in spin_trylock() seems enough
> to ensure the correct ordering. But for weakly-orderly system
> (like arm64), the store-release semantic is needed to ensure
> the correct ordering as clear_bit() and test_bit() is store
> operation, see queued_spin_lock().
> 
> So add the explicit barrier to ensure the correct ordering
> for the above case.
> 
> Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")
> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>

Acked-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc
  2021-06-19  0:30 ` Jakub Kicinski
@ 2021-06-19  0:38   ` Jakub Kicinski
  2021-06-19 10:30     ` Yunsheng Lin
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2021-06-19  0:38 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: davem, olteanv, ast, daniel, andriin, edumazet, weiwan,
	cong.wang, ap420073, netdev, linux-kernel, linuxarm, mkl,
	linux-can, jhs, xiyou.wangcong, jiri, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, bpf, jonas.bonn,
	pabeni, mzhivich, johunt, albcamus, kehuan.feng, a.fatoum,
	atenart, alexander.duyck, hdanton, jgross, JKosina, mkubecek,
	bjorn, alobakin

On Fri, 18 Jun 2021 17:30:47 -0700 Jakub Kicinski wrote:
> On Thu, 17 Jun 2021 09:04:14 +0800 Yunsheng Lin wrote:
> > The spin_trylock() was assumed to contain the implicit
> > barrier needed to ensure the correct ordering between
> > STATE_MISSED setting/clearing and STATE_MISSED checking
> > in commit a90c57f2cedd ("net: sched: fix packet stuck
> > problem for lockless qdisc").
> > 
> > But it turns out that spin_trylock() only has load-acquire
> > semantic, for strongly-ordered system(like x86), the compiler
> > barrier implicitly contained in spin_trylock() seems enough
> > to ensure the correct ordering. But for weakly-orderly system
> > (like arm64), the store-release semantic is needed to ensure
> > the correct ordering as clear_bit() and test_bit() is store
> > operation, see queued_spin_lock().
> > 
> > So add the explicit barrier to ensure the correct ordering
> > for the above case.
> > 
> > Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")
> > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>  
> 
> Acked-by: Jakub Kicinski <kuba@kernel.org>

Actually.. do we really need the _before_atomic() barrier?
I'd think we only need to make sure we re-check the lock 
after we set the bit, ordering of the first check doesn't 
matter.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc
  2021-06-19  0:38   ` Jakub Kicinski
@ 2021-06-19 10:30     ` Yunsheng Lin
  2021-06-21 23:29       ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Yunsheng Lin @ 2021-06-19 10:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Yunsheng Lin, davem, olteanv, ast, daniel, andriin, edumazet,
	weiwan, cong.wang, ap420073, netdev, linux-kernel, linuxarm, mkl,
	linux-can, jhs, xiyou.wangcong, jiri, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, bpf, jonas.bonn,
	pabeni, mzhivich, johunt, albcamus, kehuan.feng, a.fatoum,
	atenart, alexander.duyck, hdanton, jgross, JKosina, mkubecek,
	bjorn, alobakin

On Fri, Jun 18, 2021 at 05:38:37PM -0700, Jakub Kicinski wrote:
> On Fri, 18 Jun 2021 17:30:47 -0700 Jakub Kicinski wrote:
> > On Thu, 17 Jun 2021 09:04:14 +0800 Yunsheng Lin wrote:
> > > The spin_trylock() was assumed to contain the implicit
> > > barrier needed to ensure the correct ordering between
> > > STATE_MISSED setting/clearing and STATE_MISSED checking
> > > in commit a90c57f2cedd ("net: sched: fix packet stuck
> > > problem for lockless qdisc").
> > > 
> > > But it turns out that spin_trylock() only has load-acquire
> > > semantic, for strongly-ordered system(like x86), the compiler
> > > barrier implicitly contained in spin_trylock() seems enough
> > > to ensure the correct ordering. But for weakly-orderly system
> > > (like arm64), the store-release semantic is needed to ensure
> > > the correct ordering as clear_bit() and test_bit() is store
> > > operation, see queued_spin_lock().
> > > 
> > > So add the explicit barrier to ensure the correct ordering
> > > for the above case.
> > > 
> > > Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")
> > > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>  
> > 
> > Acked-by: Jakub Kicinski <kuba@kernel.org>
> 
> Actually.. do we really need the _before_atomic() barrier?
> I'd think we only need to make sure we re-check the lock 
> after we set the bit, ordering of the first check doesn't 
> matter.

When debugging pointed to the misordering between STATE_MISSED
setting/clearing and STATE_MISSED checking, only _after_atomic()
was added first, and it did not fix the misordering problem,
when both _before_atomic() and _after_atomic() were added, the
misordering problem disappeared.

I suppose _before_atomic() matters because the STATE_MISSED
setting and the lock rechecking is only done when first check of
STATE_MISSED returns false. _before_atomic() is used to make sure
the first check returns correct result, if it does not return the
correct result, then we may have misordering problem too.

     cpu0                        cpu1
                              clear MISSED
                             _after_atomic()
                                dequeue
    enqueue
 first trylock() #false
  MISSED check #*true* ?

As above, even cpu1 has a _after_atomic() between clearing
STATE_MISSED and dequeuing, we might stiil need a barrier to
prevent cpu0 doing speculative MISSED checking before cpu1
clearing MISSED?

And the implicit load-acquire barrier contained in the first
trylock() does not seems to prevent the above case too.

And there is no load-acquire barrier in pfifo_fast_dequeue()
too, which possibly make the above case more likely to happen.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc
  2021-06-19 10:30     ` Yunsheng Lin
@ 2021-06-21 23:29       ` Jakub Kicinski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2021-06-21 23:29 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Yunsheng Lin, davem, olteanv, ast, daniel, andriin, edumazet,
	weiwan, cong.wang, ap420073, netdev, linux-kernel, linuxarm, mkl,
	linux-can, jhs, xiyou.wangcong, jiri, andrii, kafai,
	songliubraving, yhs, john.fastabend, kpsingh, bpf, jonas.bonn,
	pabeni, mzhivich, johunt, albcamus, kehuan.feng, a.fatoum,
	atenart, alexander.duyck, hdanton, jgross, JKosina, mkubecek,
	bjorn, alobakin

On Sat, 19 Jun 2021 10:30:09 +0000 Yunsheng Lin wrote:
> When debugging pointed to the misordering between STATE_MISSED
> setting/clearing and STATE_MISSED checking, only _after_atomic()
> was added first, and it did not fix the misordering problem,
> when both _before_atomic() and _after_atomic() were added, the
> misordering problem disappeared.
> 
> I suppose _before_atomic() matters because the STATE_MISSED
> setting and the lock rechecking is only done when first check of
> STATE_MISSED returns false. _before_atomic() is used to make sure
> the first check returns correct result, if it does not return the
> correct result, then we may have misordering problem too.
> 
>      cpu0                        cpu1
>                               clear MISSED
>                              _after_atomic()
>                                 dequeue
>     enqueue
>  first trylock() #false
>   MISSED check #*true* ?
> 
> As above, even cpu1 has a _after_atomic() between clearing
> STATE_MISSED and dequeuing, we might stiil need a barrier to
> prevent cpu0 doing speculative MISSED checking before cpu1
> clearing MISSED?
> 
> And the implicit load-acquire barrier contained in the first
> trylock() does not seems to prevent the above case too.
> 
> And there is no load-acquire barrier in pfifo_fast_dequeue()
> too, which possibly make the above case more likely to happen.

Ah, you're right. The test_bit() was not in the patch context, 
I forgot it's there... Both barriers are indeed needed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-21 23:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-17  1:04 [PATCH net v2] net: sched: add barrier to ensure correct ordering for lockless qdisc Yunsheng Lin
2021-06-19  0:30 ` Jakub Kicinski
2021-06-19  0:38   ` Jakub Kicinski
2021-06-19 10:30     ` Yunsheng Lin
2021-06-21 23:29       ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.