netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Queue with wait-free enqueue, blocking dequeue, splice
       [not found] <1311316954.11157.1413631325000.JavaMail.zimbra@efficios.com>
@ 2014-10-18 11:48 ` Mathieu Desnoyers
  2014-10-20 14:02   ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2014-10-18 11:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Paul E. McKenney, netdev

Hi Jesper,

(re-send after getting the right netdev ML address)

Following our LPC discussion on lock-free queue algorithms
for qdisc, here is some info on the wfcqueue implementation
found in Userspace RCU. See http://urcu.so for info and
git repository.

Here is the wfcqueue ported to the Linux kernel I sent last
year as RFC:
https://lkml.org/lkml/2013/3/14/289

I'm very interested to learn if it fits well for your
use-case,

Feedback is welcome,

Thanks!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Queue with wait-free enqueue, blocking dequeue, splice
  2014-10-18 11:48 ` Queue with wait-free enqueue, blocking dequeue, splice Mathieu Desnoyers
@ 2014-10-20 14:02   ` Jesper Dangaard Brouer
  2014-10-21  0:04     ` Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2014-10-20 14:02 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: brouer, Paul E. McKenney, netdev, Jamal Hadi Salim


On Sat, 18 Oct 2014 11:48:12 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> Following our LPC discussion on lock-free queue algorithms
> for qdisc, here is some info on the wfcqueue implementation
> found in Userspace RCU. See http://urcu.so for info and
> git repository.

Thank for following up on our very interesting discussions.

I've started with the more simple variant "urcu/static/wfqueue.h" to
understand the concepts.  And I'm now reading wfcqueue.h, which I guess
it replacing wfqueue.h.

 
> Here is the wfcqueue ported to the Linux kernel I sent last
> year as RFC:
> https://lkml.org/lkml/2013/3/14/289
> 
> I'm very interested to learn if it fits well for your
> use-case,

Does this wfcqueue API support bulk dequeue?  (A feature needed for the
lock-less qdisc implementation, else it cannot compete with our new
bulk dequeue strategy).

AFAIK your queue implementation is a CAS-based, Wait-Free on enqueue,
but Lock-Free on dequeue with the potential for waiting/blocking on
a enqueue processes.
 I'm not 100% sure, that we want this behavior for the qdisc system.

I can certainly use the wfcq_empty() check, but I guess I need to
maintain a separate counter to maintain the qdisc limit, right?
(I would use the approximate/split counter API percpu_counter to keep
this scalable, and wfcq_empty() would provide an accurate empty check)


I think, we/I should start micro benchmarking the different approaches.
As our time budget is only 67.2ns 
 http://netoptimizer.blogspot.dk/2014/05/the-calculations-10gbits-wirespeed.html
(or bulking tricks artificially "increase" this budget)


The motivation behind this lockless qdisc is, the current qdisc locking
cost 48ns, see slide 9/16 titled "Qdisc locking is nasty":
 http://people.netfilter.org/hawk/presentations/LinuxPlumbers2014/performance_tx_qdisc_bulk_LPC2014.pdf

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Queue with wait-free enqueue, blocking dequeue, splice
  2014-10-20 14:02   ` Jesper Dangaard Brouer
@ 2014-10-21  0:04     ` Mathieu Desnoyers
  2014-10-21 11:48       ` Jesper Dangaard Brouer
  2014-10-21 12:02       ` Mathieu Desnoyers
  0 siblings, 2 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2014-10-21  0:04 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Paul E. McKenney, netdev, Jamal Hadi Salim

----- Original Message -----
> From: "Jesper Dangaard Brouer" <brouer@redhat.com>
> To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> Cc: brouer@redhat.com, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, netdev@vger.kernel.org, "Jamal Hadi Salim"
> <jhs@mojatatu.com>
> Sent: Monday, October 20, 2014 10:02:37 AM
> Subject: Re: Queue with wait-free enqueue, blocking dequeue, splice
> 
> 
> On Sat, 18 Oct 2014 11:48:12 +0000 (UTC) Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> 
> > Following our LPC discussion on lock-free queue algorithms
> > for qdisc, here is some info on the wfcqueue implementation
> > found in Userspace RCU. See http://urcu.so for info and
> > git repository.
> 
> Thank for following up on our very interesting discussions.
> 
> I've started with the more simple variant "urcu/static/wfqueue.h" to
> understand the concepts.  And I'm now reading wfcqueue.h, which I guess
> it replacing wfqueue.h.

Yep, this is correct.

> 
>  
> > Here is the wfcqueue ported to the Linux kernel I sent last
> > year as RFC:
> > https://lkml.org/lkml/2013/3/14/289
> > 
> > I'm very interested to learn if it fits well for your
> > use-case,
> 
> Does this wfcqueue API support bulk dequeue?  (A feature needed for the
> lock-less qdisc implementation, else it cannot compete with our new
> bulk dequeue strategy).

Yes, it does. See the "__wfcq_splice" API.

> 
> AFAIK your queue implementation is a CAS-based, Wait-Free on enqueue,
> but Lock-Free on dequeue with the potential for waiting/blocking on
> a enqueue processes.
>  I'm not 100% sure, that we want this behavior for the qdisc system.

It's actually xchg-based (not CAS-based). It is indeed wait-free
in the strictest sense of the term on enqueue (at least on x86,
some other arch implement xchg using ll/sc, which is not strictly
wait-free).

On dequeue, it can busy-wait for a short while that the enqueue
completes. Note that in kernel, since we disable preemption during
enqueue, the dequeue does not have to ever block, just busy-looping
is OK, since the longest thing that could nest over the enqueue
is possibly an interrupt and softirq. So yes, I guess the dequeue
would qualify as lock-free.

> 
> I can certainly use the wfcq_empty() check,

Not sure why you would want to use it, considering that the dequeue
operation implies it. If there is nothing to dequeue, we just return
immediately. Dequeue operation does not block on empty queue. It
just busy waits if it happen to see the queue in an intermediate
state of the enqueue operation (which is very short, few instructions
at most, with preemption disabled).

> but I guess I need to
> maintain a separate counter to maintain the qdisc limit, right?
> (I would use the approximate/split counter API percpu_counter to keep
> this scalable, and wfcq_empty() would provide an accurate empty check)

Yes for split counters, not sure why you need the empty check explicitly
in your use-case though.

> 
> 
> I think, we/I should start micro benchmarking the different approaches.
> As our time budget is only 67.2ns
>  http://netoptimizer.blogspot.dk/2014/05/the-calculations-10gbits-wirespeed.html
> (or bulking tricks artificially "increase" this budget)
> 
> 
> The motivation behind this lockless qdisc is, the current qdisc locking
> cost 48ns, see slide 9/16 titled "Qdisc locking is nasty":
>  http://people.netfilter.org/hawk/presentations/LinuxPlumbers2014/performance_tx_qdisc_bulk_LPC2014.pdf
> 

Chances are that the scheme:

__wfcq_enqueue() on enqueue

__wfcq_splice() for bulk dequeue

__wfcq_first()
__wfcq_next()  for iteration on the splice dest queue

Would be must faster than the ton of locks you use currently.
The nice part about using just enqueue and splice is that you
don't need locking on the queue at all. Iteration on the
destination queue can be done by a single thread, so no need
for explicit locking there neither.

By the way, you should look at the kernel wfcq implementation
I posted earlier. It's more compact that the one in userspace RCU
because we don't need to support blocking/nonblocking modes,
because we have the luxury of disabling preemption in the kernel.

I look forward to see the numbers,

Thanks!

Mathieu

> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Sr. Network Kernel Developer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Queue with wait-free enqueue, blocking dequeue, splice
  2014-10-21  0:04     ` Mathieu Desnoyers
@ 2014-10-21 11:48       ` Jesper Dangaard Brouer
  2014-10-21 12:02       ` Mathieu Desnoyers
  1 sibling, 0 replies; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2014-10-21 11:48 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Paul E. McKenney, netdev, Jamal Hadi Salim, brouer

On Tue, 21 Oct 2014 00:04:10 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > From: "Jesper Dangaard Brouer" <brouer@redhat.com>
> > 
[...]
> > I can certainly use the wfcq_empty() check,
> 
> Not sure why you would want to use it, considering that the dequeue
> operation implies it. If there is nothing to dequeue, we just return
> immediately. Dequeue operation does not block on empty queue. It
> just busy waits if it happen to see the queue in an intermediate
> state of the enqueue operation (which is very short, few instructions
> at most, with preemption disabled).
> 
> > but I guess I need to
> > maintain a separate counter to maintain the qdisc limit, right?
> > (I would use the approximate/split counter API percpu_counter to keep
> > this scalable, and wfcq_empty() would provide an accurate empty check)
> 
> Yes for split counters, not sure why you need the empty check explicitly
> in your use-case though.

In case the qdisc is empty, we avoid/bypass the enqueue + dequeue phase
and instead transmit the packet directly.

Iif the flag TCQ_F_CAN_BYPASS is set. 
 https://github.com/torvalds/linux/blob/master/net/core/dev.c#L2799
But I'm not 100% sure that we can set this flag on a lock-less qdisc.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Queue with wait-free enqueue, blocking dequeue, splice
  2014-10-21  0:04     ` Mathieu Desnoyers
  2014-10-21 11:48       ` Jesper Dangaard Brouer
@ 2014-10-21 12:02       ` Mathieu Desnoyers
  1 sibling, 0 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2014-10-21 12:02 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Paul E. McKenney, netdev, Jamal Hadi Salim



----- Original Message -----
> From: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> To: "Jesper Dangaard Brouer" <brouer@redhat.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, netdev@vger.kernel.org, "Jamal Hadi Salim" <jhs@mojatatu.com>
> Sent: Monday, October 20, 2014 8:04:10 PM
> Subject: Re: Queue with wait-free enqueue, blocking dequeue, splice
> 
> ----- Original Message -----
> > From: "Jesper Dangaard Brouer" <brouer@redhat.com>
> > To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> > Cc: brouer@redhat.com, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
> > netdev@vger.kernel.org, "Jamal Hadi Salim"
> > <jhs@mojatatu.com>
> > Sent: Monday, October 20, 2014 10:02:37 AM
> > Subject: Re: Queue with wait-free enqueue, blocking dequeue, splice
> > 

[...]

> > 
> > AFAIK your queue implementation is a CAS-based, Wait-Free on enqueue,
> > but Lock-Free on dequeue with the potential for waiting/blocking on
> > a enqueue processes.
> >  I'm not 100% sure, that we want this behavior for the qdisc system.
> 
> It's actually xchg-based (not CAS-based). It is indeed wait-free
> in the strictest sense of the term on enqueue (at least on x86,
> some other arch implement xchg using ll/sc, which is not strictly
> wait-free).
> 
> On dequeue, it can busy-wait for a short while that the enqueue
> completes. Note that in kernel, since we disable preemption during
> enqueue, the dequeue does not have to ever block, just busy-looping
> is OK, since the longest thing that could nest over the enqueue
> is possibly an interrupt and softirq. So yes, I guess the dequeue
> would qualify as lock-free.

Scratch this last part about dequeue lock-freedom: dequeue can
busy-wait endlessly long if an enqueue is, for instance, interrupted
by a dequeue operation that would sit within an interrupt handler.
Dequeue is therefore not "lock-free". It is not even obstruction-free.

This just means that you need to be aware of this if you use dequeue
in an execution context that can interrupt enqueue.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-21 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1311316954.11157.1413631325000.JavaMail.zimbra@efficios.com>
2014-10-18 11:48 ` Queue with wait-free enqueue, blocking dequeue, splice Mathieu Desnoyers
2014-10-20 14:02   ` Jesper Dangaard Brouer
2014-10-21  0:04     ` Mathieu Desnoyers
2014-10-21 11:48       ` Jesper Dangaard Brouer
2014-10-21 12:02       ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).