All of lore.kernel.org
 help / color / mirror / Atom feed
* SPMC Queue in URCU
       [not found] <1733955808.5920250.1465238884157.JavaMail.yahoo.ref@mail.yahoo.com>
@ 2016-06-06 18:48 ` Mark E. Dawson, Jr.
       [not found] ` <1733955808.5920250.1465238884157.JavaMail.yahoo@mail.yahoo.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Mark E. Dawson, Jr. @ 2016-06-06 18:48 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 224 bytes --]

All,
Is there a data structure offered by Userspace RCU that provides a bounded single producer multiple consumer queue? Or are we better off just using the linked list URCU data structure despite the cpu cache implications?

[-- Attachment #1.2: Type: text/html, Size: 581 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPMC Queue in URCU
       [not found] ` <1733955808.5920250.1465238884157.JavaMail.yahoo@mail.yahoo.com>
@ 2016-06-06 20:08   ` Mathieu Desnoyers
       [not found]   ` <188943617.29620.1465243726034.JavaMail.zimbra@efficios.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers @ 2016-06-06 20:08 UTC (permalink / raw)
  To: Mark E. Dawson, Jr.; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1040 bytes --]

----- On Jun 6, 2016, at 2:48 PM, Mark E. Dawson, Jr. <medawsonjr@yahoo.com> wrote: 

> All,

> Is there a data structure offered by Userspace RCU that provides a bounded
> single producer multiple consumer queue? Or are we better off just using the
> linked list URCU data structure despite the cpu cache implications?

Hi, 

What we have in URCU is an unbounded SPMC queue (wfcqueue.h). 
You could theoretically use it with a counter to make it bounded, but 
it might not be as efficient as a bounded SPMC queue, since we use 
a linked list rather than a circular buffer. It would be interesting to 
compare the performance of the two approaches. 

Note that wfcqueue is wait-free on the enqueue, blocking on the 
dequeue. It is used internally by URCU to implement the call_rcu 
scheme. 

Thanks, 

Mathieu 

> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 2424 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPMC Queue in URCU
       [not found]   ` <188943617.29620.1465243726034.JavaMail.zimbra@efficios.com>
@ 2016-06-06 20:19     ` Mark E. Dawson, Jr.
       [not found]     ` <897997834.5701164.1465244342476.JavaMail.yahoo@mail.yahoo.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Mark E. Dawson, Jr. @ 2016-06-06 20:19 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1382 bytes --]

Ah, so even the wfcqueue is a linked list under the covers? No ring buffer or other cache-friendly data structure types available?

      From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
 To: "Mark E. Dawson, Jr." <medawsonjr@yahoo.com> 
Cc: lttng-dev <lttng-dev@lists.lttng.org>
 Sent: Monday, June 6, 2016 3:08 PM
 Subject: Re: [lttng-dev] SPMC Queue in URCU
   

----- On Jun 6, 2016, at 2:48 PM, Mark E. Dawson, Jr. <medawsonjr@yahoo.com> wrote:

All,
Is there a data structure offered by Userspace RCU that provides a bounded single producer multiple consumer queue? Or are we better off just using the linked list URCU data structure despite the cpu cache implications?

Hi,
What we have in URCU is an unbounded SPMC queue (wfcqueue.h).You could theoretically use it with a counter to make it bounded, butit might not be as efficient as a bounded SPMC queue, since we usea linked list rather than a circular buffer. It would be interesting tocompare the performance of the two approaches.
Note that wfcqueue is wait-free on the enqueue, blocking on the
dequeue. It is used internally by URCU to implement the call_rcu
scheme.

Thanks,
Mathieu



_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

   

[-- Attachment #1.2: Type: text/html, Size: 4961 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPMC Queue in URCU
       [not found]     ` <897997834.5701164.1465244342476.JavaMail.yahoo@mail.yahoo.com>
@ 2016-06-06 21:20       ` Eric Wong
  2016-07-21 20:47       ` Practical recommendation for max update rate with QSBR Mark E. Dawson, Jr.
       [not found]       ` <208245300.3358603.1469134061370.JavaMail.yahoo@mail.yahoo.com>
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Wong @ 2016-06-06 21:20 UTC (permalink / raw)
  To: Mark E. Dawson, Jr.; +Cc: lttng-dev

"Mark E. Dawson, Jr." <medawsonjr@yahoo.com> wrote:
> Ah, so even the wfcqueue is a linked list under the covers? No
> ring buffer or other cache-friendly data structure types
> available?

Yes it's a linked list, but wfcqueue does no allocations under
the covers; but rather the structure is embedded into whatever
struct you give it (similar to the well-known Linux kernel
linked list, using the container_of macro).

In other words, you may use a cache-friendly allocator for all
the structs you might place into it.  But you already get some
cache friendliness by avoiding the extra pointer to the struct
you're actually storing.
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Practical recommendation for max update rate with QSBR
       [not found]     ` <897997834.5701164.1465244342476.JavaMail.yahoo@mail.yahoo.com>
  2016-06-06 21:20       ` Eric Wong
@ 2016-07-21 20:47       ` Mark E. Dawson, Jr.
       [not found]       ` <208245300.3358603.1469134061370.JavaMail.yahoo@mail.yahoo.com>
  2 siblings, 0 replies; 7+ messages in thread
From: Mark E. Dawson, Jr. @ 2016-07-21 20:47 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 574 bytes --]

All,
I've read the documentation regarding reader throughput drop-offs with high update rates due to the pointer exchanging between readers-writers, and the general admonition of using URCU only for mostly-read workloads with relatively infrequent updates.
However, is there a general rule-of-thumb suggestion for highest recommended update rate sustainable for optimal performance with URCU (in particular, the QSBR flavor) for highly threaded applications deployed on high core count machines (Intel)? The example case would be a single updater and 20 - 30 reader threads.

[-- Attachment #1.2: Type: text/html, Size: 1045 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Practical recommendation for max update rate with QSBR
       [not found]       ` <208245300.3358603.1469134061370.JavaMail.yahoo@mail.yahoo.com>
@ 2016-07-21 21:58         ` Mathieu Desnoyers
       [not found]         ` <1787099143.77815.1469138309989.JavaMail.zimbra@efficios.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers @ 2016-07-21 21:58 UTC (permalink / raw)
  To: Mark E. Dawson; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1706 bytes --]

----- On Jul 21, 2016, at 4:47 PM, Mark E. Dawson <medawsonjr@yahoo.com> wrote: 

> All,

> I've read the documentation regarding reader throughput drop-offs with high
> update rates due to the pointer exchanging between readers-writers, and the
> general admonition of using URCU only for mostly-read workloads with relatively
> infrequent updates.

> However, is there a general rule-of-thumb suggestion for highest recommended
> update rate sustainable for optimal performance with URCU (in particular, the
> QSBR flavor) for highly threaded applications deployed on high core count
> machines (Intel)? The example case would be a single updater and 20 - 30 reader
> threads.

Hi Mark, 

We have some relevant measurements in the supplementary material of: 

Desnoyers, Mathieu, McKenney, Paul. E., Stern, Alan S., Dagenais, Michel R. and Walpole, Jonathan, User-Level Implementations of Read-Copy Update . IEEE Transaction on Parallel and Distributed Systems , 23 (2): 375-382 (2012). 

A copy can be found at http://www.efficios.com/publications 

See p. 10 of supplementary material, Fig. 14: 

"Comparison of Pointer Exchange and Ideal RCU Update Overhead, 8-core Intel Xeon, Logarithmic Scale" 

Since you are doing pointer exchange, you will want to look at the "QSBR" 
line, which stays at 1000M reads/s up to 100000 updates/s, and then starts 
dropping. 

On the 64-core POWER5+ (Fig. 15), read speed starts dropping near 100000 
updates/s too. 

Thanks, 

Mathieu 

> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 3809 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Practical recommendation for max update rate with QSBR
       [not found]         ` <1787099143.77815.1469138309989.JavaMail.zimbra@efficios.com>
@ 2016-07-21 23:14           ` Mark E. Dawson, Jr.
  0 siblings, 0 replies; 7+ messages in thread
From: Mark E. Dawson, Jr. @ 2016-07-21 23:14 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1981 bytes --]

Thanks!

      From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
 To: Mark E. Dawson <medawsonjr@yahoo.com> 
Cc: lttng-dev <lttng-dev@lists.lttng.org>
 Sent: Thursday, July 21, 2016 4:58 PM
 Subject: Re: [lttng-dev] Practical recommendation for max update rate with QSBR
   
----- On Jul 21, 2016, at 4:47 PM, Mark E. Dawson <medawsonjr@yahoo.com> wrote:

All,
I've read the documentation regarding reader throughput drop-offs with high update rates due to the pointer exchanging between readers-writers, and the general admonition of using URCU only for mostly-read workloads with relatively infrequent updates.
However, is there a general rule-of-thumb suggestion for highest recommended update rate sustainable for optimal performance with URCU (in particular, the QSBR flavor) for highly threaded applications deployed on high core count machines (Intel)? The example case would be a single updater and 20 - 30 reader threads.

Hi Mark,

We have some relevant measurements in the supplementary material of:

Desnoyers, Mathieu, McKenney, Paul. E., Stern, Alan S., Dagenais, Michel R. and Walpole, Jonathan, User-Level Implementations of Read-Copy Update. IEEE Transaction on Parallel and Distributed Systems, 23 (2): 375-382 (2012).
A copy can be found at http://www.efficios.com/publications

See p. 10 of supplementary material,  Fig. 14:

"Comparison of Pointer Exchange and Ideal RCU Update Overhead, 8-core Intel Xeon, Logarithmic Scale"

Since you are doing pointer exchange, you will want to look at the "QSBR"line, which stays at 1000M reads/s up to 100000 updates/s, and then startsdropping.
On the 64-core POWER5+ (Fig. 15), read speed starts dropping near 100000updates/s too.

Thanks,
Mathieu

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  

[-- Attachment #1.2: Type: text/html, Size: 5508 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-07-21 23:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1733955808.5920250.1465238884157.JavaMail.yahoo.ref@mail.yahoo.com>
2016-06-06 18:48 ` SPMC Queue in URCU Mark E. Dawson, Jr.
     [not found] ` <1733955808.5920250.1465238884157.JavaMail.yahoo@mail.yahoo.com>
2016-06-06 20:08   ` Mathieu Desnoyers
     [not found]   ` <188943617.29620.1465243726034.JavaMail.zimbra@efficios.com>
2016-06-06 20:19     ` Mark E. Dawson, Jr.
     [not found]     ` <897997834.5701164.1465244342476.JavaMail.yahoo@mail.yahoo.com>
2016-06-06 21:20       ` Eric Wong
2016-07-21 20:47       ` Practical recommendation for max update rate with QSBR Mark E. Dawson, Jr.
     [not found]       ` <208245300.3358603.1469134061370.JavaMail.yahoo@mail.yahoo.com>
2016-07-21 21:58         ` Mathieu Desnoyers
     [not found]         ` <1787099143.77815.1469138309989.JavaMail.zimbra@efficios.com>
2016-07-21 23:14           ` Mark E. Dawson, Jr.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.