All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Document Linux's circular buffering capabilities
@ 2010-03-11 17:20 David Howells
  2010-03-11 18:10 ` Randy Dunlap
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: David Howells @ 2010-03-11 17:20 UTC (permalink / raw)
  To: torvalds, akpm
  Cc: sgruszka, davem, linux-kernel, David Howells, Paul E. McKenney

Document the circular buffering capabilities available in Linux.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 Documentation/circular-buffers.txt |  231 ++++++++++++++++++++++++++++++++++++
 Documentation/memory-barriers.txt  |   20 +++
 include/linux/circ_buf.h           |    4 +
 3 files changed, 255 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/circular-buffers.txt


diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
new file mode 100644
index 0000000..cde764c
--- /dev/null
+++ b/Documentation/circular-buffers.txt
@@ -0,0 +1,231 @@
+			       ================
+			       CIRCULAR BUFFERS
+			       ================
+
+By: David Howells <dhowells@redhat.com>
+    Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+
+
+Linux provides a number of features that can be used to implement circular
+buffering.  There are two sets of such features:
+
+ (1) Convenience functions for determining information about power-of-2 sized
+     buffers.
+
+ (2) Memory barriers for when the producer and the consumer of objects in the
+     buffer don't want to share a lock.
+
+To use these facilities, as discussed below, there needs to be just one
+producer and just one consumer.  It is possible to handle multiple producers by
+serialising them, and to handle multiple consumers by serialising them.
+
+
+Contents:
+
+ (*) What is a circular buffer?
+
+ (*) Measuring power-of-2 buffers.
+
+ (*) Using memory barriers with circular buffers.
+     - The producer.
+     - The consumer.
+
+
+==========================
+WHAT IS A CIRCULAR BUFFER?
+==========================
+
+First of all, what is a circular buffer?  A circular buffer is a buffer of
+fixed, finite size into which there are two indices:
+
+ (1) A 'head' index - the point at which the producer inserts items into the
+     buffer.
+
+ (2) A 'tail' index - the point at which the consumer finds the next item in
+     the buffer.
+
+Typically when the tail pointer is equal to the head pointer, the buffer is
+empty; and the buffer is full when the head pointer is one less than the tail
+pointer.
+
+The head index is incremented when items are added, and the tail index when
+items are removed.  The tail index should never jump the head index, and both
+indices should be wrapped to 0 when they reach the end of the buffer, thus
+allowing an infinite amount of data to flow through the buffer.
+
+Typically, items will all be of the same unit size, but this isn't strictly
+required to use the techniques below.  The indices can be increased by more
+than 1 if multiple items or variable-sized items are to be included in the
+buffer, provided that neither index overtakes the other.  The implementer must
+be careful, however, as a region more than one unit in size may wrap the end of
+the buffer and be broken into two segments.
+
+
+============================
+MEASURING POWER-OF-2 BUFFERS
+============================
+
+Circular buffers that are of a size that is an exact power of two can have
+their item count and buffer space assessed really quickly through the use of a
+bitwise-AND instruction.  Non-power-of-2 sized buffers must use a modulus
+(divide) instruction instead which is likely to be very slow.
+
+There are a set of macros to do this in Linux, that can be made use of by:
+
+	#include <linux/circ_buf.h>
+
+These are:
+
+ (*) Measure the remaining capacity of a buffer:
+
+	CIRC_SPACE(head_index, tail_index, buffer_size);
+
+     This returns the amount of space left in the buffer[1] into which items can
+     be inserted.
+
+
+ (*) Measure the maximum consecutive immediate space in a buffer:
+
+	CIRC_SPACE_TO_END(head_index, tail_index, buffer_size);
+
+     This returns the amount of consecutive space left in the buffer[1] into which
+     items can be immediately inserted without having to wrap back to the
+     beginning of the buffer.
+
+
+ (*) Measure the occupancy of a buffer:
+
+	CIRC_CNT(head_index, tail_index, buffer_size);
+
+     This returns the number of items currently occupying a buffer.
+
+
+ (*) Measure the non-wrapping occupancy of a buffer:
+
+	CIRC_CNT_TO_END(head_index, tail_index, buffer_size);
+
+     This returns the number of consecutive items that can be extracted from
+     the buffer without having to wrap back to the beginning of the buffer.
+
+
+Each of these macros will nominally return a value between 0 and buffer_size-1,
+however:
+
+ [1] CIRC_SPACE*() are intended to be used in the producer.  To the producer
+     they will return a lower bound as the producer controls the head index,
+     but the consumer may still be depleting the buffer on another CPU and
+     moving the tail index.
+
+     To the consumer it will show an upper bound as the producer may be busy
+     depleting the space.
+
+ [2] CIRC_CNT*() are intended to be used in the consumer.  To the consumer they
+     will return a lower bound as the consumer controls the tail index, but the
+     producer may still be filling the buffer on andother CPU and moving the
+     head index.
+
+     To the producer it will show an upper bound as the consumer may be busy
+     emptying the buffer.
+
+ [3] To a third party, the order in which the writes to the indices by the
+     producer and consumer become visible cannot be guaranteed as they are
+     independent and may be made on different CPUs - so the result in such a
+     situation will merely be a guess, and may even be negative.
+
+
+===========================================
+USING MEMORY BARRIERS WITH CIRCULAR BUFFERS
+===========================================
+
+By using memory barriers in conjunction with circular buffers, you can avoid
+the need to:
+
+ (1) use a single lock to govern access to both ends of the buffer, thus
+     allowing the buffer to be filled and emptied at the same time; and
+
+ (2) use atomic counter operations.
+
+There are two sides to this: the producer that fills the buffer, and the
+consumer that empties it.  Only one thing should be filling a buffer at any one
+time, and only one thing should be emptying a buffer at any one time, but the
+two sides can operate simultaneously.
+
+
+THE PRODUCER
+------------
+
+The producer will look something like this:
+
+	spin_lock(&producer_lock);
+
+	unsigned long head = buffer->head;
+	unsigned long tail = ACCESS_ONCE(buffer->tail);
+
+	if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
+		/* insert one item into the buffer */
+		struct item *item = buffer[head];
+
+		produce_item(item);
+
+		smp_wmb(); /* commit the item before incrementing the head */
+
+		buffer->head = (head + 1) & (buffer->size - 1);
+
+		/* wake_up() will make sure that the head is committed before
+		 * waking anyone up */
+		wake_up(consumer);
+	}
+
+	spin_unlock(&producer_lock);
+
+This will instruct the CPU that the contents of the new item must be written
+before the head index makes it available to the consumer and then instructs the
+CPU that the revised head index must be written before the consumer is woken.
+
+Note that wake_up() doesn't have to be the exact mechanism used, but whatever
+is used must guarantee a (write) memory barrier between the update of the head
+index and the change of state of the consumer, if a change of state occurs.
+
+
+THE CONSUMER
+------------
+
+The consumer will look something like this:
+
+	spin_lock(&consumer_lock);
+
+	unsigned long head = ACCESS_ONCE(buffer->head);
+	unsigned long tail = buffer->tail;
+
+	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
+		/* read index before reading contents at that index */
+		smp_read_barrier_depends();
+
+		/* extract one item from the buffer */
+		struct item *item = buffer[tail];
+
+		consume_item(item);
+
+		smp_mb(); /* finish reading descriptor before incrementing tail */
+
+		buffer->tail = (tail + 1) & (buffer->size - 1);
+	}
+
+	spin_unlock(&consumer_lock);
+
+This will instruct the CPU to make sure the index is up to date before reading
+the new item, and then it shall make sure the CPU has finished reading the item
+before it writes the new tail pointer, which will erase the item.
+
+
+Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
+This prevents the compiler from discarding and reloading its cached value -
+which some compilers will do, even after an implied compiler barrier.
+
+
+===============
+FURTHER READING
+===============
+
+See also Documentation/memory-barriers.txt for a description of Linux's memory
+barrier facilities.
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 7f5809e..631ad2f 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -3,6 +3,7 @@
 			 ============================
 
 By: David Howells <dhowells@redhat.com>
+    Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 Contents:
 
@@ -60,6 +61,10 @@ Contents:
 
      - And then there's the Alpha.
 
+ (*) Example uses.
+
+     - Circular buffers.
+
  (*) References.
 
 
@@ -2226,6 +2231,21 @@ The Alpha defines the Linux kernel's memory barrier model.
 See the subsection on "Cache Coherency" above.
 
 
+============
+EXAMPLE USES
+============
+
+CIRCULAR BUFFERS
+----------------
+
+Memory barriers can be used to implement circular buffering without the need
+of a lock to serialise the producer with the consumer.  See:
+
+	Documentation/circular-buffers.txt
+
+for details.
+
+
 ==========
 REFERENCES
 ==========
diff --git a/include/linux/circ_buf.h b/include/linux/circ_buf.h
index a2ed059..90f2471 100644
--- a/include/linux/circ_buf.h
+++ b/include/linux/circ_buf.h
@@ -1,3 +1,7 @@
+/*
+ * See Documentation/circular-buffers.txt for more information.
+ */
+
 #ifndef _LINUX_CIRC_BUF_H
 #define _LINUX_CIRC_BUF_H 1
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-11 17:20 [PATCH] Document Linux's circular buffering capabilities David Howells
@ 2010-03-11 18:10 ` Randy Dunlap
  2010-03-11 19:20 ` David Howells
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Randy Dunlap @ 2010-03-11 18:10 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, sgruszka, davem, linux-kernel, Paul E. McKenney

On 03/11/10 09:20, David Howells wrote:
> Document the circular buffering capabilities available in Linux.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
> 
>  Documentation/circular-buffers.txt |  231 ++++++++++++++++++++++++++++++++++++
>  Documentation/memory-barriers.txt  |   20 +++
>  include/linux/circ_buf.h           |    4 +
>  3 files changed, 255 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/circular-buffers.txt
> 
> 
> diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
> new file mode 100644
> index 0000000..cde764c
> --- /dev/null
> +++ b/Documentation/circular-buffers.txt
> @@ -0,0 +1,231 @@
> +			       ================
> +			       CIRCULAR BUFFERS
> +			       ================
> +
> +By: David Howells <dhowells@redhat.com>
> +    Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> +
> +
> +Linux provides a number of features that can be used to implement circular
> +buffering.  There are two sets of such features:
> +
> + (1) Convenience functions for determining information about power-of-2 sized
> +     buffers.
> +
> + (2) Memory barriers for when the producer and the consumer of objects in the
> +     buffer don't want to share a lock.
> +
> +To use these facilities, as discussed below, there needs to be just one
> +producer and just one consumer.  It is possible to handle multiple producers by
> +serialising them, and to handle multiple consumers by serialising them.
> +
> +
> +Contents:
> +
> + (*) What is a circular buffer?
> +
> + (*) Measuring power-of-2 buffers.
> +
> + (*) Using memory barriers with circular buffers.
> +     - The producer.
> +     - The consumer.
> +
> +
> +==========================
> +WHAT IS A CIRCULAR BUFFER?
> +==========================
> +
> +First of all, what is a circular buffer?  A circular buffer is a buffer of
> +fixed, finite size into which there are two indices:
> +
> + (1) A 'head' index - the point at which the producer inserts items into the
> +     buffer.
> +
> + (2) A 'tail' index - the point at which the consumer finds the next item in
> +     the buffer.
> +
> +Typically when the tail pointer is equal to the head pointer, the buffer is
> +empty; and the buffer is full when the head pointer is one less than the tail
> +pointer.
> +
> +The head index is incremented when items are added, and the tail index when
> +items are removed.  The tail index should never jump the head index, and both
> +indices should be wrapped to 0 when they reach the end of the buffer, thus
> +allowing an infinite amount of data to flow through the buffer.
> +
> +Typically, items will all be of the same unit size, but this isn't strictly
> +required to use the techniques below.  The indices can be increased by more
> +than 1 if multiple items or variable-sized items are to be included in the
> +buffer, provided that neither index overtakes the other.  The implementer must
> +be careful, however, as a region more than one unit in size may wrap the end of
> +the buffer and be broken into two segments.
> +
> +
> +============================
> +MEASURING POWER-OF-2 BUFFERS
> +============================
> +
> +Circular buffers that are of a size that is an exact power of two can have
> +their item count and buffer space assessed really quickly through the use of a
> +bitwise-AND instruction.  Non-power-of-2 sized buffers must use a modulus
> +(divide) instruction instead which is likely to be very slow.
> +
> +There are a set of macros to do this in Linux, that can be made use of by:
> +
> +	#include <linux/circ_buf.h>
> +
> +These are:
> +
> + (*) Measure the remaining capacity of a buffer:
> +
> +	CIRC_SPACE(head_index, tail_index, buffer_size);
> +
> +     This returns the amount of space left in the buffer[1] into which items can
> +     be inserted.
> +
> +
> + (*) Measure the maximum consecutive immediate space in a buffer:
> +
> +	CIRC_SPACE_TO_END(head_index, tail_index, buffer_size);
> +
> +     This returns the amount of consecutive space left in the buffer[1] into which

Do the "[1]" here and in the previous section refer to note [1] below?
If so, then the CIRC_CNT*() paragraphs below could use "[2]" for consistency.

> +     items can be immediately inserted without having to wrap back to the
> +     beginning of the buffer.
> +
> +
> + (*) Measure the occupancy of a buffer:
> +
> +	CIRC_CNT(head_index, tail_index, buffer_size);
> +
> +     This returns the number of items currently occupying a buffer.
> +
> +
> + (*) Measure the non-wrapping occupancy of a buffer:
> +
> +	CIRC_CNT_TO_END(head_index, tail_index, buffer_size);
> +
> +     This returns the number of consecutive items that can be extracted from
> +     the buffer without having to wrap back to the beginning of the buffer.
> +
> +
> +Each of these macros will nominally return a value between 0 and buffer_size-1,
> +however:
> +
> + [1] CIRC_SPACE*() are intended to be used in the producer.  To the producer
> +     they will return a lower bound as the producer controls the head index,
> +     but the consumer may still be depleting the buffer on another CPU and
> +     moving the tail index.
> +
> +     To the consumer it will show an upper bound as the producer may be busy
> +     depleting the space.
> +
> + [2] CIRC_CNT*() are intended to be used in the consumer.  To the consumer they
> +     will return a lower bound as the consumer controls the tail index, but the
> +     producer may still be filling the buffer on andother CPU and moving the
> +     head index.
> +
> +     To the producer it will show an upper bound as the consumer may be busy
> +     emptying the buffer.
> +
> + [3] To a third party, the order in which the writes to the indices by the
> +     producer and consumer become visible cannot be guaranteed as they are
> +     independent and may be made on different CPUs - so the result in such a
> +     situation will merely be a guess, and may even be negative.
> +
> +
> +===========================================
> +USING MEMORY BARRIERS WITH CIRCULAR BUFFERS
> +===========================================
> +
> +By using memory barriers in conjunction with circular buffers, you can avoid
> +the need to:
> +
> + (1) use a single lock to govern access to both ends of the buffer, thus
> +     allowing the buffer to be filled and emptied at the same time; and
> +
> + (2) use atomic counter operations.
> +
> +There are two sides to this: the producer that fills the buffer, and the
> +consumer that empties it.  Only one thing should be filling a buffer at any one
> +time, and only one thing should be emptying a buffer at any one time, but the
> +two sides can operate simultaneously.
> +
> +
> +THE PRODUCER
> +------------
> +
> +The producer will look something like this:

   (requires [or assumes] power-of-2 sized buffers -- same for consumer)

> +
> +	spin_lock(&producer_lock);
> +
> +	unsigned long head = buffer->head;
> +	unsigned long tail = ACCESS_ONCE(buffer->tail);
> +
> +	if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
> +		/* insert one item into the buffer */
> +		struct item *item = buffer[head];
> +
> +		produce_item(item);
> +
> +		smp_wmb(); /* commit the item before incrementing the head */
> +
> +		buffer->head = (head + 1) & (buffer->size - 1);
> +
> +		/* wake_up() will make sure that the head is committed before
> +		 * waking anyone up */
> +		wake_up(consumer);
> +	}
> +
> +	spin_unlock(&producer_lock);
> +
> +This will instruct the CPU that the contents of the new item must be written
> +before the head index makes it available to the consumer and then instructs the
> +CPU that the revised head index must be written before the consumer is woken.
> +
> +Note that wake_up() doesn't have to be the exact mechanism used, but whatever
> +is used must guarantee a (write) memory barrier between the update of the head
> +index and the change of state of the consumer, if a change of state occurs.
> +
> +
> +THE CONSUMER
> +------------
> +
> +The consumer will look something like this:
> +
> +	spin_lock(&consumer_lock);
> +
> +	unsigned long head = ACCESS_ONCE(buffer->head);
> +	unsigned long tail = buffer->tail;
> +
> +	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
> +		/* read index before reading contents at that index */
> +		smp_read_barrier_depends();
> +
> +		/* extract one item from the buffer */
> +		struct item *item = buffer[tail];
> +
> +		consume_item(item);
> +
> +		smp_mb(); /* finish reading descriptor before incrementing tail */
> +
> +		buffer->tail = (tail + 1) & (buffer->size - 1);
> +	}
> +
> +	spin_unlock(&consumer_lock);
> +
> +This will instruct the CPU to make sure the index is up to date before reading
> +the new item, and then it shall make sure the CPU has finished reading the item
> +before it writes the new tail pointer, which will erase the item.
> +
> +
> +Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
> +This prevents the compiler from discarding and reloading its cached value -
> +which some compilers will do, even after an implied compiler barrier.
> +
> +
> +===============
> +FURTHER READING
> +===============
> +
> +See also Documentation/memory-barriers.txt for a description of Linux's memory
> +barrier facilities.


-- 
~Randy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-11 17:20 [PATCH] Document Linux's circular buffering capabilities David Howells
  2010-03-11 18:10 ` Randy Dunlap
@ 2010-03-11 19:20 ` David Howells
  2010-03-12 15:39 ` Stefan Richter
  2010-03-12 23:41 ` David Howells
  3 siblings, 0 replies; 11+ messages in thread
From: David Howells @ 2010-03-11 19:20 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: dhowells, torvalds, akpm, sgruszka, davem, linux-kernel,
	Paul E. McKenney

Randy Dunlap <rdunlap@xenotime.net> wrote:

> Do the "[1]" here and in the previous section refer to note [1] below?
> If so, then the CIRC_CNT*() paragraphs below could use "[2]" for consistency.

Good point.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-11 17:20 [PATCH] Document Linux's circular buffering capabilities David Howells
  2010-03-11 18:10 ` Randy Dunlap
  2010-03-11 19:20 ` David Howells
@ 2010-03-12 15:39 ` Stefan Richter
  2010-03-12 23:41 ` David Howells
  3 siblings, 0 replies; 11+ messages in thread
From: Stefan Richter @ 2010-03-12 15:39 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, sgruszka, davem, linux-kernel, Paul E. McKenney,
	Randy Dunlap

David Howells wrote:
> +============================
> +MEASURING POWER-OF-2 BUFFERS
> +============================
> +
> +Circular buffers that are of a size that is an exact power of two can have
> +their item count and buffer space assessed really quickly through the use of a
> +bitwise-AND instruction.  Non-power-of-2 sized buffers must use a modulus
> +(divide) instruction instead which is likely to be very slow.
> +
> +There are a set of macros to do this in Linux, that can be made use of by:

"...do this" could be misunderstood as "use a modulus instruction",
although the heading says that this section is about 2^n sized buffers.
How about reversing the leading paragraph?

    Calculating the occupied or free space of a circular buffer involves
    a somewhat slow modulus operation.  But if the buffer size is an
    exact power of 2, a quick bitwise AND can be used instead.

    There is a set of macros which do the latter, that can be made use
    of by [...]

[...]
> + [2] CIRC_CNT*() are intended to be used in the consumer.  To the consumer they
> +     will return a lower bound as the consumer controls the tail index, but the
> +     producer may still be filling the buffer on andother CPU and moving the
> +     head index.
                                                    ^^^^^^^^
                                                    another

[...]
> +The producer will look something like this:
> +
> +	spin_lock(&producer_lock);
> +
> +	unsigned long head = buffer->head;
> +	unsigned long tail = ACCESS_ONCE(buffer->tail);
> +
> +	if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
> +		/* insert one item into the buffer */
> +		struct item *item = buffer[head];
> +
> +		produce_item(item);
> +
> +		smp_wmb(); /* commit the item before incrementing the head */
> +
> +		buffer->head = (head + 1) & (buffer->size - 1);
> +
> +		/* wake_up() will make sure that the head is committed before
> +		 * waking anyone up */
> +		wake_up(consumer);
> +	}
> +
> +	spin_unlock(&producer_lock);
> +
[...]
> +The consumer will look something like this:
> +
> +	spin_lock(&consumer_lock);
> +
> +	unsigned long head = ACCESS_ONCE(buffer->head);
> +	unsigned long tail = buffer->tail;
> +
> +	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
> +		/* read index before reading contents at that index */
> +		smp_read_barrier_depends();
> +
> +		/* extract one item from the buffer */
> +		struct item *item = buffer[tail];
> +
> +		consume_item(item);
> +
> +		smp_mb(); /* finish reading descriptor before incrementing tail */
> +
> +		buffer->tail = (tail + 1) & (buffer->size - 1);
> +	}
> +
> +	spin_unlock(&consumer_lock);
> +
[...]
> +Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
> +This prevents the compiler from discarding and reloading its cached value -
> +which some compilers will do, even after an implied compiler barrier.

I don't understand why ACCESS_ONCE is needed here.  The CIRC_SPACE and
CIRC_CNT macros do not look at head and tail more than once.

CIRC_CNT_TO_END and CIRC_SPACE_TO_END might do that.  (In
CIRC_CNT_TO_END, tail is in danger to be loaded more than once.  In
CIRC_SPACE_TO_END, head is in danger to be loaded more than once.)
-- 
Stefan Richter
-=====-==-=- --== -==--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-11 17:20 [PATCH] Document Linux's circular buffering capabilities David Howells
                   ` (2 preceding siblings ...)
  2010-03-12 15:39 ` Stefan Richter
@ 2010-03-12 23:41 ` David Howells
  2010-03-13  9:26   ` Stefan Richter
  2010-03-15 12:36   ` David Howells
  3 siblings, 2 replies; 11+ messages in thread
From: David Howells @ 2010-03-12 23:41 UTC (permalink / raw)
  To: Stefan Richter
  Cc: dhowells, torvalds, akpm, sgruszka, davem, linux-kernel,
	Paul E. McKenney, Randy Dunlap

Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> "...do this" could be misunderstood as "use a modulus instruction",
> although the heading says that this section is about 2^n sized buffers.
> How about reversing the leading paragraph?
> 
>     Calculating the occupied or free space of a circular buffer involves
>     a somewhat slow modulus operation.  But if the buffer size is an
>     exact power of 2, a quick bitwise AND can be used instead.
> 
>     There is a set of macros which do the latter, that can be made use
>     of by [...]

How about:

	Calculation of the occupancy or the remaining capacity of an
	arbitrarily sized circular buffer would normally be a slow operation,
	requiring the use of a modulus (divide) instruction.  However, if the
	buffer is of a power-of-2 size, then a much quicker bitwise-AND
	instruction can be used instead.

	Linux provides a set of macros for handling power-of-2 circular
	buffers.  These can be made use of by:
	...

> > +Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
> > +This prevents the compiler from discarding and reloading its cached value -
> > +which some compilers will do, even after an implied compiler barrier.
> 
> I don't understand why ACCESS_ONCE is needed here.  The CIRC_SPACE and
> CIRC_CNT macros do not look at head and tail more than once.

In this example they don't, but say someone wants to read several elements
from the buffer, they might end up accessing their copy of head several times.

If you only access it once anyway, ACCESS_ONCE() shouldn't hurt.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-12 23:41 ` David Howells
@ 2010-03-13  9:26   ` Stefan Richter
  2010-03-15 12:36   ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Richter @ 2010-03-13  9:26 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, sgruszka, davem, linux-kernel, Paul E. McKenney,
	Randy Dunlap

David Howells wrote:
> Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
>> "...do this" could be misunderstood as "use a modulus instruction",
...
> How about:
> 
> 	Calculation of the occupancy or the remaining capacity of an
> 	arbitrarily sized circular buffer would normally be a slow operation,
> 	requiring the use of a modulus (divide) instruction.  However, if the
> 	buffer is of a power-of-2 size,
[...]

Yep.

>> I don't understand why ACCESS_ONCE is needed here.  The CIRC_SPACE and
>> CIRC_CNT macros do not look at head and tail more than once.
> 
> In this example they don't, but say someone wants to read several elements
> from the buffer, they might end up accessing their copy of head several times.

Would you agree to add a quick note that these examples are simple
enough to not strictly require ACCESS_ONCE but are meant to show what
more general code would have to do?  Else a reader might be left puzzled
why he can't see in the example code the circumstances which require
ACCESS_ONCE and may remain unsure about where to use it in his own works...

(BTW, good that I came across your documentation posting.  This twist
with possibly multiple loads was not apparent to me too.  I am going to
have to have another look at some driver code with this in mind...)
-- 
Stefan Richter
-=====-==-=- --== -==-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-12 23:41 ` David Howells
  2010-03-13  9:26   ` Stefan Richter
@ 2010-03-15 12:36   ` David Howells
  2010-03-15 13:43     ` Stefan Richter
  2010-03-15 13:55     ` David Howells
  1 sibling, 2 replies; 11+ messages in thread
From: David Howells @ 2010-03-15 12:36 UTC (permalink / raw)
  To: Stefan Richter
  Cc: dhowells, torvalds, akpm, sgruszka, davem, linux-kernel,
	Paul E. McKenney, Randy Dunlap

Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Would you agree to add a quick note that these examples are simple
> enough to not strictly require ACCESS_ONCE but are meant to show what
> more general code would have to do?  Else a reader might be left puzzled
> why he can't see in the example code the circumstances which require
> ACCESS_ONCE and may remain unsure about where to use it in his own works...

How about adding a bit:

 Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
 This prevents the compiler from discarding and reloading its cached value -
-which some compilers will do across smp_read_barrier_depends().
+which some compilers will do across smp_read_barrier_depends().  This isn't
+strictly needed if you can be sure that the opposition index will _only_ be
+used the once.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-15 12:36   ` David Howells
@ 2010-03-15 13:43     ` Stefan Richter
  2010-03-19 17:53       ` Mark Rustad
  2010-03-15 13:55     ` David Howells
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Richter @ 2010-03-15 13:43 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, sgruszka, davem, linux-kernel, Paul E. McKenney,
	Randy Dunlap

David Howells wrote:
> How about adding a bit:
> 
>  Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
>  This prevents the compiler from discarding and reloading its cached value -
> -which some compilers will do across smp_read_barrier_depends().
> +which some compilers will do across smp_read_barrier_depends().  This isn't
> +strictly needed if you can be sure that the opposition index will _only_ be
> +used the once.

Yes, thanks.  (The last "the" is superfluous.)
-- 
Stefan Richter
-=====-==-=- --== -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-15 12:36   ` David Howells
  2010-03-15 13:43     ` Stefan Richter
@ 2010-03-15 13:55     ` David Howells
  2010-03-15 14:16       ` Stefan Richter
  1 sibling, 1 reply; 11+ messages in thread
From: David Howells @ 2010-03-15 13:55 UTC (permalink / raw)
  To: Stefan Richter
  Cc: dhowells, torvalds, akpm, sgruszka, davem, linux-kernel,
	Paul E. McKenney, Randy Dunlap

Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> Yes, thanks.  (The last "the" is superfluous.)

Only sort of.  Think of it as being emphasis.

Can I add your Reviewed-by?

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-15 13:55     ` David Howells
@ 2010-03-15 14:16       ` Stefan Richter
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Richter @ 2010-03-15 14:16 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, sgruszka, davem, linux-kernel, Paul E. McKenney,
	Randy Dunlap

David Howells wrote:
> Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
> 
>> Yes, thanks.  (The last "the" is superfluous.)
> 
> Only sort of.  Think of it as being emphasis.

(OK, I mistook for a typo what was just unfamiliar to me.)

> Can I add your Reviewed-by?

Now that you mention it,
Reviewed-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
-- 
Stefan Richter
-=====-==-=- --== -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Document Linux's circular buffering capabilities
  2010-03-15 13:43     ` Stefan Richter
@ 2010-03-19 17:53       ` Mark Rustad
  0 siblings, 0 replies; 11+ messages in thread
From: Mark Rustad @ 2010-03-19 17:53 UTC (permalink / raw)
  To: Stefan Richter
  Cc: David Howells, torvalds, akpm, sgruszka, davem, linux-kernel,
	Paul E. McKenney, Randy Dunlap

On Mar 15, 2010, at 8:43 AM, Stefan Richter wrote:

> David Howells wrote:
>> How about adding a bit:
>> 
>> Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
>> This prevents the compiler from discarding and reloading its cached value -
>> -which some compilers will do across smp_read_barrier_depends().
>> +which some compilers will do across smp_read_barrier_depends().  This isn't
>> +strictly needed if you can be sure that the opposition index will _only_ be
>> +used the once.
> 
> Yes, thanks.  (The last "the" is superfluous.)

Maybe it would be clearer to change "the once" to "the one time" or "that one time". "the once" reads too much like an error which may lead to it being "corrected" some day, most likely then losing the intended meaning.

-- 
Mark Rustad, MRustad@gmail.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-03-19 17:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-11 17:20 [PATCH] Document Linux's circular buffering capabilities David Howells
2010-03-11 18:10 ` Randy Dunlap
2010-03-11 19:20 ` David Howells
2010-03-12 15:39 ` Stefan Richter
2010-03-12 23:41 ` David Howells
2010-03-13  9:26   ` Stefan Richter
2010-03-15 12:36   ` David Howells
2010-03-15 13:43     ` Stefan Richter
2010-03-19 17:53       ` Mark Rustad
2010-03-15 13:55     ` David Howells
2010-03-15 14:16       ` Stefan Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.