linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC local_t removal V1 0/4] Remove local_t
@ 2010-01-05 22:04 Christoph Lameter
  2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

Current -next has only the trace subsystem left as a user of local_t

Tracing uses local_t for per cpu safe atomic operations in the form
of cmpxchg and additions. We already have a cmpxchg_local but no "local"
form of addition.

The patchset introduces a similar local primitive add_local() and then
uses cmpxchg_local() and add_local() to remove local_t use from the
trace subsystem.

The last patch removes local_t support from the kernel tree.

The support for add_local() is pretty basic. We can add more
fancy inc/dec variants and more optimization in the next
revision of the patchset.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
@ 2010-01-05 22:04 ` Christoph Lameter
  2010-01-05 22:17   ` Mike Frysinger
  2010-01-05 22:49   ` Mathieu Desnoyers
  2010-01-05 22:04 ` [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem Christoph Lameter
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

[-- Attachment #1: remove_local_t_add_local --]
[-- Type: text/plain, Size: 10233 bytes --]

We already have cmpxchg_local that can work on an arbitrary type. The cmpxchg
is only safe from concurrent access on the local cpu.

Add another operation that in the similar fasion allow "local" safe adds to
a variable.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 arch/alpha/include/asm/add-local.h      |    2 +
 arch/arm/include/asm/add-local.h        |    2 +
 arch/avr32/include/asm/add-local.h      |    2 +
 arch/blackfin/include/asm/add-local.h   |    2 +
 arch/cris/include/asm/add-local.h       |    2 +
 arch/frv/include/asm/add-local.h        |    2 +
 arch/h8300/include/asm/add-local.h      |    2 +
 arch/ia64/include/asm/add-local.h       |    2 +
 arch/m32r/include/asm/add-local.h       |    2 +
 arch/m68k/include/asm/add-local.h       |    2 +
 arch/microblaze/include/asm/add-local.h |    2 +
 arch/mips/include/asm/add-local.h       |    2 +
 arch/mn10300/include/asm/add-local.h    |    2 +
 arch/parisc/include/asm/add-local.h     |    2 +
 arch/powerpc/include/asm/add-local.h    |    2 +
 arch/s390/include/asm/add-local.h       |    2 +
 arch/score/include/asm/add-local.h      |    2 +
 arch/sh/include/asm/add-local.h         |    2 +
 arch/sparc/include/asm/add-local.h      |    2 +
 arch/um/include/asm/add-local.h         |    2 +
 arch/x86/include/asm/add-local.h        |    2 +
 arch/xtensa/include/asm/add-local.h     |    2 +
 include/asm-generic/add-local-generic.h |   40 ++++++++++++++++++++++++++++++++
 include/asm-generic/add-local.h         |   13 ++++++++++
 24 files changed, 97 insertions(+)

Index: linux-2.6/include/asm-generic/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/include/asm-generic/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,13 @@
+#ifndef __ASM_GENERIC_ADD_LOCAL_H
+#define __ASM_GENERIC_ADD_LOCAL_H
+
+#include <asm-generic/add-local-generic.h>
+
+#define add_return_local(ptr, v)				  	\
+	((__typeof__(*(ptr)))__add_return_local_generic((ptr), 		\
+			(unsigned long)(v), sizeof(*(ptr))))
+
+#define add_local(ptr, v) (void)__add_return_local_generic((ptr),	\
+			(unsigned long)(v), sizeof(*(ptr)))
+
+#endif
Index: linux-2.6/arch/alpha/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/alpha/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/arm/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/arm/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/avr32/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/avr32/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/blackfin/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/blackfin/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/cris/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/cris/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/frv/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/frv/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/h8300/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/h8300/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/ia64/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/ia64/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/m32r/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/m32r/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/m68k/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/m68k/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/microblaze/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/microblaze/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/mips/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/mips/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/mn10300/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/mn10300/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/parisc/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/parisc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/powerpc/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/powerpc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/s390/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/s390/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/score/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/score/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/sh/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/sh/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/sparc/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/sparc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/um/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/um/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/x86/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/x86/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/arch/xtensa/include/asm/add-local.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/arch/xtensa/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,2 @@
+#include <asm-generic/add-local.h>
+
Index: linux-2.6/include/asm-generic/add-local-generic.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/include/asm-generic/add-local-generic.h	2010-01-05 15:36:02.000000000 -0600
@@ -0,0 +1,40 @@
+#ifndef __ASM_GENERIC_ADD_LOCAL_GENERIC_H
+#define __ASM_GENERIC_ADD_LOCAL_GENERIC_H
+
+#include <linux/types.h>
+
+extern unsigned long wrong_size_add_local(volatile void *ptr);
+
+/*
+ * Generic version of __add_return_local (disables interrupts). Takes an
+ * unsigned long parameter, supporting various types of architectures.
+ */
+static inline unsigned long __add_return_local_generic(volatile void *ptr,
+		unsigned long value, int size)
+{
+	unsigned long flags, r;
+
+	/*
+	 * Sanity checking, compile-time.
+	 */
+	if (size == 8 && sizeof(unsigned long) != 8)
+		wrong_size_add_local(ptr);
+
+	local_irq_save(flags);
+	switch (size) {
+	case 1: r = (*((u8 *)ptr) += value);
+		break;
+	case 2: r = (*((u16 *)ptr) += value);
+		break;
+	case 4: r = (*((u32 *)ptr) += value);
+		break;
+	case 8: r = (*((u64 *)ptr) += value);
+		break;
+	default:
+		wrong_size_add_local(ptr);
+	}
+	local_irq_restore(flags);
+	return r;
+}
+
+#endif

-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem
  2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
  2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
@ 2010-01-05 22:04 ` Christoph Lameter
  2010-01-05 22:57   ` Mathieu Desnoyers
  2010-01-14  2:56   ` Steven Rostedt
  2010-01-05 22:04 ` [RFC local_t removal V1 3/4] Optimized add_local() Christoph Lameter
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

[-- Attachment #1: remove_local_t_ringbuffer_convert --]
[-- Type: text/plain, Size: 16796 bytes --]

Replace the local_t use with longs in the trace subsystem. The longs can then be
updated with the required level of concurrency protection through cmpxchg_local()
and add_local().

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 kernel/trace/ring_buffer.c           |  150 +++++++++++++++++------------------
 kernel/trace/ring_buffer_benchmark.c |    5 -
 kernel/trace/trace_branch.c          |    1 
 3 files changed, 77 insertions(+), 79 deletions(-)

Index: linux-2.6/kernel/trace/ring_buffer.c
===================================================================
--- linux-2.6.orig/kernel/trace/ring_buffer.c	2010-01-05 15:35:35.000000000 -0600
+++ linux-2.6/kernel/trace/ring_buffer.c	2010-01-05 15:36:07.000000000 -0600
@@ -20,7 +20,7 @@
 #include <linux/cpu.h>
 #include <linux/fs.h>
 
-#include <asm/local.h>
+#include <asm/add-local.h>
 #include "trace.h"
 
 /*
@@ -312,7 +312,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
 
 struct buffer_data_page {
 	u64		 time_stamp;	/* page time stamp */
-	local_t		 commit;	/* write committed index */
+	long		 commit;	/* write committed index */
 	unsigned char	 data[];	/* data of buffer page */
 };
 
@@ -326,9 +326,9 @@ struct buffer_data_page {
  */
 struct buffer_page {
 	struct list_head list;		/* list of buffer pages */
-	local_t		 write;		/* index for next write */
+	long		 write;		/* index for next write */
 	unsigned	 read;		/* index for next read */
-	local_t		 entries;	/* entries on this page */
+	long		 entries;	/* entries on this page */
 	struct buffer_data_page *page;	/* Actual data page */
 };
 
@@ -349,7 +349,7 @@ struct buffer_page {
 
 static void rb_init_page(struct buffer_data_page *bpage)
 {
-	local_set(&bpage->commit, 0);
+	bpage->commit = 0;
 }
 
 /**
@@ -360,7 +360,7 @@ static void rb_init_page(struct buffer_d
  */
 size_t ring_buffer_page_len(void *page)
 {
-	return local_read(&((struct buffer_data_page *)page)->commit)
+	return ((struct buffer_data_page *)page)->commit
 		+ BUF_PAGE_HDR_SIZE;
 }
 
@@ -431,11 +431,11 @@ struct ring_buffer_per_cpu {
 	struct buffer_page		*tail_page;	/* write to tail */
 	struct buffer_page		*commit_page;	/* committed pages */
 	struct buffer_page		*reader_page;
-	local_t				commit_overrun;
-	local_t				overrun;
-	local_t				entries;
-	local_t				committing;
-	local_t				commits;
+	long				commit_overrun;
+	long				overrun;
+	long				entries;
+	long				committing;
+	long				commits;
 	unsigned long			read;
 	u64				write_stamp;
 	u64				read_stamp;
@@ -824,8 +824,8 @@ static int rb_tail_page_update(struct ri
 	 *
 	 * We add a counter to the write field to denote this.
 	 */
-	old_write = local_add_return(RB_WRITE_INTCNT, &next_page->write);
-	old_entries = local_add_return(RB_WRITE_INTCNT, &next_page->entries);
+	old_write = add_return_local(&next_page->write, RB_WRITE_INTCNT);
+	old_entries = add_return_local(&next_page->entries, RB_WRITE_INTCNT);
 
 	/*
 	 * Just make sure we have seen our old_write and synchronize
@@ -853,15 +853,15 @@ static int rb_tail_page_update(struct ri
 		 * cmpxchg to only update if an interrupt did not already
 		 * do it for us. If the cmpxchg fails, we don't care.
 		 */
-		(void)local_cmpxchg(&next_page->write, old_write, val);
-		(void)local_cmpxchg(&next_page->entries, old_entries, eval);
+		(void)cmpxchg_local(&next_page->write, old_write, val);
+		(void)cmpxchg_local(&next_page->entries, old_entries, eval);
 
 		/*
 		 * No need to worry about races with clearing out the commit.
 		 * it only can increment when a commit takes place. But that
 		 * only happens in the outer most nested commit.
 		 */
-		local_set(&next_page->page->commit, 0);
+		next_page->page->commit = 0;
 
 		old_tail = cmpxchg(&cpu_buffer->tail_page,
 				   tail_page, next_page);
@@ -1394,17 +1394,17 @@ rb_iter_head_event(struct ring_buffer_it
 
 static inline unsigned long rb_page_write(struct buffer_page *bpage)
 {
-	return local_read(&bpage->write) & RB_WRITE_MASK;
+	return bpage->write & RB_WRITE_MASK;
 }
 
 static inline unsigned rb_page_commit(struct buffer_page *bpage)
 {
-	return local_read(&bpage->page->commit);
+	return bpage->page->commit;
 }
 
 static inline unsigned long rb_page_entries(struct buffer_page *bpage)
 {
-	return local_read(&bpage->entries) & RB_WRITE_MASK;
+	return bpage->entries & RB_WRITE_MASK;
 }
 
 /* Size is determined by what has been commited */
@@ -1463,8 +1463,8 @@ rb_set_commit_to_write(struct ring_buffe
 		if (RB_WARN_ON(cpu_buffer,
 			       rb_is_reader_page(cpu_buffer->tail_page)))
 			return;
-		local_set(&cpu_buffer->commit_page->page->commit,
-			  rb_page_write(cpu_buffer->commit_page));
+		cpu_buffer->commit_page->page->commit =
+			  rb_page_write(cpu_buffer->commit_page);
 		rb_inc_page(cpu_buffer, &cpu_buffer->commit_page);
 		cpu_buffer->write_stamp =
 			cpu_buffer->commit_page->page->time_stamp;
@@ -1474,10 +1474,10 @@ rb_set_commit_to_write(struct ring_buffe
 	while (rb_commit_index(cpu_buffer) !=
 	       rb_page_write(cpu_buffer->commit_page)) {
 
-		local_set(&cpu_buffer->commit_page->page->commit,
-			  rb_page_write(cpu_buffer->commit_page));
+		cpu_buffer->commit_page->page->commit =
+			  rb_page_write(cpu_buffer->commit_page);
 		RB_WARN_ON(cpu_buffer,
-			   local_read(&cpu_buffer->commit_page->page->commit) &
+			   cpu_buffer->commit_page->page->commit &
 			   ~RB_WRITE_MASK);
 		barrier();
 	}
@@ -1600,7 +1600,7 @@ rb_handle_head_page(struct ring_buffer_p
 		 * it is our responsibility to update
 		 * the counters.
 		 */
-		local_add(entries, &cpu_buffer->overrun);
+		add_local(&cpu_buffer->overrun, entries);
 
 		/*
 		 * The entries will be zeroed out when we move the
@@ -1741,7 +1741,7 @@ rb_reset_tail(struct ring_buffer_per_cpu
 	 * must fill the old tail_page with padding.
 	 */
 	if (tail >= BUF_PAGE_SIZE) {
-		local_sub(length, &tail_page->write);
+		add_local(&tail_page->write, -length);
 		return;
 	}
 
@@ -1766,7 +1766,7 @@ rb_reset_tail(struct ring_buffer_per_cpu
 		rb_event_set_padding(event);
 
 		/* Set the write back to the previous setting */
-		local_sub(length, &tail_page->write);
+		add_local(&tail_page->write, -length);
 		return;
 	}
 
@@ -1778,7 +1778,7 @@ rb_reset_tail(struct ring_buffer_per_cpu
 
 	/* Set write to end of buffer */
 	length = (tail + length) - BUF_PAGE_SIZE;
-	local_sub(length, &tail_page->write);
+	add_local(&tail_page->write, -length);
 }
 
 static struct ring_buffer_event *
@@ -1801,7 +1801,7 @@ rb_move_tail(struct ring_buffer_per_cpu 
 	 * about it.
 	 */
 	if (unlikely(next_page == commit_page)) {
-		local_inc(&cpu_buffer->commit_overrun);
+		add_local(&cpu_buffer->commit_overrun, 1);
 		goto out_reset;
 	}
 
@@ -1855,7 +1855,7 @@ rb_move_tail(struct ring_buffer_per_cpu 
 				      cpu_buffer->tail_page) &&
 				     (cpu_buffer->commit_page ==
 				      cpu_buffer->reader_page))) {
-				local_inc(&cpu_buffer->commit_overrun);
+				add_local(&cpu_buffer->commit_overrun, 1);
 				goto out_reset;
 			}
 		}
@@ -1894,7 +1894,7 @@ __rb_reserve_next(struct ring_buffer_per
 	unsigned long tail, write;
 
 	tail_page = cpu_buffer->tail_page;
-	write = local_add_return(length, &tail_page->write);
+	write = add_return_local(&tail_page->write, length);
 
 	/* set write to only the index of the write */
 	write &= RB_WRITE_MASK;
@@ -1913,7 +1913,7 @@ __rb_reserve_next(struct ring_buffer_per
 
 	/* The passed in type is zero for DATA */
 	if (likely(!type))
-		local_inc(&tail_page->entries);
+		add_local(&tail_page->entries, 1);
 
 	/*
 	 * If this is the first commit on the page, then update
@@ -1943,7 +1943,7 @@ rb_try_to_discard(struct ring_buffer_per
 
 	if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
 		unsigned long write_mask =
-			local_read(&bpage->write) & ~RB_WRITE_MASK;
+			bpage->write & ~RB_WRITE_MASK;
 		/*
 		 * This is on the tail page. It is possible that
 		 * a write could come in and move the tail page
@@ -1952,7 +1952,7 @@ rb_try_to_discard(struct ring_buffer_per
 		 */
 		old_index += write_mask;
 		new_index += write_mask;
-		index = local_cmpxchg(&bpage->write, old_index, new_index);
+		index = cmpxchg_local(&bpage->write, old_index, new_index);
 		if (index == old_index)
 			return 1;
 	}
@@ -2030,8 +2030,8 @@ rb_add_time_stamp(struct ring_buffer_per
 
 static void rb_start_commit(struct ring_buffer_per_cpu *cpu_buffer)
 {
-	local_inc(&cpu_buffer->committing);
-	local_inc(&cpu_buffer->commits);
+	add_local(&cpu_buffer->committing, 1);
+	add_local(&cpu_buffer->commits, 1);
 }
 
 static void rb_end_commit(struct ring_buffer_per_cpu *cpu_buffer)
@@ -2039,17 +2039,17 @@ static void rb_end_commit(struct ring_bu
 	unsigned long commits;
 
 	if (RB_WARN_ON(cpu_buffer,
-		       !local_read(&cpu_buffer->committing)))
+		       !cpu_buffer->committing))
 		return;
 
  again:
-	commits = local_read(&cpu_buffer->commits);
+	commits = cpu_buffer->commits;
 	/* synchronize with interrupts */
 	barrier();
-	if (local_read(&cpu_buffer->committing) == 1)
+	if (cpu_buffer->committing == 1)
 		rb_set_commit_to_write(cpu_buffer);
 
-	local_dec(&cpu_buffer->committing);
+	add_local(&cpu_buffer->committing, -1);
 
 	/* synchronize with interrupts */
 	barrier();
@@ -2059,9 +2059,9 @@ static void rb_end_commit(struct ring_bu
 	 * updating of the commit page and the clearing of the
 	 * committing counter.
 	 */
-	if (unlikely(local_read(&cpu_buffer->commits) != commits) &&
-	    !local_read(&cpu_buffer->committing)) {
-		local_inc(&cpu_buffer->committing);
+	if (unlikely(cpu_buffer->commits != commits) &&
+	    !cpu_buffer->committing) {
+		add_local(&cpu_buffer->committing, 1);
 		goto again;
 	}
 }
@@ -2087,8 +2087,8 @@ rb_reserve_next_event(struct ring_buffer
 	 */
 	barrier();
 	if (unlikely(ACCESS_ONCE(cpu_buffer->buffer) != buffer)) {
-		local_dec(&cpu_buffer->committing);
-		local_dec(&cpu_buffer->commits);
+		add_local(&cpu_buffer->committing, -1);
+		add_local(&cpu_buffer->commits, -1);
 		return NULL;
 	}
 #endif
@@ -2291,7 +2291,7 @@ rb_update_write_stamp(struct ring_buffer
 static void rb_commit(struct ring_buffer_per_cpu *cpu_buffer,
 		      struct ring_buffer_event *event)
 {
-	local_inc(&cpu_buffer->entries);
+	add_local(&cpu_buffer->entries, -1);
 	rb_update_write_stamp(cpu_buffer, event);
 	rb_end_commit(cpu_buffer);
 }
@@ -2357,7 +2357,7 @@ rb_decrement_entry(struct ring_buffer_pe
 
 	/* Do the likely case first */
 	if (likely(bpage->page == (void *)addr)) {
-		local_dec(&bpage->entries);
+		add_local(&bpage->entries, -1);
 		return;
 	}
 
@@ -2369,7 +2369,7 @@ rb_decrement_entry(struct ring_buffer_pe
 	start = bpage;
 	do {
 		if (bpage->page == (void *)addr) {
-			local_dec(&bpage->entries);
+			add_local(&bpage->entries, -1);
 			return;
 		}
 		rb_inc_page(cpu_buffer, &bpage);
@@ -2415,7 +2415,7 @@ void ring_buffer_discard_commit(struct r
 	 * committed yet. Thus we can assume that preemption
 	 * is still disabled.
 	 */
-	RB_WARN_ON(buffer, !local_read(&cpu_buffer->committing));
+	RB_WARN_ON(buffer, !cpu_buffer->committing);
 
 	rb_decrement_entry(cpu_buffer, event);
 	if (rb_try_to_discard(cpu_buffer, event))
@@ -2604,7 +2604,7 @@ unsigned long ring_buffer_entries_cpu(st
 		return 0;
 
 	cpu_buffer = buffer->buffers[cpu];
-	ret = (local_read(&cpu_buffer->entries) - local_read(&cpu_buffer->overrun))
+	ret = cpu_buffer->entries - cpu_buffer->overrun
 		- cpu_buffer->read;
 
 	return ret;
@@ -2625,7 +2625,7 @@ unsigned long ring_buffer_overrun_cpu(st
 		return 0;
 
 	cpu_buffer = buffer->buffers[cpu];
-	ret = local_read(&cpu_buffer->overrun);
+	ret = cpu_buffer->overrun;
 
 	return ret;
 }
@@ -2646,7 +2646,7 @@ ring_buffer_commit_overrun_cpu(struct ri
 		return 0;
 
 	cpu_buffer = buffer->buffers[cpu];
-	ret = local_read(&cpu_buffer->commit_overrun);
+	ret = cpu_buffer->commit_overrun;
 
 	return ret;
 }
@@ -2668,8 +2668,8 @@ unsigned long ring_buffer_entries(struct
 	/* if you care about this being correct, lock the buffer */
 	for_each_buffer_cpu(buffer, cpu) {
 		cpu_buffer = buffer->buffers[cpu];
-		entries += (local_read(&cpu_buffer->entries) -
-			    local_read(&cpu_buffer->overrun)) - cpu_buffer->read;
+		entries += cpu_buffer->entries -
+			    cpu_buffer->overrun - cpu_buffer->read;
 	}
 
 	return entries;
@@ -2692,7 +2692,7 @@ unsigned long ring_buffer_overruns(struc
 	/* if you care about this being correct, lock the buffer */
 	for_each_buffer_cpu(buffer, cpu) {
 		cpu_buffer = buffer->buffers[cpu];
-		overruns += local_read(&cpu_buffer->overrun);
+		overruns += cpu_buffer->overrun;
 	}
 
 	return overruns;
@@ -2861,9 +2861,9 @@ rb_get_reader_page(struct ring_buffer_pe
 	/*
 	 * Reset the reader page to size zero.
 	 */
-	local_set(&cpu_buffer->reader_page->write, 0);
-	local_set(&cpu_buffer->reader_page->entries, 0);
-	local_set(&cpu_buffer->reader_page->page->commit, 0);
+	cpu_buffer->reader_page->write = 0;
+	cpu_buffer->reader_page->entries = 0;
+	cpu_buffer->reader_page->page->commit = 0;
 
  spin:
 	/*
@@ -3354,9 +3354,9 @@ rb_reset_cpu(struct ring_buffer_per_cpu 
 
 	cpu_buffer->head_page
 		= list_entry(cpu_buffer->pages, struct buffer_page, list);
-	local_set(&cpu_buffer->head_page->write, 0);
-	local_set(&cpu_buffer->head_page->entries, 0);
-	local_set(&cpu_buffer->head_page->page->commit, 0);
+	cpu_buffer->head_page->write = 0;
+	cpu_buffer->head_page->entries = 0;
+	cpu_buffer->head_page->page->commit = 0;
 
 	cpu_buffer->head_page->read = 0;
 
@@ -3364,16 +3364,16 @@ rb_reset_cpu(struct ring_buffer_per_cpu 
 	cpu_buffer->commit_page = cpu_buffer->head_page;
 
 	INIT_LIST_HEAD(&cpu_buffer->reader_page->list);
-	local_set(&cpu_buffer->reader_page->write, 0);
-	local_set(&cpu_buffer->reader_page->entries, 0);
-	local_set(&cpu_buffer->reader_page->page->commit, 0);
+	cpu_buffer->reader_page->write = 0;
+	cpu_buffer->reader_page->entries = 0;
+	cpu_buffer->reader_page->page->commit = 0;
 	cpu_buffer->reader_page->read = 0;
 
-	local_set(&cpu_buffer->commit_overrun, 0);
-	local_set(&cpu_buffer->overrun, 0);
-	local_set(&cpu_buffer->entries, 0);
-	local_set(&cpu_buffer->committing, 0);
-	local_set(&cpu_buffer->commits, 0);
+	cpu_buffer->commit_overrun = 0;
+	cpu_buffer->overrun = 0;
+	cpu_buffer->entries = 0;
+	cpu_buffer->committing = 0;
+	cpu_buffer->commits = 0;
 	cpu_buffer->read = 0;
 
 	cpu_buffer->write_stamp = 0;
@@ -3399,7 +3399,7 @@ void ring_buffer_reset_cpu(struct ring_b
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
 
-	if (RB_WARN_ON(cpu_buffer, local_read(&cpu_buffer->committing)))
+	if (RB_WARN_ON(cpu_buffer, cpu_buffer->committing))
 		goto out;
 
 	arch_spin_lock(&cpu_buffer->lock);
@@ -3547,9 +3547,9 @@ int ring_buffer_swap_cpu(struct ring_buf
 	atomic_inc(&cpu_buffer_b->record_disabled);
 
 	ret = -EBUSY;
-	if (local_read(&cpu_buffer_a->committing))
+	if (cpu_buffer_a->committing)
 		goto out_dec;
-	if (local_read(&cpu_buffer_b->committing))
+	if (cpu_buffer_b->committing)
 		goto out_dec;
 
 	buffer_a->buffers[cpu] = cpu_buffer_b;
@@ -3733,7 +3733,7 @@ int ring_buffer_read_page(struct ring_bu
 		} while (len > size);
 
 		/* update bpage */
-		local_set(&bpage->commit, pos);
+		bpage->commit = pos;
 		bpage->time_stamp = save_timestamp;
 
 		/* we copied everything to the beginning */
@@ -3746,8 +3746,8 @@ int ring_buffer_read_page(struct ring_bu
 		rb_init_page(bpage);
 		bpage = reader->page;
 		reader->page = *data_page;
-		local_set(&reader->write, 0);
-		local_set(&reader->entries, 0);
+		reader->write = 0;
+		reader->entries = 0;
 		reader->read = 0;
 		*data_page = bpage;
 	}
Index: linux-2.6/kernel/trace/ring_buffer_benchmark.c
===================================================================
--- linux-2.6.orig/kernel/trace/ring_buffer_benchmark.c	2010-01-05 15:35:35.000000000 -0600
+++ linux-2.6/kernel/trace/ring_buffer_benchmark.c	2010-01-05 15:38:23.000000000 -0600
@@ -8,11 +8,10 @@
 #include <linux/kthread.h>
 #include <linux/module.h>
 #include <linux/time.h>
-#include <asm/local.h>
 
 struct rb_page {
 	u64		ts;
-	local_t		commit;
+	long		commit;
 	char		data[4080];
 };
 
@@ -113,7 +112,7 @@ static enum event_status read_page(int c
 	ret = ring_buffer_read_page(buffer, &bpage, PAGE_SIZE, cpu, 1);
 	if (ret >= 0) {
 		rpage = bpage;
-		commit = local_read(&rpage->commit);
+		commit = rpage->commit;
 		for (i = 0; i < commit && !kill_test; i += inc) {
 
 			if (i >= (PAGE_SIZE - offsetof(struct rb_page, data))) {
Index: linux-2.6/kernel/trace/trace_branch.c
===================================================================
--- linux-2.6.orig/kernel/trace/trace_branch.c	2010-01-05 15:38:39.000000000 -0600
+++ linux-2.6/kernel/trace/trace_branch.c	2010-01-05 15:38:45.000000000 -0600
@@ -13,7 +13,6 @@
 #include <linux/ftrace.h>
 #include <linux/hash.h>
 #include <linux/fs.h>
-#include <asm/local.h>
 
 #include "trace.h"
 #include "trace_stat.h"

-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC local_t removal V1 3/4] Optimized add_local()
  2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
  2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
  2010-01-05 22:04 ` [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem Christoph Lameter
@ 2010-01-05 22:04 ` Christoph Lameter
  2010-01-05 22:59   ` Mathieu Desnoyers
  2010-01-05 22:04 ` [RFC local_t removal V1 4/4] Remove local_t support Christoph Lameter
  2010-01-05 22:23 ` [RFC local_t removal V1 0/4] Remove local_t Mathieu Desnoyers
  4 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

[-- Attachment #1: remove_local_t_add_local_x86 --]
[-- Type: text/plain, Size: 1869 bytes --]

Use XADD to implement add_local().

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 arch/x86/include/asm/add-local.h |   56 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/include/asm/add-local.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/add-local.h	2010-01-05 15:29:11.000000000 -0600
+++ linux-2.6/arch/x86/include/asm/add-local.h	2010-01-05 15:33:59.000000000 -0600
@@ -1,2 +1,56 @@
-#include <asm-generic/add-local.h>
+#ifndef __ASM_X86_ADD_LOCAL_H
+#define __ASM_X86_ADD_LOCAL_H
+
+#include <linux/types.h>
+#include <asm-generic/add-local-generic.h>
+
+static inline unsigned long __add_return_local(volatile void *ptr,
+		unsigned long value, int size)
+{
+	unsigned long r;
+
+#ifdef CONFIG_M386
+	if (unlikely(boot_cpu_data.x86 <= 3))
+                return __add_return_local_generic(ptr, value, size);
+#endif
+
+	/*
+	 * Sanity checking, compile-time.
+	 */
+	if (size == 8 && sizeof(unsigned long) != 8)
+		wrong_size_add_local(ptr);
+
+	r = value;
+	switch (size) {
+	case 1:
+		asm volatile("xaddb %0, %1;": "+r" (r), "+m" (*((u8 *)ptr)):
+			 : "memory");
+		break;
+	case 2:
+		asm volatile("xaddw %0, %1;": "+r" (r), "+m" (*((u16 *)ptr)):
+			 : "memory");
+		break;
+	case 4:
+		asm volatile("xaddl %0, %1;": "+r" (r), "+m" (*((u32 *)ptr)):
+			 : "memory");
+		break;
+	case 8:
+		asm volatile("xaddq %0, %1;": "+r" (r), "+m" (*((u64 *)ptr)):
+			 : "memory");
+		break;
+	default:
+		wrong_size_add_local(ptr);
+	}
+	return r + value;
+}
+
+#define add_return_local(ptr, v)				  	\
+	((__typeof__(*(ptr)))__add_return_local((ptr), (unsigned long)(v), \
+			sizeof(*(ptr))))
+
+#define add_local(ptr, v) (void)__add_return_local((ptr), (unsigned long)(v), \
+			sizeof(*(ptr)))
+
+
+#endif
 

-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC local_t removal V1 4/4] Remove local_t support
  2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
                   ` (2 preceding siblings ...)
  2010-01-05 22:04 ` [RFC local_t removal V1 3/4] Optimized add_local() Christoph Lameter
@ 2010-01-05 22:04 ` Christoph Lameter
  2010-01-05 22:23 ` [RFC local_t removal V1 0/4] Remove local_t Mathieu Desnoyers
  4 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

[-- Attachment #1: remove_local_t_actual_remove --]
[-- Type: text/plain, Size: 46853 bytes --]

No user of local_t remains.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 Documentation/DocBook/kernel-locking.tmpl |    8 
 Documentation/atomic_ops.txt              |    7 
 Documentation/local_ops.txt               |  186 ----------------
 arch/alpha/include/asm/local.h            |  101 --------
 arch/arm/include/asm/local.h              |    1 
 arch/avr32/include/asm/local.h            |    6 
 arch/blackfin/include/asm/local.h         |    1 
 arch/cris/include/asm/local.h             |    1 
 arch/frv/include/asm/local.h              |    6 
 arch/frv/kernel/local.h                   |   59 -----
 arch/h8300/include/asm/local.h            |    6 
 arch/ia64/include/asm/local.h             |    1 
 arch/m32r/include/asm/local.h             |  341 ------------------------------
 arch/m68k/include/asm/local.h             |    6 
 arch/microblaze/include/asm/local.h       |    1 
 arch/mips/include/asm/local.h             |  196 -----------------
 arch/mn10300/include/asm/local.h          |    1 
 arch/parisc/include/asm/local.h           |    1 
 arch/powerpc/include/asm/local.h          |  175 ---------------
 arch/s390/include/asm/local.h             |    1 
 arch/score/include/asm/local.h            |    6 
 arch/sh/include/asm/local.h               |    7 
 arch/sparc/include/asm/local.h            |    6 
 arch/x86/include/asm/local.h              |  198 -----------------
 arch/xtensa/include/asm/local.h           |   16 -
 include/asm-generic/local.h               |   55 ----
 26 files changed, 4 insertions(+), 1389 deletions(-)

Index: linux-2.6/arch/frv/kernel/local.h
===================================================================
--- linux-2.6.orig/arch/frv/kernel/local.h	2009-11-13 09:33:01.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,59 +0,0 @@
-/* local.h: local definitions
- *
- * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-
-#ifndef _FRV_LOCAL_H
-#define _FRV_LOCAL_H
-
-#include <asm/sections.h>
-
-#ifndef __ASSEMBLY__
-
-/* dma.c */
-extern unsigned long frv_dma_inprogress;
-
-extern void frv_dma_pause_all(void);
-extern void frv_dma_resume_all(void);
-
-/* sleep.S */
-extern asmlinkage void frv_cpu_suspend(unsigned long);
-extern asmlinkage void frv_cpu_core_sleep(void);
-
-/* setup.c */
-extern unsigned long __nongprelbss pdm_suspend_mode;
-extern void determine_clocks(int verbose);
-extern int __nongprelbss clock_p0_current;
-extern int __nongprelbss clock_cm_current;
-extern int __nongprelbss clock_cmode_current;
-
-#ifdef CONFIG_PM
-extern int __nongprelbss clock_cmodes_permitted;
-extern unsigned long __nongprelbss clock_bits_settable;
-#define CLOCK_BIT_CM		0x0000000f
-#define CLOCK_BIT_CM_H		0x00000001	/* CLKC.CM can be set to 0 */
-#define CLOCK_BIT_CM_M		0x00000002	/* CLKC.CM can be set to 1 */
-#define CLOCK_BIT_CM_L		0x00000004	/* CLKC.CM can be set to 2 */
-#define CLOCK_BIT_P0		0x00000010	/* CLKC.P0 can be changed */
-#define CLOCK_BIT_CMODE		0x00000020	/* CLKC.CMODE can be changed */
-
-extern void (*__power_switch_wake_setup)(void);
-extern int  (*__power_switch_wake_check)(void);
-extern void (*__power_switch_wake_cleanup)(void);
-#endif
-
-/* time.c */
-extern void time_divisor_init(void);
-
-/* cmode.S */
-extern asmlinkage void frv_change_cmode(int);
-
-
-#endif /* __ASSEMBLY__ */
-#endif /* _FRV_LOCAL_H */
Index: linux-2.6/arch/alpha/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/alpha/include/asm/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,101 +0,0 @@
-#ifndef _ALPHA_LOCAL_H
-#define _ALPHA_LOCAL_H
-
-#include <linux/percpu.h>
-#include <asm/atomic.h>
-
-typedef struct
-{
-	atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
-#define local_read(l)	atomic_long_read(&(l)->a)
-#define local_set(l,i)	atomic_long_set(&(l)->a, (i))
-#define local_inc(l)	atomic_long_inc(&(l)->a)
-#define local_dec(l)	atomic_long_dec(&(l)->a)
-#define local_add(i,l)	atomic_long_add((i),(&(l)->a))
-#define local_sub(i,l)	atomic_long_sub((i),(&(l)->a))
-
-static __inline__ long local_add_return(long i, local_t * l)
-{
-	long temp, result;
-	__asm__ __volatile__(
-	"1:	ldq_l %0,%1\n"
-	"	addq %0,%3,%2\n"
-	"	addq %0,%3,%0\n"
-	"	stq_c %0,%1\n"
-	"	beq %0,2f\n"
-	".subsection 2\n"
-	"2:	br 1b\n"
-	".previous"
-	:"=&r" (temp), "=m" (l->a.counter), "=&r" (result)
-	:"Ir" (i), "m" (l->a.counter) : "memory");
-	return result;
-}
-
-static __inline__ long local_sub_return(long i, local_t * l)
-{
-	long temp, result;
-	__asm__ __volatile__(
-	"1:	ldq_l %0,%1\n"
-	"	subq %0,%3,%2\n"
-	"	subq %0,%3,%0\n"
-	"	stq_c %0,%1\n"
-	"	beq %0,2f\n"
-	".subsection 2\n"
-	"2:	br 1b\n"
-	".previous"
-	:"=&r" (temp), "=m" (l->a.counter), "=&r" (result)
-	:"Ir" (i), "m" (l->a.counter) : "memory");
-	return result;
-}
-
-#define local_cmpxchg(l, o, n) \
-	(cmpxchg_local(&((l)->a.counter), (o), (n)))
-#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
-
-/**
- * local_add_unless - add unless the number is a given value
- * @l: pointer of type local_t
- * @a: the amount to add to l...
- * @u: ...unless l is equal to u.
- *
- * Atomically adds @a to @l, so long as it was not @u.
- * Returns non-zero if @l was not @u, and zero otherwise.
- */
-#define local_add_unless(l, a, u)				\
-({								\
-	long c, old;						\
-	c = local_read(l);					\
-	for (;;) {						\
-		if (unlikely(c == (u)))				\
-			break;					\
-		old = local_cmpxchg((l), c, c + (a));	\
-		if (likely(old == c))				\
-			break;					\
-		c = old;					\
-	}							\
-	c != (u);						\
-})
-#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
-
-#define local_add_negative(a, l) (local_add_return((a), (l)) < 0)
-
-#define local_dec_return(l) local_sub_return(1,(l))
-
-#define local_inc_return(l) local_add_return(1,(l))
-
-#define local_sub_and_test(i,l) (local_sub_return((i), (l)) == 0)
-
-#define local_inc_and_test(l) (local_add_return(1, (l)) == 0)
-
-#define local_dec_and_test(l) (local_sub_return(1, (l)) == 0)
-
-/* Verify if faster than atomic ops */
-#define __local_inc(l)		((l)->a.counter++)
-#define __local_dec(l)		((l)->a.counter++)
-#define __local_add(i,l)	((l)->a.counter+=(i))
-#define __local_sub(i,l)	((l)->a.counter-=(i))
-
-#endif /* _ALPHA_LOCAL_H */
Index: linux-2.6/arch/arm/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/arm/include/asm/local.h	2009-11-13 09:32:55.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/avr32/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/avr32/include/asm/local.h	2009-11-13 09:32:59.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef __ASM_AVR32_LOCAL_H
-#define __ASM_AVR32_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* __ASM_AVR32_LOCAL_H */
Index: linux-2.6/arch/blackfin/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/blackfin/include/asm/local.h	2009-11-13 09:32:59.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/cris/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/cris/include/asm/local.h	2009-11-13 09:33:01.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/frv/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/frv/include/asm/local.h	2009-11-13 09:33:01.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef _ASM_LOCAL_H
-#define _ASM_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* _ASM_LOCAL_H */
Index: linux-2.6/arch/h8300/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/h8300/include/asm/local.h	2009-11-13 09:33:01.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef _H8300_LOCAL_H_
-#define _H8300_LOCAL_H_
-
-#include <asm-generic/local.h>
-
-#endif
Index: linux-2.6/arch/ia64/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/ia64/include/asm/local.h	2009-11-13 09:33:01.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/m32r/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/m32r/include/asm/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,341 +0,0 @@
-#ifndef __M32R_LOCAL_H
-#define __M32R_LOCAL_H
-
-/*
- *  linux/include/asm-m32r/local.h
- *
- *  M32R version:
- *    Copyright (C) 2001, 2002  Hitoshi Yamamoto
- *    Copyright (C) 2004  Hirokazu Takata <takata at linux-m32r.org>
- *    Copyright (C) 2007  Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
- */
-
-#include <linux/percpu.h>
-#include <asm/assembler.h>
-#include <asm/system.h>
-#include <asm/local.h>
-
-/*
- * Atomic operations that C can't guarantee us.  Useful for
- * resource counting etc..
- */
-
-/*
- * Make sure gcc doesn't try to be clever and move things around
- * on us. We need to use _exactly_ the address the user gave us,
- * not some alias that contains the same information.
- */
-typedef struct { volatile int counter; } local_t;
-
-#define LOCAL_INIT(i)	{ (i) }
-
-/**
- * local_read - read local variable
- * @l: pointer of type local_t
- *
- * Atomically reads the value of @l.
- */
-#define local_read(l)	((l)->counter)
-
-/**
- * local_set - set local variable
- * @l: pointer of type local_t
- * @i: required value
- *
- * Atomically sets the value of @l to @i.
- */
-#define local_set(l, i)	(((l)->counter) = (i))
-
-/**
- * local_add_return - add long to local variable and return it
- * @i: long value to add
- * @l: pointer of type local_t
- *
- * Atomically adds @i to @l and return (@i + @l).
- */
-static inline long local_add_return(long i, local_t *l)
-{
-	unsigned long flags;
-	long result;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_add_return		\n\t"
-		DCACHE_CLEAR("%0", "r4", "%1")
-		"ld %0, @%1;			\n\t"
-		"add	%0, %2;			\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (result)
-		: "r" (&l->counter), "r" (i)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r4"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-
-	return result;
-}
-
-/**
- * local_sub_return - subtract long from local variable and return it
- * @i: long value to subtract
- * @l: pointer of type local_t
- *
- * Atomically subtracts @i from @l and return (@l - @i).
- */
-static inline long local_sub_return(long i, local_t *l)
-{
-	unsigned long flags;
-	long result;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_sub_return		\n\t"
-		DCACHE_CLEAR("%0", "r4", "%1")
-		"ld %0, @%1;			\n\t"
-		"sub	%0, %2;			\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (result)
-		: "r" (&l->counter), "r" (i)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r4"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-
-	return result;
-}
-
-/**
- * local_add - add long to local variable
- * @i: long value to add
- * @l: pointer of type local_t
- *
- * Atomically adds @i to @l.
- */
-#define local_add(i, l) ((void) local_add_return((i), (l)))
-
-/**
- * local_sub - subtract the local variable
- * @i: long value to subtract
- * @l: pointer of type local_t
- *
- * Atomically subtracts @i from @l.
- */
-#define local_sub(i, l) ((void) local_sub_return((i), (l)))
-
-/**
- * local_sub_and_test - subtract value from variable and test result
- * @i: integer value to subtract
- * @l: pointer of type local_t
- *
- * Atomically subtracts @i from @l and returns
- * true if the result is zero, or false for all
- * other cases.
- */
-#define local_sub_and_test(i, l) (local_sub_return((i), (l)) == 0)
-
-/**
- * local_inc_return - increment local variable and return it
- * @l: pointer of type local_t
- *
- * Atomically increments @l by 1 and returns the result.
- */
-static inline long local_inc_return(local_t *l)
-{
-	unsigned long flags;
-	long result;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_inc_return		\n\t"
-		DCACHE_CLEAR("%0", "r4", "%1")
-		"ld %0, @%1;			\n\t"
-		"addi	%0, #1;			\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (result)
-		: "r" (&l->counter)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r4"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-
-	return result;
-}
-
-/**
- * local_dec_return - decrement local variable and return it
- * @l: pointer of type local_t
- *
- * Atomically decrements @l by 1 and returns the result.
- */
-static inline long local_dec_return(local_t *l)
-{
-	unsigned long flags;
-	long result;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_dec_return		\n\t"
-		DCACHE_CLEAR("%0", "r4", "%1")
-		"ld %0, @%1;			\n\t"
-		"addi	%0, #-1;		\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (result)
-		: "r" (&l->counter)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r4"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-
-	return result;
-}
-
-/**
- * local_inc - increment local variable
- * @l: pointer of type local_t
- *
- * Atomically increments @l by 1.
- */
-#define local_inc(l) ((void)local_inc_return(l))
-
-/**
- * local_dec - decrement local variable
- * @l: pointer of type local_t
- *
- * Atomically decrements @l by 1.
- */
-#define local_dec(l) ((void)local_dec_return(l))
-
-/**
- * local_inc_and_test - increment and test
- * @l: pointer of type local_t
- *
- * Atomically increments @l by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-#define local_inc_and_test(l) (local_inc_return(l) == 0)
-
-/**
- * local_dec_and_test - decrement and test
- * @l: pointer of type local_t
- *
- * Atomically decrements @l by 1 and
- * returns true if the result is 0, or false for all
- * other cases.
- */
-#define local_dec_and_test(l) (local_dec_return(l) == 0)
-
-/**
- * local_add_negative - add and test if negative
- * @l: pointer of type local_t
- * @i: integer value to add
- *
- * Atomically adds @i to @l and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
- */
-#define local_add_negative(i, l) (local_add_return((i), (l)) < 0)
-
-#define local_cmpxchg(l, o, n) (cmpxchg_local(&((l)->counter), (o), (n)))
-#define local_xchg(v, new) (xchg_local(&((l)->counter), new))
-
-/**
- * local_add_unless - add unless the number is a given value
- * @l: pointer of type local_t
- * @a: the amount to add to l...
- * @u: ...unless l is equal to u.
- *
- * Atomically adds @a to @l, so long as it was not @u.
- * Returns non-zero if @l was not @u, and zero otherwise.
- */
-static inline int local_add_unless(local_t *l, long a, long u)
-{
-	long c, old;
-	c = local_read(l);
-	for (;;) {
-		if (unlikely(c == (u)))
-			break;
-		old = local_cmpxchg((l), c, c + (a));
-		if (likely(old == c))
-			break;
-		c = old;
-	}
-	return c != (u);
-}
-
-#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
-
-static inline void local_clear_mask(unsigned long  mask, local_t *addr)
-{
-	unsigned long flags;
-	unsigned long tmp;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_clear_mask		\n\t"
-		DCACHE_CLEAR("%0", "r5", "%1")
-		"ld %0, @%1;			\n\t"
-		"and	%0, %2;			\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (tmp)
-		: "r" (addr), "r" (~mask)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r5"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-}
-
-static inline void local_set_mask(unsigned long  mask, local_t *addr)
-{
-	unsigned long flags;
-	unsigned long tmp;
-
-	local_irq_save(flags);
-	__asm__ __volatile__ (
-		"# local_set_mask		\n\t"
-		DCACHE_CLEAR("%0", "r5", "%1")
-		"ld %0, @%1;			\n\t"
-		"or	%0, %2;			\n\t"
-		"st %0, @%1;			\n\t"
-		: "=&r" (tmp)
-		: "r" (addr), "r" (mask)
-		: "memory"
-#ifdef CONFIG_CHIP_M32700_TS1
-		, "r5"
-#endif	/* CONFIG_CHIP_M32700_TS1 */
-	);
-	local_irq_restore(flags);
-}
-
-/* Atomic operations are already serializing on m32r */
-#define smp_mb__before_local_dec()	barrier()
-#define smp_mb__after_local_dec()	barrier()
-#define smp_mb__before_local_inc()	barrier()
-#define smp_mb__after_local_inc()	barrier()
-
-/* Use these for per-cpu local_t variables: on some archs they are
- * much more efficient than these naive implementations.  Note they take
- * a variable, not an address.
- */
-
-#define __local_inc(l)		((l)->a.counter++)
-#define __local_dec(l)		((l)->a.counter++)
-#define __local_add(i, l)	((l)->a.counter += (i))
-#define __local_sub(i, l)	((l)->a.counter -= (i))
-
-/* Use these for per-cpu local_t variables: on some archs they are
- * much more efficient than these naive implementations.  Note they take
- * a variable, not an address.
- */
-
-#endif /* __M32R_LOCAL_H */
Index: linux-2.6/arch/m68k/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/m68k/include/asm/local.h	2009-11-13 09:33:02.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef _ASM_M68K_LOCAL_H
-#define _ASM_M68K_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* _ASM_M68K_LOCAL_H */
Index: linux-2.6/arch/microblaze/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/microblaze/include/asm/local.h	2009-11-13 09:33:03.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/mips/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/mips/include/asm/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,196 +0,0 @@
-#ifndef _ARCH_MIPS_LOCAL_H
-#define _ARCH_MIPS_LOCAL_H
-
-#include <linux/percpu.h>
-#include <linux/bitops.h>
-#include <asm/atomic.h>
-#include <asm/cmpxchg.h>
-#include <asm/war.h>
-
-typedef struct
-{
-	atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)	atomic_long_read(&(l)->a)
-#define local_set(l, i)	atomic_long_set(&(l)->a, (i))
-
-#define local_add(i, l)	atomic_long_add((i), (&(l)->a))
-#define local_sub(i, l)	atomic_long_sub((i), (&(l)->a))
-#define local_inc(l)	atomic_long_inc(&(l)->a)
-#define local_dec(l)	atomic_long_dec(&(l)->a)
-
-/*
- * Same as above, but return the result value
- */
-static __inline__ long local_add_return(long i, local_t * l)
-{
-	unsigned long result;
-
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {
-		unsigned long temp;
-
-		__asm__ __volatile__(
-		"	.set	mips3					\n"
-		"1:"	__LL	"%1, %2		# local_add_return	\n"
-		"	addu	%0, %1, %3				\n"
-			__SC	"%0, %2					\n"
-		"	beqzl	%0, 1b					\n"
-		"	addu	%0, %1, %3				\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp), "=m" (l->a.counter)
-		: "Ir" (i), "m" (l->a.counter)
-		: "memory");
-	} else if (kernel_uses_llsc) {
-		unsigned long temp;
-
-		__asm__ __volatile__(
-		"	.set	mips3					\n"
-		"1:"	__LL	"%1, %2		# local_add_return	\n"
-		"	addu	%0, %1, %3				\n"
-			__SC	"%0, %2					\n"
-		"	beqz	%0, 1b					\n"
-		"	addu	%0, %1, %3				\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp), "=m" (l->a.counter)
-		: "Ir" (i), "m" (l->a.counter)
-		: "memory");
-	} else {
-		unsigned long flags;
-
-		local_irq_save(flags);
-		result = l->a.counter;
-		result += i;
-		l->a.counter = result;
-		local_irq_restore(flags);
-	}
-
-	return result;
-}
-
-static __inline__ long local_sub_return(long i, local_t * l)
-{
-	unsigned long result;
-
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {
-		unsigned long temp;
-
-		__asm__ __volatile__(
-		"	.set	mips3					\n"
-		"1:"	__LL	"%1, %2		# local_sub_return	\n"
-		"	subu	%0, %1, %3				\n"
-			__SC	"%0, %2					\n"
-		"	beqzl	%0, 1b					\n"
-		"	subu	%0, %1, %3				\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp), "=m" (l->a.counter)
-		: "Ir" (i), "m" (l->a.counter)
-		: "memory");
-	} else if (kernel_uses_llsc) {
-		unsigned long temp;
-
-		__asm__ __volatile__(
-		"	.set	mips3					\n"
-		"1:"	__LL	"%1, %2		# local_sub_return	\n"
-		"	subu	%0, %1, %3				\n"
-			__SC	"%0, %2					\n"
-		"	beqz	%0, 1b					\n"
-		"	subu	%0, %1, %3				\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp), "=m" (l->a.counter)
-		: "Ir" (i), "m" (l->a.counter)
-		: "memory");
-	} else {
-		unsigned long flags;
-
-		local_irq_save(flags);
-		result = l->a.counter;
-		result -= i;
-		l->a.counter = result;
-		local_irq_restore(flags);
-	}
-
-	return result;
-}
-
-#define local_cmpxchg(l, o, n) \
-	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
-#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
-
-/**
- * local_add_unless - add unless the number is a given value
- * @l: pointer of type local_t
- * @a: the amount to add to l...
- * @u: ...unless l is equal to u.
- *
- * Atomically adds @a to @l, so long as it was not @u.
- * Returns non-zero if @l was not @u, and zero otherwise.
- */
-#define local_add_unless(l, a, u)				\
-({								\
-	long c, old;						\
-	c = local_read(l);					\
-	while (c != (u) && (old = local_cmpxchg((l), c, c + (a))) != c) \
-		c = old;					\
-	c != (u);						\
-})
-#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
-
-#define local_dec_return(l) local_sub_return(1, (l))
-#define local_inc_return(l) local_add_return(1, (l))
-
-/*
- * local_sub_and_test - subtract value from variable and test result
- * @i: integer value to subtract
- * @l: pointer of type local_t
- *
- * Atomically subtracts @i from @l and returns
- * true if the result is zero, or false for all
- * other cases.
- */
-#define local_sub_and_test(i, l) (local_sub_return((i), (l)) == 0)
-
-/*
- * local_inc_and_test - increment and test
- * @l: pointer of type local_t
- *
- * Atomically increments @l by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-#define local_inc_and_test(l) (local_inc_return(l) == 0)
-
-/*
- * local_dec_and_test - decrement by 1 and test
- * @l: pointer of type local_t
- *
- * Atomically decrements @l by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
- */
-#define local_dec_and_test(l) (local_sub_return(1, (l)) == 0)
-
-/*
- * local_add_negative - add and test if negative
- * @l: pointer of type local_t
- * @i: integer value to add
- *
- * Atomically adds @i to @l and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
- */
-#define local_add_negative(i, l) (local_add_return(i, (l)) < 0)
-
-/* Use these for per-cpu local_t variables: on some archs they are
- * much more efficient than these naive implementations.  Note they take
- * a variable, not an address.
- */
-
-#define __local_inc(l)		((l)->a.counter++)
-#define __local_dec(l)		((l)->a.counter++)
-#define __local_add(i, l)	((l)->a.counter+=(i))
-#define __local_sub(i, l)	((l)->a.counter-=(i))
-
-#endif /* _ARCH_MIPS_LOCAL_H */
Index: linux-2.6/arch/mn10300/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/mn10300/include/asm/local.h	2009-11-13 09:33:04.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/parisc/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/parisc/include/asm/local.h	2009-11-13 09:33:04.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/powerpc/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,175 +0,0 @@
-#ifndef _ARCH_POWERPC_LOCAL_H
-#define _ARCH_POWERPC_LOCAL_H
-
-#include <linux/percpu.h>
-#include <asm/atomic.h>
-
-typedef struct
-{
-	atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)	atomic_long_read(&(l)->a)
-#define local_set(l,i)	atomic_long_set(&(l)->a, (i))
-
-#define local_add(i,l)	atomic_long_add((i),(&(l)->a))
-#define local_sub(i,l)	atomic_long_sub((i),(&(l)->a))
-#define local_inc(l)	atomic_long_inc(&(l)->a)
-#define local_dec(l)	atomic_long_dec(&(l)->a)
-
-static __inline__ long local_add_return(long a, local_t *l)
-{
-	long t;
-
-	__asm__ __volatile__(
-"1:"	PPC_LLARX	"%0,0,%2		# local_add_return\n\
-	add	%0,%1,%0\n"
-	PPC405_ERR77(0,%2)
-	PPC_STLCX	"%0,0,%2 \n\
-	bne-	1b"
-	: "=&r" (t)
-	: "r" (a), "r" (&(l->a.counter))
-	: "cc", "memory");
-
-	return t;
-}
-
-#define local_add_negative(a, l)	(local_add_return((a), (l)) < 0)
-
-static __inline__ long local_sub_return(long a, local_t *l)
-{
-	long t;
-
-	__asm__ __volatile__(
-"1:"	PPC_LLARX	"%0,0,%2		# local_sub_return\n\
-	subf	%0,%1,%0\n"
-	PPC405_ERR77(0,%2)
-	PPC_STLCX	"%0,0,%2 \n\
-	bne-	1b"
-	: "=&r" (t)
-	: "r" (a), "r" (&(l->a.counter))
-	: "cc", "memory");
-
-	return t;
-}
-
-static __inline__ long local_inc_return(local_t *l)
-{
-	long t;
-
-	__asm__ __volatile__(
-"1:"	PPC_LLARX	"%0,0,%1		# local_inc_return\n\
-	addic	%0,%0,1\n"
-	PPC405_ERR77(0,%1)
-	PPC_STLCX	"%0,0,%1 \n\
-	bne-	1b"
-	: "=&r" (t)
-	: "r" (&(l->a.counter))
-	: "cc", "xer", "memory");
-
-	return t;
-}
-
-/*
- * local_inc_and_test - increment and test
- * @l: pointer of type local_t
- *
- * Atomically increments @l by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-#define local_inc_and_test(l) (local_inc_return(l) == 0)
-
-static __inline__ long local_dec_return(local_t *l)
-{
-	long t;
-
-	__asm__ __volatile__(
-"1:"	PPC_LLARX	"%0,0,%1		# local_dec_return\n\
-	addic	%0,%0,-1\n"
-	PPC405_ERR77(0,%1)
-	PPC_STLCX	"%0,0,%1\n\
-	bne-	1b"
-	: "=&r" (t)
-	: "r" (&(l->a.counter))
-	: "cc", "xer", "memory");
-
-	return t;
-}
-
-#define local_cmpxchg(l, o, n) \
-	(cmpxchg_local(&((l)->a.counter), (o), (n)))
-#define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
-
-/**
- * local_add_unless - add unless the number is a given value
- * @l: pointer of type local_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
- *
- * Atomically adds @a to @l, so long as it was not @u.
- * Returns non-zero if @l was not @u, and zero otherwise.
- */
-static __inline__ int local_add_unless(local_t *l, long a, long u)
-{
-	long t;
-
-	__asm__ __volatile__ (
-"1:"	PPC_LLARX	"%0,0,%1		# local_add_unless\n\
-	cmpw	0,%0,%3 \n\
-	beq-	2f \n\
-	add	%0,%2,%0 \n"
-	PPC405_ERR77(0,%2)
-	PPC_STLCX	"%0,0,%1 \n\
-	bne-	1b \n"
-"	subf	%0,%2,%0 \n\
-2:"
-	: "=&r" (t)
-	: "r" (&(l->a.counter)), "r" (a), "r" (u)
-	: "cc", "memory");
-
-	return t != u;
-}
-
-#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
-
-#define local_sub_and_test(a, l)	(local_sub_return((a), (l)) == 0)
-#define local_dec_and_test(l)		(local_dec_return((l)) == 0)
-
-/*
- * Atomically test *l and decrement if it is greater than 0.
- * The function returns the old value of *l minus 1.
- */
-static __inline__ long local_dec_if_positive(local_t *l)
-{
-	long t;
-
-	__asm__ __volatile__(
-"1:"	PPC_LLARX	"%0,0,%1		# local_dec_if_positive\n\
-	cmpwi	%0,1\n\
-	addi	%0,%0,-1\n\
-	blt-	2f\n"
-	PPC405_ERR77(0,%1)
-	PPC_STLCX	"%0,0,%1\n\
-	bne-	1b"
-	"\n\
-2:"	: "=&b" (t)
-	: "r" (&(l->a.counter))
-	: "cc", "memory");
-
-	return t;
-}
-
-/* Use these for per-cpu local_t variables: on some archs they are
- * much more efficient than these naive implementations.  Note they take
- * a variable, not an address.
- */
-
-#define __local_inc(l)		((l)->a.counter++)
-#define __local_dec(l)		((l)->a.counter++)
-#define __local_add(i,l)	((l)->a.counter+=(i))
-#define __local_sub(i,l)	((l)->a.counter-=(i))
-
-#endif /* _ARCH_POWERPC_LOCAL_H */
Index: linux-2.6/arch/s390/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/s390/include/asm/local.h	2009-11-13 09:33:06.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1 +0,0 @@
-#include <asm-generic/local.h>
Index: linux-2.6/arch/score/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/score/include/asm/local.h	2009-11-13 09:33:06.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef _ASM_SCORE_LOCAL_H
-#define _ASM_SCORE_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* _ASM_SCORE_LOCAL_H */
Index: linux-2.6/arch/sh/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/sh/include/asm/local.h	2009-11-13 09:33:07.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,7 +0,0 @@
-#ifndef __ASM_SH_LOCAL_H
-#define __ASM_SH_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* __ASM_SH_LOCAL_H */
-
Index: linux-2.6/arch/sparc/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/sparc/include/asm/local.h	2009-11-13 09:33:07.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,6 +0,0 @@
-#ifndef _SPARC_LOCAL_H
-#define _SPARC_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif
Index: linux-2.6/arch/x86/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,198 +0,0 @@
-#ifndef _ASM_X86_LOCAL_H
-#define _ASM_X86_LOCAL_H
-
-#include <linux/percpu.h>
-
-#include <asm/system.h>
-#include <asm/atomic.h>
-#include <asm/asm.h>
-
-typedef struct {
-	atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)	atomic_long_read(&(l)->a)
-#define local_set(l, i)	atomic_long_set(&(l)->a, (i))
-
-static inline void local_inc(local_t *l)
-{
-	asm volatile(_ASM_INC "%0"
-		     : "+m" (l->a.counter));
-}
-
-static inline void local_dec(local_t *l)
-{
-	asm volatile(_ASM_DEC "%0"
-		     : "+m" (l->a.counter));
-}
-
-static inline void local_add(long i, local_t *l)
-{
-	asm volatile(_ASM_ADD "%1,%0"
-		     : "+m" (l->a.counter)
-		     : "ir" (i));
-}
-
-static inline void local_sub(long i, local_t *l)
-{
-	asm volatile(_ASM_SUB "%1,%0"
-		     : "+m" (l->a.counter)
-		     : "ir" (i));
-}
-
-/**
- * local_sub_and_test - subtract value from variable and test result
- * @i: integer value to subtract
- * @l: pointer to type local_t
- *
- * Atomically subtracts @i from @l and returns
- * true if the result is zero, or false for all
- * other cases.
- */
-static inline int local_sub_and_test(long i, local_t *l)
-{
-	unsigned char c;
-
-	asm volatile(_ASM_SUB "%2,%0; sete %1"
-		     : "+m" (l->a.counter), "=qm" (c)
-		     : "ir" (i) : "memory");
-	return c;
-}
-
-/**
- * local_dec_and_test - decrement and test
- * @l: pointer to type local_t
- *
- * Atomically decrements @l by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
- */
-static inline int local_dec_and_test(local_t *l)
-{
-	unsigned char c;
-
-	asm volatile(_ASM_DEC "%0; sete %1"
-		     : "+m" (l->a.counter), "=qm" (c)
-		     : : "memory");
-	return c != 0;
-}
-
-/**
- * local_inc_and_test - increment and test
- * @l: pointer to type local_t
- *
- * Atomically increments @l by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-static inline int local_inc_and_test(local_t *l)
-{
-	unsigned char c;
-
-	asm volatile(_ASM_INC "%0; sete %1"
-		     : "+m" (l->a.counter), "=qm" (c)
-		     : : "memory");
-	return c != 0;
-}
-
-/**
- * local_add_negative - add and test if negative
- * @i: integer value to add
- * @l: pointer to type local_t
- *
- * Atomically adds @i to @l and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
- */
-static inline int local_add_negative(long i, local_t *l)
-{
-	unsigned char c;
-
-	asm volatile(_ASM_ADD "%2,%0; sets %1"
-		     : "+m" (l->a.counter), "=qm" (c)
-		     : "ir" (i) : "memory");
-	return c;
-}
-
-/**
- * local_add_return - add and return
- * @i: integer value to add
- * @l: pointer to type local_t
- *
- * Atomically adds @i to @l and returns @i + @l
- */
-static inline long local_add_return(long i, local_t *l)
-{
-	long __i;
-#ifdef CONFIG_M386
-	unsigned long flags;
-	if (unlikely(boot_cpu_data.x86 <= 3))
-		goto no_xadd;
-#endif
-	/* Modern 486+ processor */
-	__i = i;
-	asm volatile(_ASM_XADD "%0, %1;"
-		     : "+r" (i), "+m" (l->a.counter)
-		     : : "memory");
-	return i + __i;
-
-#ifdef CONFIG_M386
-no_xadd: /* Legacy 386 processor */
-	local_irq_save(flags);
-	__i = local_read(l);
-	local_set(l, i + __i);
-	local_irq_restore(flags);
-	return i + __i;
-#endif
-}
-
-static inline long local_sub_return(long i, local_t *l)
-{
-	return local_add_return(-i, l);
-}
-
-#define local_inc_return(l)  (local_add_return(1, l))
-#define local_dec_return(l)  (local_sub_return(1, l))
-
-#define local_cmpxchg(l, o, n) \
-	(cmpxchg_local(&((l)->a.counter), (o), (n)))
-/* Always has a lock prefix */
-#define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
-
-/**
- * local_add_unless - add unless the number is a given value
- * @l: pointer of type local_t
- * @a: the amount to add to l...
- * @u: ...unless l is equal to u.
- *
- * Atomically adds @a to @l, so long as it was not @u.
- * Returns non-zero if @l was not @u, and zero otherwise.
- */
-#define local_add_unless(l, a, u)				\
-({								\
-	long c, old;						\
-	c = local_read((l));					\
-	for (;;) {						\
-		if (unlikely(c == (u)))				\
-			break;					\
-		old = local_cmpxchg((l), c, c + (a));		\
-		if (likely(old == c))				\
-			break;					\
-		c = old;					\
-	}							\
-	c != (u);						\
-})
-#define local_inc_not_zero(l) local_add_unless((l), 1, 0)
-
-/* On x86_32, these are no better than the atomic variants.
- * On x86-64 these are better than the atomic variants on SMP kernels
- * because they dont use a lock prefix.
- */
-#define __local_inc(l)		local_inc(l)
-#define __local_dec(l)		local_dec(l)
-#define __local_add(i, l)	local_add((i), (l))
-#define __local_sub(i, l)	local_sub((i), (l))
-
-#endif /* _ASM_X86_LOCAL_H */
Index: linux-2.6/arch/xtensa/include/asm/local.h
===================================================================
--- linux-2.6.orig/arch/xtensa/include/asm/local.h	2009-11-13 09:33:18.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,16 +0,0 @@
-/*
- * include/asm-xtensa/local.h
- *
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2001 - 2005 Tensilica Inc.
- */
-
-#ifndef _XTENSA_LOCAL_H
-#define _XTENSA_LOCAL_H
-
-#include <asm-generic/local.h>
-
-#endif /* _XTENSA_LOCAL_H */
Index: linux-2.6/Documentation/atomic_ops.txt
===================================================================
--- linux-2.6.orig/Documentation/atomic_ops.txt	2009-11-13 09:32:52.000000000 -0600
+++ linux-2.6/Documentation/atomic_ops.txt	2010-01-05 15:41:03.000000000 -0600
@@ -17,9 +17,10 @@ suffice:
 Historically, counter has been declared volatile.  This is now discouraged.
 See Documentation/volatile-considered-harmful.txt for the complete rationale.
 
-local_t is very similar to atomic_t. If the counter is per CPU and only
-updated by one CPU, local_t is probably more appropriate. Please see
-Documentation/local_ops.txt for the semantics of local_t.
+this_cpu operations are very similar to atomic_t. If the counter is per CPU
+and only updated by one CPU, then the use of the per cpu allocator and
+this cpu operations is more appropriate. Please see include/linux/percpu.h
+for details.
 
 The first operations to implement for atomic_t's are the initializers and
 plain reads.
Index: linux-2.6/Documentation/local_ops.txt
===================================================================
--- linux-2.6.orig/Documentation/local_ops.txt	2009-11-13 09:32:52.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,186 +0,0 @@
-	     Semantics and Behavior of Local Atomic Operations
-
-			    Mathieu Desnoyers
-
-
-	This document explains the purpose of the local atomic operations, how
-to implement them for any given architecture and shows how they can be used
-properly. It also stresses on the precautions that must be taken when reading
-those local variables across CPUs when the order of memory writes matters.
-
-
-
-* Purpose of local atomic operations
-
-Local atomic operations are meant to provide fast and highly reentrant per CPU
-counters. They minimize the performance cost of standard atomic operations by
-removing the LOCK prefix and memory barriers normally required to synchronize
-across CPUs.
-
-Having fast per CPU atomic counters is interesting in many cases : it does not
-require disabling interrupts to protect from interrupt handlers and it permits
-coherent counters in NMI handlers. It is especially useful for tracing purposes
-and for various performance monitoring counters.
-
-Local atomic operations only guarantee variable modification atomicity wrt the
-CPU which owns the data. Therefore, care must taken to make sure that only one
-CPU writes to the local_t data. This is done by using per cpu data and making
-sure that we modify it from within a preemption safe context. It is however
-permitted to read local_t data from any CPU : it will then appear to be written
-out of order wrt other memory writes by the owner CPU.
-
-
-* Implementation for a given architecture
-
-It can be done by slightly modifying the standard atomic operations : only
-their UP variant must be kept. It typically means removing LOCK prefix (on
-i386 and x86_64) and any SMP synchronization barrier. If the architecture does
-not have a different behavior between SMP and UP, including asm-generic/local.h
-in your architecture's local.h is sufficient.
-
-The local_t type is defined as an opaque signed long by embedding an
-atomic_long_t inside a structure. This is made so a cast from this type to a
-long fails. The definition looks like :
-
-typedef struct { atomic_long_t a; } local_t;
-
-
-* Rules to follow when using local atomic operations
-
-- Variables touched by local ops must be per cpu variables.
-- _Only_ the CPU owner of these variables must write to them.
-- This CPU can use local ops from any context (process, irq, softirq, nmi, ...)
-  to update its local_t variables.
-- Preemption (or interrupts) must be disabled when using local ops in
-  process context to   make sure the process won't be migrated to a
-  different CPU between getting the per-cpu variable and doing the
-  actual local op.
-- When using local ops in interrupt context, no special care must be
-  taken on a mainline kernel, since they will run on the local CPU with
-  preemption already disabled. I suggest, however, to explicitly
-  disable preemption anyway to make sure it will still work correctly on
-  -rt kernels.
-- Reading the local cpu variable will provide the current copy of the
-  variable.
-- Reads of these variables can be done from any CPU, because updates to
-  "long", aligned, variables are always atomic. Since no memory
-  synchronization is done by the writer CPU, an outdated copy of the
-  variable can be read when reading some _other_ cpu's variables.
-
-
-* How to use local atomic operations
-
-#include <linux/percpu.h>
-#include <asm/local.h>
-
-static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0);
-
-
-* Counting
-
-Counting is done on all the bits of a signed long.
-
-In preemptible context, use get_cpu_var() and put_cpu_var() around local atomic
-operations : it makes sure that preemption is disabled around write access to
-the per cpu variable. For instance :
-
-	local_inc(&get_cpu_var(counters));
-	put_cpu_var(counters);
-
-If you are already in a preemption-safe context, you can directly use
-__get_cpu_var() instead.
-
-	local_inc(&__get_cpu_var(counters));
-
-
-
-* Reading the counters
-
-Those local counters can be read from foreign CPUs to sum the count. Note that
-the data seen by local_read across CPUs must be considered to be out of order
-relatively to other memory writes happening on the CPU that owns the data.
-
-	long sum = 0;
-	for_each_online_cpu(cpu)
-		sum += local_read(&per_cpu(counters, cpu));
-
-If you want to use a remote local_read to synchronize access to a resource
-between CPUs, explicit smp_wmb() and smp_rmb() memory barriers must be used
-respectively on the writer and the reader CPUs. It would be the case if you use
-the local_t variable as a counter of bytes written in a buffer : there should
-be a smp_wmb() between the buffer write and the counter increment and also a
-smp_rmb() between the counter read and the buffer read.
-
-
-Here is a sample module which implements a basic per cpu counter using local.h.
-
---- BEGIN ---
-/* test-local.c
- *
- * Sample module for local.h usage.
- */
-
-
-#include <asm/local.h>
-#include <linux/module.h>
-#include <linux/timer.h>
-
-static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0);
-
-static struct timer_list test_timer;
-
-/* IPI called on each CPU. */
-static void test_each(void *info)
-{
-	/* Increment the counter from a non preemptible context */
-	printk("Increment on cpu %d\n", smp_processor_id());
-	local_inc(&__get_cpu_var(counters));
-
-	/* This is what incrementing the variable would look like within a
-	 * preemptible context (it disables preemption) :
-	 *
-	 * local_inc(&get_cpu_var(counters));
-	 * put_cpu_var(counters);
-	 */
-}
-
-static void do_test_timer(unsigned long data)
-{
-	int cpu;
-
-	/* Increment the counters */
-	on_each_cpu(test_each, NULL, 1);
-	/* Read all the counters */
-	printk("Counters read from CPU %d\n", smp_processor_id());
-	for_each_online_cpu(cpu) {
-		printk("Read : CPU %d, count %ld\n", cpu,
-			local_read(&per_cpu(counters, cpu)));
-	}
-	del_timer(&test_timer);
-	test_timer.expires = jiffies + 1000;
-	add_timer(&test_timer);
-}
-
-static int __init test_init(void)
-{
-	/* initialize the timer that will increment the counter */
-	init_timer(&test_timer);
-	test_timer.function = do_test_timer;
-	test_timer.expires = jiffies + 1;
-	add_timer(&test_timer);
-
-	return 0;
-}
-
-static void __exit test_exit(void)
-{
-	del_timer_sync(&test_timer);
-}
-
-module_init(test_init);
-module_exit(test_exit);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Mathieu Desnoyers");
-MODULE_DESCRIPTION("Local Atomic Ops");
---- END ---
Index: linux-2.6/Documentation/DocBook/kernel-locking.tmpl
===================================================================
--- linux-2.6.orig/Documentation/DocBook/kernel-locking.tmpl	2009-11-13 09:32:52.000000000 -0600
+++ linux-2.6/Documentation/DocBook/kernel-locking.tmpl	2010-01-05 15:41:03.000000000 -0600
@@ -1810,14 +1810,6 @@ machines due to caching.
     </para>
 
     <para>
-      Of particular use for simple per-cpu counters is the
-      <type>local_t</type> type, and the
-      <function>cpu_local_inc()</function> and related functions,
-      which are more efficient than simple code on some architectures
-      (<filename class="headerfile">include/asm/local.h</filename>).
-    </para>
-
-    <para>
       Note that there is no simple, reliable way of getting an exact
       value of such a counter, without introducing more locks.  This
       is not a problem for some uses.
Index: linux-2.6/include/asm-generic/local.h
===================================================================
--- linux-2.6.orig/include/asm-generic/local.h	2010-01-05 15:35:34.000000000 -0600
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,55 +0,0 @@
-#ifndef _ASM_GENERIC_LOCAL_H
-#define _ASM_GENERIC_LOCAL_H
-
-#include <linux/percpu.h>
-#include <asm/atomic.h>
-#include <asm/types.h>
-
-/*
- * A signed long type for operations which are atomic for a single CPU.
- * Usually used in combination with per-cpu variables.
- *
- * This is the default implementation, which uses atomic_long_t.  Which is
- * rather pointless.  The whole point behind local_t is that some processors
- * can perform atomic adds and subtracts in a manner which is atomic wrt IRQs
- * running on this CPU.  local_t allows exploitation of such capabilities.
- */
-
-/* Implement in terms of atomics. */
-
-/* Don't use typedef: don't want them to be mixed with atomic_t's. */
-typedef struct
-{
-	atomic_long_t a;
-} local_t;
-
-#define LOCAL_INIT(i)	{ ATOMIC_LONG_INIT(i) }
-
-#define local_read(l)	atomic_long_read(&(l)->a)
-#define local_set(l,i)	atomic_long_set((&(l)->a),(i))
-#define local_inc(l)	atomic_long_inc(&(l)->a)
-#define local_dec(l)	atomic_long_dec(&(l)->a)
-#define local_add(i,l)	atomic_long_add((i),(&(l)->a))
-#define local_sub(i,l)	atomic_long_sub((i),(&(l)->a))
-
-#define local_sub_and_test(i, l) atomic_long_sub_and_test((i), (&(l)->a))
-#define local_dec_and_test(l) atomic_long_dec_and_test(&(l)->a)
-#define local_inc_and_test(l) atomic_long_inc_and_test(&(l)->a)
-#define local_add_negative(i, l) atomic_long_add_negative((i), (&(l)->a))
-#define local_add_return(i, l) atomic_long_add_return((i), (&(l)->a))
-#define local_sub_return(i, l) atomic_long_sub_return((i), (&(l)->a))
-#define local_inc_return(l) atomic_long_inc_return(&(l)->a)
-
-#define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
-#define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
-#define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
-#define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
-
-/* Non-atomic variants, ie. preemption disabled and won't be touched
- * in interrupt, etc.  Some archs can optimize this case well. */
-#define __local_inc(l)		local_set((l), local_read(l) + 1)
-#define __local_dec(l)		local_set((l), local_read(l) - 1)
-#define __local_add(i,l)	local_set((l), local_read(l) + (i))
-#define __local_sub(i,l)	local_set((l), local_read(l) - (i))
-
-#endif /* _ASM_GENERIC_LOCAL_H */

-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and  add_local_return()
  2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
@ 2010-01-05 22:17   ` Mike Frysinger
  2010-01-05 22:29     ` Christoph Lameter
  2010-01-05 22:49   ` Mathieu Desnoyers
  1 sibling, 1 reply; 24+ messages in thread
From: Mike Frysinger @ 2010-01-05 22:17 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

On Tue, Jan 5, 2010 at 17:04, Christoph Lameter wrote:
> --- /dev/null   1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/include/asm-generic/add-local.h   2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,13 @@
> +#ifndef __ASM_GENERIC_ADD_LOCAL_H
> +#define __ASM_GENERIC_ADD_LOCAL_H

needs comment header (blurb/copyright/license)

> +#define add_return_local(ptr, v)                                       \
> +       ((__typeof__(*(ptr)))__add_return_local_generic((ptr),          \
> +                       (unsigned long)(v), sizeof(*(ptr))))
> +
> +#define add_local(ptr, v) (void)__add_return_local_generic((ptr),      \
> +                       (unsigned long)(v), sizeof(*(ptr)))

the only difference here is the cast ... perhaps define an
_add_local() macro to avoid the duplication

> --- /dev/null   1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/alpha/include/asm/add-local.h        2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +

these arch stubs all have an extra new line

> --- /dev/null   1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/include/asm-generic/add-local-generic.h   2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,40 @@
> +#ifndef __ASM_GENERIC_ADD_LOCAL_GENERIC_H
> +#define __ASM_GENERIC_ADD_LOCAL_GENERIC_H

needs comment header (blurb/copyright/license)

> +/*
> + * Generic version of __add_return_local (disables interrupts). Takes an
> + * unsigned long parameter, supporting various types of architectures.
> + */
> +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> +               unsigned long value, int size)

size_t for size ?


> +extern unsigned long wrong_size_add_local(volatile void *ptr);
> ...
> +       /*
> +        * Sanity checking, compile-time.
> +        */
> +       if (size == 8 && sizeof(unsigned long) != 8)
> +               wrong_size_add_local(ptr);
> ...
> +       default:
> +               wrong_size_add_local(ptr);
> +       }

should be BUILD_BUG_ON() ?
-mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 0/4] Remove local_t
  2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
                   ` (3 preceding siblings ...)
  2010-01-05 22:04 ` [RFC local_t removal V1 4/4] Remove local_t support Christoph Lameter
@ 2010-01-05 22:23 ` Mathieu Desnoyers
  2010-01-05 22:34   ` Christoph Lameter
  4 siblings, 1 reply; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-05 22:23 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tejun Heo, linux-kernel

* Christoph Lameter (cl@linux-foundation.org) wrote:
> Current -next has only the trace subsystem left as a user of local_t
> 
> Tracing uses local_t for per cpu safe atomic operations in the form
> of cmpxchg and additions. We already have a cmpxchg_local but no "local"
> form of addition.
> 
> The patchset introduces a similar local primitive add_local() and then
> uses cmpxchg_local() and add_local() to remove local_t use from the
> trace subsystem.
> 
> The last patch removes local_t support from the kernel tree.
> 
> The support for add_local() is pretty basic. We can add more
> fancy inc/dec variants and more optimization in the next
> revision of the patchset.
> 

Hi Christoph,

Yes, removing the local_t type could make sense. However, local_t maps
to a volatile long, not just "long". Secondly, I am concerned about the
fact that the patch you propose:

- does not create the primitives I use in lttng
- only deals with x86

In lttng (which is out of tree, but widely used), I need the equivalent
of:

local_read
local_set
local_add
local_cmpxchg
local_add_return
local_inc

The approach of just doing the x86 implementation and leaving all the
other architectures "for later" with a slow/non atomic generic fallback
is, imho, a no-go, given that some people (myself, actually) already
took the time to go through all the kernel architectures to create the
optimized local.h headers. Basically, you are destroying all that work,
asking for it to be done all over again.

I therefore argue that we should keep local.h as-is as long as the
replacement lacks the wide architecture support and primitive variety
found in local.h.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and  add_local_return()
  2010-01-05 22:17   ` Mike Frysinger
@ 2010-01-05 22:29     ` Christoph Lameter
  2010-01-06 19:23       ` Mike Frysinger
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:29 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2146 bytes --]

On Tue, 5 Jan 2010, Mike Frysinger wrote:

> On Tue, Jan 5, 2010 at 17:04, Christoph Lameter wrote:
> > --- /dev/null   1970-01-01 00:00:00.000000000 +0000
> > +++ linux-2.6/include/asm-generic/add-local.h   2010-01-05 15:36:02.000000000 -0600
> > @@ -0,0 +1,13 @@
> > +#ifndef __ASM_GENERIC_ADD_LOCAL_H
> > +#define __ASM_GENERIC_ADD_LOCAL_H
>
> needs comment header (blurb/copyright/license)

A simple small include file? Really?

> > +#define add_return_local(ptr, v)                                       \
> > +       ((__typeof__(*(ptr)))__add_return_local_generic((ptr),          \
> > +                       (unsigned long)(v), sizeof(*(ptr))))
> > +
> > +#define add_local(ptr, v) (void)__add_return_local_generic((ptr),      \
> > +                       (unsigned long)(v), sizeof(*(ptr)))
>
> the only difference here is the cast ... perhaps define an
> _add_local() macro to avoid the duplication

Right. This is a shortcut to define the add_local() operation without too
much effort. We should be using add / inc /dec there at some point.

> > --- /dev/null   1970-01-01 00:00:00.000000000 +0000
> > +++ linux-2.6/arch/alpha/include/asm/add-local.h        2010-01-05 15:36:02.000000000 -0600
> > @@ -0,0 +1,2 @@
> > +#include <asm-generic/add-local.h>
> > +
>
> these arch stubs all have an extra new line

Thats bad?

> > +/*
> > + * Generic version of __add_return_local (disables interrupts). Takes an
> > + * unsigned long parameter, supporting various types of architectures.
> > + */
> > +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> > +               unsigned long value, int size)
>
> size_t for size ?

No. It can be anything.

> > +extern unsigned long wrong_size_add_local(volatile void *ptr);
> > ...
> > +       /*
> > +        * Sanity checking, compile-time.
> > +        */
> > +       if (size == 8 && sizeof(unsigned long) != 8)
> > +               wrong_size_add_local(ptr);
> > ...
> > +       default:
> > +               wrong_size_add_local(ptr);
> > +       }
>
> should be BUILD_BUG_ON() ?

Does not work there.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 0/4] Remove local_t
  2010-01-05 22:23 ` [RFC local_t removal V1 0/4] Remove local_t Mathieu Desnoyers
@ 2010-01-05 22:34   ` Christoph Lameter
  2010-01-05 22:45     ` Mathieu Desnoyers
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2010-01-05 22:34 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:

> Yes, removing the local_t type could make sense. However, local_t maps
> to a volatile long, not just "long". Secondly, I am concerned about the
> fact that the patch you propose:

Volatile is discouraged as far as I can tell.

> - does not create the primitives I use in lttng
> - only deals with x8

As I said its an RFC. This provides all the functionality you need
through. The rest is sugar coating.

> In lttng (which is out of tree, but widely used), I need the equivalent
> of:
>
> local_read
> local_set
> local_add
> local_cmpxchg
> local_add_return
> local_inc

Please read the patch! This is all provided. add_local_return in the RFC
provides what is needed to replace local_add, local_inc. We can add these
explicitly.

local_cmpxchg replacement is already in there in the form of
cmpxchg_local().

> The approach of just doing the x86 implementation and leaving all the
> other architectures "for later" with a slow/non atomic generic fallback
> is, imho, a no-go, given that some people (myself, actually) already
> took the time to go through all the kernel architectures to create the
> optimized local.h headers. Basically, you are destroying all that work,
> asking for it to be done all over again.

AS I said this is an RFC. I can easily generate all these things from the
existing local.hs for the architectures.

> I therefore argue that we should keep local.h as-is as long as the
> replacement lacks the wide architecture support and primitive variety
> found in local.h.

Cool down and please review the patch.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 0/4] Remove local_t
  2010-01-05 22:34   ` Christoph Lameter
@ 2010-01-05 22:45     ` Mathieu Desnoyers
  2010-01-07 17:05       ` Christoph Lameter
  0 siblings, 1 reply; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-05 22:45 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tejun Heo, linux-kernel

* Christoph Lameter (cl@linux-foundation.org) wrote:
> On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:
> 
> > Yes, removing the local_t type could make sense. However, local_t maps
> > to a volatile long, not just "long". Secondly, I am concerned about the
> > fact that the patch you propose:
> 
> Volatile is discouraged as far as I can tell.

If you want to ensure that a simple variable assignment or read
(local_set, local_read) are not performed piecewise by the compiler
which can cause odd effects when shared with interrupt handlers, this
will however be necessary.


> 
> > - does not create the primitives I use in lttng
> > - only deals with x8
> 
> As I said its an RFC. This provides all the functionality you need
> through. The rest is sugar coating.

OK

> 
> > In lttng (which is out of tree, but widely used), I need the equivalent
> > of:
> >
> > local_read
> > local_set
> > local_add
> > local_cmpxchg
> > local_add_return
> > local_inc
> 
> Please read the patch! This is all provided. add_local_return in the RFC
> provides what is needed to replace local_add, local_inc. We can add these
> explicitly.
> 
> local_cmpxchg replacement is already in there in the form of
> cmpxchg_local().
> 
> > The approach of just doing the x86 implementation and leaving all the
> > other architectures "for later" with a slow/non atomic generic fallback
> > is, imho, a no-go, given that some people (myself, actually) already
> > took the time to go through all the kernel architectures to create the
> > optimized local.h headers. Basically, you are destroying all that work,
> > asking for it to be done all over again.
> 
> AS I said this is an RFC. I can easily generate all these things from the
> existing local.hs for the architectures.
> 
> > I therefore argue that we should keep local.h as-is as long as the
> > replacement lacks the wide architecture support and primitive variety
> > found in local.h.
> 
> Cool down and please review the patch.

OK :)

Mathieu
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
  2010-01-05 22:17   ` Mike Frysinger
@ 2010-01-05 22:49   ` Mathieu Desnoyers
  2010-01-07 13:45     ` Arnd Bergmann
  2010-01-07 17:07     ` Christoph Lameter
  1 sibling, 2 replies; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-05 22:49 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tejun Heo, linux-kernel

* Christoph Lameter (cl@linux-foundation.org) wrote:
> We already have cmpxchg_local that can work on an arbitrary type. The cmpxchg
> is only safe from concurrent access on the local cpu.
> 
> Add another operation that in the similar fasion allow "local" safe adds to
> a variable.
> 
> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 
> ---
>  arch/alpha/include/asm/add-local.h      |    2 +
>  arch/arm/include/asm/add-local.h        |    2 +
>  arch/avr32/include/asm/add-local.h      |    2 +
>  arch/blackfin/include/asm/add-local.h   |    2 +
>  arch/cris/include/asm/add-local.h       |    2 +
>  arch/frv/include/asm/add-local.h        |    2 +
>  arch/h8300/include/asm/add-local.h      |    2 +
>  arch/ia64/include/asm/add-local.h       |    2 +
>  arch/m32r/include/asm/add-local.h       |    2 +
>  arch/m68k/include/asm/add-local.h       |    2 +
>  arch/microblaze/include/asm/add-local.h |    2 +
>  arch/mips/include/asm/add-local.h       |    2 +
>  arch/mn10300/include/asm/add-local.h    |    2 +
>  arch/parisc/include/asm/add-local.h     |    2 +
>  arch/powerpc/include/asm/add-local.h    |    2 +
>  arch/s390/include/asm/add-local.h       |    2 +
>  arch/score/include/asm/add-local.h      |    2 +
>  arch/sh/include/asm/add-local.h         |    2 +
>  arch/sparc/include/asm/add-local.h      |    2 +
>  arch/um/include/asm/add-local.h         |    2 +
>  arch/x86/include/asm/add-local.h        |    2 +
>  arch/xtensa/include/asm/add-local.h     |    2 +
>  include/asm-generic/add-local-generic.h |   40 ++++++++++++++++++++++++++++++++
>  include/asm-generic/add-local.h         |   13 ++++++++++
>  24 files changed, 97 insertions(+)

The problem I see here is that with ~5-6 operations, we will end up
having 20*5 = 100 headers only for this. Can we combine these in a
single header file instead ? local.h wasn't bad in this respect.

Also, separating all these in sub-files will make it a bit difficult to
pinpoint errors that would arise from using a bad combination of, e.g.
generic add with a non-protected read or set.

Thanks,

Mathieu

> 
> Index: linux-2.6/include/asm-generic/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/include/asm-generic/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,13 @@
> +#ifndef __ASM_GENERIC_ADD_LOCAL_H
> +#define __ASM_GENERIC_ADD_LOCAL_H
> +
> +#include <asm-generic/add-local-generic.h>
> +
> +#define add_return_local(ptr, v)				  	\
> +	((__typeof__(*(ptr)))__add_return_local_generic((ptr), 		\
> +			(unsigned long)(v), sizeof(*(ptr))))
> +
> +#define add_local(ptr, v) (void)__add_return_local_generic((ptr),	\
> +			(unsigned long)(v), sizeof(*(ptr)))
> +
> +#endif
> Index: linux-2.6/arch/alpha/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/alpha/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/arm/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/arm/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/avr32/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/avr32/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/blackfin/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/blackfin/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/cris/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/cris/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/frv/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/frv/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/h8300/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/h8300/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/ia64/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/ia64/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/m32r/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/m32r/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/m68k/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/m68k/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/microblaze/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/microblaze/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/mips/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/mips/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/mn10300/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/mn10300/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/parisc/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/parisc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/powerpc/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/powerpc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/s390/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/s390/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/score/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/score/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/sh/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/sh/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/sparc/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/sparc/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/um/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/um/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/x86/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/x86/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/arch/xtensa/include/asm/add-local.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/arch/xtensa/include/asm/add-local.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,2 @@
> +#include <asm-generic/add-local.h>
> +
> Index: linux-2.6/include/asm-generic/add-local-generic.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6/include/asm-generic/add-local-generic.h	2010-01-05 15:36:02.000000000 -0600
> @@ -0,0 +1,40 @@
> +#ifndef __ASM_GENERIC_ADD_LOCAL_GENERIC_H
> +#define __ASM_GENERIC_ADD_LOCAL_GENERIC_H
> +
> +#include <linux/types.h>
> +
> +extern unsigned long wrong_size_add_local(volatile void *ptr);
> +
> +/*
> + * Generic version of __add_return_local (disables interrupts). Takes an
> + * unsigned long parameter, supporting various types of architectures.
> + */
> +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> +		unsigned long value, int size)
> +{
> +	unsigned long flags, r;
> +
> +	/*
> +	 * Sanity checking, compile-time.
> +	 */
> +	if (size == 8 && sizeof(unsigned long) != 8)
> +		wrong_size_add_local(ptr);
> +
> +	local_irq_save(flags);
> +	switch (size) {
> +	case 1: r = (*((u8 *)ptr) += value);
> +		break;
> +	case 2: r = (*((u16 *)ptr) += value);
> +		break;
> +	case 4: r = (*((u32 *)ptr) += value);
> +		break;
> +	case 8: r = (*((u64 *)ptr) += value);
> +		break;
> +	default:
> +		wrong_size_add_local(ptr);
> +	}
> +	local_irq_restore(flags);
> +	return r;
> +}
> +
> +#endif
> 
> -- 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem
  2010-01-05 22:04 ` [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem Christoph Lameter
@ 2010-01-05 22:57   ` Mathieu Desnoyers
  2010-01-07 17:15     ` Christoph Lameter
  2010-01-14  2:56   ` Steven Rostedt
  1 sibling, 1 reply; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-05 22:57 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tejun Heo, linux-kernel

* Christoph Lameter (cl@linux-foundation.org) wrote:
> Replace the local_t use with longs in the trace subsystem. The longs can then be
> updated with the required level of concurrency protection through cmpxchg_local()
> and add_local().
> 
> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 
> ---
>  kernel/trace/ring_buffer.c           |  150 +++++++++++++++++------------------
>  kernel/trace/ring_buffer_benchmark.c |    5 -
>  kernel/trace/trace_branch.c          |    1 
>  3 files changed, 77 insertions(+), 79 deletions(-)
> 
> Index: linux-2.6/kernel/trace/ring_buffer.c
> ===================================================================
> --- linux-2.6.orig/kernel/trace/ring_buffer.c	2010-01-05 15:35:35.000000000 -0600
> +++ linux-2.6/kernel/trace/ring_buffer.c	2010-01-05 15:36:07.000000000 -0600
> @@ -20,7 +20,7 @@
>  #include <linux/cpu.h>
>  #include <linux/fs.h>
>  
> -#include <asm/local.h>
> +#include <asm/add-local.h>
>  #include "trace.h"
>  
>  /*
> @@ -312,7 +312,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
>  
>  struct buffer_data_page {
>  	u64		 time_stamp;	/* page time stamp */
> -	local_t		 commit;	/* write committed index */
> +	long		 commit;	/* write committed index */
>  	unsigned char	 data[];	/* data of buffer page */
>  };
>  
> @@ -326,9 +326,9 @@ struct buffer_data_page {
>   */
>  struct buffer_page {
>  	struct list_head list;		/* list of buffer pages */
> -	local_t		 write;		/* index for next write */
> +	long		 write;		/* index for next write */
>  	unsigned	 read;		/* index for next read */
> -	local_t		 entries;	/* entries on this page */
> +	long		 entries;	/* entries on this page */
>  	struct buffer_data_page *page;	/* Actual data page */
>  };
>  
> @@ -349,7 +349,7 @@ struct buffer_page {
>  
>  static void rb_init_page(struct buffer_data_page *bpage)
>  {
> -	local_set(&bpage->commit, 0);
> +	bpage->commit = 0;

This is incorrect. You are turning a "volatile" write into a
non-volatile write, which can be turned into multiple writes by the
compiler and therefore expose inconsistent state to interrupt handlers.

>  }
>  
>  /**
> @@ -360,7 +360,7 @@ static void rb_init_page(struct buffer_d
>   */
>  size_t ring_buffer_page_len(void *page)
>  {
> -	return local_read(&((struct buffer_data_page *)page)->commit)
> +	return ((struct buffer_data_page *)page)->commit
>  		+ BUF_PAGE_HDR_SIZE;
>  }
>  
> @@ -431,11 +431,11 @@ struct ring_buffer_per_cpu {
>  	struct buffer_page		*tail_page;	/* write to tail */
>  	struct buffer_page		*commit_page;	/* committed pages */
>  	struct buffer_page		*reader_page;
> -	local_t				commit_overrun;
> -	local_t				overrun;
> -	local_t				entries;
> -	local_t				committing;
> -	local_t				commits;
> +	long				commit_overrun;
> +	long				overrun;
> +	long				entries;
> +	long				committing;
> +	long				commits;
>  	unsigned long			read;
>  	u64				write_stamp;
>  	u64				read_stamp;
> @@ -824,8 +824,8 @@ static int rb_tail_page_update(struct ri
>  	 *
>  	 * We add a counter to the write field to denote this.
>  	 */
> -	old_write = local_add_return(RB_WRITE_INTCNT, &next_page->write);
> -	old_entries = local_add_return(RB_WRITE_INTCNT, &next_page->entries);
> +	old_write = add_return_local(&next_page->write, RB_WRITE_INTCNT);
> +	old_entries = add_return_local(&next_page->entries, RB_WRITE_INTCNT);
>  
>  	/*
>  	 * Just make sure we have seen our old_write and synchronize
> @@ -853,15 +853,15 @@ static int rb_tail_page_update(struct ri
>  		 * cmpxchg to only update if an interrupt did not already
>  		 * do it for us. If the cmpxchg fails, we don't care.
>  		 */
> -		(void)local_cmpxchg(&next_page->write, old_write, val);
> -		(void)local_cmpxchg(&next_page->entries, old_entries, eval);
> +		(void)cmpxchg_local(&next_page->write, old_write, val);
> +		(void)cmpxchg_local(&next_page->entries, old_entries, eval);
>  
>  		/*
>  		 * No need to worry about races with clearing out the commit.
>  		 * it only can increment when a commit takes place. But that
>  		 * only happens in the outer most nested commit.
>  		 */
> -		local_set(&next_page->page->commit, 0);
> +		next_page->page->commit = 0;
>  
>  		old_tail = cmpxchg(&cpu_buffer->tail_page,
>  				   tail_page, next_page);
> @@ -1394,17 +1394,17 @@ rb_iter_head_event(struct ring_buffer_it
>  
>  static inline unsigned long rb_page_write(struct buffer_page *bpage)
>  {
> -	return local_read(&bpage->write) & RB_WRITE_MASK;
> +	return bpage->write & RB_WRITE_MASK;

Same problem here: missing volatile for read. Same applies thorough the
patch.

>  }
>  
>  static inline unsigned rb_page_commit(struct buffer_page *bpage)
>  {
> -	return local_read(&bpage->page->commit);
> +	return bpage->page->commit;
>  }
>  
>  static inline unsigned long rb_page_entries(struct buffer_page *bpage)
>  {
> -	return local_read(&bpage->entries) & RB_WRITE_MASK;
> +	return bpage->entries & RB_WRITE_MASK;
>  }
>  
>  /* Size is determined by what has been commited */
> @@ -1463,8 +1463,8 @@ rb_set_commit_to_write(struct ring_buffe
>  		if (RB_WARN_ON(cpu_buffer,
>  			       rb_is_reader_page(cpu_buffer->tail_page)))
>  			return;
> -		local_set(&cpu_buffer->commit_page->page->commit,
> -			  rb_page_write(cpu_buffer->commit_page));
> +		cpu_buffer->commit_page->page->commit =
> +			  rb_page_write(cpu_buffer->commit_page);
>  		rb_inc_page(cpu_buffer, &cpu_buffer->commit_page);
>  		cpu_buffer->write_stamp =
>  			cpu_buffer->commit_page->page->time_stamp;
> @@ -1474,10 +1474,10 @@ rb_set_commit_to_write(struct ring_buffe
>  	while (rb_commit_index(cpu_buffer) !=
>  	       rb_page_write(cpu_buffer->commit_page)) {
>  
> -		local_set(&cpu_buffer->commit_page->page->commit,
> -			  rb_page_write(cpu_buffer->commit_page));
> +		cpu_buffer->commit_page->page->commit =
> +			  rb_page_write(cpu_buffer->commit_page);
>  		RB_WARN_ON(cpu_buffer,
> -			   local_read(&cpu_buffer->commit_page->page->commit) &
> +			   cpu_buffer->commit_page->page->commit &
>  			   ~RB_WRITE_MASK);
>  		barrier();
>  	}
> @@ -1600,7 +1600,7 @@ rb_handle_head_page(struct ring_buffer_p
>  		 * it is our responsibility to update
>  		 * the counters.
>  		 */
> -		local_add(entries, &cpu_buffer->overrun);
> +		add_local(&cpu_buffer->overrun, entries);
>  
>  		/*
>  		 * The entries will be zeroed out when we move the
> @@ -1741,7 +1741,7 @@ rb_reset_tail(struct ring_buffer_per_cpu
>  	 * must fill the old tail_page with padding.
>  	 */
>  	if (tail >= BUF_PAGE_SIZE) {
> -		local_sub(length, &tail_page->write);
> +		add_local(&tail_page->write, -length);

[...]

If we can have inc/dec/sub already, that would be good, rather than
going with add -val. This would ensure that we don't do too much
ping-pong with the code using these primitives.

In the end, the fact that the missing volatile access bug crept up as
part of this patch makes me think that keeping local_t was doing a fine
encapsulation job. However, if we really want to go down the path of
removing this encapsulation, then we should:

- make sure that _all_ variable accesses are encapsulated, even
  read_local and set_local.
- put all this API into a single header per architecture, easy for
  people to find and understand, rather than multiple headers sprinkled
  all over the place.
- document that accessing the variables without the API violates the
  consistency guarantees.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 3/4] Optimized add_local()
  2010-01-05 22:04 ` [RFC local_t removal V1 3/4] Optimized add_local() Christoph Lameter
@ 2010-01-05 22:59   ` Mathieu Desnoyers
  2010-01-07 17:16     ` Christoph Lameter
  0 siblings, 1 reply; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-05 22:59 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tejun Heo, linux-kernel

* Christoph Lameter (cl@linux-foundation.org) wrote:
> Use XADD to implement add_local().

xadd should only be used to implement add_local_return, not add_local.

add_local can be implemented with the "add" instruction, which is
significantly faster if my memory serves me correctly.

Thanks,

Mathieu

> 
> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 
> ---
>  arch/x86/include/asm/add-local.h |   56 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/arch/x86/include/asm/add-local.h
> ===================================================================
> --- linux-2.6.orig/arch/x86/include/asm/add-local.h	2010-01-05 15:29:11.000000000 -0600
> +++ linux-2.6/arch/x86/include/asm/add-local.h	2010-01-05 15:33:59.000000000 -0600
> @@ -1,2 +1,56 @@
> -#include <asm-generic/add-local.h>
> +#ifndef __ASM_X86_ADD_LOCAL_H
> +#define __ASM_X86_ADD_LOCAL_H
> +
> +#include <linux/types.h>
> +#include <asm-generic/add-local-generic.h>
> +
> +static inline unsigned long __add_return_local(volatile void *ptr,
> +		unsigned long value, int size)
> +{
> +	unsigned long r;
> +
> +#ifdef CONFIG_M386
> +	if (unlikely(boot_cpu_data.x86 <= 3))
> +                return __add_return_local_generic(ptr, value, size);
> +#endif
> +
> +	/*
> +	 * Sanity checking, compile-time.
> +	 */
> +	if (size == 8 && sizeof(unsigned long) != 8)
> +		wrong_size_add_local(ptr);
> +
> +	r = value;
> +	switch (size) {
> +	case 1:
> +		asm volatile("xaddb %0, %1;": "+r" (r), "+m" (*((u8 *)ptr)):
> +			 : "memory");
> +		break;
> +	case 2:
> +		asm volatile("xaddw %0, %1;": "+r" (r), "+m" (*((u16 *)ptr)):
> +			 : "memory");
> +		break;
> +	case 4:
> +		asm volatile("xaddl %0, %1;": "+r" (r), "+m" (*((u32 *)ptr)):
> +			 : "memory");
> +		break;
> +	case 8:
> +		asm volatile("xaddq %0, %1;": "+r" (r), "+m" (*((u64 *)ptr)):
> +			 : "memory");
> +		break;
> +	default:
> +		wrong_size_add_local(ptr);
> +	}
> +	return r + value;
> +}
> +
> +#define add_return_local(ptr, v)				  	\
> +	((__typeof__(*(ptr)))__add_return_local((ptr), (unsigned long)(v), \
> +			sizeof(*(ptr))))
> +
> +#define add_local(ptr, v) (void)__add_return_local((ptr), (unsigned long)(v), \
> +			sizeof(*(ptr)))
> +
> +
> +#endif
>  
> 
> -- 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and  add_local_return()
  2010-01-05 22:29     ` Christoph Lameter
@ 2010-01-06 19:23       ` Mike Frysinger
  2010-01-07 17:03         ` Christoph Lameter
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Frysinger @ 2010-01-06 19:23 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

On Tue, Jan 5, 2010 at 17:29, Christoph Lameter wrote:
> On Tue, 5 Jan 2010, Mike Frysinger wrote:
>> On Tue, Jan 5, 2010 at 17:04, Christoph Lameter wrote:
>> > --- /dev/null   1970-01-01 00:00:00.000000000 +0000
>> > +++ linux-2.6/include/asm-generic/add-local.h   2010-01-05 15:36:02.000000000 -0600
>> > @@ -0,0 +1,13 @@
>> > +#ifndef __ASM_GENERIC_ADD_LOCAL_H
>> > +#define __ASM_GENERIC_ADD_LOCAL_H
>>
>> needs comment header (blurb/copyright/license)
>
> A simple small include file? Really?

if there's enough content to warrant protection against multiple
inclusion, then generally yes

>> > --- /dev/null   1970-01-01 00:00:00.000000000 +0000
>> > +++ linux-2.6/arch/alpha/include/asm/add-local.h        2010-01-05 15:36:02.000000000 -0600
>> > @@ -0,0 +1,2 @@
>> > +#include <asm-generic/add-local.h>
>> > +
>>
>> these arch stubs all have an extra new line
>
> Thats bad?

files shouldnt have trailing new lines

>> > +/*
>> > + * Generic version of __add_return_local (disables interrupts). Takes an
>> > + * unsigned long parameter, supporting various types of architectures.
>> > + */
>> > +static inline unsigned long __add_return_local_generic(volatile void *ptr,
>> > +               unsigned long value, int size)
>>
>> size_t for size ?
>
> No. It can be anything.

you're passing it sizeof() which returns a size_t

>> > +extern unsigned long wrong_size_add_local(volatile void *ptr);
>> > ...
>> > +       /*
>> > +        * Sanity checking, compile-time.
>> > +        */
>> > +       if (size == 8 && sizeof(unsigned long) != 8)
>> > +               wrong_size_add_local(ptr);
>> > ...
>> > +       default:
>> > +               wrong_size_add_local(ptr);
>> > +       }
>>
>> should be BUILD_BUG_ON() ?
>
> Does not work there.

why not ?  BUILD_BUG_ON() should work on any compile-time constant
which sizeof() is ... unless you plan on letting people call this
function with arbitrary sizes ?
-mike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-05 22:49   ` Mathieu Desnoyers
@ 2010-01-07 13:45     ` Arnd Bergmann
  2010-01-07 13:57       ` Mathieu Desnoyers
  2010-01-07 17:07     ` Christoph Lameter
  1 sibling, 1 reply; 24+ messages in thread
From: Arnd Bergmann @ 2010-01-07 13:45 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Christoph Lameter, Tejun Heo, linux-kernel

On Tuesday 05 January 2010, Mathieu Desnoyers wrote:
> 
> The problem I see here is that with ~5-6 operations, we will end up
> having 20*5 = 100 headers only for this. Can we combine these in a
> single header file instead ? local.h wasn't bad in this respect.

I have an old patch that I was planning to dig out for 2.6.34,
which autogenerates arch/*/include/foo.h files that only contain
"#include <asm-generic/foo.h>".

I guess this would be sufficient to avoid the overload with all
these header files.

> Also, separating all these in sub-files will make it a bit difficult to
> pinpoint errors that would arise from using a bad combination of, e.g.
> generic add with a non-protected read or set.

Yes.

> > --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> > +++ linux-2.6/include/asm-generic/add-local-generic.h	2010-01-05 15:36:02.000000000 -0600
> > @@ -0,0 +1,40 @@
> > +#ifndef __ASM_GENERIC_ADD_LOCAL_GENERIC_H
> > +#define __ASM_GENERIC_ADD_LOCAL_GENERIC_H

Why split the file between asm-generic/add-local.h and add-local-generic.h?
I don't see how any architecture could use one but not the other.

> > +#include <linux/types.h>
> > +
> > +extern unsigned long wrong_size_add_local(volatile void *ptr);
> > +
> > +/*
> > + * Generic version of __add_return_local (disables interrupts). Takes an
> > + * unsigned long parameter, supporting various types of architectures.
> > + */
> > +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> > +		unsigned long value, int size)

You could probably lose the 'volatile' here, if you want to discourage
marking data as volatile in the code.

> > +{
> > +	unsigned long flags, r;
> > +
> > +	/*
> > +	 * Sanity checking, compile-time.
> > +	 */
> > +	if (size == 8 && sizeof(unsigned long) != 8)
> > +		wrong_size_add_local(ptr);

It can be BUILD_BUG_ON if you move it to the outer macro.

> > +	local_irq_save(flags);
> > +	switch (size) {
> > +	case 1: r = (*((u8 *)ptr) += value);
> > +		break;
> > +	case 2: r = (*((u16 *)ptr) += value);
> > +		break;
> > +	case 4: r = (*((u32 *)ptr) += value);
> > +		break;
> > +	case 8: r = (*((u64 *)ptr) += value);
> > +		break;

But I think here you actually need to add the volatile in order
to make these atomic assignments.

	Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-07 13:45     ` Arnd Bergmann
@ 2010-01-07 13:57       ` Mathieu Desnoyers
  2010-01-07 14:22         ` Arnd Bergmann
  0 siblings, 1 reply; 24+ messages in thread
From: Mathieu Desnoyers @ 2010-01-07 13:57 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Christoph Lameter, Tejun Heo, linux-kernel

* Arnd Bergmann (arnd@arndb.de) wrote:
> On Tuesday 05 January 2010, Mathieu Desnoyers wrote:
> > 
> > The problem I see here is that with ~5-6 operations, we will end up
> > having 20*5 = 100 headers only for this. Can we combine these in a
> > single header file instead ? local.h wasn't bad in this respect.
> 
> I have an old patch that I was planning to dig out for 2.6.34,
> which autogenerates arch/*/include/foo.h files that only contain
> "#include <asm-generic/foo.h>".
> 
> I guess this would be sufficient to avoid the overload with all
> these header files.

Well, given we already have local.h, I am not completely sure that this
whole exercise is giving us.

[...]

> > > +#include <linux/types.h>
> > > +
> > > +extern unsigned long wrong_size_add_local(volatile void *ptr);
> > > +
> > > +/*
> > > + * Generic version of __add_return_local (disables interrupts). Takes an
> > > + * unsigned long parameter, supporting various types of architectures.
> > > + */
> > > +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> > > +		unsigned long value, int size)
> 
> You could probably lose the 'volatile' here, if you want to discourage
> marking data as volatile in the code.
> 
> > > +{
> > > +	unsigned long flags, r;
> > > +
> > > +	/*
> > > +	 * Sanity checking, compile-time.
> > > +	 */
> > > +	if (size == 8 && sizeof(unsigned long) != 8)
> > > +		wrong_size_add_local(ptr);
> 
> It can be BUILD_BUG_ON if you move it to the outer macro.
> 
> > > +	local_irq_save(flags);
> > > +	switch (size) {
> > > +	case 1: r = (*((u8 *)ptr) += value);
> > > +		break;
> > > +	case 2: r = (*((u16 *)ptr) += value);
> > > +		break;
> > > +	case 4: r = (*((u32 *)ptr) += value);
> > > +		break;
> > > +	case 8: r = (*((u64 *)ptr) += value);
> > > +		break;
> 
> But I think here you actually need to add the volatile in order
> to make these atomic assignments.

Yes, you are right. If we ever try to access these variables from a
remote CPU with a load (but not with any concurrent store operation, as
this would be semantically invalid), then the volatile is important.

Mathieu

> 
> 	Arnd

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-07 13:57       ` Mathieu Desnoyers
@ 2010-01-07 14:22         ` Arnd Bergmann
  0 siblings, 0 replies; 24+ messages in thread
From: Arnd Bergmann @ 2010-01-07 14:22 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Christoph Lameter, Tejun Heo, linux-kernel

On Thursday 07 January 2010, Mathieu Desnoyers wrote:
> * Arnd Bergmann (arnd@arndb.de) wrote:
> > > > + local_irq_save(flags);
> > > > + switch (size) {
> > > > + case 1: r = (*((u8 *)ptr) += value);
> > > > +         break;
> > > > + case 2: r = (*((u16 *)ptr) += value);
> > > > +         break;
> > > > + case 4: r = (*((u32 *)ptr) += value);
> > > > +         break;
> > > > + case 8: r = (*((u64 *)ptr) += value);
> > > > +         break;
> > 
> > But I think here you actually need to add the volatile in order
> > to make these atomic assignments.
> 
> Yes, you are right. If we ever try to access these variables from a
> remote CPU with a load (but not with any concurrent store operation, as
> this would be semantically invalid), then the volatile is important.

Just to make sure everyone has the same understanding: We need the volatile
in the cast in these lines, not the one in the function prototype which
only serves to avoid warnings but has no impact on the object code when
we cast the pointer to a non-volatile type for the assignment.

	Arnd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and  add_local_return()
  2010-01-06 19:23       ` Mike Frysinger
@ 2010-01-07 17:03         ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-07 17:03 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 359 bytes --]

On Wed, 6 Jan 2010, Mike Frysinger wrote:

> >> > +static inline unsigned long __add_return_local_generic(volatile void *ptr,
> >> > +               unsigned long value, int size)
> >>
> >> size_t for size ?
> >
> > No. It can be anything.
>
> you're passing it sizeof() which returns a size_t

Ahh  not value but the size parameter... Ok.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 0/4] Remove local_t
  2010-01-05 22:45     ` Mathieu Desnoyers
@ 2010-01-07 17:05       ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-07 17:05 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:

> > Volatile is discouraged as far as I can tell.
>
> If you want to ensure that a simple variable assignment or read
> (local_set, local_read) are not performed piecewise by the compiler
> which can cause odd effects when shared with interrupt handlers, this
> will however be necessary.

Piecewise? Assignment of scalars or a pointer is an atomic operation by
default. Lots of things will break if that is not true.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 1/4] Add add_local() and add_local_return()
  2010-01-05 22:49   ` Mathieu Desnoyers
  2010-01-07 13:45     ` Arnd Bergmann
@ 2010-01-07 17:07     ` Christoph Lameter
  1 sibling, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-07 17:07 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:

> The problem I see here is that with ~5-6 operations, we will end up
> having 20*5 = 100 headers only for this. Can we combine these in a
> single header file instead ? local.h wasn't bad in this respect.

We could actually keep local.h and just thin it out a bit. Get rid of the
local_t type and use long instead? Then make the local_inc/local_add work
on an arbitrary scalar like cmpxchg_local?

> Also, separating all these in sub-files will make it a bit difficult to
> pinpoint errors that would arise from using a bad combination of, e.g.
> generic add with a non-protected read or set.

Yes surely I dont want many files. I thought about adding

#define inc_local(x) add_local(x, 1)

etc to add-local.h


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem
  2010-01-05 22:57   ` Mathieu Desnoyers
@ 2010-01-07 17:15     ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-07 17:15 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:

> >  static void rb_init_page(struct buffer_data_page *bpage)
> >  {
> > -	local_set(&bpage->commit, 0);
> > +	bpage->commit = 0;
>
> This is incorrect. You are turning a "volatile" write into a
> non-volatile write, which can be turned into multiple writes by the
> compiler and therefore expose inconsistent state to interrupt handlers.

The structure is being setup and therefore not reachable by anyone?

Even if that is not the case: The assignment of a scalar is atomic.

> >  static inline unsigned long rb_page_write(struct buffer_page *bpage)
> >  {
> > -	return local_read(&bpage->write) & RB_WRITE_MASK;
> > +	return bpage->write & RB_WRITE_MASK;
>
> Same problem here: missing volatile for read. Same applies thorough the
> patch.

Reading of a scalar is atomic.

> >  	if (tail >= BUF_PAGE_SIZE) {
> > -		local_sub(length, &tail_page->write);
> > +		add_local(&tail_page->write, -length);
>
> [...]
>
> If we can have inc/dec/sub already, that would be good, rather than
> going with add -val. This would ensure that we don't do too much
> ping-pong with the code using these primitives.

ok.

> In the end, the fact that the missing volatile access bug crept up as
> part of this patch makes me think that keeping local_t was doing a fine
> encapsulation job. However, if we really want to go down the path of
> removing this encapsulation, then we should:

I am not sure that there is anything to be won by volatile.

> - make sure that _all_ variable accesses are encapsulated, even
>   read_local and set_local.
> - put all this API into a single header per architecture, easy for
>   people to find and understand, rather than multiple headers sprinkled
>   all over the place.
> - document that accessing the variables without the API violates the
>   consistency guarantees.

Then we better leave things as is. local.h would then become a private
operations set for ringbuffer operations?

I'd like to see local operations that are generically usable also in
other subsystems. Locall operations that work generically on scalars
(pointer, int, long etc) like cmpxchg_local.

The only new operation needed for ringbuffer.c is add_local().
Sugarcoating with inc/dec/sub can be easily added and add_local can be
modified to generate inc/dec if it discovers 1 or -1 being passed to it.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 3/4] Optimized add_local()
  2010-01-05 22:59   ` Mathieu Desnoyers
@ 2010-01-07 17:16     ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-07 17:16 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Tejun Heo, linux-kernel

On Tue, 5 Jan 2010, Mathieu Desnoyers wrote:

> add_local can be implemented with the "add" instruction, which is
> significantly faster if my memory serves me correctly.

Yes a full patchset would do that. Lets first see if we can come to an
agreement to go down this road before I put the effort in.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem
  2010-01-05 22:04 ` [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem Christoph Lameter
  2010-01-05 22:57   ` Mathieu Desnoyers
@ 2010-01-14  2:56   ` Steven Rostedt
  2010-01-14 14:49     ` Christoph Lameter
  1 sibling, 1 reply; 24+ messages in thread
From: Steven Rostedt @ 2010-01-14  2:56 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

Please Cc me on changes to the ring buffer.

On Tue, 2010-01-05 at 16:04 -0600, Christoph Lameter wrote:
> plain text document attachment (remove_local_t_ringbuffer_convert)
> Replace the local_t use with longs in the trace subsystem. The longs can then be
> updated with the required level of concurrency protection through cmpxchg_local()
> and add_local().
> 
> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 


> @@ -1741,7 +1741,7 @@ rb_reset_tail(struct ring_buffer_per_cpu
>  	 * must fill the old tail_page with padding.
>  	 */
>  	if (tail >= BUF_PAGE_SIZE) {
> -		local_sub(length, &tail_page->write);
> +		add_local(&tail_page->write, -length);

Can't you have a helper function or macro that does:

#define sub_local(local, val)	add_local(local, -(val))

>  		return;
>  	}
>  


>  static struct ring_buffer_event *
> @@ -1801,7 +1801,7 @@ rb_move_tail(struct ring_buffer_per_cpu 
>  	 * about it.
>  	 */
>  	if (unlikely(next_page == commit_page)) {
> -		local_inc(&cpu_buffer->commit_overrun);
> +		add_local(&cpu_buffer->commit_overrun, 1);

As well as a:

#define inc_local(local)	add_local(local, 1)

>  		goto out_reset;
>  	}
>  


>   again:
> -	commits = local_read(&cpu_buffer->commits);
> +	commits = cpu_buffer->commits;
>  	/* synchronize with interrupts */
>  	barrier();
> -	if (local_read(&cpu_buffer->committing) == 1)
> +	if (cpu_buffer->committing == 1)
>  		rb_set_commit_to_write(cpu_buffer);
>  
> -	local_dec(&cpu_buffer->committing);
> +	add_local(&cpu_buffer->committing, -1);

As well as a:

#define dec_local(local)	add_local(local, -1)


>  
>  	/* synchronize with interrupts */
>  	barrier();
> @@ -2059,9 +2059,9 @@ static void rb_end_commit(struct ring_bu
>  	 * updating of the commit page and the clearing of the
>  	 * committing counter.
>  	 */

The reason I ask for the above is because it sticks out much better. The
subtracting and inc/dec pairs I need to match. So I usually end up
scanning a matching "dec" for a "inc", or a matching "sub" for an "add",
but having "-1" to match "1" is not going to stand out, and I envision a
lot of stupid bugs happening because of this.

-- Steve



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem
  2010-01-14  2:56   ` Steven Rostedt
@ 2010-01-14 14:49     ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2010-01-14 14:49 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Mathieu Desnoyers, Tejun Heo, linux-kernel

Please have a look at V2 which has these changes.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-01-14 14:54 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-05 22:04 [RFC local_t removal V1 0/4] Remove local_t Christoph Lameter
2010-01-05 22:04 ` [RFC local_t removal V1 1/4] Add add_local() and add_local_return() Christoph Lameter
2010-01-05 22:17   ` Mike Frysinger
2010-01-05 22:29     ` Christoph Lameter
2010-01-06 19:23       ` Mike Frysinger
2010-01-07 17:03         ` Christoph Lameter
2010-01-05 22:49   ` Mathieu Desnoyers
2010-01-07 13:45     ` Arnd Bergmann
2010-01-07 13:57       ` Mathieu Desnoyers
2010-01-07 14:22         ` Arnd Bergmann
2010-01-07 17:07     ` Christoph Lameter
2010-01-05 22:04 ` [RFC local_t removal V1 2/4] Replace local_t use in trace subsystem Christoph Lameter
2010-01-05 22:57   ` Mathieu Desnoyers
2010-01-07 17:15     ` Christoph Lameter
2010-01-14  2:56   ` Steven Rostedt
2010-01-14 14:49     ` Christoph Lameter
2010-01-05 22:04 ` [RFC local_t removal V1 3/4] Optimized add_local() Christoph Lameter
2010-01-05 22:59   ` Mathieu Desnoyers
2010-01-07 17:16     ` Christoph Lameter
2010-01-05 22:04 ` [RFC local_t removal V1 4/4] Remove local_t support Christoph Lameter
2010-01-05 22:23 ` [RFC local_t removal V1 0/4] Remove local_t Mathieu Desnoyers
2010-01-05 22:34   ` Christoph Lameter
2010-01-05 22:45     ` Mathieu Desnoyers
2010-01-07 17:05       ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).