linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()
@ 2015-03-16 21:31 Steven Rostedt
  2015-03-17  5:56 ` Christoph Lameter
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2015-03-16 21:31 UTC (permalink / raw)
  To: linux-arm-kernel

It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

   101.356868: preempt_count_add <-ring_buffer_lock_reserve
   101.356870: preempt_count_sub <-ring_buffer_lock_reserve

The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.

The generic version of this_cpu_read() and write() are:

#define _this_cpu_generic_read(pcp)					\
({	typeof(pcp) ret__;						\
	preempt_disable();						\
	ret__ = *this_cpu_ptr(&(pcp));					\
	preempt_enable();						\
	ret__;								\
})

#define _this_cpu_generic_to_op(pcp, val, op)				\
do {									\
	unsigned long flags;						\
	raw_local_irq_save(flags);					\
	*__this_cpu_ptr(&(pcp)) op val;					\
	raw_local_irq_restore(flags);					\
} while (0)


Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.

I may go and remove all this_cpu_read,write() calls from my code
because of this.

Cc: stable at vger.kernel.org
Cc: Christoph Lameter <cl@linux.com>
Reported-by: Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5040d44fe5a3..be33c6093ca5 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2679,7 +2679,11 @@ static DEFINE_PER_CPU(unsigned int, current_context);
 
 static __always_inline int trace_recursive_lock(void)
 {
-	unsigned int val = this_cpu_read(current_context);
+	/*
+	 * We can not use this_cpu_read() and this_cpu_write() because
+	 * the generic versions call preempt_disable()
+	 */
+	unsigned int val = *this_cpu_ptr(&current_context);
 	int bit;
 
 	if (in_interrupt()) {
@@ -2696,18 +2700,18 @@ static __always_inline int trace_recursive_lock(void)
 		return 1;
 
 	val |= (1 << bit);
-	this_cpu_write(current_context, val);
+	*this_cpu_ptr(&current_context) = val;
 
 	return 0;
 }
 
 static __always_inline void trace_recursive_unlock(void)
 {
-	unsigned int val = this_cpu_read(current_context);
+	unsigned int val = *this_cpu_ptr(&current_context);
 
 	val--;
 	val &= this_cpu_read(current_context);
-	this_cpu_write(current_context, val);
+	*this_cpu_ptr(&current_context) = val;
 }
 
 #else

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()
  2015-03-16 21:31 [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr() Steven Rostedt
@ 2015-03-17  5:56 ` Christoph Lameter
  2015-03-17 12:13   ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Lameter @ 2015-03-17  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 16 Mar 2015, Steven Rostedt wrote:

> It has come to my attention that this_cpu_read/write are horrible on
> architectures other than x86. Worse yet, they actually disable
> preemption or interrupts! This caused some unexpected tracing results
> on ARM.

Well its just been 7 years or so. Took a long time it seems.

These would need to be implemented on the architectures to
have comparable performance.

> I may go and remove all this_cpu_read,write() calls from my code
> because of this.

You could do that with __this_cpo_* but not this_cpu_*(). Doing
it to this_cpu_* would make the operations no longer per cpu atomic. If
they do not need per cpu atomicity then you could have used __this_cpu_*
instead. And  __this_cpu_* do not disable preemption or interrupts.

So please do not send patches based on gut reactions.

NAK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()
  2015-03-17  5:56 ` Christoph Lameter
@ 2015-03-17 12:13   ` Steven Rostedt
  2015-03-17 14:11     ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2015-03-17 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 17 Mar 2015 00:56:51 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> On Mon, 16 Mar 2015, Steven Rostedt wrote:
> 
> > It has come to my attention that this_cpu_read/write are horrible on
> > architectures other than x86. Worse yet, they actually disable
> > preemption or interrupts! This caused some unexpected tracing results
> > on ARM.
> 
> Well its just been 7 years or so. Took a long time it seems.

The code that I added was not 7 years old. And not all people send me
reports like this.

> 
> These would need to be implemented on the architectures to
> have comparable performance.
> 
> > I may go and remove all this_cpu_read,write() calls from my code
> > because of this.
> 
> You could do that with __this_cpo_* but not this_cpu_*(). Doing
> it to this_cpu_* would make the operations no longer per cpu atomic. If
> they do not need per cpu atomicity then you could have used __this_cpu_*
> instead. And  __this_cpu_* do not disable preemption or interrupts.

I do not need it to be atomic.

> 
> So please do not send patches based on gut reactions.

What else would you like me to do? It was an RFC, and it worked.

> 
> NAK

For this particular patch, I may override the NAK as I do not see a
downside for it. Why should x86 get an advantage at the expense of ARM?

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()
  2015-03-17 12:13   ` Steven Rostedt
@ 2015-03-17 14:11     ` Steven Rostedt
  2015-03-17 14:40       ` [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Steven Rostedt
  2015-03-19 16:19       ` [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr() Christoph Lameter
  0 siblings, 2 replies; 13+ messages in thread
From: Steven Rostedt @ 2015-03-17 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 17 Mar 2015 08:13:41 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > > I may go and remove all this_cpu_read,write() calls from my code
> > > because of this.
> > 
> > You could do that with __this_cpo_* but not this_cpu_*(). Doing
> > it to this_cpu_* would make the operations no longer per cpu atomic. If
> > they do not need per cpu atomicity then you could have used __this_cpu_*
> > instead. And  __this_cpu_* do not disable preemption or interrupts.
> 
> I do not need it to be atomic.

I test this out with __this_cpu_* versions and see if that solves it
too. If it does, I'll use that version instead.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-17 14:11     ` Steven Rostedt
@ 2015-03-17 14:40       ` Steven Rostedt
  2015-03-17 14:47         ` Uwe Kleine-König
  2015-03-19 16:19       ` [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr() Christoph Lameter
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2015-03-17 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

   101.356868: preempt_count_add <-ring_buffer_lock_reserve
   101.356870: preempt_count_sub <-ring_buffer_lock_reserve

The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.

The generic version of this_cpu_read() and write() are:

#define _this_cpu_generic_read(pcp)					\
({	typeof(pcp) ret__;						\
	preempt_disable();						\
	ret__ = *this_cpu_ptr(&(pcp));					\
	preempt_enable();						\
	ret__;								\
})

#define _this_cpu_generic_to_op(pcp, val, op)				\
do {									\
	unsigned long flags;						\
	raw_local_irq_save(flags);					\
	*__this_cpu_ptr(&(pcp)) op val;					\
	raw_local_irq_restore(flags);					\
} while (0)


Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.

Paul McKenney stated that __this_cpu_() versions produce much better code on
other architectures than this_cpu_() does, if we know that the call is done in
a preempt disabled location.

I also changed the recursive_unlock() to use two local variables instead
of accessing the per_cpu variable twice.

Link: http://lkml.kernel.org/r/20150317114411.GE3589 at linux.vnet.ibm.com

Cc: stable at vger.kernel.org
Cc: Christoph Lameter <cl@linux.com>
Reported-by: Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
Changes since v1:
  Use __this_cpu_*() instead of this_cpu_ptr()

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5040d44fe5a3..363b9ec58aae 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2679,7 +2679,7 @@ static DEFINE_PER_CPU(unsigned int, current_context);
 
 static __always_inline int trace_recursive_lock(void)
 {
-	unsigned int val = this_cpu_read(current_context);
+	unsigned int val = __this_cpu_read(current_context);
 	int bit;
 
 	if (in_interrupt()) {
@@ -2696,18 +2696,19 @@ static __always_inline int trace_recursive_lock(void)
 		return 1;
 
 	val |= (1 << bit);
-	this_cpu_write(current_context, val);
+	__this_cpu_write(current_context, val);
 
 	return 0;
 }
 
 static __always_inline void trace_recursive_unlock(void)
 {
-	unsigned int val = this_cpu_read(current_context);
+	unsigned int val = __this_cpu_read(current_context);
+	unsigned int val2;
 
-	val--;
-	val &= this_cpu_read(current_context);
-	this_cpu_write(current_context, val);
+	val2 = val - 1;
+	val &= val2;
+	__this_cpu_write(current_context, val);
 }
 
 #else

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-17 14:40       ` [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Steven Rostedt
@ 2015-03-17 14:47         ` Uwe Kleine-König
  2015-03-17 15:07           ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: Uwe Kleine-König @ 2015-03-17 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

Guten Morgen Steven,

On Tue, Mar 17, 2015 at 10:40:38AM -0400, Steven Rostedt wrote:
>  static __always_inline void trace_recursive_unlock(void)
>  {
> -	unsigned int val = this_cpu_read(current_context);
> +	unsigned int val = __this_cpu_read(current_context);
> +	unsigned int val2;
>  
> -	val--;
> -	val &= this_cpu_read(current_context);
> -	this_cpu_write(current_context, val);
> +	val2 = val - 1;
> +	val &= val2;
> +	__this_cpu_write(current_context, val);
You could use:

	unsigned int val = __this_cpu_read(current_context);

	val = val & (val - 1);

	__this_cpu_write(current_context, val);

and save a few lines and still make it more readable (IMHO).

BTW, this patch makes the additional lines in the trace disappear, so if
you think that makes a Tested-by applicable, feel free to add it.

Best regards
Uwe
-- 
Pengutronix e.K.                           | Uwe Kleine-K?nig            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-17 14:47         ` Uwe Kleine-König
@ 2015-03-17 15:07           ` Steven Rostedt
  2015-03-19 16:20             ` Christoph Lameter
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2015-03-17 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 17 Mar 2015 15:47:01 +0100
Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de> wrote:

> Guten Morgen Steven,
> 
> On Tue, Mar 17, 2015 at 10:40:38AM -0400, Steven Rostedt wrote:
> >  static __always_inline void trace_recursive_unlock(void)
> >  {
> > -	unsigned int val = this_cpu_read(current_context);
> > +	unsigned int val = __this_cpu_read(current_context);
> > +	unsigned int val2;
> >  
> > -	val--;
> > -	val &= this_cpu_read(current_context);
> > -	this_cpu_write(current_context, val);
> > +	val2 = val - 1;
> > +	val &= val2;
> > +	__this_cpu_write(current_context, val);
> You could use:
> 
> 	unsigned int val = __this_cpu_read(current_context);
> 
> 	val = val & (val - 1);
> 
> 	__this_cpu_write(current_context, val);
> 
> and save a few lines and still make it more readable (IMHO).

Me too. My version came from looking at too much assembly, and val2
just happened to be another register in my mind.

> 
> BTW, this patch makes the additional lines in the trace disappear, so if
> you think that makes a Tested-by applicable, feel free to add it.

OK, will do. Thanks.

Christoph, you happy with this version?

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()
  2015-03-17 14:11     ` Steven Rostedt
  2015-03-17 14:40       ` [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Steven Rostedt
@ 2015-03-19 16:19       ` Christoph Lameter
  1 sibling, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-03-19 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 17 Mar 2015, Steven Rostedt wrote:

> I test this out with __this_cpu_* versions and see if that solves it
> too. If it does, I'll use that version instead.

Yeah that may be best. Sorry for the delay. Stuck at yet another
conference.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-17 15:07           ` Steven Rostedt
@ 2015-03-19 16:20             ` Christoph Lameter
  2015-03-19 16:31               ` Steven Rostedt
  2015-03-19 16:33               ` Christoph Lameter
  0 siblings, 2 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-03-19 16:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 17 Mar 2015, Steven Rostedt wrote:

> Christoph, you happy with this version?

Yes looks good.

Acked-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-19 16:20             ` Christoph Lameter
@ 2015-03-19 16:31               ` Steven Rostedt
  2015-03-19 16:33               ` Christoph Lameter
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2015-03-19 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 19 Mar 2015 11:20:53 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> On Tue, 17 Mar 2015, Steven Rostedt wrote:
> 
> > Christoph, you happy with this version?
> 
> Yes looks good.
> 
> Acked-by: Christoph Lameter <cl@linux.com>

Thanks, will add your ack.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-19 16:20             ` Christoph Lameter
  2015-03-19 16:31               ` Steven Rostedt
@ 2015-03-19 16:33               ` Christoph Lameter
  2015-03-19 16:40                 ` Steven Rostedt
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Lameter @ 2015-03-19 16:33 UTC (permalink / raw)
  To: linux-arm-kernel

If you are redoing it then please get the comments a bit cleared up. The
heaviness of the fallback version of this_cpu_read/write can usually
easily be remedied by arch specific definitions. The per cpu
offset is somewhere in a register and one needs to define a macro that
creates an instruction that does a fetch from that register plus
the current offset into the area that is needed. This is similarly easy
for the write path. But then its often easier to just use the __this_cpu
instructions since preemption is often off in these code paths.

I have had code for IA64 in the past that does this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-19 16:33               ` Christoph Lameter
@ 2015-03-19 16:40                 ` Steven Rostedt
  2015-03-24 18:49                   ` Christoph Lameter
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2015-03-19 16:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 19 Mar 2015 11:33:30 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> If you are redoing it then please get the comments a bit cleared up. The

What comments should I clear up? This version did not have a comment.
It just switched this_cpu_* to __this_cpu_*, and also updated a
variable algorithm.

-- Steve


> heaviness of the fallback version of this_cpu_read/write can usually
> easily be remedied by arch specific definitions. The per cpu
> offset is somewhere in a register and one needs to define a macro that
> creates an instruction that does a fetch from that register plus
> the current offset into the area that is needed. This is similarly easy
> for the write path. But then its often easier to just use the __this_cpu
> instructions since preemption is often off in these code paths.
> 
> I have had code for IA64 in the past that does this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*()
  2015-03-19 16:40                 ` Steven Rostedt
@ 2015-03-24 18:49                   ` Christoph Lameter
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2015-03-24 18:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 19 Mar 2015, Steven Rostedt wrote:

> On Thu, 19 Mar 2015 11:33:30 -0500 (CDT)
> Christoph Lameter <cl@linux.com> wrote:
>
> > If you are redoing it then please get the comments a bit cleared up. The
>
> What comments should I clear up? This version did not have a comment.
> It just switched this_cpu_* to __this_cpu_*, and also updated a
> variable algorithm.

Ok fine if there is no comment.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-03-24 18:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-16 21:31 [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr() Steven Rostedt
2015-03-17  5:56 ` Christoph Lameter
2015-03-17 12:13   ` Steven Rostedt
2015-03-17 14:11     ` Steven Rostedt
2015-03-17 14:40       ` [PATCH v2] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Steven Rostedt
2015-03-17 14:47         ` Uwe Kleine-König
2015-03-17 15:07           ` Steven Rostedt
2015-03-19 16:20             ` Christoph Lameter
2015-03-19 16:31               ` Steven Rostedt
2015-03-19 16:33               ` Christoph Lameter
2015-03-19 16:40                 ` Steven Rostedt
2015-03-24 18:49                   ` Christoph Lameter
2015-03-19 16:19       ` [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr() Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).