All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] output the cpu number  when printking.
@ 2012-12-24  2:31 he, bo
  2012-12-24  3:37 ` Randy Dunlap
  0 siblings, 1 reply; 8+ messages in thread
From: he, bo @ 2012-12-24  2:31 UTC (permalink / raw)
  To: akpm, mingo, a.p.zijlstra, linux-kernel; +Cc: yanmin_zhang

From: "he, bo" <bo.he@intel.com>

We often hit kernel panic issues on SMP machines because processes race
on multiple cpu. By adding a new parameter printk.cpu, kernel prints
cpu number at printk information line. It’s useful to debug what cpus
are racing.

Signed-off-by: he, bo <bo.he@intel.com>
---
 Documentation/kernel-parameters.txt |    4 ++++
 kernel/printk.c                     |   18 ++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ddd84d6..ccd5266 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	printk.time=	Show timing data prefixed to each printk message line
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 
+	printk.cpu=	Show cpu data prefixed to each printk message line
+			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
+	printk.cpu takes effect only when printk.time=y.
+
 	processor.max_cstate=	[HW,ACPI]
 			Limit processor to maximum C-state
 			max_cstate=9 overrides any DMI blacklist limit.
diff --git a/kernel/printk.c b/kernel/printk.c
index 19c0d7b..873a226 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -863,9 +863,14 @@ static bool printk_time;
 #endif
 module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
 
+static bool printk_cpu = 1;
+module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR);
+
 static size_t print_time(u64 ts, char *buf)
 {
 	unsigned long rem_nsec;
+	size_t len = 0;
+	int this_cpu;
 
 	if (!printk_time)
 		return 0;
@@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf)
 		return 15;
 
 	rem_nsec = do_div(ts, 1000000000);
-	return sprintf(buf, "[%5lu.%06lu] ",
-		       (unsigned long)ts, rem_nsec / 1000);
+
+	if (printk_cpu) {
+		this_cpu = raw_smp_processor_id();
+		len = sprintf(buf, "[%5lu.%06lu,%u] ",
+			       (unsigned long)ts, rem_nsec / 1000, this_cpu);
+	} else {
+		len = sprintf(buf, "[%5lu.%06lu] ",
+			       (unsigned long)ts, rem_nsec / 1000);
+	}
+
+	return len;
 }
 
 static size_t print_prefix(const struct log *msg, bool syslog, char *buf)
-- 
1.7.6




^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] output the cpu number  when printking.
  2012-12-24  2:31 [PATCH] output the cpu number when printking he, bo
@ 2012-12-24  3:37 ` Randy Dunlap
  2012-12-24  3:54   ` he, bo
  0 siblings, 1 reply; 8+ messages in thread
From: Randy Dunlap @ 2012-12-24  3:37 UTC (permalink / raw)
  To: he, bo; +Cc: akpm, mingo, a.p.zijlstra, linux-kernel, yanmin_zhang

On 12/23/12 18:31, he, bo wrote:
> From: "he, bo" <bo.he@intel.com>
> 
> We often hit kernel panic issues on SMP machines because processes race
> on multiple cpu. By adding a new parameter printk.cpu, kernel prints
> cpu number at printk information line. It’s useful to debug what cpus
> are racing.
> 
> Signed-off-by: he, bo <bo.he@intel.com>
> ---
>  Documentation/kernel-parameters.txt |    4 ++++
>  kernel/printk.c                     |   18 ++++++++++++++++--
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index ddd84d6..ccd5266 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  	printk.time=	Show timing data prefixed to each printk message line
>  			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
>  
> +	printk.cpu=	Show cpu data prefixed to each printk message line
> +			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
> +	printk.cpu takes effect only when printk.time=y.

Above line needs to be indented more (beginning under "Format:").

> +
>  	processor.max_cstate=	[HW,ACPI]
>  			Limit processor to maximum C-state
>  			max_cstate=9 overrides any DMI blacklist limit.
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 19c0d7b..873a226 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -863,9 +863,14 @@ static bool printk_time;
>  #endif
>  module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>  
> +static bool printk_cpu = 1;
> +module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR);
> +
>  static size_t print_time(u64 ts, char *buf)
>  {
>  	unsigned long rem_nsec;
> +	size_t len = 0;
> +	int this_cpu;
>  
>  	if (!printk_time)
>  		return 0;
> @@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf)
>  		return 15;
>  
>  	rem_nsec = do_div(ts, 1000000000);
> -	return sprintf(buf, "[%5lu.%06lu] ",
> -		       (unsigned long)ts, rem_nsec / 1000);
> +
> +	if (printk_cpu) {
> +		this_cpu = raw_smp_processor_id();
> +		len = sprintf(buf, "[%5lu.%06lu,%u] ",
> +			       (unsigned long)ts, rem_nsec / 1000, this_cpu);
> +	} else {
> +		len = sprintf(buf, "[%5lu.%06lu] ",
> +			       (unsigned long)ts, rem_nsec / 1000);
> +	}
> +
> +	return len;
>  }
>  
>  static size_t print_prefix(const struct log *msg, bool syslog, char *buf)
> 


-- 
~Randy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] output the cpu number  when printking.
  2012-12-24  3:37 ` Randy Dunlap
@ 2012-12-24  3:54   ` he, bo
  2012-12-24  5:01     ` [PATCH V2] " he, bo
  0 siblings, 1 reply; 8+ messages in thread
From: he, bo @ 2012-12-24  3:54 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: akpm, mingo, a.p.zijlstra, linux-kernel, yanmin_zhang

Thanks randy's comments, I will modifiy it.

On Sun, 2012-12-23 at 19:37 -0800, Randy Dunlap wrote:
> On 12/23/12 18:31, he, bo wrote:
> > From: "he, bo" <bo.he@intel.com>
> > 
> > We often hit kernel panic issues on SMP machines because processes race
> > on multiple cpu. By adding a new parameter printk.cpu, kernel prints
> > cpu number at printk information line. It’s useful to debug what cpus
> > are racing.
> > 
> > Signed-off-by: he, bo <bo.he@intel.com>
> > ---
> >  Documentation/kernel-parameters.txt |    4 ++++
> >  kernel/printk.c                     |   18 ++++++++++++++++--
> >  2 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> > index ddd84d6..ccd5266 100644
> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> >  	printk.time=	Show timing data prefixed to each printk message line
> >  			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
> >  
> > +	printk.cpu=	Show cpu data prefixed to each printk message line
> > +			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
> > +	printk.cpu takes effect only when printk.time=y.
> 
> Above line needs to be indented more (beginning under "Format:").
> 
> > +
> >  	processor.max_cstate=	[HW,ACPI]
> >  			Limit processor to maximum C-state
> >  			max_cstate=9 overrides any DMI blacklist limit.
> > diff --git a/kernel/printk.c b/kernel/printk.c
> > index 19c0d7b..873a226 100644
> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -863,9 +863,14 @@ static bool printk_time;
> >  #endif
> >  module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
> >  
> > +static bool printk_cpu = 1;
> > +module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR);
> > +
> >  static size_t print_time(u64 ts, char *buf)
> >  {
> >  	unsigned long rem_nsec;
> > +	size_t len = 0;
> > +	int this_cpu;
> >  
> >  	if (!printk_time)
> >  		return 0;
> > @@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf)
> >  		return 15;
> >  
> >  	rem_nsec = do_div(ts, 1000000000);
> > -	return sprintf(buf, "[%5lu.%06lu] ",
> > -		       (unsigned long)ts, rem_nsec / 1000);
> > +
> > +	if (printk_cpu) {
> > +		this_cpu = raw_smp_processor_id();
> > +		len = sprintf(buf, "[%5lu.%06lu,%u] ",
> > +			       (unsigned long)ts, rem_nsec / 1000, this_cpu);
> > +	} else {
> > +		len = sprintf(buf, "[%5lu.%06lu] ",
> > +			       (unsigned long)ts, rem_nsec / 1000);
> > +	}
> > +
> > +	return len;
> >  }
> >  
> >  static size_t print_prefix(const struct log *msg, bool syslog, char *buf)
> > 
> 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2] output the cpu number  when printking.
  2012-12-24  3:54   ` he, bo
@ 2012-12-24  5:01     ` he, bo
  2012-12-24 17:55       ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: he, bo @ 2012-12-24  5:01 UTC (permalink / raw)
  To: Randy Dunlap, akpm, mingo, linux-kernel, a.p.zijlstra; +Cc: yanmin_zhang

From: "he, bo" <bo.he@intel.com>

We often hit kernel panic issues on SMP machines because processes race
on multiple cpu. By adding a new parameter printk.cpu, kernel prints
cpu number at printk information line. It’s useful to debug what cpus
are racing.

Signed-off-by: he, bo <bo.he@intel.com>
---
 Documentation/kernel-parameters.txt |    4 ++++
 kernel/printk.c                     |   18 ++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ddd84d6..3387a7f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2378,6 +2378,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	printk.time=	Show timing data prefixed to each printk message line
 			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
 
+	printk.cpu=	Show cpu data prefixed to each printk message line
+			Format: <bool>  (1/Y/y=enable, 0/N/n=disable)
+			printk.cpu takes effect only when printk.time=y.
+
 	processor.max_cstate=	[HW,ACPI]
 			Limit processor to maximum C-state
 			max_cstate=9 overrides any DMI blacklist limit.
diff --git a/kernel/printk.c b/kernel/printk.c
index 19c0d7b..873a226 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -863,9 +863,14 @@ static bool printk_time;
 #endif
 module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
 
+static bool printk_cpu = 1;
+module_param_named(cpu, printk_cpu, bool, S_IRUGO | S_IWUSR);
+
 static size_t print_time(u64 ts, char *buf)
 {
 	unsigned long rem_nsec;
+	size_t len = 0;
+	int this_cpu;
 
 	if (!printk_time)
 		return 0;
@@ -874,8 +879,17 @@ static size_t print_time(u64 ts, char *buf)
 		return 15;
 
 	rem_nsec = do_div(ts, 1000000000);
-	return sprintf(buf, "[%5lu.%06lu] ",
-		       (unsigned long)ts, rem_nsec / 1000);
+
+	if (printk_cpu) {
+		this_cpu = raw_smp_processor_id();
+		len = sprintf(buf, "[%5lu.%06lu,%u] ",
+			       (unsigned long)ts, rem_nsec / 1000, this_cpu);
+	} else {
+		len = sprintf(buf, "[%5lu.%06lu] ",
+			       (unsigned long)ts, rem_nsec / 1000);
+	}
+
+	return len;
 }
 
 static size_t print_prefix(const struct log *msg, bool syslog, char *buf)
-- 
1.7.6




^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] output the cpu number  when printking.
  2012-12-24  5:01     ` [PATCH V2] " he, bo
@ 2012-12-24 17:55       ` Greg KH
  2012-12-25  1:09         ` Yanmin Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2012-12-24 17:55 UTC (permalink / raw)
  To: he, bo
  Cc: Randy Dunlap, akpm, mingo, linux-kernel, a.p.zijlstra, yanmin_zhang

On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote:
> From: "he, bo" <bo.he@intel.com>
> 
> We often hit kernel panic issues on SMP machines because processes race
> on multiple cpu. By adding a new parameter printk.cpu, kernel prints
> cpu number at printk information line. It’s useful to debug what cpus
> are racing.

How useful is this really for normal developers?  We complained when
this option was proposed by the ARM developers who were, for the first
time, handling more than one processor and the issues involved with
that.  You are enabling this as a default option, for all developers,
and almost no one will ever need it.

So, if you really want this, don't enable this by default.  Also, go
back and read the old thread about this option and why it was rejeted
previously.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] output the cpu number  when printking.
  2012-12-24 17:55       ` Greg KH
@ 2012-12-25  1:09         ` Yanmin Zhang
  2012-12-26 17:50           ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Yanmin Zhang @ 2012-12-25  1:09 UTC (permalink / raw)
  To: Greg KH; +Cc: he, bo, Randy Dunlap, akpm, mingo, linux-kernel, a.p.zijlstra

On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote:
> On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote:
> > From: "he, bo" <bo.he@intel.com>
> > 
> > We often hit kernel panic issues on SMP machines because processes race
> > on multiple cpu. By adding a new parameter printk.cpu, kernel prints
> > cpu number at printk information line. It’s useful to debug what cpus
> > are racing.
> 
> How useful is this really for normal developers?
It's very useful to debug race conditions under SMP environment.
We applied the patch to our Android build image on our smartphones.

>   We complained when
> this option was proposed by the ARM developers who were, for the first
> time, handling more than one processor and the issues involved with
> that.  You are enabling this as a default option, for all developers,
> and almost no one will ever need it.
At least, I am a developer to use it every day. :). I need debug various
weird issues and tried many debug tools.

Anyway, we would change the default to _disable_.

> 
> So, if you really want this, don't enable this by default.  Also, go
> back and read the old thread about this option and why it was rejeted
> previously.
Thanks for your kind reminder. I googled it. Is the link
like https://lkml.org/lkml/2012/8/3/121?
Or http://lkml.indiana.edu/hypermail/linux/kernel/1208.0/02097.html?

Above links raise a good question that we might be able to use
ftrace (trace_printk) instead of to extend printk.
1) There are indeed lots of debug methods/tools in kernel because of many
excellent developers' contributions. However, the most frequently-used tool
is still printk.
2) ftrace (I use it often) is good when system mostly doesn't crash.
If system crashes, we couldn't get ftrace info sometimes.
When we debug some hard issues on a multi-core device, the system often
crashes when Graphics driver accesses registers incorrectly. There is no chance
for kernel to dump ftrace or panic info. With printk, we find that's because
2 cpu are racing. One cpu turns off Graphics while another cpu is accessing it.
Of course, we also applied the Android persistent ram patch, so after rebooting,
we still could see the printk buffer of previous booting.
3) Current drivers usually have many dev_dbg statements. We could enable
dynamic debug control to get such info quickly. With the printk cpu number
patch, we quickly enhance them to dump info about potential races among cpu.
With ftrace, we need change all these dev_dbg to trace_printk and that's not
the original objective of driver developers.

Mostly, the patch is to reuse all current printk statements in kernel with
a little overhead.

Thanks for your kind comments.

Yanmin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2] output the cpu number  when printking.
  2012-12-25  1:09         ` Yanmin Zhang
@ 2012-12-26 17:50           ` Greg KH
  2012-12-27  6:49             ` He, Bo
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2012-12-26 17:50 UTC (permalink / raw)
  To: Yanmin Zhang
  Cc: he, bo, Randy Dunlap, akpm, mingo, linux-kernel, a.p.zijlstra

On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote:
> On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote:
> > On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote:
> > > From: "he, bo" <bo.he@intel.com>
> > > 
> > > We often hit kernel panic issues on SMP machines because processes race
> > > on multiple cpu. By adding a new parameter printk.cpu, kernel prints
> > > cpu number at printk information line. It’s useful to debug what cpus
> > > are racing.
> > 
> > How useful is this really for normal developers?
> It's very useful to debug race conditions under SMP environment.
> We applied the patch to our Android build image on our smartphones.

That's fine for your application, and seemed to be useful to others
with their first interactions with SMP systems.  However, once you start
to get to "real" numbers of CPUs, this information turns pretty
pointless, which is why the patch was originally rejected.

sorry,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH V2] output the cpu number  when printking.
  2012-12-26 17:50           ` Greg KH
@ 2012-12-27  6:49             ` He, Bo
  0 siblings, 0 replies; 8+ messages in thread
From: He, Bo @ 2012-12-27  6:49 UTC (permalink / raw)
  To: Greg KH, Yanmin Zhang
  Cc: Randy Dunlap, akpm, mingo, linux-kernel, a.p.zijlstra

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1559 bytes --]

Thanks, Greg.
I did use this patch to fix many races on SMP. But to respect maintainer, I stop pushing the patch to upstream.


-----Original Message-----
From: Greg KH [mailto:gregkh@linuxfoundation.org] 
Sent: Thursday, December 27, 2012 1:50 AM
To: Yanmin Zhang
Cc: He, Bo; Randy Dunlap; akpm@linux-foundation.org; mingo@elte.hu; linux-kernel@vger.kernel.org; a.p.zijlstra@chello.nl
Subject: Re: [PATCH V2] output the cpu number when printking.

On Tue, Dec 25, 2012 at 09:09:05AM +0800, Yanmin Zhang wrote:
> On Mon, 2012-12-24 at 09:55 -0800, Greg KH wrote:
> > On Mon, Dec 24, 2012 at 01:01:55PM +0800, he, bo wrote:
> > > From: "he, bo" <bo.he@intel.com>
> > > 
> > > We often hit kernel panic issues on SMP machines because processes 
> > > race on multiple cpu. By adding a new parameter printk.cpu, kernel 
> > > prints cpu number at printk information line. It’s useful to debug 
> > > what cpus are racing.
> > 
> > How useful is this really for normal developers?
> It's very useful to debug race conditions under SMP environment.
> We applied the patch to our Android build image on our smartphones.

That's fine for your application, and seemed to be useful to others with their first interactions with SMP systems.  However, once you start to get to "real" numbers of CPUs, this information turns pretty pointless, which is why the patch was originally rejected.

sorry,

greg k-h
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-12-27  6:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-24  2:31 [PATCH] output the cpu number when printking he, bo
2012-12-24  3:37 ` Randy Dunlap
2012-12-24  3:54   ` he, bo
2012-12-24  5:01     ` [PATCH V2] " he, bo
2012-12-24 17:55       ` Greg KH
2012-12-25  1:09         ` Yanmin Zhang
2012-12-26 17:50           ` Greg KH
2012-12-27  6:49             ` He, Bo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.