* [PATCH 1/8] SGI x86_64 UV: Add limit console output function
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
@ 2009-10-23 23:37 ` Mike Travis
2009-10-24 1:09 ` Frederic Weisbecker
2009-10-26 7:02 ` Andi Kleen
2009-10-23 23:37 ` [PATCH 2/8] SGI x86_64 UV: " Mike Travis
` (6 subsequent siblings)
7 siblings, 2 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
[-- Attachment #1: limit_console_output --]
[-- Type: text/plain, Size: 4171 bytes --]
With a large number of processors in a system there is an excessive amount
of messages sent to the system console. It's estimated that with 4096
processors in a system, and the console baudrate set to 56K, the startup
messages will take about 84 minutes to clear the serial port.
This patch adds (for SGI UV only) a kernel start option "limit_console_
output" (or 'lco' for short), which when set provides the ability to
temporarily reduce the console loglevel during system startup. This allows
informative messages to still be seen on the console without producing
excessive amounts of repetious messages.
Note that all the messages are still available in the kernel log buffer.
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Robin Getz <rgetz@analog.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
Documentation/kernel-parameters.txt | 6 ++++
include/linux/kernel.h | 23 +++++++++++++++++
kernel/printk.c | 47 ++++++++++++++++++++++++++++++++++++
3 files changed, 76 insertions(+)
--- linux.orig/Documentation/kernel-parameters.txt
+++ linux/Documentation/kernel-parameters.txt
@@ -103,6 +103,7 @@
UMS USB Mass Storage support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
+ UV SGI Ultraviolet
V4L Video For Linux support is enabled.
VGA The VGA console has been enabled.
VT Virtual terminal support is enabled.
@@ -1218,6 +1219,11 @@
If there are multiple matching configurations changing
the same attribute, the last one is used.
+ limit_console_output [UV]
+ lco [UV]
+ Limits repetitous messages when the number of cpus in
+ a system is large.
+
lmb=debug [KNL] Enable lmb debug messages.
load_ramdisk= [RAM] List of ramdisks to load from floppy
--- linux.orig/include/linux/kernel.h
+++ linux/include/linux/kernel.h
@@ -372,6 +372,29 @@
}
#endif
+#ifdef CONFIG_X86_UV
+bool _limit_console_output(bool suspend);
+#define limit_console_output(suspend) \
+({ \
+ bool limit = _limit_console_output(suspend); \
+ if (limit) \
+ printk_once(KERN_NOTICE \
+ "printk: further related messages suppressed\n");\
+ limit; \
+})
+
+void end_limit_console_output(void);
+#else
+static inline bool limit_console_output(bool suspend)
+{
+ return false;
+}
+
+static inline void end_limit_console_output(void)
+{
+}
+#endif
+
extern int printk_needs_cpu(int cpu);
extern void printk_tick(void);
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -1417,3 +1417,50 @@
}
EXPORT_SYMBOL(printk_timed_ratelimit);
#endif
+
+#ifdef CONFIG_X86_UV
+/*
+ * Support to suppress the zillions of extra messages being sent to
+ * the console when a server has a large number of cpus.
+ */
+static bool __read_mostly console_output_limited;
+static int __init limit_console_output_setup(char *str)
+{
+ console_output_limited = true;
+ printk(KERN_NOTICE "printk: console messages will be limited.\n");
+ return 0;
+}
+early_param("limit_console_output", limit_console_output_setup);
+
+static int __init lco(char *str)
+{
+ return limit_console_output_setup(str);
+}
+early_param("lco", lco);
+
+#define SUSPENDED_CONSOLE_LOGLEVEL (DEFAULT_CONSOLE_LOGLEVEL-1)
+
+/* check if "limiting" console output and suspend further msgs if requested. */
+bool _limit_console_output(bool suspend)
+{
+ if (console_output_limited) {
+ if (suspend && saved_console_loglevel == -1) {
+ saved_console_loglevel = console_loglevel;
+ console_loglevel = SUSPENDED_CONSOLE_LOGLEVEL;
+ }
+ return true;
+ }
+ return false;
+}
+EXPORT_SYMBOL(_limit_console_output);
+
+/* remove suspension of console msgs. */
+void end_limit_console_output(void)
+{
+ if (console_output_limited && saved_console_loglevel != -1) {
+ console_loglevel = saved_console_loglevel;
+ saved_console_loglevel = -1;
+ }
+}
+EXPORT_SYMBOL(end_limit_console_output);
+#endif
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 2/8] SGI x86_64 UV: Limit the number of processor bootup messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
2009-10-23 23:37 ` [PATCH 1/8] SGI x86_64 UV: Add limit console output function Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-26 7:26 ` Andi Kleen
2009-10-23 23:37 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Mike Travis
` (5 subsequent siblings)
7 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, H. Peter Anvin, x86, Rusty Russell, Yinghai Lu,
Tejun Heo, linux-kernel
[-- Attachment #1: limit_boot_cpu --]
[-- Type: text/plain, Size: 2046 bytes --]
Limit the number of processor bootup messages when
system_state == SYSTEM_BOOTING. Limit the number of
offline messages when system is shutting down.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
Example output...
Before:
[ 36.305264] Booting processor 1/2 ip 6000
..
[ 36.420549] Booting processor 2/4 ip 6000
..
[ 101.352770] Booting processor 383/759 ip 6000
..
[ 101.524209] Brought up 384 CPUs
[ 101.528277] Total of 384 processors activated (1741075.97 BogoMIPS).
After:
[ 36.189152] Booting processor 1/2 ip 6000
..
[ 36.304541] Booting processor 2/4 ip 6000
..
[ 36.464533] printk: further related messages suppressed
[ 76.536185] Brought up 384 CPUs
[ 76.539894] Total of 384 processors activated (1741015.43 BogoMIPS).
---
arch/x86/kernel/smpboot.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- linux.orig/arch/x86/kernel/smpboot.c
+++ linux/arch/x86/kernel/smpboot.c
@@ -741,6 +741,9 @@
/* start_ip had better be page-aligned! */
start_ip = setup_trampoline();
+ if (cpu > 2 && system_state == SYSTEM_BOOTING)
+ limit_console_output(true);
+
/* So we see what's up */
printk(KERN_INFO "Booting processor %d APIC 0x%x ip 0x%lx\n",
cpu, apicid, start_ip);
@@ -838,6 +841,9 @@
smpboot_restore_warm_reset_vector();
}
+ if (cpu > 2 && system_state == SYSTEM_BOOTING)
+ end_limit_console_output();
+
return boot_error;
}
@@ -1308,7 +1314,10 @@
for (i = 0; i < 10; i++) {
/* They ack this in play_dead by setting CPU_DEAD */
if (per_cpu(cpu_state, cpu) == CPU_DEAD) {
- printk(KERN_INFO "CPU %d is now offline\n", cpu);
+ if (cpu < 4 || system_state == SYSTEM_RUNNING ||
+ !limit_console_output(false))
+ printk(KERN_INFO
+ "CPU %d is now offline\n", cpu);
if (1 == num_online_cpus())
alternatives_smp_switch(0);
return;
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
2009-10-23 23:37 ` [PATCH 1/8] SGI x86_64 UV: Add limit console output function Mike Travis
2009-10-23 23:37 ` [PATCH 2/8] SGI x86_64 UV: " Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-26 7:04 ` Andi Kleen
2009-10-23 23:37 ` [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages Mike Travis
` (4 subsequent siblings)
7 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, H. Peter Anvin, x86, David Rientjes, Yinghai Lu,
Mel Gorman, linux-kernel
[-- Attachment #1: limit_srat --]
[-- Type: text/plain, Size: 1271 bytes --]
Limit number of SRAT messages of the form:
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/mm/srat_64.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- linux.orig/arch/x86/mm/srat_64.c
+++ linux/arch/x86/mm/srat_64.c
@@ -136,8 +136,9 @@
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+ if (node < 2 || !limit_console_output(false))
+ printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
+ pxm, apic_id, node);
}
/* Callback for Proximity Domain -> LAPIC mapping */
@@ -170,8 +171,9 @@
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+ if (node < 2 || !limit_console_output(false))
+ printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
+ pxm, apic_id, node);
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
` (2 preceding siblings ...)
2009-10-23 23:37 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-24 3:29 ` Bjorn Helgaas
2009-10-23 23:37 ` [PATCH 5/8] SGI x86_64 UV: Limit the number of firmware messages Mike Travis
` (3 subsequent siblings)
7 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Zhang Rui, Len Brown, Thomas Renninger,
Bjorn Helgaas, Alexey Dobriyan, Myron Stowe, Feng Tang,
Suresh Siddha, Yinghai Lu, linux-acpi, linux-kernel
[-- Attachment #1: limit_acpi --]
[-- Type: text/plain, Size: 2858 bytes --]
Limit number of ACPI messages of the form:
[ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
[ 99.638655] processor ACPI0007:00: registered as cooling_device0
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Myron Stowe <myron.stowe@hp.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
drivers/acpi/fan.c | 7 ++++++-
drivers/acpi/processor_core.c | 8 ++++++--
drivers/acpi/tables.c | 15 ++++++++++-----
3 files changed, 22 insertions(+), 8 deletions(-)
--- linux.orig/drivers/acpi/fan.c
+++ linux/drivers/acpi/fan.c
@@ -243,6 +243,7 @@
int result = 0;
int state = 0;
struct thermal_cooling_device *cdev;
+ static int msgcnt;
if (!device)
return -EINVAL;
@@ -267,7 +268,11 @@
goto end;
}
- dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
+ if (msgcnt < 4 || !limit_console_output(false)) {
+ dev_info(&device->dev,
+ "registered as cooling_device%d\n", cdev->id);
+ msgcnt++;
+ }
device->driver_data = cdev;
result = sysfs_create_link(&device->dev.kobj,
--- linux.orig/drivers/acpi/processor_core.c
+++ linux/drivers/acpi/processor_core.c
@@ -775,6 +775,7 @@
struct acpi_processor *pr = NULL;
int result = 0;
struct sys_device *sysdev;
+ static int msgcnt;
pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
if (!pr)
@@ -845,8 +846,11 @@
goto err_power_exit;
}
- dev_info(&device->dev, "registered as cooling_device%d\n",
- pr->cdev->id);
+ if (msgcnt < 4 || !limit_console_output(false)) {
+ dev_info(&device->dev, "registered as cooling_device%d\n",
+ pr->cdev->id);
+ msgcnt++;
+ }
result = sysfs_create_link(&device->dev.kobj,
&pr->cdev->device.kobj,
--- linux.orig/drivers/acpi/tables.c
+++ linux/drivers/acpi/tables.c
@@ -170,11 +170,16 @@
case ACPI_MADT_TYPE_LOCAL_SAPIC:
{
struct acpi_madt_local_sapic *p =
- (struct acpi_madt_local_sapic *)header;
- printk(KERN_INFO PREFIX
- "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
- p->processor_id, p->id, p->eid,
- (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled");
+ (struct acpi_madt_local_sapic *)header;
+
+ if (p->eid < 8 || !limit_console_output(false))
+ printk(KERN_INFO PREFIX
+ "LSAPIC (acpi_id[0x%02x] "
+ "lsapic_id[0x%02x] "
+ "lsapic_eid[0x%02x] %s)\n",
+ p->processor_id, p->id, p->eid,
+ (p->lapic_flags & ACPI_MADT_ENABLED) ?
+ "enabled" : "disabled");
}
break;
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 5/8] SGI x86_64 UV: Limit the number of firmware messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
` (3 preceding siblings ...)
2009-10-23 23:37 ` [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-23 23:37 ` [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages Mike Travis
` (2 subsequent siblings)
7 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Greg Kroah-Hartman, Ming Lei, Catalin Marinas,
David Woodhouse, linux-kernel
[-- Attachment #1: limit_firmware --]
[-- Type: text/plain, Size: 969 bytes --]
Limit number of firmware messages of the form:
[ 170.643130] firmware: requesting intel-ucode/06-2e-0
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
drivers/base/firmware_class.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- linux.orig/drivers/base/firmware_class.c
+++ linux/drivers/base/firmware_class.c
@@ -471,6 +471,7 @@
struct firmware *firmware;
struct builtin_fw *builtin;
int retval;
+ static int msgcnt;
if (!firmware_p)
return -EINVAL;
@@ -494,8 +495,10 @@
return 0;
}
- if (uevent)
+ if (uevent && (msgcnt < 4 || !limit_console_output(false))) {
dev_info(device, "firmware: requesting %s\n", name);
+ msgcnt++;
+ }
retval = fw_setup_device(firmware, &f_dev, name, device, uevent);
if (retval)
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
` (4 preceding siblings ...)
2009-10-23 23:37 ` [PATCH 5/8] SGI x86_64 UV: Limit the number of firmware messages Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-24 20:09 ` Dmitry Adamushko
2009-10-26 7:05 ` Andi Kleen
2009-10-23 23:37 ` [PATCH 7/8] SGI x86_64 UV: Limit the number of scheduler debug messages Mike Travis
2009-10-23 23:37 ` [PATCH 8/8] SGI x86_64 UV: Limit the number of cpu is down messages Mike Travis
7 siblings, 2 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Tigran Aivazian, H. Peter Anvin, x86,
Dmitry Adamushko, Andreas Mohr, Hugh Dickins, Hannes Eder,
linux-kernel
[-- Attachment #1: limit_microcode --]
[-- Type: text/plain, Size: 1040 bytes --]
Limit number of microcode messages of the form:
[ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Andreas Mohr <andi@lisas.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Hannes Eder <hannes@hanneseder.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/microcode_intel.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- linux.orig/arch/x86/kernel/microcode_intel.c
+++ linux/arch/x86/kernel/microcode_intel.c
@@ -165,7 +165,9 @@
/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], csig->rev);
- printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
+ if (cpu_num < 4 || !limit_console_output(false))
+ printk(KERN_INFO
+ "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
cpu_num, csig->sig, csig->pf, csig->rev);
return 0;
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 7/8] SGI x86_64 UV: Limit the number of scheduler debug messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
` (5 preceding siblings ...)
2009-10-23 23:37 ` [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
2009-10-23 23:37 ` [PATCH 8/8] SGI x86_64 UV: Limit the number of cpu is down messages Mike Travis
7 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Peter Zijlstra, linux-kernel
[-- Attachment #1: limit_sched --]
[-- Type: text/plain, Size: 585 bytes --]
Limit number of sched debug messages while system is BOOTING.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/sched.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -8003,7 +8003,9 @@
sd->child = NULL;
}
- sched_domain_debug(sd, cpu);
+ if (cpu < 2 || system_state != SYSTEM_BOOTING ||
+ !limit_console_output(false))
+ sched_domain_debug(sd, cpu);
rq_attach_root(rq, rd);
rcu_assign_pointer(rq->sd, sd);
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 8/8] SGI x86_64 UV: Limit the number of cpu is down messages
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
` (6 preceding siblings ...)
2009-10-23 23:37 ` [PATCH 7/8] SGI x86_64 UV: Limit the number of scheduler debug messages Mike Travis
@ 2009-10-23 23:37 ` Mike Travis
7 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-23 23:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, Andrew Morton
Cc: Jack Steiner, Rusty Russell, H. Peter Anvin, Heiko Carstens,
Shane Wang, linux-kernel
[-- Attachment #1: limit_cpu_offline --]
[-- Type: text/plain, Size: 805 bytes --]
Limit number of "CPUx is down" messages when system is shutting down.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Shane Wang <shane.wang@intel.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/cpu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- linux.orig/kernel/cpu.c
+++ linux/kernel/cpu.c
@@ -394,7 +394,9 @@
error = _cpu_down(cpu, 1);
if (!error) {
cpumask_set_cpu(cpu, frozen_cpus);
- printk("CPU%d is down\n", cpu);
+ if (cpu < 4 || system_state == SYSTEM_RUNNING ||
+ !limit_console_output(false))
+ printk(KERN_INFO "CPU%d is down\n", cpu);
} else {
printk(KERN_ERR "Error taking CPU%d down: %d\n",
cpu, error);
--
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-23 23:37 ` [PATCH 1/8] SGI x86_64 UV: Add limit console output function Mike Travis
@ 2009-10-24 1:09 ` Frederic Weisbecker
2009-10-26 17:55 ` Mike Travis
2009-10-26 7:02 ` Andi Kleen
1 sibling, 1 reply; 109+ messages in thread
From: Frederic Weisbecker @ 2009-10-24 1:09 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman, Heiko Carstens,
Robin Getz, Dave Young, linux-kernel, linux-doc
On Fri, Oct 23, 2009 at 06:37:44PM -0500, Mike Travis wrote:
> With a large number of processors in a system there is an excessive amount
> of messages sent to the system console. It's estimated that with 4096
> processors in a system, and the console baudrate set to 56K, the startup
> messages will take about 84 minutes to clear the serial port.
>
> This patch adds (for SGI UV only) a kernel start option "limit_console_
> output" (or 'lco' for short), which when set provides the ability to
> temporarily reduce the console loglevel during system startup. This allows
> informative messages to still be seen on the console without producing
> excessive amounts of repetious messages.
>
> Note that all the messages are still available in the kernel log buffer.
Well, this problem does not only concerns SGI UV but all boxes with a large
number of cpus.
Also, instead of adding the same conditionals in multiple places to solve
the same problem (and that may even expand if we go further the SGI UV case,
for example with other archs cpu up/down events), may be can you centralize,
institutionalize this issue by using the existing printk mechanisms.
I mean, may be that could be addressed by adding a new printk
level flag, and then associate the desired filters against it.
KERN_CPU could be a name, since this is targetting cpu events.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-23 23:37 ` [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages Mike Travis
@ 2009-10-24 3:29 ` Bjorn Helgaas
2009-10-26 18:15 ` Mike Travis
` (2 more replies)
0 siblings, 3 replies; 109+ messages in thread
From: Bjorn Helgaas @ 2009-10-24 3:29 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Zhang Rui, Len Brown, Thomas Renninger, Alexey Dobriyan,
Myron Stowe, Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi,
linux-kernel
On Fri, 2009-10-23 at 18:37 -0500, Mike Travis wrote:
> plain text document attachment (limit_acpi)
> Limit number of ACPI messages of the form:
>
> [ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
>
> [ 99.638655] processor ACPI0007:00: registered as cooling_device0
>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Thomas Renninger <trenn@suse.de>
> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Myron Stowe <myron.stowe@hp.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: Yinghai Lu <yhlu.kernel@gmail.com>
> Cc: linux-acpi@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> drivers/acpi/fan.c | 7 ++++++-
> drivers/acpi/processor_core.c | 8 ++++++--
> drivers/acpi/tables.c | 15 ++++++++++-----
> 3 files changed, 22 insertions(+), 8 deletions(-)
>
> --- linux.orig/drivers/acpi/fan.c
> +++ linux/drivers/acpi/fan.c
> @@ -243,6 +243,7 @@
> int result = 0;
> int state = 0;
> struct thermal_cooling_device *cdev;
> + static int msgcnt;
>
> if (!device)
> return -EINVAL;
> @@ -267,7 +268,11 @@
> goto end;
> }
>
> - dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
> + if (msgcnt < 4 || !limit_console_output(false)) {
> + dev_info(&device->dev,
> + "registered as cooling_device%d\n", cdev->id);
> + msgcnt++;
> + }
I'm personally not in favor of printing some, but not all, of these
messages. That leads to questions when analyzing a dmesg log, such as
"Hmm, I see I have 64 CPUs, but only 0-3 are registered as cooling
devices. Does that mean something is wrong?"
But I would be glad to see this particular message removed completely.
> device->driver_data = cdev;
> result = sysfs_create_link(&device->dev.kobj,
> --- linux.orig/drivers/acpi/processor_core.c
> +++ linux/drivers/acpi/processor_core.c
> @@ -775,6 +775,7 @@
> struct acpi_processor *pr = NULL;
> int result = 0;
> struct sys_device *sysdev;
> + static int msgcnt;
>
> pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
> if (!pr)
> @@ -845,8 +846,11 @@
> goto err_power_exit;
> }
>
> - dev_info(&device->dev, "registered as cooling_device%d\n",
> - pr->cdev->id);
> + if (msgcnt < 4 || !limit_console_output(false)) {
> + dev_info(&device->dev, "registered as cooling_device%d\n",
> + pr->cdev->id);
> + msgcnt++;
> + }
>
> result = sysfs_create_link(&device->dev.kobj,
> &pr->cdev->device.kobj,
> --- linux.orig/drivers/acpi/tables.c
> +++ linux/drivers/acpi/tables.c
> @@ -170,11 +170,16 @@
> case ACPI_MADT_TYPE_LOCAL_SAPIC:
> {
> struct acpi_madt_local_sapic *p =
> - (struct acpi_madt_local_sapic *)header;
> - printk(KERN_INFO PREFIX
> - "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
> - p->processor_id, p->id, p->eid,
> - (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled");
> + (struct acpi_madt_local_sapic *)header;
> +
> + if (p->eid < 8 || !limit_console_output(false))
> + printk(KERN_INFO PREFIX
> + "LSAPIC (acpi_id[0x%02x] "
> + "lsapic_id[0x%02x] "
> + "lsapic_eid[0x%02x] %s)\n",
> + p->processor_id, p->id, p->eid,
> + (p->lapic_flags & ACPI_MADT_ENABLED) ?
> + "enabled" : "disabled");
I know we print way too much stuff for every processor, but again, I'd
rather see all CPUs or none. I think there's a little more value in
this one than the cooling device one (probably because I do a lot of
platform bringup), but it could certainly be made KERN_DEBUG and/or
combined with another processor discovery line.
Bjorn
> }
> break;
>
>
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-23 23:37 ` [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages Mike Travis
@ 2009-10-24 20:09 ` Dmitry Adamushko
2009-10-24 21:09 ` Tigran Aivazian
2009-10-26 18:18 ` Mike Travis
2009-10-26 7:05 ` Andi Kleen
1 sibling, 2 replies; 109+ messages in thread
From: Dmitry Adamushko @ 2009-10-24 20:09 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Tigran Aivazian, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
2009/10/24 Mike Travis <travis@sgi.com>:
> Limit number of microcode messages of the form:
>
> [ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
>
> [ ... ]
>
> --- linux.orig/arch/x86/kernel/microcode_intel.c
> +++ linux/arch/x86/kernel/microcode_intel.c
> @@ -165,7 +165,9 @@
> /* get the current revision from MSR 0x8B */
> rdmsr(MSR_IA32_UCODE_REV, val[0], csig->rev);
>
> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
> + if (cpu_num < 4 || !limit_console_output(false))
> + printk(KERN_INFO
> + "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
> cpu_num, csig->sig, csig->pf, csig->rev);
>
Hmm, I guess we wouldn't lose a lot by simply removing those messages
completely. Per-cpu pf/revision is available via /sys anyway.
Alternatively, we might move the output into
microcode_core.c::collect_cpu_info() (or even microcode_init_cpu()) so
that the same logic is also applied for amd and do something as
following:
don't print if a cpu info is equal to the info of CPU#0. I guess, any
non-0 cpu would be even better as the microcode for cpu#0 can be
loaded by BIOS, if I'm not mistaken. But then we can only be sure
about the presence of cpu#0.
Anyway, it's not worthy of any additional complexity so I'd say let's
just remove the output :-)
-- Dmitry
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 20:09 ` Dmitry Adamushko
@ 2009-10-24 21:09 ` Tigran Aivazian
2009-10-24 22:45 ` Dmitry Adamushko
2009-10-26 18:24 ` [PATCH 6/8] SGI x86_64 UV: " Mike Travis
2009-10-26 18:18 ` Mike Travis
1 sibling, 2 replies; 109+ messages in thread
From: Tigran Aivazian @ 2009-10-24 21:09 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 676 bytes --]
On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
>> + if (cpu_num < 4 || !limit_console_output(false))
>> + printk(KERN_INFO
>> + "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
>> cpu_num, csig->sig, csig->pf, csig->rev);
>>
>
> Hmm, I guess we wouldn't lose a lot by simply removing those messages
> completely. Per-cpu pf/revision is available via /sys anyway.
The reason for printing them is that the pf (possibly others?) can change
by the update and so the log has this info handy.
Kind regards
Tigran
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 21:09 ` Tigran Aivazian
@ 2009-10-24 22:45 ` Dmitry Adamushko
2009-10-25 16:37 ` Ingo Molnar
` (2 more replies)
2009-10-26 18:24 ` [PATCH 6/8] SGI x86_64 UV: " Mike Travis
1 sibling, 3 replies; 109+ messages in thread
From: Dmitry Adamushko @ 2009-10-24 22:45 UTC (permalink / raw)
To: Tigran Aivazian
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>>
>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>> revision=0x%x\n",
>>> + if (cpu_num < 4 || !limit_console_output(false))
>>> + printk(KERN_INFO
>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>> revision=0x%x\n",
>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>
>>
>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>> completely. Per-cpu pf/revision is available via /sys anyway.
>
> The reason for printing them is that the pf (possibly others?) can change by the update and so the log has this info handy.
We might store the old sig/pf/revision set as well, export them via
/sys or/and print them at update-to-new-microcode time.
If it's really so useful to have this info in the log and, at the same
time, to avoid the flood of messages (which, I guess for the majority
of systems, are the same) at startup time, we might delay the printout
until the end of microcode_init(). Then do something like this:
microcode cpu0: up to date version sig, pf, rev // let's say,
it was updated by BIOS
microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
Anyway, my humble opinion, is that (at the very least) the current
patch should be accompanied by a similar version for amd.
>
> Kind regards
> Tigran
-- Dmitry
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 22:45 ` Dmitry Adamushko
@ 2009-10-25 16:37 ` Ingo Molnar
2009-10-25 17:11 ` Arjan van de Ven
2009-10-26 18:29 ` Mike Travis
2009-10-26 18:25 ` Mike Travis
2009-10-30 19:40 ` [PATCH] x86_64: " Mike Travis
2 siblings, 2 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-10-25 16:37 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Tigran Aivazian, Mike Travis, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
* Dmitry Adamushko <dmitry.adamushko@gmail.com> wrote:
> 2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
> > On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
> >>>
> >>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
> >>> revision=0x%x\n",
> >>> + if (cpu_num < 4 || !limit_console_output(false))
> >>> + printk(KERN_INFO
> >>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
> >>> revision=0x%x\n",
> >>> cpu_num, csig->sig, csig->pf, csig->rev);
> >>>
> >>
> >> Hmm, I guess we wouldn't lose a lot by simply removing those messages
> >> completely. Per-cpu pf/revision is available via /sys anyway.
> >
> > The reason for printing them is that the pf (possibly others?) can change by the update and so the log has this info handy.
>
> We might store the old sig/pf/revision set as well, export them via
> /sys or/and print them at update-to-new-microcode time.
>
> If it's really so useful to have this info in the log and, at the same
> time, to avoid the flood of messages (which, I guess for the majority
> of systems, are the same) at startup time, we might delay the printout
> until the end of microcode_init(). Then do something like this:
>
> microcode cpu0: up to date version sig, pf, rev // let's say,
> it was updated by BIOS
> microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
>
> Anyway, my humble opinion, is that (at the very least) the current
> patch should be accompanied by a similar version for amd.
yeah. Since we load new microcode on all cpus it's enough to print it
for the boot CPU or so.
Having the precise microcode version printed (or exposed somewhere in
/sys) is useful - sometimes when there's a weird crash in some prototype
CPU one of the first questions from hw vendors is 'which precise
microcode version was that?'.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-25 16:37 ` Ingo Molnar
@ 2009-10-25 17:11 ` Arjan van de Ven
2009-10-25 17:27 ` Ingo Molnar
2009-10-26 18:29 ` Mike Travis
2009-10-26 18:29 ` Mike Travis
1 sibling, 2 replies; 109+ messages in thread
From: Arjan van de Ven @ 2009-10-25 17:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Dmitry Adamushko, Tigran Aivazian, Mike Travis, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Andreas Mohr,
Hugh Dickins, Hannes Eder, linux-kernel
On Sun, 25 Oct 2009 17:37:04 +0100
Ingo Molnar <mingo@elte.hu> wrote:
>
> Having the precise microcode version printed (or exposed somewhere in
> /sys) is useful - sometimes when there's a weird crash in some
> prototype CPU one of the first questions from hw vendors is 'which
> precise microcode version was that?'.
something like
/sys/devices/system/cpu/cpu0/microcode/version ?
(yes that is there today ;-)
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-25 17:11 ` Arjan van de Ven
@ 2009-10-25 17:27 ` Ingo Molnar
2009-10-26 18:33 ` Mike Travis
2009-10-26 18:29 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Ingo Molnar @ 2009-10-25 17:27 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Dmitry Adamushko, Tigran Aivazian, Mike Travis, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Andreas Mohr,
Hugh Dickins, Hannes Eder, linux-kernel
* Arjan van de Ven <arjan@infradead.org> wrote:
> On Sun, 25 Oct 2009 17:37:04 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
>
> >
> > Having the precise microcode version printed (or exposed somewhere in
> > /sys) is useful - sometimes when there's a weird crash in some
> > prototype CPU one of the first questions from hw vendors is 'which
> > precise microcode version was that?'.
>
> something like /sys/devices/system/cpu/cpu0/microcode/version ?
>
> (yes that is there today ;-)
yeah, i used that for a bug recently.
Nevertheless it makes sense to print the boot CPU message too - for bugs
that crash before we can read out
/sys/devices/system/cpu/cpu0/microcode/version.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-23 23:37 ` [PATCH 1/8] SGI x86_64 UV: Add limit console output function Mike Travis
2009-10-24 1:09 ` Frederic Weisbecker
@ 2009-10-26 7:02 ` Andi Kleen
2009-10-26 16:10 ` Steven Rostedt
2009-10-26 18:03 ` Mike Travis
1 sibling, 2 replies; 109+ messages in thread
From: Andi Kleen @ 2009-10-26 7:02 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
Mike Travis <travis@sgi.com> writes:
> With a large number of processors in a system there is an excessive amount
> of messages sent to the system console. It's estimated that with 4096
> processors in a system, and the console baudrate set to 56K, the startup
> messages will take about 84 minutes to clear the serial port.
>
> This patch adds (for SGI UV only) a kernel start option "limit_console_
> output" (or 'lco' for short), which when set provides the ability to
> temporarily reduce the console loglevel during system startup. This allows
> informative messages to still be seen on the console without producing
> excessive amounts of repetious messages.
>
> Note that all the messages are still available in the kernel log buffer.
I've run into the same problem (kernel log being flooded on large number of CPU thread
systems). It's definitely not a UV only problem. Making such a option UV only
is definitely not the right approach, if anything it needs to be for everyone.
Frankly a lot of these messages made sense for debugging at some point,
but really don't anymore and should just be removed.
Also I don't like the defaults of on. It would be better to evaluate if
these various messages are really useful and if they are not just remove them.
For example do we really need the scheduler debug messages by default?
Or do we really need to print the caches for each CPU at boot? The information
is in sysfs anyways and rarely changes (I added this originally on 64bit,
but in hindsight it was a bad idea)
I don't think it makes much sense to print more than 2-3 lines for each CPU boot
for example.
Also more work could be done to make CPU boot up less verbose without
sacrifying debuggability if something goes wrong.
So please:
- Simply remove messages that don't make sense, no flag.
- Make the default non verbose.
- Minimize output in general, with just a few standard checkpoints so
that if there is a hang the developer still has some clue what went wrong.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-23 23:37 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Mike Travis
@ 2009-10-26 7:04 ` Andi Kleen
2009-10-26 18:08 ` Mike Travis
2009-10-27 15:24 ` Mike Travis
0 siblings, 2 replies; 109+ messages in thread
From: Andi Kleen @ 2009-10-26 7:04 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, David Rientjes, Yinghai Lu, Mel Gorman,
linux-kernel
Mike Travis <travis@sgi.com> writes:
> Limit number of SRAT messages of the form:
> [ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
While I generally agree on the concept of limiting per CPU information
(see other mail) I don't think removing this message by default
is a good idea. I regularly needed it for debugging some NUMA related
problems and they still happen moderately often even today.
I think the right approach here, to limit output, would be to figure out
a more compact output format, perhaps using a matrix in a table
or simply printing multiple pair per line.
-Andi
>
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-23 23:37 ` [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages Mike Travis
2009-10-24 20:09 ` Dmitry Adamushko
@ 2009-10-26 7:05 ` Andi Kleen
2009-10-26 18:34 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Andi Kleen @ 2009-10-26 7:05 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Tigran Aivazian, H. Peter Anvin, x86, Dmitry Adamushko,
Hugh Dickins, Hannes Eder, linux-kernel
Mike Travis <travis@sgi.com> writes:
> Limit number of microcode messages of the form:
>
> [ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
Having a summary message that tells how many CPUs got updated at the
end would seem like the right approach here.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 2/8] SGI x86_64 UV: Limit the number of processor bootup messages
2009-10-23 23:37 ` [PATCH 2/8] SGI x86_64 UV: " Mike Travis
@ 2009-10-26 7:26 ` Andi Kleen
0 siblings, 0 replies; 109+ messages in thread
From: Andi Kleen @ 2009-10-26 7:26 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Rusty Russell, Yinghai Lu, Tejun Heo,
linux-kernel
Mike Travis <travis@sgi.com> writes:
> Limit the number of processor bootup messages when
> system_state == SYSTEM_BOOTING. Limit the number of
> offline messages when system is shutting down.
I think we should still have at least one line per CPU on bootup,
simply so that you have any clue what's going on when something hangs.
I agree the current lines of output is excessive.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 7:02 ` Andi Kleen
@ 2009-10-26 16:10 ` Steven Rostedt
2009-10-26 18:05 ` Mike Travis
2009-10-26 18:03 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Steven Rostedt @ 2009-10-26 16:10 UTC (permalink / raw)
To: Andi Kleen
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Randy Dunlap, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
On Mon, 2009-10-26 at 08:02 +0100, Andi Kleen wrote:
> Also more work could be done to make CPU boot up less verbose without
> sacrifying debuggability if something goes wrong.
What about moving printks over to trace_printk or something. And that
way we can have a "boot up" ring buffer that can later be retrieved if
something goes wrong. It already dumps to the console on panic/oops.
The trace_printk will be hidden and is very fast.
-- Steve
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-24 1:09 ` Frederic Weisbecker
@ 2009-10-26 17:55 ` Mike Travis
2009-11-02 14:15 ` Frederic Weisbecker
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-26 17:55 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman, Heiko Carstens,
Robin Getz, Dave Young, linux-kernel, linux-doc
Frederic Weisbecker wrote:
> On Fri, Oct 23, 2009 at 06:37:44PM -0500, Mike Travis wrote:
>> With a large number of processors in a system there is an excessive amount
>> of messages sent to the system console. It's estimated that with 4096
>> processors in a system, and the console baudrate set to 56K, the startup
>> messages will take about 84 minutes to clear the serial port.
>>
>> This patch adds (for SGI UV only) a kernel start option "limit_console_
>> output" (or 'lco' for short), which when set provides the ability to
>> temporarily reduce the console loglevel during system startup. This allows
>> informative messages to still be seen on the console without producing
>> excessive amounts of repetious messages.
>>
>> Note that all the messages are still available in the kernel log buffer.
>
>
>
> Well, this problem does not only concerns SGI UV but all boxes with a large
> number of cpus.
>
> Also, instead of adding the same conditionals in multiple places to solve
> the same problem (and that may even expand if we go further the SGI UV case,
> for example with other archs cpu up/down events), may be can you centralize,
> institutionalize this issue by using the existing printk mechanisms.
>
> I mean, may be that could be addressed by adding a new printk
> level flag, and then associate the desired filters against it.
>
> KERN_CPU could be a name, since this is targetting cpu events.
>
I did try out something like this but the changes quickly became very intrusive,
and I was hoping for a "lighter" touch. The other potential fallout of adding
another printk level might affect user programs that sift through the dmesg
log for "interesting" info.
Also, I could use some other config option to enable this, it's just that the
existing X86_UV was too convenient. ;-) I believe most systems would want this
turned off so the code size shrinks. And until you get the number of cpus into
the hundreds and thousands, the messages usually just fly by - particularly if
you're on a desktop system which has almost an infinite baud rate to the screen,
and usually hides the messages behind a splash screen anyways.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 7:02 ` Andi Kleen
2009-10-26 16:10 ` Steven Rostedt
@ 2009-10-26 18:03 ` Mike Travis
2009-10-26 21:55 ` Andi Kleen
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:03 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
Andi Kleen wrote:
> Mike Travis <travis@sgi.com> writes:
>
>> With a large number of processors in a system there is an excessive amount
>> of messages sent to the system console. It's estimated that with 4096
>> processors in a system, and the console baudrate set to 56K, the startup
>> messages will take about 84 minutes to clear the serial port.
>>
>> This patch adds (for SGI UV only) a kernel start option "limit_console_
>> output" (or 'lco' for short), which when set provides the ability to
>> temporarily reduce the console loglevel during system startup. This allows
>> informative messages to still be seen on the console without producing
>> excessive amounts of repetious messages.
>>
>> Note that all the messages are still available in the kernel log buffer.
>
> I've run into the same problem (kernel log being flooded on large number of CPU thread
> systems). It's definitely not a UV only problem. Making such a option UV only
> is definitely not the right approach, if anything it needs to be for everyone.
I could use something like the MAXSMP config option to enable it...?
>
> Frankly a lot of these messages made sense for debugging at some point,
> but really don't anymore and should just be removed.
That they still go to the kernel log buffer means the messages are still
available for debugging system problems. KDB has a kernel print option if
you end up there before being able to use 'dmesg'.
>
> Also I don't like the defaults of on. It would be better to evaluate if
> these various messages are really useful and if they are not just remove them.
I believe most distros already do that by setting the loglevel argument
(but I could be wrong since I haven't looked at too many of them.)
>
> For example do we really need the scheduler debug messages by default?
This was the most painful message at Nasa (which has a 2k cpu system). It took
well over an hour for these scheduler messages to print, just because we wanted
to get some other DEBUG prints.
>
> Or do we really need to print the caches for each CPU at boot? The information
> is in sysfs anyways and rarely changes (I added this originally on 64bit,
> but in hindsight it was a bad idea)
I was attempting not to decide whether each message was pertinent, only if it
was redundant.
>
> I don't think it makes much sense to print more than 2-3 lines for each CPU boot
> for example.
That would still be 4 to 12 thousand lines of information which, as you say is
available by other means.
>
> Also more work could be done to make CPU boot up less verbose without
> sacrifying debuggability if something goes wrong.
>
> So please:
> - Simply remove messages that don't make sense, no flag.
> - Make the default non verbose.
> - Minimize output in general, with just a few standard checkpoints so
> that if there is a hang the developer still has some clue what went wrong.
loglevel=4 does this quite nicely. ;-)
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 16:10 ` Steven Rostedt
@ 2009-10-26 18:05 ` Mike Travis
2009-10-26 18:51 ` Steven Rostedt
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:05 UTC (permalink / raw)
To: rostedt
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Randy Dunlap, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
Steven Rostedt wrote:
> On Mon, 2009-10-26 at 08:02 +0100, Andi Kleen wrote:
>
>> Also more work could be done to make CPU boot up less verbose without
>> sacrifying debuggability if something goes wrong.
>
> What about moving printks over to trace_printk or something. And that
> way we can have a "boot up" ring buffer that can later be retrieved if
> something goes wrong. It already dumps to the console on panic/oops.
>
> The trace_printk will be hidden and is very fast.
>
> -- Steve
>
I haven't heard of "trace_printk" but how does this differ from the
existing kernel log buffer you get with 'dmesg'?
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-26 7:04 ` Andi Kleen
@ 2009-10-26 18:08 ` Mike Travis
2009-10-27 15:24 ` Mike Travis
1 sibling, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:08 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, David Rientjes, Yinghai Lu, Mel Gorman,
linux-kernel
Andi Kleen wrote:
> Mike Travis <travis@sgi.com> writes:
>
>> Limit number of SRAT messages of the form:
>> [ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
>
> While I generally agree on the concept of limiting per CPU information
> (see other mail) I don't think removing this message by default
> is a good idea. I regularly needed it for debugging some NUMA related
> problems and they still happen moderately often even today.
>
> I think the right approach here, to limit output, would be to figure out
> a more compact output format, perhaps using a matrix in a table
> or simply printing multiple pair per line.
>
> -Andi
>
On our UV systems, this really is redundant information and adds noise
to the console printout. If you need to examine it, dmesg will provide
it (or don't use the limit_console_output flag)?
I had thought of some reduction techniques to reduce console output, but
it didn't seem worth the complexity. Perhaps I was wrong?
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-24 3:29 ` Bjorn Helgaas
@ 2009-10-26 18:15 ` Mike Travis
2009-10-26 22:47 ` Thomas Renninger
2009-10-27 15:27 ` Mike Travis
2 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:15 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Zhang Rui, Len Brown, Thomas Renninger, Alexey Dobriyan,
Myron Stowe, Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi,
linux-kernel
Bjorn Helgaas wrote:
> On Fri, 2009-10-23 at 18:37 -0500, Mike Travis wrote:
>> plain text document attachment (limit_acpi)
>> Limit number of ACPI messages of the form:
>>
>> [ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
>>
>> [ 99.638655] processor ACPI0007:00: registered as cooling_device0
>>
>> Cc: Zhang Rui <rui.zhang@intel.com>
>> Cc: Len Brown <lenb@kernel.org>
>> Cc: Thomas Renninger <trenn@suse.de>
>> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
>> Cc: Alexey Dobriyan <adobriyan@gmail.com>
>> Cc: Myron Stowe <myron.stowe@hp.com>
>> Cc: Feng Tang <feng.tang@intel.com>
>> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
>> Cc: Yinghai Lu <yhlu.kernel@gmail.com>
>> Cc: linux-acpi@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> drivers/acpi/fan.c | 7 ++++++-
>> drivers/acpi/processor_core.c | 8 ++++++--
>> drivers/acpi/tables.c | 15 ++++++++++-----
>> 3 files changed, 22 insertions(+), 8 deletions(-)
>>
>> --- linux.orig/drivers/acpi/fan.c
>> +++ linux/drivers/acpi/fan.c
>> @@ -243,6 +243,7 @@
>> int result = 0;
>> int state = 0;
>> struct thermal_cooling_device *cdev;
>> + static int msgcnt;
>>
>> if (!device)
>> return -EINVAL;
>> @@ -267,7 +268,11 @@
>> goto end;
>> }
>>
>> - dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
>> + if (msgcnt < 4 || !limit_console_output(false)) {
>> + dev_info(&device->dev,
>> + "registered as cooling_device%d\n", cdev->id);
>> + msgcnt++;
>> + }
>
> I'm personally not in favor of printing some, but not all, of these
> messages. That leads to questions when analyzing a dmesg log, such as
> "Hmm, I see I have 64 CPUs, but only 0-3 are registered as cooling
> devices. Does that mean something is wrong?"
>
> But I would be glad to see this particular message removed completely.
I didn't want to make the decision to remove messages as the original
authors might have very good reasons for including them.
Note that the dmesg log (kernel log buffer) still does have every one of
the messages, only the prints to the console output (which usually is a
serial connection [or IPMI] on servers) are limited.
>
>> device->driver_data = cdev;
>> result = sysfs_create_link(&device->dev.kobj,
>> --- linux.orig/drivers/acpi/processor_core.c
>> +++ linux/drivers/acpi/processor_core.c
>> @@ -775,6 +775,7 @@
>> struct acpi_processor *pr = NULL;
>> int result = 0;
>> struct sys_device *sysdev;
>> + static int msgcnt;
>>
>> pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
>> if (!pr)
>> @@ -845,8 +846,11 @@
>> goto err_power_exit;
>> }
>>
>> - dev_info(&device->dev, "registered as cooling_device%d\n",
>> - pr->cdev->id);
>> + if (msgcnt < 4 || !limit_console_output(false)) {
>> + dev_info(&device->dev, "registered as cooling_device%d\n",
>> + pr->cdev->id);
>> + msgcnt++;
>> + }
>>
>> result = sysfs_create_link(&device->dev.kobj,
>> &pr->cdev->device.kobj,
>> --- linux.orig/drivers/acpi/tables.c
>> +++ linux/drivers/acpi/tables.c
>> @@ -170,11 +170,16 @@
>> case ACPI_MADT_TYPE_LOCAL_SAPIC:
>> {
>> struct acpi_madt_local_sapic *p =
>> - (struct acpi_madt_local_sapic *)header;
>> - printk(KERN_INFO PREFIX
>> - "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
>> - p->processor_id, p->id, p->eid,
>> - (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled");
>> + (struct acpi_madt_local_sapic *)header;
>> +
>> + if (p->eid < 8 || !limit_console_output(false))
>> + printk(KERN_INFO PREFIX
>> + "LSAPIC (acpi_id[0x%02x] "
>> + "lsapic_id[0x%02x] "
>> + "lsapic_eid[0x%02x] %s)\n",
>> + p->processor_id, p->id, p->eid,
>> + (p->lapic_flags & ACPI_MADT_ENABLED) ?
>> + "enabled" : "disabled");
>
> I know we print way too much stuff for every processor, but again, I'd
> rather see all CPUs or none. I think there's a little more value in
> this one than the cooling device one (probably because I do a lot of
> platform bringup), but it could certainly be made KERN_DEBUG and/or
> combined with another processor discovery line.
This was the major reason why I left the default as it currently is, and
made it a startup option that a site can choose to use or not.
The intent of printing a few messages was to give context to the last line
in this sequence:
[ 99.638655] processor ACPI0007:00: registered as cooling_device0
[ 99.648277] processor ACPI0007:01: registered as cooling_device1
[ 99.657976] processor ACPI0007:02: registered as cooling_device2
[ 99.667229] processor ACPI0007:03: registered as cooling_device3
[ 99.676517] printk: further related messages suppressed
Thanks,
Mike
>
> Bjorn
>
>> }
>> break;
>>
>>
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 20:09 ` Dmitry Adamushko
2009-10-24 21:09 ` Tigran Aivazian
@ 2009-10-26 18:18 ` Mike Travis
1 sibling, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:18 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Tigran Aivazian, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
Dmitry Adamushko wrote:
> 2009/10/24 Mike Travis <travis@sgi.com>:
>> Limit number of microcode messages of the form:
>>
>> [ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
>>
>> [ ... ]
>>
>> --- linux.orig/arch/x86/kernel/microcode_intel.c
>> +++ linux/arch/x86/kernel/microcode_intel.c
>> @@ -165,7 +165,9 @@
>> /* get the current revision from MSR 0x8B */
>> rdmsr(MSR_IA32_UCODE_REV, val[0], csig->rev);
>>
>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
>> + if (cpu_num < 4 || !limit_console_output(false))
>> + printk(KERN_INFO
>> + "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
>> cpu_num, csig->sig, csig->pf, csig->rev);
>>
>
> Hmm, I guess we wouldn't lose a lot by simply removing those messages
> completely. Per-cpu pf/revision is available via /sys anyway.
>
> Alternatively, we might move the output into
> microcode_core.c::collect_cpu_info() (or even microcode_init_cpu()) so
> that the same logic is also applied for amd and do something as
> following:
>
> don't print if a cpu info is equal to the info of CPU#0. I guess, any
> non-0 cpu would be even better as the microcode for cpu#0 can be
> loaded by BIOS, if I'm not mistaken. But then we can only be sure
> about the presence of cpu#0.
>
> Anyway, it's not worthy of any additional complexity so I'd say let's
> just remove the output :-)
>
>
> -- Dmitry
I would be more than happy to remove messages but I didn't want to override
the original author's intent on why they choose to add these messages in the
first place.
Plus if you have a 64 or 128 cpu system, it might give you pleasure in seeing
all those cpu messages. ;-) Just when it hits around 256 and up that it
really starts getting annoying.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 21:09 ` Tigran Aivazian
2009-10-24 22:45 ` Dmitry Adamushko
@ 2009-10-26 18:24 ` Mike Travis
1 sibling, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:24 UTC (permalink / raw)
To: Tigran Aivazian
Cc: Dmitry Adamushko, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
Tigran Aivazian wrote:
> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>> revision=0x%x\n",
>>> + if (cpu_num < 4 || !limit_console_output(false))
>>> + printk(KERN_INFO
>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>> revision=0x%x\n",
>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>
>>
>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>> completely. Per-cpu pf/revision is available via /sys anyway.
>
> The reason for printing them is that the pf (possibly others?) can
> change by the update and so the log has this info handy.
>
> Kind regards
> Tigran
Is there any reason to need this on the console before being able to
look at them with dmesg? (Or use some filter program to hunt through
the system log?)
And if all the cpus are the same, would the printing of each one give
you any more information? I could add something that attempts to
print the new line if it's different than the previous, but this would
add complexity, maybe unnecessarily? And I was going for an approach
that optimizes to zero code when not enabled.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-24 22:45 ` Dmitry Adamushko
2009-10-25 16:37 ` Ingo Molnar
@ 2009-10-26 18:25 ` Mike Travis
2009-10-26 19:27 ` Borislav Petkov
2009-10-30 19:40 ` [PATCH] x86_64: " Mike Travis
2 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:25 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Tigran Aivazian, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
Dmitry Adamushko wrote:
> 2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
>> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>> revision=0x%x\n",
>>>> + if (cpu_num < 4 || !limit_console_output(false))
>>>> + printk(KERN_INFO
>>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>> revision=0x%x\n",
>>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>>
>>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>>> completely. Per-cpu pf/revision is available via /sys anyway.
>> The reason for printing them is that the pf (possibly others?) can change by the update and so the log has this info handy.
>
> We might store the old sig/pf/revision set as well, export them via
> /sys or/and print them at update-to-new-microcode time.
>
> If it's really so useful to have this info in the log and, at the same
> time, to avoid the flood of messages (which, I guess for the majority
> of systems, are the same) at startup time, we might delay the printout
> until the end of microcode_init(). Then do something like this:
>
> microcode cpu0: up to date version sig, pf, rev // let's say,
> it was updated by BIOS
> microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
>
> Anyway, my humble opinion, is that (at the very least) the current
> patch should be accompanied by a similar version for amd.
I could add it for AMD but I can't test it, and I'm always reluctant to
change things I can't verify.
Thanks,
Mike
>
>
>> Kind regards
>> Tigran
>
>
> -- Dmitry
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-25 16:37 ` Ingo Molnar
2009-10-25 17:11 ` Arjan van de Ven
@ 2009-10-26 18:29 ` Mike Travis
2009-10-26 20:11 ` Dmitry Adamushko
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: Dmitry Adamushko, Tigran Aivazian, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Andreas Mohr,
Hugh Dickins, Hannes Eder, linux-kernel
Ingo Molnar wrote:
> * Dmitry Adamushko <dmitry.adamushko@gmail.com> wrote:
>
>> 2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
>>> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>> revision=0x%x\n",
>>>>> + if (cpu_num < 4 || !limit_console_output(false))
>>>>> + printk(KERN_INFO
>>>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>> revision=0x%x\n",
>>>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>>>
>>>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>>>> completely. Per-cpu pf/revision is available via /sys anyway.
>>> The reason for printing them is that the pf (possibly others?) can change by the update and so the log has this info handy.
>> We might store the old sig/pf/revision set as well, export them via
>> /sys or/and print them at update-to-new-microcode time.
>>
>> If it's really so useful to have this info in the log and, at the same
>> time, to avoid the flood of messages (which, I guess for the majority
>> of systems, are the same) at startup time, we might delay the printout
>> until the end of microcode_init(). Then do something like this:
>>
>> microcode cpu0: up to date version sig, pf, rev // let's say,
>> it was updated by BIOS
>> microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
>>
>> Anyway, my humble opinion, is that (at the very least) the current
>> patch should be accompanied by a similar version for amd.
>
> yeah. Since we load new microcode on all cpus it's enough to print it
> for the boot CPU or so.
>
> Having the precise microcode version printed (or exposed somewhere in
> /sys) is useful - sometimes when there's a weird crash in some prototype
> CPU one of the first questions from hw vendors is 'which precise
> microcode version was that?'.
>
> Ingo
I would agree especially in the case where not all the cpus are exactly
the same. But so far, I've only seen variations of the speed of the cpus
not it's generic type, in an SSI. So the version of the microcode was
identical in all cases.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-25 17:11 ` Arjan van de Ven
2009-10-25 17:27 ` Ingo Molnar
@ 2009-10-26 18:29 ` Mike Travis
1 sibling, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:29 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Ingo Molnar, Dmitry Adamushko, Tigran Aivazian, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Andreas Mohr,
Hugh Dickins, Hannes Eder, linux-kernel
Arjan van de Ven wrote:
> On Sun, 25 Oct 2009 17:37:04 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
>
>> Having the precise microcode version printed (or exposed somewhere in
>> /sys) is useful - sometimes when there's a weird crash in some
>> prototype CPU one of the first questions from hw vendors is 'which
>> precise microcode version was that?'.
>
> something like
> /sys/devices/system/cpu/cpu0/microcode/version ?
>
> (yes that is there today ;-)
>
>
Thanks. That's good information.
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-25 17:27 ` Ingo Molnar
@ 2009-10-26 18:33 ` Mike Travis
0 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:33 UTC (permalink / raw)
To: Ingo Molnar
Cc: Arjan van de Ven, Dmitry Adamushko, Tigran Aivazian,
Thomas Gleixner, Andrew Morton, Jack Steiner, H. Peter Anvin,
x86, Andreas Mohr, Hugh Dickins, Hannes Eder, linux-kernel
Ingo Molnar wrote:
> * Arjan van de Ven <arjan@infradead.org> wrote:
>
>> On Sun, 25 Oct 2009 17:37:04 +0100
>> Ingo Molnar <mingo@elte.hu> wrote:
>>
>>> Having the precise microcode version printed (or exposed somewhere in
>>> /sys) is useful - sometimes when there's a weird crash in some
>>> prototype CPU one of the first questions from hw vendors is 'which
>>> precise microcode version was that?'.
>> something like /sys/devices/system/cpu/cpu0/microcode/version ?
>>
>> (yes that is there today ;-)
>
> yeah, i used that for a bug recently.
>
> Nevertheless it makes sense to print the boot CPU message too - for bugs
> that crash before we can read out
> /sys/devices/system/cpu/cpu0/microcode/version.
>
> Ingo
I added the printout of the first few cpus. I had thought that maybe
printing the cpu info for each new socket discovered, perhaps reducing
that to each new blade or chassis as the number grew, but again it quickly
became more complex that I thought was necessary...(?)
If you get the first cpu started ok, then using a kernel debugger like
KDB, you can usually get the remaining information.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-26 7:05 ` Andi Kleen
@ 2009-10-26 18:34 ` Mike Travis
0 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 18:34 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Tigran Aivazian, H. Peter Anvin, x86, Dmitry Adamushko,
Hugh Dickins, Hannes Eder, linux-kernel
Andi Kleen wrote:
> Mike Travis <travis@sgi.com> writes:
>
>> Limit number of microcode messages of the form:
>>
>> [ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
>
> Having a summary message that tells how many CPUs got updated at the
> end would seem like the right approach here.
>
> -Andi
>
I can do that if you think it's necessary. I believe it still does
print the error information if one or more fails to update.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 18:05 ` Mike Travis
@ 2009-10-26 18:51 ` Steven Rostedt
0 siblings, 0 replies; 109+ messages in thread
From: Steven Rostedt @ 2009-10-26 18:51 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Randy Dunlap, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
On Mon, 2009-10-26 at 11:05 -0700, Mike Travis wrote:
> I haven't heard of "trace_printk" but how does this differ from the
> existing kernel log buffer you get with 'dmesg'?
>
It goes into the ftrace ring buffer and does not print to console. Nor
does it print to dmesg. Currently you need to read ftrace (via the
debugfs) to get to it. Or a sysrq-z will print it out to the console.
-- Steve
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-26 18:25 ` Mike Travis
@ 2009-10-26 19:27 ` Borislav Petkov
0 siblings, 0 replies; 109+ messages in thread
From: Borislav Petkov @ 2009-10-26 19:27 UTC (permalink / raw)
To: Mike Travis
Cc: Dmitry Adamushko, Tigran Aivazian, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Andreas Mohr,
Hugh Dickins, Hannes Eder, linux-kernel
On Mon, Oct 26, 2009 at 11:25:36AM -0700, Mike Travis wrote:
> I could add it for AMD but I can't test it, and I'm always reluctant to
> change things I can't verify.
Just send it to me, I'll test it for ya :)
--
Regards/Gruss,
Boris.
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
Research | Geschäftsführer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
(OSRC) | Registergericht München, HRB Nr. 43632
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-26 18:29 ` Mike Travis
@ 2009-10-26 20:11 ` Dmitry Adamushko
2009-10-27 15:21 ` Mike Travis
0 siblings, 1 reply; 109+ messages in thread
From: Dmitry Adamushko @ 2009-10-26 20:11 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Tigran Aivazian, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
2009/10/26 Mike Travis <travis@sgi.com>:
>
>
> Ingo Molnar wrote:
>>
>> * Dmitry Adamushko <dmitry.adamushko@gmail.com> wrote:
>>
>>> 2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
>>>>
>>>> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>>>>>
>>>>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>>> revision=0x%x\n",
>>>>>> + if (cpu_num < 4 || !limit_console_output(false))
>>>>>> + printk(KERN_INFO
>>>>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>>> revision=0x%x\n",
>>>>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>>>>
>>>>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>>>>> completely. Per-cpu pf/revision is available via /sys anyway.
>>>>
>>>> The reason for printing them is that the pf (possibly others?) can
>>>> change by the update and so the log has this info handy.
>>>
>>> We might store the old sig/pf/revision set as well, export them via
>>> /sys or/and print them at update-to-new-microcode time.
>>>
>>> If it's really so useful to have this info in the log and, at the same
>>> time, to avoid the flood of messages (which, I guess for the majority
>>> of systems, are the same) at startup time, we might delay the printout
>>> until the end of microcode_init(). Then do something like this:
>>>
>>> microcode cpu0: up to date version sig, pf, rev // let's say,
>>> it was updated by BIOS
>>> microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
>>>
>>> Anyway, my humble opinion, is that (at the very least) the current
>>> patch should be accompanied by a similar version for amd.
>>
>> yeah. Since we load new microcode on all cpus it's enough to print it for
>> the boot CPU or so.
>>
>> Having the precise microcode version printed (or exposed somewhere in
>> /sys) is useful - sometimes when there's a weird crash in some prototype CPU
>> one of the first questions from hw vendors is 'which precise microcode
>> version was that?'.
>>
>> Ingo
>
> I would agree especially in the case where not all the cpus are exactly
> the same. But so far, I've only seen variations of the speed of the cpus
> not it's generic type, in an SSI. So the version of the microcode was
> identical in all cases.
I guess that (at least) a bootup cpu can be updated by BIOS so that it
may appear to be different.
Perhaps, cases where some 'broken' cpus have been replaced for others
with a different "revision" (but still compatible otherwise) might be
rare but possible (say, big machines with hot-pluggable cpus) ?
btw., I was thinking of having something like this:
microcode: cpus [K...L] platform-specific-format (e.g. for Intel :
sig, pf, rev)
microcode: updating...
microcode: cpus [K...L] platform-specific-format (e.g. for Intel :
sig, pf, rev)
or even just,
microcode: cpus [ K...L] updated from platform-specific-format-1 to
platform-specific-format-2
>
> Thanks,
> Mike
>
-- Dmitry
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-26 22:47 ` Thomas Renninger
@ 2009-10-26 21:25 ` Mike Travis
0 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 21:25 UTC (permalink / raw)
To: Thomas Renninger
Cc: Bjorn Helgaas, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Zhang Rui, Len Brown, Alexey Dobriyan, Myron Stowe,
Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi, linux-kernel
Thomas Renninger wrote:
> On Saturday 24 October 2009 05:29:47 am Bjorn Helgaas wrote:
>> On Fri, 2009-10-23 at 18:37 -0500, Mike Travis wrote:
>>> plain text document attachment (limit_acpi)
>>> Limit number of ACPI messages of the form:
>>>
>>> [ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00]
>>> lsapic_eid[0x00] enabled)
>>>
>>> [ 99.638655] processor ACPI0007:00: registered as cooling_device0
>>>
>>> Cc: Zhang Rui <rui.zhang@intel.com>
>>> Cc: Len Brown <lenb@kernel.org>
>>> Cc: Thomas Renninger <trenn@suse.de>
>>> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
>>> Cc: Alexey Dobriyan <adobriyan@gmail.com>
>>> Cc: Myron Stowe <myron.stowe@hp.com>
>>> Cc: Feng Tang <feng.tang@intel.com>
>>> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
>>> Cc: Yinghai Lu <yhlu.kernel@gmail.com>
>>> Cc: linux-acpi@vger.kernel.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Mike Travis <travis@sgi.com>
>>> ---
>>> drivers/acpi/fan.c | 7 ++++++-
>>> drivers/acpi/processor_core.c | 8 ++++++--
>>> drivers/acpi/tables.c | 15 ++++++++++-----
>>> 3 files changed, 22 insertions(+), 8 deletions(-)
>>>
>>> --- linux.orig/drivers/acpi/fan.c
>>> +++ linux/drivers/acpi/fan.c
>>> @@ -243,6 +243,7 @@
>>> int result = 0;
>>> int state = 0;
>>> struct thermal_cooling_device *cdev;
>>> + static int msgcnt;
>>>
>>> if (!device)
>>> return -EINVAL;
>>> @@ -267,7 +268,11 @@
>>> goto end;
>>> }
>>>
>>> - dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
>>> + if (msgcnt < 4 || !limit_console_output(false)) {
>>> + dev_info(&device->dev,
>>> + "registered as cooling_device%d\n", cdev->id);
>>> + msgcnt++;
>>> + }
>> I'm personally not in favor of printing some, but not all, of these
>> messages. That leads to questions when analyzing a dmesg log, such as
>> "Hmm, I see I have 64 CPUs, but only 0-3 are registered as cooling
>> devices. Does that mean something is wrong?"
>>
>> But I would be glad to see this particular message removed completely.
>>
>>> device->driver_data = cdev;
>>> result = sysfs_create_link(&device->dev.kobj,
>>> --- linux.orig/drivers/acpi/processor_core.c
>>> +++ linux/drivers/acpi/processor_core.c
>>> @@ -775,6 +775,7 @@
>>> struct acpi_processor *pr = NULL;
>>> int result = 0;
>>> struct sys_device *sysdev;
>>> + static int msgcnt;
>>>
>>> pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
>>> if (!pr)
>>> @@ -845,8 +846,11 @@
>>> goto err_power_exit;
>>> }
>>>
>>> - dev_info(&device->dev, "registered as cooling_device%d\n",
>>> - pr->cdev->id);
>>> + if (msgcnt < 4 || !limit_console_output(false)) {
>>> + dev_info(&device->dev, "registered as cooling_device%d\n",
>>> + pr->cdev->id);
>>> + msgcnt++;
>>> + }
> If Zhang Rui does not complain you can change these:
> ..registered as cooling_device..
> into dev_dbg() without any condition.
> This isn't critical.
>
> Or why not use the more fine grained
> ACPI debug facility and change it into:
> ACPI_DEBUG_PRINT((ACPI_DB_INFO "..."));
> (compare with Documentation/acpi/debug.txt and other
> occurences in the same file)
> You have to pass:
> acpi_dbg_layer=0x20000000
> to see it then.
Ok.
>>> result = sysfs_create_link(&device->dev.kobj,
>>> &pr->cdev->device.kobj,
>>> --- linux.orig/drivers/acpi/tables.c
>>> +++ linux/drivers/acpi/tables.c
>>> @@ -170,11 +170,16 @@
>>> case ACPI_MADT_TYPE_LOCAL_SAPIC:
>>> {
>>> struct acpi_madt_local_sapic *p =
>>> - (struct acpi_madt_local_sapic *)header;
>>> - printk(KERN_INFO PREFIX
>>> - "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x]
>>> %s)\n", - p->processor_id, p->id, p->eid,
>>> - (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" :
>>> "disabled"); + (struct acpi_madt_local_sapic *)header;
>>> +
>>> + if (p->eid < 8 || !limit_console_output(false))
> I can't find limit_console_output(), I expect it got introduced by another one
> of your patch series, not send to the acpi list?
> Still shouldn't this be:
> limit_console_output(true)
> instead of:
> !limit_console_output(false)
>
> Thomas
Sorry, I used a semi-auto method of calling get_maintainer which filled each patch
with specific Cc's. I did send the first one to everyone in hopes that that would
help find the others.
See http://marc.info/?l=linux-kernel&m=125634109621411&w=4 (the argument specifies
whether to reduce the console loglevel. It's currently only used to suppress the
cpu bootup messages.)
Thanks,
Mike
>
>>> + printk(KERN_INFO PREFIX
>>> + "LSAPIC (acpi_id[0x%02x] "
>>> + "lsapic_id[0x%02x] "
>>> + "lsapic_eid[0x%02x] %s)\n",
>>> + p->processor_id, p->id, p->eid,
>>> + (p->lapic_flags & ACPI_MADT_ENABLED) ?
>>> + "enabled" : "disabled");
>> I know we print way too much stuff for every processor, but again, I'd
>> rather see all CPUs or none. I think there's a little more value in
>> this one than the cooling device one (probably because I do a lot of
>> platform bringup), but it could certainly be made KERN_DEBUG and/or
>> combined with another processor discovery line.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 18:03 ` Mike Travis
@ 2009-10-26 21:55 ` Andi Kleen
2009-10-26 22:07 ` Mike Travis
2009-10-30 19:25 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
0 siblings, 2 replies; 109+ messages in thread
From: Andi Kleen @ 2009-10-26 21:55 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
On Mon, Oct 26, 2009 at 11:03:59AM -0700, Mike Travis wrote:
>
>
> Andi Kleen wrote:
>> Mike Travis <travis@sgi.com> writes:
>>
>>> With a large number of processors in a system there is an excessive amount
>>> of messages sent to the system console. It's estimated that with 4096
>>> processors in a system, and the console baudrate set to 56K, the startup
>>> messages will take about 84 minutes to clear the serial port.
>>>
>>> This patch adds (for SGI UV only) a kernel start option "limit_console_
>>> output" (or 'lco' for short), which when set provides the ability to
>>> temporarily reduce the console loglevel during system startup. This allows
>>> informative messages to still be seen on the console without producing
>>> excessive amounts of repetious messages.
>>>
>>> Note that all the messages are still available in the kernel log buffer.
>>
>> I've run into the same problem (kernel log being flooded on large number of CPU thread
>> systems). It's definitely not a UV only problem. Making such a option UV only
>> is definitely not the right approach, if anything it needs to be for everyone.
>
> I could use something like the MAXSMP config option to enable it...?
No, it's a problem long before MAXSMP sizes.
>>
>> Frankly a lot of these messages made sense for debugging at some point,
>> but really don't anymore and should just be removed.
>
> That they still go to the kernel log buffer means the messages are still
> available for debugging system problems. KDB has a kernel print option if
> you end up there before being able to use 'dmesg'.
Again they should be just reevaluated and pr_debug()ed or completely
removed.
>
>>
>> Also I don't like the defaults of on. It would be better to evaluate if
>> these various messages are really useful and if they are not just remove them.
>
> I believe most distros already do that by setting the loglevel argument
> (but I could be wrong since I haven't looked at too many of them.)
Even spamming dmesg is a problem. loglevel doesn't fix that.
>
>>
>> For example do we really need the scheduler debug messages by default?
>
> This was the most painful message at Nasa (which has a 2k cpu system). It took
> well over an hour for these scheduler messages to print, just because we wanted
> to get some other DEBUG prints.
They should be just removed.
>>
>> Or do we really need to print the caches for each CPU at boot? The information
>> is in sysfs anyways and rarely changes (I added this originally on 64bit,
>> but in hindsight it was a bad idea)
>
> I was attempting not to decide whether each message was pertinent, only if it
> was redundant.
You should decide or at least ask whoever added it
("How many bugs did you fix with that message last year?" If the answer
is < 10 or so, remove it)
>
>>
>> I don't think it makes much sense to print more than 2-3 lines for each CPU boot
>> for example.
>
> That would still be 4 to 12 thousand lines of information which, as you say is
> available by other means.
A simple checkpoint for debugging is not available by other means.
The cache, mce etc. information is.
For the checkpoint problem on CPU boot it might be reasonable
to write them into a special buffer and only print it when the other
CPU does not come up (BP detects a time out)
With that a single line of per CPU output should be feasible without
losing any debuggability.
In fact debuggability could be improved by putting the output
at better strategic points instead of the ad-hoc way it is currently.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 21:55 ` Andi Kleen
@ 2009-10-26 22:07 ` Mike Travis
2009-10-30 19:25 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
1 sibling, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-26 22:07 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman,
Frederic Weisbecker, Heiko Carstens, Robin Getz, Dave Young,
linux-kernel, linux-doc
Andi Kleen wrote:
> On Mon, Oct 26, 2009 at 11:03:59AM -0700, Mike Travis wrote:
>>
>> Andi Kleen wrote:
>>> Mike Travis <travis@sgi.com> writes:
>>>
>>>> With a large number of processors in a system there is an excessive amount
>>>> of messages sent to the system console. It's estimated that with 4096
>>>> processors in a system, and the console baudrate set to 56K, the startup
>>>> messages will take about 84 minutes to clear the serial port.
>>>>
>>>> This patch adds (for SGI UV only) a kernel start option "limit_console_
>>>> output" (or 'lco' for short), which when set provides the ability to
>>>> temporarily reduce the console loglevel during system startup. This allows
>>>> informative messages to still be seen on the console without producing
>>>> excessive amounts of repetious messages.
>>>>
>>>> Note that all the messages are still available in the kernel log buffer.
>>> I've run into the same problem (kernel log being flooded on large number of CPU thread
>>> systems). It's definitely not a UV only problem. Making such a option UV only
>>> is definitely not the right approach, if anything it needs to be for everyone.
>> I could use something like the MAXSMP config option to enable it...?
>
> No, it's a problem long before MAXSMP sizes.
>
>>> Frankly a lot of these messages made sense for debugging at some point,
>>> but really don't anymore and should just be removed.
>> That they still go to the kernel log buffer means the messages are still
>> available for debugging system problems. KDB has a kernel print option if
>> you end up there before being able to use 'dmesg'.
>
> Again they should be just reevaluated and pr_debug()ed or completely
> removed.
>
>>> Also I don't like the defaults of on. It would be better to evaluate if
>>> these various messages are really useful and if they are not just remove them.
>> I believe most distros already do that by setting the loglevel argument
>> (but I could be wrong since I haven't looked at too many of them.)
>
> Even spamming dmesg is a problem. loglevel doesn't fix that.
>
>>> For example do we really need the scheduler debug messages by default?
>> This was the most painful message at Nasa (which has a 2k cpu system). It took
>> well over an hour for these scheduler messages to print, just because we wanted
>> to get some other DEBUG prints.
>
> They should be just removed.
I had changed this to CONFIG_DEBUG_SCHED at one time. Perhaps this would be
acceptible?
>
>>> Or do we really need to print the caches for each CPU at boot? The information
>>> is in sysfs anyways and rarely changes (I added this originally on 64bit,
>>> but in hindsight it was a bad idea)
>> I was attempting not to decide whether each message was pertinent, only if it
>> was redundant.
>
> You should decide or at least ask whoever added it
>
> ("How many bugs did you fix with that message last year?" If the answer
> is < 10 or so, remove it)
Ok.
>>> I don't think it makes much sense to print more than 2-3 lines for each CPU boot
>>> for example.
>> That would still be 4 to 12 thousand lines of information which, as you say is
>> available by other means.
>
> A simple checkpoint for debugging is not available by other means.
>
> The cache, mce etc. information is.
>
> For the checkpoint problem on CPU boot it might be reasonable
> to write them into a special buffer and only print it when the other
> CPU does not come up (BP detects a time out)
>
> With that a single line of per CPU output should be feasible without
> losing any debuggability.
>
> In fact debuggability could be improved by putting the output
> at better strategic points instead of the ad-hoc way it is currently.
>
> -Andi
>
Ok, thanks for the feedback. I'll see about reducing the output more
intelligently for CPU's (as per Ingo's suggestions as well.)
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-24 3:29 ` Bjorn Helgaas
2009-10-26 18:15 ` Mike Travis
@ 2009-10-26 22:47 ` Thomas Renninger
2009-10-26 21:25 ` Mike Travis
2009-10-27 15:27 ` Mike Travis
2 siblings, 1 reply; 109+ messages in thread
From: Thomas Renninger @ 2009-10-26 22:47 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, Zhang Rui, Len Brown, Alexey Dobriyan, Myron Stowe,
Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi, linux-kernel
On Saturday 24 October 2009 05:29:47 am Bjorn Helgaas wrote:
> On Fri, 2009-10-23 at 18:37 -0500, Mike Travis wrote:
> > plain text document attachment (limit_acpi)
> > Limit number of ACPI messages of the form:
> >
> > [ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00]
> > lsapic_eid[0x00] enabled)
> >
> > [ 99.638655] processor ACPI0007:00: registered as cooling_device0
> >
> > Cc: Zhang Rui <rui.zhang@intel.com>
> > Cc: Len Brown <lenb@kernel.org>
> > Cc: Thomas Renninger <trenn@suse.de>
> > Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
> > Cc: Alexey Dobriyan <adobriyan@gmail.com>
> > Cc: Myron Stowe <myron.stowe@hp.com>
> > Cc: Feng Tang <feng.tang@intel.com>
> > Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> > Cc: Yinghai Lu <yhlu.kernel@gmail.com>
> > Cc: linux-acpi@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Mike Travis <travis@sgi.com>
> > ---
> > drivers/acpi/fan.c | 7 ++++++-
> > drivers/acpi/processor_core.c | 8 ++++++--
> > drivers/acpi/tables.c | 15 ++++++++++-----
> > 3 files changed, 22 insertions(+), 8 deletions(-)
> >
> > --- linux.orig/drivers/acpi/fan.c
> > +++ linux/drivers/acpi/fan.c
> > @@ -243,6 +243,7 @@
> > int result = 0;
> > int state = 0;
> > struct thermal_cooling_device *cdev;
> > + static int msgcnt;
> >
> > if (!device)
> > return -EINVAL;
> > @@ -267,7 +268,11 @@
> > goto end;
> > }
> >
> > - dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
> > + if (msgcnt < 4 || !limit_console_output(false)) {
> > + dev_info(&device->dev,
> > + "registered as cooling_device%d\n", cdev->id);
> > + msgcnt++;
> > + }
>
> I'm personally not in favor of printing some, but not all, of these
> messages. That leads to questions when analyzing a dmesg log, such as
> "Hmm, I see I have 64 CPUs, but only 0-3 are registered as cooling
> devices. Does that mean something is wrong?"
>
> But I would be glad to see this particular message removed completely.
>
> > device->driver_data = cdev;
> > result = sysfs_create_link(&device->dev.kobj,
> > --- linux.orig/drivers/acpi/processor_core.c
> > +++ linux/drivers/acpi/processor_core.c
> > @@ -775,6 +775,7 @@
> > struct acpi_processor *pr = NULL;
> > int result = 0;
> > struct sys_device *sysdev;
> > + static int msgcnt;
> >
> > pr = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
> > if (!pr)
> > @@ -845,8 +846,11 @@
> > goto err_power_exit;
> > }
> >
> > - dev_info(&device->dev, "registered as cooling_device%d\n",
> > - pr->cdev->id);
> > + if (msgcnt < 4 || !limit_console_output(false)) {
> > + dev_info(&device->dev, "registered as cooling_device%d\n",
> > + pr->cdev->id);
> > + msgcnt++;
> > + }
If Zhang Rui does not complain you can change these:
..registered as cooling_device..
into dev_dbg() without any condition.
This isn't critical.
Or why not use the more fine grained
ACPI debug facility and change it into:
ACPI_DEBUG_PRINT((ACPI_DB_INFO "..."));
(compare with Documentation/acpi/debug.txt and other
occurences in the same file)
You have to pass:
acpi_dbg_layer=0x20000000
to see it then.
> >
> > result = sysfs_create_link(&device->dev.kobj,
> > &pr->cdev->device.kobj,
> > --- linux.orig/drivers/acpi/tables.c
> > +++ linux/drivers/acpi/tables.c
> > @@ -170,11 +170,16 @@
> > case ACPI_MADT_TYPE_LOCAL_SAPIC:
> > {
> > struct acpi_madt_local_sapic *p =
> > - (struct acpi_madt_local_sapic *)header;
> > - printk(KERN_INFO PREFIX
> > - "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x]
> > %s)\n", - p->processor_id, p->id, p->eid,
> > - (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" :
> > "disabled"); + (struct acpi_madt_local_sapic *)header;
> > +
> > + if (p->eid < 8 || !limit_console_output(false))
I can't find limit_console_output(), I expect it got introduced by another one
of your patch series, not send to the acpi list?
Still shouldn't this be:
limit_console_output(true)
instead of:
!limit_console_output(false)
Thomas
> > + printk(KERN_INFO PREFIX
> > + "LSAPIC (acpi_id[0x%02x] "
> > + "lsapic_id[0x%02x] "
> > + "lsapic_eid[0x%02x] %s)\n",
> > + p->processor_id, p->id, p->eid,
> > + (p->lapic_flags & ACPI_MADT_ENABLED) ?
> > + "enabled" : "disabled");
>
> I know we print way too much stuff for every processor, but again, I'd
> rather see all CPUs or none. I think there's a little more value in
> this one than the cooling device one (probably because I do a lot of
> platform bringup), but it could certainly be made KERN_DEBUG and/or
> combined with another processor discovery line.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages
2009-10-26 20:11 ` Dmitry Adamushko
@ 2009-10-27 15:21 ` Mike Travis
0 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-27 15:21 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Ingo Molnar, Tigran Aivazian, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel
[Another approach is shown below.]
Dmitry Adamushko wrote:
> 2009/10/26 Mike Travis <travis@sgi.com>:
>>
>> Ingo Molnar wrote:
>>> * Dmitry Adamushko <dmitry.adamushko@gmail.com> wrote:
>>>
>>>> 2009/10/24 Tigran Aivazian <tigran@aivazian.fsnet.co.uk>:
>>>>> On Sat, 24 Oct 2009, Dmitry Adamushko wrote:
>>>>>>> - printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>>>> revision=0x%x\n",
>>>>>>> + if (cpu_num < 4 || !limit_console_output(false))
>>>>>>> + printk(KERN_INFO
>>>>>>> + "microcode: CPU%d sig=0x%x, pf=0x%x,
>>>>>>> revision=0x%x\n",
>>>>>>> cpu_num, csig->sig, csig->pf, csig->rev);
>>>>>>>
>>>>>> Hmm, I guess we wouldn't lose a lot by simply removing those messages
>>>>>> completely. Per-cpu pf/revision is available via /sys anyway.
>>>>> The reason for printing them is that the pf (possibly others?) can
>>>>> change by the update and so the log has this info handy.
>>>> We might store the old sig/pf/revision set as well, export them via
>>>> /sys or/and print them at update-to-new-microcode time.
>>>>
>>>> If it's really so useful to have this info in the log and, at the same
>>>> time, to avoid the flood of messages (which, I guess for the majority
>>>> of systems, are the same) at startup time, we might delay the printout
>>>> until the end of microcode_init(). Then do something like this:
>>>>
>>>> microcode cpu0: up to date version sig, pf, rev // let's say,
>>>> it was updated by BIOS
>>>> microcode cpus [1 ... 16] : update from sig, pf, rev to sig, pf2, rev2.
>>>>
>>>> Anyway, my humble opinion, is that (at the very least) the current
>>>> patch should be accompanied by a similar version for amd.
>>> yeah. Since we load new microcode on all cpus it's enough to print it for
>>> the boot CPU or so.
>>>
>>> Having the precise microcode version printed (or exposed somewhere in
>>> /sys) is useful - sometimes when there's a weird crash in some prototype CPU
>>> one of the first questions from hw vendors is 'which precise microcode
>>> version was that?'.
>>>
>>> Ingo
>> I would agree especially in the case where not all the cpus are exactly
>> the same. But so far, I've only seen variations of the speed of the cpus
>> not it's generic type, in an SSI. So the version of the microcode was
>> identical in all cases.
>
> I guess that (at least) a bootup cpu can be updated by BIOS so that it
> may appear to be different.
> Perhaps, cases where some 'broken' cpus have been replaced for others
> with a different "revision" (but still compatible otherwise) might be
> rare but possible (say, big machines with hot-pluggable cpus) ?
>
> btw., I was thinking of having something like this:
>
> microcode: cpus [K...L] platform-specific-format (e.g. for Intel :
> sig, pf, rev)
> microcode: updating...
> microcode: cpus [K...L] platform-specific-format (e.g. for Intel :
> sig, pf, rev)
>
> or even just,
>
> microcode: cpus [ K...L] updated from platform-specific-format-1 to
> platform-specific-format-2
>
>
>> Thanks,
>> Mike
>>
>
>
> -- Dmitry
Here's another approach...
I wasn't sure how to trigger the printing of the summary, so I winged it.
Looking closer it would appear that perhaps adding a new "summarize"
function to the microcode_ops struct could trigger it? Note this version
builds but I haven't yet tested it on a live system.
Thanks,
Mike
SGI x86_64 UV: Limit the number of microcode messages
Limit number of microcode messages of the form:
[ 50.887135] microcode: CPU0 sig=0x206e5, pf=0x4, revision=0xffff001
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Andreas Mohr <andi@lisas.de>
Cc: Hannes Eder <hannes@hanneseder.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/microcode_intel.c | 73 +++++++++++++++++++++++++++++++++++++-
1 file changed, 72 insertions(+), 1 deletion(-)
--- linux.orig/arch/x86/kernel/microcode_intel.c
+++ linux/arch/x86/kernel/microcode_intel.c
@@ -137,6 +137,52 @@
#define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
+static struct cpu_signature *cpusigs;
+static cpumask_var_t cpusigslist;
+static int cpusigs_error;
+
+static void summarize_cpu_info(void)
+{
+ char buf[128];
+ int cpu;
+ cpumask_var_t cpulist;
+
+ if (cpusigs_error || !alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
+ printk(KERN_INFO "Can't print microcode summary\n");
+ return;
+ }
+
+ while ((cpu = cpumask_first(cpusigslist)) < nr_cpu_ids) {
+ struct cpu_signature *csig = &cpusigs[cpu];
+ int ncpu = cpu;
+
+ cpumask_clear(cpulist);
+ cpumask_set_cpu(cpu, cpulist);
+
+ /* gather all cpu info with same data */
+ while ((ncpu = cpumask_next(ncpu, cpusigslist)) < nr_cpu_ids)
+ if (csig->sig == cpusigs[ncpu].sig &&
+ csig->pf == cpusigs[ncpu].pf &&
+ csig->rev == cpusigs[ncpu].rev)
+ cpumask_set_cpu(ncpu, cpulist);
+
+ cpulist_scnprintf(buf, sizeof(buf), cpulist);
+
+ printk(KERN_INFO
+ "microcode: CPU%s: sig=0x%x, pf=0x%x, revision=0x%x\n",
+ buf, csig->sig, csig->pf, csig->rev);
+
+ /* clear bits we just processed */
+ cpumask_xor(cpusigslist, cpusigslist, cpulist);
+ }
+
+ /* cleanup */
+ free_cpumask_var(cpulist);
+ free_cpumask_var(cpusigslist);
+ vfree(cpusigs);
+ cpusigs_error = 0;
+}
+
static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
{
struct cpuinfo_x86 *c = &cpu_data(cpu_num);
@@ -165,9 +211,34 @@
/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], csig->rev);
- printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
+ if (!cpusigs && !cpusigs_error) {
+ if (!alloc_cpumask_var(&cpusigslist, GFP_KERNEL))
+ cpusigs_error = 1;
+ else {
+ cpusigs = vmalloc(sizeof(*cpusigs) * nr_cpu_ids);
+ if (!cpusigs) {
+ free_cpumask_var(cpusigslist);
+ cpusigs_error = 1;
+ }
+ }
+ }
+
+ if (cpusigs_error || cpu_num == 0)
+ printk(KERN_INFO
+ "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
cpu_num, csig->sig, csig->pf, csig->rev);
+ else if (!cpusigs_error) {
+ cpusigs[cpu_num].sig = csig->sig;
+ cpusigs[cpu_num].pf = csig->pf;
+ cpusigs[cpu_num].rev = csig->rev;
+ cpumask_set_cpu(cpu_num, cpusigslist);
+
+ /* (XXX Need better method for when to print summary) */
+ if (cpu_num == (num_present_cpus() - 1))
+ summarize_cpu_info();
+ }
+
return 0;
}
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-26 7:04 ` Andi Kleen
2009-10-26 18:08 ` Mike Travis
@ 2009-10-27 15:24 ` Mike Travis
2009-10-27 19:45 ` David Rientjes
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 15:24 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, David Rientjes, Yinghai Lu, Mel Gorman,
linux-kernel
Andi Kleen wrote:
> Mike Travis <travis@sgi.com> writes:
>
>> Limit number of SRAT messages of the form:
>> [ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
>
> While I generally agree on the concept of limiting per CPU information
> (see other mail) I don't think removing this message by default
> is a good idea. I regularly needed it for debugging some NUMA related
> problems and they still happen moderately often even today.
>
> I think the right approach here, to limit output, would be to figure out
> a more compact output format, perhaps using a matrix in a table
> or simply printing multiple pair per line.
>
> -Andi
>
In this approach, I only print the first SRAT of each node... Is
this closer to what you're looking for?
Thanks,
Mike
SGI x86_64 UV: Limit the number of number of SRAT messages
Limit number of SRAT messages of the form:
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/mm/srat_64.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
--- linux.orig/arch/x86/mm/srat_64.c
+++ linux/arch/x86/mm/srat_64.c
@@ -115,6 +115,7 @@
{
int pxm, node;
int apic_id;
+ static int last_node = -1;
if (srat_disabled())
return;
@@ -136,8 +137,11 @@
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+ if (node > last_node) {
+ printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
+ pxm, apic_id, node);
+ last_node = node;
+ }
}
/* Callback for Proximity Domain -> LAPIC mapping */
@@ -146,6 +150,7 @@
{
int pxm, node;
int apic_id;
+ static int last_node = -1;
if (srat_disabled())
return;
@@ -170,8 +175,11 @@
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+ if (node > last_node) {
+ printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
+ pxm, apic_id, node);
+ last_node = node;
+ }
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-24 3:29 ` Bjorn Helgaas
2009-10-26 18:15 ` Mike Travis
2009-10-26 22:47 ` Thomas Renninger
@ 2009-10-27 15:27 ` Mike Travis
2009-10-27 15:51 ` Bjorn Helgaas
2 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 15:27 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Zhang Rui, Len Brown, Thomas Renninger, Alexey Dobriyan,
Myron Stowe, Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi,
linux-kernel
Bjorn Helgaas wrote:
...
>
> I know we print way too much stuff for every processor, but again, I'd
> rather see all CPUs or none. I think there's a little more value in
> this one than the cooling device one (probably because I do a lot of
> platform bringup), but it could certainly be made KERN_DEBUG and/or
> combined with another processor discovery line.
Is this more acceptable?
Thanks,
Mike
---
SGI x86_64 UV: Limit the number of ACPI messages
Limit number of ACPI messages of the form:
[ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
[ 99.638655] processor ACPI0007:00: registered as cooling_device0
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Myron Stowe <myron.stowe@hp.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
drivers/acpi/fan.c | 2 +-
drivers/acpi/processor_core.c | 3 +--
drivers/acpi/tables.c | 13 ++++++++-----
3 files changed, 10 insertions(+), 8 deletions(-)
--- linux.orig/drivers/acpi/fan.c
+++ linux/drivers/acpi/fan.c
@@ -267,7 +267,7 @@
goto end;
}
- dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
+ dev_dbg(&device->dev, "registered as cooling_device%d\n", cdev->id);
device->driver_data = cdev;
result = sysfs_create_link(&device->dev.kobj,
--- linux.orig/drivers/acpi/processor_core.c
+++ linux/drivers/acpi/processor_core.c
@@ -845,8 +845,7 @@
goto err_power_exit;
}
- dev_info(&device->dev, "registered as cooling_device%d\n",
- pr->cdev->id);
+ dev_dbg(&device->dev, "registered as cooling_device%d\n", pr->cdev->id);
result = sysfs_create_link(&device->dev.kobj,
&pr->cdev->device.kobj,
--- linux.orig/drivers/acpi/tables.c
+++ linux/drivers/acpi/tables.c
@@ -170,11 +170,14 @@
case ACPI_MADT_TYPE_LOCAL_SAPIC:
{
struct acpi_madt_local_sapic *p =
- (struct acpi_madt_local_sapic *)header;
- printk(KERN_INFO PREFIX
- "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
- p->processor_id, p->id, p->eid,
- (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled");
+ (struct acpi_madt_local_sapic *)header;
+
+ printk(KERN_DEBUG PREFIX
+ "LSAPIC (acpi_id[0x%02x] "
+ "lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
+ p->processor_id, p->id, p->eid,
+ (p->lapic_flags & ACPI_MADT_ENABLED) ?
+ "enabled" : "disabled");
}
break;
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages
2009-10-27 15:27 ` Mike Travis
@ 2009-10-27 15:51 ` Bjorn Helgaas
0 siblings, 0 replies; 109+ messages in thread
From: Bjorn Helgaas @ 2009-10-27 15:51 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Zhang Rui, Len Brown, Thomas Renninger, Alexey Dobriyan,
Myron Stowe, Feng Tang, Suresh Siddha, Yinghai Lu, linux-acpi,
linux-kernel
On Tuesday 27 October 2009 09:27:35 am Mike Travis wrote:
> Bjorn Helgaas wrote:
> ...
> >
> > I know we print way too much stuff for every processor, but again, I'd
> > rather see all CPUs or none. I think there's a little more value in
> > this one than the cooling device one (probably because I do a lot of
> > platform bringup), but it could certainly be made KERN_DEBUG and/or
> > combined with another processor discovery line.
>
> Is this more acceptable?
>
> Thanks,
> Mike
> ---
>
> SGI x86_64 UV: Limit the number of ACPI messages
>
> Limit number of ACPI messages of the form:
>
> [ 0.000000] ACPI: LSAPIC (acpi_id[0x00] lsapic_id[0x00] lsapic_eid[0x00] enabled)
>
> [ 99.638655] processor ACPI0007:00: registered as cooling_device0
>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Thomas Renninger <trenn@suse.de>
> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Myron Stowe <myron.stowe@hp.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: Yinghai Lu <yhlu.kernel@gmail.com>
> Cc: linux-acpi@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> drivers/acpi/fan.c | 2 +-
> drivers/acpi/processor_core.c | 3 +--
> drivers/acpi/tables.c | 13 ++++++++-----
> 3 files changed, 10 insertions(+), 8 deletions(-)
>
> --- linux.orig/drivers/acpi/fan.c
> +++ linux/drivers/acpi/fan.c
> @@ -267,7 +267,7 @@
> goto end;
> }
>
> - dev_info(&device->dev, "registered as cooling_device%d\n", cdev->id);
> + dev_dbg(&device->dev, "registered as cooling_device%d\n", cdev->id);
>
> device->driver_data = cdev;
> result = sysfs_create_link(&device->dev.kobj,
> --- linux.orig/drivers/acpi/processor_core.c
> +++ linux/drivers/acpi/processor_core.c
> @@ -845,8 +845,7 @@
> goto err_power_exit;
> }
>
> - dev_info(&device->dev, "registered as cooling_device%d\n",
> - pr->cdev->id);
> + dev_dbg(&device->dev, "registered as cooling_device%d\n", pr->cdev->id);
I still think you should just remove these messages completely.
If you do keep them, note that dev_dbg() is not the same as
dev_prinkt(KERN_DEBUG) -- dev_dbg() compiles to nothing at all
unless "DEBUG" is defined.
> result = sysfs_create_link(&device->dev.kobj,
> &pr->cdev->device.kobj,
> --- linux.orig/drivers/acpi/tables.c
> +++ linux/drivers/acpi/tables.c
> @@ -170,11 +170,14 @@
> case ACPI_MADT_TYPE_LOCAL_SAPIC:
> {
> struct acpi_madt_local_sapic *p =
> - (struct acpi_madt_local_sapic *)header;
> - printk(KERN_INFO PREFIX
> - "LSAPIC (acpi_id[0x%02x] lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
> - p->processor_id, p->id, p->eid,
> - (p->lapic_flags & ACPI_MADT_ENABLED) ? "enabled" : "disabled");
> + (struct acpi_madt_local_sapic *)header;
> +
> + printk(KERN_DEBUG PREFIX
> + "LSAPIC (acpi_id[0x%02x] "
> + "lsapic_id[0x%02x] lsapic_eid[0x%02x] %s)\n",
> + p->processor_id, p->id, p->eid,
> + (p->lapic_flags & ACPI_MADT_ENABLED) ?
> + "enabled" : "disabled");
I don't object to this.
I do think it'd be much better to combine this with the other
per-processor startup messages, of which we have an absolute over-
abundance:
CPU0: Intel(R) Xeon(TM) CPU 2.80GHz stepping 09
Booting processor 1 APIC 0x6 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5597.14 BogoMIPS (lpj=2798571)
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
But that's a bigger project.
Bjorn
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-27 15:24 ` Mike Travis
@ 2009-10-27 19:45 ` David Rientjes
2009-10-27 20:00 ` Mike Travis
2009-10-27 20:16 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Cyrill Gorcunov
0 siblings, 2 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-27 19:45 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Tue, 27 Oct 2009, Mike Travis wrote:
> --- linux.orig/arch/x86/mm/srat_64.c
> +++ linux/arch/x86/mm/srat_64.c
> @@ -115,6 +115,7 @@
> {
> int pxm, node;
> int apic_id;
> + static int last_node = -1;
>
> if (srat_disabled())
> return;
> @@ -136,8 +137,11 @@
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> + if (node > last_node) {
> + printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> + pxm, apic_id, node);
> + last_node = node;
> + }
> }
>
> /* Callback for Proximity Domain -> LAPIC mapping */
> @@ -146,6 +150,7 @@
> {
> int pxm, node;
> int apic_id;
> + static int last_node = -1;
>
> if (srat_disabled())
> return;
> @@ -170,8 +175,11 @@
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> + if (node > last_node) {
> + printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> + pxm, apic_id, node);
> + last_node = node;
> + }
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
>
So on my Opteron I'll be getting this:
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
instead of this:
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
Do I need to infer what apic 1 or 3 map to with your patch (or whether
they are even valid)?
It would seem much better to print this information once the SRAT parsing
in acpi_numa_init() is complete and apicid_to_node[] is populated. This
leads to the ideal beheavior, which is:
SRAT: PXM 0 -> APIC {0-1} -> Node 0
SRAT: PXM 1 -> APIC {2-3} -> Node 1
Something like the following patch? (Regardless, we need to cc
linux-acpi@vger.kernel.org. I've added it.)
x86: reduce srat verbosity in the kernel log
It's possible to reduce the number of SRAT messages emitted to the kernel
log by printing each valid pxm once and then creating bitmaps to represent
the apicids that map to the same node.
This reduces lines such as
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
to
SRAT: PXM 0 -> APIC {0-1} -> Node 0
SRAT: PXM 1 -> APIC {2-3} -> Node 1
Signed-off-by: David Rientjes <rientjes@google.com>
---
arch/x86/mm/srat_64.c | 31 +++++++++++++++++++++++++++----
drivers/acpi/numa.c | 5 +++++
include/linux/acpi.h | 3 ++-
3 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -136,8 +136,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
}
/* Callback for Proximity Domain -> LAPIC mapping */
@@ -170,8 +168,33 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+}
+
+void __init acpi_numa_print_srat_mapping(void)
+{
+ DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC);
+ char apicid_list[MAX_LOCAL_APIC];
+ int i, j;
+
+ for (i = 0; i < MAX_PXM_DOMAINS; i++) {
+ int nid;
+
+ nid = pxm_to_node(i);
+ if (nid == NUMA_NO_NODE)
+ continue;
+
+ bitmap_zero(apicid_map, MAX_LOCAL_APIC);
+ for (j = 0; j < MAX_LOCAL_APIC; j++)
+ if (apicid_to_node[j] == nid)
+ set_bit(j, apicid_map);
+
+ if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
+ continue;
+ bitmap_scnlistprintf(apicid_list, MAX_LOCAL_APIC,
+ apicid_map, MAX_LOCAL_APIC);
+ pr_info("SRAT: PXM %u -> APIC {%s} -> Node %u\n",
+ i, apicid_list, nid);
+ }
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
handler, max_entries);
}
+void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
+{
+}
+
int __init acpi_numa_init(void)
{
/* SRAT: Static Resource Affinity Table */
@@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory_affinity,
NR_NODE_MEMBLKS);
+ acpi_numa_print_srat_mapping();
}
/* SLIT: System Locality Information Table */
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
-/* the following four functions are architecture-dependent */
+/* the following six functions are architecture-dependent */
void acpi_numa_slit_init (struct acpi_table_slit *slit);
void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
void acpi_numa_arch_fixup(void);
+void acpi_numa_print_srat_mapping(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-27 19:45 ` David Rientjes
@ 2009-10-27 20:00 ` Mike Travis
2009-10-27 20:25 ` [patch] x86: reduce srat verbosity in the kernel log David Rientjes
2009-10-27 20:16 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Cyrill Gorcunov
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 20:00 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
Hi David,
Very Cool, I'll try it out and let you know how it goes.
Note that it would be better to declare the BITMAP in the
static initdata section so it doesn't grow the stack by 4k
bytes. (And it's thrown away after the kernel starts.)
Thanks,
Mike
David Rientjes wrote:
> On Tue, 27 Oct 2009, Mike Travis wrote:
>
>> --- linux.orig/arch/x86/mm/srat_64.c
>> +++ linux/arch/x86/mm/srat_64.c
>> @@ -115,6 +115,7 @@
>> {
>> int pxm, node;
>> int apic_id;
>> + static int last_node = -1;
>>
>> if (srat_disabled())
>> return;
>> @@ -136,8 +137,11 @@
>> apicid_to_node[apic_id] = node;
>> node_set(node, cpu_nodes_parsed);
>> acpi_numa = 1;
>> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
>> - pxm, apic_id, node);
>> + if (node > last_node) {
>> + printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
>> + pxm, apic_id, node);
>> + last_node = node;
>> + }
>> }
>>
>> /* Callback for Proximity Domain -> LAPIC mapping */
>> @@ -146,6 +150,7 @@
>> {
>> int pxm, node;
>> int apic_id;
>> + static int last_node = -1;
>>
>> if (srat_disabled())
>> return;
>> @@ -170,8 +175,11 @@
>> apicid_to_node[apic_id] = node;
>> node_set(node, cpu_nodes_parsed);
>> acpi_numa = 1;
>> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
>> - pxm, apic_id, node);
>> + if (node > last_node) {
>> + printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
>> + pxm, apic_id, node);
>> + last_node = node;
>> + }
>> }
>>
>> #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
>>
>
> So on my Opteron I'll be getting this:
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
>
> instead of this:
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
>
> Do I need to infer what apic 1 or 3 map to with your patch (or whether
> they are even valid)?
>
> It would seem much better to print this information once the SRAT parsing
> in acpi_numa_init() is complete and apicid_to_node[] is populated. This
> leads to the ideal beheavior, which is:
>
> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>
> Something like the following patch? (Regardless, we need to cc
> linux-acpi@vger.kernel.org. I've added it.)
>
>
>
> x86: reduce srat verbosity in the kernel log
>
> It's possible to reduce the number of SRAT messages emitted to the kernel
> log by printing each valid pxm once and then creating bitmaps to represent
> the apicids that map to the same node.
>
> This reduces lines such as
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
>
> to
>
> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> arch/x86/mm/srat_64.c | 31 +++++++++++++++++++++++++++----
> drivers/acpi/numa.c | 5 +++++
> include/linux/acpi.h | 3 ++-
> 3 files changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -136,8 +136,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> }
>
> /* Callback for Proximity Domain -> LAPIC mapping */
> @@ -170,8 +168,33 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> +}
> +
> +void __init acpi_numa_print_srat_mapping(void)
> +{
> + DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC);
> + char apicid_list[MAX_LOCAL_APIC];
> + int i, j;
> +
> + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> + int nid;
> +
> + nid = pxm_to_node(i);
> + if (nid == NUMA_NO_NODE)
> + continue;
> +
> + bitmap_zero(apicid_map, MAX_LOCAL_APIC);
> + for (j = 0; j < MAX_LOCAL_APIC; j++)
> + if (apicid_to_node[j] == nid)
> + set_bit(j, apicid_map);
> +
> + if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
> + continue;
> + bitmap_scnlistprintf(apicid_list, MAX_LOCAL_APIC,
> + apicid_map, MAX_LOCAL_APIC);
> + pr_info("SRAT: PXM %u -> APIC {%s} -> Node %u\n",
> + i, apicid_list, nid);
> + }
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> --- a/drivers/acpi/numa.c
> +++ b/drivers/acpi/numa.c
> @@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
> handler, max_entries);
> }
>
> +void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
> +{
> +}
> +
> int __init acpi_numa_init(void)
> {
> /* SRAT: Static Resource Affinity Table */
> @@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
> acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
> acpi_parse_memory_affinity,
> NR_NODE_MEMBLKS);
> + acpi_numa_print_srat_mapping();
> }
>
> /* SLIT: System Locality Information Table */
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
> int acpi_parse_mcfg (struct acpi_table_header *header);
> void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>
> -/* the following four functions are architecture-dependent */
> +/* the following six functions are architecture-dependent */
> void acpi_numa_slit_init (struct acpi_table_slit *slit);
> void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
> void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
> void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
> void acpi_numa_arch_fixup(void);
> +void acpi_numa_print_srat_mapping(void);
>
> #ifdef CONFIG_ACPI_HOTPLUG_CPU
> /* Arch dependent functions for cpu hotplug support */
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-27 19:45 ` David Rientjes
2009-10-27 20:00 ` Mike Travis
@ 2009-10-27 20:16 ` Cyrill Gorcunov
2009-10-27 20:23 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Cyrill Gorcunov @ 2009-10-27 20:16 UTC (permalink / raw)
To: David Rientjes
Cc: Mike Travis, Andi Kleen, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
[David Rientjes - Tue, Oct 27, 2009 at 12:45:00PM -0700]
...
| On Tue, 27 Oct 2009, Mike Travis wrote:
| +
| +void __init acpi_numa_print_srat_mapping(void)
| +{
| + DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC);
| + char apicid_list[MAX_LOCAL_APIC];
Hi David, I suppose 32K on stack is too much :)
(perhaps gcc will move it out of stack?)
...
-- Cyrill
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-27 20:16 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Cyrill Gorcunov
@ 2009-10-27 20:23 ` Mike Travis
2009-10-27 20:33 ` Cyrill Gorcunov
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 20:23 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: David Rientjes, Andi Kleen, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
Cyrill Gorcunov wrote:
> [David Rientjes - Tue, Oct 27, 2009 at 12:45:00PM -0700]
> ...
> | On Tue, 27 Oct 2009, Mike Travis wrote:
> | +
> | +void __init acpi_numa_print_srat_mapping(void)
> | +{
> | + DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC);
> | + char apicid_list[MAX_LOCAL_APIC];
>
> Hi David, I suppose 32K on stack is too much :)
> (perhaps gcc will move it out of stack?)
>
> ...
>
> -- Cyrill
Yeah, I missed that too on my first review. (4k seems piddling
compared to 32k on the stack! ;-)
I moved them both to static and will be testing it shortly:
static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
static char apicid_list[MAX_LOCAL_APIC] __initdata;
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:00 ` Mike Travis
@ 2009-10-27 20:25 ` David Rientjes
2009-10-27 20:42 ` Mike Travis
` (2 more replies)
0 siblings, 3 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-27 20:25 UTC (permalink / raw)
To: Ingo Molnar, Mike Travis
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi
On Tue, 27 Oct 2009, Mike Travis wrote:
> Hi David,
>
> Very Cool, I'll try it out and let you know how it goes.
>
> Note that it would be better to declare the BITMAP in the
> static initdata section so it doesn't grow the stack by 4k
> bytes. (And it's thrown away after the kernel starts.)
>
Right, here's an updated version. I was thinking of MAX_PXM_DOMAINS being
256 instead of MAX_LOCAL_APIC :)
Here's an updated version. apicid_map and apicid_list don't need to be
synchronized because there're no concurrency issues here on init.
x86: reduce srat verbosity in the kernel log
It's possible to reduce the number of SRAT messages emitted to the kernel
log by printing each valid pxm once and then creating bitmaps to represent
the apicids that map to the same node.
This reduces lines such as
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
to
SRAT: PXM 0 -> APIC {0-1} -> Node 0
SRAT: PXM 1 -> APIC {2-3} -> Node 1
Signed-off-by: David Rientjes <rientjes@google.com>
---
arch/x86/mm/srat_64.c | 32 ++++++++++++++++++++++++++++----
drivers/acpi/numa.c | 5 +++++
include/linux/acpi.h | 3 ++-
3 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
+static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
+static char apicid_list[MAX_LOCAL_APIC] __initdata;
+
static __init int setup_node(int pxm)
{
return acpi_map_pxm_to_node(pxm);
@@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
}
/* Callback for Proximity Domain -> LAPIC mapping */
@@ -170,8 +171,31 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+}
+
+void __init acpi_numa_print_srat_mapping(void)
+{
+ int i, j;
+
+ for (i = 0; i < MAX_PXM_DOMAINS; i++) {
+ int nid;
+
+ nid = pxm_to_node(i);
+ if (nid == NUMA_NO_NODE)
+ continue;
+
+ bitmap_zero(apicid_map, MAX_LOCAL_APIC);
+ for (j = 0; j < MAX_LOCAL_APIC; j++)
+ if (apicid_to_node[j] == nid)
+ set_bit(j, apicid_map);
+
+ if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
+ continue;
+ bitmap_scnlistprintf(apicid_list, MAX_LOCAL_APIC,
+ apicid_map, MAX_LOCAL_APIC);
+ pr_info("SRAT: PXM %u -> APIC {%s} -> Node %u\n",
+ i, apicid_list, nid);
+ }
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
handler, max_entries);
}
+void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
+{
+}
+
int __init acpi_numa_init(void)
{
/* SRAT: Static Resource Affinity Table */
@@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory_affinity,
NR_NODE_MEMBLKS);
+ acpi_numa_print_srat_mapping();
}
/* SLIT: System Locality Information Table */
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
-/* the following four functions are architecture-dependent */
+/* the following six functions are architecture-dependent */
void acpi_numa_slit_init (struct acpi_table_slit *slit);
void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
void acpi_numa_arch_fixup(void);
+void acpi_numa_print_srat_mapping(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages
2009-10-27 20:23 ` Mike Travis
@ 2009-10-27 20:33 ` Cyrill Gorcunov
0 siblings, 0 replies; 109+ messages in thread
From: Cyrill Gorcunov @ 2009-10-27 20:33 UTC (permalink / raw)
To: Mike Travis
Cc: David Rientjes, Andi Kleen, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
[Mike Travis - Tue, Oct 27, 2009 at 01:23:42PM -0700]
>
> I moved them both to static and will be testing it shortly:
>
> static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> static char apicid_list[MAX_LOCAL_APIC] __initdata;
>
> Thanks,
> Mike
>
Great! David has just updated patch too.
-- Cyrill
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:25 ` [patch] x86: reduce srat verbosity in the kernel log David Rientjes
@ 2009-10-27 20:42 ` Mike Travis
2009-10-27 20:48 ` David Rientjes
2009-10-27 20:55 ` Cyrill Gorcunov
2009-10-28 3:32 ` Andi Kleen
2 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 20:42 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
I applied your previous patch with the change to use static and
here's the console output from a live system:
[ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
[ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
[ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
[ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
[ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
[ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
[ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
[ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
[ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
[ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
[ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
[ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
[ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
[ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
[ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
[ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
[ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
[ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
[ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
[ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
[ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
[ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
[ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
David Rientjes wrote:
> On Tue, 27 Oct 2009, Mike Travis wrote:
>
>> Hi David,
>>
>> Very Cool, I'll try it out and let you know how it goes.
>>
>> Note that it would be better to declare the BITMAP in the
>> static initdata section so it doesn't grow the stack by 4k
>> bytes. (And it's thrown away after the kernel starts.)
>>
>
> Right, here's an updated version. I was thinking of MAX_PXM_DOMAINS being
> 256 instead of MAX_LOCAL_APIC :)
>
> Here's an updated version. apicid_map and apicid_list don't need to be
> synchronized because there're no concurrency issues here on init.
>
>
>
> x86: reduce srat verbosity in the kernel log
>
> It's possible to reduce the number of SRAT messages emitted to the kernel
> log by printing each valid pxm once and then creating bitmaps to represent
> the apicids that map to the same node.
>
> This reduces lines such as
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
>
> to
>
> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> arch/x86/mm/srat_64.c | 32 ++++++++++++++++++++++++++++----
> drivers/acpi/numa.c | 5 +++++
> include/linux/acpi.h | 3 ++-
> 3 files changed, 35 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
> --- a/arch/x86/mm/srat_64.c
> +++ b/arch/x86/mm/srat_64.c
> @@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
> static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
> static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
>
> +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> +static char apicid_list[MAX_LOCAL_APIC] __initdata;
> +
> static __init int setup_node(int pxm)
> {
> return acpi_map_pxm_to_node(pxm);
> @@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> }
>
> /* Callback for Proximity Domain -> LAPIC mapping */
> @@ -170,8 +171,31 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
> apicid_to_node[apic_id] = node;
> node_set(node, cpu_nodes_parsed);
> acpi_numa = 1;
> - printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
> - pxm, apic_id, node);
> +}
> +
> +void __init acpi_numa_print_srat_mapping(void)
> +{
> + int i, j;
> +
> + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> + int nid;
> +
> + nid = pxm_to_node(i);
> + if (nid == NUMA_NO_NODE)
> + continue;
> +
> + bitmap_zero(apicid_map, MAX_LOCAL_APIC);
> + for (j = 0; j < MAX_LOCAL_APIC; j++)
> + if (apicid_to_node[j] == nid)
> + set_bit(j, apicid_map);
> +
> + if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
> + continue;
> + bitmap_scnlistprintf(apicid_list, MAX_LOCAL_APIC,
> + apicid_map, MAX_LOCAL_APIC);
> + pr_info("SRAT: PXM %u -> APIC {%s} -> Node %u\n",
> + i, apicid_list, nid);
> + }
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
> --- a/drivers/acpi/numa.c
> +++ b/drivers/acpi/numa.c
> @@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
> handler, max_entries);
> }
>
> +void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
> +{
> +}
> +
> int __init acpi_numa_init(void)
> {
> /* SRAT: Static Resource Affinity Table */
> @@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
> acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
> acpi_parse_memory_affinity,
> NR_NODE_MEMBLKS);
> + acpi_numa_print_srat_mapping();
> }
>
> /* SLIT: System Locality Information Table */
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
> int acpi_parse_mcfg (struct acpi_table_header *header);
> void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>
> -/* the following four functions are architecture-dependent */
> +/* the following six functions are architecture-dependent */
> void acpi_numa_slit_init (struct acpi_table_slit *slit);
> void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
> void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
> void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
> void acpi_numa_arch_fixup(void);
> +void acpi_numa_print_srat_mapping(void);
>
> #ifdef CONFIG_ACPI_HOTPLUG_CPU
> /* Arch dependent functions for cpu hotplug support */
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:42 ` Mike Travis
@ 2009-10-27 20:48 ` David Rientjes
2009-10-27 23:02 ` Mike Travis
2009-10-28 3:53 ` Yinghai Lu
0 siblings, 2 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-27 20:48 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Tue, 27 Oct 2009, Mike Travis wrote:
> I applied your previous patch with the change to use static and
> here's the console output from a live system:
>
>
> [ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
> [ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
> [ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
> [ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
> [ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
> [ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
> [ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
> [ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
> [ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
> [ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
> [ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
> [ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
> [ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
> [ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
> [ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
> [ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
> [ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
> [ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
> [ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
> [ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
> [ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
> [ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
> [ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
> [ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
>
Quite the system you have there :) What was once 760 lines has been
reduced to 24 without removing any information.
This seems to be the most we can reduce this particular output since we
don't support mapping multiple pxms to a single node.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:25 ` [patch] x86: reduce srat verbosity in the kernel log David Rientjes
2009-10-27 20:42 ` Mike Travis
@ 2009-10-27 20:55 ` Cyrill Gorcunov
2009-10-27 21:06 ` David Rientjes
2009-10-28 3:32 ` Andi Kleen
2 siblings, 1 reply; 109+ messages in thread
From: Cyrill Gorcunov @ 2009-10-27 20:55 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Mike Travis, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
[David Rientjes - Tue, Oct 27, 2009 at 01:25:51PM -0700]
| On Tue, 27 Oct 2009, Mike Travis wrote:
|
...
| +
| +void __init acpi_numa_print_srat_mapping(void)
| +{
| + int i, j;
| +
| + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
| + int nid;
| +
| + nid = pxm_to_node(i);
| + if (nid == NUMA_NO_NODE)
Btw, David, while you at it, I just curious -- shouldn't we test it
with NID_INVAL (as pxm_to_node_map initially defined to)? Not a big
deal at all (since they are both = -1) but for the record.
Or perhaps I miss something?
| + continue;
...
-- Cyrill
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:55 ` Cyrill Gorcunov
@ 2009-10-27 21:06 ` David Rientjes
2009-10-27 21:10 ` Cyrill Gorcunov
0 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-27 21:06 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Ingo Molnar, Mike Travis, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
On Tue, 27 Oct 2009, Cyrill Gorcunov wrote:
> | +void __init acpi_numa_print_srat_mapping(void)
> | +{
> | + int i, j;
> | +
> | + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
> | + int nid;
> | +
> | + nid = pxm_to_node(i);
> | + if (nid == NUMA_NO_NODE)
>
> Btw, David, while you at it, I just curious -- shouldn't we test it
> with NID_INVAL (as pxm_to_node_map initially defined to)? Not a big
> deal at all (since they are both = -1) but for the record.
> Or perhaps I miss something?
>
I don't think we need to address that since NID_INVAL is going away and
will be replaced by NUMA_NO_NODE since Lee has exposed it globally in his
mempolicy patchset, and as you mention they are the same anyway.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 21:06 ` David Rientjes
@ 2009-10-27 21:10 ` Cyrill Gorcunov
0 siblings, 0 replies; 109+ messages in thread
From: Cyrill Gorcunov @ 2009-10-27 21:10 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Mike Travis, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
[David Rientjes - Tue, Oct 27, 2009 at 02:06:21PM -0700]
| On Tue, 27 Oct 2009, Cyrill Gorcunov wrote:
|
| > | +void __init acpi_numa_print_srat_mapping(void)
| > | +{
| > | + int i, j;
| > | +
| > | + for (i = 0; i < MAX_PXM_DOMAINS; i++) {
| > | + int nid;
| > | +
| > | + nid = pxm_to_node(i);
| > | + if (nid == NUMA_NO_NODE)
| >
| > Btw, David, while you at it, I just curious -- shouldn't we test it
| > with NID_INVAL (as pxm_to_node_map initially defined to)? Not a big
| > deal at all (since they are both = -1) but for the record.
| > Or perhaps I miss something?
| >
|
| I don't think we need to address that since NID_INVAL is going away and
| will be replaced by NUMA_NO_NODE since Lee has exposed it globally in his
| mempolicy patchset, and as you mention they are the same anyway.
|
I see. Thanks!
-- Cyrill
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:48 ` David Rientjes
@ 2009-10-27 23:02 ` Mike Travis
2009-10-28 3:29 ` Andi Kleen
2009-10-28 3:53 ` Yinghai Lu
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-27 23:02 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Tue, 27 Oct 2009, Mike Travis wrote:
>
>> I applied your previous patch with the change to use static and
>> here's the console output from a live system:
>>
>>
>> [ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
>> [ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
>> [ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
>> [ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
>> [ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
>> [ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
>> [ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
>> [ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
>> [ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
>> [ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
>> [ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
>> [ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
>> [ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
>> [ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
>> [ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
>> [ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
>> [ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
>> [ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
>> [ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
>> [ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
>> [ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
>> [ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
>> [ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
>> [ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
>>
>
> Quite the system you have there :) What was once 760 lines has been
> reduced to 24 without removing any information.
>
> This seems to be the most we can reduce this particular output since we
> don't support mapping multiple pxms to a single node.
Yes, thanks very much for the optimization.
(And you can add my Acked-by or whatever you need.)
Tomorrow I will have more time on the system and will try out all the
new changes together, mostly with summarizing the Processor stats.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 23:02 ` Mike Travis
@ 2009-10-28 3:29 ` Andi Kleen
2009-10-28 4:08 ` David Rientjes
0 siblings, 1 reply; 109+ messages in thread
From: Andi Kleen @ 2009-10-28 3:29 UTC (permalink / raw)
To: Mike Travis
Cc: David Rientjes, Ingo Molnar, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
>> Quite the system you have there :) What was once 760 lines has been
>> reduced to 24 without removing any information.
>>
>> This seems to be the most we can reduce this particular output since we
>> don't support mapping multiple pxms to a single node.
>
> Yes, thanks very much for the optimization.
>
> (And you can add my Acked-by or whatever you need.)
Looks also good to me, thanks. Also Acked-by.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:25 ` [patch] x86: reduce srat verbosity in the kernel log David Rientjes
2009-10-27 20:42 ` Mike Travis
2009-10-27 20:55 ` Cyrill Gorcunov
@ 2009-10-28 3:32 ` Andi Kleen
2009-10-28 4:08 ` David Rientjes
2 siblings, 1 reply; 109+ messages in thread
From: Andi Kleen @ 2009-10-28 3:32 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Mike Travis, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
> +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> +static char apicid_list[MAX_LOCAL_APIC] __initdata;
Is MAX_LOCAL_APIC really big enough to print them all in ASCII?
It would be better to not use that large a buffer, but print
in smaller pieces (I realize this would enlarge your patch,
but then it would also save a lot of BSS)
-Andi
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-27 20:48 ` David Rientjes
2009-10-27 23:02 ` Mike Travis
@ 2009-10-28 3:53 ` Yinghai Lu
2009-10-28 4:08 ` David Rientjes
1 sibling, 1 reply; 109+ messages in thread
From: Yinghai Lu @ 2009-10-28 3:53 UTC (permalink / raw)
To: David Rientjes
Cc: Mike Travis, Ingo Molnar, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Tue, 27 Oct 2009, Mike Travis wrote:
>
>> I applied your previous patch with the change to use static and
>> here's the console output from a live system:
>>
>>
>> [ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
>> [ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
>> [ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
>> [ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
>> [ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
>> [ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
>> [ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
>> [ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
>> [ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
>> [ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
>> [ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
>> [ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
>> [ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
>> [ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
>> [ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
>> [ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
>> [ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
>> [ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
>> [ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
>> [ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
>> [ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
>> [ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
>> [ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
>> [ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
>>
>
> Quite the system you have there :) What was once 760 lines has been
> reduced to 24 without removing any information.
>
can you change the apic to hex print?
YH
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 3:32 ` Andi Kleen
@ 2009-10-28 4:08 ` David Rientjes
2009-10-28 4:11 ` Andi Kleen
0 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-28 4:08 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Mike Travis, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Andi Kleen wrote:
> > +static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
> > +static char apicid_list[MAX_LOCAL_APIC] __initdata;
>
> Is MAX_LOCAL_APIC really big enough to print them all in ASCII?
>
> It would be better to not use that large a buffer, but print
> in smaller pieces (I realize this would enlarge your patch,
> but then it would also save a lot of BSS)
>
MAX_LOCAL_APIC was definitely an arbitrary choice here and has very little
relevance. scnlistprintf will protect against overflow, but we still need
to decide upon a constant that will emit the most information possible
while not overly polluting the printk and saving on bss, as you mentioned.
I suspect we could agree on a value as little as 128 and it would work for
the overwhelming majority (all?) of users.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 3:29 ` Andi Kleen
@ 2009-10-28 4:08 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 4:08 UTC (permalink / raw)
To: Andi Kleen
Cc: Mike Travis, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Andi Kleen wrote:
> >> Quite the system you have there :) What was once 760 lines has been
> >> reduced to 24 without removing any information.
> >>
> >> This seems to be the most we can reduce this particular output since we
> >> don't support mapping multiple pxms to a single node.
> >
> > Yes, thanks very much for the optimization.
> >
> > (And you can add my Acked-by or whatever you need.)
>
> Looks also good to me, thanks. Also Acked-by.
>
Thanks Andi. I'm hoping Ingo will pick this up and not have a problem
with the use of NUMA_NO_NODE vs. NID_INVAL since there's a patch pending
in -mm that removes the former and this saves a Linus build error when he
pushes for 2.6.33 (and they are both defined the same anyway).
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 3:53 ` Yinghai Lu
@ 2009-10-28 4:08 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 4:08 UTC (permalink / raw)
To: Yinghai Lu
Cc: Mike Travis, Ingo Molnar, Andi Kleen, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Mel Gorman,
linux-kernel, linux-acpi
On Tue, 27 Oct 2009, Yinghai Lu wrote:
> can you change the apic to hex print?
>
That would be an extension made on top of my patch (which may be difficult
without adding an additional library function to be used in this case
since it relies on bitmap_scnlistprintf()). It's been printed as an
unsigned int for well over four years so I don't see any specific urgency,
anyway.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 4:08 ` David Rientjes
@ 2009-10-28 4:11 ` Andi Kleen
2009-10-28 4:53 ` [patch v2] " David Rientjes
2009-10-28 17:02 ` [patch] " Mike Travis
0 siblings, 2 replies; 109+ messages in thread
From: Andi Kleen @ 2009-10-28 4:11 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Mike Travis, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
> >
>
> MAX_LOCAL_APIC was definitely an arbitrary choice here and has very little
> relevance. scnlistprintf will protect against overflow, but we still need
> to decide upon a constant that will emit the most information possible
> while not overly polluting the printk and saving on bss, as you mentioned.
> I suspect we could agree on a value as little as 128 and it would work for
> the overwhelming majority (all?) of users.
For now at least seems reasonable to limit to 128 or so yes (and go
back to the stack). if we ever have sparse apic ids for nodes
then that might change; but in this case could still just do
a acpidump or teach the printer to be more clever and support
strides.
It would be just good to have some indication in the output
if there was a overflow.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* [patch v2] x86: reduce srat verbosity in the kernel log
2009-10-28 4:11 ` Andi Kleen
@ 2009-10-28 4:53 ` David Rientjes
2009-10-28 5:19 ` Andi Kleen
2009-11-10 21:08 ` David Rientjes
2009-10-28 17:02 ` [patch] " Mike Travis
1 sibling, 2 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 4:53 UTC (permalink / raw)
To: Andi Kleen, Ingo Molnar
Cc: Mike Travis, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi
On Wed, 28 Oct 2009, Andi Kleen wrote:
> For now at least seems reasonable to limit to 128 or so yes (and go
> back to the stack). if we ever have sparse apic ids for nodes
> then that might change; but in this case could still just do
> a acpidump or teach the printer to be more clever and support
> strides.
>
It'll support sparse apicids, which was shown in Mike's example, although
it only becomes a problem when they cannot be represented in a 128
character buffer. I doubt there are many machines where that happens
given the way they are formed.
> It would be just good to have some indication in the output
> if there was a overflow.
>
Agreed, a trailing "..." would be appropriate.
x86: reduce srat verbosity in the kernel log
It's possible to reduce the number of SRAT messages emitted to the kernel
log by printing each valid pxm once and then creating bitmaps to represent
the apic ids that map to the same node.
This reduces lines such as
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
to
SRAT: PXM 0 -> APIC {0-1} -> Node 0
SRAT: PXM 1 -> APIC {2-3} -> Node 1
The buffer used to store the apic id list is 128 characters in length.
If that is too small to represent all the apic id ranges that are bound
to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
manually increased for such configurations.
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
arch/x86/mm/srat_64.c | 41 +++++++++++++++++++++++++++++++++++++----
drivers/acpi/numa.c | 5 +++++
include/linux/acpi.h | 3 ++-
3 files changed, 44 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -36,6 +36,9 @@ static int num_node_memblks __initdata;
static struct bootnode node_memblk_range[NR_NODE_MEMBLKS] __initdata;
static int memblk_nodeid[NR_NODE_MEMBLKS] __initdata;
+static DECLARE_BITMAP(apicid_map, MAX_LOCAL_APIC) __initdata;
+#define APICID_LIST_LEN (128)
+
static __init int setup_node(int pxm)
{
return acpi_map_pxm_to_node(pxm);
@@ -136,8 +139,6 @@ acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
}
/* Callback for Proximity Domain -> LAPIC mapping */
@@ -170,8 +171,40 @@ acpi_numa_processor_affinity_init(struct acpi_srat_cpu_affinity *pa)
apicid_to_node[apic_id] = node;
node_set(node, cpu_nodes_parsed);
acpi_numa = 1;
- printk(KERN_INFO "SRAT: PXM %u -> APIC %u -> Node %u\n",
- pxm, apic_id, node);
+}
+
+void __init acpi_numa_print_srat_mapping(void)
+{
+ char apicid_list[APICID_LIST_LEN];
+ int i, j;
+
+ for (i = 0; i < MAX_PXM_DOMAINS; i++) {
+ int len;
+ int nid;
+
+ nid = pxm_to_node(i);
+ if (nid == NUMA_NO_NODE)
+ continue;
+
+ bitmap_zero(apicid_map, MAX_LOCAL_APIC);
+ for (j = 0; j < MAX_LOCAL_APIC; j++)
+ if (apicid_to_node[j] == nid)
+ set_bit(j, apicid_map);
+
+ if (bitmap_empty(apicid_map, MAX_LOCAL_APIC))
+ continue;
+
+ /*
+ * If the bitmap cannot be listed in a buffer of length
+ * APICID_LIST_LEN, then it is suffixed with "...".
+ */
+ len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
+ apicid_map, MAX_LOCAL_APIC);
+ pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
+ i, apicid_list,
+ (len == APICID_LIST_LEN - 1) ? "..." : "",
+ nid);
+ }
}
#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -281,6 +281,10 @@ acpi_table_parse_srat(enum acpi_srat_type id,
handler, max_entries);
}
+void __init __attribute__((weak)) acpi_numa_print_srat_mapping(void)
+{
+}
+
int __init acpi_numa_init(void)
{
/* SRAT: Static Resource Affinity Table */
@@ -292,6 +296,7 @@ int __init acpi_numa_init(void)
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory_affinity,
NR_NODE_MEMBLKS);
+ acpi_numa_print_srat_mapping();
}
/* SLIT: System Locality Information Table */
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -92,12 +92,13 @@ int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler hand
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
-/* the following four functions are architecture-dependent */
+/* the following six functions are architecture-dependent */
void acpi_numa_slit_init (struct acpi_table_slit *slit);
void acpi_numa_processor_affinity_init (struct acpi_srat_cpu_affinity *pa);
void acpi_numa_x2apic_affinity_init(struct acpi_srat_x2apic_cpu_affinity *pa);
void acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
void acpi_numa_arch_fixup(void);
+void acpi_numa_print_srat_mapping(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-10-28 4:53 ` [patch v2] " David Rientjes
@ 2009-10-28 5:19 ` Andi Kleen
2009-10-28 5:24 ` David Rientjes
2009-11-10 21:08 ` David Rientjes
1 sibling, 1 reply; 109+ messages in thread
From: Andi Kleen @ 2009-10-28 5:19 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Mike Travis, Thomas Gleixner,
Andrew Morton, Jack Steiner, H. Peter Anvin, x86, Yinghai Lu,
Mel Gorman, linux-kernel, linux-acpi
> + /*
> + * If the bitmap cannot be listed in a buffer of length
> + * APICID_LIST_LEN, then it is suffixed with "...".
> + */
> + len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
> + apicid_map, MAX_LOCAL_APIC);
> + pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
> + i, apicid_list,
> + (len == APICID_LIST_LEN - 1) ? "..." : "",
Is the - 1 really correct? If scnlistprintf follows snprintf semantics then it would not
be and my understanding is it is supposed to.
Other than that it looks good.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-10-28 5:19 ` Andi Kleen
@ 2009-10-28 5:24 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 5:24 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Mike Travis, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Andi Kleen wrote:
> > + /*
> > + * If the bitmap cannot be listed in a buffer of length
> > + * APICID_LIST_LEN, then it is suffixed with "...".
> > + */
> > + len = bitmap_scnlistprintf(apicid_list, APICID_LIST_LEN,
> > + apicid_map, MAX_LOCAL_APIC);
> > + pr_info("SRAT: PXM %u -> APIC {%s%s} -> Node %u\n",
> > + i, apicid_list,
> > + (len == APICID_LIST_LEN - 1) ? "..." : "",
>
> Is the - 1 really correct? If scnlistprintf follows snprintf semantics then it would not
> be and my understanding is it is supposed to.
>
It is, bitmap_scnlistprintf() returns the number of characters printed to
the buffer minus the trailing '\0', which is different from snprintf().
APICID_LIST_LEN-1 then identifies when the buffer was max'd out. It still
adds a trailing "..." if the list is exactly 128 characters long, but this
isn't addressed to avoid added complexity.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 4:11 ` Andi Kleen
2009-10-28 4:53 ` [patch v2] " David Rientjes
@ 2009-10-28 17:02 ` Mike Travis
2009-10-28 20:52 ` David Rientjes
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-28 17:02 UTC (permalink / raw)
To: Andi Kleen
Cc: David Rientjes, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
Andi Kleen wrote:
>> MAX_LOCAL_APIC was definitely an arbitrary choice here and has very little
>> relevance. scnlistprintf will protect against overflow, but we still need
>> to decide upon a constant that will emit the most information possible
>> while not overly polluting the printk and saving on bss, as you mentioned.
>> I suspect we could agree on a value as little as 128 and it would work for
>> the overwhelming majority (all?) of users.
>
> For now at least seems reasonable to limit to 128 or so yes (and go
> back to the stack). if we ever have sparse apic ids for nodes
> then that might change; but in this case could still just do
> a acpidump or teach the printer to be more clever and support
> strides.
>
> It would be just good to have some indication in the output
> if there was a overflow.
>
> -Andi
>
I don't understand the importance of this when the memory is given back
after the system starts up anyway...?
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 17:02 ` [patch] " Mike Travis
@ 2009-10-28 20:52 ` David Rientjes
2009-10-28 21:03 ` Mike Travis
2009-10-28 21:35 ` Mike Travis
0 siblings, 2 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 20:52 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Mike Travis wrote:
> I don't understand the importance of this when the memory is given back
> after the system starts up anyway...?
>
Printing a list of apic ids longer than 128 characters would pollute the
kernel log and this upper bound will probably never be reached based on
the way apic ids are created for physical and logical processors: they are
normally reduced to ranges instead of comma seperated entities.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 20:52 ` David Rientjes
@ 2009-10-28 21:03 ` Mike Travis
2009-10-28 21:06 ` David Rientjes
2009-10-28 21:35 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-28 21:03 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Wed, 28 Oct 2009, Mike Travis wrote:
>
>> I don't understand the importance of this when the memory is given back
>> after the system starts up anyway...?
>>
>
> Printing a list of apic ids longer than 128 characters would pollute the
> kernel log and this upper bound will probably never be reached based on
> the way apic ids are created for physical and logical processors: they are
> normally reduced to ranges instead of comma seperated entities.
Your latest patch tested:
[ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
[ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
[ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
[ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
[ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
[ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
[ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
[ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
[ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
[ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
[ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
[ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
[ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
[ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
[ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
[ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
[ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
[ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
[ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
[ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
[ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
[ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
[ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
[ 0.000000] SRAT: PXM 24 -> APIC {768-775,784-791} -> Node 24
[ 0.000000] SRAT: PXM 25 -> APIC {800-807,816-823} -> Node 25
[ 0.000000] SRAT: PXM 26 -> APIC {832-839,848-855} -> Node 26
[ 0.000000] SRAT: PXM 27 -> APIC {864-871,880-887} -> Node 27
[ 0.000000] SRAT: PXM 28 -> APIC {896-903,912-919} -> Node 28
[ 0.000000] SRAT: PXM 29 -> APIC {928-935,944-951} -> Node 29
[ 0.000000] SRAT: PXM 30 -> APIC {960-967,976-983} -> Node 30
[ 0.000000] SRAT: PXM 31 -> APIC {992-999,1008-1015} -> Node 31
[ 0.000000] SRAT: PXM 32 -> APIC {1024-1031,1040-1047} -> Node 32
[ 0.000000] SRAT: PXM 33 -> APIC {1056-1063,1072-1079} -> Node 33
[ 0.000000] SRAT: PXM 34 -> APIC {1088-1095,1104-1111} -> Node 34
[ 0.000000] SRAT: PXM 35 -> APIC {1120-1127,1136-1143} -> Node 35
[ 0.000000] SRAT: PXM 36 -> APIC {1152-1159,1168-1175} -> Node 36
[ 0.000000] SRAT: PXM 37 -> APIC {1184-1191,1200-1207} -> Node 37
[ 0.000000] SRAT: PXM 38 -> APIC {1216-1223,1232-1239} -> Node 38
[ 0.000000] SRAT: PXM 39 -> APIC {1248-1255,1264-1271} -> Node 39
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 21:03 ` Mike Travis
@ 2009-10-28 21:06 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-28 21:06 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Mike Travis wrote:
> Your latest patch tested:
>
> [ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
> [ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
> [ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
> [ 0.000000] SRAT: PXM 3 -> APIC {96-103,112-119} -> Node 3
> [ 0.000000] SRAT: PXM 4 -> APIC {128-135,144-151} -> Node 4
> [ 0.000000] SRAT: PXM 5 -> APIC {160-167,176-183} -> Node 5
> [ 0.000000] SRAT: PXM 6 -> APIC {192-199,208-215} -> Node 6
> [ 0.000000] SRAT: PXM 7 -> APIC {224-231,240-247} -> Node 7
> [ 0.000000] SRAT: PXM 8 -> APIC {256-263,272-279} -> Node 8
> [ 0.000000] SRAT: PXM 9 -> APIC {288-295,304-311} -> Node 9
> [ 0.000000] SRAT: PXM 10 -> APIC {320-327,336-343} -> Node 10
> [ 0.000000] SRAT: PXM 11 -> APIC {352-359,368-375} -> Node 11
> [ 0.000000] SRAT: PXM 12 -> APIC {384-391,400-407} -> Node 12
> [ 0.000000] SRAT: PXM 13 -> APIC {416-423,432-439} -> Node 13
> [ 0.000000] SRAT: PXM 14 -> APIC {448-455,464-471} -> Node 14
> [ 0.000000] SRAT: PXM 15 -> APIC {480-487,496-503} -> Node 15
> [ 0.000000] SRAT: PXM 16 -> APIC {512-519,528-535} -> Node 16
> [ 0.000000] SRAT: PXM 17 -> APIC {544-551,560-567} -> Node 17
> [ 0.000000] SRAT: PXM 18 -> APIC {576-583,592-599} -> Node 18
> [ 0.000000] SRAT: PXM 19 -> APIC {608-615,624-631} -> Node 19
> [ 0.000000] SRAT: PXM 20 -> APIC {640-647,656-663} -> Node 20
> [ 0.000000] SRAT: PXM 21 -> APIC {672-679,688-695} -> Node 21
> [ 0.000000] SRAT: PXM 22 -> APIC {704-711,720-727} -> Node 22
> [ 0.000000] SRAT: PXM 23 -> APIC {736-743,752-759} -> Node 23
> [ 0.000000] SRAT: PXM 24 -> APIC {768-775,784-791} -> Node 24
> [ 0.000000] SRAT: PXM 25 -> APIC {800-807,816-823} -> Node 25
> [ 0.000000] SRAT: PXM 26 -> APIC {832-839,848-855} -> Node 26
> [ 0.000000] SRAT: PXM 27 -> APIC {864-871,880-887} -> Node 27
> [ 0.000000] SRAT: PXM 28 -> APIC {896-903,912-919} -> Node 28
> [ 0.000000] SRAT: PXM 29 -> APIC {928-935,944-951} -> Node 29
> [ 0.000000] SRAT: PXM 30 -> APIC {960-967,976-983} -> Node 30
> [ 0.000000] SRAT: PXM 31 -> APIC {992-999,1008-1015} -> Node 31
> [ 0.000000] SRAT: PXM 32 -> APIC {1024-1031,1040-1047} -> Node 32
> [ 0.000000] SRAT: PXM 33 -> APIC {1056-1063,1072-1079} -> Node 33
> [ 0.000000] SRAT: PXM 34 -> APIC {1088-1095,1104-1111} -> Node 34
> [ 0.000000] SRAT: PXM 35 -> APIC {1120-1127,1136-1143} -> Node 35
> [ 0.000000] SRAT: PXM 36 -> APIC {1152-1159,1168-1175} -> Node 36
> [ 0.000000] SRAT: PXM 37 -> APIC {1184-1191,1200-1207} -> Node 37
> [ 0.000000] SRAT: PXM 38 -> APIC {1216-1223,1232-1239} -> Node 38
> [ 0.000000] SRAT: PXM 39 -> APIC {1248-1255,1264-1271} -> Node 39
Looks good, 1272 lines reduced to 40.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 20:52 ` David Rientjes
2009-10-28 21:03 ` Mike Travis
@ 2009-10-28 21:35 ` Mike Travis
2009-10-28 21:46 ` David Rientjes
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-28 21:35 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Wed, 28 Oct 2009, Mike Travis wrote:
>
>> I don't understand the importance of this when the memory is given back
>> after the system starts up anyway...?
>>
>
> Printing a list of apic ids longer than 128 characters would pollute the
> kernel log and this upper bound will probably never be reached based on
> the way apic ids are created for physical and logical processors: they are
> normally reduced to ranges instead of comma seperated entities.
Ahh, ok, thanks.
Does that mean this 10,649 character line full of periods is illegal?
[ 102.551570] Completing Region/Field/Buffer/Package initialization:
............... [long time later] .........
<4>Clocksource tsc unstable (delta = 4396383657849 ns)
I'm having trouble finding it. Does it look familiar to anyone?
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 21:35 ` Mike Travis
@ 2009-10-28 21:46 ` David Rientjes
2009-10-28 22:36 ` Mike Travis
0 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-28 21:46 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Mike Travis wrote:
> > Printing a list of apic ids longer than 128 characters would pollute the
> > kernel log and this upper bound will probably never be reached based on the
> > way apic ids are created for physical and logical processors: they are
> > normally reduced to ranges instead of comma seperated entities.
>
> Ahh, ok, thanks.
>
> Does that mean this 10,649 character line full of periods is illegal?
>
I'm not saying it would be illegal, merely that it would be harm
readability. Based on how apic id's are formed from processor ids,
though, I think we're really talking about an upper limit (128) that will
never be reached.
> [ 102.551570] Completing Region/Field/Buffer/Package initialization:
> ............... [long time later] .........
> <4>Clocksource tsc unstable (delta = 4396383657849 ns)
>
> I'm having trouble finding it. Does it look familiar to anyone?
>
It's debugging output from acpi_ns_initialize_objects() and each period is
from acpi_ns_init_one_device(). You can suppress it by disabing
CONFIG_ACPI_DEBUG.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 21:46 ` David Rientjes
@ 2009-10-28 22:36 ` Mike Travis
2009-10-29 8:21 ` David Rientjes
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-28 22:36 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Wed, 28 Oct 2009, Mike Travis wrote:
>
>>> Printing a list of apic ids longer than 128 characters would pollute the
>>> kernel log and this upper bound will probably never be reached based on the
>>> way apic ids are created for physical and logical processors: they are
>>> normally reduced to ranges instead of comma seperated entities.
>> Ahh, ok, thanks.
>>
>> Does that mean this 10,649 character line full of periods is illegal?
>>
>
> I'm not saying it would be illegal, merely that it would be harm
> readability. Based on how apic id's are formed from processor ids,
> though, I think we're really talking about an upper limit (128) that will
> never be reached.
We actually have many, many more than that by adding on some extra bits
to the CPU's apicid. These select which blade in the system to target.
>
>> [ 102.551570] Completing Region/Field/Buffer/Package initialization:
>> ............... [long time later] .........
>> <4>Clocksource tsc unstable (delta = 4396383657849 ns)
>>
>> I'm having trouble finding it. Does it look familiar to anyone?
>>
>
> It's debugging output from acpi_ns_initialize_objects() and each period is
> from acpi_ns_init_one_device(). You can suppress it by disabing
> CONFIG_ACPI_DEBUG.
Ahh, didn't know that was set in the (our) default config. Is it normally
set by distros?
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-28 22:36 ` Mike Travis
@ 2009-10-29 8:21 ` David Rientjes
2009-10-29 16:34 ` Mike Travis
0 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-29 8:21 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Wed, 28 Oct 2009, Mike Travis wrote:
> > I'm not saying it would be illegal, merely that it would be harm
> > readability. Based on how apic id's are formed from processor ids, though,
> > I think we're really talking about an upper limit (128) that will never be
> > reached.
>
> We actually have many, many more than that by adding on some extra bits
> to the CPU's apicid. These select which blade in the system to target.
>
Maybe I've been vague in my rationale for why this limit will probably
never be reached. The way apic ids are constructed, with physical and
logical processor ids, it tends to lend itself to ranges where
bitmap_scnlistprintf() can specify a large number of apic ids with
relatively few ASCII characters because logical processors typically do
not have differing pxms. For us to reach the 128 character upper bound,
scnlistprintf() would need to have many, many distinct ranges; your
example showed two ranges per pxm (many more machines would have only a
single range). In other words, we're not predicting to have
"1-2,4-6,8-9,11-13,15-17," etc, that we often have with nodemasks.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-29 8:21 ` David Rientjes
@ 2009-10-29 16:34 ` Mike Travis
2009-10-29 19:06 ` David Rientjes
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-29 16:34 UTC (permalink / raw)
To: David Rientjes
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
David Rientjes wrote:
> On Wed, 28 Oct 2009, Mike Travis wrote:
>
>>> I'm not saying it would be illegal, merely that it would be harm
>>> readability. Based on how apic id's are formed from processor ids, though,
>>> I think we're really talking about an upper limit (128) that will never be
>>> reached.
>> We actually have many, many more than that by adding on some extra bits
>> to the CPU's apicid. These select which blade in the system to target.
>>
>
> Maybe I've been vague in my rationale for why this limit will probably
> never be reached. The way apic ids are constructed, with physical and
> logical processor ids, it tends to lend itself to ranges where
> bitmap_scnlistprintf() can specify a large number of apic ids with
> relatively few ASCII characters because logical processors typically do
> not have differing pxms. For us to reach the 128 character upper bound,
> scnlistprintf() would need to have many, many distinct ranges; your
> example showed two ranges per pxm (many more machines would have only a
> single range). In other words, we're not predicting to have
> "1-2,4-6,8-9,11-13,15-17," etc, that we often have with nodemasks.
Yes, you are correct. (I was confused... ;-)
I believe the disjointed ranges came from the hyperthread cpus..? Which if
true means there'll probably be as many distinct ranges as there are threads
per core?
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch] x86: reduce srat verbosity in the kernel log
2009-10-29 16:34 ` Mike Travis
@ 2009-10-29 19:06 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-10-29 19:06 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Yinghai Lu, Mel Gorman,
linux-kernel, linux-acpi
On Thu, 29 Oct 2009, Mike Travis wrote:
> I believe the disjointed ranges came from the hyperthread cpus..? Which if
> true means there'll probably be as many distinct ranges as there are threads
> per core?
>
Not necessarily, look at the first few lines of your new output:
[ 0.000000] SRAT: PXM 0 -> APIC {0-7,16-23} -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC {32-39,48-55} -> Node 1
[ 0.000000] SRAT: PXM 2 -> APIC {64-71,80-87} -> Node 2
...
If those values are in hex, you have these apic id ranges:
0x00-0x07, 0x10-0x17
0x20-0x27, 0x30-0x37
0x40-0x47, 0x50-0x57
...
So it's most likely that each of the physical processors has eight logical
processors (represented by the least significant three bits) and there are
two physical processors (the more significant four bits) per node.
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-26 21:55 ` Andi Kleen
2009-10-26 22:07 ` Mike Travis
@ 2009-10-30 19:25 ` Mike Travis
2009-10-30 19:54 ` David Rientjes
2009-11-02 11:11 ` Andi Kleen
1 sibling, 2 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-30 19:25 UTC (permalink / raw)
To: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton
Cc: Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
x86_64: Limit the number of processor bootup messages
With a large number of processors in a system there is an excessive amount
of messages sent to the system console. It's estimated that with 4096
processors in a system, and the console baudrate set to 56K, the startup
messages will take about 84 minutes to clear the serial port.
This set of patches limits the number of repetitious messages which contain
no additional information. Much of this information is obtainable from the
/proc and /sysfs. Most of the messages are also sent to the kernel log
buffer as KERN_DEBUG messages so it can be used to examine more closely any
details specific to a processor.
The list of message transformations....
For system_state == SYSTEM_BOOTING:
[ 25.388280] Booting Processors 1-7,320-327, Node 0
[ 26.064742] Booting Processors 8-15,328-335, Node 1
[ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
[ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
[ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
[ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
[ 50.917504] Booting Processors 256-319,576-639, Nodes 32-39
[ 90.964169] Brought up 640 CPUs
The range of processors increases as a power of 2, so 4096 CPU's should
only take 12 lines.
(QUESTION: print_summary_bootmsg() is in the __init section and is called
from a __cpuinit function, but only when system is booting. Is there a
special flag to handle this case?)
For Processor Information printout:
[ 90.968381] Summary Processor Information for CPUS: 0-639
[ 90.972033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
[ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 90.985888] CPU: L2 cache: 256K
[ 90.988032] CPU: L3 cache: 24576K
[ 90.992032] MIN 4266.68 BogoMIPS (lpj=8533371)
[ 91.000033] MAX 4267.89 BogoMIPS (lpj=8535789)
These lines have been moved to loglevel KERN_DEBUG:
CPU: Physical Processor ID:
CPU: Processor Core ID:
CPU %d/0x%x -> Node %d
<cache line sizes per cpu>
CPUx is down
This message has been changed to loglevel KERN_DEBUG if system is booting
and KERN_INFO otherwise:
CPU %d is now offline
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/include/asm/processor.h | 4
arch/x86/kernel/cpu/addon_cpuid_features.c | 4
arch/x86/kernel/cpu/amd.c | 2
arch/x86/kernel/cpu/common.c | 23 +++-
arch/x86/kernel/cpu/intel.c | 2
arch/x86/kernel/cpu/intel_cacheinfo.c | 22 +---
arch/x86/kernel/smpboot.c | 154 ++++++++++++++++++++++++++++-
kernel/cpu.c | 2
8 files changed, 187 insertions(+), 26 deletions(-)
--- linux.orig/arch/x86/include/asm/processor.h
+++ linux/arch/x86/include/asm/processor.h
@@ -111,6 +111,9 @@
u16 cpu_core_id;
/* Index into per_cpu list: */
u16 cpu_index;
+ /* Interior Cache Sizes: */
+ u16 l1i, l1d, l2;
+ u32 l3;
#endif
unsigned int x86_hyper_vendor;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));
@@ -169,6 +172,7 @@
extern void identify_boot_cpu(void);
extern void identify_secondary_cpu(struct cpuinfo_x86 *);
extern void print_cpu_info(struct cpuinfo_x86 *);
+extern void print_cache_info(struct cpuinfo_x86 *, char *msglvl);
extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
extern unsigned short num_cache_leaves;
--- linux.orig/arch/x86/kernel/cpu/addon_cpuid_features.c
+++ linux/arch/x86/kernel/cpu/addon_cpuid_features.c
@@ -128,10 +128,10 @@
c->x86_max_cores = (core_level_siblings / smp_num_siblings);
- printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
+ printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
c->phys_proc_id);
if (c->x86_max_cores > 1)
- printk(KERN_INFO "CPU: Processor Core ID: %d\n",
+ printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
c->cpu_core_id);
return;
#endif
--- linux.orig/arch/x86/kernel/cpu/amd.c
+++ linux/arch/x86/kernel/cpu/amd.c
@@ -376,7 +376,7 @@
}
numa_set_node(cpu, node);
- printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
+ printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
#endif
}
--- linux.orig/arch/x86/kernel/cpu/common.c
+++ linux/arch/x86/kernel/cpu/common.c
@@ -475,9 +475,9 @@
out:
if ((c->x86_max_cores * smp_num_siblings) > 1) {
- printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
+ printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
c->phys_proc_id);
- printk(KERN_INFO "CPU: Processor Core ID: %d\n",
+ printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
c->cpu_core_id);
}
#endif
@@ -967,6 +967,23 @@
#endif
}
+void __cpuinit print_cache_info(struct cpuinfo_x86 *c, char *lvl)
+{
+ if (c->l1i)
+ printk("%sCPU: L1 I cache: %dK", lvl, c->l1i);
+
+ if (c->l1d)
+ printk(KERN_CONT ", L1 D cache: %dK\n", c->l1d);
+ else
+ printk(KERN_CONT "\n");
+
+ if (c->l2)
+ printk("%sCPU: L2 cache: %dK\n", lvl, c->l2);
+
+ if (c->l3)
+ printk("%sCPU: L3 cache: %dK\n", lvl, c->l3);
+}
+
static __init int setup_disablecpuid(char *arg)
{
int bit;
@@ -1115,7 +1132,7 @@
if (cpumask_test_and_set_cpu(cpu, cpu_initialized_mask))
panic("CPU#%d already initialized!\n", cpu);
- printk(KERN_INFO "Initializing CPU#%d\n", cpu);
+ printk(KERN_DEBUG "Initializing CPU#%d\n", cpu);
clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
--- linux.orig/arch/x86/kernel/cpu/intel.c
+++ linux/arch/x86/kernel/cpu/intel.c
@@ -267,7 +267,7 @@
node = first_node(node_online_map);
numa_set_node(cpu, node);
- printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
+ printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
#endif
}
--- linux.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -489,23 +489,17 @@
}
if (trace)
- printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
- else if (l1i)
- printk(KERN_INFO "CPU: L1 I cache: %dK", l1i);
-
- if (l1d)
- printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
- else
- printk(KERN_CONT "\n");
-
- if (l2)
- printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
-
- if (l3)
- printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
+ printk(KERN_DEBUG "CPU: Trace cache: %dK uops", trace);
+ c->l1i = l1i;
+ c->l1d = l1d;
+ c->l2 = l2;
+ c->l3 = l3;
c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
+ print_cache_info(c,
+ system_state == SYSTEM_BOOTING? KERN_DEBUG : KERN_INFO);
+
return l2;
}
--- linux.orig/arch/x86/kernel/smpboot.c
+++ linux/arch/x86/kernel/smpboot.c
@@ -442,6 +442,94 @@
return c->llc_shared_map;
}
+/* Summarize Processor Information */
+static void __init summarize_cpu_info(void)
+{
+ cpumask_var_t cpulist, cpusdone;
+ int cpu;
+ int err = 0;
+
+ if (!alloc_cpumask_var(&cpulist, GFP_KERNEL))
+ err = 1;
+
+ else if (!alloc_cpumask_var(&cpusdone, GFP_KERNEL)) {
+ free_cpumask_var(cpulist);
+ err = 1;
+ }
+
+ if (err) {
+ printk(KERN_INFO "Can't print processor summaries\n");
+ return;
+ }
+
+ cpumask_clear(cpusdone);
+ for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
+ struct cpuinfo_x86 *c;
+ int l1i, l1d, l2, l3;
+ int x86, x86_vendor, x86_model, x86_mask;
+ char buf[64];
+ int ncpu;
+ unsigned long minlpj, maxlpj;
+
+ /* skip if cpu has already been displayed */
+ if (cpumask_test_cpu(cpu, cpusdone))
+ continue;
+
+ c = &cpu_data(cpu);
+ l1i = c->l1i;
+ l1d = c->l1d;
+ l2 = c->l2;
+ l3 = c->l3;
+ x86 = c->x86;
+ x86_vendor = c->x86_vendor;
+ x86_model = c->x86_model;
+ x86_mask = c->x86_mask;
+ minlpj = ULONG_MAX;
+ maxlpj = 0;
+
+ cpumask_clear(cpulist);
+
+ /* collate all cpus with same specifics */
+ for (ncpu = cpu; ncpu < nr_cpu_ids; ncpu++) {
+ if (l1i != cpu_data(ncpu).l1i ||
+ l1d != cpu_data(ncpu).l1d ||
+ l2 != cpu_data(ncpu).l2 ||
+ l3 != cpu_data(ncpu).l3 ||
+ x86 != cpu_data(ncpu).x86 ||
+ x86_vendor != cpu_data(ncpu).x86_vendor ||
+ x86_model != cpu_data(ncpu).x86_model ||
+ x86_mask != cpu_data(ncpu).x86_mask)
+ continue;
+
+ cpumask_set_cpu(ncpu, cpulist);
+ cpumask_set_cpu(ncpu, cpusdone);
+
+ if (cpu_data(ncpu).loops_per_jiffy < minlpj)
+ minlpj = cpu_data(ncpu).loops_per_jiffy;
+
+ if (cpu_data(ncpu).loops_per_jiffy > maxlpj)
+ maxlpj = cpu_data(ncpu).loops_per_jiffy;
+ }
+
+ cpulist_scnprintf(buf, sizeof(buf), cpulist);
+ printk(KERN_INFO
+ "Summary Processor Information for CPUS: %s\n", buf);
+
+ printk(KERN_INFO);
+ print_cpu_info(c);
+ print_cache_info(c, KERN_INFO);
+
+ printk(KERN_INFO "MIN %lu.%02lu BogoMIPS (lpj=%lu)\n",
+ minlpj/(500000/HZ), (minlpj/(5000/HZ)) % 100, minlpj);
+
+ printk(KERN_INFO "MAX %lu.%02lu BogoMIPS (lpj=%lu)\n",
+ maxlpj/(500000/HZ), (maxlpj/(5000/HZ)) % 100, maxlpj);
+ }
+
+ free_cpumask_var(cpusdone);
+ free_cpumask_var(cpulist);
+}
+
static void impress_friends(void)
{
int cpu;
@@ -671,6 +759,50 @@
complete(&c_idle->done);
}
+/* Summarize the "Booting processor ..." startup messages */
+static void __init print_summary_bootmsg(int cpu)
+{
+ static int next_node, node_shift;
+ int node = cpu_to_node(cpu);
+
+ if (node >= next_node) {
+ cpumask_var_t cpulist;
+
+ node = next_node;
+ next_node = 1 << node_shift;
+ node_shift++;
+
+ if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
+ int i, tmp, last_node = node;
+ char buf[32];
+
+ cpumask_clear(cpulist);
+ for_each_present_cpu(i) {
+ if (i == 0) /* boot cpu */
+ continue;
+
+ tmp = cpu_to_node(i);
+ if (node <= tmp && tmp < next_node) {
+ cpumask_set_cpu(i, cpulist);
+ if (last_node < tmp)
+ last_node = tmp;
+ }
+ }
+ if (cpumask_weight(cpulist)) {
+ cpulist_scnprintf(buf, sizeof(buf), cpulist);
+ printk(KERN_INFO "Booting Processors %s,", buf);
+
+ if (node == last_node)
+ printk(KERN_CONT " Node %d\n", node);
+ else
+ printk(KERN_CONT " Nodes %d-%d\n",
+ node, last_node);
+ }
+ free_cpumask_var(cpulist);
+ }
+ }
+}
+
/*
* NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
* (ie clustered apic addressing mode), this is a LOGICAL apic ID.
@@ -737,8 +869,14 @@
start_ip = setup_trampoline();
/* So we see what's up */
- printk(KERN_INFO "Booting processor %d APIC 0x%x ip 0x%lx\n",
- cpu, apicid, start_ip);
+ if (system_state == SYSTEM_BOOTING) {
+ print_summary_bootmsg(cpu);
+ printk(KERN_DEBUG);
+ } else
+ printk(KERN_INFO);
+
+ printk(KERN_CONT "Booting processor %d APIC 0x%x ip 0x%lx\n",
+ cpu, apicid, start_ip);
/*
* This grunge runs the startup process for
@@ -790,7 +928,7 @@
if (cpumask_test_cpu(cpu, cpu_callin_mask)) {
/* number CPUs logically, starting from 1 (BSP is 0) */
pr_debug("OK.\n");
- printk(KERN_INFO "CPU%d: ", cpu);
+ printk(KERN_DEBUG "CPU%d: ", cpu);
print_cpu_info(&cpu_data(cpu));
pr_debug("CPU has booted.\n");
} else {
@@ -1147,6 +1285,9 @@
{
pr_debug("Boot done.\n");
+ /* print processor data summaries */
+ summarize_cpu_info();
+
impress_friends();
#ifdef CONFIG_X86_IO_APIC
setup_ioapic_dest();
@@ -1300,7 +1441,12 @@
for (i = 0; i < 10; i++) {
/* They ack this in play_dead by setting CPU_DEAD */
if (per_cpu(cpu_state, cpu) == CPU_DEAD) {
- printk(KERN_INFO "CPU %d is now offline\n", cpu);
+ if (system_state == SYSTEM_RUNNING)
+ printk(KERN_INFO);
+ else
+ printk(KERN_DEBUG);
+
+ printk(KERN_CONT "CPU %d is now offline\n", cpu);
if (1 == num_online_cpus())
alternatives_smp_switch(0);
return;
--- linux.orig/kernel/cpu.c
+++ linux/kernel/cpu.c
@@ -394,7 +394,7 @@
error = _cpu_down(cpu, 1);
if (!error) {
cpumask_set_cpu(cpu, frozen_cpus);
- printk("CPU%d is down\n", cpu);
+ printk(KERN_DEBUG "CPU%d is down\n", cpu);
} else {
printk(KERN_ERR "Error taking CPU%d down: %d\n",
cpu, error);
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] x86_64: Limit the number of microcode messages
2009-10-24 22:45 ` Dmitry Adamushko
2009-10-25 16:37 ` Ingo Molnar
2009-10-26 18:25 ` Mike Travis
@ 2009-10-30 19:40 ` Mike Travis
2 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-30 19:40 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Tigran Aivazian, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Andreas Mohr, Hugh Dickins,
Hannes Eder, linux-kernel, Roland Dreier, Hidetoshi Seto
x86_64: Limit the number of microcode messages
Presented as an example of summarizing startup information,
but as others have pointed out this particular message can be
removed completely from the console log.
Summarize microcode messages to the form:
[ 8.961953] microcode: CPU0-23: sig=0x206e5, pf=0x4, revision=0xffff0016
[ 8.969494] Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
(Note: I need a better method for triggering the summary message.)
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Andreas Mohr <andi@lisas.de>
Cc: Hannes Eder <hannes@hanneseder.net>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/microcode_intel.c | 75 +++++++++++++++++++++++++++++++++++++-
1 file changed, 74 insertions(+), 1 deletion(-)
--- linux.orig/arch/x86/kernel/microcode_intel.c
+++ linux/arch/x86/kernel/microcode_intel.c
@@ -137,6 +137,52 @@
#define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
+static struct cpu_signature *cpusigs;
+static cpumask_var_t cpusigslist;
+static int cpusigs_error;
+
+static void summarize_cpu_info(void)
+{
+ char buf[128];
+ int cpu;
+ cpumask_var_t cpulist;
+
+ if (cpusigs_error || !alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
+ printk(KERN_INFO "Can't print microcode summary\n");
+ return;
+ }
+
+ while ((cpu = cpumask_first(cpusigslist)) < nr_cpu_ids) {
+ struct cpu_signature *csig = &cpusigs[cpu];
+ int ncpu = cpu;
+
+ cpumask_clear(cpulist);
+ cpumask_set_cpu(cpu, cpulist);
+
+ /* gather all cpu info with same data */
+ while ((ncpu = cpumask_next(ncpu, cpusigslist)) < nr_cpu_ids)
+ if (csig->sig == cpusigs[ncpu].sig &&
+ csig->pf == cpusigs[ncpu].pf &&
+ csig->rev == cpusigs[ncpu].rev)
+ cpumask_set_cpu(ncpu, cpulist);
+
+ cpulist_scnprintf(buf, sizeof(buf), cpulist);
+
+ printk(KERN_INFO
+ "microcode: CPU%s: sig=0x%x, pf=0x%x, revision=0x%x\n",
+ buf, csig->sig, csig->pf, csig->rev);
+
+ /* clear bits we just processed */
+ cpumask_xor(cpusigslist, cpusigslist, cpulist);
+ }
+
+ /* cleanup */
+ free_cpumask_var(cpulist);
+ free_cpumask_var(cpusigslist);
+ kfree(cpusigs);
+ cpusigs_error = 0;
+}
+
static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
{
struct cpuinfo_x86 *c = &cpu_data(cpu_num);
@@ -165,9 +211,36 @@
/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], csig->rev);
- printk(KERN_INFO "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
+ if (!cpusigs && !cpusigs_error) {
+ if (!alloc_cpumask_var(&cpusigslist, GFP_KERNEL))
+ cpusigs_error = 1;
+ else {
+ int size = sizeof(*cpusigs) * nr_cpu_ids;
+ cpusigs = kmalloc(size, GFP_KERNEL);
+ if (!cpusigs) {
+ free_cpumask_var(cpusigslist);
+ cpusigs_error = 1;
+ }
+ }
+ }
+
+ /* will only print microcode revision of first cpu if cannot do all */
+ if (cpusigs_error && cpu_num == 0)
+ printk(KERN_INFO
+ "microcode: CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
cpu_num, csig->sig, csig->pf, csig->rev);
+ else if (!cpusigs_error) {
+ cpusigs[cpu_num].sig = csig->sig;
+ cpusigs[cpu_num].pf = csig->pf;
+ cpusigs[cpu_num].rev = csig->rev;
+ cpumask_set_cpu(cpu_num, cpusigslist);
+
+ /* (XXX Need better method for when to print summary) */
+ if (cpu_num == (num_present_cpus() - 1))
+ summarize_cpu_info();
+ }
+
return 0;
}
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-30 19:25 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
@ 2009-10-30 19:54 ` David Rientjes
2009-10-30 20:39 ` Mike Travis
2009-11-02 11:11 ` Andi Kleen
1 sibling, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-30 19:54 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
On Fri, 30 Oct 2009, Mike Travis wrote:
> x86_64: Limit the number of processor bootup messages
>
> With a large number of processors in a system there is an excessive amount
> of messages sent to the system console. It's estimated that with 4096
> processors in a system, and the console baudrate set to 56K, the startup
> messages will take about 84 minutes to clear the serial port.
>
> This set of patches limits the number of repetitious messages which contain
> no additional information. Much of this information is obtainable from the
> /proc and /sysfs. Most of the messages are also sent to the kernel log
> buffer as KERN_DEBUG messages so it can be used to examine more closely any
> details specific to a processor.
>
> The list of message transformations....
>
> For system_state == SYSTEM_BOOTING:
>
> [ 25.388280] Booting Processors 1-7,320-327, Node 0
> [ 26.064742] Booting Processors 8-15,328-335, Node 1
> [ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
> [ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
> [ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
> [ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
> [ 50.917504] Booting Processors 256-319,576-639, Nodes 32-39
> [ 90.964169] Brought up 640 CPUs
>
> The range of processors increases as a power of 2, so 4096 CPU's should
> only take 12 lines.
>
> (QUESTION: print_summary_bootmsg() is in the __init section and is called
> from a __cpuinit function, but only when system is booting. Is there a
> special flag to handle this case?)
>
That's fine, init.text will still be valid as long as cpuinit.text is and
there will be no hotplug considerations.
> For Processor Information printout:
>
> [ 90.968381] Summary Processor Information for CPUS: 0-639
> [ 90.972033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
> [ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
> [ 90.985888] CPU: L2 cache: 256K
> [ 90.988032] CPU: L3 cache: 24576K
> [ 90.992032] MIN 4266.68 BogoMIPS (lpj=8533371)
> [ 91.000033] MAX 4267.89 BogoMIPS (lpj=8535789)
>
> These lines have been moved to loglevel KERN_DEBUG:
>
> CPU: Physical Processor ID:
> CPU: Processor Core ID:
> CPU %d/0x%x -> Node %d
> <cache line sizes per cpu>
> CPUx is down
>
> This message has been changed to loglevel KERN_DEBUG if system is booting
> and KERN_INFO otherwise:
>
> CPU %d is now offline
>
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> arch/x86/include/asm/processor.h | 4
> arch/x86/kernel/cpu/addon_cpuid_features.c | 4 arch/x86/kernel/cpu/amd.c
> | 2 arch/x86/kernel/cpu/common.c | 23 +++-
> arch/x86/kernel/cpu/intel.c | 2
> arch/x86/kernel/cpu/intel_cacheinfo.c | 22 +---
> arch/x86/kernel/smpboot.c | 154
> ++++++++++++++++++++++++++++-
> kernel/cpu.c | 2 8 files changed, 187
> insertions(+), 26 deletions(-)
>
> --- linux.orig/arch/x86/include/asm/processor.h
> +++ linux/arch/x86/include/asm/processor.h
> @@ -111,6 +111,9 @@
> u16 cpu_core_id;
> /* Index into per_cpu list: */
> u16 cpu_index;
> + /* Interior Cache Sizes: */
> + u16 l1i, l1d, l2;
> + u32 l3;
> #endif
> unsigned int x86_hyper_vendor;
> } __attribute__((__aligned__(SMP_CACHE_BYTES)));
> @@ -169,6 +172,7 @@
> extern void identify_boot_cpu(void);
> extern void identify_secondary_cpu(struct cpuinfo_x86 *);
> extern void print_cpu_info(struct cpuinfo_x86 *);
> +extern void print_cache_info(struct cpuinfo_x86 *, char *msglvl);
> extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
> extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
> extern unsigned short num_cache_leaves;
> --- linux.orig/arch/x86/kernel/cpu/addon_cpuid_features.c
> +++ linux/arch/x86/kernel/cpu/addon_cpuid_features.c
> @@ -128,10 +128,10 @@
> c->x86_max_cores = (core_level_siblings / smp_num_siblings);
>
>
> - printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
> + printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
> c->phys_proc_id);
> if (c->x86_max_cores > 1)
> - printk(KERN_INFO "CPU: Processor Core ID: %d\n",
> + printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
> c->cpu_core_id);
> return;
> #endif
Perhaps an opporunity to move these to pr_debug() instead?
> --- linux.orig/arch/x86/kernel/cpu/amd.c
> +++ linux/arch/x86/kernel/cpu/amd.c
> @@ -376,7 +376,7 @@
> }
> numa_set_node(cpu, node);
>
> - printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
> + printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
> #endif
> }
>
> --- linux.orig/arch/x86/kernel/cpu/common.c
> +++ linux/arch/x86/kernel/cpu/common.c
> @@ -475,9 +475,9 @@
>
> out:
> if ((c->x86_max_cores * smp_num_siblings) > 1) {
> - printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
> + printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
> c->phys_proc_id);
> - printk(KERN_INFO "CPU: Processor Core ID: %d\n",
> + printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
> c->cpu_core_id);
> }
> #endif
> @@ -967,6 +967,23 @@
> #endif
> }
>
> +void __cpuinit print_cache_info(struct cpuinfo_x86 *c, char *lvl)
> +{
> + if (c->l1i)
> + printk("%sCPU: L1 I cache: %dK", lvl, c->l1i);
> +
> + if (c->l1d)
> + printk(KERN_CONT ", L1 D cache: %dK\n", c->l1d);
> + else
> + printk(KERN_CONT "\n");
> +
> + if (c->l2)
> + printk("%sCPU: L2 cache: %dK\n", lvl, c->l2);
> +
> + if (c->l3)
> + printk("%sCPU: L3 cache: %dK\n", lvl, c->l3);
> +}
> +
> static __init int setup_disablecpuid(char *arg)
> {
> int bit;
> @@ -1115,7 +1132,7 @@
> if (cpumask_test_and_set_cpu(cpu, cpu_initialized_mask))
> panic("CPU#%d already initialized!\n", cpu);
>
> - printk(KERN_INFO "Initializing CPU#%d\n", cpu);
> + printk(KERN_DEBUG "Initializing CPU#%d\n", cpu);
>
> clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
>
> --- linux.orig/arch/x86/kernel/cpu/intel.c
> +++ linux/arch/x86/kernel/cpu/intel.c
> @@ -267,7 +267,7 @@
> node = first_node(node_online_map);
> numa_set_node(cpu, node);
>
> - printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
> + printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
> #endif
> }
>
> --- linux.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
> +++ linux/arch/x86/kernel/cpu/intel_cacheinfo.c
> @@ -489,23 +489,17 @@
> }
>
> if (trace)
> - printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
> - else if (l1i)
> - printk(KERN_INFO "CPU: L1 I cache: %dK", l1i);
> -
> - if (l1d)
> - printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
> - else
> - printk(KERN_CONT "\n");
> -
> - if (l2)
> - printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
> -
> - if (l3)
> - printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
> + printk(KERN_DEBUG "CPU: Trace cache: %dK uops", trace);
>
> + c->l1i = l1i;
> + c->l1d = l1d;
> + c->l2 = l2;
> + c->l3 = l3;
> c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
>
> + print_cache_info(c,
> + system_state == SYSTEM_BOOTING? KERN_DEBUG : KERN_INFO);
> +
> return l2;
> }
>
> --- linux.orig/arch/x86/kernel/smpboot.c
> +++ linux/arch/x86/kernel/smpboot.c
> @@ -442,6 +442,94 @@
> return c->llc_shared_map;
> }
>
> +/* Summarize Processor Information */
> +static void __init summarize_cpu_info(void)
> +{
> + cpumask_var_t cpulist, cpusdone;
> + int cpu;
> + int err = 0;
> +
> + if (!alloc_cpumask_var(&cpulist, GFP_KERNEL))
> + err = 1;
> +
> + else if (!alloc_cpumask_var(&cpusdone, GFP_KERNEL)) {
> + free_cpumask_var(cpulist);
> + err = 1;
> + }
> +
> + if (err) {
> + printk(KERN_INFO "Can't print processor summaries\n");
> + return;
> + }
> +
> + cpumask_clear(cpusdone);
> + for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
> + struct cpuinfo_x86 *c;
> + int l1i, l1d, l2, l3;
> + int x86, x86_vendor, x86_model, x86_mask;
> + char buf[64];
> + int ncpu;
> + unsigned long minlpj, maxlpj;
> +
> + /* skip if cpu has already been displayed */
> + if (cpumask_test_cpu(cpu, cpusdone))
> + continue;
> +
> + c = &cpu_data(cpu);
> + l1i = c->l1i;
> + l1d = c->l1d;
> + l2 = c->l2;
> + l3 = c->l3;
> + x86 = c->x86;
> + x86_vendor = c->x86_vendor;
> + x86_model = c->x86_model;
> + x86_mask = c->x86_mask;
> + minlpj = ULONG_MAX;
> + maxlpj = 0;
> +
> + cpumask_clear(cpulist);
> +
> + /* collate all cpus with same specifics */
> + for (ncpu = cpu; ncpu < nr_cpu_ids; ncpu++) {
> + if (l1i != cpu_data(ncpu).l1i ||
> + l1d != cpu_data(ncpu).l1d ||
> + l2 != cpu_data(ncpu).l2 ||
> + l3 != cpu_data(ncpu).l3 ||
> + x86 != cpu_data(ncpu).x86 ||
> + x86_vendor != cpu_data(ncpu).x86_vendor ||
> + x86_model != cpu_data(ncpu).x86_model ||
> + x86_mask != cpu_data(ncpu).x86_mask)
> + continue;
> +
> + cpumask_set_cpu(ncpu, cpulist);
> + cpumask_set_cpu(ncpu, cpusdone);
> +
> + if (cpu_data(ncpu).loops_per_jiffy < minlpj)
> + minlpj = cpu_data(ncpu).loops_per_jiffy;
> +
> + if (cpu_data(ncpu).loops_per_jiffy > maxlpj)
> + maxlpj = cpu_data(ncpu).loops_per_jiffy;
> + }
> +
> + cpulist_scnprintf(buf, sizeof(buf), cpulist);
> + printk(KERN_INFO
> + "Summary Processor Information for CPUS: %s\n", buf);
> +
> + printk(KERN_INFO);
> + print_cpu_info(c);
> + print_cache_info(c, KERN_INFO);
> +
> + printk(KERN_INFO "MIN %lu.%02lu BogoMIPS (lpj=%lu)\n",
> + minlpj/(500000/HZ), (minlpj/(5000/HZ)) % 100, minlpj);
> +
> + printk(KERN_INFO "MAX %lu.%02lu BogoMIPS (lpj=%lu)\n",
> + maxlpj/(500000/HZ), (maxlpj/(5000/HZ)) % 100, maxlpj);
> + }
> +
> + free_cpumask_var(cpusdone);
> + free_cpumask_var(cpulist);
> +}
> +
> static void impress_friends(void)
> {
> int cpu;
> @@ -671,6 +759,50 @@
> complete(&c_idle->done);
> }
>
> +/* Summarize the "Booting processor ..." startup messages */
> +static void __init print_summary_bootmsg(int cpu)
> +{
> + static int next_node, node_shift;
> + int node = cpu_to_node(cpu);
> +
> + if (node >= next_node) {
> + cpumask_var_t cpulist;
> +
> + node = next_node;
> + next_node = 1 << node_shift;
> + node_shift++;
> +
> + if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
> + int i, tmp, last_node = node;
> + char buf[32];
> +
> + cpumask_clear(cpulist);
> + for_each_present_cpu(i) {
> + if (i == 0) /* boot cpu */
> + continue;
> +
> + tmp = cpu_to_node(i);
> + if (node <= tmp && tmp < next_node) {
> + cpumask_set_cpu(i, cpulist);
> + if (last_node < tmp)
> + last_node = tmp;
> + }
> + }
> + if (cpumask_weight(cpulist)) {
> + cpulist_scnprintf(buf, sizeof(buf), cpulist);
> + printk(KERN_INFO "Booting Processors %s,",
> buf);
> +
> + if (node == last_node)
> + printk(KERN_CONT " Node %d\n", node);
> + else
> + printk(KERN_CONT " Nodes %d-%d\n",
> + node, last_node);
> + }
> + free_cpumask_var(cpulist);
> + }
> + }
> +}
> +
> /*
> * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
> * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
Why isn't cpumask_of_node() available yet?
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-30 19:54 ` David Rientjes
@ 2009-10-30 20:39 ` Mike Travis
2009-10-30 23:30 ` David Rientjes
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-10-30 20:39 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
David Rientjes wrote:
> On Fri, 30 Oct 2009, Mike Travis wrote:
>
>> x86_64: Limit the number of processor bootup messages
>>
>> With a large number of processors in a system there is an excessive amount
>> of messages sent to the system console. It's estimated that with 4096
>> processors in a system, and the console baudrate set to 56K, the startup
>> messages will take about 84 minutes to clear the serial port.
>>
>> This set of patches limits the number of repetitious messages which contain
>> no additional information. Much of this information is obtainable from the
>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>> buffer as KERN_DEBUG messages so it can be used to examine more closely any
>> details specific to a processor.
>>
>> The list of message transformations....
>>
>> For system_state == SYSTEM_BOOTING:
>>
>> [ 25.388280] Booting Processors 1-7,320-327, Node 0
>> [ 26.064742] Booting Processors 8-15,328-335, Node 1
>> [ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
>> [ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
>> [ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
>> [ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
>> [ 50.917504] Booting Processors 256-319,576-639, Nodes 32-39
>> [ 90.964169] Brought up 640 CPUs
>>
>> The range of processors increases as a power of 2, so 4096 CPU's should
>> only take 12 lines.
>>
>> (QUESTION: print_summary_bootmsg() is in the __init section and is called
>> from a __cpuinit function, but only when system is booting. Is there a
>> special flag to handle this case?)
>>
>
> That's fine, init.text will still be valid as long as cpuinit.text is and
> there will be no hotplug considerations.
Ok, thanks.
>
>> For Processor Information printout:
>>
>> [ 90.968381] Summary Processor Information for CPUS: 0-639
>> [ 90.972033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
>> [ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
>> [ 90.985888] CPU: L2 cache: 256K
>> [ 90.988032] CPU: L3 cache: 24576K
>> [ 90.992032] MIN 4266.68 BogoMIPS (lpj=8533371)
>> [ 91.000033] MAX 4267.89 BogoMIPS (lpj=8535789)
>>
>> These lines have been moved to loglevel KERN_DEBUG:
>>
>> CPU: Physical Processor ID:
>> CPU: Processor Core ID:
>> CPU %d/0x%x -> Node %d
>> <cache line sizes per cpu>
>> CPUx is down
>>
>> This message has been changed to loglevel KERN_DEBUG if system is booting
>> and KERN_INFO otherwise:
>>
>> CPU %d is now offline
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> arch/x86/include/asm/processor.h | 4
>> arch/x86/kernel/cpu/addon_cpuid_features.c | 4 arch/x86/kernel/cpu/amd.c
>> | 2 arch/x86/kernel/cpu/common.c | 23 +++-
>> arch/x86/kernel/cpu/intel.c | 2
>> arch/x86/kernel/cpu/intel_cacheinfo.c | 22 +---
>> arch/x86/kernel/smpboot.c | 154
>> ++++++++++++++++++++++++++++-
>> kernel/cpu.c | 2 8 files changed, 187
>> insertions(+), 26 deletions(-)
>>
>> --- linux.orig/arch/x86/include/asm/processor.h
>> +++ linux/arch/x86/include/asm/processor.h
>> @@ -111,6 +111,9 @@
>> u16 cpu_core_id;
>> /* Index into per_cpu list: */
>> u16 cpu_index;
>> + /* Interior Cache Sizes: */
>> + u16 l1i, l1d, l2;
>> + u32 l3;
>> #endif
>> unsigned int x86_hyper_vendor;
>> } __attribute__((__aligned__(SMP_CACHE_BYTES)));
>> @@ -169,6 +172,7 @@
>> extern void identify_boot_cpu(void);
>> extern void identify_secondary_cpu(struct cpuinfo_x86 *);
>> extern void print_cpu_info(struct cpuinfo_x86 *);
>> +extern void print_cache_info(struct cpuinfo_x86 *, char *msglvl);
>> extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
>> extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
>> extern unsigned short num_cache_leaves;
>> --- linux.orig/arch/x86/kernel/cpu/addon_cpuid_features.c
>> +++ linux/arch/x86/kernel/cpu/addon_cpuid_features.c
>> @@ -128,10 +128,10 @@
>> c->x86_max_cores = (core_level_siblings / smp_num_siblings);
>>
>>
>> - printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
>> + printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
>> c->phys_proc_id);
>> if (c->x86_max_cores > 1)
>> - printk(KERN_INFO "CPU: Processor Core ID: %d\n",
>> + printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
>> c->cpu_core_id);
>> return;
>> #endif
>
> Perhaps an opporunity to move these to pr_debug() instead?
>
Hmm, good idea.
>> --- linux.orig/arch/x86/kernel/cpu/amd.c
>> +++ linux/arch/x86/kernel/cpu/amd.c
>> @@ -376,7 +376,7 @@
>> }
>> numa_set_node(cpu, node);
>>
>> - printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
>> + printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
>> #endif
>> }
>>
>> --- linux.orig/arch/x86/kernel/cpu/common.c
>> +++ linux/arch/x86/kernel/cpu/common.c
>> @@ -475,9 +475,9 @@
>>
>> out:
>> if ((c->x86_max_cores * smp_num_siblings) > 1) {
>> - printk(KERN_INFO "CPU: Physical Processor ID: %d\n",
>> + printk(KERN_DEBUG "CPU: Physical Processor ID: %d\n",
>> c->phys_proc_id);
>> - printk(KERN_INFO "CPU: Processor Core ID: %d\n",
>> + printk(KERN_DEBUG "CPU: Processor Core ID: %d\n",
>> c->cpu_core_id);
>> }
>> #endif
>> @@ -967,6 +967,23 @@
>> #endif
>> }
>>
>> +void __cpuinit print_cache_info(struct cpuinfo_x86 *c, char *lvl)
>> +{
>> + if (c->l1i)
>> + printk("%sCPU: L1 I cache: %dK", lvl, c->l1i);
>> +
>> + if (c->l1d)
>> + printk(KERN_CONT ", L1 D cache: %dK\n", c->l1d);
>> + else
>> + printk(KERN_CONT "\n");
>> +
>> + if (c->l2)
>> + printk("%sCPU: L2 cache: %dK\n", lvl, c->l2);
>> +
>> + if (c->l3)
>> + printk("%sCPU: L3 cache: %dK\n", lvl, c->l3);
>> +}
>> +
>> static __init int setup_disablecpuid(char *arg)
>> {
>> int bit;
>> @@ -1115,7 +1132,7 @@
>> if (cpumask_test_and_set_cpu(cpu, cpu_initialized_mask))
>> panic("CPU#%d already initialized!\n", cpu);
>>
>> - printk(KERN_INFO "Initializing CPU#%d\n", cpu);
>> + printk(KERN_DEBUG "Initializing CPU#%d\n", cpu);
>>
>> clear_in_cr4(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
>>
>> --- linux.orig/arch/x86/kernel/cpu/intel.c
>> +++ linux/arch/x86/kernel/cpu/intel.c
>> @@ -267,7 +267,7 @@
>> node = first_node(node_online_map);
>> numa_set_node(cpu, node);
>>
>> - printk(KERN_INFO "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
>> + printk(KERN_DEBUG "CPU %d/0x%x -> Node %d\n", cpu, apicid, node);
>> #endif
>> }
>>
>> --- linux.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
>> +++ linux/arch/x86/kernel/cpu/intel_cacheinfo.c
>> @@ -489,23 +489,17 @@
>> }
>>
>> if (trace)
>> - printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
>> - else if (l1i)
>> - printk(KERN_INFO "CPU: L1 I cache: %dK", l1i);
>> -
>> - if (l1d)
>> - printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
>> - else
>> - printk(KERN_CONT "\n");
>> -
>> - if (l2)
>> - printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
>> -
>> - if (l3)
>> - printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
>> + printk(KERN_DEBUG "CPU: Trace cache: %dK uops", trace);
>>
>> + c->l1i = l1i;
>> + c->l1d = l1d;
>> + c->l2 = l2;
>> + c->l3 = l3;
>> c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
>>
>> + print_cache_info(c,
>> + system_state == SYSTEM_BOOTING? KERN_DEBUG : KERN_INFO);
>> +
>> return l2;
>> }
>>
>> --- linux.orig/arch/x86/kernel/smpboot.c
>> +++ linux/arch/x86/kernel/smpboot.c
>> @@ -442,6 +442,94 @@
>> return c->llc_shared_map;
>> }
>>
>> +/* Summarize Processor Information */
>> +static void __init summarize_cpu_info(void)
>> +{
>> + cpumask_var_t cpulist, cpusdone;
>> + int cpu;
>> + int err = 0;
>> +
>> + if (!alloc_cpumask_var(&cpulist, GFP_KERNEL))
>> + err = 1;
>> +
>> + else if (!alloc_cpumask_var(&cpusdone, GFP_KERNEL)) {
>> + free_cpumask_var(cpulist);
>> + err = 1;
>> + }
>> +
>> + if (err) {
>> + printk(KERN_INFO "Can't print processor summaries\n");
>> + return;
>> + }
>> +
>> + cpumask_clear(cpusdone);
>> + for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
>> + struct cpuinfo_x86 *c;
>> + int l1i, l1d, l2, l3;
>> + int x86, x86_vendor, x86_model, x86_mask;
>> + char buf[64];
>> + int ncpu;
>> + unsigned long minlpj, maxlpj;
>> +
>> + /* skip if cpu has already been displayed */
>> + if (cpumask_test_cpu(cpu, cpusdone))
>> + continue;
>> +
>> + c = &cpu_data(cpu);
>> + l1i = c->l1i;
>> + l1d = c->l1d;
>> + l2 = c->l2;
>> + l3 = c->l3;
>> + x86 = c->x86;
>> + x86_vendor = c->x86_vendor;
>> + x86_model = c->x86_model;
>> + x86_mask = c->x86_mask;
>> + minlpj = ULONG_MAX;
>> + maxlpj = 0;
>> +
>> + cpumask_clear(cpulist);
>> +
>> + /* collate all cpus with same specifics */
>> + for (ncpu = cpu; ncpu < nr_cpu_ids; ncpu++) {
>> + if (l1i != cpu_data(ncpu).l1i ||
>> + l1d != cpu_data(ncpu).l1d ||
>> + l2 != cpu_data(ncpu).l2 ||
>> + l3 != cpu_data(ncpu).l3 ||
>> + x86 != cpu_data(ncpu).x86 ||
>> + x86_vendor != cpu_data(ncpu).x86_vendor ||
>> + x86_model != cpu_data(ncpu).x86_model ||
>> + x86_mask != cpu_data(ncpu).x86_mask)
>> + continue;
>> +
>> + cpumask_set_cpu(ncpu, cpulist);
>> + cpumask_set_cpu(ncpu, cpusdone);
>> +
>> + if (cpu_data(ncpu).loops_per_jiffy < minlpj)
>> + minlpj = cpu_data(ncpu).loops_per_jiffy;
>> +
>> + if (cpu_data(ncpu).loops_per_jiffy > maxlpj)
>> + maxlpj = cpu_data(ncpu).loops_per_jiffy;
>> + }
>> +
>> + cpulist_scnprintf(buf, sizeof(buf), cpulist);
>> + printk(KERN_INFO
>> + "Summary Processor Information for CPUS: %s\n", buf);
>> +
>> + printk(KERN_INFO);
>> + print_cpu_info(c);
>> + print_cache_info(c, KERN_INFO);
>> +
>> + printk(KERN_INFO "MIN %lu.%02lu BogoMIPS (lpj=%lu)\n",
>> + minlpj/(500000/HZ), (minlpj/(5000/HZ)) % 100, minlpj);
>> +
>> + printk(KERN_INFO "MAX %lu.%02lu BogoMIPS (lpj=%lu)\n",
>> + maxlpj/(500000/HZ), (maxlpj/(5000/HZ)) % 100, maxlpj);
>> + }
>> +
>> + free_cpumask_var(cpusdone);
>> + free_cpumask_var(cpulist);
>> +}
>> +
>> static void impress_friends(void)
>> {
>> int cpu;
>> @@ -671,6 +759,50 @@
>> complete(&c_idle->done);
>> }
>>
>> +/* Summarize the "Booting processor ..." startup messages */
>> +static void __init print_summary_bootmsg(int cpu)
>> +{
>> + static int next_node, node_shift;
>> + int node = cpu_to_node(cpu);
>> +
>> + if (node >= next_node) {
>> + cpumask_var_t cpulist;
>> +
>> + node = next_node;
>> + next_node = 1 << node_shift;
>> + node_shift++;
>> +
>> + if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
>> + int i, tmp, last_node = node;
>> + char buf[32];
>> +
>> + cpumask_clear(cpulist);
>> + for_each_present_cpu(i) {
>> + if (i == 0) /* boot cpu */
>> + continue;
>> +
>> + tmp = cpu_to_node(i);
>> + if (node <= tmp && tmp < next_node) {
>> + cpumask_set_cpu(i, cpulist);
>> + if (last_node < tmp)
>> + last_node = tmp;
>> + }
>> + }
>> + if (cpumask_weight(cpulist)) {
>> + cpulist_scnprintf(buf, sizeof(buf), cpulist);
>> + printk(KERN_INFO "Booting Processors %s,",
>> buf);
>> +
>> + if (node == last_node)
>> + printk(KERN_CONT " Node %d\n", node);
>> + else
>> + printk(KERN_CONT " Nodes %d-%d\n",
>> + node, last_node);
>> + }
>> + free_cpumask_var(cpulist);
>> + }
>> + }
>> +}
>> +
>> /*
>> * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
>> * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
>
> Why isn't cpumask_of_node() available yet?
I'll try that. It gets a bit tricky in specifying the actual last node that
is being booted.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-30 20:39 ` Mike Travis
@ 2009-10-30 23:30 ` David Rientjes
2009-10-31 0:27 ` Mike Travis
0 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-10-30 23:30 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
On Fri, 30 Oct 2009, Mike Travis wrote:
> > > x86_64: Limit the number of processor bootup messages
> > >
Is this really only limited to 64 bit?
> > > With a large number of processors in a system there is an excessive amount
> > > of messages sent to the system console. It's estimated that with 4096
> > > processors in a system, and the console baudrate set to 56K, the startup
> > > messages will take about 84 minutes to clear the serial port.
> > >
> > > This set of patches limits the number of repetitious messages which
> > > contain
> > > no additional information. Much of this information is obtainable from
> > > the
> > > /proc and /sysfs. Most of the messages are also sent to the kernel log
> > > buffer as KERN_DEBUG messages so it can be used to examine more closely
> > > any
> > > details specific to a processor.
> > >
> > > The list of message transformations....
> > >
> > > For system_state == SYSTEM_BOOTING:
> > >
> > > [ 25.388280] Booting Processors 1-7,320-327, Node 0
> > > [ 26.064742] Booting Processors 8-15,328-335, Node 1
> > > [ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
> > > [ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
> > > [ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
> > > [ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
> > > [ 50.917504] Booting Processors 256-319,576-639, Nodes 32-39
> > > [ 90.964169] Brought up 640 CPUs
> > >
> > > The range of processors increases as a power of 2, so 4096 CPU's should
> > > only take 12 lines.
> > >
On your particular machine, yes, but there's no x86 restriction on the
number of cpus per node.
> > > @@ -671,6 +759,50 @@
> > > complete(&c_idle->done);
> > > }
> > >
> > > +/* Summarize the "Booting processor ..." startup messages */
> > > +static void __init print_summary_bootmsg(int cpu)
> > > +{
> > > + static int next_node, node_shift;
> > > + int node = cpu_to_node(cpu);
> > > +
> > > + if (node >= next_node) {
> > > + cpumask_var_t cpulist;
> > > +
> > > + node = next_node;
> > > + next_node = 1 << node_shift;
> > > + node_shift++;
> > > +
> > > + if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
> > > + int i, tmp, last_node = node;
> > > + char buf[32];
> > > +
> > > + cpumask_clear(cpulist);
> > > + for_each_present_cpu(i) {
> > > + if (i == 0) /* boot cpu */
> > > + continue;
> > > +
> > > + tmp = cpu_to_node(i);
> > > + if (node <= tmp && tmp < next_node) {
> > > + cpumask_set_cpu(i, cpulist);
> > > + if (last_node < tmp)
> > > + last_node = tmp;
> > > + }
> > > + }
> > > + if (cpumask_weight(cpulist)) {
> > > + cpulist_scnprintf(buf, sizeof(buf), cpulist);
> > > + printk(KERN_INFO "Booting Processors %s,",
> > > buf);
> > > +
> > > + if (node == last_node)
> > > + printk(KERN_CONT " Node %d\n", node);
> > > + else
> > > + printk(KERN_CONT " Nodes %d-%d\n",
> > > + node, last_node);
> > > + }
> > > + free_cpumask_var(cpulist);
> > > + }
> > > + }
> > > +}
> > > +
> > > /*
> > > * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
> > > * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
> >
> > Why isn't cpumask_of_node() available yet?
>
> I'll try that. It gets a bit tricky in specifying the actual last node that
> is being booted.
>
Why do you need to call print_summary_bootmsg() for each cpu? It seems
like you'd be able to move this out to a single call to a new function:
void __init print_summary_bootmsg(void)
{
char buf[128];
int nid;
for_each_online_node(nid) {
const struct cpumask *mask = cpumask_of_node(nid);
if (cpumask_empty(mask))
continue;
cpulist_scnprintf(buf, sizeof(buf), cpumask_of_node(nid));
pr_info("Booting Processors %s, Node %d\n", buf, nid);
}
}
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-30 23:30 ` David Rientjes
@ 2009-10-31 0:27 ` Mike Travis
0 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-10-31 0:27 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Andi Kleen, Thomas Gleixner, Andrew Morton,
Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
David Rientjes wrote:
> On Fri, 30 Oct 2009, Mike Travis wrote:
>
>>>> x86_64: Limit the number of processor bootup messages
>>>>
>
> Is this really only limited to 64 bit?
[That was a quick edit to change it from SGI X86_64 UV and it didn't
occur to me to remove the _64. :-)]
>
>>>> With a large number of processors in a system there is an excessive amount
>>>> of messages sent to the system console. It's estimated that with 4096
>>>> processors in a system, and the console baudrate set to 56K, the startup
>>>> messages will take about 84 minutes to clear the serial port.
>>>>
>>>> This set of patches limits the number of repetitious messages which
>>>> contain
>>>> no additional information. Much of this information is obtainable from
>>>> the
>>>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>>>> buffer as KERN_DEBUG messages so it can be used to examine more closely
>>>> any
>>>> details specific to a processor.
>>>>
>>>> The list of message transformations....
>>>>
>>>> For system_state == SYSTEM_BOOTING:
>>>>
>>>> [ 25.388280] Booting Processors 1-7,320-327, Node 0
>>>> [ 26.064742] Booting Processors 8-15,328-335, Node 1
>>>> [ 26.837006] Booting Processors 16-31,336-351, Nodes 2-3
>>>> [ 28.440427] Booting Processors 32-63,352-383, Nodes 4-7
>>>> [ 31.640450] Booting Processors 64-127,384-447, Nodes 8-15
>>>> [ 38.041430] Booting Processors 128-255,448-575, Nodes 16-31
>>>> [ 50.917504] Booting Processors 256-319,576-639, Nodes 32-39
>>>> [ 90.964169] Brought up 640 CPUs
>>>>
>>>> The range of processors increases as a power of 2, so 4096 CPU's should
>>>> only take 12 lines.
>>>>
>
> On your particular machine, yes, but there's no x86 restriction on the
> number of cpus per node.
Yes, my comment is wrong. The limit would be 10 lines for the current kernel
limit of 512 nodes.
>
>>>> @@ -671,6 +759,50 @@
>>>> complete(&c_idle->done);
>>>> }
>>>>
>>>> +/* Summarize the "Booting processor ..." startup messages */
>>>> +static void __init print_summary_bootmsg(int cpu)
>>>> +{
>>>> + static int next_node, node_shift;
>>>> + int node = cpu_to_node(cpu);
>>>> +
>>>> + if (node >= next_node) {
>>>> + cpumask_var_t cpulist;
>>>> +
>>>> + node = next_node;
>>>> + next_node = 1 << node_shift;
>>>> + node_shift++;
>>>> +
>>>> + if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
>>>> + int i, tmp, last_node = node;
>>>> + char buf[32];
>>>> +
>>>> + cpumask_clear(cpulist);
>>>> + for_each_present_cpu(i) {
>>>> + if (i == 0) /* boot cpu */
>>>> + continue;
>>>> +
>>>> + tmp = cpu_to_node(i);
>>>> + if (node <= tmp && tmp < next_node) {
>>>> + cpumask_set_cpu(i, cpulist);
>>>> + if (last_node < tmp)
>>>> + last_node = tmp;
>>>> + }
>>>> + }
>>>> + if (cpumask_weight(cpulist)) {
>>>> + cpulist_scnprintf(buf, sizeof(buf), cpulist);
>>>> + printk(KERN_INFO "Booting Processors %s,",
>>>> buf);
>>>> +
>>>> + if (node == last_node)
>>>> + printk(KERN_CONT " Node %d\n", node);
>>>> + else
>>>> + printk(KERN_CONT " Nodes %d-%d\n",
>>>> + node, last_node);
>>>> + }
>>>> + free_cpumask_var(cpulist);
>>>> + }
>>>> + }
>>>> +}
>>>> +
>>>> /*
>>>> * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
>>>> * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
>>> Why isn't cpumask_of_node() available yet?
>> I'll try that. It gets a bit tricky in specifying the actual last node that
>> is being booted.
>>
>
> Why do you need to call print_summary_bootmsg() for each cpu? It seems
> like you'd be able to move this out to a single call to a new function:
>
> void __init print_summary_bootmsg(void)
> {
> char buf[128];
> int nid;
>
> for_each_online_node(nid) {
> const struct cpumask *mask = cpumask_of_node(nid);
>
> if (cpumask_empty(mask))
> continue;
> cpulist_scnprintf(buf, sizeof(buf), cpumask_of_node(nid));
> pr_info("Booting Processors %s, Node %d\n", buf, nid);
> }
> }
Well one thing I did find out, cpumask_of_node (or more specifically
node_to_cpumask_map[] is filled in while the CPU's are booting, not
before.
Also, the above could potentially print 512 lines of boot messages before
booting cpu 1. The printk times also would not be accurate for each group
of cpus. And there's something to be said about actually doing what it
is you say you are doing. ;-)
Booting Processors 0-15 Node 0
Booting Processors 16-31 Node 1
<Here you expect cpus 0-15 to have already been booted.>
Why not just say:
cpulist_scnprintf(buf, sizeof(buf), cpu_present_mask);
pr_info("Booting Processors %s\n", buf);
Since the node -> cpu map can be printed much more efficiently some other way?
For example:
Nodes 0-7: 0-7,512-519 8-15,520-527 ...
would shrink it to 64 lines max.
(Note, it's important to include the "cpu_present_mask" because cpus can
be powered on disabled, and be booted later on, to decrease the initial
system startup time.)
A request was made (by AK?) that getting a general sense of progress is
a "good thing". I wanted to avoid something more mundane like dots or
sequential numbers. The one thing that Andi mentioned that I haven't
figured out is how to "delay print" specific cpu info in the case of a
boot error. I suppose one way would be to save the current position in
the kernel log buffer at the start of each cpu boot, and print that to
the console in case of an error?
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-10-30 19:25 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
2009-10-30 19:54 ` David Rientjes
@ 2009-11-02 11:11 ` Andi Kleen
2009-11-02 19:21 ` Mike Travis
1 sibling, 1 reply; 109+ messages in thread
From: Andi Kleen @ 2009-11-02 11:11 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
Mike Travis wrote:
>
> This set of patches limits the number of repetitious messages which contain
> no additional information. Much of this information is obtainable from the
> /proc and /sysfs. Most of the messages are also sent to the kernel log
> buffer as KERN_DEBUG messages so it can be used to examine more closely any
> details specific to a processor.
What would be good is to put the information from the booting CPUs
into some buffer and print it visibly if there's a timeout detected on the BP.
Also power of two summaries at a bit odd, but ok.
> For Processor Information printout:
>
> [ 90.968381] Summary Processor Information for CPUS: 0-639
> [ 90.972033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
It would be good to print family/model in this line
> [ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
> [ 90.985888] CPU: L2 cache: 256K
> [ 90.988032] CPU: L3 cache: 24576K
I would recommend to drop the cache information; this can be easily
gotten at runtime and is often implied in the CPU name anyways
(and especially L1 and increasingly L2 too change only very rarely)
> [ 90.992032] MIN 4266.68 BogoMIPS (lpj=8533371)
> [ 91.000033] MAX 4267.89 BogoMIPS (lpj=8535789)
Perhaps an average too? You could put all that on one line.
> These lines have been moved to loglevel KERN_DEBUG:
>
> CPU: Physical Processor ID:
> CPU: Processor Core ID:
> CPU %d/0x%x -> Node %d
> <cache line sizes per cpu>
I think you can just remove them.
> CPUx is down
This should be still printed if there's a timeout, or rather print
a "CPUx is not down" message. Right now there's no timeout detection on shutdown, but
I guess that wouldn't be too hard to add.
-Andi
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH 1/8] SGI x86_64 UV: Add limit console output function
2009-10-26 17:55 ` Mike Travis
@ 2009-11-02 14:15 ` Frederic Weisbecker
0 siblings, 0 replies; 109+ messages in thread
From: Frederic Weisbecker @ 2009-11-02 14:15 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
Randy Dunlap, Steven Rostedt, Greg Kroah-Hartman, Heiko Carstens,
Robin Getz, Dave Young, linux-kernel, linux-doc
On Mon, Oct 26, 2009 at 10:55:31AM -0700, Mike Travis wrote:
>
>
> Frederic Weisbecker wrote:
>> On Fri, Oct 23, 2009 at 06:37:44PM -0500, Mike Travis wrote:
>>> With a large number of processors in a system there is an excessive amount
>>> of messages sent to the system console. It's estimated that with 4096
>>> processors in a system, and the console baudrate set to 56K, the startup
>>> messages will take about 84 minutes to clear the serial port.
>>>
>>> This patch adds (for SGI UV only) a kernel start option "limit_console_
>>> output" (or 'lco' for short), which when set provides the ability to
>>> temporarily reduce the console loglevel during system startup. This allows
>>> informative messages to still be seen on the console without producing
>>> excessive amounts of repetious messages.
>>>
>>> Note that all the messages are still available in the kernel log buffer.
>>
>>
>>
>> Well, this problem does not only concerns SGI UV but all boxes with a large
>> number of cpus.
>>
>> Also, instead of adding the same conditionals in multiple places to solve
>> the same problem (and that may even expand if we go further the SGI UV case,
>> for example with other archs cpu up/down events), may be can you centralize,
>> institutionalize this issue by using the existing printk mechanisms.
>>
>> I mean, may be that could be addressed by adding a new printk
>> level flag, and then associate the desired filters against it.
>>
>> KERN_CPU could be a name, since this is targetting cpu events.
>>
>
> I did try out something like this but the changes quickly became very intrusive,
> and I was hoping for a "lighter" touch. The other potential fallout of adding
> another printk level might affect user programs that sift through the dmesg
> log for "interesting" info.
>
> Also, I could use some other config option to enable this, it's just that the
> existing X86_UV was too convenient. ;-) I believe most systems would want this
> turned off so the code size shrinks. And until you get the number of cpus into
> the hundreds and thousands, the messages usually just fly by - particularly if
> you're on a desktop system which has almost an infinite baud rate to the screen,
> and usually hides the messages behind a splash screen anyways.
Ok :)
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 11:11 ` Andi Kleen
@ 2009-11-02 19:21 ` Mike Travis
2009-11-02 19:34 ` Ingo Molnar
2009-11-12 22:22 ` Dave Jones
0 siblings, 2 replies; 109+ messages in thread
From: Mike Travis @ 2009-11-02 19:21 UTC (permalink / raw)
To: Andi Kleen
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
Andi Kleen wrote:
> Mike Travis wrote:
>>
>> This set of patches limits the number of repetitious messages which
>> contain
>> no additional information. Much of this information is obtainable
>> from the
>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>> buffer as KERN_DEBUG messages so it can be used to examine more
>> closely any
>> details specific to a processor.
>
> What would be good is to put the information from the booting CPUs
> into some buffer and print it visibly if there's a timeout detected on
> the BP.
What do you think of this idea.... Add a "mark kernel log buffer" function,
and then if any KERN_NOTE or above happens, it sends the marked info from
the kernel log buffer to the console before the current message. Set the
marker to '0' to clear.
And I was thinking that you might want to print the history of the previous
cpu that booted ok, before printing the info for the cpu that didn't. That
way you'd have some data to compare it with?
>
> Also power of two summaries at a bit odd, but ok.
>
>> For Processor Information printout:
>>
>> [ 90.968381] Summary Processor Information for CPUS: 0-639
>> [ 90.972033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
>
> It would be good to print family/model in this line
There is more info that should be printed? I'm just calling the current
print_cpu_info using the cpuinfo_x86 for the first cpu in the list. And
it appears that it is printing the x86_model_id. Is there some other info
in that struct that should be printed?
>
>> [ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
>> [ 90.985888] CPU: L2 cache: 256K
>> [ 90.988032] CPU: L3 cache: 24576K
>
> I would recommend to drop the cache information; this can be easily
> gotten at runtime and is often implied in the CPU name anyways
> (and especially L1 and increasingly L2 too change only very rarely)
Ok, though because of future system upgrades to a UV system, you can
end up with slightly different processors (of the same family). The
only differences I've detected so far in testing is the stepping has
changed.
>
>> [ 90.992032] MIN 4266.68 BogoMIPS (lpj=8533371)
>> [ 91.000033] MAX 4267.89 BogoMIPS (lpj=8535789)
>
> Perhaps an average too? You could put all that on one line.
Sure thing.
>
>
>> These lines have been moved to loglevel KERN_DEBUG:
>>
>> CPU: Physical Processor ID:
>> CPU: Processor Core ID:
>> CPU %d/0x%x -> Node %d
>> <cache line sizes per cpu>
>
> I think you can just remove them.
I left them in in case we get to the point of printing KERN_DEBUG
messages in case of a failure. But you think they will not be
necessary in that case? (I also left them KERN_DEBUG instead of
pr_debug as the latter optimizes out the print if kernel DEBUG
is not defined... which it won't be in 99% of the kernels our
customers run with. And generally, it's better it get as much
good information as early as possible after a failure, instead
of attempting to recreate the failure with a "debug" kernel
[scheduling time on the system can sometimes be a real pain.]
>
>> CPUx is down
>
> This should be still printed if there's a timeout, or rather print
> a "CPUx is not down" message. Right now there's no timeout detection on
> shutdown, but
> I guess that wouldn't be too hard to add.
That seems a bit outside the scope of this patch...?
>
> -Andi
Thanks!
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 19:21 ` Mike Travis
@ 2009-11-02 19:34 ` Ingo Molnar
2009-11-02 20:32 ` Mike Travis
2009-11-12 22:22 ` Dave Jones
1 sibling, 1 reply; 109+ messages in thread
From: Ingo Molnar @ 2009-11-02 19:34 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
* Mike Travis <travis@sgi.com> wrote:
>
>
> Andi Kleen wrote:
>> Mike Travis wrote:
>>>
>>> This set of patches limits the number of repetitious messages which
>>> contain
>>> no additional information. Much of this information is obtainable
>>> from the
>>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>>> buffer as KERN_DEBUG messages so it can be used to examine more
>>> closely any
>>> details specific to a processor.
>>
>> What would be good is to put the information from the booting CPUs
>> into some buffer and print it visibly if there's a timeout detected on
>> the BP.
>
> What do you think of this idea.... Add a "mark kernel log buffer"
> function, and then if any KERN_NOTE or above happens, it sends the
> marked info from the kernel log buffer to the console before the
> current message. Set the marker to '0' to clear.
That's _way_ too complex really, for little benefit. (If there's a boot
hang people will re-try anyway (and this time with a serial console
attached or so), and they can add various boot options to increase
verbosity - depending in which phase the bootup hung.)
So please go with the simple solution i suggested days ago: print stuff
on the boot CPU but after that only a single line per AP CPU.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 19:34 ` Ingo Molnar
@ 2009-11-02 20:32 ` Mike Travis
2009-11-04 0:22 ` Mike Travis
2009-11-04 10:31 ` Ingo Molnar
0 siblings, 2 replies; 109+ messages in thread
From: Mike Travis @ 2009-11-02 20:32 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>>
>> Andi Kleen wrote:
>>> Mike Travis wrote:
>>>> This set of patches limits the number of repetitious messages which
>>>> contain
>>>> no additional information. Much of this information is obtainable
>>>> from the
>>>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>>>> buffer as KERN_DEBUG messages so it can be used to examine more
>>>> closely any
>>>> details specific to a processor.
>>> What would be good is to put the information from the booting CPUs
>>> into some buffer and print it visibly if there's a timeout detected on
>>> the BP.
>> What do you think of this idea.... Add a "mark kernel log buffer"
>> function, and then if any KERN_NOTE or above happens, it sends the
>> marked info from the kernel log buffer to the console before the
>> current message. Set the marker to '0' to clear.
>
> That's _way_ too complex really, for little benefit. (If there's a boot
> hang people will re-try anyway (and this time with a serial console
> attached or so), and they can add various boot options to increase
> verbosity - depending in which phase the bootup hung.)
I'm ok with this, though generally speaking large server systems have
serial consoles attached, and save the output into admin logs. One
problem with just setting the loglevel high enough to output debug
messages, is you get literally 100's of thousands of lines of meaningless
information. We waited over 8 hours for a system with 2k cpus to boot
in debug mode, and it never made it all the way up.
My intention for the above was to attempt to print debug information
that pertains to the failure, and not everything else.
>
> So please go with the simple solution i suggested days ago: print stuff
> on the boot CPU but after that only a single line per AP CPU.
>
> Ingo
So you think printing 4096 lines provides meaningful additional
information? I would think at least compress it so you only print
each new processor socket boots and not the 16 threads each of
them have?
I should have timing information soon for 512 cores/1024 threads and
printing a single line for each of those will significantly increase
the time it takes to boot.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 20:32 ` Mike Travis
@ 2009-11-04 0:22 ` Mike Travis
2009-11-04 10:24 ` Ingo Molnar
2009-11-04 10:31 ` Ingo Molnar
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-11-04 0:22 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
...
>> So please go with the simple solution i suggested days ago: print
>> stuff on the boot CPU but after that only a single line per AP CPU.
>>
>> Ingo
>
Hi Ingo,
Here is some timing info I collected... Would you accept the first
format (line per node) as a compromise?
(Note there will be 256 node systems w/4096 threads as well as
512 node systems w/HT disabled thus still 4096 threads.)
Btw, I think we should keep the processor summary as it does show some
useful information. Agreed?
(... I'll clean up the BogoMIPS line a bit:
BogoMIPS(lpj): MIN xxx (yyy) AVG xxx (yyy) MAX xxx (yyy)
...)
Thanks,
Mike
64 Nodes/512 cores/1024 threads...
By Node:
1 [ 27.998414] Booting Node 0, Processors 1-7,512-519
2 [ 28.645066] Booting Node 1, Processors 8-15,520-527
3 [ 29.389359] Booting Node 2, Processors 16-23,528-535
4 [ 30.160646] Booting Node 3, Processors 24-31,536-543
...
62 [ 75.013459] Booting Node 61, Processors 488-495,1000-1007
63 [ 75.789663] Booting Node 62, Processors 496-503,1008-1015
64 [ 76.565430] Booting Node 63, Processors 504-511,1016-1023
65 [ 126.860204] Brought up 1024 CPUs
66 [ 126.865392] Summary Processor Information for CPUS: 0-143,256-383,448-655,768-895,960-1023
67 [ 126.876033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
68 [ 126.881404] BogoMIPS: MIN 3980.53 MAX 4268.28 AVG 4265.85
69 [ 126.888032] Loops/Jiffy: MIN 7961074 MAX 8536570 AVG 8531701
70 [ 126.896875] Summary Processor Information for CPUS: 144-239,384-447,656-751,896-959
71 [ 126.904032] Intel(R) Xeon(R) CPU X7560 @ 2.27GHz stepping 05
72 [ 126.913404] BogoMIPS: MIN 4528.51 MAX 4762.75 AVG 4535.89
73 [ 126.920032] Loops/Jiffy: MIN 9057030 MAX 9525505 AVG 9071785
74 [ 126.924217] Summary Processor Information for CPUS: 240-255,752-767
75 [ 126.932032] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 03
76 [ 126.937404] BogoMIPS: MIN 4267.31 MAX 4268.24 AVG 4267.66
77 [ 126.944032] Loops/Jiffy: MIN 8534632 MAX 8536490 AVG 8535338
78 [ 126.952425] Total of 1024 processors activated (4454702.72 BogoMIPS).
By CPU:
1 [ 28.010255] Booting Processor 1 APIC 0x2 ip 0x6000
2 [ 28.106191] Booting Processor 2 APIC 0x4 ip 0x6000
3 [ 28.204705] Booting Processor 3 APIC 0x6 ip 0x6000
4 [ 28.300709] Booting Processor 4 APIC 0x10 ip 0x6000
5 [ 28.400707] Booting Processor 5 APIC 0x12 ip 0x6000
...
1020 [ 131.440341] Booting Processor 1020 APIC 0x7f1 ip 0x6000
1021 [ 131.544366] Booting Processor 1021 APIC 0x7f3 ip 0x6000
1022 [ 131.648354] Booting Processor 1022 APIC 0x7f5 ip 0x6000
1023 [ 131.752350] Booting Processor 1023 APIC 0x7f7 ip 0x6000
1024 [ 131.852202] Brought up 1024 CPUs
1025 [ 131.857394] Summary Processor Information for CPUS: 0-143,256-383,448-655,768-895,960-1023
1026 [ 131.868033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
1027 [ 131.873404] BogoMIPS: MIN 3996.11 MAX 4555.27 AVG 4266.18
1028 [ 131.880033] Loops/Jiffy: MIN 7992230 MAX 9110559 AVG 8532371
1029 [ 131.888871] Summary Processor Information for CPUS: 144-239,384-447,656-751,896-959
1030 [ 131.896032] Intel(R) Xeon(R) CPU X7560 @ 2.27GHz stepping 05
1031 [ 131.905405] BogoMIPS: MIN 4252.28 MAX 4819.44 AVG 4535.46
1032 [ 131.912032] Loops/Jiffy: MIN 8504574 MAX 9638886 AVG 9070920
1033 [ 131.916218] Summary Processor Information for CPUS: 240-255,752-767
1034 [ 131.924032] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 03
1035 [ 131.929404] BogoMIPS: MIN 4267.15 MAX 4268.35 AVG 4267.67
1036 [ 131.936032] Loops/Jiffy: MIN 8534307 MAX 8536711 AVG 8535357
1037 [ 131.944424] Total of 1024 processors activated (4454789.68 BogoMIPS).
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-04 0:22 ` Mike Travis
@ 2009-11-04 10:24 ` Ingo Molnar
0 siblings, 0 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-11-04 10:24 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
* Mike Travis <travis@sgi.com> wrote:
>
> ...
>>> So please go with the simple solution i suggested days ago: print
>>> stuff on the boot CPU but after that only a single line per AP CPU.
>>>
>>> Ingo
>>
>
> Hi Ingo,
>
> Here is some timing info I collected... Would you accept the first
> format (line per node) as a compromise?
>
> (Note there will be 256 node systems w/4096 threads as well as
> 512 node systems w/HT disabled thus still 4096 threads.)
>
> Btw, I think we should keep the processor summary as it does show some
> useful information. Agreed?
>
> (... I'll clean up the BogoMIPS line a bit:
>
> BogoMIPS(lpj): MIN xxx (yyy) AVG xxx (yyy) MAX xxx (yyy)
> ...)
>
> Thanks,
> Mike
>
> 64 Nodes/512 cores/1024 threads...
>
> By Node:
> 1 [ 27.998414] Booting Node 0, Processors 1-7,512-519
> 2 [ 28.645066] Booting Node 1, Processors 8-15,520-527
> 3 [ 29.389359] Booting Node 2, Processors 16-23,528-535
> 4 [ 30.160646] Booting Node 3, Processors 24-31,536-543
> ...
> 62 [ 75.013459] Booting Node 61, Processors 488-495,1000-1007
> 63 [ 75.789663] Booting Node 62, Processors 496-503,1008-1015
> 64 [ 76.565430] Booting Node 63, Processors 504-511,1016-1023
> 65 [ 126.860204] Brought up 1024 CPUs
Yeah, this portion certainly looks good. The important thing is to make
this the default - i.e. we dont want some weird switch (or other hw
dependent flaggery) to turn on two styles of bootup output.
We have trouble keeping a single variant sane already, we definitely
dont want to have multiple variants.
> 66 [ 126.865392] Summary Processor Information for CPUS: 0-143,256-383,448-655,768-895,960-1023
> 67 [ 126.876033] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 04
> 68 [ 126.881404] BogoMIPS: MIN 3980.53 MAX 4268.28 AVG 4265.85
> 69 [ 126.888032] Loops/Jiffy: MIN 7961074 MAX 8536570 AVG 8531701
> 70 [ 126.896875] Summary Processor Information for CPUS: 144-239,384-447,656-751,896-959
> 71 [ 126.904032] Intel(R) Xeon(R) CPU X7560 @ 2.27GHz stepping 05
> 72 [ 126.913404] BogoMIPS: MIN 4528.51 MAX 4762.75 AVG 4535.89
> 73 [ 126.920032] Loops/Jiffy: MIN 9057030 MAX 9525505 AVG 9071785
> 74 [ 126.924217] Summary Processor Information for CPUS: 240-255,752-767
> 75 [ 126.932032] Genuine Intel(R) CPU 0000 @ 2.13GHz stepping 03
> 76 [ 126.937404] BogoMIPS: MIN 4267.31 MAX 4268.24 AVG 4267.66
> 77 [ 126.944032] Loops/Jiffy: MIN 8534632 MAX 8536490 AVG 8535338
> 78 [ 126.952425] Total of 1024 processors activated (4454702.72 BogoMIPS).
4.4 million bogomips. Nice ;-)
>
> By CPU:
> 1 [ 28.010255] Booting Processor 1 APIC 0x2 ip 0x6000
> 2 [ 28.106191] Booting Processor 2 APIC 0x4 ip 0x6000
> 3 [ 28.204705] Booting Processor 3 APIC 0x6 ip 0x6000
> 4 [ 28.300709] Booting Processor 4 APIC 0x10 ip 0x6000
> 5 [ 28.400707] Booting Processor 5 APIC 0x12 ip 0x6000
> ...
> 1020 [ 131.440341] Booting Processor 1020 APIC 0x7f1 ip 0x6000
> 1021 [ 131.544366] Booting Processor 1021 APIC 0x7f3 ip 0x6000
> 1022 [ 131.648354] Booting Processor 1022 APIC 0x7f5 ip 0x6000
> 1023 [ 131.752350] Booting Processor 1023 APIC 0x7f7 ip 0x6000
> 1024 [ 131.852202] Brought up 1024 CPUs
i'd suggest some cleanups to this single line output.
The 'ip' printout is very lame an comes from ancient times, from our
first attempts to boot Linux in SMP mode on dual 100 MHz pentia ;-)
Something like:
Booting CPU #1020 ... ok.
Would be more than enough. Any other info can be made a debug boot flag,
dependent on apic=verbose.
Please send a new iteration of the arch/x86 (and scheduler/timer)
patches in a separate series instead of mixed into this thread.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 20:32 ` Mike Travis
2009-11-04 0:22 ` Mike Travis
@ 2009-11-04 10:31 ` Ingo Molnar
1 sibling, 0 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-11-04 10:31 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Thomas Gleixner, Andrew Morton, Heiko Carstens,
Roland Dreier, Randy Dunlap, Tejun Heo, Greg Kroah-Hartman,
Yinghai Lu, H. Peter Anvin, David Rientjes, Steven Rostedt,
Rusty Russell, Hidetoshi Seto, Jack Steiner, Frederic Weisbecker,
x86, Linux Kernel
* Mike Travis <travis@sgi.com> wrote:
>
>
> Ingo Molnar wrote:
>> * Mike Travis <travis@sgi.com> wrote:
>>
>>>
>>> Andi Kleen wrote:
>>>> Mike Travis wrote:
>>>>> This set of patches limits the number of repetitious messages
>>>>> which contain
>>>>> no additional information. Much of this information is
>>>>> obtainable from the
>>>>> /proc and /sysfs. Most of the messages are also sent to the kernel log
>>>>> buffer as KERN_DEBUG messages so it can be used to examine more
>>>>> closely any
>>>>> details specific to a processor.
>>>> What would be good is to put the information from the booting CPUs
>>>> into some buffer and print it visibly if there's a timeout detected
>>>> on the BP.
>>> What do you think of this idea.... Add a "mark kernel log buffer"
>>> function, and then if any KERN_NOTE or above happens, it sends the
>>> marked info from the kernel log buffer to the console before the
>>> current message. Set the marker to '0' to clear.
>>
>> That's _way_ too complex really, for little benefit. (If there's a boot
>> hang people will re-try anyway (and this time with a serial console
>> attached or so), and they can add various boot options to increase
>> verbosity - depending in which phase the bootup hung.)
>
> I'm ok with this, though generally speaking large server systems have
> serial consoles attached, and save the output into admin logs. [...]
Typically yes, but not necessarily during basic system bringup, which is
when most of the hangs/problems are found.
> [...] One problem with just setting the loglevel high enough to
> output debug messages, is you get literally 100's of thousands of
> lines of meaningless information. We waited over 8 hours for a system
> with 2k cpus to boot in debug mode, and it never made it all the way
> up.
>
> My intention for the above was to attempt to print debug information
> that pertains to the failure, and not everything else.
We want a noise-free default bootup, and printks (on the boot cpu) in
case of failures.
_that_ abnormal-event printout can then be sufficiently verbose.
>> So please go with the simple solution i suggested days ago: print
>> stuff on the boot CPU but after that only a single line per AP CPU.
>
> So you think printing 4096 lines provides meaningful additional
> information? I would think at least compress it so you only print
> each new processor socket boots and not the 16 threads each of them
> have?
>
> I should have timing information soon for 512 cores/1024 threads and
> printing a single line for each of those will significantly increase
> the time it takes to boot.
Feel free to compress it further. What i was objecting to was the
increased complexity of 'buffering' messages somehow and printing them
conditionally.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-10-28 4:53 ` [patch v2] " David Rientjes
2009-10-28 5:19 ` Andi Kleen
@ 2009-11-10 21:08 ` David Rientjes
2009-11-10 21:33 ` Ingo Molnar
1 sibling, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-11-10 21:08 UTC (permalink / raw)
To: Ingo Molnar
Cc: Mike Travis, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
On Tue, 27 Oct 2009, David Rientjes wrote:
> x86: reduce srat verbosity in the kernel log
>
> It's possible to reduce the number of SRAT messages emitted to the kernel
> log by printing each valid pxm once and then creating bitmaps to represent
> the apic ids that map to the same node.
>
> This reduces lines such as
>
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
>
> to
>
> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>
> The buffer used to store the apic id list is 128 characters in length.
> If that is too small to represent all the apic id ranges that are bound
> to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> manually increased for such configurations.
>
> Acked-by: Mike Travis <travis@sgi.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
Ingo, have you had a chance to look at merging this yet?
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-10 21:08 ` David Rientjes
@ 2009-11-10 21:33 ` Ingo Molnar
2009-11-10 21:42 ` Yinghai Lu
` (2 more replies)
0 siblings, 3 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-11-10 21:33 UTC (permalink / raw)
To: David Rientjes
Cc: Mike Travis, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
* David Rientjes <rientjes@google.com> wrote:
> On Tue, 27 Oct 2009, David Rientjes wrote:
>
> > x86: reduce srat verbosity in the kernel log
> >
> > It's possible to reduce the number of SRAT messages emitted to the kernel
> > log by printing each valid pxm once and then creating bitmaps to represent
> > the apic ids that map to the same node.
> >
> > This reduces lines such as
> >
> > SRAT: PXM 0 -> APIC 0 -> Node 0
> > SRAT: PXM 0 -> APIC 1 -> Node 0
> > SRAT: PXM 1 -> APIC 2 -> Node 1
> > SRAT: PXM 1 -> APIC 3 -> Node 1
> >
> > to
> >
> > SRAT: PXM 0 -> APIC {0-1} -> Node 0
> > SRAT: PXM 1 -> APIC {2-3} -> Node 1
> >
> > The buffer used to store the apic id list is 128 characters in length.
> > If that is too small to represent all the apic id ranges that are bound
> > to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> > manually increased for such configurations.
> >
> > Acked-by: Mike Travis <travis@sgi.com>
> > Signed-off-by: David Rientjes <rientjes@google.com>
>
> Ingo, have you had a chance to look at merging this yet?
I'm waiting for Mike to test them (and other patches) and send a new
series out with bits to pick up.
But i really dont like such type of buffering - in the past they tended
to be problematic. Why print this info at all in the default bootup?
It's not needed on a correctly functioning system.
For failure analysis make it opt-in available via a boot parameter (if
it's needed for bootup analysis) - but otherwise just dont print it.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-10 21:33 ` Ingo Molnar
@ 2009-11-10 21:42 ` Yinghai Lu
2009-11-10 21:57 ` Ingo Molnar
2009-11-10 23:09 ` Mike Travis
2009-11-12 20:56 ` David Rientjes
2 siblings, 1 reply; 109+ messages in thread
From: Yinghai Lu @ 2009-11-10 21:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: David Rientjes, Mike Travis, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
Ingo Molnar wrote:
> * David Rientjes <rientjes@google.com> wrote:
>
>> On Tue, 27 Oct 2009, David Rientjes wrote:
>>
>>> x86: reduce srat verbosity in the kernel log
>>>
>>> It's possible to reduce the number of SRAT messages emitted to the kernel
>>> log by printing each valid pxm once and then creating bitmaps to represent
>>> the apic ids that map to the same node.
>>>
>>> This reduces lines such as
>>>
>>> SRAT: PXM 0 -> APIC 0 -> Node 0
>>> SRAT: PXM 0 -> APIC 1 -> Node 0
>>> SRAT: PXM 1 -> APIC 2 -> Node 1
>>> SRAT: PXM 1 -> APIC 3 -> Node 1
>>>
>>> to
>>>
>>> SRAT: PXM 0 -> APIC {0-1} -> Node 0
>>> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>>>
>>> The buffer used to store the apic id list is 128 characters in length.
>>> If that is too small to represent all the apic id ranges that are bound
>>> to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
>>> manually increased for such configurations.
>>>
>>> Acked-by: Mike Travis <travis@sgi.com>
>>> Signed-off-by: David Rientjes <rientjes@google.com>
>> Ingo, have you had a chance to look at merging this yet?
>
> I'm waiting for Mike to test them (and other patches) and send a new
> series out with bits to pick up.
>
> But i really dont like such type of buffering - in the past they tended
> to be problematic. Why print this info at all in the default bootup?
> It's not needed on a correctly functioning system.
>
> For failure analysis make it opt-in available via a boot parameter (if
> it's needed for bootup analysis) - but otherwise just dont print it.
>
make them to depend on apic=debug or apic=verbose?
YH
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-10 21:42 ` Yinghai Lu
@ 2009-11-10 21:57 ` Ingo Molnar
0 siblings, 0 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-11-10 21:57 UTC (permalink / raw)
To: Yinghai Lu
Cc: David Rientjes, Mike Travis, Thomas Gleixner, Andrew Morton,
Jack Steiner, H. Peter Anvin, x86, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
* Yinghai Lu <yinghai@kernel.org> wrote:
> Ingo Molnar wrote:
> > * David Rientjes <rientjes@google.com> wrote:
> >
> >> On Tue, 27 Oct 2009, David Rientjes wrote:
> >>
> >>> x86: reduce srat verbosity in the kernel log
> >>>
> >>> It's possible to reduce the number of SRAT messages emitted to the kernel
> >>> log by printing each valid pxm once and then creating bitmaps to represent
> >>> the apic ids that map to the same node.
> >>>
> >>> This reduces lines such as
> >>>
> >>> SRAT: PXM 0 -> APIC 0 -> Node 0
> >>> SRAT: PXM 0 -> APIC 1 -> Node 0
> >>> SRAT: PXM 1 -> APIC 2 -> Node 1
> >>> SRAT: PXM 1 -> APIC 3 -> Node 1
> >>>
> >>> to
> >>>
> >>> SRAT: PXM 0 -> APIC {0-1} -> Node 0
> >>> SRAT: PXM 1 -> APIC {2-3} -> Node 1
> >>>
> >>> The buffer used to store the apic id list is 128 characters in length.
> >>> If that is too small to represent all the apic id ranges that are bound
> >>> to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
> >>> manually increased for such configurations.
> >>>
> >>> Acked-by: Mike Travis <travis@sgi.com>
> >>> Signed-off-by: David Rientjes <rientjes@google.com>
> >> Ingo, have you had a chance to look at merging this yet?
> >
> > I'm waiting for Mike to test them (and other patches) and send a new
> > series out with bits to pick up.
> >
> > But i really dont like such type of buffering - in the past they tended
> > to be problematic. Why print this info at all in the default bootup?
> > It's not needed on a correctly functioning system.
> >
> > For failure analysis make it opt-in available via a boot parameter (if
> > it's needed for bootup analysis) - but otherwise just dont print it.
> >
> make them to depend on apic=debug or apic=verbose?
Yeah - i'd definitely suggest to not splinter the boot flag space too
much - users wont know what to enable in case of trouble.
But that is a detail (and it can be improved later on as well).
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-10 21:33 ` Ingo Molnar
2009-11-10 21:42 ` Yinghai Lu
@ 2009-11-10 23:09 ` Mike Travis
2009-11-12 20:56 ` David Rientjes
2 siblings, 0 replies; 109+ messages in thread
From: Mike Travis @ 2009-11-10 23:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: David Rientjes, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
Ingo Molnar wrote:
> * David Rientjes <rientjes@google.com> wrote:
>
>> On Tue, 27 Oct 2009, David Rientjes wrote:
>>
>>> x86: reduce srat verbosity in the kernel log
>>>
>>> It's possible to reduce the number of SRAT messages emitted to the kernel
>>> log by printing each valid pxm once and then creating bitmaps to represent
>>> the apic ids that map to the same node.
>>>
>>> This reduces lines such as
>>>
>>> SRAT: PXM 0 -> APIC 0 -> Node 0
>>> SRAT: PXM 0 -> APIC 1 -> Node 0
>>> SRAT: PXM 1 -> APIC 2 -> Node 1
>>> SRAT: PXM 1 -> APIC 3 -> Node 1
>>>
>>> to
>>>
>>> SRAT: PXM 0 -> APIC {0-1} -> Node 0
>>> SRAT: PXM 1 -> APIC {2-3} -> Node 1
>>>
>>> The buffer used to store the apic id list is 128 characters in length.
>>> If that is too small to represent all the apic id ranges that are bound
>>> to a single pxm, a trailing "..." is added. APICID_LIST_LEN should be
>>> manually increased for such configurations.
>>>
>>> Acked-by: Mike Travis <travis@sgi.com>
>>> Signed-off-by: David Rientjes <rientjes@google.com>
>> Ingo, have you had a chance to look at merging this yet?
>
> I'm waiting for Mike to test them (and other patches) and send a new
> series out with bits to pick up.
>
> But i really dont like such type of buffering - in the past they tended
> to be problematic. Why print this info at all in the default bootup?
> It's not needed on a correctly functioning system.
>
> For failure analysis make it opt-in available via a boot parameter (if
> it's needed for bootup analysis) - but otherwise just dont print it.
>
> Ingo
Hi,
Sorry, it's been time consuming getting this checked out as our test
systems are much more in demand right now (SC09 is here.)
I'm very close to submitting another version, just picking up
everyone's comments now. One more test run this afternoon and
I should be able to submit the patches. I believe I've got a
good compromise between informative messages and compactness,
without any additional overhead.
I've also tested David's patch in every run and it hasn't shown any
problems at all. (In fact, a recent merge of ACPI 4.0 code and it
still works flawlessly.)
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-10 21:33 ` Ingo Molnar
2009-11-10 21:42 ` Yinghai Lu
2009-11-10 23:09 ` Mike Travis
@ 2009-11-12 20:56 ` David Rientjes
2009-11-12 21:14 ` Mike Travis
2 siblings, 1 reply; 109+ messages in thread
From: David Rientjes @ 2009-11-12 20:56 UTC (permalink / raw)
To: Ingo Molnar
Cc: Mike Travis, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
On Tue, 10 Nov 2009, Ingo Molnar wrote:
> I'm waiting for Mike to test them (and other patches) and send a new
> series out with bits to pick up.
>
Mike posted his series today without including my patch, so I've replied
to it.
> But i really dont like such type of buffering - in the past they tended
> to be problematic.
I'm not sure that I'd call it buffering when iterating through all apic
id's and setting appropriate bits in a bitmap when they map to a node id.
It's apparently not been problematic either on my machines, Mike's
machines, or his merge with ACPI 4.0 code. I think the code is pretty
straight forward.
> Why print this info at all in the default bootup?
> It's not needed on a correctly functioning system.
>
We have no other export of the apic id to to node mappings in the kernel.
We already show each pxm's address range, each node's address range, and
the pxm to node map. The only other way to map apic ids to nodes is by
looking for the lines "CPU 0/0 -> Node 0," which I believe are being
removed.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-12 20:56 ` David Rientjes
@ 2009-11-12 21:14 ` Mike Travis
2009-11-12 21:20 ` David Rientjes
0 siblings, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-11-12 21:14 UTC (permalink / raw)
To: David Rientjes
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
David Rientjes wrote:
> On Tue, 10 Nov 2009, Ingo Molnar wrote:
>
>> I'm waiting for Mike to test them (and other patches) and send a new
>> series out with bits to pick up.
>>
>
> Mike posted his series today without including my patch, so I've replied
> to it.
Sorry, I wasn't aware I should have.
>
>> But i really dont like such type of buffering - in the past they tended
>> to be problematic.
>
> I'm not sure that I'd call it buffering when iterating through all apic
> id's and setting appropriate bits in a bitmap when they map to a node id.
> It's apparently not been problematic either on my machines, Mike's
> machines, or his merge with ACPI 4.0 code. I think the code is pretty
> straight forward.
>
>> Why print this info at all in the default bootup?
>> It's not needed on a correctly functioning system.
>>
>
> We have no other export of the apic id to to node mappings in the kernel.
> We already show each pxm's address range, each node's address range, and
> the pxm to node map. The only other way to map apic ids to nodes is by
> looking for the lines "CPU 0/0 -> Node 0," which I believe are being
> removed.
The bootup messages in my patch 1/7 list nodes and their processors as each
boots. And this is easily found under /sysfs.
Also, I think in general that all the apic messages, unless they represent
"system boot progress" should be displayed only when asked for, like with
apic=debug or verbose? Something more like:
BIOS-provided physical RAM map processed.
EFI: memory allocated.
SRAT: table interpreted.
Bootmem setups complete.
ACPI: APIC's enabled.
PM: Registered all nosave memory.
Removing the above tables remove about 3400 lines of console output on a 1k
thread machine. There are 20,000+ lines of output before you get to the
login prompt (even with the removal of cpu bootup messages).
But you are right, the apic info should be available via /sysfs or /procfs.
The next BIG output is from devices. Listing all the pci busses available
is an overkill as that info is also readily available when the system is
running.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [patch v2] x86: reduce srat verbosity in the kernel log
2009-11-12 21:14 ` Mike Travis
@ 2009-11-12 21:20 ` David Rientjes
0 siblings, 0 replies; 109+ messages in thread
From: David Rientjes @ 2009-11-12 21:20 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, Jack Steiner,
H. Peter Anvin, x86, Yinghai Lu, Mel Gorman, linux-kernel,
linux-acpi, Andi Kleen
On Thu, 12 Nov 2009, Mike Travis wrote:
> Also, I think in general that all the apic messages, unless they represent
> "system boot progress" should be displayed only when asked for, like with
> apic=debug or verbose? Something more like:
>
That's outside the scope of my patch. My patch does what the title says,
it reduces srat verbosity in the kernel log. If an additional change
would like to suppress that output with a kernel parameter, that's fine,
but it's an additional change and not what I was addressing.
When posting a patchset like this where all patches are related for a
common goal and one patch (mine) was proposed during the development of
the set, it's normal to include that patch in future postings with proper
attribution given by indicating an author other than yourself in the very
first line of the email:
From: David Rientjes <rientjes@google.com>
and retaining your acked-by line, my signed-off-by line, and then adding
your own signed-off-by line.
If a subsequent patch were to suppress this for kernels not using a
certain parameter, I certainly wouldn't object to it.
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-02 19:21 ` Mike Travis
2009-11-02 19:34 ` Ingo Molnar
@ 2009-11-12 22:22 ` Dave Jones
2009-11-12 22:57 ` H. Peter Anvin
1 sibling, 1 reply; 109+ messages in thread
From: Dave Jones @ 2009-11-12 22:22 UTC (permalink / raw)
To: Mike Travis
Cc: Andi Kleen, Ingo Molnar, Thomas Gleixner, Andrew Morton,
Heiko Carstens, Roland Dreier, Randy Dunlap, Tejun Heo,
Greg Kroah-Hartman, Yinghai Lu, H. Peter Anvin, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
On Mon, Nov 02, 2009 at 11:21:39AM -0800, Mike Travis wrote:
> >> [ 90.981402] CPU: L1 I cache: 32K, L1 D cache: 32K
> >> [ 90.985888] CPU: L2 cache: 256K
> >> [ 90.988032] CPU: L3 cache: 24576K
> >
> > I would recommend to drop the cache information; this can be easily
> > gotten at runtime and is often implied in the CPU name anyways
> > (and especially L1 and increasingly L2 too change only very rarely)
>
> Ok, though because of future system upgrades to a UV system, you can
> end up with slightly different processors (of the same family). The
> only differences I've detected so far in testing is the stepping has
> changed.
I happened to be annoyed by dozens of these three printk's earlier,
and hacked up the following (currently untested) patch.
But I don't disagree with Andi either, that it's not particularly useful,
and we can get all this from userspace in /proc/cpuinfo, or x86info.
If someone still finds it valuable to have the kernel keep printing it
though, perhaps something like the following ?
Dave
On processors with a large number of cores, we print dozens of lines of information about
the CPU cache topology, most of which is unnecessary.
This patch reduces spew a lot (down to a single line unless someone uses a mix of processors
with different cache sizes)
- Check if the total cache size on APs is equal to the boot processors cache size.
Print nothing if equal.
- The three printk's will fit on one line.
Signed-off-by: Dave Jones <davej@redhat.com>
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 804c40e..cc4f44d 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -358,9 +362,9 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
#ifdef CONFIG_X86_HT
unsigned int cpu = c->cpu_index;
#endif
+ static int is_initialized;
if (c->cpuid_level > 3) {
- static int is_initialized;
if (is_initialized == 0) {
/* Init num_cache_leaves from boot CPU */
@@ -488,6 +492,21 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
#endif
}
+ c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
+
+ /*
+ * cache topology on all AP's is likely equal to that of the BP
+ * if this is the case, don't bother printing anything out for the AP's.
+ */
+ if (is_initialized != 0) {
+ if (c->x86_cache_size == boot_cpu_data.x86_cache_size)
+ return l2;
+ else
+ printk(KERN_INFO "CPU: AP has different cache size (%d) to BP (%d)\n",
+ c->x86_cache_size,
+ boot_cpu_data.x86_cache_size);
+ }
+
if (trace)
printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
else if (l1i)
@@ -495,16 +512,12 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
if (l1d)
printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
- else
- printk(KERN_CONT "\n");
if (l2)
- printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
+ printk(KERN_CONT ", L2 cache: %dK\n", l2);
if (l3)
- printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
-
- c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
+ printk(KERN_CONT ", L3 cache: %dK\n", l3);
return l2;
}
^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-12 22:22 ` Dave Jones
@ 2009-11-12 22:57 ` H. Peter Anvin
2009-11-12 23:15 ` Dave Jones
2009-11-13 16:10 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
0 siblings, 2 replies; 109+ messages in thread
From: H. Peter Anvin @ 2009-11-12 22:57 UTC (permalink / raw)
To: Dave Jones, Mike Travis, Andi Kleen, Ingo Molnar,
Thomas Gleixner, Andrew Morton, Heiko Carstens, Roland Dreier,
Randy Dunlap, Tejun Heo, Greg Kroah-Hartman, Yinghai Lu,
David Rientjes, Steven Rostedt, Rusty Russell, Hidetoshi Seto,
Jack Steiner, Frederic Weisbecker, x86, Linux Kernel
On 11/12/2009 02:22 PM, Dave Jones wrote:
>
> But I don't disagree with Andi either, that it's not particularly useful,
> and we can get all this from userspace in /proc/cpuinfo, or x86info.
>
I personally don't think it's useful at all. It gives information about
the processor which can be obtained from other sources. What we want is
enough information that the CPU can be unambiguously identified, so that
when someone posts dmesg we can tell what machine they came from.
-hpa
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-12 22:57 ` H. Peter Anvin
@ 2009-11-12 23:15 ` Dave Jones
2009-11-13 8:03 ` Ingo Molnar
` (2 more replies)
2009-11-13 16:10 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
1 sibling, 3 replies; 109+ messages in thread
From: Dave Jones @ 2009-11-12 23:15 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Mike Travis, Andi Kleen, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Heiko Carstens, Roland Dreier, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
On Thu, Nov 12, 2009 at 02:57:33PM -0800, H. Peter Anvin wrote:
> On 11/12/2009 02:22 PM, Dave Jones wrote:
> >
> > But I don't disagree with Andi either, that it's not particularly useful,
> > and we can get all this from userspace in /proc/cpuinfo, or x86info.
> >
>
> I personally don't think it's useful at all. It gives information about
> the processor which can be obtained from other sources. What we want is
> enough information that the CPU can be unambiguously identified, so that
> when someone posts dmesg we can tell what machine they came from.
In which case..
Dave
---
Remove the CPU cache size printk's.
They aren't useful, and pollute the dmesg output a lot (especially on machines with many cores).
Also the same information can be trivially found out from userspace.
Signed-off-by: Dave Jones <davej@redhat.com>
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 804c40e..868fcdd 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -488,22 +493,6 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
#endif
}
- if (trace)
- printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
- else if (l1i)
- printk(KERN_INFO "CPU: L1 I cache: %dK", l1i);
-
- if (l1d)
- printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
- else
- printk(KERN_CONT "\n");
-
- if (l2)
- printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
-
- if (l3)
- printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
-
c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
return l2;
^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-12 23:15 ` Dave Jones
@ 2009-11-13 8:03 ` Ingo Molnar
2009-11-13 8:11 ` H. Peter Anvin
2009-11-13 8:18 ` [tip:x86/debug] x86: Remove the CPU cache size printk's tip-bot for Dave Jones
2009-11-13 22:38 ` [PATCH] x86: Remove CPU cache size output for non-Intel too Roland Dreier
2 siblings, 1 reply; 109+ messages in thread
From: Ingo Molnar @ 2009-11-13 8:03 UTC (permalink / raw)
To: Dave Jones, H. Peter Anvin, Mike Travis, Andi Kleen,
Thomas Gleixner, Andrew Morton, Heiko Carstens, Roland Dreier,
Randy Dunlap, Tejun Heo, Greg Kroah-Hartman, Yinghai Lu,
David Rientjes, Steven Rostedt, Rusty Russell, Hidetoshi Seto,
Jack Steiner, Frederic Weisbecker, x86, Linux Kernel
* Dave Jones <davej@redhat.com> wrote:
> In which case..
>
> Dave
>
> ---
>
> Remove the CPU cache size printk's.
>
> They aren't useful, and pollute the dmesg output a lot (especially on
> machines with many cores). Also the same information can be trivially
> found out from userspace.
>
> Signed-off-by: Dave Jones <davej@redhat.com>
Precisely - these are the kind of simple patches i'd like to see.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-13 8:03 ` Ingo Molnar
@ 2009-11-13 8:11 ` H. Peter Anvin
0 siblings, 0 replies; 109+ messages in thread
From: H. Peter Anvin @ 2009-11-13 8:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: Dave Jones, Mike Travis, Andi Kleen, Thomas Gleixner,
Andrew Morton, Heiko Carstens, Roland Dreier, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
On 11/13/2009 12:03 AM, Ingo Molnar wrote:
>
> * Dave Jones <davej@redhat.com> wrote:
>
>> In which case..
>>
>> Dave
>>
>> ---
>>
>> Remove the CPU cache size printk's.
>>
>> They aren't useful, and pollute the dmesg output a lot (especially on
>> machines with many cores). Also the same information can be trivially
>> found out from userspace.
>>
>> Signed-off-by: Dave Jones <davej@redhat.com>
>
> Precisely - these are the kind of simple patches i'd like to see.
>
Indeed.
Acked-by: H. Peter Anvin <hpa@zytor.com>
I'll apply it tomorrow if you don't beat me to it.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 109+ messages in thread
* [tip:x86/debug] x86: Remove the CPU cache size printk's
2009-11-12 23:15 ` Dave Jones
2009-11-13 8:03 ` Ingo Molnar
@ 2009-11-13 8:18 ` tip-bot for Dave Jones
2009-11-13 22:38 ` [PATCH] x86: Remove CPU cache size output for non-Intel too Roland Dreier
2 siblings, 0 replies; 109+ messages in thread
From: tip-bot for Dave Jones @ 2009-11-13 8:18 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, rusty, fweisbec, rostedt, gregkh, ak, heiko.carstens,
tglx, rientjes, linux-kernel, hpa, yinghai, travis,
seto.hidetoshi, davej, rdunlap, steiner, rdreier, tj, mingo
Commit-ID: 15cd8812ab2ce62a2f779e93a8398bdad752291a
Gitweb: http://git.kernel.org/tip/15cd8812ab2ce62a2f779e93a8398bdad752291a
Author: Dave Jones <davej@redhat.com>
AuthorDate: Thu, 12 Nov 2009 18:15:43 -0500
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 13 Nov 2009 09:14:55 +0100
x86: Remove the CPU cache size printk's
They aren't really useful, and they pollute the dmesg output a lot
(especially on machines with many cores).
Also the same information can be trivially found out from
userspace.
Reported-by: Mike Travis <travis@sgi.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091112231542.GA7129@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/cpu/intel_cacheinfo.c | 16 ----------------
1 files changed, 0 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 804c40e..0df4c2b 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -488,22 +488,6 @@ unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
#endif
}
- if (trace)
- printk(KERN_INFO "CPU: Trace cache: %dK uops", trace);
- else if (l1i)
- printk(KERN_INFO "CPU: L1 I cache: %dK", l1i);
-
- if (l1d)
- printk(KERN_CONT ", L1 D cache: %dK\n", l1d);
- else
- printk(KERN_CONT "\n");
-
- if (l2)
- printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
-
- if (l3)
- printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
-
c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
return l2;
^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-12 22:57 ` H. Peter Anvin
2009-11-12 23:15 ` Dave Jones
@ 2009-11-13 16:10 ` Mike Travis
2009-11-14 0:53 ` Ingo Molnar
1 sibling, 1 reply; 109+ messages in thread
From: Mike Travis @ 2009-11-13 16:10 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Dave Jones, Andi Kleen, Ingo Molnar, Thomas Gleixner,
Andrew Morton, Heiko Carstens, Roland Dreier, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
H. Peter Anvin wrote:
> On 11/12/2009 02:22 PM, Dave Jones wrote:
>> But I don't disagree with Andi either, that it's not particularly useful,
>> and we can get all this from userspace in /proc/cpuinfo, or x86info.
>>
>
> I personally don't think it's useful at all. It gives information about
> the processor which can be obtained from other sources. What we want is
> enough information that the CPU can be unambiguously identified, so that
> when someone posts dmesg we can tell what machine they came from.
>
> -hpa
Can we say the same thing about the sched debug messages? It's even more
painful because the number of lines output is exponential to the number
of cpus.
Thanks,
Mike
^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] x86: Remove CPU cache size output for non-Intel too
2009-11-12 23:15 ` Dave Jones
2009-11-13 8:03 ` Ingo Molnar
2009-11-13 8:18 ` [tip:x86/debug] x86: Remove the CPU cache size printk's tip-bot for Dave Jones
@ 2009-11-13 22:38 ` Roland Dreier
2009-11-13 22:52 ` Dave Jones
2009-11-14 0:54 ` [tip:x86/debug] " tip-bot for Roland Dreier
2 siblings, 2 replies; 109+ messages in thread
From: Roland Dreier @ 2009-11-13 22:38 UTC (permalink / raw)
To: Dave Jones
Cc: H. Peter Anvin, Mike Travis, Andi Kleen, Ingo Molnar,
Thomas Gleixner, Andrew Morton, Heiko Carstens, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
As Dave Jones said about the output in intel_cacheinfo.c: "They aren't
useful, and pollute the dmesg output a lot (especially on machines with
many cores). Also the same information can be trivially found out from
userspace." Give the generic display_cacheinfo() function the same
treatment.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
arch/x86/kernel/cpu/common.c | 5 -----
1 files changed, 0 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cc25c2b..5f8f420 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -391,8 +391,6 @@ void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c)
if (n >= 0x80000005) {
cpuid(0x80000005, &dummy, &ebx, &ecx, &edx);
- printk(KERN_INFO "CPU: L1 I Cache: %dK (%d bytes/line), D cache %dK (%d bytes/line)\n",
- edx>>24, edx&0xFF, ecx>>24, ecx&0xFF);
c->x86_cache_size = (ecx>>24) + (edx>>24);
#ifdef CONFIG_X86_64
/* On K8 L1 TLB is inclusive, so don't count it */
@@ -422,9 +420,6 @@ void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c)
#endif
c->x86_cache_size = l2size;
-
- printk(KERN_INFO "CPU: L2 Cache: %dK (%d bytes/line)\n",
- l2size, ecx & 0xFF);
}
void __cpuinit detect_ht(struct cpuinfo_x86 *c)
^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] x86: Remove CPU cache size output for non-Intel too
2009-11-13 22:38 ` [PATCH] x86: Remove CPU cache size output for non-Intel too Roland Dreier
@ 2009-11-13 22:52 ` Dave Jones
2009-11-14 0:54 ` [tip:x86/debug] " tip-bot for Roland Dreier
1 sibling, 0 replies; 109+ messages in thread
From: Dave Jones @ 2009-11-13 22:52 UTC (permalink / raw)
To: Roland Dreier
Cc: H. Peter Anvin, Mike Travis, Andi Kleen, Ingo Molnar,
Thomas Gleixner, Andrew Morton, Heiko Carstens, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
On Fri, Nov 13, 2009 at 02:38:26PM -0800, Roland Dreier wrote:
> As Dave Jones said about the output in intel_cacheinfo.c: "They aren't
> useful, and pollute the dmesg output a lot (especially on machines with
> many cores). Also the same information can be trivially found out from
> userspace." Give the generic display_cacheinfo() function the same
> treatment.
>
> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Dave Jones <davej@redhat.com>
^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] x86_64: Limit the number of processor bootup messages
2009-11-13 16:10 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
@ 2009-11-14 0:53 ` Ingo Molnar
0 siblings, 0 replies; 109+ messages in thread
From: Ingo Molnar @ 2009-11-14 0:53 UTC (permalink / raw)
To: Mike Travis
Cc: H. Peter Anvin, Dave Jones, Andi Kleen, Thomas Gleixner,
Andrew Morton, Heiko Carstens, Roland Dreier, Randy Dunlap,
Tejun Heo, Greg Kroah-Hartman, Yinghai Lu, David Rientjes,
Steven Rostedt, Rusty Russell, Hidetoshi Seto, Jack Steiner,
Frederic Weisbecker, x86, Linux Kernel
* Mike Travis <travis@sgi.com> wrote:
> H. Peter Anvin wrote:
> >On 11/12/2009 02:22 PM, Dave Jones wrote:
> >>But I don't disagree with Andi either, that it's not particularly useful,
> >>and we can get all this from userspace in /proc/cpuinfo, or x86info.
> >>
> >
> > I personally don't think it's useful at all. It gives information
> > about the processor which can be obtained from other sources. What
> > we want is enough information that the CPU can be unambiguously
> > identified, so that when someone posts dmesg we can tell what
> > machine they came from.
>
> Can we say the same thing about the sched debug messages? It's even
> more painful because the number of lines output is exponential to the
> number of cpus.
Yeah - but instead of getting rid of them please tie them to a (default
off) debug switch. There's been the occasional crash where they were
useful.
Ingo
^ permalink raw reply [flat|nested] 109+ messages in thread
* [tip:x86/debug] x86: Remove CPU cache size output for non-Intel too
2009-11-13 22:38 ` [PATCH] x86: Remove CPU cache size output for non-Intel too Roland Dreier
2009-11-13 22:52 ` Dave Jones
@ 2009-11-14 0:54 ` tip-bot for Roland Dreier
1 sibling, 0 replies; 109+ messages in thread
From: tip-bot for Roland Dreier @ 2009-11-14 0:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, rusty, fweisbec, rostedt, gregkh, ak, heiko.carstens,
tglx, rientjes, linux-kernel, hpa, yinghai, seto.hidetoshi,
travis, davej, rdunlap, steiner, rdreier, tj, rolandd, mingo
Commit-ID: b01c845f0f2e3f9e54e6a78d5d56895f5b95e818
Gitweb: http://git.kernel.org/tip/b01c845f0f2e3f9e54e6a78d5d56895f5b95e818
Author: Roland Dreier <rdreier@cisco.com>
AuthorDate: Fri, 13 Nov 2009 14:38:26 -0800
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sat, 14 Nov 2009 01:51:18 +0100
x86: Remove CPU cache size output for non-Intel too
As Dave Jones said about the output in intel_cacheinfo.c: "They
aren't useful, and pollute the dmesg output a lot (especially on
machines with many cores). Also the same information can be
trivially found out from userspace."
Give the generic display_cacheinfo() function the same treatment.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Dave Jones <davej@redhat.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <adaocn6dp99.fsf_-_@roland-alpha.cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/cpu/common.c | 5 -----
1 files changed, 0 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 617a29f..9db1e24 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -391,8 +391,6 @@ void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c)
if (n >= 0x80000005) {
cpuid(0x80000005, &dummy, &ebx, &ecx, &edx);
- printk(KERN_INFO "CPU: L1 I Cache: %dK (%d bytes/line), D cache %dK (%d bytes/line)\n",
- edx>>24, edx&0xFF, ecx>>24, ecx&0xFF);
c->x86_cache_size = (ecx>>24) + (edx>>24);
#ifdef CONFIG_X86_64
/* On K8 L1 TLB is inclusive, so don't count it */
@@ -422,9 +420,6 @@ void __cpuinit display_cacheinfo(struct cpuinfo_x86 *c)
#endif
c->x86_cache_size = l2size;
-
- printk(KERN_INFO "CPU: L2 Cache: %dK (%d bytes/line)\n",
- l2size, ecx & 0xFF);
}
void __cpuinit detect_ht(struct cpuinfo_x86 *c)
^ permalink raw reply related [flat|nested] 109+ messages in thread
end of thread, other threads:[~2009-11-14 0:56 UTC | newest]
Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20091023233743.439628000@alcatraz.americas.sgi.com>
2009-10-23 23:37 ` [PATCH 1/8] SGI x86_64 UV: Add limit console output function Mike Travis
2009-10-24 1:09 ` Frederic Weisbecker
2009-10-26 17:55 ` Mike Travis
2009-11-02 14:15 ` Frederic Weisbecker
2009-10-26 7:02 ` Andi Kleen
2009-10-26 16:10 ` Steven Rostedt
2009-10-26 18:05 ` Mike Travis
2009-10-26 18:51 ` Steven Rostedt
2009-10-26 18:03 ` Mike Travis
2009-10-26 21:55 ` Andi Kleen
2009-10-26 22:07 ` Mike Travis
2009-10-30 19:25 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
2009-10-30 19:54 ` David Rientjes
2009-10-30 20:39 ` Mike Travis
2009-10-30 23:30 ` David Rientjes
2009-10-31 0:27 ` Mike Travis
2009-11-02 11:11 ` Andi Kleen
2009-11-02 19:21 ` Mike Travis
2009-11-02 19:34 ` Ingo Molnar
2009-11-02 20:32 ` Mike Travis
2009-11-04 0:22 ` Mike Travis
2009-11-04 10:24 ` Ingo Molnar
2009-11-04 10:31 ` Ingo Molnar
2009-11-12 22:22 ` Dave Jones
2009-11-12 22:57 ` H. Peter Anvin
2009-11-12 23:15 ` Dave Jones
2009-11-13 8:03 ` Ingo Molnar
2009-11-13 8:11 ` H. Peter Anvin
2009-11-13 8:18 ` [tip:x86/debug] x86: Remove the CPU cache size printk's tip-bot for Dave Jones
2009-11-13 22:38 ` [PATCH] x86: Remove CPU cache size output for non-Intel too Roland Dreier
2009-11-13 22:52 ` Dave Jones
2009-11-14 0:54 ` [tip:x86/debug] " tip-bot for Roland Dreier
2009-11-13 16:10 ` [PATCH] x86_64: Limit the number of processor bootup messages Mike Travis
2009-11-14 0:53 ` Ingo Molnar
2009-10-23 23:37 ` [PATCH 2/8] SGI x86_64 UV: " Mike Travis
2009-10-26 7:26 ` Andi Kleen
2009-10-23 23:37 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Mike Travis
2009-10-26 7:04 ` Andi Kleen
2009-10-26 18:08 ` Mike Travis
2009-10-27 15:24 ` Mike Travis
2009-10-27 19:45 ` David Rientjes
2009-10-27 20:00 ` Mike Travis
2009-10-27 20:25 ` [patch] x86: reduce srat verbosity in the kernel log David Rientjes
2009-10-27 20:42 ` Mike Travis
2009-10-27 20:48 ` David Rientjes
2009-10-27 23:02 ` Mike Travis
2009-10-28 3:29 ` Andi Kleen
2009-10-28 4:08 ` David Rientjes
2009-10-28 3:53 ` Yinghai Lu
2009-10-28 4:08 ` David Rientjes
2009-10-27 20:55 ` Cyrill Gorcunov
2009-10-27 21:06 ` David Rientjes
2009-10-27 21:10 ` Cyrill Gorcunov
2009-10-28 3:32 ` Andi Kleen
2009-10-28 4:08 ` David Rientjes
2009-10-28 4:11 ` Andi Kleen
2009-10-28 4:53 ` [patch v2] " David Rientjes
2009-10-28 5:19 ` Andi Kleen
2009-10-28 5:24 ` David Rientjes
2009-11-10 21:08 ` David Rientjes
2009-11-10 21:33 ` Ingo Molnar
2009-11-10 21:42 ` Yinghai Lu
2009-11-10 21:57 ` Ingo Molnar
2009-11-10 23:09 ` Mike Travis
2009-11-12 20:56 ` David Rientjes
2009-11-12 21:14 ` Mike Travis
2009-11-12 21:20 ` David Rientjes
2009-10-28 17:02 ` [patch] " Mike Travis
2009-10-28 20:52 ` David Rientjes
2009-10-28 21:03 ` Mike Travis
2009-10-28 21:06 ` David Rientjes
2009-10-28 21:35 ` Mike Travis
2009-10-28 21:46 ` David Rientjes
2009-10-28 22:36 ` Mike Travis
2009-10-29 8:21 ` David Rientjes
2009-10-29 16:34 ` Mike Travis
2009-10-29 19:06 ` David Rientjes
2009-10-27 20:16 ` [PATCH 3/8] SGI x86_64 UV: Limit the number of number of SRAT messages Cyrill Gorcunov
2009-10-27 20:23 ` Mike Travis
2009-10-27 20:33 ` Cyrill Gorcunov
2009-10-23 23:37 ` [PATCH 4/8] SGI x86_64 UV: Limit the number of ACPI messages Mike Travis
2009-10-24 3:29 ` Bjorn Helgaas
2009-10-26 18:15 ` Mike Travis
2009-10-26 22:47 ` Thomas Renninger
2009-10-26 21:25 ` Mike Travis
2009-10-27 15:27 ` Mike Travis
2009-10-27 15:51 ` Bjorn Helgaas
2009-10-23 23:37 ` [PATCH 5/8] SGI x86_64 UV: Limit the number of firmware messages Mike Travis
2009-10-23 23:37 ` [PATCH 6/8] SGI x86_64 UV: Limit the number of microcode messages Mike Travis
2009-10-24 20:09 ` Dmitry Adamushko
2009-10-24 21:09 ` Tigran Aivazian
2009-10-24 22:45 ` Dmitry Adamushko
2009-10-25 16:37 ` Ingo Molnar
2009-10-25 17:11 ` Arjan van de Ven
2009-10-25 17:27 ` Ingo Molnar
2009-10-26 18:33 ` Mike Travis
2009-10-26 18:29 ` Mike Travis
2009-10-26 18:29 ` Mike Travis
2009-10-26 20:11 ` Dmitry Adamushko
2009-10-27 15:21 ` Mike Travis
2009-10-26 18:25 ` Mike Travis
2009-10-26 19:27 ` Borislav Petkov
2009-10-30 19:40 ` [PATCH] x86_64: " Mike Travis
2009-10-26 18:24 ` [PATCH 6/8] SGI x86_64 UV: " Mike Travis
2009-10-26 18:18 ` Mike Travis
2009-10-26 7:05 ` Andi Kleen
2009-10-26 18:34 ` Mike Travis
2009-10-23 23:37 ` [PATCH 7/8] SGI x86_64 UV: Limit the number of scheduler debug messages Mike Travis
2009-10-23 23:37 ` [PATCH 8/8] SGI x86_64 UV: Limit the number of cpu is down messages Mike Travis
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.