linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section
@ 2007-02-12  7:37 Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [2/39] x86_64: Break init() in two parts to avoid MODPOST warnings Andi Kleen
                   ` (37 more replies)
  0 siblings, 38 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Vivek Goyal, patches, linux-kernel


From: Vivek Goyal <vgoyal@in.ibm.com>

o Entry startup_32 was in .text section but it was accessing some init
  data too and it prompts MODPOST to generate compilation warnings.

WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
.text between '_text' (at offset 0xc0100029) and 'startup_32_smp'
WARNING: vmlinux - Section mismatch: reference to .init.data:boot_params from
.text between '_text' (at offset 0xc0100037) and 'startup_32_smp'
WARNING: vmlinux - Section mismatch: reference to
.init.data:init_pg_tables_end from .text between '_text' (at offset
0xc0100099) and 'startup_32_smp'

o Can't move startup_32 to .init.text as this entry point has to be at the
  start of bzImage. Hence moved startup_32 to a new section .text.head and
  instructed MODPOST to not to generate warnings if init data is being
  accessed from .text.head section. This code has been audited.

o SMP boot up code (startup_32_smp) can go into .init.text if CPU hotplug
  is not supported. Otherwise it generates more warnings

WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
.text between 'checkCPUtype' (at offset 0xc0100126) and 'is486'
WARNING: vmlinux - Section mismatch: reference to .init.data:new_cpu_data from
.text between 'checkCPUtype' (at offset 0xc0100130) and 'is486'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/head.S        |   17 ++++++++++++++---
 arch/i386/kernel/vmlinux.lds.S |    7 ++++++-
 scripts/mod/modpost.c          |   10 +++++++++-
 3 files changed, 29 insertions(+), 5 deletions(-)

Index: linux/arch/i386/kernel/head.S
===================================================================
--- linux.orig/arch/i386/kernel/head.S
+++ linux/arch/i386/kernel/head.S
@@ -53,6 +53,7 @@
  * any particular GDT layout, because we load our own as soon as we
  * can.
  */
+.section .text.head,"ax",@progbits
 ENTRY(startup_32)
 
 #ifdef CONFIG_PARAVIRT
@@ -141,16 +142,25 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
 	jb 10b
 	movl %edi,(init_pg_tables_end - __PAGE_OFFSET)
 
-#ifdef CONFIG_SMP
 	xorl %ebx,%ebx				/* This is the boot CPU (BSP) */
 	jmp 3f
-
 /*
  * Non-boot CPU entry point; entered from trampoline.S
  * We can't lgdt here, because lgdt itself uses a data segment, but
  * we know the trampoline has already loaded the boot_gdt_table GDT
  * for us.
+ *
+ * If cpu hotplug is not supported then this code can go in init section
+ * which will be freed later
  */
+
+#ifdef CONFIG_HOTPLUG_CPU
+.section .text,"ax",@progbits
+#else
+.section .init.text,"ax",@progbits
+#endif
+
+#ifdef CONFIG_SMP
 ENTRY(startup_32_smp)
 	cld
 	movl $(__BOOT_DS),%eax
@@ -208,8 +218,8 @@ ENTRY(startup_32_smp)
 	xorl %ebx,%ebx
 	incl %ebx
 
-3:
 #endif /* CONFIG_SMP */
+3:
 
 /*
  * Enable paging
@@ -492,6 +502,7 @@ ignore_int:
 #endif
 	iret
 
+.section .text
 #ifdef CONFIG_PARAVIRT
 startup_paravirt:
 	cld
Index: linux/arch/i386/kernel/vmlinux.lds.S
===================================================================
--- linux.orig/arch/i386/kernel/vmlinux.lds.S
+++ linux/arch/i386/kernel/vmlinux.lds.S
@@ -37,9 +37,14 @@ SECTIONS
 {
   . = LOAD_OFFSET + LOAD_PHYSICAL_ADDR;
   phys_startup_32 = startup_32 - LOAD_OFFSET;
+
+  .text.head : AT(ADDR(.text.head) - LOAD_OFFSET) {
+  	_text = .;			/* Text and read-only data */
+	*(.text.head)
+  } :text = 0x9090
+
   /* read-only */
   .text : AT(ADDR(.text) - LOAD_OFFSET) {
-  	_text = .;			/* Text and read-only data */
 	*(.text)
 	SCHED_TEXT
 	LOCK_TEXT
Index: linux/scripts/mod/modpost.c
===================================================================
--- linux.orig/scripts/mod/modpost.c
+++ linux/scripts/mod/modpost.c
@@ -641,12 +641,20 @@ static int secref_whitelist(const char *
 	if (f1 && f2)
 		return 1;
 
-	/* Whitelist all references from .pci_fixup section if vmlinux */
+	/* Whitelist all references from .pci_fixup section if vmlinux
+	 * Whitelist all refereces from .text.head to .init.data if vmlinux
+	 * Whitelist all refereces from .text.head to .init.text if vmlinux
+	 */
 	if (is_vmlinux(modname)) {
 		if ((strcmp(fromsec, ".pci_fixup") == 0) &&
 		    (strcmp(tosec, ".init.text") == 0))
 		return 1;
 
+		if ((strcmp(fromsec, ".text.head") == 0) &&
+			((strcmp(tosec, ".init.data") == 0) ||
+			(strcmp(tosec, ".init.text") == 0)))
+		return 1;
+
 		/* Check for pattern 3 */
 		for (s = pat3refsym; *s; s++)
 			if (strcmp(refsymname, *s) == 0)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [2/39] x86_64: Break init() in two parts to avoid MODPOST warnings
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [3/39] i386: arch/i386/kernel/cpu/mcheck/mce.c should #include <asm/mce.h> Andi Kleen
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Vivek Goyal, patches, linux-kernel


From: Vivek Goyal <vgoyal@in.ibm.com>

o init() is a non __init function in .text section but it calls many
  functions which are in .init.text section. Hence MODPOST generates lots
  of cross reference warnings on i386 if compiled with CONFIG_RELOCATABLE=y

WARNING: vmlinux - Section mismatch: reference to .init.text:smp_prepare_cpus from .text between 'init' (at offset 0xc0101049) and 'rest_init'
WARNING: vmlinux - Section mismatch: reference to .init.text:migration_init from .text between 'init' (at offset 0xc010104e) and 'rest_init'
WARNING: vmlinux - Section mismatch: reference to .init.text:spawn_ksoftirqd from .text between 'init' (at offset 0xc0101053) and 'rest_init'

o This patch breaks down init() in two parts. One part which can go
  in .init.text section and can be freed and other part which has to
  be non __init(init_post()). Now init() calls init_post() and init_post()
  does not call any functions present in .init sections. Hence getting
  rid of warnings.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 init/main.c |   81 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 45 insertions(+), 36 deletions(-)

Index: linux/init/main.c
===================================================================
--- linux.orig/init/main.c
+++ linux/init/main.c
@@ -713,7 +713,49 @@ static void run_init_process(char *init_
 	kernel_execve(init_filename, argv_init, envp_init);
 }
 
-static int init(void * unused)
+/* This is a non __init function. Force it to be noinline otherwise gcc
+ * makes it inline to init() and it becomes part of init.text section
+ */
+static int noinline init_post(void)
+{
+	free_initmem();
+	unlock_kernel();
+	mark_rodata_ro();
+	system_state = SYSTEM_RUNNING;
+	numa_default_policy();
+
+	if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
+		printk(KERN_WARNING "Warning: unable to open an initial console.\n");
+
+	(void) sys_dup(0);
+	(void) sys_dup(0);
+
+	if (ramdisk_execute_command) {
+		run_init_process(ramdisk_execute_command);
+		printk(KERN_WARNING "Failed to execute %s\n",
+				ramdisk_execute_command);
+	}
+
+	/*
+	 * We try each of these until one succeeds.
+	 *
+	 * The Bourne shell can be used instead of init if we are
+	 * trying to recover a really broken machine.
+	 */
+	if (execute_command) {
+		run_init_process(execute_command);
+		printk(KERN_WARNING "Failed to execute %s.  Attempting "
+					"defaults...\n", execute_command);
+	}
+	run_init_process("/sbin/init");
+	run_init_process("/etc/init");
+	run_init_process("/bin/init");
+	run_init_process("/bin/sh");
+
+	panic("No init found.  Try passing init= option to kernel.");
+}
+
+static int __init init(void * unused)
 {
 	lock_kernel();
 	/*
@@ -761,39 +803,6 @@ static int init(void * unused)
 	 * we're essentially up and running. Get rid of the
 	 * initmem segments and start the user-mode stuff..
 	 */
-	free_initmem();
-	unlock_kernel();
-	mark_rodata_ro();
-	system_state = SYSTEM_RUNNING;
-	numa_default_policy();
-
-	if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
-		printk(KERN_WARNING "Warning: unable to open an initial console.\n");
-
-	(void) sys_dup(0);
-	(void) sys_dup(0);
-
-	if (ramdisk_execute_command) {
-		run_init_process(ramdisk_execute_command);
-		printk(KERN_WARNING "Failed to execute %s\n",
-				ramdisk_execute_command);
-	}
-
-	/*
-	 * We try each of these until one succeeds.
-	 *
-	 * The Bourne shell can be used instead of init if we are 
-	 * trying to recover a really broken machine.
-	 */
-	if (execute_command) {
-		run_init_process(execute_command);
-		printk(KERN_WARNING "Failed to execute %s.  Attempting "
-					"defaults...\n", execute_command);
-	}
-	run_init_process("/sbin/init");
-	run_init_process("/etc/init");
-	run_init_process("/bin/init");
-	run_init_process("/bin/sh");
-
-	panic("No init found.  Try passing init= option to kernel.");
+	init_post();
+	return 0;
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [3/39] i386: arch/i386/kernel/cpu/mcheck/mce.c should #include <asm/mce.h>
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [2/39] x86_64: Break init() in two parts to avoid MODPOST warnings Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [4/39] i386: add idle notifier Andi Kleen
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Adrian Bunk, patches, linux-kernel


From: Adrian Bunk <bunk@stusta.de>

Every file should include the headers containing the prototypes for
it's global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/cpu/mcheck/mce.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux/arch/i386/kernel/cpu/mcheck/mce.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mcheck/mce.c
+++ linux/arch/i386/kernel/cpu/mcheck/mce.c
@@ -12,6 +12,7 @@
 
 #include <asm/processor.h> 
 #include <asm/system.h>
+#include <asm/mce.h>
 
 #include "mce.h"
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [4/39] i386: add idle notifier
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [2/39] x86_64: Break init() in two parts to avoid MODPOST warnings Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [3/39] i386: arch/i386/kernel/cpu/mcheck/mce.c should #include <asm/mce.h> Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [5/39] i386: improve sched_clock() on i686 Andi Kleen
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Stephane Eranian, patches, linux-kernel


From: Stephane Eranian <eranian@hpl.hp.com>

Add a notifier mechanism to the low level idle loop.  You can register a
callback function which gets invoked on entry and exit from the low level idle
loop.  The low level idle loop is defined as the polling loop, low-power call,
or the mwait instruction.  Interrupts processed by the idle thread are not
considered part of the low level loop.

The notifier can be used to measure precisely how much is spent in useless
execution (or low power mode).  The perfmon subsystem uses it to turn on/off
monitoring.

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/apic.c          |    4 ++
 arch/i386/kernel/cpu/mcheck/p4.c |    2 +
 arch/i386/kernel/irq.c           |    3 ++
 arch/i386/kernel/process.c       |   53 ++++++++++++++++++++++++++++++++++++++-
 arch/i386/kernel/smp.c           |    2 +
 include/asm-i386/idle.h          |   14 ++++++++++
 include/asm-i386/processor.h     |    8 +++++
 7 files changed, 85 insertions(+), 1 deletion(-)

Index: linux/arch/i386/kernel/apic.c
===================================================================
--- linux.orig/arch/i386/kernel/apic.c
+++ linux/arch/i386/kernel/apic.c
@@ -36,6 +36,7 @@
 #include <asm/hpet.h>
 #include <asm/i8253.h>
 #include <asm/nmi.h>
+#include <asm/idle.h>
 
 #include <mach_apic.h>
 #include <mach_apicdef.h>
@@ -1255,6 +1256,7 @@ fastcall void smp_apic_timer_interrupt(s
 	 * Besides, if we don't timer interrupts ignore the global
 	 * interrupt lock, which is the WrongThing (tm) to do.
 	 */
+	exit_idle();
 	irq_enter();
 	smp_local_timer_interrupt();
 	irq_exit();
@@ -1305,6 +1307,7 @@ fastcall void smp_spurious_interrupt(str
 {
 	unsigned long v;
 
+	exit_idle();
 	irq_enter();
 	/*
 	 * Check if this really is a spurious interrupt and ACK it
@@ -1329,6 +1332,7 @@ fastcall void smp_error_interrupt(struct
 {
 	unsigned long v, v1;
 
+	exit_idle();
 	irq_enter();
 	/* First tickle the hardware, only then report what went on. -- REW */
 	v = apic_read(APIC_ESR);
Index: linux/arch/i386/kernel/cpu/mcheck/p4.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mcheck/p4.c
+++ linux/arch/i386/kernel/cpu/mcheck/p4.c
@@ -12,6 +12,7 @@
 #include <asm/system.h>
 #include <asm/msr.h>
 #include <asm/apic.h>
+#include <asm/idle.h>
 
 #include <asm/therm_throt.h>
 
@@ -59,6 +60,7 @@ static void (*vendor_thermal_interrupt)(
 
 fastcall void smp_thermal_interrupt(struct pt_regs *regs)
 {
+	exit_idle();
 	irq_enter();
 	vendor_thermal_interrupt(regs);
 	irq_exit();
Index: linux/arch/i386/kernel/irq.c
===================================================================
--- linux.orig/arch/i386/kernel/irq.c
+++ linux/arch/i386/kernel/irq.c
@@ -19,6 +19,8 @@
 #include <linux/cpu.h>
 #include <linux/delay.h>
 
+#include <asm/idle.h>
+
 DEFINE_PER_CPU(irq_cpustat_t, irq_stat) ____cacheline_internodealigned_in_smp;
 EXPORT_PER_CPU_SYMBOL(irq_stat);
 
@@ -61,6 +63,7 @@ fastcall unsigned int do_IRQ(struct pt_r
 	union irq_ctx *curctx, *irqctx;
 	u32 *isp;
 #endif
+	exit_idle();
 
 	if (unlikely((unsigned)irq >= NR_IRQS)) {
 		printk(KERN_EMERG "%s: cannot handle IRQ %d\n",
Index: linux/arch/i386/kernel/process.c
===================================================================
--- linux.orig/arch/i386/kernel/process.c
+++ linux/arch/i386/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/i387.h>
 #include <asm/desc.h>
 #include <asm/vm86.h>
+#include <asm/idle.h>
 #ifdef CONFIG_MATH_EMULATION
 #include <asm/math_emu.h>
 #endif
@@ -80,6 +81,42 @@ void (*pm_idle)(void);
 EXPORT_SYMBOL(pm_idle);
 static DEFINE_PER_CPU(unsigned int, cpu_idle_state);
 
+static ATOMIC_NOTIFIER_HEAD(idle_notifier);
+
+void idle_notifier_register(struct notifier_block *n)
+{
+	atomic_notifier_chain_register(&idle_notifier, n);
+}
+
+void idle_notifier_unregister(struct notifier_block *n)
+{
+	atomic_notifier_chain_unregister(&idle_notifier, n);
+}
+
+static DEFINE_PER_CPU(volatile unsigned long, idle_state);
+
+void enter_idle(void)
+{
+	/* needs to be atomic w.r.t. interrupts, not against other CPUs */
+	__set_bit(0, &__get_cpu_var(idle_state));
+	atomic_notifier_call_chain(&idle_notifier, IDLE_START, NULL);
+}
+
+static void __exit_idle(void)
+{
+	/* needs to be atomic w.r.t. interrupts, not against other CPUs */
+	if (__test_and_clear_bit(0, &__get_cpu_var(idle_state)) == 0)
+		return;
+	atomic_notifier_call_chain(&idle_notifier, IDLE_END, NULL);
+}
+
+void exit_idle(void)
+{
+	if (current->pid)
+		return;
+	__exit_idle();
+}
+
 void disable_hlt(void)
 {
 	hlt_counter++;
@@ -130,6 +167,7 @@ EXPORT_SYMBOL(default_idle);
  */
 static void poll_idle (void)
 {
+	local_irq_enable();
 	cpu_relax();
 }
 
@@ -189,7 +227,16 @@ void cpu_idle(void)
 				play_dead();
 
 			__get_cpu_var(irq_stat).idle_timestamp = jiffies;
+
+			/*
+			 * Idle routines should keep interrupts disabled
+			 * from here on, until they go to idle.
+			 * Otherwise, idle callbacks can misfire.
+			 */
+			local_irq_disable();
+			enter_idle();
 			idle();
+			__exit_idle();
 		}
 		preempt_enable_no_resched();
 		schedule();
@@ -243,7 +290,11 @@ void mwait_idle_with_hints(unsigned long
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
 		smp_mb();
 		if (!need_resched())
-			__mwait(eax, ecx);
+			__sti_mwait(eax, ecx);
+		else
+			local_irq_enable();
+	} else {
+		local_irq_enable();
 	}
 }
 
Index: linux/arch/i386/kernel/smp.c
===================================================================
--- linux.orig/arch/i386/kernel/smp.c
+++ linux/arch/i386/kernel/smp.c
@@ -23,6 +23,7 @@
 
 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
+#include <asm/idle.h>
 #include <mach_apic.h>
 
 /*
@@ -624,6 +625,7 @@ fastcall void smp_call_function_interrup
 	/*
 	 * At this point the info structure may be out of scope unless wait==1
 	 */
+	exit_idle();
 	irq_enter();
 	(*func)(info);
 	irq_exit();
Index: linux/include/asm-i386/idle.h
===================================================================
--- /dev/null
+++ linux/include/asm-i386/idle.h
@@ -0,0 +1,14 @@
+#ifndef _ASM_I386_IDLE_H
+#define _ASM_I386_IDLE_H 1
+
+#define IDLE_START 1
+#define IDLE_END 2
+
+struct notifier_block;
+void idle_notifier_register(struct notifier_block *n);
+void idle_notifier_unregister(struct notifier_block *n);
+
+void exit_idle(void);
+void enter_idle(void);
+
+#endif
Index: linux/include/asm-i386/processor.h
===================================================================
--- linux.orig/include/asm-i386/processor.h
+++ linux/include/asm-i386/processor.h
@@ -257,6 +257,14 @@ static inline void __mwait(unsigned long
 		: :"a" (eax), "c" (ecx));
 }
 
+static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
+{
+	/* "mwait %eax,%ecx;" */
+	asm volatile(
+		"sti; .byte 0x0f,0x01,0xc9;"
+		: :"a" (eax), "c" (ecx));
+}
+
 extern void mwait_idle_with_hints(unsigned long eax, unsigned long ecx);
 
 /* from system description table in BIOS.  Mostly for MCA use, but

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [5/39] i386: improve sched_clock() on i686
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (2 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [4/39] i386: add idle notifier Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [6/39] i386: romsignature/checksum cleanup Andi Kleen
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Ingo Molnar, patches, linux-kernel


From: Ingo Molnar <mingo@elte.hu>

Clean up sched_clock() on i686: it will use the TSC if available and falls
back to jiffies only if the user asked for it to be disabled via notsc or
the CPU calibration code didnt figure out the right cpu_khz.

This generally makes the scheduler timestamps more finegrained, on all
hardware.  (the current scheduler is pretty resistant against asynchronous
sched_clock() values on different CPUs, it will allow at most up to a jiffy
of jitter.)

Also simplify sched_clock()'s check for TSC availability: propagate the
desire and ability to use the TSC into the tsc_disable flag, previously
this flag only indicated whether the notsc option was passed.  This makes
the rare low-res sched_clock() codepath a single branch off a read-mostly
flag.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/tsc.c  |   22 ++++++++++++++--------
 include/asm-i386/bugs.h |    2 +-
 2 files changed, 15 insertions(+), 9 deletions(-)

Index: linux/arch/i386/kernel/tsc.c
===================================================================
--- linux.orig/arch/i386/kernel/tsc.c
+++ linux/arch/i386/kernel/tsc.c
@@ -112,13 +112,10 @@ unsigned long long sched_clock(void)
 		return (*custom_sched_clock)();
 
 	/*
-	 * in the NUMA case we dont use the TSC as they are not
-	 * synchronized across all CPUs.
+	 * Fall back to jiffies if there's no TSC available:
 	 */
-#ifndef CONFIG_NUMA
-	if (!cpu_khz || check_tsc_unstable())
-#endif
-		/* no locking but a rare wrong value is not a big deal */
+	if (unlikely(tsc_disable))
+		/* No locking but a rare wrong value is not a big deal: */
 		return (jiffies_64 - INITIAL_JIFFIES) * (1000000000 / HZ);
 
 	/* read the Time Stamp Counter: */
@@ -198,13 +195,13 @@ EXPORT_SYMBOL(recalibrate_cpu_khz);
 void __init tsc_init(void)
 {
 	if (!cpu_has_tsc || tsc_disable)
-		return;
+		goto out_no_tsc;
 
 	cpu_khz = calculate_cpu_khz();
 	tsc_khz = cpu_khz;
 
 	if (!cpu_khz)
-		return;
+		goto out_no_tsc;
 
 	printk("Detected %lu.%03lu MHz processor.\n",
 				(unsigned long)cpu_khz / 1000,
@@ -212,6 +209,15 @@ void __init tsc_init(void)
 
 	set_cyc2ns_scale(cpu_khz);
 	use_tsc_delay();
+	return;
+
+out_no_tsc:
+	/*
+	 * Set the tsc_disable flag if there's no TSC support, this
+	 * makes it a fast flag for the kernel to see whether it
+	 * should be using the TSC.
+	 */
+	tsc_disable = 1;
 }
 
 #ifdef CONFIG_CPU_FREQ
Index: linux/include/asm-i386/bugs.h
===================================================================
--- linux.orig/include/asm-i386/bugs.h
+++ linux/include/asm-i386/bugs.h
@@ -160,7 +160,7 @@ static void __init check_config(void)
  * If we configured ourselves for a TSC, we'd better have one!
  */
 #ifdef CONFIG_X86_TSC
-	if (!cpu_has_tsc)
+	if (!cpu_has_tsc && !tsc_disable)
 		panic("Kernel compiled for Pentium+, requires TSC feature!");
 #endif
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [6/39] i386: romsignature/checksum cleanup
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (3 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [5/39] i386: improve sched_clock() on i686 Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [7/39] x86_64: Fix fake numa for x86_64 machines with big IO hole Andi Kleen
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Rene Herman, Andi Kleen, patches, linux-kernel


From: Rene Herman <rene.herman@gmail.com>

Use adding __init to romsignature() (it's only called from probe_roms()
which is itself __init) as an excuse to submit a pedantic cleanup.

Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/i386/kernel/e820.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Index: linux/arch/i386/kernel/e820.c
===================================================================
--- linux.orig/arch/i386/kernel/e820.c
+++ linux/arch/i386/kernel/e820.c
@@ -157,21 +157,22 @@ static struct resource standard_io_resou
 	.flags	= IORESOURCE_BUSY | IORESOURCE_IO
 } };
 
-static int romsignature(const unsigned char *x)
+#define ROMSIGNATURE 0xaa55
+
+static int __init romsignature(const unsigned char *rom)
 {
 	unsigned short sig;
-	int ret = 0;
-	if (probe_kernel_address((const unsigned short *)x, sig) == 0)
-		ret = (sig == 0xaa55);
-	return ret;
+
+	return probe_kernel_address((const unsigned short *)rom, sig) == 0 &&
+	       sig == ROMSIGNATURE;
 }
 
 static int __init romchecksum(unsigned char *rom, unsigned long length)
 {
-	unsigned char *p, sum = 0;
+	unsigned char sum;
 
-	for (p = rom; p < rom + length; p++)
-		sum += *p;
+	for (sum = 0; length; length--)
+		sum += *rom++;
 	return sum == 0;
 }
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [7/39] x86_64: Fix fake numa for x86_64 machines with big IO hole
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (4 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [6/39] i386: romsignature/checksum cleanup Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [8/39] x86_64: Remove fastcall references in x86_64 code Andi Kleen
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Rohit Seth, Andi Kleen, patches, linux-kernel


From: Rohit Seth <rohitseth@google.com>

This patch resolves the issue of running with numa=fake=X on kernel command
line on x86_64 machines that have big IO hole.  While calculating the size
of each node now we look at the total hole size in that range.

Previously there were nodes that only had IO holes in them causing kernel
boot problems.  We now use the NODE_MIN_SIZE (64MB) as the minimum size of
memory that any node must have.  We reduce the number of allocated nodes if
the number of nodes specified on kernel command line results in any node
getting memory smaller than NODE_MIN_SIZE.

This change allows the extra memory to be incremented in NODE_MIN_SIZE
granule and uniformly distribute among as many nodes (called big nodes) as
possible.

[akpm@osdl.org: build fix]
Signed-off-by: David Rientjes <reintjes@google.com>
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/x86_64/kernel/e820.c   |   31 ++++++++++++
 arch/x86_64/mm/numa.c       |  110 ++++++++++++++++++++++++++++++++++++++------
 include/asm-x86_64/e820.h   |    1 
 include/asm-x86_64/mmzone.h |    5 ++
 4 files changed, 133 insertions(+), 14 deletions(-)

Index: linux/arch/x86_64/kernel/e820.c
===================================================================
--- linux.orig/arch/x86_64/kernel/e820.c
+++ linux/arch/x86_64/kernel/e820.c
@@ -191,6 +191,37 @@ unsigned long __init e820_end_of_ram(voi
 }
 
 /*
+ * Find the hole size in the range.
+ */
+unsigned long __init e820_hole_size(unsigned long start, unsigned long end)
+{
+	unsigned long ram = 0;
+	int i;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		struct e820entry *ei = &e820.map[i];
+		unsigned long last, addr;
+
+		if (ei->type != E820_RAM ||
+		    ei->addr+ei->size <= start ||
+		    ei->addr >= end)
+			continue;
+
+		addr = round_up(ei->addr, PAGE_SIZE);
+		if (addr < start)
+			addr = start;
+
+		last = round_down(ei->addr + ei->size, PAGE_SIZE);
+		if (last >= end)
+			last = end;
+
+		if (last > addr)
+			ram += last - addr;
+	}
+	return ((end - start) - ram);
+}
+
+/*
  * Mark e820 reserved areas as busy for the resource manager.
  */
 void __init e820_reserve_resources(void)
Index: linux/arch/x86_64/mm/numa.c
===================================================================
--- linux.orig/arch/x86_64/mm/numa.c
+++ linux/arch/x86_64/mm/numa.c
@@ -272,31 +272,113 @@ void __init numa_init_array(void)
 }
 
 #ifdef CONFIG_NUMA_EMU
+/* Numa emulation */
 int numa_fake __initdata = 0;
 
-/* Numa emulation */
+/*
+ * This function is used to find out if the start and end correspond to
+ * different zones.
+ */
+int zone_cross_over(unsigned long start, unsigned long end)
+{
+	if ((start < (MAX_DMA32_PFN << PAGE_SHIFT)) &&
+			(end >= (MAX_DMA32_PFN << PAGE_SHIFT)))
+		return 1;
+	return 0;
+}
+
 static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn)
 {
- 	int i;
+ 	int i, big;
  	struct bootnode nodes[MAX_NUMNODES];
- 	unsigned long sz = ((end_pfn - start_pfn)<<PAGE_SHIFT) / numa_fake;
+ 	unsigned long sz, old_sz;
+	unsigned long hole_size;
+	unsigned long start, end;
+	unsigned long max_addr = (end_pfn << PAGE_SHIFT);
+
+	start = (start_pfn << PAGE_SHIFT);
+	hole_size = e820_hole_size(start, max_addr);
+	sz = (max_addr - start - hole_size) / numa_fake;
 
  	/* Kludge needed for the hash function */
- 	if (hweight64(sz) > 1) {
- 		unsigned long x = 1;
- 		while ((x << 1) < sz)
- 			x <<= 1;
- 		if (x < sz/2)
- 			printk(KERN_ERR "Numa emulation unbalanced. Complain to maintainer\n");
- 		sz = x;
- 	}
 
+	old_sz = sz;
+	/*
+	 * Round down to the nearest FAKE_NODE_MIN_SIZE.
+	 */
+	sz &= FAKE_NODE_MIN_HASH_MASK;
+
+	/*
+	 * We ensure that each node is at least 64MB big.  Smaller than this
+	 * size can cause VM hiccups.
+	 */
+	if (sz == 0) {
+		printk(KERN_INFO "Not enough memory for %d nodes.  Reducing "
+				"the number of nodes\n", numa_fake);
+		numa_fake = (max_addr - start - hole_size) / FAKE_NODE_MIN_SIZE;
+		printk(KERN_INFO "Number of fake nodes will be = %d\n",
+				numa_fake);
+		sz = FAKE_NODE_MIN_SIZE;
+	}
+	/*
+	 * Find out how many nodes can get an extra NODE_MIN_SIZE granule.
+	 * This logic ensures the extra memory gets distributed among as many
+	 * nodes as possible (as compared to one single node getting all that
+	 * extra memory.
+	 */
+	big = ((old_sz - sz) * numa_fake) / FAKE_NODE_MIN_SIZE;
+	printk(KERN_INFO "Fake node Size: %luMB hole_size: %luMB big nodes: "
+			"%d\n",
+			(sz >> 20), (hole_size >> 20), big);
  	memset(&nodes,0,sizeof(nodes));
+	end = start;
  	for (i = 0; i < numa_fake; i++) {
- 		nodes[i].start = (start_pfn<<PAGE_SHIFT) + i*sz;
+		/*
+		 * In case we are not able to allocate enough memory for all
+		 * the nodes, we reduce the number of fake nodes.
+		 */
+		if (end >= max_addr) {
+			numa_fake = i - 1;
+			break;
+		}
+ 		start = nodes[i].start = end;
+		/*
+		 * Final node can have all the remaining memory.
+		 */
  		if (i == numa_fake-1)
- 			sz = (end_pfn<<PAGE_SHIFT) - nodes[i].start;
- 		nodes[i].end = nodes[i].start + sz;
+ 			sz = max_addr - start;
+ 		end = nodes[i].start + sz;
+		/*
+		 * Fir "big" number of nodes get extra granule.
+		 */
+		if (i < big)
+			end += FAKE_NODE_MIN_SIZE;
+		/*
+		 * Iterate over the range to ensure that this node gets at
+		 * least sz amount of RAM (excluding holes)
+		 */
+		while ((end - start - e820_hole_size(start, end)) < sz) {
+			end += FAKE_NODE_MIN_SIZE;
+			if (end >= max_addr)
+				break;
+		}
+		/*
+		 * Look at the next node to make sure there is some real memory
+		 * to map.  Bad things happen when the only memory present
+		 * in a zone on a fake node is IO hole.
+		 */
+		while (e820_hole_size(end, end + FAKE_NODE_MIN_SIZE) > 0) {
+			if (zone_cross_over(start, end + sz)) {
+				end = (MAX_DMA32_PFN << PAGE_SHIFT);
+				break;
+			}
+			if (end >= max_addr)
+				break;
+			end += FAKE_NODE_MIN_SIZE;
+		}
+		if (end > max_addr)
+			end = max_addr;
+		nodes[i].end = end;
  		printk(KERN_INFO "Faking node %d at %016Lx-%016Lx (%LuMB)\n",
  		       i,
  		       nodes[i].start, nodes[i].end,
Index: linux/include/asm-x86_64/e820.h
===================================================================
--- linux.orig/include/asm-x86_64/e820.h
+++ linux/include/asm-x86_64/e820.h
@@ -46,6 +46,7 @@ extern void e820_mark_nosave_regions(voi
 extern void e820_print_map(char *who);
 extern int e820_any_mapped(unsigned long start, unsigned long end, unsigned type);
 extern int e820_all_mapped(unsigned long start, unsigned long end, unsigned type);
+extern unsigned long e820_hole_size(unsigned long start, unsigned long end);
 
 extern void e820_setup_gap(void);
 extern void e820_register_active_regions(int nid,
Index: linux/include/asm-x86_64/mmzone.h
===================================================================
--- linux.orig/include/asm-x86_64/mmzone.h
+++ linux/include/asm-x86_64/mmzone.h
@@ -47,5 +47,10 @@ static inline __attribute__((pure)) int 
 extern int pfn_valid(unsigned long pfn);
 #endif
 
+#ifdef CONFIG_NUMA_EMU
+#define FAKE_NODE_MIN_SIZE	(64*1024*1024)
+#define FAKE_NODE_MIN_HASH_MASK	(~(FAKE_NODE_MIN_SIZE - 1ul))
+#endif
+
 #endif
 #endif

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [8/39] x86_64: Remove fastcall references in x86_64 code
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (5 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [7/39] x86_64: Fix fake numa for x86_64 machines with big IO hole Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [9/39] x86_64: Use constant instead of raw number in x86_64 ioperm.c Andi Kleen
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Glauber de Oliveira Costa, Andi Kleen, patches, linux-kernel


From: Glauber de Oliveira Costa <gcosta@redhat.com>

Unlike x86, x86_64 already passes arguments in registers.  The use of
regparm attribute makes no difference in produced code, and the use of
fastcall just bloats the code.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/x86_64/kernel/acpi/sleep.c  |    2 +-
 arch/x86_64/kernel/x8664_ksyms.c |    4 ++--
 include/asm-x86_64/hw_irq.h      |    2 +-
 include/asm-x86_64/mutex.h       |    6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

Index: linux/arch/x86_64/kernel/acpi/sleep.c
===================================================================
--- linux.orig/arch/x86_64/kernel/acpi/sleep.c
+++ linux/arch/x86_64/kernel/acpi/sleep.c
@@ -58,7 +58,7 @@ unsigned long acpi_wakeup_address = 0;
 unsigned long acpi_video_flags;
 extern char wakeup_start, wakeup_end;
 
-extern unsigned long FASTCALL(acpi_copy_wakeup_routine(unsigned long));
+extern unsigned long acpi_copy_wakeup_routine(unsigned long);
 
 static pgd_t low_ptr;
 
Index: linux/arch/x86_64/kernel/x8664_ksyms.c
===================================================================
--- linux.orig/arch/x86_64/kernel/x8664_ksyms.c
+++ linux/arch/x86_64/kernel/x8664_ksyms.c
@@ -36,8 +36,8 @@ EXPORT_SYMBOL(copy_page);
 EXPORT_SYMBOL(clear_page);
 
 #ifdef CONFIG_SMP
-extern void FASTCALL( __write_lock_failed(rwlock_t *rw));
-extern void FASTCALL( __read_lock_failed(rwlock_t *rw));
+extern void  __write_lock_failed(rwlock_t *rw);
+extern void  __read_lock_failed(rwlock_t *rw);
 EXPORT_SYMBOL(__write_lock_failed);
 EXPORT_SYMBOL(__read_lock_failed);
 #endif
Index: linux/include/asm-x86_64/hw_irq.h
===================================================================
--- linux.orig/include/asm-x86_64/hw_irq.h
+++ linux/include/asm-x86_64/hw_irq.h
@@ -91,7 +91,7 @@ extern void enable_8259A_irq(unsigned in
 extern int i8259A_irq_pending(unsigned int irq);
 extern void make_8259A_irq(unsigned int irq);
 extern void init_8259A(int aeoi);
-extern void FASTCALL(send_IPI_self(int vector));
+extern void send_IPI_self(int vector);
 extern void init_VISWS_APIC_irqs(void);
 extern void setup_IO_APIC(void);
 extern void disable_IO_APIC(void);
Index: linux/include/asm-x86_64/mutex.h
===================================================================
--- linux.orig/include/asm-x86_64/mutex.h
+++ linux/include/asm-x86_64/mutex.h
@@ -21,7 +21,7 @@ do {									\
 	unsigned long dummy;						\
 									\
 	typecheck(atomic_t *, v);					\
-	typecheck_fn(fastcall void (*)(atomic_t *), fail_fn);		\
+	typecheck_fn(void (*)(atomic_t *), fail_fn);			\
 									\
 	__asm__ __volatile__(						\
 		LOCK_PREFIX "   decl (%%rdi)	\n"			\
@@ -47,7 +47,7 @@ do {									\
  */
 static inline int
 __mutex_fastpath_lock_retval(atomic_t *count,
-			     int fastcall (*fail_fn)(atomic_t *))
+			     int (*fail_fn)(atomic_t *))
 {
 	if (unlikely(atomic_dec_return(count) < 0))
 		return fail_fn(count);
@@ -67,7 +67,7 @@ do {									\
 	unsigned long dummy;						\
 									\
 	typecheck(atomic_t *, v);					\
-	typecheck_fn(fastcall void (*)(atomic_t *), fail_fn);		\
+	typecheck_fn(void (*)(atomic_t *), fail_fn);			\
 									\
 	__asm__ __volatile__(						\
 		LOCK_PREFIX "   incl (%%rdi)	\n"			\

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [9/39] x86_64: Use constant instead of raw number in x86_64 ioperm.c
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (6 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [8/39] x86_64: Remove fastcall references in x86_64 code Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [10/39] x86_64: Handle 32 bit PerfMon Counter writes cleanly in x86_64 nmi_watchdog Andi Kleen
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Glauber de Oliveira Costa, Andi Kleen, patches, linux-kernel


From: Glauber de Oliveira Costa <gcosta@redhat.com>

This is a tiny cleanup to increase readability

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 arch/x86_64/kernel/ioport.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/arch/x86_64/kernel/ioport.c
===================================================================
--- linux.orig/arch/x86_64/kernel/ioport.c
+++ linux/arch/x86_64/kernel/ioport.c
@@ -114,6 +114,6 @@ asmlinkage long sys_iopl(unsigned int le
 		if (!capable(CAP_SYS_RAWIO))
 			return -EPERM;
 	}
-	regs->eflags = (regs->eflags &~ 0x3000UL) | (level << 12);
+	regs->eflags = (regs->eflags &~ X86_EFLAGS_IOPL) | (level << 12);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [10/39] x86_64: Handle 32 bit PerfMon Counter writes cleanly in x86_64 nmi_watchdog
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (7 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [9/39] x86_64: Use constant instead of raw number in x86_64 ioperm.c Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [11/39] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog Andi Kleen
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Venkatesh Pallipadi, patches, linux-kernel


From: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>


P6 CPUs and Core/Core 2 CPUs which has 'architectural perf mon' feature,
only supports write of low 32 bits in Performance Monitoring Counters.
Bits 32..39 are sign extended based on bit 31 and bits 40..63 are reserved
and should be zero.

This patch:

Change x86_64 nmi handler to handle this case cleanly.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/nmi.c |   46 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 32 insertions(+), 14 deletions(-)

Index: linux/arch/x86_64/kernel/nmi.c
===================================================================
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -214,6 +214,23 @@ static __init void nmi_cpu_busy(void *da
 }
 #endif
 
+static unsigned int adjust_for_32bit_ctr(unsigned int hz)
+{
+	unsigned int retval = hz;
+
+	/*
+	 * On Intel CPUs with ARCH_PERFMON only 32 bits in the counter
+	 * are writable, with higher bits sign extending from bit 31.
+	 * So, we can only program the counter with 31 bit values and
+	 * 32nd bit should be 1, for 33.. to be 1.
+	 * Find the appropriate nmi_hz
+	 */
+ 	if ((((u64)cpu_khz * 1000) / retval) > 0x7fffffffULL) {
+		retval = ((u64)cpu_khz * 1000) / 0x7fffffffUL + 1;
+	}
+	return retval;
+}
+
 int __init check_nmi_watchdog (void)
 {
 	int *counts;
@@ -268,17 +285,8 @@ int __init check_nmi_watchdog (void)
 		struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
 
 		nmi_hz = 1;
-		/*
-		 * On Intel CPUs with ARCH_PERFMON only 32 bits in the counter
-		 * are writable, with higher bits sign extending from bit 31.
-		 * So, we can only program the counter with 31 bit values and
-		 * 32nd bit should be 1, for 33.. to be 1.
-		 * Find the appropriate nmi_hz
-		 */
-	 	if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0 &&
-			((u64)cpu_khz * 1000) > 0x7fffffffULL) {
-			nmi_hz = ((u64)cpu_khz * 1000) / 0x7fffffffUL + 1;
-		}
+	 	if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0)
+			nmi_hz = adjust_for_32bit_ctr(nmi_hz);
 	}
 
 	kfree(counts);
@@ -634,7 +642,9 @@ static int setup_intel_arch_watchdog(voi
 
 	/* setup the timer */
 	wrmsr(evntsel_msr, evntsel, 0);
-	wrmsrl(perfctr_msr, -((u64)cpu_khz * 1000 / nmi_hz));
+
+	nmi_hz = adjust_for_32bit_ctr(nmi_hz);
+	wrmsr(perfctr_msr, (u32)(-((u64)cpu_khz * 1000 / nmi_hz)), 0);
 
 	apic_write(APIC_LVTPC, APIC_DM_NMI);
 	evntsel |= ARCH_PERFMON_EVENTSEL0_ENABLE;
@@ -855,15 +865,23 @@ int __kprobes nmi_watchdog_tick(struct p
 				dummy &= ~P4_CCCR_OVF;
 	 			wrmsrl(wd->cccr_msr, dummy);
 	 			apic_write(APIC_LVTPC, APIC_DM_NMI);
+				/* start the cycle over again */
+				wrmsrl(wd->perfctr_msr,
+				       -((u64)cpu_khz * 1000 / nmi_hz));
 	 		} else if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) {
 				/*
 				 * ArchPerfom/Core Duo needs to re-unmask
 				 * the apic vector
 				 */
 				apic_write(APIC_LVTPC, APIC_DM_NMI);
+				/* ARCH_PERFMON has 32 bit counter writes */
+				wrmsr(wd->perfctr_msr,
+				     (u32)(-((u64)cpu_khz * 1000 / nmi_hz)), 0);
+			} else {
+				/* start the cycle over again */
+				wrmsrl(wd->perfctr_msr,
+				       -((u64)cpu_khz * 1000 / nmi_hz));
 			}
-			/* start the cycle over again */
-			wrmsrl(wd->perfctr_msr, -((u64)cpu_khz * 1000 / nmi_hz));
 			rc = 1;
 		} else 	if (nmi_watchdog == NMI_IO_APIC) {
 			/* don't know how to accurately check for this.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [11/39] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (8 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [10/39] x86_64: Handle 32 bit PerfMon Counter writes cleanly in x86_64 nmi_watchdog Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:37 ` [PATCH x86 for review II] [12/39] i386: Handle 32 bit PerfMon Counter writes cleanly in oprofile Andi Kleen
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Venkatesh Pallipadi, patches, linux-kernel


From: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>

Change i386 nmi handler to handle 32 bit perfmon counter MSR writes cleanly.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/kernel/nmi.c |   64 ++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 48 insertions(+), 16 deletions(-)

Index: linux/arch/i386/kernel/nmi.c
===================================================================
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -216,6 +216,28 @@ static __init void nmi_cpu_busy(void *da
 }
 #endif
 
+static unsigned int adjust_for_32bit_ctr(unsigned int hz)
+{
+	u64 counter_val;
+	unsigned int retval = hz;
+
+	/*
+	 * On Intel CPUs with P6/ARCH_PERFMON only 32 bits in the counter
+	 * are writable, with higher bits sign extending from bit 31.
+	 * So, we can only program the counter with 31 bit values and
+	 * 32nd bit should be 1, for 33.. to be 1.
+	 * Find the appropriate nmi_hz
+	 */
+	counter_val = (u64)cpu_khz * 1000;
+	do_div(counter_val, retval);
+ 	if (counter_val > 0x7fffffffULL) {
+		u64 count = (u64)cpu_khz * 1000;
+		do_div(count, 0x7fffffffUL);
+		retval = count + 1;
+	}
+	return retval;
+}
+
 static int __init check_nmi_watchdog(void)
 {
 	unsigned int *prev_nmi_count;
@@ -281,18 +303,10 @@ static int __init check_nmi_watchdog(voi
 		struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
 
 		nmi_hz = 1;
-		/*
-		 * On Intel CPUs with ARCH_PERFMON only 32 bits in the counter
-		 * are writable, with higher bits sign extending from bit 31.
-		 * So, we can only program the counter with 31 bit values and
-		 * 32nd bit should be 1, for 33.. to be 1.
-		 * Find the appropriate nmi_hz
-		 */
-	 	if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0 &&
-			((u64)cpu_khz * 1000) > 0x7fffffffULL) {
-			u64 count = (u64)cpu_khz * 1000;
-			do_div(count, 0x7fffffffUL);
-			nmi_hz = count + 1;
+
+		if (wd->perfctr_msr == MSR_P6_PERFCTR0 ||
+		    wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) {
+			nmi_hz = adjust_for_32bit_ctr(nmi_hz);
 		}
 	}
 
@@ -442,6 +456,17 @@ static void write_watchdog_counter(unsig
 	wrmsrl(perfctr_msr, 0 - count);
 }
 
+static void write_watchdog_counter32(unsigned int perfctr_msr,
+		const char *descr)
+{
+	u64 count = (u64)cpu_khz * 1000;
+
+	do_div(count, nmi_hz);
+	if(descr)
+		Dprintk("setting %s to -0x%08Lx\n", descr, count);
+	wrmsr(perfctr_msr, (u32)(-count), 0);
+}
+
 /* Note that these events don't tick when the CPU idles. This means
    the frequency varies with CPU load. */
 
@@ -531,7 +556,8 @@ static int setup_p6_watchdog(void)
 
 	/* setup the timer */
 	wrmsr(evntsel_msr, evntsel, 0);
-	write_watchdog_counter(perfctr_msr, "P6_PERFCTR0");
+	nmi_hz = adjust_for_32bit_ctr(nmi_hz);
+	write_watchdog_counter32(perfctr_msr, "P6_PERFCTR0");
 	apic_write(APIC_LVTPC, APIC_DM_NMI);
 	evntsel |= P6_EVNTSEL0_ENABLE;
 	wrmsr(evntsel_msr, evntsel, 0);
@@ -704,7 +730,8 @@ static int setup_intel_arch_watchdog(voi
 
 	/* setup the timer */
 	wrmsr(evntsel_msr, evntsel, 0);
-	write_watchdog_counter(perfctr_msr, "INTEL_ARCH_PERFCTR0");
+	nmi_hz = adjust_for_32bit_ctr(nmi_hz);
+	write_watchdog_counter32(perfctr_msr, "INTEL_ARCH_PERFCTR0");
 	apic_write(APIC_LVTPC, APIC_DM_NMI);
 	evntsel |= ARCH_PERFMON_EVENTSEL0_ENABLE;
 	wrmsr(evntsel_msr, evntsel, 0);
@@ -956,6 +983,8 @@ __kprobes int nmi_watchdog_tick(struct p
 				dummy &= ~P4_CCCR_OVF;
 	 			wrmsrl(wd->cccr_msr, dummy);
 	 			apic_write(APIC_LVTPC, APIC_DM_NMI);
+				/* start the cycle over again */
+				write_watchdog_counter(wd->perfctr_msr, NULL);
 	 		}
 			else if (wd->perfctr_msr == MSR_P6_PERFCTR0 ||
 				 wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) {
@@ -964,9 +993,12 @@ __kprobes int nmi_watchdog_tick(struct p
 				 * other P6 variant.
 				 * ArchPerfom/Core Duo also needs this */
 				apic_write(APIC_LVTPC, APIC_DM_NMI);
+				/* P6/ARCH_PERFMON has 32 bit counter write */
+				write_watchdog_counter32(wd->perfctr_msr, NULL);
+			} else {
+				/* start the cycle over again */
+				write_watchdog_counter(wd->perfctr_msr, NULL);
 			}
-			/* start the cycle over again */
-			write_watchdog_counter(wd->perfctr_msr, NULL);
 			rc = 1;
 		} else if (nmi_watchdog == NMI_IO_APIC) {
 			/* don't know how to accurately check for this.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [12/39] i386: Handle 32 bit PerfMon Counter writes cleanly in oprofile
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (9 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [11/39] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog Andi Kleen
@ 2007-02-12  7:37 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M? Andi Kleen
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:37 UTC (permalink / raw)
  To: Venkatesh Pallipadi, patches, linux-kernel


From: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>

Handle these 32 bit perfmon counter MSR writes cleanly in oprofile.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/oprofile/op_model_ppro.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Index: linux/arch/i386/oprofile/op_model_ppro.c
===================================================================
--- linux.orig/arch/i386/oprofile/op_model_ppro.c
+++ linux/arch/i386/oprofile/op_model_ppro.c
@@ -24,7 +24,8 @@
 
 #define CTR_IS_RESERVED(msrs,c) (msrs->counters[(c)].addr ? 1 : 0)
 #define CTR_READ(l,h,msrs,c) do {rdmsr(msrs->counters[(c)].addr, (l), (h));} while (0)
-#define CTR_WRITE(l,msrs,c) do {wrmsr(msrs->counters[(c)].addr, -(u32)(l), -1);} while (0)
+#define CTR_32BIT_WRITE(l,msrs,c)	\
+	do {wrmsr(msrs->counters[(c)].addr, -(u32)(l), 0);} while (0)
 #define CTR_OVERFLOWED(n) (!((n) & (1U<<31)))
 
 #define CTRL_IS_RESERVED(msrs,c) (msrs->controls[(c)].addr ? 1 : 0)
@@ -79,7 +80,7 @@ static void ppro_setup_ctrs(struct op_ms
 	for (i = 0; i < NUM_COUNTERS; ++i) {
 		if (unlikely(!CTR_IS_RESERVED(msrs,i)))
 			continue;
-		CTR_WRITE(1, msrs, i);
+		CTR_32BIT_WRITE(1, msrs, i);
 	}
 
 	/* enable active counters */
@@ -87,7 +88,7 @@ static void ppro_setup_ctrs(struct op_ms
 		if ((counter_config[i].enabled) && (CTR_IS_RESERVED(msrs,i))) {
 			reset_value[i] = counter_config[i].count;
 
-			CTR_WRITE(counter_config[i].count, msrs, i);
+			CTR_32BIT_WRITE(counter_config[i].count, msrs, i);
 
 			CTRL_READ(low, high, msrs, i);
 			CTRL_CLEAR(low);
@@ -116,7 +117,7 @@ static int ppro_check_ctrs(struct pt_reg
 		CTR_READ(low, high, msrs, i);
 		if (CTR_OVERFLOWED(low)) {
 			oprofile_add_sample(regs, i);
-			CTR_WRITE(reset_value[i], msrs, i);
+			CTR_32BIT_WRITE(reset_value[i], msrs, i);
 		}
 	}
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M?
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (10 preceding siblings ...)
  2007-02-12  7:37 ` [PATCH x86 for review II] [12/39] i386: Handle 32 bit PerfMon Counter writes cleanly in oprofile Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-13  6:36   ` Rene Herman
  2007-02-12  7:38 ` [PATCH x86 for review II] [14/39] x86_64: cleanup Doc/x86_64/ files Andi Kleen
                   ` (25 subsequent siblings)
  37 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Rene Herman, patches, linux-kernel


From: Rene Herman <rene.herman@gmail.com>
A while ago it was remarked on list here that keeping the kernel 4M 
aligned physically might be a performance win if the added 1M (it 
normally loads at 1M) meant it would fit on one 4M aligned hugepage 
instead of 2 and since that time I've been doing such.

In fact, while I was at it, I ran the kernel at 16M; while admittedly a 
bit of a non-issue, having never experienced ZONE_DMA shortage, I am an 
ISA user on a >16M machine so this seemed to make sense -- no kernel 
eating up "precious" ISA-DMAable memory.

Recently CONFIG_PHYSICAL_START was replaced by CONFIG_PHYSICAL_ALIGN 
(commit e69f202d0a1419219198566e1c22218a5c71a9a6) and while 4M alignment 
is still possible, that's also the strictest alignment allowed meaning I 
can't load my (non-relocatable) kernel at 16M anymore.

If I just apply the following and set it to 16M, things seem to be 
working for me. Was there an important reason to limit the alignment to 
4M, and if so, even on non relocatable kernels?

Rene.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig
+++ linux/arch/i386/Kconfig
@@ -843,7 +843,7 @@ config RELOCATABLE
 config PHYSICAL_ALIGN
 	hex "Alignment value to which kernel should be aligned"
 	default "0x100000"
-	range 0x2000 0x400000
+	range 0x2000 0x1000000
 	help
 	  This value puts the alignment restrictions on physical address
  	  where kernel is loaded and run from. Kernel is compiled for an

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [14/39] x86_64: cleanup Doc/x86_64/ files
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (11 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M? Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [15/39] x86_64: list x86_64 quilt tree Andi Kleen
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Randy Dunlap, patches, linux-kernel


From: Randy Dunlap <randy.dunlap@oracle.com>

Fix typos.
Lots of whitespace changes for readability and consistency.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 Documentation/x86_64/boot-options.txt |   27 ++++++++++-----------------
 Documentation/x86_64/cpu-hotplug-spec |    2 +-
 Documentation/x86_64/kernel-stacks    |   26 +++++++++++++-------------
 Documentation/x86_64/mm.txt           |   22 +++++++++++-----------
 4 files changed, 35 insertions(+), 42 deletions(-)

Index: linux/Documentation/x86_64/cpu-hotplug-spec
===================================================================
--- linux.orig/Documentation/x86_64/cpu-hotplug-spec
+++ linux/Documentation/x86_64/cpu-hotplug-spec
@@ -2,7 +2,7 @@ Firmware support for CPU hotplug under L
 ---------------------------------------------------
 
 Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
-know in advance boot time the maximum number of CPUs that could be plugged
+know in advance of boot time the maximum number of CPUs that could be plugged
 into the system. ACPI 3.0 currently has no official way to supply
 this information from the firmware to the operating system.
 
Index: linux/Documentation/x86_64/kernel-stacks
===================================================================
--- linux.orig/Documentation/x86_64/kernel-stacks
+++ linux/Documentation/x86_64/kernel-stacks
@@ -9,9 +9,9 @@ zombie. While the thread is in user spac
 except for the thread_info structure at the bottom.
 
 In addition to the per thread stacks, there are specialized stacks
-associated with each cpu.  These stacks are only used while the kernel
-is in control on that cpu, when a cpu returns to user space the
-specialized stacks contain no useful data.  The main cpu stacks is
+associated with each CPU.  These stacks are only used while the kernel
+is in control on that CPU; when a CPU returns to user space the
+specialized stacks contain no useful data.  The main CPU stacks are:
 
 * Interrupt stack.  IRQSTACKSIZE
 
@@ -32,17 +32,17 @@ x86_64 also has a feature which is not a
 to automatically switch to a new stack for designated events such as
 double fault or NMI, which makes it easier to handle these unusual
 events on x86_64.  This feature is called the Interrupt Stack Table
-(IST).  There can be up to 7 IST entries per cpu. The IST code is an
-index into the Task State Segment (TSS), the IST entries in the TSS
-point to dedicated stacks, each stack can be a different size.
+(IST).  There can be up to 7 IST entries per CPU. The IST code is an
+index into the Task State Segment (TSS). The IST entries in the TSS
+point to dedicated stacks; each stack can be a different size.
 
-An IST is selected by an non-zero value in the IST field of an
+An IST is selected by a non-zero value in the IST field of an
 interrupt-gate descriptor.  When an interrupt occurs and the hardware
 loads such a descriptor, the hardware automatically sets the new stack
 pointer based on the IST value, then invokes the interrupt handler.  If
 software wants to allow nested IST interrupts then the handler must
 adjust the IST values on entry to and exit from the interrupt handler.
-(this is occasionally done, e.g. for debug exceptions)
+(This is occasionally done, e.g. for debug exceptions.)
 
 Events with different IST codes (i.e. with different stacks) can be
 nested.  For example, a debug interrupt can safely be interrupted by an
@@ -58,17 +58,17 @@ The currently assigned IST stacks are :-
 
   Used for interrupt 12 - Stack Fault Exception (#SS).
 
-  This allows to recover from invalid stack segments. Rarely
+  This allows the CPU to recover from invalid stack segments. Rarely
   happens.
 
 * DOUBLEFAULT_STACK.  EXCEPTION_STKSZ (PAGE_SIZE).
 
   Used for interrupt 8 - Double Fault Exception (#DF).
 
-  Invoked when handling a exception causes another exception. Happens
-  when the kernel is very confused (e.g. kernel stack pointer corrupt)
-  Using a separate stack allows to recover from it well enough in many
-  cases to still output an oops.
+  Invoked when handling one exception causes another exception. Happens
+  when the kernel is very confused (e.g. kernel stack pointer corrupt).
+  Using a separate stack allows the kernel to recover from it well enough
+  in many cases to still output an oops.
 
 * NMI_STACK.  EXCEPTION_STKSZ (PAGE_SIZE).
 
Index: linux/Documentation/x86_64/mm.txt
===================================================================
--- linux.orig/Documentation/x86_64/mm.txt
+++ linux/Documentation/x86_64/mm.txt
@@ -3,26 +3,26 @@
 
 Virtual memory map with 4 level page tables:
 
-0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm
+0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
 hole caused by [48:63] sign extension
-ffff800000000000 - ffff80ffffffffff (=40bits) guard hole
-ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of all phys. memory
-ffffc10000000000 - ffffc1ffffffffff (=40bits) hole
-ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space
+ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
+ffff810000000000 - ffffc0ffffffffff (=46 bits) direct mapping of all phys. memory
+ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole
+ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space
 ... unused hole ...
-ffffffff80000000 - ffffffff82800000 (=40MB)   kernel text mapping, from phys 0
+ffffffff80000000 - ffffffff82800000 (=40 MB)   kernel text mapping, from phys 0
 ... unused hole ...
-ffffffff88000000 - fffffffffff00000 (=1919MB) module mapping space
+ffffffff88000000 - fffffffffff00000 (=1919 MB) module mapping space
 
-The direct mapping covers all memory in the system upto the highest
+The direct mapping covers all memory in the system up to the highest
 memory address (this means in some cases it can also include PCI memory
-holes)
+holes).
 
 vmalloc space is lazily synchronized into the different PML4 pages of
 the processes using the page fault handler, with init_level4_pgt as
 reference.
 
-Current X86-64 implementations only support 40 bit of address space,
-but we support upto 46bits. This expands into MBZ space in the page tables.
+Current X86-64 implementations only support 40 bits of address space,
+but we support up to 46 bits. This expands into MBZ space in the page tables.
 
 -Andi Kleen, Jul 2004
Index: linux/Documentation/x86_64/boot-options.txt
===================================================================
--- linux.orig/Documentation/x86_64/boot-options.txt
+++ linux/Documentation/x86_64/boot-options.txt
@@ -226,9 +226,9 @@ IOMMU (input/output memory management un
                        is 20.
     memaper[=<order>]  Allocate an own aperture over RAM with size 32MB<<order.
                        (default: order=1, i.e. 64MB)
-    merge              Do scather-gather (SG) merging. Implies "force"
+    merge              Do scatter-gather (SG) merging. Implies "force"
                        (experimental).
-    nomerge            Don't do scather-gather (SG) merging.
+    nomerge            Don't do scatter-gather (SG) merging.
     noaperture         Ask the IOMMU not to touch the aperture for AGP.
     forcesac           Force single-address cycle (SAC) mode for masks <40bits
                        (experimental).
@@ -275,14 +275,14 @@ IOMMU (input/output memory management un
 
 Debugging
 
-  oops=panic Always panic on oopses. Default is to just kill the process,
-	     but there is a small probability of deadlocking the machine.
-	     This will also cause panics on machine check exceptions.
-	     Useful together with panic=30 to trigger a reboot.
+  oops=panic	Always panic on oopses. Default is to just kill the process,
+		but there is a small probability of deadlocking the machine.
+		This will also cause panics on machine check exceptions.
+		Useful together with panic=30 to trigger a reboot.
 
-  kstack=N   Print that many words from the kernel stack in oops dumps.
+  kstack=N	Print N words from the kernel stack in oops dumps.
 
-  pagefaulttrace Dump all page faults. Only useful for extreme debugging
+  pagefaulttrace  Dump all page faults. Only useful for extreme debugging
 		and will create a lot of output.
 
   call_trace=[old|both|newfallback|new]
@@ -292,15 +292,8 @@ Debugging
 		newfallback: use new unwinder but fall back to old if it gets
 			stuck (default)
 
-  call_trace=[old|both|newfallback|new]
-		old: use old inexact backtracer
-		new: use new exact dwarf2 unwinder
- 		both: print entries from both
-		newfallback: use new unwinder but fall back to old if it gets
-			stuck (default)
-
-Misc
+Miscellaneous
 
   noreplacement  Don't replace instructions with more appropriate ones
 		 for the CPU. This may be useful on asymmetric MP systems
-		 where some CPU have less capabilities than the others.
+		 where some CPUs have less capabilities than others.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [15/39] x86_64: list x86_64 quilt tree
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (12 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [14/39] x86_64: cleanup Doc/x86_64/ files Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [16/39] x86: simplify notify_page_fault() Andi Kleen
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Randy Dunlap, patches, linux-kernel


From: Randy Dunlap <rdunlap@xenotime.net>

List x86_64 quilt tree in MAINTAINERS.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 MAINTAINERS |    1 +
 1 file changed, 1 insertion(+)

Index: linux/MAINTAINERS
===================================================================
--- linux.orig/MAINTAINERS
+++ linux/MAINTAINERS
@@ -3735,6 +3735,7 @@ P:	Andi Kleen
 M:	ak@suse.de
 L:	discuss@x86-64.org
 W:	http://www.x86-64.org
+T:	quilt ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current
 S:	Maintained
 
 YAM DRIVER FOR AX.25

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [16/39] x86: simplify notify_page_fault()
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (13 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [15/39] x86_64: list x86_64 quilt tree Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [17/39] x86_64: Tighten mce_amd driver MSR reads Andi Kleen
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Jan Beulich, patches, linux-kernel


From: "Jan Beulich" <jbeulich@novell.com>
Remove all parameters from this function that aren't really variable.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/mm/fault.c   |   18 ++++++++----------
 arch/x86_64/mm/fault.c |   18 ++++++++----------
 2 files changed, 16 insertions(+), 20 deletions(-)

Index: linux/arch/i386/mm/fault.c
===================================================================
--- linux.orig/arch/i386/mm/fault.c
+++ linux/arch/i386/mm/fault.c
@@ -46,17 +46,17 @@ int unregister_page_fault_notifier(struc
 }
 EXPORT_SYMBOL_GPL(unregister_page_fault_notifier);
 
-static inline int notify_page_fault(enum die_val val, const char *str,
-			struct pt_regs *regs, long err, int trap, int sig)
+static inline int notify_page_fault(struct pt_regs *regs, long err)
 {
 	struct die_args args = {
 		.regs = regs,
-		.str = str,
+		.str = "page fault",
 		.err = err,
-		.trapnr = trap,
-		.signr = sig
+		.trapnr = 14,
+		.signr = SIGSEGV
 	};
-	return atomic_notifier_call_chain(&notify_page_fault_chain, val, &args);
+	return atomic_notifier_call_chain(&notify_page_fault_chain,
+	                                  DIE_PAGE_FAULT, &args);
 }
 
 /*
@@ -353,8 +353,7 @@ fastcall void __kprobes do_page_fault(st
 	if (unlikely(address >= TASK_SIZE)) {
 		if (!(error_code & 0x0000000d) && vmalloc_fault(address) >= 0)
 			return;
-		if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14,
-						SIGSEGV) == NOTIFY_STOP)
+		if (notify_page_fault(regs, error_code) == NOTIFY_STOP)
 			return;
 		/*
 		 * Don't take the mm semaphore here. If we fixup a prefetch
@@ -363,8 +362,7 @@ fastcall void __kprobes do_page_fault(st
 		goto bad_area_nosemaphore;
 	}
 
-	if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14,
-					SIGSEGV) == NOTIFY_STOP)
+	if (notify_page_fault(regs, error_code) == NOTIFY_STOP)
 		return;
 
 	/* It's safe to allow irq's after cr2 has been saved and the vmalloc
Index: linux/arch/x86_64/mm/fault.c
===================================================================
--- linux.orig/arch/x86_64/mm/fault.c
+++ linux/arch/x86_64/mm/fault.c
@@ -56,17 +56,17 @@ int unregister_page_fault_notifier(struc
 }
 EXPORT_SYMBOL_GPL(unregister_page_fault_notifier);
 
-static inline int notify_page_fault(enum die_val val, const char *str,
-			struct pt_regs *regs, long err, int trap, int sig)
+static inline int notify_page_fault(struct pt_regs *regs, long err)
 {
 	struct die_args args = {
 		.regs = regs,
-		.str = str,
+		.str = "page fault",
 		.err = err,
-		.trapnr = trap,
-		.signr = sig
+		.trapnr = 14,
+		.signr = SIGSEGV
 	};
-	return atomic_notifier_call_chain(&notify_page_fault_chain, val, &args);
+	return atomic_notifier_call_chain(&notify_page_fault_chain,
+	                                  DIE_PAGE_FAULT, &args);
 }
 
 void bust_spinlocks(int yes)
@@ -376,8 +376,7 @@ asmlinkage void __kprobes do_page_fault(
 			if (vmalloc_fault(address) >= 0)
 				return;
 		}
-		if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14,
-						SIGSEGV) == NOTIFY_STOP)
+		if (notify_page_fault(regs, error_code) == NOTIFY_STOP)
 			return;
 		/*
 		 * Don't take the mm semaphore here. If we fixup a prefetch
@@ -386,8 +385,7 @@ asmlinkage void __kprobes do_page_fault(
 		goto bad_area_nosemaphore;
 	}
 
-	if (notify_page_fault(DIE_PAGE_FAULT, "page fault", regs, error_code, 14,
-					SIGSEGV) == NOTIFY_STOP)
+	if (notify_page_fault(regs, error_code) == NOTIFY_STOP)
 		return;
 
 	if (likely(regs->eflags & X86_EFLAGS_IF))

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [17/39] x86_64: Tighten mce_amd driver MSR reads
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (14 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [16/39] x86: simplify notify_page_fault() Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected Andi Kleen
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Jan Beulich, patches, linux-kernel


From: "Jan Beulich" <jbeulich@novell.com>

while debugging an unrelated problem in Xen, I noticed odd reads from
non-existent MSRs. Having now found time to look why these happen, I
came up with below patch, which
- prevents accessing MCi_MISCj with j > 0 when the block pointer in
MCi_MISC0 is zero
- accesses only contiguous MCi_MISCj until a non-implemented one is
found
- doesn't touch unimplemented blocks in mce_threshold_interrupt at all
- gives names to two bits previously derived from MASK_VALID_HI (it
took me some time to understand the code without this)

The first three items, besides being apparently closer to the spec, should
namely help cutting down on the time mce_threshold_interrupt() takes.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/mce_amd.c |   40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

Index: linux/arch/x86_64/kernel/mce_amd.c
===================================================================
--- linux.orig/arch/x86_64/kernel/mce_amd.c
+++ linux/arch/x86_64/kernel/mce_amd.c
@@ -37,6 +37,8 @@
 #define THRESHOLD_MAX     0xFFF
 #define INT_TYPE_APIC     0x00020000
 #define MASK_VALID_HI     0x80000000
+#define MASK_CNTP_HI      0x40000000
+#define MASK_LOCKED_HI    0x20000000
 #define MASK_LVTOFF_HI    0x00F00000
 #define MASK_COUNT_EN_HI  0x00080000
 #define MASK_INT_TYPE_HI  0x00060000
@@ -122,14 +124,17 @@ void __cpuinit mce_amd_feature_init(stru
 		for (block = 0; block < NR_BLOCKS; ++block) {
 			if (block == 0)
 				address = MSR_IA32_MC0_MISC + bank * 4;
-			else if (block == 1)
-				address = MCG_XBLK_ADDR
-					+ ((low & MASK_BLKPTR_LO) >> 21);
+			else if (block == 1) {
+				address = (low & MASK_BLKPTR_LO) >> 21;
+				if (!address)
+					break;
+				address += MCG_XBLK_ADDR;
+			}
 			else
 				++address;
 
 			if (rdmsr_safe(address, &low, &high))
-				continue;
+				break;
 
 			if (!(high & MASK_VALID_HI)) {
 				if (block)
@@ -138,8 +143,8 @@ void __cpuinit mce_amd_feature_init(stru
 					break;
 			}
 
-			if (!(high & MASK_VALID_HI >> 1)  ||
-			     (high & MASK_VALID_HI >> 2))
+			if (!(high & MASK_CNTP_HI)  ||
+			     (high & MASK_LOCKED_HI))
 				continue;
 
 			if (!block)
@@ -187,17 +192,22 @@ asmlinkage void mce_threshold_interrupt(
 
 	/* assume first bank caused it */
 	for (bank = 0; bank < NR_BANKS; ++bank) {
+		if (!(per_cpu(bank_map, m.cpu) & (1 << bank)))
+			continue;
 		for (block = 0; block < NR_BLOCKS; ++block) {
 			if (block == 0)
 				address = MSR_IA32_MC0_MISC + bank * 4;
-			else if (block == 1)
-				address = MCG_XBLK_ADDR
-					+ ((low & MASK_BLKPTR_LO) >> 21);
+			else if (block == 1) {
+				address = (low & MASK_BLKPTR_LO) >> 21;
+				if (!address)
+					break;
+				address += MCG_XBLK_ADDR;
+			}
 			else
 				++address;
 
 			if (rdmsr_safe(address, &low, &high))
-				continue;
+				break;
 
 			if (!(high & MASK_VALID_HI)) {
 				if (block)
@@ -206,8 +216,8 @@ asmlinkage void mce_threshold_interrupt(
 					break;
 			}
 
-			if (!(high & MASK_VALID_HI >> 1)  ||
-			     (high & MASK_VALID_HI >> 2))
+			if (!(high & MASK_CNTP_HI)  ||
+			     (high & MASK_LOCKED_HI))
 				continue;
 
 			if (high & MASK_OVERFLOW_HI) {
@@ -385,7 +395,7 @@ static __cpuinit int allocate_threshold_
 		return 0;
 
 	if (rdmsr_safe(address, &low, &high))
-		goto recurse;
+		return 0;
 
 	if (!(high & MASK_VALID_HI)) {
 		if (block)
@@ -394,8 +404,8 @@ static __cpuinit int allocate_threshold_
 			return 0;
 	}
 
-	if (!(high & MASK_VALID_HI >> 1)  ||
-	     (high & MASK_VALID_HI >> 2))
+	if (!(high & MASK_CNTP_HI)  ||
+	     (high & MASK_LOCKED_HI))
 		goto recurse;
 
 	b = kzalloc(sizeof(struct threshold_block), GFP_KERNEL);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (15 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [17/39] x86_64: Tighten mce_amd driver MSR reads Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:54   ` Oliver Neukum
  2007-02-12  7:38 ` [PATCH x86 for review II] [19/39] x86_64: remove get_pmd() Andi Kleen
                   ` (20 subsequent siblings)
  37 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


When a machine check event is detected (including a AMD RevF threshold 
overflow event) allow to run a "trigger" program. This allows user space
to react to such events sooner.

The trigger is configured using a new trigger entry in the 
machinecheck sysfs interface. It is currently shared between
all CPUs.

I also fixed the AMD threshold handler to run the machine 
check polling code immediately to actually log any events
that might have caused the threshold interrupt.

Also added some documentation for the mce sysfs interface.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 Documentation/x86_64/machinecheck |   70 ++++++++++++++++++++++++++++++++++++++
 arch/x86_64/kernel/mce.c          |   66 +++++++++++++++++++++++++++++------
 arch/x86_64/kernel/mce_amd.c      |    4 ++
 include/asm-x86_64/mce.h          |    2 +
 kernel/kmod.c                     |   44 ++++++++++++++++-------
 5 files changed, 160 insertions(+), 26 deletions(-)

Index: linux/arch/x86_64/kernel/mce.c
===================================================================
--- linux.orig/arch/x86_64/kernel/mce.c
+++ linux/arch/x86_64/kernel/mce.c
@@ -19,6 +19,7 @@
 #include <linux/cpu.h>
 #include <linux/percpu.h>
 #include <linux/ctype.h>
+#include <linux/kmod.h>
 #include <asm/processor.h> 
 #include <asm/msr.h>
 #include <asm/mce.h>
@@ -42,6 +43,10 @@ static unsigned long console_logged;
 static int notify_user;
 static int rip_msr;
 static int mce_bootlog = 1;
+static atomic_t mce_events;
+
+static char trigger[128];
+static char *trigger_argv[2] = { trigger, NULL };
 
 /*
  * Lockless MCE logging infrastructure.
@@ -57,6 +62,7 @@ struct mce_log mcelog = { 
 void mce_log(struct mce *mce)
 {
 	unsigned next, entry;
+	atomic_inc(&mce_events);
 	mce->finished = 0;
 	wmb();
 	for (;;) {
@@ -161,6 +167,17 @@ static inline void mce_get_rip(struct mc
 	}
 }
 
+static void do_mce_trigger(void)
+{
+	static atomic_t mce_logged;
+	int events = atomic_read(&mce_events);
+	if (events != atomic_read(&mce_logged) && trigger[0]) {
+		/* Small race window, but should be harmless.  */
+		atomic_set(&mce_logged, events);
+		call_usermodehelper(trigger, trigger_argv, NULL, -1);
+	}
+}
+
 /* 
  * The actual machine check handler
  */
@@ -234,8 +251,12 @@ void do_machine_check(struct pt_regs * r
 	}
 
 	/* Never do anything final in the polling timer */
-	if (!regs)
+	if (!regs) {
+		/* Normal interrupt context here. Call trigger for any new
+		   events. */
+		do_mce_trigger();
 		goto out;
+	}
 
 	/* If we didn't find an uncorrectable error, pick
 	   the last one (shouldn't happen, just being safe). */
@@ -606,17 +627,42 @@ DEFINE_PER_CPU(struct sys_device, device
 	}									   \
 	static SYSDEV_ATTR(name, 0644, show_ ## name, set_ ## name);
 
+/* TBD should generate these dynamically based on number of available banks */
 ACCESSOR(bank0ctl,bank[0],mce_restart())
 ACCESSOR(bank1ctl,bank[1],mce_restart())
 ACCESSOR(bank2ctl,bank[2],mce_restart())
 ACCESSOR(bank3ctl,bank[3],mce_restart())
 ACCESSOR(bank4ctl,bank[4],mce_restart())
 ACCESSOR(bank5ctl,bank[5],mce_restart())
-static struct sysdev_attribute * bank_attributes[NR_BANKS] = {
-	&attr_bank0ctl, &attr_bank1ctl, &attr_bank2ctl,
-	&attr_bank3ctl, &attr_bank4ctl, &attr_bank5ctl};
+
+static ssize_t show_trigger(struct sys_device *s, char *buf)
+{
+	strcpy(buf, trigger);
+	strcat(buf, "\n");
+	return strlen(trigger) + 1;
+}
+
+static ssize_t set_trigger(struct sys_device *s,const char *buf,size_t siz)
+{
+	char *p;
+	int len;
+	strncpy(trigger, buf, sizeof(trigger));
+	trigger[sizeof(trigger)-1] = 0;
+	len = strlen(trigger);
+	p = strchr(trigger, '\n');
+	if (*p) *p = 0;
+	return len;
+}
+
+static SYSDEV_ATTR(trigger, 0644, show_trigger, set_trigger);
 ACCESSOR(tolerant,tolerant,)
 ACCESSOR(check_interval,check_interval,mce_restart())
+static struct sysdev_attribute *mce_attributes[] = {
+	&attr_bank0ctl, &attr_bank1ctl, &attr_bank2ctl,
+	&attr_bank3ctl, &attr_bank4ctl, &attr_bank5ctl,
+	&attr_tolerant, &attr_check_interval, &attr_trigger,
+	NULL
+};
 
 /* Per cpu sysdev init.  All of the cpus still share the same ctl bank */
 static __cpuinit int mce_create_device(unsigned int cpu)
@@ -632,11 +678,9 @@ static __cpuinit int mce_create_device(u
 	err = sysdev_register(&per_cpu(device_mce,cpu));
 
 	if (!err) {
-		for (i = 0; i < banks; i++)
+		for (i = 0; mce_attributes[i]; i++)
 			sysdev_create_file(&per_cpu(device_mce,cpu),
-				bank_attributes[i]);
-		sysdev_create_file(&per_cpu(device_mce,cpu), &attr_tolerant);
-		sysdev_create_file(&per_cpu(device_mce,cpu), &attr_check_interval);
+				mce_attributes[i]);
 	}
 	return err;
 }
@@ -645,11 +689,9 @@ static void mce_remove_device(unsigned i
 {
 	int i;
 
-	for (i = 0; i < banks; i++)
+	for (i = 0; mce_attributes[i]; i++)
 		sysdev_remove_file(&per_cpu(device_mce,cpu),
-			bank_attributes[i]);
-	sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_tolerant);
-	sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_check_interval);
+			mce_attributes[i]);
 	sysdev_unregister(&per_cpu(device_mce,cpu));
 	memset(&per_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
 }
Index: linux/arch/x86_64/kernel/mce_amd.c
===================================================================
--- linux.orig/arch/x86_64/kernel/mce_amd.c
+++ linux/arch/x86_64/kernel/mce_amd.c
@@ -220,6 +220,10 @@ asmlinkage void mce_threshold_interrupt(
 			     (high & MASK_LOCKED_HI))
 				continue;
 
+			/* Log the machine check that caused the threshold
+			   event. */
+			do_machine_check(NULL, 0);
+
 			if (high & MASK_OVERFLOW_HI) {
 				rdmsrl(address, m.misc);
 				rdmsrl(MSR_IA32_MC0_STATUS + bank * 4,
Index: linux/kernel/kmod.c
===================================================================
--- linux.orig/kernel/kmod.c
+++ linux/kernel/kmod.c
@@ -217,7 +217,10 @@ static int wait_for_helper(void *data)
 			sub_info->retval = ret;
 	}
 
-	complete(sub_info->complete);
+	if (sub_info->wait < 0)
+		kfree(sub_info);
+	else
+		complete(sub_info->complete);
 	return 0;
 }
 
@@ -239,6 +242,9 @@ static void __call_usermodehelper(struct
 		pid = kernel_thread(____call_usermodehelper, sub_info,
 				    CLONE_VFORK | SIGCHLD);
 
+	if (wait < 0)
+		return;
+
 	if (pid < 0) {
 		sub_info->retval = pid;
 		complete(sub_info->complete);
@@ -253,6 +259,9 @@ static void __call_usermodehelper(struct
  * @envp: null-terminated environment list
  * @session_keyring: session keyring for process (NULL for an empty keyring)
  * @wait: wait for the application to finish and return status.
+ *        when -1 don't wait at all, but you get no useful error back when
+ *        the program couldn't be exec'ed. This makes it safe to call
+ *        from interrupt context.
  *
  * Runs a user-space application.  The application is started
  * asynchronously if wait is not set, and runs as a child of keventd.
@@ -265,17 +274,8 @@ int call_usermodehelper_keys(char *path,
 			     struct key *session_keyring, int wait)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
-	struct subprocess_info sub_info = {
-		.work		= __WORK_INITIALIZER(sub_info.work,
-						     __call_usermodehelper),
-		.complete	= &done,
-		.path		= path,
-		.argv		= argv,
-		.envp		= envp,
-		.ring		= session_keyring,
-		.wait		= wait,
-		.retval		= 0,
-	};
+	struct subprocess_info *sub_info;
+	int retval;
 
 	if (!khelper_wq)
 		return -EBUSY;
@@ -283,9 +283,25 @@ int call_usermodehelper_keys(char *path,
 	if (path[0] == '\0')
 		return 0;
 
-	queue_work(khelper_wq, &sub_info.work);
+	sub_info = kzalloc(sizeof(struct subprocess_info),  GFP_ATOMIC);
+	if (!sub_info)
+		return -ENOMEM;
+
+	INIT_WORK(&sub_info->work, __call_usermodehelper);
+	sub_info->complete = &done;
+	sub_info->path = path;
+	sub_info->argv = argv;
+	sub_info->envp = envp;
+	sub_info->ring = session_keyring;
+	sub_info->wait = wait;
+
+	queue_work(khelper_wq, &sub_info->work);
+	if (wait < 0) /* task has freed sub_info */
+		return 0;
 	wait_for_completion(&done);
-	return sub_info.retval;
+	retval = sub_info->retval;
+	kfree(sub_info);
+	return retval;
 }
 EXPORT_SYMBOL(call_usermodehelper_keys);
 
Index: linux/include/asm-x86_64/mce.h
===================================================================
--- linux.orig/include/asm-x86_64/mce.h
+++ linux/include/asm-x86_64/mce.h
@@ -103,6 +103,8 @@ void mce_log_therm_throt_event(unsigned 
 
 extern atomic_t mce_entry;
 
+extern void do_machine_check(struct pt_regs *, long);
+
 #endif
 
 #endif
Index: linux/Documentation/x86_64/machinecheck
===================================================================
--- /dev/null
+++ linux/Documentation/x86_64/machinecheck
@@ -0,0 +1,70 @@
+
+Configurable sysfs parameters for the x86-64 machine check code.
+
+Machine checks report internal hardware error conditions detected
+by the CPU. Uncorrected errors typically cause a machine check
+(often with panic), corrected ones cause a machine check log entry.
+
+Machine checks are organized in banks (normally associated with
+a hardware subsystem) and subevents in a bank. The exact meaning
+of the banks and subevent is CPU specific.
+
+mcelog knows how to decode them.
+
+When you see the "Machine check errors logged" message in the system
+log then mcelog should run to collect and decode machine check entries
+from /dev/mcelog. Normally mcelog should be run regularly from a cronjob.
+
+Each CPU has a directory in /sys/devices/system/machinecheck/machinecheckN
+(N = CPU number)
+
+The directory contains some configurable entries:
+
+Entries:
+
+bankNctl
+(N bank number)
+	64bit Hex bitmask enabling/disabling specific subevents for bank N
+	When a bit in the bitmask is zero then the respective
+	subevent will not be reported.
+	By default all events are enabled.
+	Note that BIOS maintain another mask to disable specific events
+	per bank.  This is not visible here
+
+The following entries appear for each CPU, but they are truly shared
+between all CPUs.
+
+check_interval
+	How often to poll for corrected machine check errors, in seconds
+	(Note output is hexademical). Default 5 minutes.
+
+tolerant
+	Tolerance level. When a machine check exception occurs for a non
+	corrected machine check the kernel can take different actions.
+	Since machine check exceptions can happen any time it is sometimes
+	risky for the kernel to kill a process because it defies
+	normal kernel locking rules. The tolerance level configures
+	how hard the kernel tries to recover even at some risk of deadlock.
+
+	0: always panic,
+	1: panic if deadlock possible,
+	2: try to avoid panic,
+   	3: never panic or exit (for testing only)
+
+	Default: 1
+
+	Note this only makes a difference if the CPU allows recovery
+	from a machine check exception. Current x86 CPUs generally do not.
+
+trigger
+	Program to run when a machine check event is detected.
+	This is an alternative to running mcelog regularly from cron
+	and allows to detect events faster.
+
+TBD document entries for AMD threshold interrupt configuration
+
+For more details about the x86 machine check architecture
+see the Intel and AMD architecture manuals from their developer websites.
+
+For more details about the architecture see
+see http://one.firstfloor.org/~andi/mce.pdf

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [19/39] x86_64: remove get_pmd()
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (16 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [20/39] i386: Small cleanup to TLB flush code Andi Kleen
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Jan Beulich, patches, linux-kernel


From: "Jan Beulich" <jbeulich@novell.com>
Function is dead.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 include/asm-x86_64/pgalloc.h |    5 -----
 1 file changed, 5 deletions(-)

Index: linux/include/asm-x86_64/pgalloc.h
===================================================================
--- linux.orig/include/asm-x86_64/pgalloc.h
+++ linux/include/asm-x86_64/pgalloc.h
@@ -18,11 +18,6 @@ static inline void pmd_populate(struct m
 	set_pmd(pmd, __pmd(_PAGE_TABLE | (page_to_pfn(pte) << PAGE_SHIFT)));
 }
 
-static inline pmd_t *get_pmd(void)
-{
-	return (pmd_t *)get_zeroed_page(GFP_KERNEL);
-}
-
 static inline void pmd_free(pmd_t *pmd)
 {
 	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [20/39] i386: Small cleanup to TLB flush code
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (17 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [19/39] x86_64: remove get_pmd() Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [21/39] i386: rdmsr_on_cpu, wrmsr_on_cpu Andi Kleen
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


- Remove outdated comment
- Use cpu_relax() in a busy loop

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/kernel/smp.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: linux/arch/i386/kernel/smp.c
===================================================================
--- linux.orig/arch/i386/kernel/smp.c
+++ linux/arch/i386/kernel/smp.c
@@ -375,8 +375,7 @@ static void flush_tlb_others(cpumask_t c
 	/*
 	 * i'm not happy about this global shared spinlock in the
 	 * MM hot path, but we'll see how contended it is.
-	 * Temporarily this turns IRQs off, so that lockups are
-	 * detected by the NMI watchdog.
+	 * AK: x86-64 has a faster method that could be ported.
 	 */
 	spin_lock(&tlbstate_lock);
 	
@@ -401,7 +400,7 @@ static void flush_tlb_others(cpumask_t c
 
 	while (!cpus_empty(flush_cpumask))
 		/* nothing. lockup detection does not belong here */
-		mb();
+		cpu_relax();
 
 	flush_mm = NULL;
 	flush_va = 0;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [21/39] i386: rdmsr_on_cpu, wrmsr_on_cpu
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (18 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [20/39] i386: Small cleanup to TLB flush code Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [22/39] x86_64: Kconfig typos Andi Kleen
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Alexey Dobriyan, patches, linux-kernel


From: Alexey Dobriyan <adobriyan@openvz.org>
There was OpenVZ specific bug rendering some cpufreq drivers unusable
on SMP. In short, when cpufreq code thinks it confined itself to
needed cpu by means of set_cpus_allowed() to execute rdmsr, some
"virtual cpu" feature can migrate process to anywhere. This triggers
bugons and does wrong things in general.

This got fixed by introducing rdmsr_on_cpu and wrmsr_on_cpu executing
rdmsr and wrmsr on given physical cpu by means of
smp_call_function_single().

AK: link it into 64bit kernel too because cpufreq drivers use it.

 arch/i386/kernel/cpu/cpufreq/p4-clockmod.c |   30 ++----------
 arch/i386/lib/Makefile                     |    2
 arch/i386/kernel/cpu/cpufreq/p4-clockmod.c |   30 ++----------
 arch/i386/lib/Makefile                     |    2 
 arch/i386/lib/msr-on-cpu.c                 |   70 +++++++++++++++++++++++++++++
 arch/x86_64/lib/Makefile                   |    4 +
 include/asm-i386/msr.h                     |    3 +
 5 files changed, 84 insertions(+), 25 deletions(-)

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/i386/lib/Makefile
===================================================================
--- linux.orig/arch/i386/lib/Makefile
+++ linux/arch/i386/lib/Makefile
@@ -7,3 +7,5 @@ lib-y = checksum.o delay.o usercopy.o ge
 	bitops.o semaphore.o
 
 lib-$(CONFIG_X86_USE_3DNOW) += mmx.o
+
+obj-y = msr-on-cpu.o
Index: linux/arch/i386/lib/msr-on-cpu.c
===================================================================
--- /dev/null
+++ linux/arch/i386/lib/msr-on-cpu.c
@@ -0,0 +1,70 @@
+#include <linux/module.h>
+#include <linux/preempt.h>
+#include <linux/smp.h>
+#include <asm/msr.h>
+
+#ifdef CONFIG_SMP
+struct msr_info {
+	u32 msr_no;
+	u32 l, h;
+};
+
+static void __rdmsr_on_cpu(void *info)
+{
+	struct msr_info *rv = info;
+
+	rdmsr(rv->msr_no, rv->l, rv->h);
+}
+
+void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h)
+{
+	preempt_disable();
+	if (smp_processor_id() == cpu)
+		rdmsr(msr_no, *l, *h);
+	else {
+		struct msr_info rv;
+
+		rv.msr_no = msr_no;
+		smp_call_function_single(cpu, __rdmsr_on_cpu, &rv, 0, 1);
+		*l = rv.l;
+		*h = rv.h;
+	}
+	preempt_enable();
+}
+
+static void __wrmsr_on_cpu(void *info)
+{
+	struct msr_info *rv = info;
+
+	wrmsr(rv->msr_no, rv->l, rv->h);
+}
+
+void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
+{
+	preempt_disable();
+	if (smp_processor_id() == cpu)
+		wrmsr(msr_no, l, h);
+	else {
+		struct msr_info rv;
+
+		rv.msr_no = msr_no;
+		rv.l = l;
+		rv.h = h;
+		smp_call_function_single(cpu, __wrmsr_on_cpu, &rv, 0, 1);
+	}
+	preempt_enable();
+}
+#else
+void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h)
+{
+	rdmsr(msr_no, *l, *h);
+}
+
+void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
+{
+	wrmsr(msr_no, l, h);
+}
+#endif
+
+EXPORT_SYMBOL(rdmsr_on_cpu);
+EXPORT_SYMBOL(wrmsr_on_cpu);
Index: linux/include/asm-i386/msr.h
===================================================================
--- linux.orig/include/asm-i386/msr.h
+++ linux/include/asm-i386/msr.h
@@ -83,6 +83,9 @@ static inline void wrmsrl (unsigned long
 			  : "c" (counter))
 #endif	/* !CONFIG_PARAVIRT */
 
+void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h);
+void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h);
+
 /* symbolic names for some interesting MSRs */
 /* Intel defined MSRs. */
 #define MSR_IA32_P5_MC_ADDR		0
Index: linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
+++ linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
@@ -62,7 +62,7 @@ static int cpufreq_p4_setdc(unsigned int
 	if (!cpu_online(cpu) || (newstate > DC_DISABLE) || (newstate == DC_RESV))
 		return -EINVAL;
 
-	rdmsr(MSR_IA32_THERM_STATUS, l, h);
+	rdmsr_on_cpu(cpu, MSR_IA32_THERM_STATUS, &l, &h);
 
 	if (l & 0x01)
 		dprintk("CPU#%d currently thermal throttled\n", cpu);
@@ -70,10 +70,10 @@ static int cpufreq_p4_setdc(unsigned int
 	if (has_N44_O17_errata[cpu] && (newstate == DC_25PT || newstate == DC_DFLT))
 		newstate = DC_38PT;
 
-	rdmsr(MSR_IA32_THERM_CONTROL, l, h);
+	rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
 	if (newstate == DC_DISABLE) {
 		dprintk("CPU#%d disabling modulation\n", cpu);
-		wrmsr(MSR_IA32_THERM_CONTROL, l & ~(1<<4), h);
+		wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l & ~(1<<4), h);
 	} else {
 		dprintk("CPU#%d setting duty cycle to %d%%\n",
 			cpu, ((125 * newstate) / 10));
@@ -84,7 +84,7 @@ static int cpufreq_p4_setdc(unsigned int
 		 */
 		l = (l & ~14);
 		l = l | (1<<4) | ((newstate & 0x7)<<1);
-		wrmsr(MSR_IA32_THERM_CONTROL, l, h);
+		wrmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, l, h);
 	}
 
 	return 0;
@@ -111,7 +111,6 @@ static int cpufreq_p4_target(struct cpuf
 {
 	unsigned int    newstate = DC_RESV;
 	struct cpufreq_freqs freqs;
-	cpumask_t cpus_allowed;
 	int i;
 
 	if (cpufreq_frequency_table_target(policy, &p4clockmod_table[0], target_freq, relation, &newstate))
@@ -132,17 +131,8 @@ static int cpufreq_p4_target(struct cpuf
 	/* run on each logical CPU, see section 13.15.3 of IA32 Intel Architecture Software
 	 * Developer's Manual, Volume 3
 	 */
-	cpus_allowed = current->cpus_allowed;
-
-	for_each_cpu_mask(i, policy->cpus) {
-		cpumask_t this_cpu = cpumask_of_cpu(i);
-
-		set_cpus_allowed(current, this_cpu);
-		BUG_ON(smp_processor_id() != i);
-
+	for_each_cpu_mask(i, policy->cpus)
 		cpufreq_p4_setdc(i, p4clockmod_table[newstate].index);
-	}
-	set_cpus_allowed(current, cpus_allowed);
 
 	/* notifiers */
 	for_each_cpu_mask(i, policy->cpus) {
@@ -256,17 +246,9 @@ static int cpufreq_p4_cpu_exit(struct cp
 
 static unsigned int cpufreq_p4_get(unsigned int cpu)
 {
-	cpumask_t cpus_allowed;
 	u32 l, h;
 
-	cpus_allowed = current->cpus_allowed;
-
-	set_cpus_allowed(current, cpumask_of_cpu(cpu));
-	BUG_ON(smp_processor_id() != cpu);
-
-	rdmsr(MSR_IA32_THERM_CONTROL, l, h);
-
-	set_cpus_allowed(current, cpus_allowed);
+	rdmsr_on_cpu(cpu, MSR_IA32_THERM_CONTROL, &l, &h);
 
 	if (l & 0x10) {
 		l = l >> 1;
Index: linux/arch/x86_64/lib/Makefile
===================================================================
--- linux.orig/arch/x86_64/lib/Makefile
+++ linux/arch/x86_64/lib/Makefile
@@ -4,10 +4,12 @@
 
 CFLAGS_csum-partial.o := -funroll-loops
 
-obj-y := io.o iomap_copy.o
+obj-y := io.o iomap_copy.o msr-on-cpu.o
 
 lib-y := csum-partial.o csum-copy.o csum-wrappers.o delay.o \
 	usercopy.o getuser.o putuser.o  \
 	thunk.o clear_page.o copy_page.o bitstr.o bitops.o
 lib-y += memcpy.o memmove.o memset.o copy_user.o rwlock.o copy_user_nocache.o
 lib-y += memcpy_uncached_read.o
+
+msr-on-cpu-y += ../../i386/lib/msr-on-cpu.o

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [22/39] x86_64: Kconfig typos
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (19 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [21/39] i386: rdmsr_on_cpu, wrmsr_on_cpu Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [23/39] i386: use smp_call_function_single() Andi Kleen
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Nicolas Kaiser, patches, linux-kernel


From: Nicolas Kaiser <nikai@nikai.net>
Some typos in Kconfig.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Andi Kleen <ak@suse.de>

---

---
 arch/x86_64/Kconfig |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/arch/x86_64/Kconfig
===================================================================
--- linux.orig/arch/x86_64/Kconfig
+++ linux/arch/x86_64/Kconfig
@@ -148,18 +148,18 @@ config MPSC
 	  Optimize for Intel Pentium 4 and older Nocona/Dempsey Xeon CPUs
 	  with Intel Extended Memory 64 Technology(EM64T). For details see
 	  <http://www.intel.com/technology/64bitextensions/>.
-	  Note the the latest Xeons (Xeon 51xx and 53xx) are not based on the
-          Netburst core and shouldn't use this option. You can distingush them
+	  Note that the latest Xeons (Xeon 51xx and 53xx) are not based on the
+          Netburst core and shouldn't use this option. You can distinguish them
 	  using the cpu family field
-	  in /proc/cpuinfo. Family 15 is a older Xeon, Family 6 a newer one
-	  (this rule only applies to system that support EM64T)
+	  in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one
+	  (this rule only applies to systems that support EM64T)
 
 config MCORE2
 	bool "Intel Core2 / newer Xeon"
 	help
 	  Optimize for Intel Core2 and newer Xeons (51xx)
-	  You can distingush the newer Xeons from the older ones using
-	  the cpu family field in /proc/cpuinfo. 15 is a older Xeon
+	  You can distinguish the newer Xeons from the older ones using
+	  the cpu family field in /proc/cpuinfo. 15 is an older Xeon
 	  (use CONFIG_MPSC then), 6 is a newer one. This rule only
 	  applies to CPUs that support EM64T.
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [23/39] i386: use smp_call_function_single()
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (20 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [22/39] x86_64: Kconfig typos Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [24/39] " Andi Kleen
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Alexey Dobriyan, patches, linux-kernel


From: Alexey Dobriyan <adobriyan@openvz.org>
It will execute rdmsr and wrmsr only on the cpu we need.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/msr.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

Index: linux/arch/i386/kernel/msr.c
===================================================================
--- linux.orig/arch/i386/kernel/msr.c
+++ linux/arch/i386/kernel/msr.c
@@ -68,7 +68,6 @@ static inline int rdmsr_eio(u32 reg, u32
 #ifdef CONFIG_SMP
 
 struct msr_command {
-	int cpu;
 	int err;
 	u32 reg;
 	u32 data[2];
@@ -78,16 +77,14 @@ static void msr_smp_wrmsr(void *cmd_bloc
 {
 	struct msr_command *cmd = (struct msr_command *)cmd_block;
 
-	if (cmd->cpu == smp_processor_id())
-		cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
+	cmd->err = wrmsr_eio(cmd->reg, cmd->data[0], cmd->data[1]);
 }
 
 static void msr_smp_rdmsr(void *cmd_block)
 {
 	struct msr_command *cmd = (struct msr_command *)cmd_block;
 
-	if (cmd->cpu == smp_processor_id())
-		cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
+	cmd->err = rdmsr_eio(cmd->reg, &cmd->data[0], &cmd->data[1]);
 }
 
 static inline int do_wrmsr(int cpu, u32 reg, u32 eax, u32 edx)
@@ -99,12 +96,11 @@ static inline int do_wrmsr(int cpu, u32 
 	if (cpu == smp_processor_id()) {
 		ret = wrmsr_eio(reg, eax, edx);
 	} else {
-		cmd.cpu = cpu;
 		cmd.reg = reg;
 		cmd.data[0] = eax;
 		cmd.data[1] = edx;
 
-		smp_call_function(msr_smp_wrmsr, &cmd, 1, 1);
+		smp_call_function_single(cpu, msr_smp_wrmsr, &cmd, 1, 1);
 		ret = cmd.err;
 	}
 	preempt_enable();
@@ -120,10 +116,9 @@ static inline int do_rdmsr(int cpu, u32 
 	if (cpu == smp_processor_id()) {
 		ret = rdmsr_eio(reg, eax, edx);
 	} else {
-		cmd.cpu = cpu;
 		cmd.reg = reg;
 
-		smp_call_function(msr_smp_rdmsr, &cmd, 1, 1);
+		smp_call_function_single(cpu, msr_smp_rdmsr, &cmd, 1, 1);
 
 		*eax = cmd.data[0];
 		*edx = cmd.data[1];

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [24/39] i386: use smp_call_function_single()
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (21 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [23/39] i386: use smp_call_function_single() Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [25/39] x86_64: Fix preprocessor condition Andi Kleen
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Alexey Dobriyan, patches, linux-kernel


From: Alexey Dobriyan <adobriyan@openvz.org>
It will execure cpuid only on the cpu we need.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/cpuid.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Index: linux/arch/i386/kernel/cpuid.c
===================================================================
--- linux.orig/arch/i386/kernel/cpuid.c
+++ linux/arch/i386/kernel/cpuid.c
@@ -48,7 +48,6 @@ static struct class *cpuid_class;
 #ifdef CONFIG_SMP
 
 struct cpuid_command {
-	int cpu;
 	u32 reg;
 	u32 *data;
 };
@@ -57,8 +56,7 @@ static void cpuid_smp_cpuid(void *cmd_bl
 {
 	struct cpuid_command *cmd = (struct cpuid_command *)cmd_block;
 
-	if (cmd->cpu == smp_processor_id())
-		cpuid(cmd->reg, &cmd->data[0], &cmd->data[1], &cmd->data[2],
+	cpuid(cmd->reg, &cmd->data[0], &cmd->data[1], &cmd->data[2],
 		      &cmd->data[3]);
 }
 
@@ -70,11 +68,10 @@ static inline void do_cpuid(int cpu, u32
 	if (cpu == smp_processor_id()) {
 		cpuid(reg, &data[0], &data[1], &data[2], &data[3]);
 	} else {
-		cmd.cpu = cpu;
 		cmd.reg = reg;
 		cmd.data = data;
 
-		smp_call_function(cpuid_smp_cpuid, &cmd, 1, 1);
+		smp_call_function_single(cpu, cpuid_smp_cpuid, &cmd, 1, 1);
 	}
 	preempt_enable();
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [25/39] x86_64: Fix preprocessor condition
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (22 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [24/39] " Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32 Andi Kleen
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Josef 'Jeff' Sipek, patches, linux-kernel


From: "Josef 'Jeff' Sipek" <jsipek@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 include/asm-x86_64/io.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/include/asm-x86_64/io.h
===================================================================
--- linux.orig/include/asm-x86_64/io.h
+++ linux/include/asm-x86_64/io.h
@@ -100,7 +100,7 @@ __OUTS(l)
 
 #define IO_SPACE_LIMIT 0xffff
 
-#if defined(__KERNEL__) && __x86_64__
+#if defined(__KERNEL__) && defined(__x86_64__)
 
 #include <linux/vmalloc.h>
 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (23 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [25/39] x86_64: Fix preprocessor condition Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12 13:24   ` Giuliano Procida
  2007-02-12  7:38 ` [PATCH x86 for review II] [27/39] i386: APM on i386 Andi Kleen
                   ` (12 subsequent siblings)
  37 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Giuliano Procida, patches, linux-kernel


From: Giuliano Procida <giuliano.procida@googlemail.com>
[MTRR] fix 32-bit ioctls on x64_32

Signed-off-by: Giuliano Procida <giuliano.procida@googlemail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---

Fixed incomplete support for 32-bit compatibility ioctls in
2.6.19.1. They were unhandled in one of three case-statements.
Testing using X server before and after change.

---
 arch/i386/kernel/cpu/mtrr/if.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Index: linux/arch/i386/kernel/cpu/mtrr/if.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/if.c
+++ linux/arch/i386/kernel/cpu/mtrr/if.c
@@ -211,6 +211,9 @@ mtrr_ioctl(struct file *file, unsigned i
 	default:
 		return -ENOTTY;
 	case MTRRIOC_ADD_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_ADD_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err =
@@ -218,21 +221,33 @@ mtrr_ioctl(struct file *file, unsigned i
 				  file, 0);
 		break;
 	case MTRRIOC_SET_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_SET_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_add(sentry.base, sentry.size, sentry.type, 0);
 		break;
 	case MTRRIOC_DEL_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_DEL_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_file_del(sentry.base, sentry.size, file, 0);
 		break;
 	case MTRRIOC_KILL_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_KILL_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_del(-1, sentry.base, sentry.size);
 		break;
 	case MTRRIOC_GET_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_GET_ENTRY:
+#endif
 		if (gentry.regnum >= num_var_ranges)
 			return -EINVAL;
 		mtrr_if->get(gentry.regnum, &gentry.base, &size, &type);
@@ -249,6 +264,9 @@ mtrr_ioctl(struct file *file, unsigned i
 
 		break;
 	case MTRRIOC_ADD_PAGE_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_ADD_PAGE_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err =
@@ -256,21 +274,33 @@ mtrr_ioctl(struct file *file, unsigned i
 				  file, 1);
 		break;
 	case MTRRIOC_SET_PAGE_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_SET_PAGE_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_add_page(sentry.base, sentry.size, sentry.type, 0);
 		break;
 	case MTRRIOC_DEL_PAGE_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_DEL_PAGE_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_file_del(sentry.base, sentry.size, file, 1);
 		break;
 	case MTRRIOC_KILL_PAGE_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_KILL_PAGE_ENTRY:
+#endif
 		if (!capable(CAP_SYS_ADMIN))
 			return -EPERM;
 		err = mtrr_del_page(-1, sentry.base, sentry.size);
 		break;
 	case MTRRIOC_GET_PAGE_ENTRY:
+#ifdef CONFIG_COMPAT
+	case MTRRIOC32_GET_PAGE_ENTRY:
+#endif
 		if (gentry.regnum >= num_var_ranges)
 			return -EINVAL;
 		mtrr_if->get(gentry.regnum, &gentry.base, &size, &type);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [27/39] i386: APM on i386
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (24 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32 Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [28/39] i386: fix size_or_mask and size_and_mask Andi Kleen
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Alexey Dobriyan, patches, linux-kernel


From: Alexey Dobriyan <adobriyan@gmail.com>
Byte-to-byte identical /proc/apm here.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---

 arch/i386/kernel/apm.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

Index: linux/arch/i386/kernel/apm.c
===================================================================
--- linux.orig/arch/i386/kernel/apm.c
+++ linux/arch/i386/kernel/apm.c
@@ -211,6 +211,7 @@
 #include <linux/slab.h>
 #include <linux/stat.h>
 #include <linux/proc_fs.h>
+#include <linux/seq_file.h>
 #include <linux/miscdevice.h>
 #include <linux/apm_bios.h>
 #include <linux/init.h>
@@ -1636,9 +1637,8 @@ static int do_open(struct inode * inode,
 	return 0;
 }
 
-static int apm_get_info(char *buf, char **start, off_t fpos, int length)
+static int proc_apm_show(struct seq_file *m, void *v)
 {
-	char *		p;
 	unsigned short	bx;
 	unsigned short	cx;
 	unsigned short	dx;
@@ -1650,8 +1650,6 @@ static int apm_get_info(char *buf, char 
 	int             time_units     = -1;
 	char            *units         = "?";
 
-	p = buf;
-
 	if ((num_online_cpus() == 1) &&
 	    !(error = apm_get_power_status(&bx, &cx, &dx))) {
 		ac_line_status = (bx >> 8) & 0xff;
@@ -1705,7 +1703,7 @@ static int apm_get_info(char *buf, char 
 	      -1: Unknown
 	   8) min = minutes; sec = seconds */
 
-	p += sprintf(p, "%s %d.%d 0x%02x 0x%02x 0x%02x 0x%02x %d%% %d %s\n",
+	seq_printf(m, "%s %d.%d 0x%02x 0x%02x 0x%02x 0x%02x %d%% %d %s\n",
 		     driver_version,
 		     (apm_info.bios.version >> 8) & 0xff,
 		     apm_info.bios.version & 0xff,
@@ -1716,10 +1714,22 @@ static int apm_get_info(char *buf, char 
 		     percentage,
 		     time_units,
 		     units);
+	return 0;
+}
 
-	return p - buf;
+static int proc_apm_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, proc_apm_show, NULL);
 }
 
+static const struct file_operations apm_file_ops = {
+	.owner		= THIS_MODULE,
+	.open		= proc_apm_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static int apm(void *unused)
 {
 	unsigned short	bx;
@@ -2341,9 +2351,9 @@ static int __init apm_init(void)
 	set_base(gdt[APM_DS >> 3],
 		 __va((unsigned long)apm_info.bios.dseg << 4));
 
-	apm_proc = create_proc_info_entry("apm", 0, NULL, apm_get_info);
+	apm_proc = create_proc_entry("apm", 0, NULL);
 	if (apm_proc)
-		apm_proc->owner = THIS_MODULE;
+		apm_proc->proc_fops = &apm_file_ops;
 
 	kapmd_task = kthread_create(apm, NULL, "kapmd");
 	if (IS_ERR(kapmd_task)) {

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [28/39] i386: fix size_or_mask and size_and_mask
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (25 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [27/39] i386: APM on i386 Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [29/39] x86_64: - Ignore long SMI interrupts in clock calibration code - update 1 Andi Kleen
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Andreas Herrmann, patches, linux-kernel


From: "Andreas Herrmann" <andreas.herrmann3@amd.com>
mtrr: fix size_or_mask and size_and_mask

This fixes two bugs in /proc/mtrr interface:
o If physical address size crosses the 44 bit boundary
  size_or_mask is evaluated wrong.
o size_and_mask limits width of physical base
  address for an MTRR to be less than 44 bits.

TBD: later patch had one more change, but I think that was bogus.
TBD: need to double check

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/i386/kernel/cpu/mtrr/main.c |    6 +++---
 arch/i386/kernel/cpu/mtrr/mtrr.h |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

Index: linux/arch/i386/kernel/cpu/mtrr/main.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/main.c
+++ linux/arch/i386/kernel/cpu/mtrr/main.c
@@ -50,7 +50,7 @@ u32 num_var_ranges = 0;
 unsigned int *usage_table;
 static DEFINE_MUTEX(mtrr_mutex);
 
-u32 size_or_mask, size_and_mask;
+u64 size_or_mask, size_and_mask;
 
 static struct mtrr_ops * mtrr_ops[X86_VENDOR_NUM] = {};
 
@@ -662,8 +662,8 @@ void __init mtrr_bp_init(void)
 			     boot_cpu_data.x86_mask == 0x4))
 				phys_addr = 36;
 
-			size_or_mask = ~((1 << (phys_addr - PAGE_SHIFT)) - 1);
-			size_and_mask = ~size_or_mask & 0xfff00000;
+			size_or_mask = ~((1ULL << (phys_addr - PAGE_SHIFT)) - 1);
+			size_and_mask = ~size_or_mask & 0xfffff00000ULL;
 		} else if (boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR &&
 			   boot_cpu_data.x86 == 6) {
 			/* VIA C* family have Intel style MTRRs, but
Index: linux/arch/i386/kernel/cpu/mtrr/mtrr.h
===================================================================
--- linux.orig/arch/i386/kernel/cpu/mtrr/mtrr.h
+++ linux/arch/i386/kernel/cpu/mtrr/mtrr.h
@@ -84,7 +84,7 @@ void get_mtrr_state(void);
 
 extern void set_mtrr_ops(struct mtrr_ops * ops);
 
-extern u32 size_or_mask, size_and_mask;
+extern u64 size_or_mask, size_and_mask;
 extern struct mtrr_ops * mtrr_if;
 
 #define is_cpu(vnd)	(mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [29/39] x86_64: - Ignore long SMI interrupts in clock calibration code - update 1
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (26 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [28/39] i386: fix size_or_mask and size_and_mask Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [30/39] x86_64: Check return value of putreg in PTRACE_SETREGS Andi Kleen
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Jack Steiner, patches, linux-kernel


From: Jack Steiner <steiner@sgi.com>
Add failsafe mechanism to HPET/TSC clock calibration.
	
	Signed-off-by: Jack Steiner <steiner@sgi.com>

Updated to include failsafe mechanism & additional community feedback.
Patch built on latest 2.6.20-rc4-mm1 tree.




Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/time.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Index: linux/arch/x86_64/kernel/time.c
===================================================================
--- linux.orig/arch/x86_64/kernel/time.c
+++ linux/arch/x86_64/kernel/time.c
@@ -657,6 +657,7 @@ core_initcall(cpufreq_tsc);
 
 #define TICK_COUNT 100000000
 #define TICK_MIN   5000
+#define MAX_READ_RETRIES 5
 
 /*
  * Some platforms take periodic SMI interrupts with 5ms duration. Make sure none
@@ -664,13 +665,17 @@ core_initcall(cpufreq_tsc);
  */
 static void __init read_hpet_tsc(int *hpet, int *tsc)
 {
-	int tsc1, tsc2, hpet1;
+	int tsc1, tsc2, hpet1, retries = 0;
+	static int msg;
 
 	do {
 		tsc1 = get_cycles_sync();
 		hpet1 = hpet_readl(HPET_COUNTER);
 		tsc2 = get_cycles_sync();
-	} while (tsc2 - tsc1 > TICK_MIN);
+	} while (tsc2 - tsc1 > TICK_MIN && retries++ < MAX_READ_RETRIES);
+	if (retries >= MAX_READ_RETRIES && !msg++)
+		printk(KERN_WARNING
+		       "hpet.c: exceeded max retries to read HPET & TSC\n");
 	*hpet = hpet1;
 	*tsc = tsc2;
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [30/39] x86_64: Check return value of putreg in PTRACE_SETREGS
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (27 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [29/39] x86_64: - Ignore long SMI interrupts in clock calibration code - update 1 Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [31/39] x86_64: Unexport __supported_pte_mask Andi Kleen
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


This means if an illegal value is set for the segment registers there
ptrace will error out now with an errno instead of silently ignoring
it.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/ptrace.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux/arch/x86_64/kernel/ptrace.c
===================================================================
--- linux.orig/arch/x86_64/kernel/ptrace.c
+++ linux/arch/x86_64/kernel/ptrace.c
@@ -536,8 +536,12 @@ long arch_ptrace(struct task_struct *chi
 		}
 		ret = 0;
 		for (ui = 0; ui < sizeof(struct user_regs_struct); ui += sizeof(long)) {
-			ret |= __get_user(tmp, (unsigned long __user *) data);
-			putreg(child, ui, tmp);
+			ret = __get_user(tmp, (unsigned long __user *) data);
+			if (ret)
+				break;
+			ret = putreg(child, ui, tmp);
+			if (ret)
+				break;
 			data += sizeof(long);
 		}
 		break;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [31/39] x86_64: Unexport __supported_pte_mask
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (28 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [30/39] x86_64: Check return value of putreg in PTRACE_SETREGS Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [32/39] x86_64: x86_64 - Fix FS/GS registers for VT execution Andi Kleen
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


The symbol is needed to manipulate page tables, and modules shouldn't
do that.

Leftover from 2.4, but no in tree module should need it now.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/setup64.c |    1 -
 1 file changed, 1 deletion(-)

Index: linux/arch/x86_64/kernel/setup64.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup64.c
+++ linux/arch/x86_64/kernel/setup64.c
@@ -37,7 +37,6 @@ struct desc_ptr idt_descr = { 256 * 16 -
 char boot_cpu_stack[IRQSTACKSIZE] __attribute__((section(".bss.page_aligned")));
 
 unsigned long __supported_pte_mask __read_mostly = ~0UL;
-EXPORT_SYMBOL(__supported_pte_mask);
 static int do_not_nx __cpuinitdata = 0;
 
 /* noexec=on|off

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [32/39] x86_64: x86_64 - Fix FS/GS registers for VT execution
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (29 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [31/39] x86_64: Unexport __supported_pte_mask Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [33/39] x86_64: Fix off by one error in IOMMU boundary checking Andi Kleen
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Zachary Amsden, patches, linux-kernel


From: Zachary Amsden <zach@vmware.com>

Initialize FS and GS to __KERNEL_DS as well.  The actual value of them is not
important, but it is important to reload them in protected mode.  At this time,
they still retain the real mode values from initial boot.  VT disallows
execution of code under such conditions, which means hardware virtualization
can not be used to boot the kernel on Intel platforms, making the boot time
painfully slow.

This requires moving the GS load before the load of GS_BASE, so just move
all the segments loads there to keep them together in the code.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/head.S |   20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

Index: linux/arch/x86_64/kernel/head.S
===================================================================
--- linux.orig/arch/x86_64/kernel/head.S
+++ linux/arch/x86_64/kernel/head.S
@@ -163,6 +163,20 @@ startup_64:
 	 */
 	lgdt	cpu_gdt_descr
 
+	/* set up data segments. actually 0 would do too */
+	movl $__KERNEL_DS,%eax
+	movl %eax,%ds
+	movl %eax,%ss
+	movl %eax,%es
+
+	/*
+	 * We don't really need to load %fs or %gs, but load them anyway
+	 * to kill any stale realmode selectors.  This allows execution
+	 * under VT hardware.
+	 */
+	movl %eax,%fs
+	movl %eax,%gs
+
 	/* 
 	 * Setup up a dummy PDA. this is just for some early bootup code
 	 * that does in_interrupt() 
@@ -173,12 +187,6 @@ startup_64:
 	shrq	$32,%rdx
 	wrmsr	
 
-	/* set up data segments. actually 0 would do too */
-	movl $__KERNEL_DS,%eax
-	movl %eax,%ds	
-	movl %eax,%ss
-	movl %eax,%es
-			
 	/* esi is pointer to real mode structure with interesting info.
 	   pass it to C */
 	movl	%esi, %edi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [33/39] x86_64: Fix off by one error in IOMMU boundary checking
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (30 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [32/39] x86_64: x86_64 - Fix FS/GS registers for VT execution Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI Andi Kleen
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


Should be harmless because there is normally no memory there, but
technically it was incorrect.

Pointed out by Leo Duran

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/pci-gart.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/arch/x86_64/kernel/pci-gart.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-gart.c
+++ linux/arch/x86_64/kernel/pci-gart.c
@@ -185,7 +185,7 @@ static void iommu_full(struct device *de
 static inline int need_iommu(struct device *dev, unsigned long addr, size_t size)
 { 
 	u64 mask = *dev->dma_mask;
-	int high = addr + size >= mask;
+	int high = addr + size > mask;
 	int mmu = high;
 	if (force_iommu) 
 		mmu = 1; 
@@ -195,7 +195,7 @@ static inline int need_iommu(struct devi
 static inline int nonforced_iommu(struct device *dev, unsigned long addr, size_t size)
 { 
 	u64 mask = *dev->dma_mask;
-	int high = addr + size >= mask;
+	int high = addr + size > mask;
 	int mmu = high;
 	return mmu; 
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (31 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [33/39] x86_64: Fix off by one error in IOMMU boundary checking Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12 19:45   ` Frédéric RISS
  2007-02-12  7:38 ` [PATCH x86 for review II] [35/39] x86_64: Don't reserve ROMs Andi Kleen
                   ` (4 subsequent siblings)
  37 siblings, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <frederic.riss@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 include/linux/efi.h |   43 +++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 16 deletions(-)

Index: linux/include/linux/efi.h
===================================================================
--- linux.orig/include/linux/efi.h
+++ linux/include/linux/efi.h
@@ -157,22 +157,33 @@ typedef struct {
 	unsigned long reset_system;
 } efi_runtime_services_t;
 
-typedef efi_status_t efi_get_time_t (efi_time_t *tm, efi_time_cap_t *tc);
-typedef efi_status_t efi_set_time_t (efi_time_t *tm);
-typedef efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled, efi_bool_t *pending,
-					    efi_time_t *tm);
-typedef efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled, efi_time_t *tm);
-typedef efi_status_t efi_get_variable_t (efi_char16_t *name, efi_guid_t *vendor, u32 *attr,
-					 unsigned long *data_size, void *data);
-typedef efi_status_t efi_get_next_variable_t (unsigned long *name_size, efi_char16_t *name,
-					      efi_guid_t *vendor);
-typedef efi_status_t efi_set_variable_t (efi_char16_t *name, efi_guid_t *vendor, 
-					 unsigned long attr, unsigned long data_size, 
-					 void *data);
-typedef efi_status_t efi_get_next_high_mono_count_t (u32 *count);
-typedef void efi_reset_system_t (int reset_type, efi_status_t status,
-				 unsigned long data_size, efi_char16_t *data);
-typedef efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
+typedef asmlinkage efi_status_t efi_get_time_t (efi_time_t *tm,
+						efi_time_cap_t *tc);
+typedef asmlinkage efi_status_t efi_set_time_t (efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled,
+						       efi_bool_t *pending,
+						       efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled,
+						       efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_get_variable_t (efi_char16_t *name,
+						    efi_guid_t *vendor,
+						    u32 *attr,
+						    unsigned long *data_size,
+						    void *data);
+typedef asmlinkage efi_status_t efi_get_next_variable_t (unsigned long *name_sz,
+							 efi_char16_t *name,
+							 efi_guid_t *vendor);
+typedef asmlinkage efi_status_t efi_set_variable_t (efi_char16_t *name,
+						    efi_guid_t *vendor,
+						    unsigned long attr,
+						    unsigned long data_size,
+						    void *data);
+typedef asmlinkage efi_status_t efi_get_next_high_mono_count_t (u32 *count);
+typedef asmlinkage void efi_reset_system_t (int reset_type,
+					    efi_status_t status,
+					    unsigned long data_size,
+					    efi_char16_t *data);
+typedef asmlinkage efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
 						unsigned long descriptor_size,
 						u32 descriptor_version,
 						efi_memory_desc_t *virtual_map);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [35/39] x86_64: Don't reserve ROMs
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (32 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [36/39] x86_64: define dma noncoherent API functions Andi Kleen
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: patches, linux-kernel


We trust the e820 table, so explicitely reserving ROMs shouldn't 
be needed.

Signed-off-by: Andi Kleen <ak@suse.de>

---
 arch/x86_64/kernel/setup.c |  130 ---------------------------------------------
 1 file changed, 2 insertions(+), 128 deletions(-)

Index: linux/arch/x86_64/kernel/setup.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup.c
+++ linux/arch/x86_64/kernel/setup.c
@@ -138,128 +138,6 @@ struct resource code_resource = {
 	.flags = IORESOURCE_RAM,
 };
 
-#define IORESOURCE_ROM (IORESOURCE_BUSY | IORESOURCE_READONLY | IORESOURCE_MEM)
-
-static struct resource system_rom_resource = {
-	.name = "System ROM",
-	.start = 0xf0000,
-	.end = 0xfffff,
-	.flags = IORESOURCE_ROM,
-};
-
-static struct resource extension_rom_resource = {
-	.name = "Extension ROM",
-	.start = 0xe0000,
-	.end = 0xeffff,
-	.flags = IORESOURCE_ROM,
-};
-
-static struct resource adapter_rom_resources[] = {
-	{ .name = "Adapter ROM", .start = 0xc8000, .end = 0,
-		.flags = IORESOURCE_ROM },
-	{ .name = "Adapter ROM", .start = 0, .end = 0,
-		.flags = IORESOURCE_ROM },
-	{ .name = "Adapter ROM", .start = 0, .end = 0,
-		.flags = IORESOURCE_ROM },
-	{ .name = "Adapter ROM", .start = 0, .end = 0,
-		.flags = IORESOURCE_ROM },
-	{ .name = "Adapter ROM", .start = 0, .end = 0,
-		.flags = IORESOURCE_ROM },
-	{ .name = "Adapter ROM", .start = 0, .end = 0,
-		.flags = IORESOURCE_ROM }
-};
-
-static struct resource video_rom_resource = {
-	.name = "Video ROM",
-	.start = 0xc0000,
-	.end = 0xc7fff,
-	.flags = IORESOURCE_ROM,
-};
-
-static struct resource video_ram_resource = {
-	.name = "Video RAM area",
-	.start = 0xa0000,
-	.end = 0xbffff,
-	.flags = IORESOURCE_RAM,
-};
-
-#define romsignature(x) (*(unsigned short *)(x) == 0xaa55)
-
-static int __init romchecksum(unsigned char *rom, unsigned long length)
-{
-	unsigned char *p, sum = 0;
-
-	for (p = rom; p < rom + length; p++)
-		sum += *p;
-	return sum == 0;
-}
-
-static void __init probe_roms(void)
-{
-	unsigned long start, length, upper;
-	unsigned char *rom;
-	int	      i;
-
-	/* video rom */
-	upper = adapter_rom_resources[0].start;
-	for (start = video_rom_resource.start; start < upper; start += 2048) {
-		rom = isa_bus_to_virt(start);
-		if (!romsignature(rom))
-			continue;
-
-		video_rom_resource.start = start;
-
-		/* 0 < length <= 0x7f * 512, historically */
-		length = rom[2] * 512;
-
-		/* if checksum okay, trust length byte */
-		if (length && romchecksum(rom, length))
-			video_rom_resource.end = start + length - 1;
-
-		request_resource(&iomem_resource, &video_rom_resource);
-		break;
-			}
-
-	start = (video_rom_resource.end + 1 + 2047) & ~2047UL;
-	if (start < upper)
-		start = upper;
-
-	/* system rom */
-	request_resource(&iomem_resource, &system_rom_resource);
-	upper = system_rom_resource.start;
-
-	/* check for extension rom (ignore length byte!) */
-	rom = isa_bus_to_virt(extension_rom_resource.start);
-	if (romsignature(rom)) {
-		length = extension_rom_resource.end - extension_rom_resource.start + 1;
-		if (romchecksum(rom, length)) {
-			request_resource(&iomem_resource, &extension_rom_resource);
-			upper = extension_rom_resource.start;
-		}
-	}
-
-	/* check for adapter roms on 2k boundaries */
-	for (i = 0; i < ARRAY_SIZE(adapter_rom_resources) && start < upper;
-	     start += 2048) {
-		rom = isa_bus_to_virt(start);
-		if (!romsignature(rom))
-			continue;
-
-		/* 0 < length <= 0x7f * 512, historically */
-		length = rom[2] * 512;
-
-		/* but accept any length that fits if checksum okay */
-		if (!length || start + length > upper || !romchecksum(rom, length))
-			continue;
-
-		adapter_rom_resources[i].start = start;
-		adapter_rom_resources[i].end = start + length - 1;
-		request_resource(&iomem_resource, &adapter_rom_resources[i]);
-
-		start = adapter_rom_resources[i++].end & ~2047UL;
-	}
-}
-
 #ifdef CONFIG_PROC_VMCORE
 /* elfcorehdr= specifies the location of elf core header
  * stored by the crashed kernel. This option will be passed
@@ -524,15 +402,11 @@ void __init setup_arch(char **cmdline_p)
 	init_apic_mappings();
 
 	/*
-	 * Request address space for all standard RAM and ROM resources
-	 * and also for regions reported as reserved by the e820.
-	 */
-	probe_roms();
+	 * We trust e820 completely. No explicit ROM probing in memory.
+ 	 */
 	e820_reserve_resources(); 
 	e820_mark_nosave_regions();
 
-	request_resource(&iomem_resource, &video_ram_resource);
-
 	{
 	unsigned i;
 	/* request I/O space for devices used on all i[345]86 PCs */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [36/39] x86_64: define dma noncoherent API functions
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (33 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [35/39] x86_64: Don't reserve ROMs Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [37/39] x86_64: robustify bad_dma_address handling Andi Kleen
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Jeff Garzik, patches, linux-kernel


From: Jeff Garzik <jeff@garzik.org>

x86-64 is missing these:

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andi Kleen <ak@suse.de>

---
 include/asm-x86_64/dma-mapping.h |    3 +++
 1 file changed, 3 insertions(+)

Index: linux/include/asm-x86_64/dma-mapping.h
===================================================================
--- linux.orig/include/asm-x86_64/dma-mapping.h
+++ linux/include/asm-x86_64/dma-mapping.h
@@ -66,6 +66,9 @@ static inline int dma_mapping_error(dma_
 #define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
 #define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
 
+#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
+#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
+
 extern void *dma_alloc_coherent(struct device *dev, size_t size,
 				dma_addr_t *dma_handle, gfp_t gfp);
 extern void dma_free_coherent(struct device *dev, size_t size, void *vaddr,

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [37/39] x86_64: robustify bad_dma_address handling
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (34 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [36/39] x86_64: define dma noncoherent API functions Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [38/39] x86: fix laptop bootup hang in init_acpi() Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [39/39] i386: All Transmeta CPUs have constant TSCs Andi Kleen
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Muli Ben-Yehuda, Leo Duran, Job Mason, patches, linux-kernel


From: Muli Ben-Yehuda <muli@il.ibm.com>

- set bad_dma_address explicitly to 0x0
- reserve 32 pages from bad_dma_address and up
- WARN_ON() a driver feeding us bad_dma_address

Thanks to Leo Duran <leo.duran@amd.com> for the suggestion.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Leo Duran <leo.duran@amd.com>
Cc: Job Mason <jdmason@kudzu.us>
---
 arch/x86_64/kernel/pci-calgary.c |   17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Index: linux/arch/x86_64/kernel/pci-calgary.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-calgary.c
+++ linux/arch/x86_64/kernel/pci-calgary.c
@@ -138,6 +138,8 @@ static const unsigned long phb_debug_off
 
 #define PHB_DEBUG_STUFF_OFFSET	0x0020
 
+#define EMERGENCY_PAGES 32 /* = 128KB */
+
 unsigned int specified_table_size = TCE_TABLE_SIZE_UNSPECIFIED;
 static int translate_empty_slots __read_mostly = 0;
 static int calgary_detected __read_mostly = 0;
@@ -296,6 +298,16 @@ static void __iommu_free(struct iommu_ta
 {
 	unsigned long entry;
 	unsigned long badbit;
+	unsigned long badend;
+
+	/* were we called with bad_dma_address? */
+	badend = bad_dma_address + (EMERGENCY_PAGES * PAGE_SIZE);
+	if (unlikely((dma_addr >= bad_dma_address) && (dma_addr < badend))) {
+		printk(KERN_ERR "Calgary: driver tried unmapping bad DMA "
+		       "address 0x%Lx\n", dma_addr);
+		WARN_ON(1);
+		return;
+	}
 
 	entry = dma_addr >> PAGE_SHIFT;
 
@@ -656,8 +668,8 @@ static void __init calgary_reserve_regio
 	u64 start;
 	struct iommu_table *tbl = dev->sysdata;
 
-	/* reserve bad_dma_address in case it's a legal address */
-	iommu_range_reserve(tbl, bad_dma_address, 1);
+	/* reserve EMERGENCY_PAGES from bad_dma_address and up */
+	iommu_range_reserve(tbl, bad_dma_address, EMERGENCY_PAGES);
 
 	/* avoid the BIOS/VGA first 640KB-1MB region */
 	start = (640 * 1024);
@@ -1176,6 +1188,7 @@ int __init calgary_iommu_init(void)
 	}
 
 	force_iommu = 1;
+	bad_dma_address = 0x0;
 	dma_ops = &calgary_dma_ops;
 
 	return 0;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [38/39] x86: fix laptop bootup hang in init_acpi()
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (35 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [37/39] x86_64: robustify bad_dma_address handling Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  2007-02-12  7:38 ` [PATCH x86 for review II] [39/39] i386: All Transmeta CPUs have constant TSCs Andi Kleen
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: Ingo Molnar, Andi Kleen, Len Brown, patches, linux-kernel


From: Ingo Molnar <mingo@elte.hu>

During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about
10%-20% of the time in acpi_init():

 Calling initcall 0xc055ce1a: topology_init+0x0/0x2f()
 Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c()
 Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175()
 Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17()
 Calling initcall 0xc0569f99: init_bio+0x0/0xf4()
 Calling initcall 0xc056b865: genhd_device_init+0x0/0x50()
 Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87()
 Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee()

It's a hard hang that not even an NMI could punch through!  Frustratingly,
adding printks or function tracing to the ACPI code made the hangs go away
...

After some time an additional detail emerged: disabling the NMI watchdog
made these occasional hangs go away.

So i spent the better part of today trying to debug this and trying out
various theories when i finally found the likely reason for the hang: if
acpi_ns_initialize_devices() executes an _INI AML method and an NMI
happens to hit that AML execution in the wrong moment, the machine would
hang.  (my theory is that this must be some sort of chipset setup method
doing stores to chipset mmio registers?)

Unfortunately given the characteristics of the hang it was sheer
impossible to figure out which of the numerous AML methods is impacted
by this problem.

As a workaround i wrote an interface to disable chipset-based NMIs while
executing _INI sections - and indeed this fixed the hang.  I did a
boot-loop of 100 separate reboots and none hung - while without the patch
it would hang every 5-10 attempts.  Out of caution i did not touch the
nmi_watchdog=2 case (it's not related to the chipset anyway and didnt
hang).

I implemented this for both x86_64 and i686, tested the i686 laptop both
with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and
tested an Athlon64 box with the 64-bit kernel as well. Everything builds
and works with the patch applied.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/i386/kernel/nmi.c          |   28 ++++++++++++++++++++++++++++
 arch/x86_64/kernel/nmi.c        |   27 +++++++++++++++++++++++++++
 drivers/acpi/namespace/nsinit.c |    9 +++++++++
 include/linux/nmi.h             |    9 ++++++++-
 4 files changed, 72 insertions(+), 1 deletion(-)

Index: linux/arch/i386/kernel/nmi.c
===================================================================
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -383,6 +383,34 @@ void enable_timer_nmi_watchdog(void)
 	}
 }
 
+static void __acpi_nmi_disable(void *__unused)
+{
+	apic_write_around(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
+}
+
+/*
+ * Disable timer based NMIs on all CPUs:
+ */
+void acpi_nmi_disable(void)
+{
+	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
+		on_each_cpu(__acpi_nmi_disable, NULL, 0, 1);
+}
+
+static void __acpi_nmi_enable(void *__unused)
+{
+	apic_write_around(APIC_LVT0, APIC_DM_NMI);
+}
+
+/*
+ * Enable timer based NMIs on all CPUs:
+ */
+void acpi_nmi_enable(void)
+{
+	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
+		on_each_cpu(__acpi_nmi_enable, NULL, 0, 1);
+}
+
 #ifdef CONFIG_PM
 
 static int nmi_pm_active; /* nmi_active before suspend */
Index: linux/arch/x86_64/kernel/nmi.c
===================================================================
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -368,6 +368,33 @@ void enable_timer_nmi_watchdog(void)
 	}
 }
 
+static void __acpi_nmi_disable(void *__unused)
+{
+	apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
+}
+
+/*
+ * Disable timer based NMIs on all CPUs:
+ */
+void acpi_nmi_disable(void)
+{
+	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
+		on_each_cpu(__acpi_nmi_disable, NULL, 0, 1);
+}
+
+static void __acpi_nmi_enable(void *__unused)
+{
+	apic_write(APIC_LVT0, APIC_DM_NMI);
+}
+
+/*
+ * Enable timer based NMIs on all CPUs:
+ */
+void acpi_nmi_enable(void)
+{
+	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
+		on_each_cpu(__acpi_nmi_enable, NULL, 0, 1);
+}
 #ifdef CONFIG_PM
 
 static int nmi_pm_active; /* nmi_active before suspend */
Index: linux/drivers/acpi/namespace/nsinit.c
===================================================================
--- linux.orig/drivers/acpi/namespace/nsinit.c
+++ linux/drivers/acpi/namespace/nsinit.c
@@ -45,6 +45,7 @@
 #include <acpi/acnamesp.h>
 #include <acpi/acdispat.h>
 #include <acpi/acinterp.h>
+#include <linux/nmi.h>
 
 #define _COMPONENT          ACPI_NAMESPACE
 ACPI_MODULE_NAME("nsinit")
@@ -534,7 +535,15 @@ acpi_ns_init_one_device(acpi_handle obj_
 	info->parameter_type = ACPI_PARAM_ARGS;
 	info->flags = ACPI_IGNORE_RETURN_VALUE;
 
+	/*
+	 * Some hardware relies on this being executed as atomically
+	 * as possible (without an NMI being received in the middle of
+	 * this) - so disable NMIs and initialize the device:
+	 */
+	acpi_nmi_disable();
 	status = acpi_ns_evaluate(info);
+	acpi_nmi_enable();
+
 	if (ACPI_SUCCESS(status)) {
 		walk_info->num_INI++;
 
Index: linux/include/linux/nmi.h
===================================================================
--- linux.orig/include/linux/nmi.h
+++ linux/include/linux/nmi.h
@@ -17,8 +17,15 @@
 #ifdef ARCH_HAS_NMI_WATCHDOG
 #include <asm/nmi.h>
 extern void touch_nmi_watchdog(void);
+extern void acpi_nmi_disable(void);
+extern void acpi_nmi_enable(void);
 #else
-# define touch_nmi_watchdog() touch_softlockup_watchdog()
+static inline void touch_nmi_watchdog(void)
+{
+	touch_softlockup_watchdog();
+}
+static inline void acpi_nmi_disable(void) { }
+static inline void acpi_nmi_enable(void) { }
 #endif
 
 #ifndef trigger_all_cpu_backtrace

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH x86 for review II] [39/39] i386: All Transmeta CPUs have constant TSCs
  2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
                   ` (36 preceding siblings ...)
  2007-02-12  7:38 ` [PATCH x86 for review II] [38/39] x86: fix laptop bootup hang in init_acpi() Andi Kleen
@ 2007-02-12  7:38 ` Andi Kleen
  37 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  7:38 UTC (permalink / raw)
  To: H. Peter Anvin, Andi Kleen, patches, linux-kernel


From: "H. Peter Anvin" <hpa@zytor.com>

All Transmeta CPUs ever produced have constant-rate TSCs.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/i386/kernel/cpu/transmeta.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux/arch/i386/kernel/cpu/transmeta.c
===================================================================
--- linux.orig/arch/i386/kernel/cpu/transmeta.c
+++ linux/arch/i386/kernel/cpu/transmeta.c
@@ -72,6 +72,9 @@ static void __cpuinit init_transmeta(str
 	wrmsr(0x80860004, ~0, uk);
 	c->x86_capability[0] = cpuid_edx(0x00000001);
 	wrmsr(0x80860004, cap_mask, uk);
+
+	/* All Transmeta CPUs have a constant TSC */
+	set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
 	
 	/* If we can run i686 user-space code, call us an i686 */
 #define USER686 (X86_FEATURE_TSC|X86_FEATURE_CX8|X86_FEATURE_CMOV)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected
  2007-02-12  7:38 ` [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected Andi Kleen
@ 2007-02-12  7:54   ` Oliver Neukum
  2007-02-12  8:04     ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: Oliver Neukum @ 2007-02-12  7:54 UTC (permalink / raw)
  To: Andi Kleen; +Cc: patches, linux-kernel

Am Montag, 12. Februar 2007 08:38 schrieb Andi Kleen:
> When a machine check event is detected (including a AMD RevF threshold 
> overflow event) allow to run a "trigger" program. This allows user space
> to react to such events sooner.

Could this not be merged with other reporting mechanisms? This looks like
a new incarnation of the /etc/hotplug code.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected
  2007-02-12  7:54   ` Oliver Neukum
@ 2007-02-12  8:04     ` Andi Kleen
  2007-02-12  8:11       ` Bauke Jan Douma
  2007-02-12 15:05       ` [patches] " Pavel Machek
  0 siblings, 2 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12  8:04 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: patches, linux-kernel

On Monday 12 February 2007 08:54, Oliver Neukum wrote:
> Am Montag, 12. Februar 2007 08:38 schrieb Andi Kleen:
> > When a machine check event is detected (including a AMD RevF threshold 
> > overflow event) allow to run a "trigger" program. This allows user space
> > to react to such events sooner.
> 
> Could this not be merged with other reporting mechanisms? This looks like
> a new incarnation of the /etc/hotplug code.

I refuse to make mcelog depend on dbus. Just because some desktops want bloat
doesn't mean that everybody else wants too.

I don't know of any other lightweight mechanism except for this.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected
  2007-02-12  8:04     ` Andi Kleen
@ 2007-02-12  8:11       ` Bauke Jan Douma
  2007-02-12 15:05       ` [patches] " Pavel Machek
  1 sibling, 0 replies; 47+ messages in thread
From: Bauke Jan Douma @ 2007-02-12  8:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

Andi Kleen wrote on 12-02-07 09:04:
> On Monday 12 February 2007 08:54, Oliver Neukum wrote:
>> Am Montag, 12. Februar 2007 08:38 schrieb Andi Kleen:
>>> When a machine check event is detected (including a AMD RevF threshold 
>>> overflow event) allow to run a "trigger" program. This allows user space
>>> to react to such events sooner.
>> Could this not be merged with other reporting mechanisms? This looks like
>> a new incarnation of the /etc/hotplug code.
> 
> I refuse to make mcelog depend on dbus. Just because some desktops want bloat
> doesn't mean that everybody else wants too.

Man, how I agree with that!
Refreshing to hear someone just say no.

bjd




^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32
  2007-02-12  7:38 ` [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32 Andi Kleen
@ 2007-02-12 13:24   ` Giuliano Procida
  2007-02-12 22:28     ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: Giuliano Procida @ 2007-02-12 13:24 UTC (permalink / raw)
  To: Andi Kleen; +Cc: patches, linux-kernel

This is a nicer version of the MTRR compatibilty ioctl patch, compiles
smaller and also tested.

Signed-off-by: Giuliano Procida <giuliano.procida@googlemail.com>

--- linux-source-2.6.19.1.orig/arch/i386/kernel/cpu/mtrr/if.c	2006-12-11
19:32:53.000000000 +0000
+++ linux-source-2.6.19.1/arch/i386/kernel/cpu/mtrr/if.c	2007-01-27
12:25:21.000000000 +0000
@@ -154,150 +154,164 @@
 mtrr_ioctl(struct file *file, unsigned int cmd, unsigned long __arg)
 {
 	int err = 0;
+	const unsigned ioctl_type = _IOC_TYPE(cmd);
+	const unsigned ioctl_dir = _IOC_DIR(cmd);
+	const unsigned ioctl_nr = _IOC_NR(cmd);
+	const unsigned ioctl_size = _IOC_SIZE(cmd);
 	mtrr_type type;
-	struct mtrr_sentry sentry;
-	struct mtrr_gentry gentry;
+	union mtrr_data {
+		struct mtrr_sentry sentry;
+		struct mtrr_gentry gentry;
+#ifdef CONFIG_COMPAT
+		struct mtrr_sentry32 sentry32;
+		struct mtrr_gentry32 gentry32;
+#endif
+	} u;
 	void __user *arg = (void __user *) __arg;

-	switch (cmd) {
-	case MTRRIOC_ADD_ENTRY:
-	case MTRRIOC_SET_ENTRY:
-	case MTRRIOC_DEL_ENTRY:
-	case MTRRIOC_KILL_ENTRY:
-	case MTRRIOC_ADD_PAGE_ENTRY:
-	case MTRRIOC_SET_PAGE_ENTRY:
-	case MTRRIOC_DEL_PAGE_ENTRY:
-	case MTRRIOC_KILL_PAGE_ENTRY:
-		if (copy_from_user(&sentry, arg, sizeof sentry))
-			return -EFAULT;
-		break;
-	case MTRRIOC_GET_ENTRY:
-	case MTRRIOC_GET_PAGE_ENTRY:
-		if (copy_from_user(&gentry, arg, sizeof gentry))
-			return -EFAULT;
+	/* check type and max size */
+	if (ioctl_type != MTRR_IOCTL_BASE || ioctl_size > sizeof(u))
+		return -ENOTTY;
+
+	/* copy from user */
+	if (ioctl_dir & _IOC_WRITE && copy_from_user(&u, arg, ioctl_size))
+		return -EFAULT;
+
+	/* check number, direction, size and permission */
+	switch (ioctl_nr) {
+	case _IOC_NR(MTRRIOC_ADD_ENTRY):
+	case _IOC_NR(MTRRIOC_SET_ENTRY):
+	case _IOC_NR(MTRRIOC_DEL_ENTRY):
+	case _IOC_NR(MTRRIOC_KILL_ENTRY):
+	case _IOC_NR(MTRRIOC_ADD_PAGE_ENTRY):
+	case _IOC_NR(MTRRIOC_SET_PAGE_ENTRY):
+	case _IOC_NR(MTRRIOC_DEL_PAGE_ENTRY):
+	case _IOC_NR(MTRRIOC_KILL_PAGE_ENTRY):
+		if (ioctl_dir != _IOC_WRITE)
+			return -ENOTTY;
+		switch (ioctl_size) {
+		case sizeof(struct mtrr_sentry):
 		break;
 #ifdef CONFIG_COMPAT
-	case MTRRIOC32_ADD_ENTRY:
-	case MTRRIOC32_SET_ENTRY:
-	case MTRRIOC32_DEL_ENTRY:
-	case MTRRIOC32_KILL_ENTRY:
-	case MTRRIOC32_ADD_PAGE_ENTRY:
-	case MTRRIOC32_SET_PAGE_ENTRY:
-	case MTRRIOC32_DEL_PAGE_ENTRY:
-	case MTRRIOC32_KILL_PAGE_ENTRY: {
-		struct mtrr_sentry32 __user *s32 = (struct mtrr_sentry32 __user *)__arg;
-		err = get_user(sentry.base, &s32->base);
-		err |= get_user(sentry.size, &s32->size);
-		err |= get_user(sentry.type, &s32->type);
-		if (err)
-			return err;
-		break;
-	}
-	case MTRRIOC32_GET_ENTRY:
-	case MTRRIOC32_GET_PAGE_ENTRY: {
-		struct mtrr_gentry32 __user *g32 = (struct mtrr_gentry32 __user *)__arg;
-		err = get_user(gentry.regnum, &g32->regnum);
-		err |= get_user(gentry.base, &g32->base);
-		err |= get_user(gentry.size, &g32->size);
-		err |= get_user(gentry.type, &g32->type);
-		if (err)
-			return err;
+		case sizeof(struct mtrr_sentry32):
+		{
+			struct mtrr_sentry32 s32 = u.sentry32;
+			u.sentry.base = s32.base;
+			u.sentry.size = s32.size;
+			u.sentry.type = s32.type;
+		}
 		break;
-	}
 #endif
+		default:
+			return -ENOTTY;
+		}
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+		break;
+	case _IOC_NR(MTRRIOC_GET_ENTRY):
+	case _IOC_NR(MTRRIOC_GET_PAGE_ENTRY):
+		if (ioctl_dir != (_IOC_READ|_IOC_WRITE))
+			return -ENOTTY;
+		switch (ioctl_size) {
+		case sizeof(struct mtrr_gentry):
+		break;
+#ifdef CONFIG_COMPAT
+		case sizeof(struct mtrr_gentry32):
+		{
+			struct mtrr_gentry32 g32 = u.gentry32;
+			u.gentry.base = g32.base;
+			u.gentry.size = g32.size;
+			u.gentry.regnum = g32.regnum;
+			u.gentry.type = g32.type;
+		}
+		break;
+#endif
+		default:
+			return -ENOTTY;
+		}
+		break;
+	default:
+		return -ENOTTY;
 	}

-	switch (cmd) {
+	/* perform command */
+	switch (ioctl_nr) {
 	default:
 		return -ENOTTY;
-	case MTRRIOC_ADD_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
+	case _IOC_NR(MTRRIOC_ADD_ENTRY):
 		err =
-		    mtrr_file_add(sentry.base, sentry.size, sentry.type, 1,
+		    mtrr_file_add(u.sentry.base, u.sentry.size, u.sentry.type, 1,
 				  file, 0);
 		break;
-	case MTRRIOC_SET_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_add(sentry.base, sentry.size, sentry.type, 0);
+	case _IOC_NR(MTRRIOC_SET_ENTRY):
+		err = mtrr_add(u.sentry.base, u.sentry.size, u.sentry.type, 0);
 		break;
-	case MTRRIOC_DEL_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_file_del(sentry.base, sentry.size, file, 0);
+	case _IOC_NR(MTRRIOC_DEL_ENTRY):
+		err = mtrr_file_del(u.sentry.base, u.sentry.size, file, 0);
 		break;
-	case MTRRIOC_KILL_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_del(-1, sentry.base, sentry.size);
+	case _IOC_NR(MTRRIOC_KILL_ENTRY):
+		err = mtrr_del(-1, u.sentry.base, u.sentry.size);
 		break;
-	case MTRRIOC_GET_ENTRY:
-		if (gentry.regnum >= num_var_ranges)
+	case _IOC_NR(MTRRIOC_GET_ENTRY):
+		if (u.gentry.regnum >= num_var_ranges)
 			return -EINVAL;
-		mtrr_if->get(gentry.regnum, &gentry.base, &gentry.size, &type);
+		mtrr_if->get(u.gentry.regnum, &u.gentry.base, &u.gentry.size, &type);

 		/* Hide entries that go above 4GB */
-		if (gentry.base + gentry.size > 0x100000
-		    || gentry.size == 0x100000)
-			gentry.base = gentry.size = gentry.type = 0;
+		if (u.gentry.base + u.gentry.size > 0x100000
+		    || u.gentry.size == 0x100000)
+			u.gentry.base = u.gentry.size = u.gentry.type = 0;
 		else {
-			gentry.base <<= PAGE_SHIFT;
-			gentry.size <<= PAGE_SHIFT;
-			gentry.type = type;
+			u.gentry.base <<= PAGE_SHIFT;
+			u.gentry.size <<= PAGE_SHIFT;
+			u.gentry.type = type;
 		}

 		break;
-	case MTRRIOC_ADD_PAGE_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
+	case _IOC_NR(MTRRIOC_ADD_PAGE_ENTRY):
 		err =
-		    mtrr_file_add(sentry.base, sentry.size, sentry.type, 1,
+		    mtrr_file_add(u.sentry.base, u.sentry.size, u.sentry.type, 1,
 				  file, 1);
 		break;
-	case MTRRIOC_SET_PAGE_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_add_page(sentry.base, sentry.size, sentry.type, 0);
+	case _IOC_NR(MTRRIOC_SET_PAGE_ENTRY):
+		err = mtrr_add_page(u.sentry.base, u.sentry.size, u.sentry.type, 0);
 		break;
-	case MTRRIOC_DEL_PAGE_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_file_del(sentry.base, sentry.size, file, 1);
+	case _IOC_NR(MTRRIOC_DEL_PAGE_ENTRY):
+		err = mtrr_file_del(u.sentry.base, u.sentry.size, file, 1);
 		break;
-	case MTRRIOC_KILL_PAGE_ENTRY:
-		if (!capable(CAP_SYS_ADMIN))
-			return -EPERM;
-		err = mtrr_del_page(-1, sentry.base, sentry.size);
+	case _IOC_NR(MTRRIOC_KILL_PAGE_ENTRY):
+		err = mtrr_del_page(-1, u.sentry.base, u.sentry.size);
 		break;
-	case MTRRIOC_GET_PAGE_ENTRY:
-		if (gentry.regnum >= num_var_ranges)
+	case _IOC_NR(MTRRIOC_GET_PAGE_ENTRY):
+		if (u.gentry.regnum >= num_var_ranges)
 			return -EINVAL;
-		mtrr_if->get(gentry.regnum, &gentry.base, &gentry.size, &type);
-		gentry.type = type;
+		mtrr_if->get(u.gentry.regnum, &u.gentry.base, &u.gentry.size, &type);
+		u.gentry.type = type;
 		break;
 	}

 	if (err)
 		return err;

-	switch(cmd) {
-	case MTRRIOC_GET_ENTRY:
-	case MTRRIOC_GET_PAGE_ENTRY:
-		if (copy_to_user(arg, &gentry, sizeof gentry))
-			err = -EFAULT;
-		break;
+	/* copy to user */
+	if (ioctl_dir & _IOC_READ) {
+		switch (ioctl_size) {
 #ifdef CONFIG_COMPAT
-	case MTRRIOC32_GET_ENTRY:
-	case MTRRIOC32_GET_PAGE_ENTRY: {
-		struct mtrr_gentry32 __user *g32 = (struct mtrr_gentry32 __user *)__arg;
-		err = put_user(gentry.base, &g32->base);
-		err |= put_user(gentry.size, &g32->size);
-		err |= put_user(gentry.regnum, &g32->regnum);
-		err |= put_user(gentry.type, &g32->type);
+		case sizeof(struct mtrr_gentry32):
+		{
+			struct mtrr_gentry g64 = u.gentry;
+			u.gentry32.base = g64.base;
+			u.gentry32.size = g64.size;
+			u.gentry32.regnum = g64.regnum;
+			u.gentry32.type = g64.type;
+		}
 		break;
-	}
 #endif
+		default:
+		break;
+		}
+		if (copy_to_user(arg, &u, ioctl_size))
+			err = -EFAULT;
 	}
 	return err;
 }

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [patches] [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected
  2007-02-12  8:04     ` Andi Kleen
  2007-02-12  8:11       ` Bauke Jan Douma
@ 2007-02-12 15:05       ` Pavel Machek
  1 sibling, 0 replies; 47+ messages in thread
From: Pavel Machek @ 2007-02-12 15:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Oliver Neukum, patches, linux-kernel

On Mon 2007-02-12 09:04:43, Andi Kleen wrote:
> On Monday 12 February 2007 08:54, Oliver Neukum wrote:
> > Am Montag, 12. Februar 2007 08:38 schrieb Andi Kleen:
> > > When a machine check event is detected (including a AMD RevF threshold 
> > > overflow event) allow to run a "trigger" program. This allows user space
> > > to react to such events sooner.
> > 
> > Could this not be merged with other reporting mechanisms? This looks like
> > a new incarnation of the /etc/hotplug code.
> 
> I refuse to make mcelog depend on dbus. Just because some desktops want bloat
> doesn't mean that everybody else wants too.

Hmm... fix the userspace, no? /etc/hotplug code in kernel should work
with other stuff than dbus...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI
  2007-02-12  7:38 ` [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI Andi Kleen
@ 2007-02-12 19:45   ` Frédéric RISS
  0 siblings, 0 replies; 47+ messages in thread
From: Frédéric RISS @ 2007-02-12 19:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: patches, linux-kernel

Le lundi 12 février 2007 à 08:38 +0100, Andi Kleen a écrit :
> When calling into the EFI firmware, the parameters need to be passed on
> the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
> This patch is needed to allow the new Intel-based Macs to suspend to ram
> (efi.get_time is called during the suspend phase).
> 
> Signed-off-by: Frederic Riss <frederic.riss@gmail.com>
> Signed-off-by: Andi Kleen <ak@suse.de>

For 2.6.20, Linus merged a different version touching only
arch/kernel/efi.c. If you really prefer this version, I guess it should
be discussed more thoroughly with him and the EFI guys. 

Fred.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32
  2007-02-12 13:24   ` Giuliano Procida
@ 2007-02-12 22:28     ` Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2007-02-12 22:28 UTC (permalink / raw)
  To: Giuliano Procida; +Cc: patches, linux-kernel

On Monday 12 February 2007 14:24, Giuliano Procida wrote:
> This is a nicer version of the MTRR compatibilty ioctl patch, compiles
> smaller and also tested.

Perhaps nice, but doesn't apply. I will stay with the old version for now

Applying patch patches/fix-32-bit-ioctls-on-x64_32
patching file arch/i386/kernel/cpu/mtrr/if.c
Hunk #1 FAILED at 154.
1 out of 1 hunk FAILED -- rejects in file arch/i386/kernel/cpu/mtrr/if.c


-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M?
  2007-02-12  7:38 ` [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M? Andi Kleen
@ 2007-02-13  6:36   ` Rene Herman
  0 siblings, 0 replies; 47+ messages in thread
From: Rene Herman @ 2007-02-13  6:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: patches, linux-kernel

On 02/12/2007 08:38 AM, Andi Kleen wrote:

> From: Rene Herman <rene.herman@gmail.com>

[ ... ]

> --- linux.orig/arch/i386/Kconfig
> +++ linux/arch/i386/Kconfig
> @@ -843,7 +843,7 @@ config RELOCATABLE
>  config PHYSICAL_ALIGN
>  	hex "Alignment value to which kernel should be aligned"
>  	default "0x100000"
> -	range 0x2000 0x400000
> +	range 0x2000 0x1000000
>  	help
>  	  This value puts the alignment restrictions on physical address
>   	  where kernel is loaded and run from. Kernel is compiled for an

Okay I guess, but in reply to this, Vivek Goyal pointed out a patch of 
his restoring CONFIG_PHYSICAL_START that was already in -mm instead 
since it seems Xen also wanted it. That does also match what I wanted it 
for better:

http://lkml.org/lkml/2007/1/2/376

Rene.


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2007-02-13  6:37 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-12  7:37 [PATCH x86 for review II] [1/39] i386: move startup_32() in text.head section Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [2/39] x86_64: Break init() in two parts to avoid MODPOST warnings Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [3/39] i386: arch/i386/kernel/cpu/mcheck/mce.c should #include <asm/mce.h> Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [4/39] i386: add idle notifier Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [5/39] i386: improve sched_clock() on i686 Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [6/39] i386: romsignature/checksum cleanup Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [7/39] x86_64: Fix fake numa for x86_64 machines with big IO hole Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [8/39] x86_64: Remove fastcall references in x86_64 code Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [9/39] x86_64: Use constant instead of raw number in x86_64 ioperm.c Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [10/39] x86_64: Handle 32 bit PerfMon Counter writes cleanly in x86_64 nmi_watchdog Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [11/39] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog Andi Kleen
2007-02-12  7:37 ` [PATCH x86 for review II] [12/39] i386: Handle 32 bit PerfMon Counter writes cleanly in oprofile Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [13/39] i386: CONFIG_PHYSICAL_ALIGN limited to 4M? Andi Kleen
2007-02-13  6:36   ` Rene Herman
2007-02-12  7:38 ` [PATCH x86 for review II] [14/39] x86_64: cleanup Doc/x86_64/ files Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [15/39] x86_64: list x86_64 quilt tree Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [16/39] x86: simplify notify_page_fault() Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [17/39] x86_64: Tighten mce_amd driver MSR reads Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [18/39] x86_64: Allow to run a program when a machine check event is detected Andi Kleen
2007-02-12  7:54   ` Oliver Neukum
2007-02-12  8:04     ` Andi Kleen
2007-02-12  8:11       ` Bauke Jan Douma
2007-02-12 15:05       ` [patches] " Pavel Machek
2007-02-12  7:38 ` [PATCH x86 for review II] [19/39] x86_64: remove get_pmd() Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [20/39] i386: Small cleanup to TLB flush code Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [21/39] i386: rdmsr_on_cpu, wrmsr_on_cpu Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [22/39] x86_64: Kconfig typos Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [23/39] i386: use smp_call_function_single() Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [24/39] " Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [25/39] x86_64: Fix preprocessor condition Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [26/39] i386: fix 32-bit ioctls on x64_32 Andi Kleen
2007-02-12 13:24   ` Giuliano Procida
2007-02-12 22:28     ` Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [27/39] i386: APM on i386 Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [28/39] i386: fix size_or_mask and size_and_mask Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [29/39] x86_64: - Ignore long SMI interrupts in clock calibration code - update 1 Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [30/39] x86_64: Check return value of putreg in PTRACE_SETREGS Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [31/39] x86_64: Unexport __supported_pte_mask Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [32/39] x86_64: x86_64 - Fix FS/GS registers for VT execution Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [33/39] x86_64: Fix off by one error in IOMMU boundary checking Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [34/39] i386: Use stack arguments for calling into EFI Andi Kleen
2007-02-12 19:45   ` Frédéric RISS
2007-02-12  7:38 ` [PATCH x86 for review II] [35/39] x86_64: Don't reserve ROMs Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [36/39] x86_64: define dma noncoherent API functions Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [37/39] x86_64: robustify bad_dma_address handling Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [38/39] x86: fix laptop bootup hang in init_acpi() Andi Kleen
2007-02-12  7:38 ` [PATCH x86 for review II] [39/39] i386: All Transmeta CPUs have constant TSCs Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).