* Re: 2.6.38-rc1 problems with khugepaged
2011-01-21 19:17 ` Andrea Arcangeli
@ 2011-01-21 21:14 ` Steven Rostedt
2011-01-22 2:12 ` [PATCH] thp: fix PARAVIRT x86 32bit noPAE Andrea Arcangeli
1 sibling, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2011-01-21 21:14 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: werner, akpm, hannes, idryomov, linux-kernel, torvalds,
Minchan Kim, Johannes Weiner
[ annoying plug ]
On Fri, 2011-01-21 at 20:17 +0100, Andrea Arcangeli wrote:
> On Fri, Jan 21, 2011 at 06:43:44AM -0400, werner wrote:
> > 2.6.38-rc1-git1 runs normally with khugepaged switched
> > OFF. Thus, khugepaged is the problem; as said if
> > switched it on, during boot the computer becomes very slow
> > and sticks almost. wl
>
> Sorry, looks like x86 32bit THP wasn't tested in all possible .config
> (32bit support is upstream-only feature and because of lower userbase
> it also got a lot less testing than 64bit support).
>
> Does it make any difference if you both disable CONFIG_PARAVIRT and
> enable HIGHMEM64?
>
> For now keep THP off on 32bit, it should be trivial to reproduce so
> fix will come soon.
>
> Minchan info shows it's an 32bit arch bug so don't be too worried
> about it because it can't affect 64bit builds and it's not common code
> bug and we'll fix it ASAP.
Just want to mention that ktest.pl was included in this release, just
for this purpose :)
It does randconfigs (and other nice features) and will test various
kernels. All you need is to have a test machine that can be remotely
booted and have its console read via stdio from another machine.
This happened to be how I triggered the bug.
I would be happy to help anyone set it up. The config I used to test my
box to do 10 runs of randconfig with a minimal config to make it boot,
and then 10 runs of randconfig with the minimal config to make it boot
and enable networking:
-----
MACHINE = mitest
# The TEST_START can override most defaults
# do ten boot tests (minimum forced config settings as possible)
TEST_START ITERATE 10
TEST_TYPE = boot
BUILD_TYPE = randconfig
MIN_CONFIG = /home/rostedt/work/autotest/configs/mitest/config-mitest-min
CHECKOUT = origin/master
# do ten boot and ssh hackbench tests (need network configs set)
TEST_START ITERATE 10
TEST_TYPE = TEST
BUILD_TYPE = randconfig
MIN_CONFIG = /home/rostedt/work/autotest/configs/mitest/config-mitest-net
CHECKOUT = origin/master
TEST = ssh root@mitest /work/c/hackbench_32 50
# defaults to use when watching test (SKIP will make ktest ignore it)
DEFAULTS SKIP
REBOOT_ON_ERROR = 1
POWEROFF_ON_ERROR = 0
POWEROFF_ON_SUCCESS = 0
REBOOT_ON_SUCCESS = 1
DIE_ON_FAILURE = 1
#defaults to kick off at night, and shut down at the end
DEFAULTS
REBOOT_ON_ERROR = 0
POWEROFF_ON_ERROR = 1
POWEROFF_ON_SUCCESS = 1
REBOOT_ON_SUCCESS = 0
DIE_ON_FAILURE = 0
STORE_FAILURES = /home/rostedt/work/autotest/nobackup/failures
# defaults for all tests
DEFAULTS
POWEROFF_AFTER_HALT = 60
CLEAR_LOG = 1
MIN_CONFIG = /home/rostedt/work/autotest/configs/mitest/config-mitest-min
SSH_USER = root
BUILD_DIR = /home/rostedt/work/autotest/nobackup/linux-test.git
OUTPUT_DIR = /home/rostedt/work/autotest/nobackup/mitest
BUILD_TARGET = arch/x86/boot/bzImage
TARGET_IMAGE = /boot/vmlinuz-test
POWER_CYCLE = /home/rostedt/work/autotest/cycle-mxtest
CONSOLE = nc -d fedora 3001
LOCALVERSION = -test
GRUB_MENU = Test Kernel
MAKE_CMD = distmake-32 ARCH=i386
POWER_OFF = /home/rostedt/work/autotest/poweroff-mxtest
BUILD_OPTIONS = -j20
LOG_FILE = /home/rostedt/work/autotest/nobackup/mitest/mitest.log
TEST = ssh root@mitest cat /debug/tracing/trace
ADD_CONFIG = /home/rostedt/work/autotest/configs/config-broken /home/rostedt/work/autotest/config-general
-----
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] thp: fix PARAVIRT x86 32bit noPAE
2011-01-21 19:17 ` Andrea Arcangeli
2011-01-21 21:14 ` Steven Rostedt
@ 2011-01-22 2:12 ` Andrea Arcangeli
2011-01-22 3:18 ` Minchan Kim
1 sibling, 1 reply; 5+ messages in thread
From: Andrea Arcangeli @ 2011-01-22 2:12 UTC (permalink / raw)
To: werner
Cc: akpm, hannes, idryomov, rostedt, linux-kernel, torvalds,
Minchan Kim, Johannes Weiner
I reproduced your 32bit x86 problem with an initramfs with a hugepage
benchmark placed as /init inside KVM (not exactly an extensive test
but it was trivial to reproduce). I tested below fix with HIGHMEM4G
and HIGHMEM64G with PARAVIRT=y (and verified setting PARAVIRT=n fixed
it, in other email Minchan also verified that setting HIGHMEM64G=y
also fixed it, to confirm my theory the problem was the below #ifdef).
The stress test reaches exit() successfully now (before the fix it
crashed in out_of_memory() instead of do_exit as expected):
NMI watchdog disabled for cpu0: unable to create perf event: -2
�Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init Not tainted 2.6.37+ #17
Call Trace:
[<c14351d2>] ? panic+0x57/0x161
[<c107439b>] ? trace_hardirqs_on+0xb/0x10
[<c1045321>] ? do_exit+0x6b1/0x6e0
[<c10728ab>] ? trace_hardirqs_off+0xb/0x10
[<c1045389>] ? do_group_exit+0x39/0xa0
[<c1045403>] ? sys_exit_group+0x13/0x20
[<c1002d9c>] ? sysenter_do_call+0x12/0x3c
QEMU 0.12.1 monitor - type 'help' for more information
(qemu) quit
====
Subject: thp: fix PARAVIRT x86 32bit noPAE
From: Andrea Arcangeli <aarcange@redhat.com>
This fixes TRANSPARENT_HUGEPAGE=y with PARAVIRT=y and HIGHMEM64=n.
The #ifdef that this patch removes was erratically introduced to fix a build
error for noPAE (where pmd.pmd doesn't exist). So then the kernel built but it
failed at runtime because set_pmd_at was a noop. This will correct it by
enabling set_pmd_at for noPAE mode too.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 2071a8b..ebbc4d8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -558,13 +558,12 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd)
{
-#if PAGETABLE_LEVELS >= 3
if (sizeof(pmdval_t) > sizeof(long))
/* 5 arg words */
pv_mmu_ops.set_pmd_at(mm, addr, pmdp, pmd);
else
- PVOP_VCALL4(pv_mmu_ops.set_pmd_at, mm, addr, pmdp, pmd.pmd);
-#endif
+ PVOP_VCALL4(pv_mmu_ops.set_pmd_at, mm, addr, pmdp,
+ native_pmd_val(pmd));
}
#endif
^ permalink raw reply related [flat|nested] 5+ messages in thread