* -next20181010 regression: thinkpad x60 (32 bit) dies during boot. @ 2018-10-10 19:59 Pavel Machek 2018-10-10 20:03 ` Pavel Machek 0 siblings, 1 reply; 11+ messages in thread From: Pavel Machek @ 2018-10-10 19:59 UTC (permalink / raw) To: kernel list, tglx, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 421 bytes --] Hi! I updated to todays next... and boot crashes with .. Call Trace: kick_ilb trigger_load_balance ? active_load.. scheduler_tick update_process_times tick_nohz_handler -next20181005 worked ok. Tell me if x86-32 works for you... Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: -next20181010 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-10 19:59 -next20181010 regression: thinkpad x60 (32 bit) dies during boot Pavel Machek @ 2018-10-10 20:03 ` Pavel Machek 2018-10-11 18:03 ` -next20181010,1011 " Pavel Machek 0 siblings, 1 reply; 11+ messages in thread From: Pavel Machek @ 2018-10-10 20:03 UTC (permalink / raw) To: kernel list, tglx, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 713 bytes --] Hi! > I updated to todays next... and boot crashes with > > .. > Call Trace: > kick_ilb > trigger_load_balance > ? active_load.. > scheduler_tick > update_process_times > tick_nohz_handler > > -next20181005 worked ok. Backtrace indicates problem with nohz, so I added idle=poll. Now I have Run /sbin/init as init process Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: pgd_alloc... ... dump_stack .... pgd_alloc mm_init.isra. mm_alloc __do_execve_file do_execve ... Good ideas welcome :-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-10 20:03 ` Pavel Machek @ 2018-10-11 18:03 ` Pavel Machek 2018-10-11 20:09 ` Thomas Gleixner 0 siblings, 1 reply; 11+ messages in thread From: Pavel Machek @ 2018-10-11 18:03 UTC (permalink / raw) To: kernel list, tglx, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 512 bytes --] On Wed 2018-10-10 22:03:32, Pavel Machek wrote: > Hi! > > > I updated to todays next... and boot crashes with > > > > .. > > Call Trace: > > kick_ilb > > trigger_load_balance > > ? active_load.. > > scheduler_tick > > update_process_times > > tick_nohz_handler > > > > -next20181005 worked ok. Problem is still there in today's next. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-11 18:03 ` -next20181010,1011 " Pavel Machek @ 2018-10-11 20:09 ` Thomas Gleixner 2018-10-12 10:24 ` Pavel Machek 0 siblings, 1 reply; 11+ messages in thread From: Thomas Gleixner @ 2018-10-11 20:09 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list, mingo, bp, hpa, x86 On Thu, 11 Oct 2018, Pavel Machek wrote: > On Wed 2018-10-10 22:03:32, Pavel Machek wrote: > > Hi! > > > > > I updated to todays next... and boot crashes with > > > > > > .. > > > Call Trace: > > > kick_ilb > > > trigger_load_balance > > > ? active_load.. > > > scheduler_tick > > > update_process_times > > > tick_nohz_handler > > > > > > -next20181005 worked ok. > > Problem is still there in today's next. So what came in between -next20181005 and the first bad one? kernel/sched/* being the first place to look at. Thanks, tglx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-11 20:09 ` Thomas Gleixner @ 2018-10-12 10:24 ` Pavel Machek 2018-10-12 10:52 ` Ingo Molnar 0 siblings, 1 reply; 11+ messages in thread From: Pavel Machek @ 2018-10-12 10:24 UTC (permalink / raw) To: Thomas Gleixner, sfr; +Cc: kernel list, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 1024 bytes --] Hi! > > > > I updated to todays next... and boot crashes with > > > > > > > > .. > > > > Call Trace: > > > > kick_ilb > > > > trigger_load_balance > > > > ? active_load.. > > > > scheduler_tick > > > > update_process_times > > > > tick_nohz_handler > > > > > > > > -next20181005 worked ok. > > > > Problem is still there in today's next. > > So what came in between -next20181005 and the first bad one? kernel/sched/* > being the first place to look at. kernel/sched does not seem to contain anything too scary. I know that -next20181005 works ok, and I know -next20181010 is bad. Is there easy way to bisect using that information? I can do bisect between -next and mainline, but that's a lot of patches and thus not much fun :-(. In the meantime, I reproduced the failure with T40p. Is there someone with working x86-32 in -next? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 10:24 ` Pavel Machek @ 2018-10-12 10:52 ` Ingo Molnar 2018-10-12 12:35 ` Pavel Machek 2018-10-12 18:10 ` Avoid VLA in pgd_alloc kills boot on 32-bit machines was " Pavel Machek 0 siblings, 2 replies; 11+ messages in thread From: Ingo Molnar @ 2018-10-12 10:52 UTC (permalink / raw) To: Pavel Machek; +Cc: Thomas Gleixner, sfr, kernel list, mingo, bp, hpa, x86 * Pavel Machek <pavel@ucw.cz> wrote: > Hi! > > > > > > I updated to todays next... and boot crashes with > > > > > > > > > > .. > > > > > Call Trace: > > > > > kick_ilb > > > > > trigger_load_balance > > > > > ? active_load.. > > > > > scheduler_tick > > > > > update_process_times > > > > > tick_nohz_handler > > > > > > > > > > -next20181005 worked ok. > > > > > > Problem is still there in today's next. > > > > So what came in between -next20181005 and the first bad one? kernel/sched/* > > being the first place to look at. > > kernel/sched does not seem to contain anything too scary. > > I know that -next20181005 works ok, and I know -next20181010 is > bad. Is there easy way to bisect using that information? I can do > bisect between -next and mainline, but that's a lot of patches and > thus not much fun :-(. > > In the meantime, I reproduced the failure with T40p. Is there someone > with working x86-32 in -next? Does latest -tip fail too? If yes then I suspect bisection would be needed. Thanks, Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 10:52 ` Ingo Molnar @ 2018-10-12 12:35 ` Pavel Machek 2018-10-12 18:10 ` Avoid VLA in pgd_alloc kills boot on 32-bit machines was " Pavel Machek 1 sibling, 0 replies; 11+ messages in thread From: Pavel Machek @ 2018-10-12 12:35 UTC (permalink / raw) To: Ingo Molnar; +Cc: Thomas Gleixner, sfr, kernel list, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 2270 bytes --] Hi! > > > > Problem is still there in today's next. > > > > > > So what came in between -next20181005 and the first bad one? kernel/sched/* > > > being the first place to look at. > > > > kernel/sched does not seem to contain anything too scary. > > > > I know that -next20181005 works ok, and I know -next20181010 is > > bad. Is there easy way to bisect using that information? I can do > > bisect between -next and mainline, but that's a lot of patches and > > thus not much fun :-(. > > > > In the meantime, I reproduced the failure with T40p. Is there someone > > with working x86-32 in -next? > > Does latest -tip fail too? If yes then I suspect bisection would be needed. I already started bisect on -next (T40p is my test machine, so bisect is not that bad). The log so far is: Pavel # bad: [771b65e89c8a51d611b8049718693a4202e4f732] Add linux-next specific files for 20181011 # good: [7876320f88802b22d4e2daf7eb027dd14175a0f8] Linux 4.19-rc4 git bisect start '771b65e89c8a51d611b8049718693a4202e4f732' '7876320f88802b22d4e2daf7eb027dd14175a0f8' # good: [43faff25da004eabce691268da34065b3690f5ca] Merge remote-tracking branch 'net-next/master' git bisect good 43faff25da004eabce691268da34065b3690f5ca # good: [3e2beb7db82a880319aa2f0dcafa820f3f5206d3] Merge remote-tracking branch 'spi/for-next' git bisect good 3e2beb7db82a880319aa2f0dcafa820f3f5206d3 # bad: [74411e5fd30ae540491c4d6142af6ee6b2b22f09] Merge remote-tracking branch 'char-misc/char-misc-next' git bisect bad 74411e5fd30ae540491c4d6142af6ee6b2b22f09 # bad: [c810d907775aa2aa753e836a122613fd2416b14d] Merge remote-tracking branch 'kvm/linux-next' git bisect bad c810d907775aa2aa753e836a122613fd2416b14d # good: [fac07d2ba7b2764e3002ff9bc7861742a84a2ef6] Merge branch 'perf/core' git bisect good fac07d2ba7b2764e3002ff9bc7861742a84a2ef6 # bad: [d74865bd3996c7a6f3e8ce6e626c1fe474e39494] Merge branch 'x86/mm' git bisect bad d74865bd3996c7a6f3e8ce6e626c1fe474e39494 # good: [dcd2d0cece1608b2be9184786c900807ec947076] Merge branch 'x86/asm' git bisect good dcd2d0cece1608b2be9184786c900807ec947076 -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Avoid VLA in pgd_alloc kills boot on 32-bit machines was Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 10:52 ` Ingo Molnar 2018-10-12 12:35 ` Pavel Machek @ 2018-10-12 18:10 ` Pavel Machek 2018-10-12 18:13 ` Borislav Petkov 2018-10-12 18:22 ` Pavel Machek 1 sibling, 2 replies; 11+ messages in thread From: Pavel Machek @ 2018-10-12 18:10 UTC (permalink / raw) To: Ingo Molnar, arnd, akpm, luto, dave.hansen, jroedel, keescook, torvalds, toshi.kani Cc: Thomas Gleixner, sfr, kernel list, mingo, bp, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 983 bytes --] Hi! > > > So what came in between -next20181005 and the first bad one? kernel/sched/* > > > being the first place to look at. > > > > kernel/sched does not seem to contain anything too scary. > > > > I know that -next20181005 works ok, and I know -next20181010 is > > bad. Is there easy way to bisect using that information? I can do > > bisect between -next and mainline, but that's a lot of patches and > > thus not much fun :-(. > > > > In the meantime, I reproduced the failure with T40p. Is there someone > > with working x86-32 in -next? > > Does latest -tip fail too? If yes then I suspect bisection would be needed. And the winner is... [1be3f247c2882a82279cbcf43717581ea943b692] x86/mm: Avoid VLA in pgd_alloc() "Kernel stack is corrupted in: pgd_alloc" panic kind of suggests this is right commit. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Avoid VLA in pgd_alloc kills boot on 32-bit machines was Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 18:10 ` Avoid VLA in pgd_alloc kills boot on 32-bit machines was " Pavel Machek @ 2018-10-12 18:13 ` Borislav Petkov 2018-10-12 18:57 ` Pavel Machek 2018-10-12 18:22 ` Pavel Machek 1 sibling, 1 reply; 11+ messages in thread From: Borislav Petkov @ 2018-10-12 18:13 UTC (permalink / raw) To: Pavel Machek Cc: Ingo Molnar, arnd, akpm, luto, dave.hansen, jroedel, keescook, torvalds, toshi.kani, Thomas Gleixner, sfr, kernel list, mingo, hpa, x86 On Fri, Oct 12, 2018 at 08:10:11PM +0200, Pavel Machek wrote: > And the winner is... > > [1be3f247c2882a82279cbcf43717581ea943b692] x86/mm: Avoid VLA in > pgd_alloc() That should be fixed now: https://git.kernel.org/tip/184d47f0fd365108bd06ab26cdb3450b716269fd -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Avoid VLA in pgd_alloc kills boot on 32-bit machines was Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 18:13 ` Borislav Petkov @ 2018-10-12 18:57 ` Pavel Machek 0 siblings, 0 replies; 11+ messages in thread From: Pavel Machek @ 2018-10-12 18:57 UTC (permalink / raw) To: Borislav Petkov Cc: Ingo Molnar, arnd, akpm, luto, dave.hansen, jroedel, keescook, torvalds, toshi.kani, Thomas Gleixner, sfr, kernel list, mingo, hpa, x86 [-- Attachment #1: Type: text/plain, Size: 558 bytes --] On Fri 2018-10-12 20:13:35, Borislav Petkov wrote: > On Fri, Oct 12, 2018 at 08:10:11PM +0200, Pavel Machek wrote: > > And the winner is... > > > > [1be3f247c2882a82279cbcf43717581ea943b692] x86/mm: Avoid VLA in > > pgd_alloc() > > That should be fixed now: > > https://git.kernel.org/tip/184d47f0fd365108bd06ab26cdb3450b716269fd Aha, ok. -next20181012 indeed works. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Avoid VLA in pgd_alloc kills boot on 32-bit machines was Re: -next20181010,1011 regression: thinkpad x60 (32 bit) dies during boot. 2018-10-12 18:10 ` Avoid VLA in pgd_alloc kills boot on 32-bit machines was " Pavel Machek 2018-10-12 18:13 ` Borislav Petkov @ 2018-10-12 18:22 ` Pavel Machek 1 sibling, 0 replies; 11+ messages in thread From: Pavel Machek @ 2018-10-12 18:22 UTC (permalink / raw) To: Ingo Molnar, arnd, akpm, luto, dave.hansen, jroedel, keescook, torvalds, toshi.kani Cc: Thomas Gleixner, sfr, kernel list, mingo, bp, hpa, x86 [-- Attachment #1.1: Type: text/plain, Size: 3382 bytes --] On Fri 2018-10-12 20:10:11, Pavel Machek wrote: > Hi! > > > > > So what came in between -next20181005 and the first bad one? kernel/sched/* > > > > being the first place to look at. > > > > > > kernel/sched does not seem to contain anything too scary. > > > > > > I know that -next20181005 works ok, and I know -next20181010 is > > > bad. Is there easy way to bisect using that information? I can do > > > bisect between -next and mainline, but that's a lot of patches and > > > thus not much fun :-(. > > > > > > In the meantime, I reproduced the failure with T40p. Is there someone > > > with working x86-32 in -next? > > > > Does latest -tip fail too? If yes then I suspect bisection would be needed. > > And the winner is... > > [1be3f247c2882a82279cbcf43717581ea943b692] x86/mm: Avoid VLA in > pgd_alloc() > > "Kernel stack is corrupted in: pgd_alloc" panic kind of suggests this > is right commit. git bisect log, for the reference... and ~ my config. Pavel # bad: [771b65e89c8a51d611b8049718693a4202e4f732] Add linux-next specific files for 20181011 # good: [7876320f88802b22d4e2daf7eb027dd14175a0f8] Linux 4.19-rc4 git bisect start '771b65e89c8a51d611b8049718693a4202e4f732' '7876320f88802b22d4e2daf7eb027dd14175a0f8' # good: [43faff25da004eabce691268da34065b3690f5ca] Merge remote-tracking branch 'net-next/master' git bisect good 43faff25da004eabce691268da34065b3690f5ca # good: [3e2beb7db82a880319aa2f0dcafa820f3f5206d3] Merge remote-tracking branch 'spi/for-next' git bisect good 3e2beb7db82a880319aa2f0dcafa820f3f5206d3 # bad: [74411e5fd30ae540491c4d6142af6ee6b2b22f09] Merge remote-tracking branch 'char-misc/char-misc-next' git bisect bad 74411e5fd30ae540491c4d6142af6ee6b2b22f09 # bad: [c810d907775aa2aa753e836a122613fd2416b14d] Merge remote-tracking branch 'kvm/linux-next' git bisect bad c810d907775aa2aa753e836a122613fd2416b14d # good: [fac07d2ba7b2764e3002ff9bc7861742a84a2ef6] Merge branch 'perf/core' git bisect good fac07d2ba7b2764e3002ff9bc7861742a84a2ef6 # bad: [d74865bd3996c7a6f3e8ce6e626c1fe474e39494] Merge branch 'x86/mm' git bisect bad d74865bd3996c7a6f3e8ce6e626c1fe474e39494 # good: [dcd2d0cece1608b2be9184786c900807ec947076] Merge branch 'x86/asm' git bisect good dcd2d0cece1608b2be9184786c900807ec947076 # bad: [ae9260d80e517c8702b91b8e00d117e1e2834c33] Merge branch 'x86/cache' git bisect bad ae9260d80e517c8702b91b8e00d117e1e2834c33 # good: [d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2] x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs git bisect good d5a581d84ae6b8a4a740464b80d8d9cf1e7947b2 # good: [245e5707dd7df01428459d97a9121f14a57dac6b] Merge branch 'x86/build' git bisect good 245e5707dd7df01428459d97a9121f14a57dac6b # bad: [7d27cb68cc307ee103e116d357e9baca35151c55] Merge branch 'x86/urgent' into x86/cache git bisect bad 7d27cb68cc307ee103e116d357e9baca35151c55 # good: [2cc81c6992248ea37d0241bc325977bab310bc3b] x86/intel_rdt: Show missing resctrl mount options git bisect good 2cc81c6992248ea37d0241bc325977bab310bc3b # bad: [e8bd1803aec89dfce5758d88022963fe3248bc4c] x86/intel_rdt: Fix out-of-bounds memory access in CBM tests git bisect bad e8bd1803aec89dfce5758d88022963fe3248bc4c -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #1.2: config.gz --] [-- Type: application/gzip, Size: 27906 bytes --] [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-10-12 18:57 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-10 19:59 -next20181010 regression: thinkpad x60 (32 bit) dies during boot Pavel Machek 2018-10-10 20:03 ` Pavel Machek 2018-10-11 18:03 ` -next20181010,1011 " Pavel Machek 2018-10-11 20:09 ` Thomas Gleixner 2018-10-12 10:24 ` Pavel Machek 2018-10-12 10:52 ` Ingo Molnar 2018-10-12 12:35 ` Pavel Machek 2018-10-12 18:10 ` Avoid VLA in pgd_alloc kills boot on 32-bit machines was " Pavel Machek 2018-10-12 18:13 ` Borislav Petkov 2018-10-12 18:57 ` Pavel Machek 2018-10-12 18:22 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).