All of lore.kernel.org
 help / color / mirror / Atom feed
* 5.15-rc on x86-32: chromium dies with floating point exception
@ 2021-10-17  9:39 Pavel Machek
  2021-10-17 10:25 ` next-20211015: suspend to ram on x86-32 broken Pavel Machek
  2021-10-17 10:41 ` 5.15-rc on x86-32: chromium dies with floating point exception Borislav Petkov
  0 siblings, 2 replies; 9+ messages in thread
From: Pavel Machek @ 2021-10-17  9:39 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, hpa, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 630 bytes --]

Hi!

I have an "interesting" x86-32 system on thinkpad x60. Depending on
kernel version, chromium either works or fails with floating point
exception.

Working:

Linux amd 5.12.0+ #104 SMP PREEMPT Tue Apr 27 10:31:57 CEST 2021 i686
GNU/Linux

Broken:

Linux amd 5.15.0-rc5-next-20211015+ #204 SMP Sun Oct 17 10:29:18 CEST
2021 i686 GNU/Linux

pavel@amd:~$ chromium  --temp-profile
Floating point exception (core dumped)

Is there any chance to get anything useful from the coredump? Besides
bisection, does someone have any ideas?

Best regards,
								Pavel
-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* next-20211015: suspend to ram on x86-32 broken
  2021-10-17  9:39 5.15-rc on x86-32: chromium dies with floating point exception Pavel Machek
@ 2021-10-17 10:25 ` Pavel Machek
  2021-10-18  7:13   ` Pavel Machek
  2021-10-17 10:41 ` 5.15-rc on x86-32: chromium dies with floating point exception Borislav Petkov
  1 sibling, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2021-10-17 10:25 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr

[-- Attachment #1: Type: text/plain, Size: 263 bytes --]

Hi!

On Thinkpad X60, suspend to ram no longer works. Suspend is okay, and
there are some signs of resume attempts for second or so, but screen
stays black and sleep LED stays on.

Best regards,
								Pavel
-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 5.15-rc on x86-32: chromium dies with floating point exception
  2021-10-17  9:39 5.15-rc on x86-32: chromium dies with floating point exception Pavel Machek
  2021-10-17 10:25 ` next-20211015: suspend to ram on x86-32 broken Pavel Machek
@ 2021-10-17 10:41 ` Borislav Petkov
  2021-10-17 12:58   ` Pavel Machek
  1 sibling, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2021-10-17 10:41 UTC (permalink / raw)
  To: Pavel Machek; +Cc: tglx, mingo, x86, hpa, linux-kernel

On Sun, Oct 17, 2021 at 11:39:05AM +0200, Pavel Machek wrote:
> does someone have any ideas?

Fix just went to Linus:

https://git.kernel.org/tip/b2381acd3fd9bacd2c63f53b2c610c89959b31cc

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 5.15-rc on x86-32: chromium dies with floating point exception
  2021-10-17 10:41 ` 5.15-rc on x86-32: chromium dies with floating point exception Borislav Petkov
@ 2021-10-17 12:58   ` Pavel Machek
  0 siblings, 0 replies; 9+ messages in thread
From: Pavel Machek @ 2021-10-17 12:58 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, x86, hpa, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 268 bytes --]

Hi!
> > does someone have any ideas?
> 
> Fix just went to Linus:
> 
> https://git.kernel.org/tip/b2381acd3fd9bacd2c63f53b2c610c89959b31cc

Thank you, that seems to solve it for me.

Best regards,
								Pavel
-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next-20211015: suspend to ram on x86-32 broken
  2021-10-17 10:25 ` next-20211015: suspend to ram on x86-32 broken Pavel Machek
@ 2021-10-18  7:13   ` Pavel Machek
  2021-10-18  8:13     ` Pavel Machek
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2021-10-18  7:13 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr, peterz, gor

[-- Attachment #1: Type: text/plain, Size: 2759 bytes --]

Hi!

> On Thinkpad X60, suspend to ram no longer works. Suspend is okay, and
> there are some signs of resume attempts for second or so, but screen
> stays black and sleep LED stays on.

I did a bisection:

# bad: [7c832d2f9b959e3181370c8b0dacaf9efe13fc05] Add linux-next specific files for 20211015
# good: [64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc] Linux 5.15-rc5
git bisect start 'next-20211015' 'HEAD'
# good: [048c22e37f3dee5adf67e97c48735c325edbb178] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect good 048c22e37f3dee5adf67e97c48735c325edbb178
# bad: [ff4d6dddf948544ef8fa7e5b539ced1f854c0a7f] Merge branch 'auto-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect bad ff4d6dddf948544ef8fa7e5b539ced1f854c0a7f
# good: [3d4352bd49acf0f81a2dae28e910e80c80aec115] Merge branch 'drm-next' of https://gitlab.freedesktop.org/agd5f/linux
git bisect good 3d4352bd49acf0f81a2dae28e910e80c80aec115
# good: [5e135c8bb89c9f83e7db6216b2bff96c5433728c] Merge branch 'for-next' of git://git.kernel.dk/linux-block.git
git bisect good 5e135c8bb89c9f83e7db6216b2bff96c5433728c
# good: [6e85e7634927384c362395bda82e3759ae94c7c6] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
git bisect good 6e85e7634927384c362395bda82e3759ae94c7c6
# good: [ac716d0d92cbb48cd635a4ac41829f8ebd7f369c] Merge remote-tracking branch 'tip/locking/core' into tip-master
git bisect good ac716d0d92cbb48cd635a4ac41829f8ebd7f369c
# bad: [a5dd661e53635877debbf48045913266b429950a] Merge remote-tracking branch 'tip/sched/core' into tip-master
git bisect bad a5dd661e53635877debbf48045913266b429950a
# good: [16d364ba6ef2aa59b409df70682770f3ed23f7c0] sched/topology: Introduce sched_group::flags
git bisect good 16d364ba6ef2aa59b409df70682770f3ed23f7c0
# good: [769fdf83df57b373660343ef4270b3ada91ef434] sched: Fix DEBUG && !SCHEDSTATS warn
git bisect good 769fdf83df57b373660343ef4270b3ada91ef434
# bad: [b6153093de41186e2c534ffffb8ce81b1666b110] sched/numa: Replace hard-coded number by a define in numa_task_group()
git bisect bad b6153093de41186e2c534ffffb8ce81b1666b110
# good: [00619f7c650e4e46c650cb2e2fd5f438b32dc64b] sched,livepatch: Use task_call_func()
git bisect good 00619f7c650e4e46c650cb2e2fd5f438b32dc64b
# bad: [2aa45be430a031c10d5f4a5bf3329ff8413a9187] sched,livepatch: Use wake_up_if_idle()
git bisect bad 2aa45be430a031c10d5f4a5bf3329ff8413a9187

It said

commit 8850cb663b5cda04d33f9cfbc38889d73d3c8e24 (HEAD)
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Tue Sep 21 22:16:02 2021 +0200

    sched: Simplify wake_up_*idle*()

is first bad commit.

-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next-20211015: suspend to ram on x86-32 broken
  2021-10-18  7:13   ` Pavel Machek
@ 2021-10-18  8:13     ` Pavel Machek
  2021-10-18  9:15       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2021-10-18  8:13 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr, peterz, gor

[-- Attachment #1: Type: text/plain, Size: 403 bytes --]

Hi!
> It said
> 
> commit 8850cb663b5cda04d33f9cfbc38889d73d3c8e24 (HEAD)
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Tue Sep 21 22:16:02 2021 +0200
> 
>     sched: Simplify wake_up_*idle*()
> 
> is first bad commit.

And reverting that one on the top of -next indeed fixes resume on
thinkpad x60.

Best regards,
								Pavel
-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next-20211015: suspend to ram on x86-32 broken
  2021-10-18  8:13     ` Pavel Machek
@ 2021-10-18  9:15       ` Peter Zijlstra
  2021-10-18 11:44         ` Pavel Machek
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2021-10-18  9:15 UTC (permalink / raw)
  To: Pavel Machek
  Cc: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr, gor

On Mon, Oct 18, 2021 at 10:13:00AM +0200, Pavel Machek wrote:
> Hi!
> > It said
> > 
> > commit 8850cb663b5cda04d33f9cfbc38889d73d3c8e24 (HEAD)
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date:   Tue Sep 21 22:16:02 2021 +0200
> > 
> >     sched: Simplify wake_up_*idle*()
> > 
> > is first bad commit.
> 
> And reverting that one on the top of -next indeed fixes resume on
> thinkpad x60.

Can you try with just reverting the smp.c hunk and leaving the sched
hunk in place? I've got a hotplug lock related splat in my inbox from
late last week that I didn't get around to yet, I suspect they're
related.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next-20211015: suspend to ram on x86-32 broken
  2021-10-18  9:15       ` Peter Zijlstra
@ 2021-10-18 11:44         ` Pavel Machek
  2021-10-18 14:52           ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2021-10-18 11:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr, gor

[-- Attachment #1: Type: text/plain, Size: 724 bytes --]

Hi!

> > > commit 8850cb663b5cda04d33f9cfbc38889d73d3c8e24 (HEAD)
> > > Author: Peter Zijlstra <peterz@infradead.org>
> > > Date:   Tue Sep 21 22:16:02 2021 +0200
> > > 
> > >     sched: Simplify wake_up_*idle*()
> > > 
> > > is first bad commit.
> > 
> > And reverting that one on the top of -next indeed fixes resume on
> > thinkpad x60.
> 
> Can you try with just reverting the smp.c hunk and leaving the sched
> hunk in place? I've got a hotplug lock related splat in my inbox from
> late last week that I didn't get around to yet, I suspect they're
> related.

Reverting smp.c hunk is enough to get suspend/resume to work.

Best regards,
							Pavel
-- 
http://www.livejournal.com/~pavelmachek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next-20211015: suspend to ram on x86-32 broken
  2021-10-18 11:44         ` Pavel Machek
@ 2021-10-18 14:52           ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2021-10-18 14:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: tglx, mingo, bp, x86, hpa, linux-kernel, rafael, len.brown,
	linux-pm, sfr, gor

On Mon, Oct 18, 2021 at 01:44:29PM +0200, Pavel Machek wrote:

> Reverting smp.c hunk is enough to get suspend/resume to work.

Thanks! Queued the below.

---
Subject: sched: Partial revert: "sched: Simplify wake_up_*idle*()"
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Oct 18 16:41:05 CEST 2021

As reported by syzbot and experienced by Pavel, using cpus_read_lock()
in wake_up_all_idle_cpus() generates lock inversion (against mmap_sem
and possibly others).

Therefore, undo this change and put in a comment :/

Fixes: 8850cb663b5c ("sched: Simplify wake_up_*idle*()")
Reported-by: syzbot+d5b23b18d2f4feae8a67@syzkaller.appspotmail.com
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Pavel Machek <pavel@ucw.cz>
---
 kernel/smp.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1170,14 +1170,23 @@ void wake_up_all_idle_cpus(void)
 {
 	int cpu;
 
-	cpus_read_lock();
+	/*
+	 * This really should be cpus_read_lock(), because disabling preemption
+	 * over iterating all CPUs is really bad when you have large numbers of
+	 * CPUs, giving rise to large latencies.
+	 *
+	 * Sadly this cannot be, since (ironically) this function is used from
+	 * the cpu_latency_qos stuff which in turn is used under all sorts of
+	 * locks yielding a hotplug lock inversion :/
+	 */
+	preempt_disable();
 	for_each_online_cpu(cpu) {
-		if (cpu == raw_smp_processor_id())
+		if (cpu == smp_processor_id())
 			continue;
 
 		wake_up_if_idle(cpu);
 	}
-	cpus_read_unlock();
+	preempt_enable();
 }
 EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-18 14:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-17  9:39 5.15-rc on x86-32: chromium dies with floating point exception Pavel Machek
2021-10-17 10:25 ` next-20211015: suspend to ram on x86-32 broken Pavel Machek
2021-10-18  7:13   ` Pavel Machek
2021-10-18  8:13     ` Pavel Machek
2021-10-18  9:15       ` Peter Zijlstra
2021-10-18 11:44         ` Pavel Machek
2021-10-18 14:52           ` Peter Zijlstra
2021-10-17 10:41 ` 5.15-rc on x86-32: chromium dies with floating point exception Borislav Petkov
2021-10-17 12:58   ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.