regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL
@ 2021-07-22 21:10 PGNet Dev
  2021-07-23  7:01 ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: PGNet Dev @ 2021-07-22 21:10 UTC (permalink / raw)
  To: stable; +Cc: regressions

My servers run Fedora 34, with latest kernels.

Updating to any of minimally- or un- patched 5.12.17 or 5.13.4 hangs/fails, as follows.

On just one server , an old ASRockRack J1900D2Y (quad-core Celeron) motherboard build,

	hwinfo --bios
		01: None 00.0: 10105 BIOS
		  [Created at bios.186]
		  Unique ID: rdCR.lZF+r4EgHp4
		  Hardware Class: bios
		  BIOS Keyboard LED Status:
		    Scroll Lock: off
		    Num Lock: off
		    Caps Lock: off
		  Serial Port 0: 0x3f8
		  Base Memory: 626 kB
		  PnP BIOS: @@@0000
		  MP spec rev 1.4 info:
		    OEM id: "A M I"
		    Product id: "ALASKA"
		    4 CPUs (0 disabled)
		  SMBIOS Version: 2.8
		  BIOS Info: #0
		    Vendor: "American Megatrends Inc."
		    Version: "P1.10"
		    Date: "01/23/2015"
		    Start Address: 0xf0000
		    ROM Size: 8192 kB
		    Features: 0x0d03000000013f8b9880
		      PCI supported
		      BIOS flashable
		      BIOS shadowing allowed
		      CD boot supported
		      Selectable boot supported
		      BIOS ROM socketed
		      EDD spec supported
		      1.2MB Floppy supported
		      720kB Floppy supported
		      2.88MB Floppy supported
		      Print Screen supported
		      8042 Keyboard Services supported
		      Serial Services supported
		      Printer Services supported
		      ACPI supported
		      USB Legacy supported
		      BIOS Boot Spec supported
		  System Info: #1
		    Manufacturer: "To Be Filled By O.E.M."
		    Product: "To Be Filled By O.E.M."
		    Version: "To Be Filled By O.E.M."
		    Serial: "To Be Filled By O.E.M."
		    UUID: 03000200-0400-0500-0006-000700080009
		    Wake-up: 0x06 (Power Switch)
		  Board Info: #2
		    Manufacturer: "ASRock"
		    Product: "J1900D2Y"
		    Type: 0x0a (Motherboard)
		    Features: 0x09
		      Hosting Board
		      Replaceable
		    Chassis: #3
		...

with Fedora34 Kernel <=

	Fedora (5.12.15-300.fc34.x86_64) 34 (Thirty Four)

the box boots/runs OK,

	[    0.000000] microcode: microcode updated early to revision 0x838, date = 2019-04-22
	[    0.000000] Linux version 5.12.15-300.fc34.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3), GNU ld version 2.35.1-41.fc34) #1 SMP Wed Jul 7 19:46:50 UTC 2021
	[    0.000000] Command line: BOOT_IMAGE=(mduuid/5687b25f8f25671a243a83cb661f7841)/vmlinuz-5.12.15-300.fc34.x86_64 root=/dev/mapper/VG0-LV_ROOT ro i915.modeset=1 vconsole.keymap=us vconsole.font=eurlatgr vconsole.font_map=trivial domdadm_
	[    0.000000] x86/fpu: x87 FPU will use FXSAVE
	[    0.000000] BIOS-provided physical RAM map:
	[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
	[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009efff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] usable
	[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
	[    0.000000] BIOS-e820: [mem 0x0000000020000000-0x00000000200fffff] reserved
	[    0.000000] BIOS-e820: [mem 0x0000000020100000-0x000000006d66cfff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006d66d000-0x000000006d69cfff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006d69d000-0x000000006d6acfff] ACPI data
	[    0.000000] BIOS-e820: [mem 0x000000006d6ad000-0x000000006d7f1fff] ACPI NVS
	[    0.000000] BIOS-e820: [mem 0x000000006d7f2000-0x000000006db80fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006db81000-0x000000006db81fff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006db82000-0x000000006dbc3fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006dbc4000-0x000000006dd31fff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006dd32000-0x000000006dff9fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006dffa000-0x000000006dffffff] usable
	[    0.000000] BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved
	[    0.000000] BIOS-e820: [mem 0x00000000fed01000-0x00000000fed01fff] reserved
	[    0.000000] BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffffffff] reserved
	[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047fffffff] usable
	[    0.000000] printk: console [earlyser0] enabled
	Memory KASLR using RDRAND RDTSC...
	Poking KASLR using RDRAND RDTSC...

	Welcome to Fedora 34 (Thirty Four) dracut-055-3.fc34 (Initramfs)!

	[  OK  ] Started Dispatch Password …ts to Console Directory Watch.
	[  OK  ] Reached target Local File Systems.
	[  OK  ] Reached target Path Units.
	...
	Fedora 34 (Thirty Four)
	Kernel 5.12.15-300.fc34.x86_64 on an x86_64 (ttyS0)

	srv01 login:
	...
	lsb_release -rd
		Description:    Fedora release 34 (Thirty Four)
		Release:        34
	uname -rm
		5.12.15-300.fc34.x86_64 x86_64

updating to any of minimally- or un- patched,

	Fedora (5.12.17-300.fc34.x86_64) 34 (Thirty Four)
		https://koji.fedoraproject.org/koji/buildinfo?buildID=1780670

	Fedora (5.13.4-200.fc34.x86_64) 34 (Thirty Four)
		https://koji.fedoraproject.org/koji/buildinfo?buildID=1782334

	Fedora (5.13.4-250.vanilla.1.fc34.x86_64) 34 (Thirty Four)
	    https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories
	    https://repos.fedorapeople.org/repos/thl/kernel-vanilla-stable/fedora-34/x86_64/
		https://fedorapeople.org/cgit/thl/public_git/kernel.git/tree/?h=kernel-5.13.3-250.vanilla.1.fc34

and re-gen'ing initrd, system boot hangs early in each case,


	[    0.000000] microcode: microcode updated early to revision 0x838, date = 2019-04-22
	[    0.000000] Linux version 5.13.4-200.fc34.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3), GNU ld version 2.35.1-41.fc34) #1 SMP Tue Jul 20 20:27:29 UTC 2021
	[    0.000000] Command line: BOOT_IMAGE=(mduuid/5687b25f8f25671a243a83cb661f7841)/vmlinuz-5.13.4-200.fc34.x86_64 root=/dev/mapper/VG0-LV_ROOT ro i915.modeset=1 vconsole.keymap=us vconsole.font=eurlatgr vconsole.font_map=trivial domdadm b
	[    0.000000] x86/fpu: x87 FPU will use FXSAVE
	[    0.000000] BIOS-provided physical RAM map:
	[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
	[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009efff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] usable
	[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
	[    0.000000] BIOS-e820: [mem 0x0000000020000000-0x00000000200fffff] reserved
	[    0.000000] BIOS-e820: [mem 0x0000000020100000-0x000000006d66cfff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006d66d000-0x000000006d69cfff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006d69d000-0x000000006d6acfff] ACPI data
	[    0.000000] BIOS-e820: [mem 0x000000006d6ad000-0x000000006d7f1fff] ACPI NVS
	[    0.000000] BIOS-e820: [mem 0x000000006d7f2000-0x000000006db80fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006db81000-0x000000006db81fff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006db82000-0x000000006dbc3fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006dbc4000-0x000000006dd31fff] usable
	[    0.000000] BIOS-e820: [mem 0x000000006dd32000-0x000000006dff9fff] reserved
	[    0.000000] BIOS-e820: [mem 0x000000006dffa000-0x000000006dffffff] usable
	[    0.000000] BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved
	[    0.000000] BIOS-e820: [mem 0x00000000fed01000-0x00000000fed01fff] reserved
	[    0.000000] BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffffffff] reserved
	[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047fffffff] usable
	[    0.000000] printk: console [earlyser0] enabled
	Memory KASLR using RDRAND RDTSC...
	Poking KASLR using RDRAND RDTSC...

, and proceeds no further.

drop-back to

	Fedora (5.12.15-300.fc34.x86_64) 34 (Thirty Four)

boots, again, with no issues.

for the

	Fedora (5.13.4-200.fc34.x86_64) 34 (Thirty Four)

case, with logging dialed up,

logs show a sloooooow boot process ( 700+ seconds and counting ), with eventual, complete hang @

	"Starting Apply Kernel Variables..."

--> https://pastebin.com/dP6Lm84J

So far, this occurs on this J1900 box only; other-hardware F34 installs with kernel >= 5.12.17 are booting fine.

AND, for _this_ J1900, all kernels <= 5.12.15 boot OK.



if there's specific, additional logging that'd be informative, I can attach.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL
  2021-07-22 21:10 boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL PGNet Dev
@ 2021-07-23  7:01 ` Greg KH
  2021-07-23 13:22   ` PGNet Dev
  0 siblings, 1 reply; 4+ messages in thread
From: Greg KH @ 2021-07-23  7:01 UTC (permalink / raw)
  To: PGNet Dev; +Cc: stable, regressions

On Thu, Jul 22, 2021 at 05:10:02PM -0400, PGNet Dev wrote:
> My servers run Fedora 34, with latest kernels.
> 
> Updating to any of minimally- or un- patched 5.12.17 or 5.13.4 hangs/fails, as follows.

Can you use 'git bisect' to find the offending change?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL
  2021-07-23  7:01 ` Greg KH
@ 2021-07-23 13:22   ` PGNet Dev
  2021-07-24 15:30     ` PGNet Dev
  0 siblings, 1 reply; 4+ messages in thread
From: PGNet Dev @ 2021-07-23 13:22 UTC (permalink / raw)
  To: greg; +Cc: stable, regressions

On 7/23/21 3:01 AM, Greg KH wrote:
> On Thu, Jul 22, 2021 at 05:10:02PM -0400, PGNet Dev wrote:
>> My servers run Fedora 34, with latest kernels.
>>
>> Updating to any of minimally- or un- patched 5.12.17 or 5.13.4 hangs/fails, as follows.
> 
> Can you use 'git bisect' to find the offending change?
> 
> thanks,
> 
> greg k-h

unfortunately, not simply.

These are rpm installs, from unpatched builds,

> updating to any of minimally- or un- patched,


> 
>     Fedora (5.12.17-300.fc34.x86_64) 34 (Thirty Four)

>         https://koji.fedoraproject.org/koji/buildinfo?buildID=1780670


> 
>     Fedora (5.13.4-200.fc34.x86_64) 34 (Thirty Four)

>         https://koji.fedoraproject.org/koji/buildinfo?buildID=1782334

> 

>     Fedora (5.13.4-250.vanilla.1.fc34.x86_64) 34 (Thirty Four)

>         https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories

>         https://repos.fedorapeople.org/repos/thl/kernel-vanilla-stable/fedora-34/x86_64/

>         https://fedorapeople.org/cgit/thl/public_git/kernel.git/tree/?h=kernel-5.13.3-250.vanilla.1.fc34

available at stable release versions, not my own source builds.

and, of course, this issue appears only on this J1900/Celeron hardware *production* server.

Is there a specific set of kernel rd/systemd/etc debug logging flags that would shine more light on the problem?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL
  2021-07-23 13:22   ` PGNet Dev
@ 2021-07-24 15:30     ` PGNet Dev
  0 siblings, 0 replies; 4+ messages in thread
From: PGNet Dev @ 2021-07-24 15:30 UTC (permalink / raw)
  To: greg; +Cc: stable, regressions

On 7/23/21 9:22 AM, PGNet Dev wrote:
>> Can you use 'git bisect' to find the offending change?
>>
>> thanks,
>>
>> greg k-h

Greg,


git clone https://gitlab.com/cki-project/kernel-ark.git
cd kernel-ark
git log -n1 | head -n1
       1 commit 94dd448e56d2446de85682efabd2f833c5f6dfc8 (HEAD -> os-build, origin/os-build, origin/HEAD)

git bisect start
git bisect good v5.12.15
git bisect bad  v5.12.17
git bisect visualize --oneline | wc -l
	715

bisect:

	1  BAD   [88611c8036bf96f91e59d223b1b8630e9ace82f2] mm: mmap_lock: use local locks instead of disabling preemption
	2  GOOD  [49e077e7c08ec4fd9299646d430bcc085ae81b86] mmc: sdhci-sprd: use sdhci_sprd_writew
	3  BAD   [2d3650748f83eb9fb5121a52c8b53e436e1f349c] crypto: ux500 - Fix error return code in hash_hw_final()
	4  GOOD  [b1bdf36471f2166725a688cf0204059455aedd66] fs: dlm: cancel work sync othercon
	5  BAD   [19d2497258ad98e1938c43ea00edfdd5408699a9] smb3: fix uninitialized value for port in witness protocol move
	6  BAD   [d401922918b0f36e2cef76413c07d1c223ee6df0] block: fix race between adding/removing rq qos and normal IO
	7  GOOD  [bc58f76172e8b80b9231abb275fac32c069df151] fs: dlm: fix memory leak when fenced
	8  BAD   [96b15a0b45182f1c3da5a861196da27000da2e3c] ACPI: resources: Add checks for ACPI IRQ override
	9  GOOD  [24743ca474860e2c350268b98cfff4ed1ff37fb4] ACPI: bus: Call kobject_put() in acpi_init() error path

	96b15a0b45182f1c3da5a861196da27000da2e3c is the first bad commit
	commit 96b15a0b45182f1c3da5a861196da27000da2e3c
	Author: Hui Wang <hui.wang@canonical.com>
	Date:   Wed Jun 9 10:14:42 2021 +0800

	    ACPI: resources: Add checks for ACPI IRQ override

	    [ Upstream commit 0ec4e55e9f571f08970ed115ec0addc691eda613 ]

	    The laptop keyboard doesn't work on many MEDION notebooks, but the
	    keyboard works well under Windows and Unix.

	    Through debugging, we found this log in the dmesg:

	     ACPI: IRQ 1 override to edge, high
	     pnp 00:03: Plug and Play ACPI device, IDs PNP0303 (active)

	     And we checked the IRQ definition in the DSDT, it is:

	        IRQ (Level, ActiveLow, Exclusive, )
	            {1}

	    So the BIOS defines the keyboard IRQ to Level_Low, but the Linux
	    kernel override it to Edge_High. If the Linux kernel is modified
	    to skip the IRQ override, the keyboard will work normally.

	    From the existing comment in acpi_dev_get_irqresource(), the override
	    function only needs to be called when IRQ() or IRQNoFlags() is used
	    to populate the resource descriptor, and according to Section 6.4.2.1
	    of ACPI 6.4 [1], if IRQ() is empty or IRQNoFlags() is used, the IRQ
	    is High true, edge sensitive and non-shareable. ACPICA also assumes
	    that to be the case (see acpi_rs_set_irq[] in rsirq.c).

	    In accordance with the above, check 3 additional conditions
	    (EdgeSensitive, ActiveHigh and Exclusive) when deciding whether or
	    not to treat an ACPI_RESOURCE_TYPE_IRQ resource as "legacy", in which
	    case the IRQ override is applicable to it.

	    Link: https://uefi.org/specs/ACPI/6.4/06_Device_Configuration/Device_Configuration.html#irq-descriptor # [1]
	    BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213031
	    BugLink: http://bugs.launchpad.net/bugs/1909814
	    Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
	    Reported-by: Manuel Krause <manuelkrause@netscape.net>
	    Tested-by: Manuel Krause <manuelkrause@netscape.net>
	    Signed-off-by: Hui Wang <hui.wang@canonical.com>
	    [ rjw: Subject rewrite, changelog edits ]
	    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
	    Signed-off-by: Sasha Levin <sashal@kernel.org>

	 drivers/acpi/resource.c | 9 ++++++++-
	 1 file changed, 8 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-24 15:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-22 21:10 boot of J1900 (quad-core Celeron) mobo: kernel <= 5.12.15, OK; kernel >= 5.12.17, 5.13.4, slow boot (>> 660 secs) + hang/FAIL PGNet Dev
2021-07-23  7:01 ` Greg KH
2021-07-23 13:22   ` PGNet Dev
2021-07-24 15:30     ` PGNet Dev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).