All of lore.kernel.org
 help / color / mirror / Atom feed
* [CRASH][BISECTED] 6.4.1 crash in boot
@ 2023-07-02 16:36 Mirsad Goran Todorovac
  2023-07-03  1:44 ` Bagas Sanjaya
  0 siblings, 1 reply; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-02 16:36 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: linux-kbuild, Kees Cook

[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]

Hi,

After new git pull the kernel in Torvalds tree with default debug config
failed to boot with error that occurs prior to mounting filesystems, so there
is no log safe for the screenshot(s) here:

[1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/

Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):

# good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
# bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
.
.
.
# bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
# first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC

The architecture is Ubuntu 22.04 with lshw and config give in the attachment.

Best regards,
Mirsad Todorovac

[-- Attachment #2: config-6.4.0-060400-generic.xz --]
[-- Type: application/x-xz, Size: 57448 bytes --]

[-- Attachment #3: lshw.txt.xz --]
[-- Type: application/x-xz, Size: 6500 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-02 16:36 [CRASH][BISECTED] 6.4.1 crash in boot Mirsad Goran Todorovac
@ 2023-07-03  1:44 ` Bagas Sanjaya
  2023-07-03  3:20   ` Kees Cook
  2023-07-03  3:40   ` Mirsad Goran Todorovac
  0 siblings, 2 replies; 27+ messages in thread
From: Bagas Sanjaya @ 2023-07-03  1:44 UTC (permalink / raw)
  To: Mirsad Goran Todorovac, Linux Kernel Mailing List, Linux LLVM
  Cc: linux-kbuild, Kees Cook, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, Guenter Roeck

[-- Attachment #1: Type: text/plain, Size: 1441 bytes --]

On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
> Hi,
> 
> After new git pull the kernel in Torvalds tree with default debug config
> failed to boot with error that occurs prior to mounting filesystems, so there
> is no log safe for the screenshot(s) here:
> 
> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
> 
> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
> 
> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
> .
> .
> .
> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> 
> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.

Can you show early kernel log (something like dmesg)?

Anyway, I'm adding it to regzbot:

#regzbot ^introduced: 2d47c6956ab3c8
#regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  1:44 ` Bagas Sanjaya
@ 2023-07-03  3:20   ` Kees Cook
  2023-07-03  3:26     ` Guenter Roeck
  2023-07-03  3:40   ` Mirsad Goran Todorovac
  1 sibling, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-03  3:20 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Mirsad Goran Todorovac, Linux Kernel Mailing List, Linux LLVM,
	linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, Guenter Roeck

On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
> > Hi,
> > 
> > After new git pull the kernel in Torvalds tree with default debug config
> > failed to boot with error that occurs prior to mounting filesystems, so there
> > is no log safe for the screenshot(s) here:
> > 
> > [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
> > 
> > Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
> > 
> > # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
> > git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
> > # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
> > git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
> > .
> > .
> > .
> > # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> > git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
> > # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> > 
> > The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
> 
> Can you show early kernel log (something like dmesg)?
> 
> Anyway, I'm adding it to regzbot:
> 
> #regzbot ^introduced: 2d47c6956ab3c8
> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening

I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
tree... it's only in Linus's ToT.

Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
as even being available, much less present. Something seems very wrong
with this report...

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:20   ` Kees Cook
@ 2023-07-03  3:26     ` Guenter Roeck
  2023-07-03  3:53       ` Mirsad Goran Todorovac
  2023-07-03  3:58       ` Guenter Roeck
  0 siblings, 2 replies; 27+ messages in thread
From: Guenter Roeck @ 2023-07-03  3:26 UTC (permalink / raw)
  To: Kees Cook, Bagas Sanjaya
  Cc: Mirsad Goran Todorovac, Linux Kernel Mailing List, Linux LLVM,
	linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On 7/2/23 20:20, Kees Cook wrote:
> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>> Hi,
>>>
>>> After new git pull the kernel in Torvalds tree with default debug config
>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>> is no log safe for the screenshot(s) here:
>>>
>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>
>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>
>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>> .
>>> .
>>> .
>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>
>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>
>> Can you show early kernel log (something like dmesg)?
>>
>> Anyway, I'm adding it to regzbot:
>>
>> #regzbot ^introduced: 2d47c6956ab3c8
>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
> 
> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
> tree... it's only in Linus's ToT.
> 

In ToT:

$ git describe 2d47c6956ab3
v6.4-rc2-1-g2d47c6956ab3

$ git describe --contains 2d47c6956ab3
next-20230616~2^2~51
$ git describe --contains --match 'v*' 2d47c6956ab3
fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'

"git describe" always shows the parent tree, which I guess was based on
v6.4-rc2.

Guenter


> Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
> as even being available, much less present. Something seems very wrong
> with this report...
> 
> -Kees
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  1:44 ` Bagas Sanjaya
  2023-07-03  3:20   ` Kees Cook
@ 2023-07-03  3:40   ` Mirsad Goran Todorovac
  1 sibling, 0 replies; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  3:40 UTC (permalink / raw)
  To: Bagas Sanjaya, Linux Kernel Mailing List, Linux LLVM
  Cc: linux-kbuild, Kees Cook, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, Guenter Roeck

On 7/3/23 03:44, Bagas Sanjaya wrote:
> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>> Hi,
>>
>> After new git pull the kernel in Torvalds tree with default debug config
>> failed to boot with error that occurs prior to mounting filesystems, so there
>> is no log safe for the screenshot(s) here:
>>
>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>
>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>
>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>> .
>> .
>> .
>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>
>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
> 
> Can you show early kernel log (something like dmesg)?

No, machine freezes after those screenfulls and I could only take a
screenshot.
  
> Anyway, I'm adding it to regzbot:
> 
> #regzbot ^introduced: 2d47c6956ab3c8
> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:26     ` Guenter Roeck
@ 2023-07-03  3:53       ` Mirsad Goran Todorovac
  2023-07-03  4:30         ` Kees Cook
  2023-07-03  3:58       ` Guenter Roeck
  1 sibling, 1 reply; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  3:53 UTC (permalink / raw)
  To: Guenter Roeck, Kees Cook, Bagas Sanjaya
  Cc: Linux Kernel Mailing List, Linux LLVM, linux-kbuild,
	Linux Regressions, Nathan Chancellor, Nick Desaulniers

On 7/3/23 05:26, Guenter Roeck wrote:
> On 7/2/23 20:20, Kees Cook wrote:
>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>> Hi,
>>>>
>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>> is no log safe for the screenshot(s) here:
>>>>
>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>
>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>
>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>> .
>>>> .
>>>> .
>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>
>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>
>>> Can you show early kernel log (something like dmesg)?
>>>
>>> Anyway, I'm adding it to regzbot:
>>>
>>> #regzbot ^introduced: 2d47c6956ab3c8
>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>
>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>> tree... it's only in Linus's ToT.
>>
> 
> In ToT:
> 
> $ git describe 2d47c6956ab3
> v6.4-rc2-1-g2d47c6956ab3
> 
> $ git describe --contains 2d47c6956ab3
> next-20230616~2^2~51
> $ git describe --contains --match 'v*' 2d47c6956ab3
> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
> 
> "git describe" always shows the parent tree, which I guess was based on
> v6.4-rc2.
> 
> Guenter
> 
> 
>> Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
>> as even being available, much less present. Something seems very wrong
>> with this report...
>>
>> -Kees

Anyway, I have double checked and linux-image-6.4.0-rc2-crash boots while
linux-image-6.4.0-rc2-crash-00001-g2d47c6956ab3 freezes in early boot.

Of course, in the next boot dmesg appears overwritten ... I could provide
only the first screen screenshots.

The difference is only one commit.

It is a bit strange so I am available for any additional diagnostics.

Regards,
Mirsad


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:26     ` Guenter Roeck
  2023-07-03  3:53       ` Mirsad Goran Todorovac
@ 2023-07-03  3:58       ` Guenter Roeck
  2023-07-03  5:18         ` Mirsad Goran Todorovac
  2023-07-03  5:18         ` Mirsad Goran Todorovac
  1 sibling, 2 replies; 27+ messages in thread
From: Guenter Roeck @ 2023-07-03  3:58 UTC (permalink / raw)
  To: Kees Cook, Bagas Sanjaya
  Cc: Mirsad Goran Todorovac, Linux Kernel Mailing List, Linux LLVM,
	linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On 7/2/23 20:26, Guenter Roeck wrote:
> On 7/2/23 20:20, Kees Cook wrote:
>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>> Hi,
>>>>
>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>> is no log safe for the screenshot(s) here:
>>>>
>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>
>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>
>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>> .
>>>> .
>>>> .
>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>
>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>
>>> Can you show early kernel log (something like dmesg)?
>>>
>>> Anyway, I'm adding it to regzbot:
>>>
>>> #regzbot ^introduced: 2d47c6956ab3c8
>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>
>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>> tree... it's only in Linus's ToT.
>>
> 
> In ToT:
> 
> $ git describe 2d47c6956ab3
> v6.4-rc2-1-g2d47c6956ab3
> 
> $ git describe --contains 2d47c6956ab3
> next-20230616~2^2~51
> $ git describe --contains --match 'v*' 2d47c6956ab3
> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
> 
> "git describe" always shows the parent tree, which I guess was based on
> v6.4-rc2.
> 

Ah, sorry, I didn't realize that the subject claims that the problem
would be in 6.4.1. That indeed does not match the bisect results.

Guenter

> Guenter
> 
> 
>> Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
>> as even being available, much less present. Something seems very wrong
>> with this report...
>>
>> -Kees
>>
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:53       ` Mirsad Goran Todorovac
@ 2023-07-03  4:30         ` Kees Cook
  2023-07-03  4:38           ` Guenter Roeck
  2023-07-03  4:50           ` Mirsad Goran Todorovac
  0 siblings, 2 replies; 27+ messages in thread
From: Kees Cook @ 2023-07-03  4:30 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On Mon, Jul 03, 2023 at 05:53:48AM +0200, Mirsad Goran Todorovac wrote:
> On 7/3/23 05:26, Guenter Roeck wrote:
> > On 7/2/23 20:20, Kees Cook wrote:
> > > On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
> > > > On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
> > > > > Hi,
> > > > > 
> > > > > After new git pull the kernel in Torvalds tree with default debug config
> > > > > failed to boot with error that occurs prior to mounting filesystems, so there
> > > > > is no log safe for the screenshot(s) here:
> > > > > 
> > > > > [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
> > > > > 
> > > > > Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
> > > > > 
> > > > > # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
> > > > > git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
> > > > > # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
> > > > > git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
> > > > > .
> > > > > .
> > > > > .
> > > > > # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> > > > > git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
> > > > > # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
> > > > > 
> > > > > The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
> > > > 
> > > > Can you show early kernel log (something like dmesg)?
> > > > 
> > > > Anyway, I'm adding it to regzbot:
> > > > 
> > > > #regzbot ^introduced: 2d47c6956ab3c8
> > > > #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
> > > 
> > > I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
> > > tree... it's only in Linus's ToT.
> > > 
> > 
> > In ToT:
> > 
> > $ git describe 2d47c6956ab3
> > v6.4-rc2-1-g2d47c6956ab3
> > 
> > $ git describe --contains 2d47c6956ab3
> > next-20230616~2^2~51
> > $ git describe --contains --match 'v*' 2d47c6956ab3
> > fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
> > 
> > "git describe" always shows the parent tree, which I guess was based on
> > v6.4-rc2.
> > 
> > Guenter
> > 
> > 
> > > Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
> > > as even being available, much less present. Something seems very wrong
> > > with this report...
> > > 
> > > -Kees
> 
> Anyway, I have double checked and linux-image-6.4.0-rc2-crash boots while
> linux-image-6.4.0-rc2-crash-00001-g2d47c6956ab3 freezes in early boot.

I don't understand what tree you're testing. 2d47c6956ab3 is only in
Linus's latest tree, which is not 6.4-rc2.

If you're testing Linus's tree, and you're bisecting to 2d47c6956ab3,
I don't understand why the .config you sent doesn't include
CONFIG_UBSAN_BOUNDS_STRICT (which was introduced by that commit) --
it should be visible whether or not it is selected.

> Of course, in the next boot dmesg appears overwritten ... I could provide
> only the first screen screenshots.

Without CONFIG_UBSAN_TRAP, I would not expect anything other than a
warning (i.e. boot would continue).

The only other thing I can think of that seems related (the backtrace
appears to show usb), might be this:
https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
which won't appears until after v6.5-rc1.

> The difference is only one commit.
> 
> It is a bit strange so I am available for any additional diagnostics.

Thanks! Can you send "grep UBSAN .config" output for the crashing kernel?

Are you booting on an EFI-capable machine? If you could configure pstore
to use the EFI-vars backend, you can capture the crash in EFI and
pstorefs will show it after the next boot. (If you're using systemd,
this all may already be happening -- check /var/lib/systemd/pstore/
or see[1] for more details.)

-Kees

[1] https://www.freedesktop.org/software/systemd/man/systemd-pstore.service.html

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  4:30         ` Kees Cook
@ 2023-07-03  4:38           ` Guenter Roeck
  2023-07-03  4:53             ` Kees Cook
  2023-07-03  4:50           ` Mirsad Goran Todorovac
  1 sibling, 1 reply; 27+ messages in thread
From: Guenter Roeck @ 2023-07-03  4:38 UTC (permalink / raw)
  To: Kees Cook, Mirsad Goran Todorovac
  Cc: Bagas Sanjaya, Linux Kernel Mailing List, Linux LLVM,
	linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On 7/2/23 21:30, Kees Cook wrote:
> On Mon, Jul 03, 2023 at 05:53:48AM +0200, Mirsad Goran Todorovac wrote:
>> On 7/3/23 05:26, Guenter Roeck wrote:
>>> On 7/2/23 20:20, Kees Cook wrote:
>>>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>>>> Hi,
>>>>>>
>>>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>>>> is no log safe for the screenshot(s) here:
>>>>>>
>>>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>>>
>>>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>>>
>>>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>>>> .
>>>>>> .
>>>>>> .
>>>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>>
>>>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>>>
>>>>> Can you show early kernel log (something like dmesg)?
>>>>>
>>>>> Anyway, I'm adding it to regzbot:
>>>>>
>>>>> #regzbot ^introduced: 2d47c6956ab3c8
>>>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>>>
>>>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>>>> tree... it's only in Linus's ToT.
>>>>
>>>
>>> In ToT:
>>>
>>> $ git describe 2d47c6956ab3
>>> v6.4-rc2-1-g2d47c6956ab3
>>>
>>> $ git describe --contains 2d47c6956ab3
>>> next-20230616~2^2~51
>>> $ git describe --contains --match 'v*' 2d47c6956ab3
>>> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
>>>
>>> "git describe" always shows the parent tree, which I guess was based on
>>> v6.4-rc2.
>>>
>>> Guenter
>>>
>>>
>>>> Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
>>>> as even being available, much less present. Something seems very wrong
>>>> with this report...
>>>>
>>>> -Kees
>>
>> Anyway, I have double checked and linux-image-6.4.0-rc2-crash boots while
>> linux-image-6.4.0-rc2-crash-00001-g2d47c6956ab3 freezes in early boot.
> 
> I don't understand what tree you're testing. 2d47c6956ab3 is only in
> Linus's latest tree, which is not 6.4-rc2.
> 

Maybe this ?

$ git checkout -b testing 2d47c6956ab3
Updating files: 100% (15501/15501), done.
Switched to a new branch 'testing'
groeck@server:~/src/linux-staging$ git describe
v6.4-rc2-1-g2d47c6956ab3

Guenter

> If you're testing Linus's tree, and you're bisecting to 2d47c6956ab3,
> I don't understand why the .config you sent doesn't include
> CONFIG_UBSAN_BOUNDS_STRICT (which was introduced by that commit) --
> it should be visible whether or not it is selected.
> 
>> Of course, in the next boot dmesg appears overwritten ... I could provide
>> only the first screen screenshots.
> 
> Without CONFIG_UBSAN_TRAP, I would not expect anything other than a
> warning (i.e. boot would continue).
> 
> The only other thing I can think of that seems related (the backtrace
> appears to show usb), might be this:
> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
> which won't appears until after v6.5-rc1.
> 
>> The difference is only one commit.
>>
>> It is a bit strange so I am available for any additional diagnostics.
> 
> Thanks! Can you send "grep UBSAN .config" output for the crashing kernel?
> 
> Are you booting on an EFI-capable machine? If you could configure pstore
> to use the EFI-vars backend, you can capture the crash in EFI and
> pstorefs will show it after the next boot. (If you're using systemd,
> this all may already be happening -- check /var/lib/systemd/pstore/
> or see[1] for more details.)
> 
> -Kees
> 
> [1] https://www.freedesktop.org/software/systemd/man/systemd-pstore.service.html
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  4:30         ` Kees Cook
  2023-07-03  4:38           ` Guenter Roeck
@ 2023-07-03  4:50           ` Mirsad Goran Todorovac
  1 sibling, 0 replies; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  4:50 UTC (permalink / raw)
  To: Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

[-- Attachment #1: Type: text/plain, Size: 4542 bytes --]

On 7/3/23 06:30, Kees Cook wrote:
> On Mon, Jul 03, 2023 at 05:53:48AM +0200, Mirsad Goran Todorovac wrote:
>> On 7/3/23 05:26, Guenter Roeck wrote:
>>> On 7/2/23 20:20, Kees Cook wrote:
>>>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>>>> Hi,
>>>>>>
>>>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>>>> is no log safe for the screenshot(s) here:
>>>>>>
>>>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>>>
>>>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>>>
>>>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>>>> .
>>>>>> .
>>>>>> .
>>>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>>
>>>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>>>
>>>>> Can you show early kernel log (something like dmesg)?
>>>>>
>>>>> Anyway, I'm adding it to regzbot:
>>>>>
>>>>> #regzbot ^introduced: 2d47c6956ab3c8
>>>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>>>
>>>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>>>> tree... it's only in Linus's ToT.
>>>>
>>>
>>> In ToT:
>>>
>>> $ git describe 2d47c6956ab3
>>> v6.4-rc2-1-g2d47c6956ab3
>>>
>>> $ git describe --contains 2d47c6956ab3
>>> next-20230616~2^2~51
>>> $ git describe --contains --match 'v*' 2d47c6956ab3
>>> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
>>>
>>> "git describe" always shows the parent tree, which I guess was based on
>>> v6.4-rc2.
>>>
>>> Guenter
>>>
>>>
>>>> Also, the config you included does not show CONFIG_UBSAN_BOUNDS_STRICT
>>>> as even being available, much less present. Something seems very wrong
>>>> with this report...
>>>>
>>>> -Kees
>>
>> Anyway, I have double checked and linux-image-6.4.0-rc2-crash boots while
>> linux-image-6.4.0-rc2-crash-00001-g2d47c6956ab3 freezes in early boot.
> 
> I don't understand what tree you're testing. 2d47c6956ab3 is only in
> Linus's latest tree, which is not 6.4-rc2.

> If you're testing Linus's tree, and you're bisecting to 2d47c6956ab3,
> I don't understand why the .config you sent doesn't include
> CONFIG_UBSAN_BOUNDS_STRICT (which was introduced by that commit) --
> it should be visible whether or not it is selected.

Hi, Mr. Cook,

I have cloned again from the Torvalds' tree, and rebuilt both kernels with the config
attached.

linux-image-6.4.0-rc2-crash2 again boots, and linux-image-6.4.0-rc2-crash2-00001-g2d47c6956ab3
crashes during the early boot. There is nothing from -00001-g2d47c6956ab3 kernel in the
logs.

It is this very config and vanilla Torvalds tree from
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Hope this helps.

Best regards,
Mirsad Todorovac

>> Of course, in the next boot dmesg appears overwritten ... I could provide
>> only the first screen screenshots.
> 
> Without CONFIG_UBSAN_TRAP, I would not expect anything other than a
> warning (i.e. boot would continue).
> 
> The only other thing I can think of that seems related (the backtrace
> appears to show usb), might be this:
> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
> which won't appears until after v6.5-rc1.
> 
>> The difference is only one commit.
>>
>> It is a bit strange so I am available for any additional diagnostics.
> 
> Thanks! Can you send "grep UBSAN .config" output for the crashing kernel?
> 
> Are you booting on an EFI-capable machine? If you could configure pstore
> to use the EFI-vars backend, you can capture the crash in EFI and
> pstorefs will show it after the next boot. (If you're using systemd,
> this all may already be happening -- check /var/lib/systemd/pstore/
> or see[1] for more details.)
> 
> -Kees
> 
> [1] https://www.freedesktop.org/software/systemd/man/systemd-pstore.service.html
> 

[-- Attachment #2: config-6.4.0-rc2-crash2-00001-g2d47c6956ab3.xz --]
[-- Type: application/x-xz, Size: 57688 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  4:38           ` Guenter Roeck
@ 2023-07-03  4:53             ` Kees Cook
  0 siblings, 0 replies; 27+ messages in thread
From: Kees Cook @ 2023-07-03  4:53 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mirsad Goran Todorovac, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On Sun, Jul 02, 2023 at 09:38:50PM -0700, Guenter Roeck wrote:
> On 7/2/23 21:30, Kees Cook wrote:
> > I don't understand what tree you're testing. 2d47c6956ab3 is only in
> > Linus's latest tree, which is not 6.4-rc2.
> > 
> 
> Maybe this ?
> 
> $ git checkout -b testing 2d47c6956ab3
> Updating files: 100% (15501/15501), done.
> Switched to a new branch 'testing'
> groeck@server:~/src/linux-staging$ git describe
> v6.4-rc2-1-g2d47c6956ab3

Oh, it's the bisection position -- 2d47c6956ab3 was based on v6.4-rc2.
Got it. Thank you!

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:58       ` Guenter Roeck
@ 2023-07-03  5:18         ` Mirsad Goran Todorovac
  2023-07-03  5:18         ` Mirsad Goran Todorovac
  1 sibling, 0 replies; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  5:18 UTC (permalink / raw)
  To: Guenter Roeck, Kees Cook, Bagas Sanjaya
  Cc: Linux Kernel Mailing List, Linux LLVM, linux-kbuild,
	Linux Regressions, Nathan Chancellor, Nick Desaulniers

On 7/3/23 05:58, Guenter Roeck wrote:
> On 7/2/23 20:26, Guenter Roeck wrote:
>> On 7/2/23 20:20, Kees Cook wrote:
>>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>>> is no log safe for the screenshot(s) here:
>>>>>
>>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>>
>>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>>
>>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>>> .
>>>>> .
>>>>> .
>>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>
>>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>>
>>>> Can you show early kernel log (something like dmesg)?
>>>>
>>>> Anyway, I'm adding it to regzbot:
>>>>
>>>> #regzbot ^introduced: 2d47c6956ab3c8
>>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>>
>>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>>> tree... it's only in Linus's ToT.
>>>
>>
>> In ToT:
>>
>> $ git describe 2d47c6956ab3
>> v6.4-rc2-1-g2d47c6956ab3
>>
>> $ git describe --contains 2d47c6956ab3
>> next-20230616~2^2~51
>> $ git describe --contains --match 'v*' 2d47c6956ab3
>> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
>>
>> "git describe" always shows the parent tree, which I guess was based on
>> v6.4-rc2.
>>
> 
> Ah, sorry, I didn't realize that the subject claims that the problem
> would be in 6.4.1. That indeed does not match the bisect results.

I apologise for confusion. In fact, I have cloned the Torvalds tree after
6.4.1 was released, but I actually cloned the Torvalds tree, not the 6.4.1
from the stable branch as the Subject line might have misled.

But I think the text explained that the Torvalds tree was cloned
and the method:

] After new git pull the kernel in Torvalds tree with default debug config
] failed to boot with error that occurs prior to mounting filesystems, so there
] is no log safe for the screenshot(s) here:

I will try to be more consistent and precise the next time.

Sorry again for the confusion.

I am right now cloning directly from the Torvalds tree for the third time
and with the Ubuntu generic production kernel and the result is the same:
crash in boot for 2d47c6956ab3.

Best regards,
Mirsad Todorovac

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  3:58       ` Guenter Roeck
  2023-07-03  5:18         ` Mirsad Goran Todorovac
@ 2023-07-03  5:18         ` Mirsad Goran Todorovac
  2023-07-03  5:41           ` Kees Cook
  1 sibling, 1 reply; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  5:18 UTC (permalink / raw)
  To: Guenter Roeck, Kees Cook, Bagas Sanjaya
  Cc: Linux Kernel Mailing List, Linux LLVM, linux-kbuild,
	Linux Regressions, Nathan Chancellor, Nick Desaulniers

On 7/3/23 05:58, Guenter Roeck wrote:
> On 7/2/23 20:26, Guenter Roeck wrote:
>> On 7/2/23 20:20, Kees Cook wrote:
>>> On Mon, Jul 03, 2023 at 08:44:37AM +0700, Bagas Sanjaya wrote:
>>>> On Sun, Jul 02, 2023 at 06:36:12PM +0200, Mirsad Goran Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> After new git pull the kernel in Torvalds tree with default debug config
>>>>> failed to boot with error that occurs prior to mounting filesystems, so there
>>>>> is no log safe for the screenshot(s) here:
>>>>>
>>>>> [1] https://domac.alu.unizg.hr/~mtodorov/linux/crashes/2023-07-02/
>>>>>
>>>>> Bisect shows the first bad commit is 2d47c6956ab3 (v6.4-rc2-1-g2d47c6956ab3):
>>>>>
>>>>> # good: [98be618ad03010b1173fc3c35f6cbb4447ee2b07] Merge tag 'Smack-for-6.5' of https://github.com/cschaufler/smack-next
>>>>> git bisect good 98be618ad03010b1173fc3c35f6cbb4447ee2b07
>>>>> # bad: [f4a0659f823e5a828ea2f45b4849ea8e2dd2984c] drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
>>>>> git bisect bad f4a0659f823e5a828ea2f45b4849ea8e2dd2984c
>>>>> .
>>>>> .
>>>>> .
>>>>> # bad: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>> git bisect bad 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c
>>>>> # first bad commit: [2d47c6956ab3c8b580a59d7704aab3e2a4882b6c] ubsan: Tighten UBSAN_BOUNDS on GCC
>>>>>
>>>>> The architecture is Ubuntu 22.04 with lshw and config give in the attachment.
>>>>
>>>> Can you show early kernel log (something like dmesg)?
>>>>
>>>> Anyway, I'm adding it to regzbot:
>>>>
>>>> #regzbot ^introduced: 2d47c6956ab3c8
>>>> #regzbot title: Linux kernel fails to boot due to UBSAN_BOUNDS tightening
>>>
>>> I'm confused. Commit 2d47c6956ab3c8b580a59d7704aab3e2a4882b6c isn't in the v6.4
>>> tree... it's only in Linus's ToT.
>>>
>>
>> In ToT:
>>
>> $ git describe 2d47c6956ab3
>> v6.4-rc2-1-g2d47c6956ab3
>>
>> $ git describe --contains 2d47c6956ab3
>> next-20230616~2^2~51
>> $ git describe --contains --match 'v*' 2d47c6956ab3
>> fatal: cannot describe '2d47c6956ab3c8b580a59d7704aab3e2a4882b6c'
>>
>> "git describe" always shows the parent tree, which I guess was based on
>> v6.4-rc2.
>>
> 
> Ah, sorry, I didn't realize that the subject claims that the problem
> would be in 6.4.1. That indeed does not match the bisect results.

I apologise for confusion. In fact, I have cloned the Torvalds tree after
6.4.1 was released, but I actually cloned the Torvalds tree, not the 6.4.1
from the stable branch as the Subject line might have misled.

But I think the text explained that the Torvalds tree was cloned
and the method:

] After new git pull the kernel in Torvalds tree with default debug config
] failed to boot with error that occurs prior to mounting filesystems, so there
] is no log safe for the screenshot(s) here:

I will try to be more consistent and precise the next time.

Sorry again for the confusion.

I am right now cloning directly from the Torvalds tree for the third time
and with the Ubuntu generic production kernel and the result is the same:
crash in boot for 2d47c6956ab3.

Best regards,
Mirsad Todorovac

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  5:18         ` Mirsad Goran Todorovac
@ 2023-07-03  5:41           ` Kees Cook
  2023-07-03  7:03             ` Mirsad Goran Todorovac
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-03  5:41 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On Mon, Jul 03, 2023 at 07:18:57AM +0200, Mirsad Goran Todorovac wrote:
> I apologise for confusion. In fact, I have cloned the Torvalds tree after
> 6.4.1 was released, but I actually cloned the Torvalds tree, not the 6.4.1
> from the stable branch as the Subject line might have misled.

Thanks, no worries! I got myself confused too. :)

The config you sent looks like I'd expect now too. Questions for you, if
you have time to diagnose further:

- Are you able to catch the very beginning of the crash, where the Oops
  starts?

- Does pstore work for you to catch the crash?

- Can you try booting with this patch applied?
  https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/

I'll try to see if I can figure out anything more from the images you
posted.

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  5:41           ` Kees Cook
@ 2023-07-03  7:03             ` Mirsad Goran Todorovac
  2023-07-03 19:03               ` Kees Cook
  0 siblings, 1 reply; 27+ messages in thread
From: Mirsad Goran Todorovac @ 2023-07-03  7:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers

On 3.7.2023. 7:41, Kees Cook wrote:
> On Mon, Jul 03, 2023 at 07:18:57AM +0200, Mirsad Goran Todorovac wrote:
>> I apologise for confusion. In fact, I have cloned the Torvalds tree after
>> 6.4.1 was released, but I actually cloned the Torvalds tree, not the 6.4.1
>> from the stable branch as the Subject line might have misled.
> 
> Thanks, no worries! I got myself confused too. :)
> 
> The config you sent looks like I'd expect now too. Questions for you, if
> you have time to diagnose further:
> 
> - Are you able to catch the very beginning of the crash, where the Oops
>    starts?

It scrolls up very quickly. Couldn't catch that with the camera.

> - Does pstore work for you to catch the crash?

Haven't tried that yet. I will have to do some homework.

> - Can you try booting with this patch applied?
>    https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/

Sure, but after 4 PM UTC+02 I suppose.

> I'll try to see if I can figure out anything more from the images you
> posted.

I really couldn't figure out myself what went wrong with this one?

Best regards,
Mirsad Todorovac



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03  7:03             ` Mirsad Goran Todorovac
@ 2023-07-03 19:03               ` Kees Cook
  2023-07-03 23:09                 ` Kees Cook
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-03 19:03 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On Mon, Jul 03, 2023 at 09:03:38AM +0200, Mirsad Goran Todorovac wrote:
> On 3.7.2023. 7:41, Kees Cook wrote:
> > On Mon, Jul 03, 2023 at 07:18:57AM +0200, Mirsad Goran Todorovac wrote:
> > > I apologise for confusion. In fact, I have cloned the Torvalds tree after
> > > 6.4.1 was released, but I actually cloned the Torvalds tree, not the 6.4.1
> > > from the stable branch as the Subject line might have misled.
> > 
> > Thanks, no worries! I got myself confused too. :)
> > 
> > The config you sent looks like I'd expect now too. Questions for you, if
> > you have time to diagnose further:
> > 
> > - Are you able to catch the very beginning of the crash, where the Oops
> >    starts?
> 
> It scrolls up very quickly. Couldn't catch that with the camera.
> 
> > - Does pstore work for you to catch the crash?
> 
> Haven't tried that yet. I will have to do some homework.

Try adding this to the .config:

# Enable PSTORE support
CONFIG_PSTORE=y
CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
CONFIG_PSTORE_COMPRESS=y
CONFIG_PSTORE_DEFLATE_COMPRESS=y
# Enable UEFI pstore backend
CONFIG_EFI_VARS_PSTORE=y
# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
# Enable ACPI ERST pstore backend
CONFIG_ACPI=y
CONFIG_ACPI_APEI=y

A go write-up about using it is here:
https://blogs.oracle.com/linux/post/pstore-linux-kernel-persistent-storage-file-system
and covers the systemd-pstore details too. Note that in the config I
suggested, I've enabled the efi backend by default.

> > - Can you try booting with this patch applied?
> >    https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
> 
> Sure, but after 4 PM UTC+02 I suppose.

Cool. xhci-hub is in your backtrace, and the above patch was made for
something very similar (though, again, I don't see why you're getting a
_crash_, it should _warn_ and continue normally). And, actually, also
include this patch:
https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/

> > I'll try to see if I can figure out anything more from the images you
> > posted.

Yeah, the xhci-hub bit is the only clue I can see here. It's also in the
IRQ handler, which reminds me of this bug that we still don't have a
root-cause for the _crash_ during the warning here:
https://lore.kernel.org/oe-lkp/202306131354.A499DE60@keescook/
but I the new patch I linked to above fixes the source of the warning.

> I really couldn't figure out myself what went wrong with this one?

Having the crash scroll off the page is pretty frustrating. I wonder if
the kernel crash handler could changed to repeat the RIP at the end of
the crash...

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03 19:03               ` Kees Cook
@ 2023-07-03 23:09                 ` Kees Cook
  2023-07-04 17:20                   ` Mirsad Todorovac
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-03 23:09 UTC (permalink / raw)
  To: Mirsad Goran Todorovac
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
> Cool. xhci-hub is in your backtrace, and the above patch was made for
> something very similar (though, again, I don't see why you're getting a
> _crash_, it should _warn_ and continue normally). And, actually, also
> include this patch:
> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/

This is now in Linus's tree:
09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")

Please also still try with the first patch I mentioned, which is very similar:
https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-03 23:09                 ` Kees Cook
@ 2023-07-04 17:20                   ` Mirsad Todorovac
  2023-07-04 21:36                     ` Kees Cook
  0 siblings, 1 reply; 27+ messages in thread
From: Mirsad Todorovac @ 2023-07-04 17:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>> something very similar (though, again, I don't see why you're getting a
>> _crash_, it should _warn_ and continue normally). And, actually, also
>> include this patch:
>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/
> 
> This is now in Linus's tree:
> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
> 
> Please also still try with the first patch I mentioned, which is very similar:
> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/

Hi,

I have finally built w both patches (and recommended PSTORE settings were
default already).

This second patch fixes the booting problem, but alas there is still a problem -
all Wayland and X11.org GUI applications fail to start, with errors like this one:

Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
Jul  4 19:09:07 defiant kernel: [   40.529723] CPU: 0 PID: 3492 Comm: thunderbird Not tainted 6.4.0-rc2-crash2-kees2-00001-g2d47c6956ab3-dirty #5
Jul  4 19:09:07 defiant kernel: [   40.529725] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
Jul  4 19:09:07 defiant kernel: [   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
Jul  4 19:09:07 defiant kernel: [   40.529730] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
Jul  4 19:09:07 defiant kernel: [   40.529731] RSP: 0018:ffffad8c45313c48 EFLAGS: 00010202
Jul  4 19:09:07 defiant kernel: [   40.529733] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.529734] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.529734] RBP: ffffad8c45313c98 R08: 0000000000000000 R09: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.529735] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c63a8
Jul  4 19:09:07 defiant kernel: [   40.529735] R13: ffff9cbde9b08750 R14: 0000000000000001 R15: ffff9cbdff1c63a8
Jul  4 19:09:07 defiant kernel: [   40.529736] FS:  00007f50d863e780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.529737] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul  4 19:09:07 defiant kernel: [   40.529737] CR2: 0000000000000000 CR3: 00000001b0ae0000 CR4: 0000000000750ef0
Jul  4 19:09:07 defiant kernel: [   40.529738] PKRU: 55555554
Jul  4 19:09:07 defiant kernel: [   40.529739] Call Trace:
Jul  4 19:09:07 defiant kernel: [   40.529739]  <TASK>
Jul  4 19:09:07 defiant kernel: [   40.529741]  copy_process+0x165f/0x2110
Jul  4 19:09:07 defiant kernel: [   40.529744]  kernel_clone+0x9d/0x3a0
Jul  4 19:09:07 defiant kernel: [   40.529745]  ? find_held_lock+0x31/0xa0
Jul  4 19:09:07 defiant kernel: [   40.529747]  ? mntput_no_expire+0x89/0x4f0
Jul  4 19:09:07 defiant kernel: [   40.529749]  ? lock_release+0xc4/0x270
Jul  4 19:09:07 defiant kernel: [   40.529751]  __do_sys_clone+0x66/0xa0
Jul  4 19:09:07 defiant kernel: [   40.529754]  __x64_sys_clone+0x25/0x40
Jul  4 19:09:07 defiant kernel: [   40.529755]  do_syscall_64+0x59/0x90
Jul  4 19:09:07 defiant kernel: [   40.529758]  ? syscall_exit_to_user_mode+0x39/0x60
Jul  4 19:09:07 defiant kernel: [   40.529760]  ? do_syscall_64+0x69/0x90
Jul  4 19:09:07 defiant kernel: [   40.529761]  ? irqentry_exit_to_user_mode+0x27/0x40
Jul  4 19:09:07 defiant kernel: [   40.529762]  ? irqentry_exit+0x77/0xb0
Jul  4 19:09:07 defiant kernel: [   40.529764]  ? exc_page_fault+0xae/0x240
Jul  4 19:09:07 defiant kernel: [   40.529765]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
Jul  4 19:09:07 defiant kernel: [   40.529767] RIP: 0033:0x7f50d811ea3d
Jul  4 19:09:07 defiant kernel: [   40.529769] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
Jul  4 19:09:07 defiant kernel: [   40.529770] RSP: 002b:00007ffcc449ce58 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Jul  4 19:09:07 defiant kernel: [   40.529771] RAX: ffffffffffffffda RBX: 0000000000000051 RCX: 00007f50d811ea3d
Jul  4 19:09:07 defiant kernel: [   40.529771] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000030000011
Jul  4 19:09:07 defiant kernel: [   40.529772] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f50d82b97c0
Jul  4 19:09:07 defiant kernel: [   40.529772] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000011
Jul  4 19:09:07 defiant kernel: [   40.529773] R13: 00007f50d7e16980 R14: 00007f50d863e6c0 R15: 00007f50d82ba3c0
Jul  4 19:09:07 defiant kernel: [   40.529775]  </TASK>
Jul  4 19:09:07 defiant kernel: [   40.529776] Modules linked in: binfmt_misc f2fs crc32_generic lz4hc_compress lz4_compress nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi edac_mce_amd crct10dif_pclmul snd_hda_intel polyval_clmulni snd_intel_dspcfg polyval_generic ghash_clmulni_intel snd_intel_sdw_acpi snd_seq_midi sha512_ssse3 snd_seq_midi_event snd_hda_codec aesni_intel snd_hda_core crypto_simd cryptd snd_hwdep joydev input_leds snd_rawmidi rapl amdgpu snd_pcm ccp wmi_bmof snd_seq k10temp snd_seq_device iommu_v2 snd_timer drm_buddy gpu_sched drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec snd drm_kms_helper i2c_algo_bit syscopyarea sysfillrect sysimgblt soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone fuse efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme nvme_core ahci xhci_pci i2c_piix4 crc32_pclmul nvme_common libahci xhci_pci_renesas r8169 realtek video wmi
Jul  4 19:09:07 defiant kernel: [   40.529799]  gpio_amdpt
Jul  4 19:09:07 defiant kernel: [   40.529801] ---[ end trace 0000000000000000 ]---
Jul  4 19:09:07 defiant kernel: [   40.865489] RIP: 0010:alloc_pid+0x46c/0x480
Jul  4 19:09:07 defiant kernel: [   40.865491] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
Jul  4 19:09:07 defiant kernel: [   40.865492] RSP: 0018:ffffad8c45313c48 EFLAGS: 00010202
Jul  4 19:09:07 defiant kernel: [   40.865494] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.865495] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.865495] RBP: ffffad8c45313c98 R08: 0000000000000000 R09: 0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.865496] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c63a8
Jul  4 19:09:07 defiant kernel: [   40.865497] R13: ffff9cbde9b08750 R14: 0000000000000001 R15: ffff9cbdff1c63a8
Jul  4 19:09:07 defiant kernel: [   40.865497] FS:  00007f50d863e780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
Jul  4 19:09:07 defiant kernel: [   40.865498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul  4 19:09:07 defiant kernel: [   40.865499] CR2: 0000000000000000 CR3: 00000001b0ae0000 CR4: 0000000000750ef0
Jul  4 19:09:07 defiant kernel: [   40.865500] PKRU: 55555554

The interpretation of these findings is beyond the scope of my knowledge.

I hope you can make any use of them.

Best regards,
Mirsad Todorovac

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-04 17:20                   ` Mirsad Todorovac
@ 2023-07-04 21:36                     ` Kees Cook
  2023-07-04 23:15                       ` Mirsad Todorovac
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-04 21:36 UTC (permalink / raw)
  To: Mirsad Todorovac, Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>>> something very similar (though, again, I don't see why you're getting a
>>> _crash_, it should _warn_ and continue normally). And, actually, also
>>> include this patch:
>>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/
>> 
>> This is now in Linus's tree:
>> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
>> 
>> Please also still try with the first patch I mentioned, which is very similar:
>> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>
>Hi,
>
>I have finally built w both patches (and recommended PSTORE settings were
>default already).

Were you able to find the crashes saved by pstore?

>
>This second patch fixes the booting problem, but alas there is still a problem -

Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.

>all Wayland and X11.org GUI applications fail to start, with errors like this one:
>
>Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI

Hmm, is CONFIG_UBSAN_TRAP set?

>Jul  4 19:09:07 defiant kernel: [   40.529726] RIP: 0010:alloc_pid+0x46c/0x480

Hmm, is this patch in your kernel?
https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-04 21:36                     ` Kees Cook
@ 2023-07-04 23:15                       ` Mirsad Todorovac
  2023-07-05  2:09                         ` Kees Cook
  0 siblings, 1 reply; 27+ messages in thread
From: Mirsad Todorovac @ 2023-07-04 23:15 UTC (permalink / raw)
  To: Kees Cook, Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

[-- Attachment #1: Type: text/plain, Size: 5876 bytes --]

On 7/4/23 23:36, Kees Cook wrote:
> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>> On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>>>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>>>> something very similar (though, again, I don't see why you're getting a
>>>> _crash_, it should _warn_ and continue normally). And, actually, also
>>>> include this patch:
>>>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/
>>>
>>> This is now in Linus's tree:
>>> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
>>>
>>> Please also still try with the first patch I mentioned, which is very similar:
>>> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>>
>> Hi,
>>
>> I have finally built w both patches (and recommended PSTORE settings were
>> default already).
> 
> Were you able to find the crashes saved by pstore?

No, only lktdm and invalid opcode crashes ...

P.S.

Actually, I have recovered some pstore records. Please find them in the attachment:

>> This second patch fixes the booting problem, but alas there is still a problem -
> 
> Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.

Glad I could be of help.

>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
>>
>> Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> 
> Hmm, is CONFIG_UBSAN_TRAP set?

marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
CONFIG_UBSAN_TRAP=y
marvin@defiant:~/linux/kernel/linux_torvalds$

>> Jul  4 19:09:07 defiant kernel: [   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
> 
> Hmm, is this patch in your kernel?
> https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04

No, it wasn't. I had only these:

marvin@defiant:~/linux/kernel/linux_torvalds$ more ../kees-[12].patch
::::::::::::::
../kees-1.patch
::::::::::::::
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index b17e3a21b15f..82ec6af71a1d 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -376,7 +376,10 @@ struct usb_string_descriptor {
  	__u8  bLength;
  	__u8  bDescriptorType;
  
-	__le16 wData[1];		/* UTF-16LE encoded */
+	union {
+		__le16 legacy_padding;
+		__DECLARE_FLEX_ARRAY(__le16, wData);	/* UTF-16LE encoded */
+	};
  } __attribute__ ((packed));
  
  /* note that "string" zero is special, it holds language codes that
::::::::::::::
../kees-2.patch
::::::::::::::
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index b17e3a21b15f..3ff98c7ba7e3 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -981,7 +981,11 @@ struct usb_ssp_cap_descriptor {
  #define USB_SSP_MIN_RX_LANE_COUNT		(0xf << 8)
  #define USB_SSP_MIN_TX_LANE_COUNT		(0xf << 12)
  	__le16 wReserved;
-	__le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
+	union {
+		__le32 legacy_padding;
+		/* list of sublink speed attrib entries */
+		__DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
+	};
  #define USB_SSP_SUBLINK_SPEED_SSID	(0xf)		/* sublink speed ID */
  #define USB_SSP_SUBLINK_SPEED_LSE	(0x3 << 4)	/* Lanespeed exponent */
  #define USB_SSP_SUBLINK_SPEED_LSE_BPS		0
marvin@defiant:~/linux/kernel/linux_torvalds$

---------------------------------------------------------

Now it works. Succeeded boot and running of X apps with the new git pull
torvalds tree and the kees-2.patch.

Praise God!

This is the git log --oneline:

d528014517f2 (HEAD, origin/master, origin/HEAD) Revert ".gitignore: ignore *.cover and *.mbx"
04f2933d375e Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue
03275585cabd afs: Fix accidental truncation when storing data
538140ca602b Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
94c76955e86a Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
ccf46d853183 Merge tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
b869e9f49964 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
406fb9eb198a Merge tag 'firewire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
f1962207150c module: fix init_module_from_file() error handling
40c565a429d7 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
f679e89acdd3 clk: tegra: Avoid calling an uninitialized function

So, the included patch is:

marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
index 82ec6af71a1d..62d318377379 100644
--- a/include/uapi/linux/usb/ch9.h
+++ b/include/uapi/linux/usb/ch9.h
@@ -984,7 +984,11 @@ struct usb_ssp_cap_descriptor {
  #define USB_SSP_MIN_RX_LANE_COUNT              (0xf << 8)
  #define USB_SSP_MIN_TX_LANE_COUNT              (0xf << 12)
         __le16 wReserved;
-       __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
+       union {
+               __le32 legacy_padding;
+               /* list of sublink speed attrib entries */
+               __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
+       };
  #define USB_SSP_SUBLINK_SPEED_SSID     (0xf)           /* sublink speed ID */
  #define USB_SSP_SUBLINK_SPEED_LSE      (0x3 << 4)      /* Lanespeed exponent */
  #define USB_SSP_SUBLINK_SPEED_LSE_BPS          0
marvin@defiant:~/linux/kernel/linux_torvalds$

This means vanilla torvalds tree + https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
works, but vanilla torvalds tree w/o patch still crashes.

I am still rather new to the utilisation of the PSTORE subsystem.

Best regards,
Mirsad Todorovac

[-- Attachment #2: 168849054-dmesg.txt --]
[-- Type: text/plain, Size: 30572 bytes --]

dmesg-efi_pstore-168849054717001:
Oops#1 Part17
<30>[    7.598794] systemd[1]: modprobe@fuse.service: Deactivated successfully.
<30>[    7.599944] systemd[1]: Finished Load Kernel Module fuse.
<30>[    7.600832] systemd[1]: modprobe@pstore_blk.service: Deactivated successfully.
<30>[    7.601939] systemd[1]: Finished Load Kernel Module pstore_blk.
<30>[    7.602831] systemd[1]: modprobe@pstore_zone.service: Deactivated successfully.
<6>[    7.603532] lp: driver loaded but no devices found
<30>[    7.603888] systemd[1]: Finished Load Kernel Module pstore_zone.
<30>[    7.604725] systemd[1]: modprobe@ramoops.service: Deactivated successfully.
<30>[    7.605784] systemd[1]: Finished Load Kernel Module ramoops.
<30>[    7.610356] systemd[1]: Finished Remount Root and Kernel File Systems.
<6>[    7.612497] ppdev: user-space parallel port driver
<30>[    7.655717] systemd[1]: Mounting FUSE Control File System...
<30>[    7.660962] systemd[1]: Mounting Kernel Configuration File System...
<30>[    7.665977] systemd[1]: Starting Create System Users...
<30>[    7.669066] systemd[1]: Finished Set the console keyboard layout.
<30>[    7.672371] systemd[1]: Finished Load Kernel Modules.
<30>[    7.674230] systemd[1]: Mounted FUSE Control File System.
<30>[    7.674620] systemd[1]: Mounted Kernel Configuration File System.
<30>[    7.678550] systemd[1]: Starting Apply Kernel Variables...
<30>[    7.697541] systemd[1]: Finished Create System Users.
<30>[    7.702198] systemd[1]: Starting Create Static Device Nodes in /dev...
<30>[    7.709620] systemd[1]: Finished Apply Kernel Variables.
<30>[    7.724718] systemd[1]: Finished Create Static Device Nodes in /dev.
<30>[    7.724852] systemd[1]: Reached target Preparation for Local File Systems.
<30>[    7.731570] systemd[1]: Starting Rule-based Manager for Device Events and Files...
dmesg-efi_pstore-168849054716001:
Oops#1 Part16
<30>[    7.734459] systemd[1]: modprobe@chromeos_pstore.service: Deactivated successfully.
<30>[    7.735482] systemd[1]: Finished Load Kernel Module chromeos_pstore.
<30>[    7.790440] systemd[1]: Started Journal Service.
<6>[    8.346947] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.
<6>[    8.458360] ccp 0000:1a:00.2: enabling device (0000 -> 0002)
<5>[    8.481124] ccp 0000:1a:00.2: psp: unable to access the device: you might be running a broken BIOS.
<6>[    9.094383] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer
<6>[    9.094386] RAPL PMU: hw unit of domain package 2^-16 Joules
<6>[    9.142633] Adding 31999996k swap on /dev/nvme0n1p2.  Priority:-2 extents:1 across:31999996k SSFS
<6>[    9.244840] [drm] amdgpu kernel modesetting enabled.
<6>[    9.244931] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
<6>[    9.246463] cryptd: max_cpu_qlen set to 1000
<4>[    9.253079] ATPX version 1, functions 0x00000000
<6>[    9.268677] AVX2 version of gcm_enc/dec engaged.
<6>[    9.268884] AES CTR mode by8 optimization enabled
<6>[    9.300405] snd_hda_intel 0000:03:00.1: enabling device (0000 -> 0002)
<6>[    9.301425] MCE: In-kernel MCE decoding enabled.
<6>[    9.305289] snd_hda_intel 0000:03:00.1: Handle vga_switcheroo audio client
<6>[    9.305292] snd_hda_intel 0000:03:00.1: Force to non-snoop mode
<6>[    9.309540] snd_hda_intel 0000:1a:00.1: enabling device (0000 -> 0002)
<6>[    9.313704] snd_hda_intel 0000:1a:00.1: Handle vga_switcheroo audio client
<6>[    9.316146] snd_hda_intel 0000:1a:00.6: enabling device (0000 -> 0002)
<6>[    9.354946] input: HD-Audio Generic HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:08.1/0000:1a:00.1/sound/card1/input14
dmesg-efi_pstore-168849054715001:
Oops#1 Part15
<6>[    9.355785] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.1/sound/card0/input9
<6>[    9.356512] input: HD-Audio Generic HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:08.1/0000:1a:00.1/sound/card1/input15
<6>[    9.357336] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.1/sound/card0/input10
<6>[    9.358079] input: HD-Audio Generic HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:08.1/0000:1a:00.1/sound/card1/input16
<6>[    9.358976] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.1/sound/card0/input11
<6>[    9.359700] input: HD-Audio Generic HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:08.1/0000:1a:00.1/sound/card1/input17
<6>[    9.360572] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.1/sound/card0/input12
<6>[    9.362270] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.1/0000:01:00.0/0000:02:00.0/0000:03:00.1/sound/card0/input13
<6>[    9.364515] snd_hda_codec_realtek hdaudioC2D0: autoconfig for ALC897: line_outs=1 (0x14/0x0/0x0/0x0/0x0) type:line
<6>[    9.364519] snd_hda_codec_realtek hdaudioC2D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
<6>[    9.364520] snd_hda_codec_realtek hdaudioC2D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
<6>[    9.364522] snd_hda_codec_realtek hdaudioC2D0:    mono: mono_out=0x0
<6>[    9.364523] snd_hda_codec_realtek hdaudioC2D0:    inputs:
<6>[    9.364525] snd_hda_codec_realtek hdaudioC2D0:      Front Mic=0x19
<6>[    9.364526] snd_hda_codec_realtek hdaudioC2D0:      Rear Mic=0x18
dmesg-efi_pstore-168849054614001:
Oops#1 Part14
<6>[    9.364528] snd_hda_codec_realtek hdaudioC2D0:      Line=0x1a
<6>[    9.376461] BTRFS info (device nvme0n1p10): using crc32c (crc32c-intel) checksum algorithm
<6>[    9.376466] BTRFS info (device nvme0n1p10): using free space tree
<6>[    9.396181] BTRFS info (device nvme0n1p10): enabling ssd optimizations
<6>[    9.396185] BTRFS info (device nvme0n1p10): auto enabling async discard
<6>[    9.400594] input: HD-Audio Generic Front Mic as /devices/pci0000:00/0000:00:08.1/0000:1a:00.6/sound/card2/input18
<6>[    9.402438] input: HD-Audio Generic Rear Mic as /devices/pci0000:00/0000:00:08.1/0000:1a:00.6/sound/card2/input19
<6>[    9.408410] input: HD-Audio Generic Line as /devices/pci0000:00/0000:00:08.1/0000:1a:00.6/sound/card2/input20
<6>[    9.415243] input: HD-Audio Generic Line Out as /devices/pci0000:00/0000:00:08.1/0000:1a:00.6/sound/card2/input21
<6>[    9.418364] input: HD-Audio Generic Front Headphone as /devices/pci0000:00/0000:00:08.1/0000:1a:00.6/sound/card2/input22
<6>[    9.434264] EXT4-fs (nvme0n1p5): mounted filesystem 8f6cf2e5-aa47-49bc-b50f-fa8023306013 r/w with ordered data mode. Quota mode: none.
<6>[    9.483230] EXT4-fs (nvme0n1p1): mounted filesystem a4814207-8827-4b89-adcd-21899f72071b r/w with ordered data mode. Quota mode: none.
<6>[    9.487407] BTRFS info (device nvme0n1p7): using crc32c (crc32c-intel) checksum algorithm
<6>[    9.487413] BTRFS info (device nvme0n1p7): using free space tree
<6>[    9.504117] loop0: detected capacity change from 0 to 8
<6>[    9.514575] loop1: detected capacity change from 0 to 616344
<6>[    9.529056] BTRFS info (device nvme0n1p7): enabling ssd optimizations
<6>[    9.529059] BTRFS info (device nvme0n1p7): auto enabling async discard
<6>[    9.534084] loop2: detected capacity change from 0 to 632776
dmesg-efi_pstore-168849054613001:
Oops#1 Part13
<6>[    9.575757] loop3: detected capacity change from 0 to 239128
<6>[    9.576373] loop4: detected capacity change from 0 to 19584
<6>[    9.576896] loop5: detected capacity change from 0 to 19608
<6>[    9.589243] loop6: detected capacity change from 0 to 242136
<6>[    9.597931] loop7: detected capacity change from 0 to 129936
<6>[    9.617003] amdgpu: Ignoring ACPI CRAT on non-APU system
<6>[    9.617281] amdgpu: Virtual CRAT table created for CPU
<6>[    9.617713] amdgpu: Topology: Add CPU node
<6>[    9.619578] amdgpu 0000:03:00.0: enabling device (0006 -> 0007)
<6>[    9.620020] [drm] initializing kernel modesetting (DIMGREY_CAVEFISH 0x1002:0x73FF 0x1849:0x5217 0xC7).
<6>[    9.620057] [drm] register mmio base: 0xFCB00000
<6>[    9.620058] [drm] register mmio size: 1048576
<6>[    9.620722] loop8: detected capacity change from 0 to 129944
<6>[    9.627042] [drm] add ip block number 0 <nv_common>
<6>[    9.627045] [drm] add ip block number 1 <gmc_v10_0>
<6>[    9.627046] [drm] add ip block number 2 <navi10_ih>
<6>[    9.627047] [drm] add ip block number 3 <psp>
<6>[    9.627048] [drm] add ip block number 4 <smu>
<6>[    9.627049] [drm] add ip block number 5 <dm>
<6>[    9.627050] [drm] add ip block number 6 <gfx_v10_0>
<6>[    9.627051] [drm] add ip block number 7 <sdma_v5_2>
<6>[    9.627052] [drm] add ip block number 8 <vcn_v3_0>
<6>[    9.627053] [drm] add ip block number 9 <jpeg_v3_0>
<6>[    9.627140] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT
<6>[    9.627206] amdgpu: ATOM BIOS: 113-EXT800296-L04
<6>[    9.634091] loop9: detected capacity change from 0 to 151240
<6>[    9.639916] loop10: detected capacity change from 0 to 151248
<6>[    9.643019] [drm] VCN(0) decode is enabled in VM mode
<6>[    9.643022] [drm] VCN(0) encode is enabled in VM mode
dmesg-efi_pstore-168849054612001:
Oops#1 Part12
<6>[    9.645286] [drm] JPEG decode is enabled in VM mode
<6>[    9.646431] Console: switching to colour dummy device 80x25
<6>[    9.646952] loop11: detected capacity change from 0 to 316376
<6>[    9.647492] amdgpu 0000:03:00.0: vgaarb: deactivate vga console
<6>[    9.647496] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
<6>[    9.648767] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
<6>[    9.648843] amdgpu 0000:03:00.0: amdgpu: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used)
<6>[    9.648846] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
<6>[    9.648847] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
<6>[    9.648871] [drm] Detected VRAM RAM=8176M, BAR=8192M
<6>[    9.648872] [drm] RAM width 128bits GDDR6
<6>[    9.652593] [drm] amdgpu: 8176M of VRAM memory ready
<6>[    9.652597] [drm] amdgpu: 31684M of GTT memory ready.
<6>[    9.652776] [drm] GART: num cpu pages 131072, num gpu pages 131072
<6>[    9.653346] [drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
<6>[    9.658342] loop12: detected capacity change from 0 to 709280
<6>[    9.665978] loop13: detected capacity change from 0 to 716168
<6>[    9.671205] loop14: detected capacity change from 0 to 943480
<6>[    9.676602] loop15: detected capacity change from 0 to 955472
<6>[    9.687320] loop16: detected capacity change from 0 to 187776
<6>[    9.693840] loop17: detected capacity change from 0 to 2214880
<6>[    9.702875] loop18: detected capacity change from 0 to 94064
<6>[    9.710122] loop19: detected capacity change from 0 to 25240
<6>[    9.717587] loop20: detected capacity change from 0 to 109072
dmesg-efi_pstore-168849054611001:
Oops#1 Part11
<6>[    9.727701] loop21: detected capacity change from 0 to 109072
<6>[    9.733848] loop22: detected capacity change from 0 to 608
<6>[    9.743762] loop23: detected capacity change from 0 to 904
<6>[    9.755144] intel_rapl_common: Found RAPL domain package
<6>[    9.755150] intel_rapl_common: Found RAPL domain core
<6>[    9.770456] EXT4-fs (nvme0n1p11): mounted filesystem bb53eb49-b161-4879-82c2-ab28079074f0 r/w with ordered data mode. Quota mode: none.
<46>[    9.894744] systemd-journald[601]: Received client request to flush runtime journal.
<6>[   11.874650] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries
<6>[   11.874995] BTRFS info (device sda2): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.875001] BTRFS info (device sda2): using free space tree
<6>[   11.875425] BTRFS info (device sda3): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.875430] BTRFS info (device sda3): using free space tree
<6>[   11.875881] [drm] Loading DMUB firmware via PSP: version=0x02020013
<6>[   11.880352] BTRFS info (device nvme0n1p4): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.880356] BTRFS info (device nvme0n1p4): using free space tree
<6>[   11.880381] BTRFS info (device nvme0n1p9): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.880388] BTRFS info (device nvme0n1p9): using free space tree
<6>[   11.880702] [drm] use_doorbell being set to: [true]
<6>[   11.881102] [drm] use_doorbell being set to: [true]
<6>[   11.881239] [drm] Found VCN firmware Version ENC: 1.21 DEC: 2 VEP: 0 Revision: 10
<6>[   11.881391] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware
<6>[   11.885948] BTRFS info (device nvme0n1p9): enabling ssd optimizations
<6>[   11.885952] BTRFS info (device nvme0n1p9): auto enabling async discard
dmesg-efi_pstore-168849054610001:
Oops#1 Part10
<6>[   11.885986] BTRFS info (device nvme0n1p4): enabling ssd optimizations
<6>[   11.885988] BTRFS info (device nvme0n1p4): auto enabling async discard
<6>[   11.952111] [drm] reserve 0xa00000 from 0x8001000000 for PSP TMR
<5>[   12.006320] F2FS-fs (nvme0n1p13): Found nat_bits in checkpoint
<6>[   12.050401] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
<6>[   12.068300] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
<6>[   12.068340] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2900 (59.41.0)
<6>[   12.068342] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
<6>[   12.068372] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
<6>[   12.077567] BTRFS info (device sda2): auto enabling async discard
<5>[   12.107475] F2FS-fs (nvme0n1p13): Mounted with checkpoint version = 59c29f4c
<6>[   12.115553] BTRFS info (device sda3): auto enabling async discard
<6>[   12.118026] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully!
<6>[   12.118728] [drm] Display Core initialized with v3.2.230!
<6>[   12.118729] [drm] DP-HDMI FRL PCON supported
<6>[   12.120090] [drm] DMUB hardware initialized: version=0x02020013
<6>[   12.123259] snd_hda_intel 0000:03:00.1: bound 0000:03:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
<6>[   12.159311] [drm] kiq ring mec 2 pipe 1 q 0
<6>[   12.163595] [drm] VCN decode and encode initialized successfully(under DPG Mode).
<6>[   12.164594] [drm] JPEG decode initialized successfully.
<6>[   12.167145] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
<6>[   12.167459] amdgpu: sdma_bitmap: ffff
<6>[   12.184812] amdgpu: HMM registered 8176MB device memory
dmesg-efi_pstore-168849054609001:
Oops#1 Part9
<4>[   12.185434] amdgpu: SRAT table not found
<6>[   12.185439] amdgpu: Virtual CRAT table created for GPU
<6>[   12.188127] amdgpu: Topology: Add dGPU node [0x73ff:0x1002]
<6>[   12.188132] kfd kfd: amdgpu: added device 1002:73ff
<6>[   12.188150] amdgpu 0000:03:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 28
<6>[   12.188988] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
<6>[   12.188990] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
<6>[   12.188991] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
<6>[   12.188992] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
<6>[   12.188992] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
<6>[   12.188993] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
<6>[   12.188994] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
<6>[   12.188995] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
<6>[   12.188996] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
<6>[   12.188997] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
<6>[   12.188998] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
<6>[   12.188999] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
<6>[   12.188999] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
<6>[   12.189000] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
<6>[   12.189001] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
<6>[   12.189002] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
dmesg-efi_pstore-168849054608001:
Oops#1 Part8
<6>[   12.189990] amdgpu 0000:03:00.0: amdgpu: Using BACO for runtime pm
<5>[   12.195846] audit: type=1400 audit(1688490518.405:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=1181 comm="apparmor_parser"
<5>[   12.196393] audit: type=1400 audit(1688490518.406:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1183 comm="apparmor_parser"
<5>[   12.196431] audit: type=1400 audit(1688490518.406:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1183 comm="apparmor_parser"
<5>[   12.199324] audit: type=1400 audit(1688490518.409:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=1187 comm="apparmor_parser"
<5>[   12.199378] audit: type=1400 audit(1688490518.409:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=1187 comm="apparmor_parser"
<5>[   12.199417] audit: type=1400 audit(1688490518.409:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=1187 comm="apparmor_parser"
<6>[   12.200239] [drm] Initialized amdgpu 3.52.0 20150101 for 0000:03:00.0 on minor 0
<5>[   12.200709] audit: type=1400 audit(1688490518.410:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="tcpdump" pid=1188 comm="apparmor_parser"
<5>[   12.200979] audit: type=1400 audit(1688490518.410:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oosplash" pid=1189 comm="apparmor_parser"
<5>[   12.201723] audit: type=1400 audit(1688490518.411:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=1184 comm="apparmor_parser"
dmesg-efi_pstore-168849054607001:
Oops#1 Part7
<5>[   12.201758] audit: type=1400 audit(1688490518.411:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=1184 comm="apparmor_parser"
<6>[   12.208400] fbcon: amdgpudrmfb (fb0) is primary device
<6>[   12.210234] [drm] DSC precompute is not needed.
<6>[   12.330373] Console: switching to colour frame buffer device 240x67
<6>[   12.349806] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device
<6>[   12.360471] amdgpu 0000:1a:00.0: enabling device (0006 -> 0007)
<6>[   12.360658] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x164E 0x1002:0x164E 0xC1).
<6>[   12.360691] [drm] register mmio base: 0xFCA00000
<6>[   12.360691] [drm] register mmio size: 524288
<6>[   12.363923] [drm] add ip block number 0 <nv_common>
<6>[   12.363924] [drm] add ip block number 1 <gmc_v10_0>
<6>[   12.363924] [drm] add ip block number 2 <navi10_ih>
<6>[   12.363925] [drm] add ip block number 3 <psp>
<6>[   12.363926] [drm] add ip block number 4 <smu>
<6>[   12.363926] [drm] add ip block number 5 <dm>
<6>[   12.363927] [drm] add ip block number 6 <gfx_v10_0>
<6>[   12.363927] [drm] add ip block number 7 <sdma_v5_2>
<6>[   12.363928] [drm] add ip block number 8 <vcn_v3_0>
<6>[   12.363928] [drm] add ip block number 9 <jpeg_v3_0>
<6>[   12.363948] amdgpu 0000:1a:00.0: amdgpu: Fetched VBIOS from VFCT
<6>[   12.363957] amdgpu: ATOM BIOS: 102-RAPHAEL-008
<6>[   12.373650] [drm] VCN(0) decode is enabled in VM mode
<6>[   12.373652] [drm] VCN(0) encode is enabled in VM mode
<6>[   12.375963] [drm] JPEG decode is enabled in VM mode
<6>[   12.375970] amdgpu 0000:1a:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
dmesg-efi_pstore-168849054606001:
Oops#1 Part6
<6>[   12.376502] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
<6>[   12.376539] amdgpu 0000:1a:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
<6>[   12.376541] amdgpu 0000:1a:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
<6>[   12.376542] amdgpu 0000:1a:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
<6>[   12.376554] [drm] Detected VRAM RAM=512M, BAR=512M
<6>[   12.376555] [drm] RAM width 128bits DDR5
<6>[   12.377052] [drm] amdgpu: 512M of VRAM memory ready
<6>[   12.377054] [drm] amdgpu: 31684M of GTT memory ready.
<6>[   12.377108] [drm] GART: num cpu pages 262144, num gpu pages 262144
<6>[   12.377308] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
<6>[   12.377805] [drm] Loading DMUB firmware via PSP: version=0x05000500
<6>[   12.379865] [drm] use_doorbell being set to: [true]
<6>[   12.379935] [drm] Found VCN firmware Version ENC: 1.24 DEC: 2 VEP: 0 Revision: 0
<6>[   12.380014] amdgpu 0000:1a:00.0: amdgpu: Will use PSP to load VCN firmware
<6>[   12.402426] [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
<6>[   12.467280] amdgpu 0000:1a:00.0: amdgpu: RAS: optional ras ta ucode is not available
<6>[   12.473179] amdgpu 0000:1a:00.0: amdgpu: RAP: optional rap ta ucode is not available
<6>[   12.473181] amdgpu 0000:1a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
<6>[   12.473512] amdgpu 0000:1a:00.0: amdgpu: smu driver if version = 0x00000004, smu fw if version = 0x00000005, smu fw program = 0, smu fw version = 0x00544fda (84.79.218)
<6>[   12.473514] amdgpu 0000:1a:00.0: amdgpu: SMU driver if version not matched
dmesg-efi_pstore-168849054605001:
Oops#1 Part5
<6>[   12.474674] amdgpu 0000:1a:00.0: amdgpu: SMU is initialized successfully!
<6>[   12.475981] [drm] Display Core initialized with v3.2.230!
<6>[   12.475983] [drm] DP-HDMI FRL PCON supported
<6>[   12.476643] [drm] DMUB hardware initialized: version=0x05000500
<6>[   12.479228] snd_hda_intel 0000:1a:00.1: bound 0000:1a:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
<6>[   12.483368] [drm] kiq ring mec 2 pipe 1 q 0
<6>[   12.485164] [drm] VCN decode and encode initialized successfully(under DPG Mode).
<6>[   12.485189] [drm] JPEG decode initialized successfully.
<6>[   12.487577] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
<6>[   12.487939] amdgpu: sdma_bitmap: 3
<6>[   12.500346] amdgpu: HMM registered 512MB device memory
<4>[   12.500545] amdgpu: SRAT table not found
<6>[   12.500547] amdgpu: Virtual CRAT table created for GPU
<6>[   12.505602] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
<6>[   12.505606] kfd kfd: amdgpu: added device 1002:164e
<6>[   12.505619] amdgpu 0000:1a:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
<6>[   12.506662] amdgpu 0000:1a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
<6>[   12.506665] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
<6>[   12.506668] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
<6>[   12.506670] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
<6>[   12.506672] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
<6>[   12.506674] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
<6>[   12.506677] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
<6>[   12.506679] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
dmesg-efi_pstore-168849054604001:
Oops#1 Part4
<6>[   12.506681] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
<6>[   12.506683] amdgpu 0000:1a:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
<6>[   12.506686] amdgpu 0000:1a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
<6>[   12.506688] amdgpu 0000:1a:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
<6>[   12.506690] amdgpu 0000:1a:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
<6>[   12.506693] amdgpu 0000:1a:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
<6>[   12.506695] amdgpu 0000:1a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
<6>[   12.517744] [drm] Initialized amdgpu 3.52.0 20150101 for 0000:1a:00.0 on minor 1
<6>[   12.521606] amdgpu 0000:1a:00.0: [drm] Cannot find any crtc or sizes
<6>[   13.505105] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-1000:00: attached PHY driver (mii_bus:phy_addr=r8169-0-1000:00, irq=MAC)
<6>[   13.560559] loop24: detected capacity change from 0 to 8
<6>[   13.673342] r8169 0000:10:00.0 enp16s0: Link is Down
<6>[   16.244015] r8169 0000:10:00.0 enp16s0: Link is Up - 1Gbps/Full - flow control rx/tx
<6>[   16.244041] IPv6: ADDRCONF(NETDEV_CHANGE): enp16s0: link becomes ready
<4>[   19.161428] kauditd_printk_skb: 71 callbacks suppressed
<5>[   19.161430] audit: type=1400 audit(1688490525.372:83): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2247 comm="snap-confine" capability=12  capname="net_admin"
<5>[   19.161596] audit: type=1400 audit(1688490525.372:84): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2247 comm="snap-confine" capability=38  capname="perfmon"
dmesg-efi_pstore-168849054603001:
Oops#1 Part3
<7>[   20.536339] rfkill: input handler disabled
<5>[   26.592370] audit: type=1400 audit(1688490532.803:85): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2700 comm="snap-confine" capability=12  capname="net_admin"
<5>[   26.592657] audit: type=1400 audit(1688490532.803:86): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2700 comm="snap-confine" capability=38  capname="perfmon"
<7>[   26.752279] rfkill: input handler enabled
<7>[   28.538548] rfkill: input handler disabled
<5>[   29.715605] audit: type=1400 audit(1688490535.926:87): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=3324 comm="snap-confine" capability=12  capname="net_admin"
<5>[   29.715780] audit: type=1400 audit(1688490535.926:88): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=3324 comm="snap-confine" capability=38  capname="perfmon"
<4>[   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[   40.529723] CPU: 0 PID: 3492 Comm: thunderbird Not tainted 6.4.0-rc2-crash2-kees2-00001-g2d47c6956ab3-dirty #5
<4>[   40.529725] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
<4>[   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
<4>[   40.529730] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
dmesg-efi_pstore-168849054602001:
Oops#1 Part2
<4>[   40.529731] RSP: 0018:ffffad8c45313c48 EFLAGS: 00010202
<4>[   40.529733] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   40.529734] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   40.529734] RBP: ffffad8c45313c98 R08: 0000000000000000 R09: 0000000000000000
<4>[   40.529735] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c63a8
<4>[   40.529735] R13: ffff9cbde9b08750 R14: 0000000000000001 R15: ffff9cbdff1c63a8
<4>[   40.529736] FS:  00007f50d863e780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
<4>[   40.529737] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   40.529737] CR2: 0000000000000000 CR3: 00000001b0ae0000 CR4: 0000000000750ef0
<4>[   40.529738] PKRU: 55555554
<4>[   40.529739] Call Trace:
<4>[   40.529739]  <TASK>
<4>[   40.529741]  copy_process+0x165f/0x2110
<4>[   40.529744]  kernel_clone+0x9d/0x3a0
<4>[   40.529745]  ? find_held_lock+0x31/0xa0
<4>[   40.529747]  ? mntput_no_expire+0x89/0x4f0
<4>[   40.529749]  ? lock_release+0xc4/0x270
<4>[   40.529751]  __do_sys_clone+0x66/0xa0
<4>[   40.529754]  __x64_sys_clone+0x25/0x40
<4>[   40.529755]  do_syscall_64+0x59/0x90
<4>[   40.529758]  ? syscall_exit_to_user_mode+0x39/0x60
<4>[   40.529760]  ? do_syscall_64+0x69/0x90
<4>[   40.529761]  ? irqentry_exit_to_user_mode+0x27/0x40
<4>[   40.529762]  ? irqentry_exit+0x77/0xb0
<4>[   40.529764]  ? exc_page_fault+0xae/0x240
<4>[   40.529765]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[   40.529767] RIP: 0033:0x7f50d811ea3d
<4>[   40.529769] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
dmesg-efi_pstore-168849054601001:
Oops#1 Part1
<4>[   40.529770] RSP: 002b:00007ffcc449ce58 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
<4>[   40.529771] RAX: ffffffffffffffda RBX: 0000000000000051 RCX: 00007f50d811ea3d
<4>[   40.529771] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000030000011
<4>[   40.529772] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f50d82b97c0
<4>[   40.529772] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000011
<4>[   40.529773] R13: 00007f50d7e16980 R14: 00007f50d863e6c0 R15: 00007f50d82ba3c0
<4>[   40.529775]  </TASK>
<4>[   40.529776] Modules linked in: binfmt_misc f2fs crc32_generic lz4hc_compress lz4_compress nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi edac_mce_amd crct10dif_pclmul snd_hda_intel polyval_clmulni snd_intel_dspcfg polyval_generic ghash_clmulni_intel snd_intel_sdw_acpi snd_seq_midi sha512_ssse3 snd_seq_midi_event snd_hda_codec aesni_intel snd_hda_core crypto_simd cryptd snd_hwdep joydev input_leds snd_rawmidi rapl amdgpu snd_pcm ccp wmi_bmof snd_seq k10temp snd_seq_device iommu_v2 snd_timer drm_buddy gpu_sched drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec snd drm_kms_helper i2c_algo_bit syscopyarea sysfillrect sysimgblt soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone fuse efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme nvme_core ahci xhci_pci i2c_piix4 crc32_pclmul nvme_common libahci xhci_pci_renesas r8169 realtek video wmi
<4>[   40.529799]  gpio_amdpt
<4>[   40.529801] ---[ end trace 0000000000000000 ]---

[-- Attachment #3: 168849056-dmesg.txt --]
[-- Type: text/plain, Size: 29022 bytes --]

dmesg-efi_pstore-168849056416002:
Oops#2 Part16
<6>[    9.487407] BTRFS info (device nvme0n1p7): using crc32c (crc32c-intel) checksum algorithm
<6>[    9.487413] BTRFS info (device nvme0n1p7): using free space tree
<6>[    9.504117] loop0: detected capacity change from 0 to 8
<6>[    9.514575] loop1: detected capacity change from 0 to 616344
<6>[    9.529056] BTRFS info (device nvme0n1p7): enabling ssd optimizations
<6>[    9.529059] BTRFS info (device nvme0n1p7): auto enabling async discard
<6>[    9.534084] loop2: detected capacity change from 0 to 632776
<6>[    9.575757] loop3: detected capacity change from 0 to 239128
<6>[    9.576373] loop4: detected capacity change from 0 to 19584
<6>[    9.576896] loop5: detected capacity change from 0 to 19608
<6>[    9.589243] loop6: detected capacity change from 0 to 242136
<6>[    9.597931] loop7: detected capacity change from 0 to 129936
<6>[    9.617003] amdgpu: Ignoring ACPI CRAT on non-APU system
<6>[    9.617281] amdgpu: Virtual CRAT table created for CPU
<6>[    9.617713] amdgpu: Topology: Add CPU node
<6>[    9.619578] amdgpu 0000:03:00.0: enabling device (0006 -> 0007)
<6>[    9.620020] [drm] initializing kernel modesetting (DIMGREY_CAVEFISH 0x1002:0x73FF 0x1849:0x5217 0xC7).
<6>[    9.620057] [drm] register mmio base: 0xFCB00000
<6>[    9.620058] [drm] register mmio size: 1048576
<6>[    9.620722] loop8: detected capacity change from 0 to 129944
<6>[    9.627042] [drm] add ip block number 0 <nv_common>
<6>[    9.627045] [drm] add ip block number 1 <gmc_v10_0>
<6>[    9.627046] [drm] add ip block number 2 <navi10_ih>
<6>[    9.627047] [drm] add ip block number 3 <psp>
<6>[    9.627048] [drm] add ip block number 4 <smu>
<6>[    9.627049] [drm] add ip block number 5 <dm>
dmesg-efi_pstore-168849056415002:
Oops#2 Part15
<6>[    9.627050] [drm] add ip block number 6 <gfx_v10_0>
<6>[    9.627051] [drm] add ip block number 7 <sdma_v5_2>
<6>[    9.627052] [drm] add ip block number 8 <vcn_v3_0>
<6>[    9.627053] [drm] add ip block number 9 <jpeg_v3_0>
<6>[    9.627140] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT
<6>[    9.627206] amdgpu: ATOM BIOS: 113-EXT800296-L04
<6>[    9.634091] loop9: detected capacity change from 0 to 151240
<6>[    9.639916] loop10: detected capacity change from 0 to 151248
<6>[    9.643019] [drm] VCN(0) decode is enabled in VM mode
<6>[    9.643022] [drm] VCN(0) encode is enabled in VM mode
<6>[    9.645286] [drm] JPEG decode is enabled in VM mode
<6>[    9.646431] Console: switching to colour dummy device 80x25
<6>[    9.646952] loop11: detected capacity change from 0 to 316376
<6>[    9.647492] amdgpu 0000:03:00.0: vgaarb: deactivate vga console
<6>[    9.647496] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
<6>[    9.648767] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
<6>[    9.648843] amdgpu 0000:03:00.0: amdgpu: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used)
<6>[    9.648846] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
<6>[    9.648847] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
<6>[    9.648871] [drm] Detected VRAM RAM=8176M, BAR=8192M
<6>[    9.648872] [drm] RAM width 128bits GDDR6
<6>[    9.652593] [drm] amdgpu: 8176M of VRAM memory ready
<6>[    9.652597] [drm] amdgpu: 31684M of GTT memory ready.
<6>[    9.652776] [drm] GART: num cpu pages 131072, num gpu pages 131072
<6>[    9.653346] [drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
dmesg-efi_pstore-168849056414002:
Oops#2 Part14
<6>[    9.658342] loop12: detected capacity change from 0 to 709280
<6>[    9.665978] loop13: detected capacity change from 0 to 716168
<6>[    9.671205] loop14: detected capacity change from 0 to 943480
<6>[    9.676602] loop15: detected capacity change from 0 to 955472
<6>[    9.687320] loop16: detected capacity change from 0 to 187776
<6>[    9.693840] loop17: detected capacity change from 0 to 2214880
<6>[    9.702875] loop18: detected capacity change from 0 to 94064
<6>[    9.710122] loop19: detected capacity change from 0 to 25240
<6>[    9.717587] loop20: detected capacity change from 0 to 109072
<6>[    9.727701] loop21: detected capacity change from 0 to 109072
<6>[    9.733848] loop22: detected capacity change from 0 to 608
<6>[    9.743762] loop23: detected capacity change from 0 to 904
<6>[    9.755144] intel_rapl_common: Found RAPL domain package
<6>[    9.755150] intel_rapl_common: Found RAPL domain core
<6>[    9.770456] EXT4-fs (nvme0n1p11): mounted filesystem bb53eb49-b161-4879-82c2-ab28079074f0 r/w with ordered data mode. Quota mode: none.
<46>[    9.894744] systemd-journald[601]: Received client request to flush runtime journal.
<6>[   11.874650] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries
<6>[   11.874995] BTRFS info (device sda2): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.875001] BTRFS info (device sda2): using free space tree
<6>[   11.875425] BTRFS info (device sda3): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.875430] BTRFS info (device sda3): using free space tree
<6>[   11.875881] [drm] Loading DMUB firmware via PSP: version=0x02020013
<6>[   11.880352] BTRFS info (device nvme0n1p4): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.880356] BTRFS info (device nvme0n1p4): using free space tree
dmesg-efi_pstore-168849056413002:
Oops#2 Part13
<6>[   11.880381] BTRFS info (device nvme0n1p9): using crc32c (crc32c-intel) checksum algorithm
<6>[   11.880388] BTRFS info (device nvme0n1p9): using free space tree
<6>[   11.880702] [drm] use_doorbell being set to: [true]
<6>[   11.881102] [drm] use_doorbell being set to: [true]
<6>[   11.881239] [drm] Found VCN firmware Version ENC: 1.21 DEC: 2 VEP: 0 Revision: 10
<6>[   11.881391] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware
<6>[   11.885948] BTRFS info (device nvme0n1p9): enabling ssd optimizations
<6>[   11.885952] BTRFS info (device nvme0n1p9): auto enabling async discard
<6>[   11.885986] BTRFS info (device nvme0n1p4): enabling ssd optimizations
<6>[   11.885988] BTRFS info (device nvme0n1p4): auto enabling async discard
<6>[   11.952111] [drm] reserve 0xa00000 from 0x8001000000 for PSP TMR
<5>[   12.006320] F2FS-fs (nvme0n1p13): Found nat_bits in checkpoint
<6>[   12.050401] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
<6>[   12.068300] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
<6>[   12.068340] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2900 (59.41.0)
<6>[   12.068342] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
<6>[   12.068372] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
<6>[   12.077567] BTRFS info (device sda2): auto enabling async discard
<5>[   12.107475] F2FS-fs (nvme0n1p13): Mounted with checkpoint version = 59c29f4c
<6>[   12.115553] BTRFS info (device sda3): auto enabling async discard
<6>[   12.118026] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully!
<6>[   12.118728] [drm] Display Core initialized with v3.2.230!
dmesg-efi_pstore-168849056412002:
Oops#2 Part12
<6>[   12.118729] [drm] DP-HDMI FRL PCON supported
<6>[   12.120090] [drm] DMUB hardware initialized: version=0x02020013
<6>[   12.123259] snd_hda_intel 0000:03:00.1: bound 0000:03:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
<6>[   12.159311] [drm] kiq ring mec 2 pipe 1 q 0
<6>[   12.163595] [drm] VCN decode and encode initialized successfully(under DPG Mode).
<6>[   12.164594] [drm] JPEG decode initialized successfully.
<6>[   12.167145] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
<6>[   12.167459] amdgpu: sdma_bitmap: ffff
<6>[   12.184812] amdgpu: HMM registered 8176MB device memory
<4>[   12.185434] amdgpu: SRAT table not found
<6>[   12.185439] amdgpu: Virtual CRAT table created for GPU
<6>[   12.188127] amdgpu: Topology: Add dGPU node [0x73ff:0x1002]
<6>[   12.188132] kfd kfd: amdgpu: added device 1002:73ff
<6>[   12.188150] amdgpu 0000:03:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 28
<6>[   12.188988] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
<6>[   12.188990] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
<6>[   12.188991] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
<6>[   12.188992] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
<6>[   12.188992] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
<6>[   12.188993] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
<6>[   12.188994] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
<6>[   12.188995] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
<6>[   12.188996] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
dmesg-efi_pstore-168849056411002:
Oops#2 Part11
<6>[   12.188997] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
<6>[   12.188998] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
<6>[   12.188999] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
<6>[   12.188999] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
<6>[   12.189000] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
<6>[   12.189001] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
<6>[   12.189002] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
<6>[   12.189990] amdgpu 0000:03:00.0: amdgpu: Using BACO for runtime pm
<5>[   12.195846] audit: type=1400 audit(1688490518.405:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=1181 comm="apparmor_parser"
<5>[   12.196393] audit: type=1400 audit(1688490518.406:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1183 comm="apparmor_parser"
<5>[   12.196431] audit: type=1400 audit(1688490518.406:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1183 comm="apparmor_parser"
<5>[   12.199324] audit: type=1400 audit(1688490518.409:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=1187 comm="apparmor_parser"
<5>[   12.199378] audit: type=1400 audit(1688490518.409:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=1187 comm="apparmor_parser"
<5>[   12.199417] audit: type=1400 audit(1688490518.409:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=1187 comm="apparmor_parser"
dmesg-efi_pstore-168849056410002:
Oops#2 Part10
<6>[   12.200239] [drm] Initialized amdgpu 3.52.0 20150101 for 0000:03:00.0 on minor 0
<5>[   12.200709] audit: type=1400 audit(1688490518.410:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="tcpdump" pid=1188 comm="apparmor_parser"
<5>[   12.200979] audit: type=1400 audit(1688490518.410:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oosplash" pid=1189 comm="apparmor_parser"
<5>[   12.201723] audit: type=1400 audit(1688490518.411:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=1184 comm="apparmor_parser"
<5>[   12.201758] audit: type=1400 audit(1688490518.411:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=1184 comm="apparmor_parser"
<6>[   12.208400] fbcon: amdgpudrmfb (fb0) is primary device
<6>[   12.210234] [drm] DSC precompute is not needed.
<6>[   12.330373] Console: switching to colour frame buffer device 240x67
<6>[   12.349806] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device
<6>[   12.360471] amdgpu 0000:1a:00.0: enabling device (0006 -> 0007)
<6>[   12.360658] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x164E 0x1002:0x164E 0xC1).
<6>[   12.360691] [drm] register mmio base: 0xFCA00000
<6>[   12.360691] [drm] register mmio size: 524288
<6>[   12.363923] [drm] add ip block number 0 <nv_common>
<6>[   12.363924] [drm] add ip block number 1 <gmc_v10_0>
<6>[   12.363924] [drm] add ip block number 2 <navi10_ih>
<6>[   12.363925] [drm] add ip block number 3 <psp>
<6>[   12.363926] [drm] add ip block number 4 <smu>
<6>[   12.363926] [drm] add ip block number 5 <dm>
dmesg-efi_pstore-168849056409002:
Oops#2 Part9
<6>[   12.363927] [drm] add ip block number 6 <gfx_v10_0>
<6>[   12.363927] [drm] add ip block number 7 <sdma_v5_2>
<6>[   12.363928] [drm] add ip block number 8 <vcn_v3_0>
<6>[   12.363928] [drm] add ip block number 9 <jpeg_v3_0>
<6>[   12.363948] amdgpu 0000:1a:00.0: amdgpu: Fetched VBIOS from VFCT
<6>[   12.363957] amdgpu: ATOM BIOS: 102-RAPHAEL-008
<6>[   12.373650] [drm] VCN(0) decode is enabled in VM mode
<6>[   12.373652] [drm] VCN(0) encode is enabled in VM mode
<6>[   12.375963] [drm] JPEG decode is enabled in VM mode
<6>[   12.375970] amdgpu 0000:1a:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
<6>[   12.376502] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
<6>[   12.376539] amdgpu 0000:1a:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
<6>[   12.376541] amdgpu 0000:1a:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
<6>[   12.376542] amdgpu 0000:1a:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
<6>[   12.376554] [drm] Detected VRAM RAM=512M, BAR=512M
<6>[   12.376555] [drm] RAM width 128bits DDR5
<6>[   12.377052] [drm] amdgpu: 512M of VRAM memory ready
<6>[   12.377054] [drm] amdgpu: 31684M of GTT memory ready.
<6>[   12.377108] [drm] GART: num cpu pages 262144, num gpu pages 262144
<6>[   12.377308] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
<6>[   12.377805] [drm] Loading DMUB firmware via PSP: version=0x05000500
<6>[   12.379865] [drm] use_doorbell being set to: [true]
<6>[   12.379935] [drm] Found VCN firmware Version ENC: 1.24 DEC: 2 VEP: 0 Revision: 0
<6>[   12.380014] amdgpu 0000:1a:00.0: amdgpu: Will use PSP to load VCN firmware
<6>[   12.402426] [drm] reserve 0xa00000 from 0xf41e000000 for PSP TMR
dmesg-efi_pstore-168849056408002:
Oops#2 Part8
<6>[   12.467280] amdgpu 0000:1a:00.0: amdgpu: RAS: optional ras ta ucode is not available
<6>[   12.473179] amdgpu 0000:1a:00.0: amdgpu: RAP: optional rap ta ucode is not available
<6>[   12.473181] amdgpu 0000:1a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
<6>[   12.473512] amdgpu 0000:1a:00.0: amdgpu: smu driver if version = 0x00000004, smu fw if version = 0x00000005, smu fw program = 0, smu fw version = 0x00544fda (84.79.218)
<6>[   12.473514] amdgpu 0000:1a:00.0: amdgpu: SMU driver if version not matched
<6>[   12.474674] amdgpu 0000:1a:00.0: amdgpu: SMU is initialized successfully!
<6>[   12.475981] [drm] Display Core initialized with v3.2.230!
<6>[   12.475983] [drm] DP-HDMI FRL PCON supported
<6>[   12.476643] [drm] DMUB hardware initialized: version=0x05000500
<6>[   12.479228] snd_hda_intel 0000:1a:00.1: bound 0000:1a:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
<6>[   12.483368] [drm] kiq ring mec 2 pipe 1 q 0
<6>[   12.485164] [drm] VCN decode and encode initialized successfully(under DPG Mode).
<6>[   12.485189] [drm] JPEG decode initialized successfully.
<6>[   12.487577] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
<6>[   12.487939] amdgpu: sdma_bitmap: 3
<6>[   12.500346] amdgpu: HMM registered 512MB device memory
<4>[   12.500545] amdgpu: SRAT table not found
<6>[   12.500547] amdgpu: Virtual CRAT table created for GPU
<6>[   12.505602] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
<6>[   12.505606] kfd kfd: amdgpu: added device 1002:164e
<6>[   12.505619] amdgpu 0000:1a:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
<6>[   12.506662] amdgpu 0000:1a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
<6>[   12.506665] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
dmesg-efi_pstore-168849056407002:
Oops#2 Part7
<6>[   12.506668] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
<6>[   12.506670] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
<6>[   12.506672] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
<6>[   12.506674] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
<6>[   12.506677] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
<6>[   12.506679] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
<6>[   12.506681] amdgpu 0000:1a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
<6>[   12.506683] amdgpu 0000:1a:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
<6>[   12.506686] amdgpu 0000:1a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
<6>[   12.506688] amdgpu 0000:1a:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
<6>[   12.506690] amdgpu 0000:1a:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
<6>[   12.506693] amdgpu 0000:1a:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
<6>[   12.506695] amdgpu 0000:1a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
<6>[   12.517744] [drm] Initialized amdgpu 3.52.0 20150101 for 0000:1a:00.0 on minor 1
<6>[   12.521606] amdgpu 0000:1a:00.0: [drm] Cannot find any crtc or sizes
<6>[   13.505105] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-1000:00: attached PHY driver (mii_bus:phy_addr=r8169-0-1000:00, irq=MAC)
<6>[   13.560559] loop24: detected capacity change from 0 to 8
<6>[   13.673342] r8169 0000:10:00.0 enp16s0: Link is Down
<6>[   16.244015] r8169 0000:10:00.0 enp16s0: Link is Up - 1Gbps/Full - flow control rx/tx
<6>[   16.244041] IPv6: ADDRCONF(NETDEV_CHANGE): enp16s0: link becomes ready
<4>[   19.161428] kauditd_printk_skb: 71 callbacks suppressed
dmesg-efi_pstore-168849056406002:
Oops#2 Part6
<5>[   19.161430] audit: type=1400 audit(1688490525.372:83): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2247 comm="snap-confine" capability=12  capname="net_admin"
<5>[   19.161596] audit: type=1400 audit(1688490525.372:84): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2247 comm="snap-confine" capability=38  capname="perfmon"
<7>[   20.536339] rfkill: input handler disabled
<5>[   26.592370] audit: type=1400 audit(1688490532.803:85): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2700 comm="snap-confine" capability=12  capname="net_admin"
<5>[   26.592657] audit: type=1400 audit(1688490532.803:86): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=2700 comm="snap-confine" capability=38  capname="perfmon"
<7>[   26.752279] rfkill: input handler enabled
<7>[   28.538548] rfkill: input handler disabled
<5>[   29.715605] audit: type=1400 audit(1688490535.926:87): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=3324 comm="snap-confine" capability=12  capname="net_admin"
<5>[   29.715780] audit: type=1400 audit(1688490535.926:88): apparmor="DENIED" operation="capable" class="cap" profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=3324 comm="snap-confine" capability=38  capname="perfmon"
<4>[   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[   40.529723] CPU: 0 PID: 3492 Comm: thunderbird Not tainted 6.4.0-rc2-crash2-kees2-00001-g2d47c6956ab3-dirty #5
<4>[   40.529725] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
dmesg-efi_pstore-168849056405002:
Oops#2 Part5
<4>[   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
<4>[   40.529730] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
<4>[   40.529731] RSP: 0018:ffffad8c45313c48 EFLAGS: 00010202
<4>[   40.529733] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   40.529734] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   40.529734] RBP: ffffad8c45313c98 R08: 0000000000000000 R09: 0000000000000000
<4>[   40.529735] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c63a8
<4>[   40.529735] R13: ffff9cbde9b08750 R14: 0000000000000001 R15: ffff9cbdff1c63a8
<4>[   40.529736] FS:  00007f50d863e780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
<4>[   40.529737] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   40.529737] CR2: 0000000000000000 CR3: 00000001b0ae0000 CR4: 0000000000750ef0
<4>[   40.529738] PKRU: 55555554
<4>[   40.529739] Call Trace:
<4>[   40.529739]  <TASK>
<4>[   40.529741]  copy_process+0x165f/0x2110
<4>[   40.529744]  kernel_clone+0x9d/0x3a0
<4>[   40.529745]  ? find_held_lock+0x31/0xa0
<4>[   40.529747]  ? mntput_no_expire+0x89/0x4f0
<4>[   40.529749]  ? lock_release+0xc4/0x270
<4>[   40.529751]  __do_sys_clone+0x66/0xa0
<4>[   40.529754]  __x64_sys_clone+0x25/0x40
<4>[   40.529755]  do_syscall_64+0x59/0x90
<4>[   40.529758]  ? syscall_exit_to_user_mode+0x39/0x60
<4>[   40.529760]  ? do_syscall_64+0x69/0x90
<4>[   40.529761]  ? irqentry_exit_to_user_mode+0x27/0x40
<4>[   40.529762]  ? irqentry_exit+0x77/0xb0
<4>[   40.529764]  ? exc_page_fault+0xae/0x240
<4>[   40.529765]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[   40.529767] RIP: 0033:0x7f50d811ea3d
dmesg-efi_pstore-168849056404002:
Oops#2 Part4
<4>[   40.529769] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
<4>[   40.529770] RSP: 002b:00007ffcc449ce58 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
<4>[   40.529771] RAX: ffffffffffffffda RBX: 0000000000000051 RCX: 00007f50d811ea3d
<4>[   40.529771] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000030000011
<4>[   40.529772] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f50d82b97c0
<4>[   40.529772] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000011
<4>[   40.529773] R13: 00007f50d7e16980 R14: 00007f50d863e6c0 R15: 00007f50d82ba3c0
<4>[   40.529775]  </TASK>
<4>[   40.529776] Modules linked in: binfmt_misc f2fs crc32_generic lz4hc_compress lz4_compress nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi edac_mce_amd crct10dif_pclmul snd_hda_intel polyval_clmulni snd_intel_dspcfg polyval_generic ghash_clmulni_intel snd_intel_sdw_acpi snd_seq_midi sha512_ssse3 snd_seq_midi_event snd_hda_codec aesni_intel snd_hda_core crypto_simd cryptd snd_hwdep joydev input_leds snd_rawmidi rapl amdgpu snd_pcm ccp wmi_bmof snd_seq k10temp snd_seq_device iommu_v2 snd_timer drm_buddy gpu_sched drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec snd drm_kms_helper i2c_algo_bit syscopyarea sysfillrect sysimgblt soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone fuse efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme nvme_core ahci xhci_pci i2c_piix4 crc32_pclmul nvme_common libahci xhci_pci_renesas r8169 realtek video wmi
dmesg-efi_pstore-168849056403002:
Oops#2 Part3
<4>[   40.529799]  gpio_amdpt
<4>[   40.529801] ---[ end trace 0000000000000000 ]---
<4>[   40.865489] RIP: 0010:alloc_pid+0x46c/0x480
<4>[   40.865491] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
<4>[   40.865492] RSP: 0018:ffffad8c45313c48 EFLAGS: 00010202
<4>[   40.865494] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   40.865495] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   40.865495] RBP: ffffad8c45313c98 R08: 0000000000000000 R09: 0000000000000000
<4>[   40.865496] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c63a8
<4>[   40.865497] R13: ffff9cbde9b08750 R14: 0000000000000001 R15: ffff9cbdff1c63a8
<4>[   40.865497] FS:  00007f50d863e780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
<4>[   40.865498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   40.865499] CR2: 0000000000000000 CR3: 00000001b0ae0000 CR4: 0000000000750ef0
<4>[   40.865500] PKRU: 55555554
<4>[   58.206209] invalid opcode: 0000 [#2] PREEMPT SMP NOPTI
<4>[   58.206213] CPU: 0 PID: 3502 Comm: thunderbird Tainted: G      D            6.4.0-rc2-crash2-kees2-00001-g2d47c6956ab3-dirty #5
<4>[   58.206215] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
<4>[   58.206216] RIP: 0010:alloc_pid+0x46c/0x480
<4>[   58.206220] Code: 00 92 49 c7 c4 f4 ff ff ff e8 50 bc 15 01 4c 89 ff e8 68 50 13 00 e9 ec fd ff ff be 02 00 00 00 e8 89 5f 71 00 e9 f8 fe ff ff <0f> 0b 49 c7 c4 f4 ff ff ff e9 b9 fb ff ff 66 0f 1f 44 00 00 90 90
dmesg-efi_pstore-168849056402002:
Oops#2 Part2
<4>[   58.206221] RSP: 0018:ffffad8c45a7bc18 EFLAGS: 00010202
<4>[   58.206222] RAX: 0000000080000000 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   58.206223] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   58.206224] RBP: ffffad8c45a7bc68 R08: 0000000000000000 R09: 0000000000000000
<4>[   58.206224] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cbdff1c50a8
<4>[   58.206225] R13: ffff9cbde9b0b850 R14: 0000000000000001 R15: ffff9cbdff1c50a8
<4>[   58.206226] FS:  00007f8d85c6a780(0000) GS:ffff9ccc97a00000(0000) knlGS:0000000000000000
<4>[   58.206226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   58.206227] CR2: 0000000000000000 CR3: 0000000113cc6000 CR4: 0000000000750ef0
<4>[   58.206228] PKRU: 55555554
<4>[   58.206228] Call Trace:
<4>[   58.206229]  <TASK>
<4>[   58.206230]  copy_process+0x165f/0x2110
<4>[   58.206232]  ? trace_preempt_on+0x2e/0xa0
<4>[   58.206236]  kernel_clone+0x9d/0x3a0
<4>[   58.206237]  ? mntput_no_expire+0xa1/0x4f0
<4>[   58.206239]  ? __dentry_kill+0x15f/0x1c0
<4>[   58.206241]  __do_sys_clone+0x66/0xa0
<4>[   58.206243]  __x64_sys_clone+0x25/0x40
<4>[   58.206244]  do_syscall_64+0x59/0x90
<4>[   58.206247]  ? syscall_exit_to_user_mode+0x39/0x60
<4>[   58.206250]  ? do_syscall_64+0x69/0x90
<4>[   58.206251]  ? syscall_exit_to_user_mode+0x39/0x60
<4>[   58.206252]  ? do_syscall_64+0x69/0x90
<4>[   58.206253]  ? irqentry_exit+0x77/0xb0
<4>[   58.206254]  ? exc_page_fault+0xae/0x240
<4>[   58.206256]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[   58.206258] RIP: 0033:0x7f8d8571ea3d
<4>[   58.206260] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48
dmesg-efi_pstore-168849056401002:
Oops#2 Part1
<4>[   58.206260] RSP: 002b:00007ffcc4c1bc28 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
<4>[   58.206261] RAX: ffffffffffffffda RBX: 0000000000000051 RCX: 00007f8d8571ea3d
<4>[   58.206262] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000030000011
<4>[   58.206263] RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f8d858fe7c0
<4>[   58.206263] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000011
<4>[   58.206264] R13: 00007f8d85516980 R14: 00007f8d85c6a6c0 R15: 00007f8d858ff3c0
<4>[   58.206266]  </TASK>
<4>[   58.206266] Modules linked in: binfmt_misc f2fs crc32_generic lz4hc_compress lz4_compress nls_iso8859_1 intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi edac_mce_amd crct10dif_pclmul snd_hda_intel polyval_clmulni snd_intel_dspcfg polyval_generic ghash_clmulni_intel snd_intel_sdw_acpi snd_seq_midi sha512_ssse3 snd_seq_midi_event snd_hda_codec aesni_intel snd_hda_core crypto_simd cryptd snd_hwdep joydev input_leds snd_rawmidi rapl amdgpu snd_pcm ccp wmi_bmof snd_seq k10temp snd_seq_device iommu_v2 snd_timer drm_buddy gpu_sched drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec snd drm_kms_helper i2c_algo_bit syscopyarea sysfillrect sysimgblt soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone fuse efi_pstore drm ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme nvme_core ahci xhci_pci i2c_piix4 crc32_pclmul nvme_common libahci xhci_pci_renesas r8169 realtek video wmi
<4>[   58.206290]  gpio_amdpt
<4>[   58.206299] ---[ end trace 0000000000000000 ]---

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-04 23:15                       ` Mirsad Todorovac
@ 2023-07-05  2:09                         ` Kees Cook
  2023-07-05  5:18                           ` Mirsad Todorovac
                                             ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Kees Cook @ 2023-07-05  2:09 UTC (permalink / raw)
  To: Mirsad Todorovac, Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>On 7/4/23 23:36, Kees Cook wrote:
>> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>>> On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>>>>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>>>>> something very similar (though, again, I don't see why you're getting a
>>>>> _crash_, it should _warn_ and continue normally). And, actually, also
>>>>> include this patch:
>>>>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/
>>>> 
>>>> This is now in Linus's tree:
>>>> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
>>>> 
>>>> Please also still try with the first patch I mentioned, which is very similar:
>>>> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>>> 
>>> Hi,
>>> 
>>> I have finally built w both patches (and recommended PSTORE settings were
>>> default already).
>> 
>> Were you able to find the crashes saved by pstore?
>
>No, only lktdm and invalid opcode crashes ...
>
>P.S.
>
>Actually, I have recovered some pstore records. Please find them in the attachment:
>
>>> This second patch fixes the booting problem, but alas there is still a problem -
>> 
>> Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.
>
>Glad I could be of help.
>
>>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
>>> 
>>> Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> 
>> Hmm, is CONFIG_UBSAN_TRAP set?
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
>CONFIG_UBSAN_TRAP=y

Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>>> Jul  4 19:09:07 defiant kernel: [   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
>> 
>> Hmm, is this patch in your kernel?
>> https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04
>
>No, it wasn't. I had only these:
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ more ../kees-[12].patch
>::::::::::::::
>../kees-1.patch
>::::::::::::::
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index b17e3a21b15f..82ec6af71a1d 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -376,7 +376,10 @@ struct usb_string_descriptor {
> 	__u8  bLength;
> 	__u8  bDescriptorType;
> -	__le16 wData[1];		/* UTF-16LE encoded */
>+	union {
>+		__le16 legacy_padding;
>+		__DECLARE_FLEX_ARRAY(__le16, wData);	/* UTF-16LE encoded */
>+	};
> } __attribute__ ((packed));
>  /* note that "string" zero is special, it holds language codes that
>::::::::::::::
>../kees-2.patch
>::::::::::::::
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index b17e3a21b15f..3ff98c7ba7e3 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -981,7 +981,11 @@ struct usb_ssp_cap_descriptor {
> #define USB_SSP_MIN_RX_LANE_COUNT		(0xf << 8)
> #define USB_SSP_MIN_TX_LANE_COUNT		(0xf << 12)
> 	__le16 wReserved;
>-	__le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>+	union {
>+		__le32 legacy_padding;
>+		/* list of sublink speed attrib entries */
>+		__DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>+	};
> #define USB_SSP_SUBLINK_SPEED_SSID	(0xf)		/* sublink speed ID */
> #define USB_SSP_SUBLINK_SPEED_LSE	(0x3 << 4)	/* Lanespeed exponent */
> #define USB_SSP_SUBLINK_SPEED_LSE_BPS		0
>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>---------------------------------------------------------
>
>Now it works. Succeeded boot and running of X apps with the new git pull
>torvalds tree and the kees-2.patch.

Perfect! Okay, so it looks like all the issues are known and fixed. I'll work with Greg to get the other ch9 patch landed.

>
>Praise God!
>
>This is the git log --oneline:
>
>d528014517f2 (HEAD, origin/master, origin/HEAD) Revert ".gitignore: ignore *.cover and *.mbx"
>04f2933d375e Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue
>03275585cabd afs: Fix accidental truncation when storing data
>538140ca602b Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
>94c76955e86a Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
>ccf46d853183 Merge tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>b869e9f49964 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
>406fb9eb198a Merge tag 'firewire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
>f1962207150c module: fix init_module_from_file() error handling
>40c565a429d7 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
>f679e89acdd3 clk: tegra: Avoid calling an uninitialized function
>
>So, the included patch is:
>
>marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
>diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>index 82ec6af71a1d..62d318377379 100644
>--- a/include/uapi/linux/usb/ch9.h
>+++ b/include/uapi/linux/usb/ch9.h
>@@ -984,7 +984,11 @@ struct usb_ssp_cap_descriptor {
> #define USB_SSP_MIN_RX_LANE_COUNT              (0xf << 8)
> #define USB_SSP_MIN_TX_LANE_COUNT              (0xf << 12)
>        __le16 wReserved;
>-       __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>+       union {
>+               __le32 legacy_padding;
>+               /* list of sublink speed attrib entries */
>+               __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>+       };
> #define USB_SSP_SUBLINK_SPEED_SSID     (0xf)           /* sublink speed ID */
> #define USB_SSP_SUBLINK_SPEED_LSE      (0x3 << 4)      /* Lanespeed exponent */
> #define USB_SSP_SUBLINK_SPEED_LSE_BPS          0
>marvin@defiant:~/linux/kernel/linux_torvalds$
>
>This means vanilla torvalds tree + https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>works, but vanilla torvalds tree w/o patch still crashes.

Great, thanks again for testing it all!

-Keed

>
>I am still rather new to the utilisation of the PSTORE subsystem.
>
>Best regards,
>Mirsad Todorovac

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-05  2:09                         ` Kees Cook
@ 2023-07-05  5:18                           ` Mirsad Todorovac
  2023-07-05 15:16                           ` CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot] Jann Horn
  2023-07-06  5:02                           ` [CRASH][BISECTED] 6.4.1 crash in boot Mirsad Todorovac
  2 siblings, 0 replies; 27+ messages in thread
From: Mirsad Todorovac @ 2023-07-05  5:18 UTC (permalink / raw)
  To: Kees Cook, Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On 7/5/23 04:09, Kees Cook wrote:
> On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>> On 7/4/23 23:36, Kees Cook wrote:
>>> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
>>>> On 7/4/23 01:09, Kees Cook wrote:> On Mon, Jul 03, 2023 at 12:03:23PM -0700, Kees Cook wrote:
>>>>>> Cool. xhci-hub is in your backtrace, and the above patch was made for
>>>>>> something very similar (though, again, I don't see why you're getting a
>>>>>> _crash_, it should _warn_ and continue normally). And, actually, also
>>>>>> include this patch:
>>>>>> https://lore.kernel.org/lkml/20230614181307.gonna.256-kees@kernel.org/
>>>>>
>>>>> This is now in Linus's tree:
>>>>> 09b69dd4378b ("usb: ch9: Replace 1-element array with flexible array")
>>>>>
>>>>> Please also still try with the first patch I mentioned, which is very similar:
>>>>> https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>>>>
>>>> Hi,
>>>>
>>>> I have finally built w both patches (and recommended PSTORE settings were
>>>> default already).
>>>
>>> Were you able to find the crashes saved by pstore?
>>
>> No, only lktdm and invalid opcode crashes ...
>>
>> P.S.
>>
>> Actually, I have recovered some pstore records. Please find them in the attachment:
>>
>>>> This second patch fixes the booting problem, but alas there is still a problem -
>>>
>>> Ah! That's great! They're is still an unexpected crash source, but the trigger is fixed.
>>
>> Glad I could be of help.
>>
>>>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
>>>>
>>>> Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>>>
>>> Hmm, is CONFIG_UBSAN_TRAP set?
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
>> CONFIG_UBSAN_TRAP=y
> Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

Will do that. Thanks for the hint.

>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>>>> Jul  4 19:09:07 defiant kernel: [   40.529726] RIP: 0010:alloc_pid+0x46c/0x480
>>>
>>> Hmm, is this patch in your kernel?
>>> https://git.kernel.org/linus/b69f0aeb068980af983d399deafc7477cec8bc04
>>
>> No, it wasn't. I had only these:
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ more ../kees-[12].patch
>> ::::::::::::::
>> ../kees-1.patch
>> ::::::::::::::
>> diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>> index b17e3a21b15f..82ec6af71a1d 100644
>> --- a/include/uapi/linux/usb/ch9.h
>> +++ b/include/uapi/linux/usb/ch9.h
>> @@ -376,7 +376,10 @@ struct usb_string_descriptor {
>> 	__u8  bLength;
>> 	__u8  bDescriptorType;
>> -	__le16 wData[1];		/* UTF-16LE encoded */
>> +	union {
>> +		__le16 legacy_padding;
>> +		__DECLARE_FLEX_ARRAY(__le16, wData);	/* UTF-16LE encoded */
>> +	};
>> } __attribute__ ((packed));
>>   /* note that "string" zero is special, it holds language codes that
>> ::::::::::::::
>> ../kees-2.patch
>> ::::::::::::::
>> diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>> index b17e3a21b15f..3ff98c7ba7e3 100644
>> --- a/include/uapi/linux/usb/ch9.h
>> +++ b/include/uapi/linux/usb/ch9.h
>> @@ -981,7 +981,11 @@ struct usb_ssp_cap_descriptor {
>> #define USB_SSP_MIN_RX_LANE_COUNT		(0xf << 8)
>> #define USB_SSP_MIN_TX_LANE_COUNT		(0xf << 12)
>> 	__le16 wReserved;
>> -	__le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>> +	union {
>> +		__le32 legacy_padding;
>> +		/* list of sublink speed attrib entries */
>> +		__DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>> +	};
>> #define USB_SSP_SUBLINK_SPEED_SSID	(0xf)		/* sublink speed ID */
>> #define USB_SSP_SUBLINK_SPEED_LSE	(0x3 << 4)	/* Lanespeed exponent */
>> #define USB_SSP_SUBLINK_SPEED_LSE_BPS		0
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> ---------------------------------------------------------
>>
>> Now it works. Succeeded boot and running of X apps with the new git pull
>> torvalds tree and the kees-2.patch.
> 
> Perfect! Okay, so it looks like all the issues are known and fixed. I'll work with Greg to get the other ch9 patch landed.

Yes, maybe it should be tested more widely first. It was an unobvious bug and
I couldn't see what went wrong ...

>> Praise God!
>>
>> This is the git log --oneline:
>>
>> d528014517f2 (HEAD, origin/master, origin/HEAD) Revert ".gitignore: ignore *.cover and *.mbx"
>> 04f2933d375e Merge tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue
>> 03275585cabd afs: Fix accidental truncation when storing data
>> 538140ca602b Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
>> 94c76955e86a Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
>> ccf46d853183 Merge tag 'pm-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>> b869e9f49964 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
>> 406fb9eb198a Merge tag 'firewire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
>> f1962207150c module: fix init_module_from_file() error handling
>> 40c565a429d7 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
>> f679e89acdd3 clk: tegra: Avoid calling an uninitialized function
>>
>> So, the included patch is:
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
>> diff --git a/include/uapi/linux/usb/ch9.h b/include/uapi/linux/usb/ch9.h
>> index 82ec6af71a1d..62d318377379 100644
>> --- a/include/uapi/linux/usb/ch9.h
>> +++ b/include/uapi/linux/usb/ch9.h
>> @@ -984,7 +984,11 @@ struct usb_ssp_cap_descriptor {
>> #define USB_SSP_MIN_RX_LANE_COUNT              (0xf << 8)
>> #define USB_SSP_MIN_TX_LANE_COUNT              (0xf << 12)
>>         __le16 wReserved;
>> -       __le32 bmSublinkSpeedAttr[1]; /* list of sublink speed attrib entries */
>> +       union {
>> +               __le32 legacy_padding;
>> +               /* list of sublink speed attrib entries */
>> +               __DECLARE_FLEX_ARRAY(__le32, bmSublinkSpeedAttr);
>> +       };
>> #define USB_SSP_SUBLINK_SPEED_SSID     (0xf)           /* sublink speed ID */
>> #define USB_SSP_SUBLINK_SPEED_LSE      (0x3 << 4)      /* Lanespeed exponent */
>> #define USB_SSP_SUBLINK_SPEED_LSE_BPS          0
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> This means vanilla torvalds tree + https://lore.kernel.org/lkml/20230629190900.never.787-kees@kernel.org/
>> works, but vanilla torvalds tree w/o patch still crashes.
> 
> Great, thanks again for testing it all!

No at all, I'm glad I could be of assistance.

Best regards,
Mirsad Todorovac

> -Keed
> 
>>
>> I am still rather new to the utilisation of the PSTORE subsystem.
>>
>> Best regards,
>> Mirsad Todorovac
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot]
  2023-07-05  2:09                         ` Kees Cook
  2023-07-05  5:18                           ` Mirsad Todorovac
@ 2023-07-05 15:16                           ` Jann Horn
  2023-07-05 21:08                             ` Kees Cook
  2023-07-06  5:02                           ` [CRASH][BISECTED] 6.4.1 crash in boot Mirsad Todorovac
  2 siblings, 1 reply; 27+ messages in thread
From: Jann Horn @ 2023-07-05 15:16 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linux Kernel Mailing List, Linux LLVM, the arch/x86 maintainers

On Wed, Jul 5, 2023 at 4:10 AM Kees Cook <kees@kernel.org> wrote:
> On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
> >On 7/4/23 23:36, Kees Cook wrote:
> >> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
> >>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
> >>>
> >>> Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> >>
> >> Hmm, is CONFIG_UBSAN_TRAP set?
> >
> >marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
> >CONFIG_UBSAN_TRAP=y
>
> Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

It might be useful if the x86 code under handle_invalid_op() at least
printed a warning about this when the kernel crashes with #UD on a
system with CONFIG_UBSAN_TRAP=y? It seems pretty unintuitive and
unhelpful that the kernel just crashes itself with a #UD and no
further information in this configuration.

Even just a "WARNING: CONFIG_UBSAN_TRAP active, #UD might be caused by
that" on every #UD that does not come from a known BUG() location or
such might be better than nothing...

And maybe the Kconfig help text could be clearer on this, too.
Currently it does say that this turns warnings into "full exceptions
that abort the running kernel code" but it does not say that the
exception reporting will become pretty unhelpful, so it's probably not
really what you'd want for debugging.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot]
  2023-07-05 15:16                           ` CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot] Jann Horn
@ 2023-07-05 21:08                             ` Kees Cook
  2023-07-05 21:31                               ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2023-07-05 21:08 UTC (permalink / raw)
  To: Jann Horn
  Cc: Kees Cook, Linux Kernel Mailing List, Linux LLVM,
	the arch/x86 maintainers

On Wed, Jul 05, 2023 at 05:16:36PM +0200, Jann Horn wrote:
> On Wed, Jul 5, 2023 at 4:10 AM Kees Cook <kees@kernel.org> wrote:
> > On July 4, 2023 4:15:20 PM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
> > >On 7/4/23 23:36, Kees Cook wrote:
> > >> On July 4, 2023 10:20:11 AM PDT, Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> wrote:
> > >>> all Wayland and X11.org GUI applications fail to start, with errors like this one:
> > >>>
> > >>> Jul  4 19:09:07 defiant kernel: [   40.529719] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > >>
> > >> Hmm, is CONFIG_UBSAN_TRAP set?
> > >
> > >marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
> > >CONFIG_UBSAN_TRAP=y
> >
> > Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.
> 
> It might be useful if the x86 code under handle_invalid_op() at least
> printed a warning about this when the kernel crashes with #UD on a
> system with CONFIG_UBSAN_TRAP=y? It seems pretty unintuitive and
> unhelpful that the kernel just crashes itself with a #UD and no
> further information in this configuration.
> 
> Even just a "WARNING: CONFIG_UBSAN_TRAP active, #UD might be caused by
> that" on every #UD that does not come from a known BUG() location or
> such might be better than nothing...

I've considered it, but usually CONFIG_UBSAN_TRAP isn't accidentally
set. Also, the crash info is something we can get help from on the
compiler side, to mark up where the traps are, similar to what we do
with KCFI, but it hasn't happened yet for x86. For example, arm64
already encodes the details in the trap instruction itself:
https://git.kernel.org/linus/25b84002afb9dc9a91a7ea67166879c13ad82422

> And maybe the Kconfig help text could be clearer on this, too.
> Currently it does say that this turns warnings into "full exceptions
> that abort the running kernel code" but it does not say that the
> exception reporting will become pretty unhelpful, so it's probably not
> really what you'd want for debugging.

Yeah, that's a reasonable change to make. Can you send a patch for this?
I can carry it.

Thanks!

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot]
  2023-07-05 21:08                             ` Kees Cook
@ 2023-07-05 21:31                               ` Peter Zijlstra
  2023-07-05 21:54                                 ` Kees Cook
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2023-07-05 21:31 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jann Horn, Kees Cook, Linux Kernel Mailing List, Linux LLVM,
	the arch/x86 maintainers

On Wed, Jul 05, 2023 at 02:08:09PM -0700, Kees Cook wrote:

> > Even just a "WARNING: CONFIG_UBSAN_TRAP active, #UD might be caused by
> > that" on every #UD that does not come from a known BUG() location or
> > such might be better than nothing...
> 
> I've considered it, but usually CONFIG_UBSAN_TRAP isn't accidentally
> set. Also, the crash info is something we can get help from on the
> compiler side, to mark up where the traps are, similar to what we do
> with KCFI, but it hasn't happened yet for x86. For example, arm64
> already encodes the details in the trap instruction itself:
> https://git.kernel.org/linus/25b84002afb9dc9a91a7ea67166879c13ad82422

Right, so you could easily use a different #UD instruction that has an
immediate, something like:

  0f b9 40 ff          ud1    -0x1(%rax),%rax

or even:

  0f b9 80 00 ff ff ff         ud1    -0x100(%rax),%rax

if you need a 32bit value.

It shouldn't be hard to fix up the #UD handler to decode the instruction
and obtain the displacement for a clue.

Typically we use ud2 because it's the smallest #UD instruction (2 bytes)
and that's enough, but if you want to provide additional clues, there's
options...



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot]
  2023-07-05 21:31                               ` Peter Zijlstra
@ 2023-07-05 21:54                                 ` Kees Cook
  0 siblings, 0 replies; 27+ messages in thread
From: Kees Cook @ 2023-07-05 21:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jann Horn, Kees Cook, Linux Kernel Mailing List, Linux LLVM,
	the arch/x86 maintainers

On Wed, Jul 05, 2023 at 11:31:13PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 05, 2023 at 02:08:09PM -0700, Kees Cook wrote:
> 
> > > Even just a "WARNING: CONFIG_UBSAN_TRAP active, #UD might be caused by
> > > that" on every #UD that does not come from a known BUG() location or
> > > such might be better than nothing...
> > 
> > I've considered it, but usually CONFIG_UBSAN_TRAP isn't accidentally
> > set. Also, the crash info is something we can get help from on the
> > compiler side, to mark up where the traps are, similar to what we do
> > with KCFI, but it hasn't happened yet for x86. For example, arm64
> > already encodes the details in the trap instruction itself:
> > https://git.kernel.org/linus/25b84002afb9dc9a91a7ea67166879c13ad82422
> 
> Right, so you could easily use a different #UD instruction that has an
> immediate, something like:
> 
>   0f b9 40 ff          ud1    -0x1(%rax),%rax

Ah yeah, that would be easier, probably. It could match what arm64 does.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [CRASH][BISECTED] 6.4.1 crash in boot
  2023-07-05  2:09                         ` Kees Cook
  2023-07-05  5:18                           ` Mirsad Todorovac
  2023-07-05 15:16                           ` CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot] Jann Horn
@ 2023-07-06  5:02                           ` Mirsad Todorovac
  2 siblings, 0 replies; 27+ messages in thread
From: Mirsad Todorovac @ 2023-07-06  5:02 UTC (permalink / raw)
  To: Kees Cook, Kees Cook
  Cc: Guenter Roeck, Bagas Sanjaya, Linux Kernel Mailing List,
	Linux LLVM, linux-kbuild, Linux Regressions, Nathan Chancellor,
	Nick Desaulniers, linux-hardening

On 7/5/23 04:09, Kees Cook wrote:
>>>
>>> Hmm, is CONFIG_UBSAN_TRAP set?
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ grep CONFIG_UBSAN_TRAP .config
>> CONFIG_UBSAN_TRAP=y
> 
> Ah-ha! Turn that off please. With it off you will get much more useful reports from USBAN.

Done that. And it appears to work.

Great job.

There should be a way to store the earliest kernel messages while in the initrd phase, but
I can't think of any either ...

Have a nice day!

Best regards,
Mirsad Todorovac

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-07-06  5:02 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-02 16:36 [CRASH][BISECTED] 6.4.1 crash in boot Mirsad Goran Todorovac
2023-07-03  1:44 ` Bagas Sanjaya
2023-07-03  3:20   ` Kees Cook
2023-07-03  3:26     ` Guenter Roeck
2023-07-03  3:53       ` Mirsad Goran Todorovac
2023-07-03  4:30         ` Kees Cook
2023-07-03  4:38           ` Guenter Roeck
2023-07-03  4:53             ` Kees Cook
2023-07-03  4:50           ` Mirsad Goran Todorovac
2023-07-03  3:58       ` Guenter Roeck
2023-07-03  5:18         ` Mirsad Goran Todorovac
2023-07-03  5:18         ` Mirsad Goran Todorovac
2023-07-03  5:41           ` Kees Cook
2023-07-03  7:03             ` Mirsad Goran Todorovac
2023-07-03 19:03               ` Kees Cook
2023-07-03 23:09                 ` Kees Cook
2023-07-04 17:20                   ` Mirsad Todorovac
2023-07-04 21:36                     ` Kees Cook
2023-07-04 23:15                       ` Mirsad Todorovac
2023-07-05  2:09                         ` Kees Cook
2023-07-05  5:18                           ` Mirsad Todorovac
2023-07-05 15:16                           ` CONFIG_UBSAN_TRAP #UD error message on x86 [was: Re: [CRASH][BISECTED] 6.4.1 crash in boot] Jann Horn
2023-07-05 21:08                             ` Kees Cook
2023-07-05 21:31                               ` Peter Zijlstra
2023-07-05 21:54                                 ` Kees Cook
2023-07-06  5:02                           ` [CRASH][BISECTED] 6.4.1 crash in boot Mirsad Todorovac
2023-07-03  3:40   ` Mirsad Goran Todorovac

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.