* Regression in 028abd92 for Sun UltraSPARC T1 @ 2021-03-22 21:30 Frank Scheiner 2021-03-22 21:48 ` John Paul Adrian Glaubitz 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-22 21:30 UTC (permalink / raw) To: Sparc kernel list; +Cc: debian-sparc, Christoph Hellwig Dear all, Riccardo Mottola first recognized a problem with 5.10.x kernels on his Sun T2000 with UltraSPARC T1 (details in [this thread]). I could verify the problem also on my Sun T1000 and it looks like this specific issue breaks the mounting of the root FS or maybe mounting file systems at all. This affects both booting from disk and from network. [this thread]: https://lists.debian.org/debian-sparc/2021/03/msg00004.html I bisected the Linux kernel between: bbf5c979011a099af5dc76498918ed7df445635b (good) ...and: 3650b228f83adda7e5ee532e2b90429c03f7b9ec (bad) ...and the process identified: 028abd9222df0cf5855dab5014a5ebaf06f90565 ([1]) ...as first bad commit. ``` commit 028abd9222df0cf5855dab5014a5ebaf06f90565 Author: Christoph Hellwig <hch@lst.de> Date: Thu Sep 17 10:22:34 2020 +0200 fs: remove compat_sys_mount compat_sys_mount is identical to the regular sys_mount now, so remove it and use the native version everywhere. ``` [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=028abd9222df0cf5855dab5014a5ebaf06f90565 Details about the bisecting on [2]. [2]: https://lists.debian.org/debian-sparc/2021/03/msg00042.html So far this only affects UltraSPARC T1 processors. I didn't see that problem on a T5220 with UltraSPARC T2 and I also didn't see that problem on a Sun Ultra Enterprise 450 with UltraSPARC II when testing a recent Debian installation media with 5.10.x kernel some weeks ago. Other UltraSPARC processors weren't tested yet. I plant to check UltraSPARC IIIi and maybe others if time allows. **** Do you maybe have an idea, what could go wrong with 028abd92 specifically on an UltraSPARC T1 processor? I can provide a full log of a broken (network) boot process if that's useful, I just need to re-create it. IIRC the kernel oopses for each hardware thread (similar to what Riccardo wrote on the debian-sparc mailing list above) and then stops. Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-22 21:30 Regression in 028abd92 for Sun UltraSPARC T1 Frank Scheiner @ 2021-03-22 21:48 ` John Paul Adrian Glaubitz 2021-03-22 21:55 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: John Paul Adrian Glaubitz @ 2021-03-22 21:48 UTC (permalink / raw) To: Frank Scheiner, Sparc kernel list; +Cc: debian-sparc, Christoph Hellwig Hello! On 3/22/21 10:30 PM, Frank Scheiner wrote: > Riccardo Mottola first recognized a problem with 5.10.x kernels on his > Sun T2000 with UltraSPARC T1 (details in [this thread]). I could verify > the problem also on my Sun T1000 and it looks like this specific issue > breaks the mounting of the root FS or maybe mounting file systems at > all. This affects both booting from disk and from network. > (...) > ...as first bad commit. > > ``` > commit 028abd9222df0cf5855dab5014a5ebaf06f90565 > Author: Christoph Hellwig <hch@lst.de> > Date: Thu Sep 17 10:22:34 2020 +0200 > > fs: remove compat_sys_mount > > compat_sys_mount is identical to the regular sys_mount now, so > remove it > and use the native version everywhere. > ``` > > [1]: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=028abd9222df0cf5855dab5014a5ebaf06f90565 Looking at this change, I think it's rather unexpected that this particular change would break the kernel on a specific CPU target. Are you sure that this is the right bad commit? If you found the right commit, then I assume there is something wrong with the syscall handling on UltraSPARC T1. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-22 21:48 ` John Paul Adrian Glaubitz @ 2021-03-22 21:55 ` Frank Scheiner 2021-03-23 16:50 ` Jan Engelhardt 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-22 21:55 UTC (permalink / raw) To: John Paul Adrian Glaubitz, Sparc kernel list Cc: debian-sparc, Christoph Hellwig Hi, On 22.03.21 22:48, John Paul Adrian Glaubitz wrote: > On 3/22/21 10:30 PM, Frank Scheiner wrote: >> Riccardo Mottola first recognized a problem with 5.10.x kernels on his >> Sun T2000 with UltraSPARC T1 (details in [this thread]). I could verify >> the problem also on my Sun T1000 and it looks like this specific issue >> breaks the mounting of the root FS or maybe mounting file systems at >> all. This affects both booting from disk and from network. >> (...) >> ...as first bad commit. >> >> ``` >> commit 028abd9222df0cf5855dab5014a5ebaf06f90565 >> Author: Christoph Hellwig <hch@lst.de> >> Date: Thu Sep 17 10:22:34 2020 +0200 >> >> fs: remove compat_sys_mount >> >> compat_sys_mount is identical to the regular sys_mount now, so >> remove it >> and use the native version everywhere. >> ``` >> >> [1]: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=028abd9222df0cf5855dab5014a5ebaf06f90565 > > Looking at this change, I think it's rather unexpected that this particular > change would break the kernel on a specific CPU target. Are you sure that > this is the right bad commit? Well, I strictly followed the `git bisect` process and tested each and every proposed revision. It's indeed strange that this only affects UltraSPARC T1s, but the changes match the behavior: mounting of (root) FS is broken. > If you found the right commit, then I assume there is something wrong with > the syscall handling on UltraSPARC T1. Could be, all in all the T1 is a first of its kind. Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-22 21:55 ` Frank Scheiner @ 2021-03-23 16:50 ` Jan Engelhardt 2021-03-23 16:57 ` Christoph Hellwig 0 siblings, 1 reply; 24+ messages in thread From: Jan Engelhardt @ 2021-03-23 16:50 UTC (permalink / raw) To: Christoph Hellwig Cc: John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc, Frank Scheiner On Monday 2021-03-22 22:55, Frank Scheiner wrote: >>> Riccardo Mottola first recognized a problem with 5.10.x kernels on his >>> Sun T2000 with UltraSPARC T1 (details in [this thread]). I could verify >>> the problem also on my Sun T1000 and it looks like this specific issue >>> breaks the mounting of the root FS or maybe mounting file systems at >>> all. This affects both booting from disk and from network. >>> (...) >>> ...as first bad commit. >>> >>> ``` >>> commit 028abd9222df0cf5855dab5014a5ebaf06f90565 >>> Author: Christoph Hellwig <hch@lst.de> >>> fs: remove compat_sys_mount Some participants in the discussion over at the debian-sparc list mentioned "NFS" and "Invalid argument", which is something I know just too well from iptables. NFS is a filesystem that uses an extra data blob (5th argument to the mount syscall). Such blobs have historically not always been designed to bear the same layout between ILP32 and LP64 modes, and nfs's structs fell prey to this as well. My hypothesis now is that fs/nfs/fs_context.c line 1160: if (in_compat_syscall()) nfs4_compat_mount_data_conv(data); and ones similar to it (I didn't look too close where nfs3 gets to do its conversion), no longer trigger as a result of compat_sys_mount being wiped from the syscall table: +++ arch/sparc/kernel/syscalls/syscall.tbl @@ -201,7 +201,7 @@ 164 64 utrap_install sys_utrap_install 165 common quotactl sys_quotactl 166 common set_tid_address sys_set_tid_address -167 common mount sys_mount compat_sys_mount +167 common mount sys_mount I didn't extract from the debian-sparc discussion whether people were running the all-LP64 userspace, or had some older Debian with a ILP32-on-64bitkernel setup. [But that's just a theory - a kernel theory!] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-23 16:50 ` Jan Engelhardt @ 2021-03-23 16:57 ` Christoph Hellwig 2021-03-23 17:39 ` Frank Scheiner 2021-03-23 22:17 ` Frank Scheiner 0 siblings, 2 replies; 24+ messages in thread From: Christoph Hellwig @ 2021-03-23 16:57 UTC (permalink / raw) To: Jan Engelhardt Cc: Christoph Hellwig, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc, Frank Scheiner On Tue, Mar 23, 2021 at 05:50:59PM +0100, Jan Engelhardt wrote: > Some participants in the discussion over at the debian-sparc list mentioned > "NFS" and "Invalid argument", which is something I know just too well from > iptables. NFS is a filesystem that uses an extra data blob (5th argument to the > mount syscall). Such blobs have historically not always been designed to bear > the same layout between ILP32 and LP64 modes, and nfs's structs fell prey to > this as well. > > My hypothesis now is that fs/nfs/fs_context.c line 1160: > > if (in_compat_syscall()) > nfs4_compat_mount_data_conv(data); > > and ones similar to it (I didn't look too close where nfs3 gets to do its > conversion), no longer trigger as a result of compat_sys_mount being > wiped from the syscall table: No, if in_compat_syscall() syscall doesn't trigger properly the kernel would not get this far. That being said, the NFS compat code was moved out of the compat mount handler and into nfs and refactored in the commit just before this one. Frank, can you double check that commit 67e306c6906137020267eb9bbdbc127034da3627 really still works, and only 028abd9222df0cf5855dab5014a5ebaf06f90565 broke your setup? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-23 16:57 ` Christoph Hellwig @ 2021-03-23 17:39 ` Frank Scheiner 2021-03-23 22:17 ` Frank Scheiner 1 sibling, 0 replies; 24+ messages in thread From: Frank Scheiner @ 2021-03-23 17:39 UTC (permalink / raw) To: Christoph Hellwig, Jan Engelhardt Cc: John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 23.03.21 17:57, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 05:50:59PM +0100, Jan Engelhardt wrote: >> Some participants in the discussion over at the debian-sparc list mentioned >> "NFS" and "Invalid argument", which is something I know just too well from >> iptables. NFS is a filesystem that uses an extra data blob (5th argument to the >> mount syscall). Such blobs have historically not always been designed to bear >> the same layout between ILP32 and LP64 modes, and nfs's structs fell prey to >> this as well. >> >> My hypothesis now is that fs/nfs/fs_context.c line 1160: >> >> if (in_compat_syscall()) >> nfs4_compat_mount_data_conv(data); >> >> and ones similar to it (I didn't look too close where nfs3 gets to do its >> conversion), no longer trigger as a result of compat_sys_mount being >> wiped from the syscall table: > > No, if in_compat_syscall() syscall doesn't trigger properly the kernel > would not get this far. > > That being said, the NFS compat code was moved out of the compat mount > handler and into nfs and refactored in the commit just before this one. > > Frank, can you double check that commit > 67e306c6906137020267eb9bbdbc127034da3627 really still works, and > only 028abd9222df0cf5855dab5014a5ebaf06f90565 broke your setup? Indeed, I also expected 67e306c6906137020267eb9bbdbc127034da3627 to fail because of its commit message, but from my log it did work correctly. As the T1000 is at home and I don't have another T1 based system in my storage location where I am now, I'll double check that in the evening and report back. Strangely for a V245 (with UltraSPARC IIIi) both commits seem to work according to my testing, but 5.10.x (from Debian) doesn't work and 5.9.15 (also from Debian) does work - tested now both for boot from network and boot from disk. Possibly unrelated to the problem with the T1000, the V245 emits the following for boot from disk with 5.10.x: ``` [...] Loading Linux 5.10.0-5-sparc64-smp ... Loading initial ramdisk ... [ 2.602821] rtc_cmos rtc_cmos: IRQ index 0 not found /dev/sda2: clean, 33516/8454144 files, 1105784/33798750 blocks [ 13.542728] autofs4:pid:1:autofs_fill_super: called with bogus options [ 13.628931] systemd[1]: proc-sys-fs-binfmt_misc.automount: Failed to initialize automounter: Invalid argument [ 13.759917] systemd[1]: Failed to set up automount Arbitrary Executable File Formats File System Automount Point. [FAILED] Failed to set up automount File System Automount Point. [ 14.456396] Unable to handle kernel paging request in mna handler [ 14.456400] at virtual address da65f2fed110e482 [ 14.597474] current->{active_,}mm->context = 00000000000000ce [ 14.597478] current->{active_,}mm->pgd = fff0000006d5c000 [ 14.752380] Unable to handle kernel paging request in mna handler [ 14.752383] at virtual address da65f2fed110e482 [ 14.893509] current->{active_,}mm->context = 0000000000000094 [ 14.969141] current->{active_,}mm->pgd = fff00011010e0000 [ 15.040554] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 15.141430] Press Stop-A (L1-A) from sun keyboard or send break [ 15.141430] twice on console to return to the boot prom [ 15.141459] kernel BUG at kernel/cpu.c:960 ``` Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-23 16:57 ` Christoph Hellwig 2021-03-23 17:39 ` Frank Scheiner @ 2021-03-23 22:17 ` Frank Scheiner 2021-03-24 8:28 ` Christoph Hellwig 1 sibling, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-23 22:17 UTC (permalink / raw) To: Christoph Hellwig, Jan Engelhardt Cc: John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 23.03.21 17:57, Christoph Hellwig wrote:> Frank, can you double check that commit > 67e306c6906137020267eb9bbdbc127034da3627 really still works, and > only 028abd9222df0cf5855dab5014a5ebaf06f90565 broke your setup? So I manually checked out both 67e306c6906137020267eb9bbdbc127034da3627 and 028abd9222df0cf5855dab5014a5ebaf06f90565 and recompiled both (doing `make [...] mrproper` before each run). The results didn't change from the ones from the bisecting process: 67e306c6906137020267eb9bbdbc127034da3627 ...is working and: 028abd9222df0cf5855dab5014a5ebaf06f90565 ...is broken on my T1000. As I don't know how big attachments can be on this list, I put the logs on pastebin. A log for 028abd9222df is here: https://pastebin.com/ApPYsMcu A log for 67e306c69061 is here: https://pastebin.com/uGLXX7RS Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-23 22:17 ` Frank Scheiner @ 2021-03-24 8:28 ` Christoph Hellwig 2021-03-24 12:30 ` Frank Scheiner ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Christoph Hellwig @ 2021-03-24 8:28 UTC (permalink / raw) To: Frank Scheiner Cc: Christoph Hellwig, Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On Tue, Mar 23, 2021 at 11:17:41PM +0100, Frank Scheiner wrote: > 028abd9222df0cf5855dab5014a5ebaf06f90565 > > ...is broken on my T1000. > > As I don't know how big attachments can be on this list, I put the logs > on pastebin. > > A log for 028abd9222df is here: > > https://pastebin.com/ApPYsMcu Just do confirm: in this tree line 304 in mm/slub.c is this BUG_ON: BUG_ON(object == fp); /* naive detection of double free or corruption */ which would mean we have a double free. In that case it would be interesting which call to kfree this is, which could be done by calling gdb on vmlinux and then typing; l *(sys_mount+0x114/0x1e0) Not that a double free caused by this conversion makes any sense to me.. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 8:28 ` Christoph Hellwig @ 2021-03-24 12:30 ` Frank Scheiner 2021-03-24 12:42 ` Anatoly Pugachev 2021-03-24 12:49 ` John Paul Adrian Glaubitz 2021-03-24 13:09 ` Frank Scheiner 2021-03-24 13:57 ` Frank Scheiner 2 siblings, 2 replies; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 12:30 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 09:28, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 11:17:41PM +0100, Frank Scheiner wrote: >> 028abd9222df0cf5855dab5014a5ebaf06f90565 >> >> ...is broken on my T1000. >> >> As I don't know how big attachments can be on this list, I put the logs >> on pastebin. >> >> A log for 028abd9222df is here: >> >> https://pastebin.com/ApPYsMcu > > Just do confirm: in this tree line 304 in mm/slub.c is this BUG_ON: > > BUG_ON(object == fp); /* naive detection of double free or corruption */ > > which would mean we have a double free. In that case it would be > interesting which call to kfree this is, which could be done by > calling gdb on vmlinux and then typing; > > l *(sys_mount+0x114/0x1e0) > > Not that a double free caused by this conversion makes any sense to me.. Sorry, but I can't install `gdb` on my T1000 ATM, because it depends on "libpython3.8" for sparc64 (see [1]) and "libpython3.9" for the other architectures, but "libpython3.8" is actually not available for sparc64, "libpython3.9" is available for sparc64 though: ``` root@t1000:~# apt install gdb Reading package lists... Done Building dependency tree... Done Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: gdb : Depends: libpython3.8 (>= 3.8.2) but it is not installable Recommends: libc-dbg E: Unable to correct problems, you have held broken packages. ``` [1]: https://packages.debian.org/sid/gdb Something wrong with the dependencies. Any suggestions? Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 12:30 ` Frank Scheiner @ 2021-03-24 12:42 ` Anatoly Pugachev 2021-03-24 12:48 ` Frank Scheiner 2021-03-24 12:49 ` John Paul Adrian Glaubitz 1 sibling, 1 reply; 24+ messages in thread From: Anatoly Pugachev @ 2021-03-24 12:42 UTC (permalink / raw) To: Frank Scheiner Cc: Christoph Hellwig, Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On Wed, Mar 24, 2021 at 3:31 PM Frank Scheiner <frank.scheiner@web.de> wrote: > Sorry, but I can't install `gdb` on my T1000 ATM, because it depends on > "libpython3.8" for sparc64 (see [1]) and "libpython3.9" for the other > architectures, but "libpython3.8" is actually not available for sparc64, > "libpython3.9" is available for sparc64 though: > ... > The following packages have unmet dependencies: > gdb : Depends: libpython3.8 (>= 3.8.2) but it is not installable > Recommends: libc-dbg > E: Unable to correct problems, you have held broken packages. > ``` > Something wrong with the dependencies. Any suggestions? Frank, you could use http://snapshot.debian.org to install old versions of packages, i.e. gdb and libpython-3.8 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 12:42 ` Anatoly Pugachev @ 2021-03-24 12:48 ` Frank Scheiner 0 siblings, 0 replies; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 12:48 UTC (permalink / raw) To: Anatoly Pugachev Cc: Christoph Hellwig, Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 13:42, Anatoly Pugachev wrote: > On Wed, Mar 24, 2021 at 3:31 PM Frank Scheiner <frank.scheiner@web.de> wrote: >> Sorry, but I can't install `gdb` on my T1000 ATM, because it depends on >> "libpython3.8" for sparc64 (see [1]) and "libpython3.9" for the other >> architectures, but "libpython3.8" is actually not available for sparc64, >> "libpython3.9" is available for sparc64 though: >> ... >> The following packages have unmet dependencies: >> gdb : Depends: libpython3.8 (>= 3.8.2) but it is not installable >> Recommends: libc-dbg >> E: Unable to correct problems, you have held broken packages. >> ``` >> Something wrong with the dependencies. Any suggestions? > > Frank, > > you could use http://snapshot.debian.org to install old versions of > packages, i.e. gdb and libpython-3.8 Of course, didn't think about that. Will try that and report my findings. Thanks and cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 12:30 ` Frank Scheiner 2021-03-24 12:42 ` Anatoly Pugachev @ 2021-03-24 12:49 ` John Paul Adrian Glaubitz 1 sibling, 0 replies; 24+ messages in thread From: John Paul Adrian Glaubitz @ 2021-03-24 12:49 UTC (permalink / raw) To: Frank Scheiner, Christoph Hellwig Cc: Jan Engelhardt, Sparc kernel list, debian-sparc Hello Frank! On 3/24/21 1:30 PM, Frank Scheiner wrote: > Sorry, but I can't install `gdb` on my T1000 ATM, because it depends on > "libpython3.8" for sparc64 (see [1]) and "libpython3.9" for the other > architectures, but "libpython3.8" is actually not available for sparc64, > "libpython3.9" is available for sparc64 though: The reason for this is a bug in gdb [1] and the fact that we don't have cruft in Debian Ports [2]. If someone knows how to disable individual tests in the GDB testsuite, we could just disable the problematic test in src:gdb. Adrian > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=26170 > [2] https://lists.debian.org/debian-sparc/2017/12/msg00060.html -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 8:28 ` Christoph Hellwig 2021-03-24 12:30 ` Frank Scheiner @ 2021-03-24 13:09 ` Frank Scheiner 2021-03-24 13:16 ` John Paul Adrian Glaubitz 2021-03-24 13:57 ` Frank Scheiner 2 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 13:09 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 09:28, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 11:17:41PM +0100, Frank Scheiner wrote: >> 028abd9222df0cf5855dab5014a5ebaf06f90565 >> >> ...is broken on my T1000. >> >> As I don't know how big attachments can be on this list, I put the logs >> on pastebin. >> >> A log for 028abd9222df is here: >> >> https://pastebin.com/ApPYsMcu > > Just do confirm: in this tree line 304 in mm/slub.c is this BUG_ON: > > BUG_ON(object == fp); /* naive detection of double free or corruption */ > > which would mean we have a double free. In that case it would be > interesting which call to kfree this is, which could be done by > calling gdb on vmlinux and then typing; > > l *(sys_mount+0x114/0x1e0) > > Not that a double free caused by this conversion makes any sense to me.. This is what I get: ``` root@t1000:~/kernels-in-question# gdb vmlinux-028abd9222df-new GNU gdb (Debian 9.2-1+b1) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "sparc64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from vmlinux-028abd9222df-new... (gdb) l *(sys_mount+0x114/0x1e0) 0x6c6380 is in __se_sys_mount (fs/namespace.c:3390). 3385 fs/namespace.c: No such file or directory. (gdb) ``` Kernel sources are not available on the T1000. If need be, where do they need to exist and how should the directory be named - `/usr/src/[...]`? Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 13:09 ` Frank Scheiner @ 2021-03-24 13:16 ` John Paul Adrian Glaubitz 2021-03-24 13:19 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: John Paul Adrian Glaubitz @ 2021-03-24 13:16 UTC (permalink / raw) To: Frank Scheiner, Christoph Hellwig Cc: Jan Engelhardt, Sparc kernel list, debian-sparc On 3/24/21 2:09 PM, Frank Scheiner wrote:> Kernel sources are not available on the T1000. > > If need be, where do they need to exist and how should the directory be > named - `/usr/src/[...]`? Try installing "linux-source" and the "-dbg" package for your Debian kernel. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz@debian.org `. `' Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 13:16 ` John Paul Adrian Glaubitz @ 2021-03-24 13:19 ` Frank Scheiner 2021-03-24 13:24 ` Anatoly Pugachev 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 13:19 UTC (permalink / raw) To: John Paul Adrian Glaubitz, Christoph Hellwig Cc: Jan Engelhardt, Sparc kernel list, debian-sparc On 24.03.21 14:16, John Paul Adrian Glaubitz wrote: > On 3/24/21 2:09 PM, Frank Scheiner wrote:> Kernel sources are not available on the T1000. >> >> If need be, where do they need to exist and how should the directory be >> named - `/usr/src/[...]`? > > Try installing "linux-source" and the "-dbg" package for your Debian kernel. But don't I need the source for the kernel at 028abd92? I figured, I need the sources in `/usr/src/linux-source-5.9.0-rc1+` because "5.9.0-rc1+" is the version the corresponding modules are installed - could that be correct? Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 13:19 ` Frank Scheiner @ 2021-03-24 13:24 ` Anatoly Pugachev 2021-03-24 13:29 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: Anatoly Pugachev @ 2021-03-24 13:24 UTC (permalink / raw) To: Frank Scheiner Cc: John Paul Adrian Glaubitz, Christoph Hellwig, Jan Engelhardt, Sparc kernel list, debian-sparc On Wed, Mar 24, 2021 at 4:19 PM Frank Scheiner <frank.scheiner@web.de> wrote: > On 24.03.21 14:16, John Paul Adrian Glaubitz wrote: > > On 3/24/21 2:09 PM, Frank Scheiner wrote:> Kernel sources are not available on the T1000. > >> > >> If need be, where do they need to exist and how should the directory be > >> named - `/usr/src/[...]`? > > > > Try installing "linux-source" and the "-dbg" package for your Debian kernel. > > But don't I need the source for the kernel at 028abd92? I figured, I > need the sources in `/usr/src/linux-source-5.9.0-rc1+` because > "5.9.0-rc1+" is the version the corresponding modules are installed - > could that be correct? Frank, i'm using gdb from kernel sources directory (from which kernel is installed), like: $ uname -a Linux ttip 5.12.0-rc4 #203 SMP Wed Mar 24 15:50:29 MSK 2021 sparc64 GNU/Linux $ cd linux-2.6 linux-2.6$ git describe v5.12-rc4 linux-2.6$ gdb -q vmlinux Reading symbols from vmlinux... (gdb) l *(sys_mount+0x114/0x1e0) 0x6dd7c0 is in __se_sys_mount (fs/namespace.c:3431). 3426 /* ... and return the root of (sub)tree on it */ 3427 return path.dentry; 3428 } 3429 EXPORT_SYMBOL(mount_subtree); 3430 3431 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name, 3432 char __user *, type, unsigned long, flags, void __user *, data) 3433 { 3434 int ret; 3435 char *kernel_type; (gdb) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 13:24 ` Anatoly Pugachev @ 2021-03-24 13:29 ` Frank Scheiner 0 siblings, 0 replies; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 13:29 UTC (permalink / raw) To: Anatoly Pugachev Cc: John Paul Adrian Glaubitz, Christoph Hellwig, Jan Engelhardt, Sparc kernel list, debian-sparc On 24.03.21 14:24, Anatoly Pugachev wrote: > On Wed, Mar 24, 2021 at 4:19 PM Frank Scheiner <frank.scheiner@web.de> wrote: >> On 24.03.21 14:16, John Paul Adrian Glaubitz wrote: >>> On 3/24/21 2:09 PM, Frank Scheiner wrote:> Kernel sources are not available on the T1000. >>>> >>>> If need be, where do they need to exist and how should the directory be >>>> named - `/usr/src/[...]`? >>> >>> Try installing "linux-source" and the "-dbg" package for your Debian kernel. >> >> But don't I need the source for the kernel at 028abd92? I figured, I >> need the sources in `/usr/src/linux-source-5.9.0-rc1+` because >> "5.9.0-rc1+" is the version the corresponding modules are installed - >> could that be correct? > > Frank, > > i'm using gdb from kernel sources directory (from which kernel is > installed), like: > > $ uname -a > Linux ttip 5.12.0-rc4 #203 SMP Wed Mar 24 15:50:29 MSK 2021 sparc64 GNU/Linux > $ cd linux-2.6 > linux-2.6$ git describe > v5.12-rc4 > linux-2.6$ gdb -q vmlinux > Reading symbols from vmlinux... > (gdb) l *(sys_mount+0x114/0x1e0) > 0x6dd7c0 is in __se_sys_mount (fs/namespace.c:3431). > 3426 /* ... and return the root of (sub)tree on it */ > 3427 return path.dentry; > 3428 } > 3429 EXPORT_SYMBOL(mount_subtree); > 3430 > 3431 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name, > 3432 char __user *, type, unsigned long, flags, > void __user *, data) > 3433 { > 3434 int ret; > 3435 char *kernel_type; > (gdb) > Ok, will try that approach. I'm currently `tar`ing the kernel sources @028abd92 on the cross-compiling host and will move them over to the T1000. Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 8:28 ` Christoph Hellwig 2021-03-24 12:30 ` Frank Scheiner 2021-03-24 13:09 ` Frank Scheiner @ 2021-03-24 13:57 ` Frank Scheiner 2021-03-24 15:22 ` Jan Engelhardt 2 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 13:57 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 09:28, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 11:17:41PM +0100, Frank Scheiner wrote: >> 028abd9222df0cf5855dab5014a5ebaf06f90565 >> >> ...is broken on my T1000. >> >> As I don't know how big attachments can be on this list, I put the logs >> on pastebin. >> >> A log for 028abd9222df is here: >> >> https://pastebin.com/ApPYsMcu > > Just do confirm: in this tree line 304 in mm/slub.c is this BUG_ON: > > BUG_ON(object == fp); /* naive detection of double free or corruption */ > > which would mean we have a double free. In that case it would be > interesting which call to kfree this is, which could be done by > calling gdb on vmlinux and then typing; > > l *(sys_mount+0x114/0x1e0) > > Not that a double free caused by this conversion makes any sense to me.. > Finally - a T1 thread is so slow (for untaring) that I untared the tarball from my X4270 cross-compile host to the T1000's root FS in the end: ``` root@t1000:~/mnt/torvalds-linux# git describe v5.9-rc1-3-g028abd9222df root@t1000:~/mnt/torvalds-linux# gdb -q vmlinux Reading symbols from vmlinux... (gdb) l *(sys_mount+0x114/0x1e0) 0x6c6380 is in __se_sys_mount (fs/namespace.c:3390). 3385 /* ... and return the root of (sub)tree on it */ 3386 return path.dentry; 3387 } 3388 EXPORT_SYMBOL(mount_subtree); 3389 3390 SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name, 3391 char __user *, type, unsigned long, flags, void __user *, data) 3392 { 3393 int ret; 3394 char *kernel_type; (gdb) ``` ...not sure if that adds anything to what Anatoly already provided apart from the "correct" line numbers for the actually used kernel. Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 13:57 ` Frank Scheiner @ 2021-03-24 15:22 ` Jan Engelhardt 2021-03-24 15:58 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: Jan Engelhardt @ 2021-03-24 15:22 UTC (permalink / raw) To: Frank Scheiner Cc: Christoph Hellwig, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On Wednesday 2021-03-24 14:57, Frank Scheiner wrote: > (gdb) l *(sys_mount+0x114/0x1e0) > 0x6c6380 is in __se_sys_mount (fs/namespace.c:3390). /0x1e0 does not normally belong there. Just l *(sys_mount+0x114) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 15:22 ` Jan Engelhardt @ 2021-03-24 15:58 ` Frank Scheiner 2021-03-24 16:10 ` Christoph Hellwig 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 15:58 UTC (permalink / raw) To: Jan Engelhardt Cc: Christoph Hellwig, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 16:22, Jan Engelhardt wrote: > > On Wednesday 2021-03-24 14:57, Frank Scheiner wrote: > >> (gdb) l *(sys_mount+0x114/0x1e0) >> 0x6c6380 is in __se_sys_mount (fs/namespace.c:3390). > > /0x1e0 does not normally belong there. Just > > l *(sys_mount+0x114) > I guess this comes from my log on [1]: ``` [...] [ 20.089289] RPC: <kfree+0x3ac/0x420> [ 20.089415] l0: ffff8001f8885cc8 l1: ffff8001f8881380 l2: ffff8001ec434558 l3: 0000000000201db0 [ 20.089586] l4: 000000000000029c l5: ffff80010000c1a0 l6: ffff8001ec79c000 l7: 00000000006c6380 [ 20.089802] i0: 0000000000001000 i1: ffff8001ec436000 i2: 00000000006c6494 i3: ffff8001ec436000 [ 20.089877] i4: ffff800008405340 i5: 00006000045396c0 i6: ffff8001ec79f561 i7: 00000000006c6494 [ 20.090051] I7: <sys_mount+0x114/0x1e0> [ 20.090186] Call Trace: [ 20.090279] [<00000000006c6494>] sys_mount+0x114/0x1e0 [ 20.090338] [<00000000006c6454>] sys_mount+0xd4/0x1e0 [ 20.090499] [<0000000000406274>] linux_sparc_syscall+0x34/0x44 [ 20.090697] Disabling lock debugging due to kernel taint [ 20.090770] Caller[00000000006c6494]: sys_mount+0x114/0x1e0 [ 20.090926] Caller[00000000006c6454]: sys_mount+0xd4/0x1e0 [ 20.091133] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44 [ 20.091196] Caller[0000000000100aa8]: 0x100aa8 [...] ``` [1]: https://pastebin.com/ApPYsMcu Here the result for the suggested command: ``` root@t1000:~/mnt/torvalds-linux# gdb -q vmlinux Reading symbols from vmlinux... (gdb) l *(sys_mount+0x114) 0x6c6494 is in __se_sys_mount (fs/namespace.c:3415). 3410 if (IS_ERR(options)) 3411 goto out_data; 3412 3413 ret = do_mount(kernel_dev, dir_name, kernel_type, flags, options); 3414 3415 kfree(options); 3416 out_data: 3417 kfree(kernel_dev); 3418 out_dev: 3419 kfree(kernel_type); (gdb) ``` Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 15:58 ` Frank Scheiner @ 2021-03-24 16:10 ` Christoph Hellwig 2021-03-24 16:33 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: Christoph Hellwig @ 2021-03-24 16:10 UTC (permalink / raw) To: Frank Scheiner Cc: Jan Engelhardt, Christoph Hellwig, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On Wed, Mar 24, 2021 at 04:58:39PM +0100, Frank Scheiner wrote: > [ 20.090279] [<00000000006c6494>] sys_mount+0x114/0x1e0 > [ 20.090338] [<00000000006c6454>] sys_mount+0xd4/0x1e0 > [ 20.090499] [<0000000000406274>] linux_sparc_syscall+0x34/0x44 > [ 20.090697] Disabling lock debugging due to kernel taint > [ 20.090770] Caller[00000000006c6494]: sys_mount+0x114/0x1e0 > [ 20.090926] Caller[00000000006c6454]: sys_mount+0xd4/0x1e0 > [ 20.091133] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44 > [ 20.091196] Caller[0000000000100aa8]: 0x100aa8 > [...] > ``` > > [1]: https://pastebin.com/ApPYsMcu > > Here the result for the suggested command: Thanks. And very strange, as i can't find what would free options before. Does the system boot if you comment out that kfree in line 3415 (even if that casues a memleak elsewhere). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 16:10 ` Christoph Hellwig @ 2021-03-24 16:33 ` Frank Scheiner 2021-03-24 16:37 ` Frank Scheiner 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 16:33 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 17:10, Christoph Hellwig wrote: > On Wed, Mar 24, 2021 at 04:58:39PM +0100, Frank Scheiner wrote: >> [ 20.090279] [<00000000006c6494>] sys_mount+0x114/0x1e0 >> [ 20.090338] [<00000000006c6454>] sys_mount+0xd4/0x1e0 >> [ 20.090499] [<0000000000406274>] linux_sparc_syscall+0x34/0x44 >> [ 20.090697] Disabling lock debugging due to kernel taint >> [ 20.090770] Caller[00000000006c6494]: sys_mount+0x114/0x1e0 >> [ 20.090926] Caller[00000000006c6454]: sys_mount+0xd4/0x1e0 >> [ 20.091133] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44 >> [ 20.091196] Caller[0000000000100aa8]: 0x100aa8 >> [...] >> ``` >> >> [1]: https://pastebin.com/ApPYsMcu >> >> Here the result for the suggested command: > > Thanks. And very strange, as i can't find what would free options > before. Does the system boot if you comment out that kfree in line > 3415 (even if that casues a memleak elsewhere). Unfortunately not, the result with the kfree() commented in fs/namespace.c:3415 looks pretty similar in my eyes. Log is on [2] [1]: https://pastebin.com/zmSFpv3R Cheers, Frank ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 16:33 ` Frank Scheiner @ 2021-03-24 16:37 ` Frank Scheiner 2021-03-25 7:50 ` Christoph Hellwig 0 siblings, 1 reply; 24+ messages in thread From: Frank Scheiner @ 2021-03-24 16:37 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc On 24.03.21 17:33, Frank Scheiner wrote: > On 24.03.21 17:10, Christoph Hellwig wrote: >> On Wed, Mar 24, 2021 at 04:58:39PM +0100, Frank Scheiner wrote: >>> [ 20.090279] [<00000000006c6494>] sys_mount+0x114/0x1e0 >>> [ 20.090338] [<00000000006c6454>] sys_mount+0xd4/0x1e0 >>> [ 20.090499] [<0000000000406274>] linux_sparc_syscall+0x34/0x44 >>> [ 20.090697] Disabling lock debugging due to kernel taint >>> [ 20.090770] Caller[00000000006c6494]: sys_mount+0x114/0x1e0 >>> [ 20.090926] Caller[00000000006c6454]: sys_mount+0xd4/0x1e0 >>> [ 20.091133] Caller[0000000000406274]: linux_sparc_syscall+0x34/0x44 >>> [ 20.091196] Caller[0000000000100aa8]: 0x100aa8 >>> [...] >>> ``` >>> >>> [1]: https://pastebin.com/ApPYsMcu >>> >>> Here the result for the suggested command: >> >> Thanks. And very strange, as i can't find what would free options >> before. Does the system boot if you comment out that kfree in line >> 3415 (even if that casues a memleak elsewhere). > > Unfortunately not, the result with the kfree() commented in > fs/namespace.c:3415 looks pretty similar in my eyes. Actually on second view the result looks different. :-/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Regression in 028abd92 for Sun UltraSPARC T1 2021-03-24 16:37 ` Frank Scheiner @ 2021-03-25 7:50 ` Christoph Hellwig 0 siblings, 0 replies; 24+ messages in thread From: Christoph Hellwig @ 2021-03-25 7:50 UTC (permalink / raw) To: Frank Scheiner Cc: Christoph Hellwig, Jan Engelhardt, John Paul Adrian Glaubitz, Sparc kernel list, debian-sparc I have to admit I'm completely lost at this point. This new trace looks totally strange to me, and I'm pretty sure whatever symptoms you see are due to different alignments / code sections etc just triggered by the removal, we need help from the real sparc experts. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2021-03-25 7:51 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-22 21:30 Regression in 028abd92 for Sun UltraSPARC T1 Frank Scheiner 2021-03-22 21:48 ` John Paul Adrian Glaubitz 2021-03-22 21:55 ` Frank Scheiner 2021-03-23 16:50 ` Jan Engelhardt 2021-03-23 16:57 ` Christoph Hellwig 2021-03-23 17:39 ` Frank Scheiner 2021-03-23 22:17 ` Frank Scheiner 2021-03-24 8:28 ` Christoph Hellwig 2021-03-24 12:30 ` Frank Scheiner 2021-03-24 12:42 ` Anatoly Pugachev 2021-03-24 12:48 ` Frank Scheiner 2021-03-24 12:49 ` John Paul Adrian Glaubitz 2021-03-24 13:09 ` Frank Scheiner 2021-03-24 13:16 ` John Paul Adrian Glaubitz 2021-03-24 13:19 ` Frank Scheiner 2021-03-24 13:24 ` Anatoly Pugachev 2021-03-24 13:29 ` Frank Scheiner 2021-03-24 13:57 ` Frank Scheiner 2021-03-24 15:22 ` Jan Engelhardt 2021-03-24 15:58 ` Frank Scheiner 2021-03-24 16:10 ` Christoph Hellwig 2021-03-24 16:33 ` Frank Scheiner 2021-03-24 16:37 ` Frank Scheiner 2021-03-25 7:50 ` Christoph Hellwig
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.