linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
@ 2022-09-22 11:32 Petr Štetiar
  2022-09-22 12:32 ` Phil Auld
  0 siblings, 1 reply; 9+ messages in thread
From: Petr Štetiar @ 2022-09-22 11:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: pauld, yury.norov, rafael, Greg Kroah-Hartman

Hi,

we've got a recent bug report[1], that lscpu segfaults on aarch64 board running
5.15.y kernel. It is working fine on 5.10.y kernel. 

I've tracked it down[2] to the issue with `topology/thread_siblings` which
apart from very strange file size returns empty content. I assume, that it's
somehow related to the changes done in commit bb9ec13d156e ("topology: use
bin_attribute to break the size limitation of cpumap ABI"), but I didn't tried
to revert it yet to verify it.

Kernel 5.15.68:

  root@OpenWrt:/# uname -a
  Linux OpenWrt 5.15.68 #0 SMP Wed Sep 21 05:54:21 2022 aarch64 GNU/Linux

  root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
  -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu1/topology/thread_siblings
  -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu0/topology/thread_siblings

  root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
  root@OpenWrt:/# 

Kernel 5.10.138:

  root@OpenWrt:/# uname -a
  Linux OpenWrt 5.10.138 #0 SMP Sat Sep 3 02:55:34 2022 aarch64 GNU/Linux

  root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
  2
  1

  root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
  -r--r--r--    1 root     root          4096 Sep 22 11:12 /sys/devices/system/cpu/cpu1/topology/thread_siblings
  -r--r--r--    1 root     root          4096 Sep 22 11:12 /sys/devices/system/cpu/cpu0/topology/thread_siblings


1. https://github.com/openwrt/openwrt/issues/10737
2. https://github.com/util-linux/util-linux/pull/1821


Cheers,

Petr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 11:32 aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content) Petr Štetiar
@ 2022-09-22 12:32 ` Phil Auld
  2022-09-22 12:40   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Auld @ 2022-09-22 12:32 UTC (permalink / raw)
  To: Petr Štetiar; +Cc: linux-kernel, yury.norov, rafael, Greg Kroah-Hartman

On Thu, Sep 22, 2022 at 01:32:17PM +0200 Petr Štetiar wrote:
> Hi,
> 
> we've got a recent bug report[1], that lscpu segfaults on aarch64 board running
> 5.15.y kernel. It is working fine on 5.10.y kernel. 
> 
> I've tracked it down[2] to the issue with `topology/thread_siblings` which
> apart from very strange file size returns empty content. I assume, that it's
> somehow related to the changes done in commit bb9ec13d156e ("topology: use
> bin_attribute to break the size limitation of cpumap ABI"), but I didn't tried
> to revert it yet to verify it.
>

This is actually due to a fix for that since returning 0 size breaks
things as well.

  7ee951acd31a drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist

The fix for small number of cpus as you have is now in Greg's driver core tree

  d7f06bdd6ee8 drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES

and should work it's way back to stable trees soon.


Cheers,
Phil


> Kernel 5.15.68:
> 
>   root@OpenWrt:/# uname -a
>   Linux OpenWrt 5.15.68 #0 SMP Wed Sep 21 05:54:21 2022 aarch64 GNU/Linux
> 
>   root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
>   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu1/topology/thread_siblings
>   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu0/topology/thread_siblings
> 
>   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
>   root@OpenWrt:/# 
> 
> Kernel 5.10.138:
> 
>   root@OpenWrt:/# uname -a
>   Linux OpenWrt 5.10.138 #0 SMP Sat Sep 3 02:55:34 2022 aarch64 GNU/Linux
> 
>   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
>   2
>   1
> 
>   root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
>   -r--r--r--    1 root     root          4096 Sep 22 11:12 /sys/devices/system/cpu/cpu1/topology/thread_siblings
>   -r--r--r--    1 root     root          4096 Sep 22 11:12 /sys/devices/system/cpu/cpu0/topology/thread_siblings
> 
> 
> 1. https://github.com/openwrt/openwrt/issues/10737
> 2. https://github.com/util-linux/util-linux/pull/1821
> 
> 
> Cheers,
> 
> Petr
> 

-- 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 12:32 ` Phil Auld
@ 2022-09-22 12:40   ` Greg Kroah-Hartman
  2022-09-22 13:18     ` Phil Auld
  0 siblings, 1 reply; 9+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-22 12:40 UTC (permalink / raw)
  To: Phil Auld; +Cc: Petr Štetiar, linux-kernel, yury.norov, rafael

On Thu, Sep 22, 2022 at 08:32:10AM -0400, Phil Auld wrote:
> On Thu, Sep 22, 2022 at 01:32:17PM +0200 Petr Štetiar wrote:
> > Hi,
> > 
> > we've got a recent bug report[1], that lscpu segfaults on aarch64 board running
> > 5.15.y kernel. It is working fine on 5.10.y kernel. 
> > 
> > I've tracked it down[2] to the issue with `topology/thread_siblings` which
> > apart from very strange file size returns empty content. I assume, that it's
> > somehow related to the changes done in commit bb9ec13d156e ("topology: use
> > bin_attribute to break the size limitation of cpumap ABI"), but I didn't tried
> > to revert it yet to verify it.
> >
> 
> This is actually due to a fix for that since returning 0 size breaks
> things as well.
> 
>   7ee951acd31a drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
> 
> The fix for small number of cpus as you have is now in Greg's driver core tree
> 
>   d7f06bdd6ee8 drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES
> 
> and should work it's way back to stable trees soon.

That should fix up the file size issue.

The main problem being reported here is:

> > Kernel 5.15.68:
> > 
> >   root@OpenWrt:/# uname -a
> >   Linux OpenWrt 5.15.68 #0 SMP Wed Sep 21 05:54:21 2022 aarch64 GNU/Linux
> > 
> >   root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
> >   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu1/topology/thread_siblings
> >   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu0/topology/thread_siblings
> > 
> >   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
> >   root@OpenWrt:/# 

Nothing in the file in 5.15, yet 5.10:

> > 
> > Kernel 5.10.138:
> > 
> >   root@OpenWrt:/# uname -a
> >   Linux OpenWrt 5.10.138 #0 SMP Sat Sep 3 02:55:34 2022 aarch64 GNU/Linux
> > 
> >   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
> >   2
> >   1

Has data in the files.

What caused that change?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 12:40   ` Greg Kroah-Hartman
@ 2022-09-22 13:18     ` Phil Auld
  2022-09-22 14:05       ` Petr Štetiar
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Auld @ 2022-09-22 13:18 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Petr Štetiar, linux-kernel, yury.norov, rafael

On Thu, Sep 22, 2022 at 02:40:00PM +0200 Greg Kroah-Hartman wrote:
> On Thu, Sep 22, 2022 at 08:32:10AM -0400, Phil Auld wrote:
> > On Thu, Sep 22, 2022 at 01:32:17PM +0200 Petr Štetiar wrote:
> > > Hi,
> > > 
> > > we've got a recent bug report[1], that lscpu segfaults on aarch64 board running
> > > 5.15.y kernel. It is working fine on 5.10.y kernel. 
> > > 
> > > I've tracked it down[2] to the issue with `topology/thread_siblings` which
> > > apart from very strange file size returns empty content. I assume, that it's
> > > somehow related to the changes done in commit bb9ec13d156e ("topology: use
> > > bin_attribute to break the size limitation of cpumap ABI"), but I didn't tried
> > > to revert it yet to verify it.
> > >
> > 
> > This is actually due to a fix for that since returning 0 size breaks
> > things as well.
> > 
> >   7ee951acd31a drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
> > 
> > The fix for small number of cpus as you have is now in Greg's driver core tree
> > 
> >   d7f06bdd6ee8 drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES
> > 
> > and should work it's way back to stable trees soon.
> 
> That should fix up the file size issue.
> 
> The main problem being reported here is:
> 
> > > Kernel 5.15.68:
> > > 
> > >   root@OpenWrt:/# uname -a
> > >   Linux OpenWrt 5.15.68 #0 SMP Wed Sep 21 05:54:21 2022 aarch64 GNU/Linux
> > > 
> > >   root@OpenWrt:/# find /sys -name thread_siblings -exec ls -al {} \;
> > >   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu1/topology/thread_siblings
> > >   -r--r--r--    1 root     root     18446744073709551615 Sep 22 08:37 /sys/devices/system/cpu/cpu0/topology/thread_siblings
> > > 
> > >   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
> > >   root@OpenWrt:/# 
> 
> Nothing in the file in 5.15, yet 5.10:
> 
> > > 
> > > Kernel 5.10.138:
> > > 
> > >   root@OpenWrt:/# uname -a
> > >   Linux OpenWrt 5.10.138 #0 SMP Sat Sep 3 02:55:34 2022 aarch64 GNU/Linux
> > > 
> > >   root@OpenWrt:/# find /sys -name thread_siblings -exec cat {} \;
> > >   2
> > >   1
> 
> Has data in the files.
>

Good point. My eyes latched on to that huge file size for some reason ;)

> What caused that change?

I've seen the size cause problems for tools. Are we sure that it's the empty file and not
the size causing issues?  Maybe something is treating that as signed again for a count of
-1 bytes (which seems like it would be a bug anyway)?


Cheers,
Phil

>
> thanks,
> 
> greg k-h
> 

-- 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 13:18     ` Phil Auld
@ 2022-09-22 14:05       ` Petr Štetiar
  2022-09-22 17:18         ` Phil Auld
  0 siblings, 1 reply; 9+ messages in thread
From: Petr Štetiar @ 2022-09-22 14:05 UTC (permalink / raw)
  To: Phil Auld; +Cc: Greg Kroah-Hartman, linux-kernel, yury.norov, rafael

Phil Auld <pauld@redhat.com> [2022-09-22 09:18:47]:

Hi,

> I've seen the size cause problems for tools. Are we sure that it's the empty file and not
> the size causing issues?  Maybe something is treating that as signed again for a count of
> -1 bytes (which seems like it would be a bug anyway)?

  root@OpenWrt:/# strace cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
  ...snip...
  openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
  sendfile(1, 3, NULL, 16777216)          = 0

  root@OpenWrt:/# strace md5sum /sys/devices/system/cpu/cpu1/topology/thread_sibli
  ...snip...
  openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
  read(3, "", 4096)                       = 0

  root@OpenWrt:/# strace head /sys/devices/system/cpu/cpu1/topology/thread_siblings
  ...snip...
  openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
  read(3, "", 1024)                       = 0

Cheers,

Petr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 14:05       ` Petr Štetiar
@ 2022-09-22 17:18         ` Phil Auld
  2022-09-22 20:05           ` Petr Štetiar
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Auld @ 2022-09-22 17:18 UTC (permalink / raw)
  To: Petr Štetiar; +Cc: Greg Kroah-Hartman, linux-kernel, yury.norov, rafael

On Thu, Sep 22, 2022 at 04:05:04PM +0200 Petr Štetiar wrote:
> Phil Auld <pauld@redhat.com> [2022-09-22 09:18:47]:
> 
> Hi,
> 
> > I've seen the size cause problems for tools. Are we sure that it's the empty file and not
> > the size causing issues?  Maybe something is treating that as signed again for a count of
> > -1 bytes (which seems like it would be a bug anyway)?
> 
>   root@OpenWrt:/# strace cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
>   ...snip...
>   openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
>   sendfile(1, 3, NULL, 16777216)          = 0
> 
>   root@OpenWrt:/# strace md5sum /sys/devices/system/cpu/cpu1/topology/thread_sibli
>   ...snip...
>   openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
>   read(3, "", 4096)                       = 0
> 
>   root@OpenWrt:/# strace head /sys/devices/system/cpu/cpu1/topology/thread_siblings
>   ...snip...
>   openat(AT_FDCWD, "/sys/devices/system/cpu/cpu1/topology/thread_siblings", O_RDONLY|O_LARGEFILE) = 3
>   read(3, "", 1024)                       = 0
>

I tried this with the latest upstream (which doesn't yet have the fix
for the size issue) and got the same results.

Then I applied the fix and the problem went away:

6.0.0-rc6.nr_cpus2+
# find /sys -name thread_siblings -exec cat \{\} \; 
2
1

Cheers,
Phil

> Cheers,
> 
> Petr
> 

-- 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 17:18         ` Phil Auld
@ 2022-09-22 20:05           ` Petr Štetiar
  2022-09-23  9:16             ` Greg Kroah-Hartman
  0 siblings, 1 reply; 9+ messages in thread
From: Petr Štetiar @ 2022-09-22 20:05 UTC (permalink / raw)
  To: Phil Auld; +Cc: Greg Kroah-Hartman, linux-kernel, yury.norov, rafael

Phil Auld <pauld@redhat.com> [2022-09-22 13:18:12]:

> Then I applied the fix and the problem went away:

I've just tried the same aarch64 and I can confirm, that the
patch fixes the issue.

Cheers,

Petr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-22 20:05           ` Petr Štetiar
@ 2022-09-23  9:16             ` Greg Kroah-Hartman
  2022-09-23 13:02               ` Phil Auld
  0 siblings, 1 reply; 9+ messages in thread
From: Greg Kroah-Hartman @ 2022-09-23  9:16 UTC (permalink / raw)
  To: Petr Štetiar; +Cc: Phil Auld, linux-kernel, yury.norov, rafael

On Thu, Sep 22, 2022 at 10:05:06PM +0200, Petr Štetiar wrote:
> Phil Auld <pauld@redhat.com> [2022-09-22 13:18:12]:
> 
> > Then I applied the fix and the problem went away:
> 
> I've just tried the same aarch64 and I can confirm, that the
> patch fixes the issue.

Wow, that's odd that the file size matters here.

Ok, I'll send this to Linus in a few hours, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content)
  2022-09-23  9:16             ` Greg Kroah-Hartman
@ 2022-09-23 13:02               ` Phil Auld
  0 siblings, 0 replies; 9+ messages in thread
From: Phil Auld @ 2022-09-23 13:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Petr Štetiar, linux-kernel, yury.norov, rafael

On Fri, Sep 23, 2022 at 11:16:55AM +0200 Greg Kroah-Hartman wrote:
> On Thu, Sep 22, 2022 at 10:05:06PM +0200, Petr Štetiar wrote:
> > Phil Auld <pauld@redhat.com> [2022-09-22 13:18:12]:
> > 
> > > Then I applied the fix and the problem went away:
> > 
> > I've just tried the same aarch64 and I can confirm, that the
> > patch fixes the issue.
> 
> Wow, that's odd that the file size matters here.
>

Yeah, I looked through the code some but nothing jumped out where
that unsigned -1 could cause a problem (like count + 1 wrapping
to 0 or something).

> Ok, I'll send this to Linus in a few hours, thanks.

Thanks!


Cheers,
Phil

> 
> greg k-h
> 

-- 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-23 13:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-22 11:32 aarch64 5.15.68 regression in topology/thread_siblings (huge file size and no content) Petr Štetiar
2022-09-22 12:32 ` Phil Auld
2022-09-22 12:40   ` Greg Kroah-Hartman
2022-09-22 13:18     ` Phil Auld
2022-09-22 14:05       ` Petr Štetiar
2022-09-22 17:18         ` Phil Auld
2022-09-22 20:05           ` Petr Štetiar
2022-09-23  9:16             ` Greg Kroah-Hartman
2022-09-23 13:02               ` Phil Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).