Stable Archive on lore.kernel.org
 help / color / Atom feed
* ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)
@ 2019-11-30  5:26 CKI Project
  2019-11-30 21:56 ` Jan Stancek
  0 siblings, 1 reply; 11+ messages in thread
From: CKI Project @ 2019-11-30  5:26 UTC (permalink / raw)
  To: Linux Stable maillist; +Cc: Memory Management, Jan Stancek, LTP Mailing List


Hello,

We ran automated tests on a recent commit from this kernel tree:

       Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
            Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel

The results of these automated tests are provided below.

    Overall result: FAILED (see details below)
             Merge: OK
           Compile: OK
             Tests: FAILED

All kernel binaries, config files, and logs are available for download here:

  https://artifacts.cki-project.org/pipelines/314344

One or more kernel tests failed:

    ppc64le:
     ❌ LTP

We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.

Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.

        ,-.   ,-.
       ( C ) ( K )  Continuous
        `-',-.`-'   Kernel
          ( I )     Integration
           `-'
______________________________________________________________________________

Compile testing
---------------

We compiled the kernel for 3 architectures:

    aarch64:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg

    ppc64le:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg

    x86_64:
      make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg


Hardware testing
----------------
We booted each kernel and ran the following tests:

  aarch64:
    Host 1:
       ✅ Boot test
       ✅ Podman system integration test (as root)
       ✅ Podman system integration test (as user)
       ✅ LTP
       ✅ Loopdev Sanity
       ✅ Memory function: memfd_create
       ✅ Memory function: kaslr
       ✅ AMTU (Abstract Machine Test Utility)
       ✅ Networking bridge: sanity
       ✅ Ethernet drivers sanity
       ✅ Networking MACsec: sanity
       ✅ Networking socket: fuzz
       ✅ Networking sctp-auth: sockopts test
       ✅ Networking: igmp conformance test
       ✅ Networking route: pmtu
       ✅ Networking route_func: local
       ✅ Networking route_func: forward
       ✅ Networking TCP: keepalive test
       ✅ Networking UDP: socket
       ✅ Networking tunnel: geneve basic test
       ✅ Networking tunnel: gre basic
       ✅ L2TP basic test
       ✅ Networking tunnel: vxlan basic
       ✅ Networking ipsec: basic netns transport
       ✅ Networking ipsec: basic netns tunnel
       ✅ audit: audit testsuite test
       ✅ httpd: mod_ssl smoke sanity
       ✅ tuned: tune-processes-through-perf
       ✅ ALSA PCM loopback test
       ✅ ALSA Control (mixer) Userspace Element test
       ✅ storage: SCSI VPD
       ✅ stress: stress-ng
       ✅ trace: ftrace/tracer
       🚧 ✅ CIFS Connectathon
       🚧 ✅ POSIX pjd-fstest suites
       🚧 ✅ jvm test suite
       🚧 ✅ LTP: openposix test suite
       🚧 ✅ Networking vnic: ipvlan/basic
       🚧 ✅ iotop: sanity
       🚧 ✅ Usex - version 1.9-29
       🚧 ✅ storage: dm/common

    Host 2:
       ✅ Boot test
       ✅ xfstests: ext4
       ✅ xfstests: xfs
       ✅ lvm thinp sanity
       ✅ storage: software RAID testing
       🚧 ✅ selinux-policy: serge-testsuite
       🚧 ✅ Storage blktests

  ppc64le:
    Host 1:
       ✅ Boot test
       ✅ Podman system integration test (as root)
       ✅ Podman system integration test (as user)
       ❌ LTP
       ✅ Loopdev Sanity
       ✅ Memory function: memfd_create
       ✅ Memory function: kaslr
       ✅ AMTU (Abstract Machine Test Utility)
       ✅ Networking bridge: sanity
       ✅ Ethernet drivers sanity
       ✅ Networking MACsec: sanity
       ✅ Networking socket: fuzz
       ✅ Networking sctp-auth: sockopts test
       ✅ Networking route: pmtu
       ✅ Networking route_func: local
       ✅ Networking route_func: forward
       ✅ Networking TCP: keepalive test
       ✅ Networking UDP: socket
       ✅ Networking tunnel: geneve basic test
       ✅ Networking tunnel: gre basic
       ✅ L2TP basic test
       ✅ Networking tunnel: vxlan basic
       ✅ Networking ipsec: basic netns tunnel
       ✅ audit: audit testsuite test
       ✅ httpd: mod_ssl smoke sanity
       ✅ tuned: tune-processes-through-perf
       ✅ ALSA PCM loopback test
       ✅ ALSA Control (mixer) Userspace Element test
       ✅ trace: ftrace/tracer
       🚧 ✅ CIFS Connectathon
       🚧 ✅ POSIX pjd-fstest suites
       🚧 ✅ jvm test suite
       🚧 ✅ LTP: openposix test suite
       🚧 ✅ Networking vnic: ipvlan/basic
       🚧 ✅ iotop: sanity
       🚧 ✅ Usex - version 1.9-29
       🚧 ✅ storage: dm/common

    Host 2:
       ✅ Boot test
       ✅ xfstests: ext4
       ✅ xfstests: xfs
       ✅ lvm thinp sanity
       ✅ storage: software RAID testing
       🚧 ✅ IPMI driver test
       🚧 ✅ IPMItool loop stress test
       🚧 ✅ selinux-policy: serge-testsuite
       🚧 ✅ Storage blktests

  x86_64:
    Host 1:
       ✅ Boot test
       ✅ Podman system integration test (as root)
       ✅ Podman system integration test (as user)
       ✅ LTP
       ✅ Loopdev Sanity
       ✅ Memory function: memfd_create
       ✅ Memory function: kaslr
       ✅ AMTU (Abstract Machine Test Utility)
       ✅ Networking bridge: sanity
       ✅ Ethernet drivers sanity
       ✅ Networking MACsec: sanity
       ✅ Networking socket: fuzz
       ✅ Networking sctp-auth: sockopts test
       ✅ Networking: igmp conformance test
       ✅ Networking route: pmtu
       ✅ Networking route_func: local
       ✅ Networking route_func: forward
       ✅ Networking TCP: keepalive test
       ✅ Networking UDP: socket
       ✅ Networking tunnel: geneve basic test
       ✅ Networking tunnel: gre basic
       ✅ L2TP basic test
       ✅ Networking tunnel: vxlan basic
       ✅ Networking ipsec: basic netns transport
       ✅ Networking ipsec: basic netns tunnel
       ✅ audit: audit testsuite test
       ✅ httpd: mod_ssl smoke sanity
       ✅ tuned: tune-processes-through-perf
       ✅ pciutils: sanity smoke test
       ✅ ALSA PCM loopback test
       ✅ ALSA Control (mixer) Userspace Element test
       ✅ storage: SCSI VPD
       ✅ stress: stress-ng
       ✅ trace: ftrace/tracer
       🚧 ✅ CIFS Connectathon
       🚧 ✅ POSIX pjd-fstest suites
       🚧 ✅ jvm test suite
       🚧 ✅ LTP: openposix test suite
       🚧 ✅ Networking vnic: ipvlan/basic
       🚧 ✅ iotop: sanity
       🚧 ✅ Usex - version 1.9-29
       🚧 ✅ storage: dm/common

    Host 2:
       ⏱  Boot test
       ⏱  Storage SAN device stress - mpt3sas driver

    Host 3:
       ⏱  Boot test
       ⏱  Storage SAN device stress - megaraid_sas

    Host 4:
       ✅ Boot test
       ✅ xfstests: ext4
       ✅ xfstests: xfs
       ✅ lvm thinp sanity
       ✅ storage: software RAID testing
       🚧 ✅ IOMMU boot test
       🚧 ✅ IPMI driver test
       🚧 ✅ IPMItool loop stress test
       🚧 ✅ selinux-policy: serge-testsuite
       🚧 ✅ Storage blktests

  Test sources: https://github.com/CKI-project/tests-beaker
    💚 Pull requests are welcome for new tests or improvements to existing tests!

Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.

Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running are marked with ⏱. Reports for non-upstream kernels have
a Beaker recipe linked to next to each host.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)
  2019-11-30  5:26 ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue) CKI Project
@ 2019-11-30 21:56 ` Jan Stancek
  2019-12-02  5:46   ` Michael Ellerman
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stancek @ 2019-11-30 21:56 UTC (permalink / raw)
  To: CKI Project; +Cc: Linux Stable maillist, Memory Management, LTP Mailing List


----- Original Message -----
> 
> Hello,
> 
> We ran automated tests on a recent commit from this kernel tree:
> 
>        Kernel repo:
>        git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
>             Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack on
>             guest exit to host kernel
> 
> The results of these automated tests are provided below.
> 
>     Overall result: FAILED (see details below)
>              Merge: OK
>            Compile: OK
>              Tests: FAILED
> 
> All kernel binaries, config files, and logs are available for download here:
> 
>   https://artifacts.cki-project.org/pipelines/314344
> 
> One or more kernel tests failed:
> 
>     ppc64le:
>      ❌ LTP

I suspect kernel bug.

There were couple of 'math' runtest related failures in recent couple days.
In all cases, some data file used by test was missing. Presumably because
binary that generates it crashed.

I managed to reproduce one failure with this CKI build, which I believe
is the same problem.

We crash early during load, before any LTP code runs:

(gdb) r
Starting program: /mnt/testarea/ltp/testcases/bin/genasin

Program received signal SIGBUS, Bus error.
dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
1362        switch (ph->p_type)
(gdb) bt
#0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
#1  0x00007ffff7fcf3c8 in _dl_sysdep_start (start_argptr=<optimized out>, dl_main=0x7ffff7fb37b0 <dl_main>) at ../elf/dl-sysdep.c:253
#2  0x00007ffff7fb1d1c in _dl_start_final (arg=arg@entry=0x7fffffffee20, info=info@entry=0x7fffffffe870) at rtld.c:445
#3  0x00007ffff7fb2f5c in _dl_start (arg=0x7fffffffee20) at rtld.c:537
#4  0x00007ffff7fb14d8 in _start () from /lib64/ld64.so.2
(gdb) f 0
#0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
1362        switch (ph->p_type)
(gdb) l
1357      /* And it was opened directly.  */
1358      ++main_map->l_direct_opencount;
1359
1360      /* Scan the program header table for the dynamic section.  */
1361      for (ph = phdr; ph < &phdr[phnum]; ++ph)
1362        switch (ph->p_type)
1363          {
1364          case PT_PHDR:
1365            /* Find out the load address.  */
1366            main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;

(gdb) p ph
$1 = (const Elf64_Phdr *) 0x10000040

(gdb) p *ph
Cannot access memory at address 0x10000040

(gdb) info proc map
process 1110670
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
          0x10000000         0x10010000    0x10000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
          0x10010000         0x10030000    0x20000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
      0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
      0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0 /usr/lib64/ld-2.30.so
      0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000 /usr/lib64/ld-2.30.so
      0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]

(gdb) x/1x 0x10000040
0x10000040:     Cannot access memory at address 0x10000040

# /mnt/testarea/ltp/testcases/bin/genasin
Bus error (core dumped)

However, as soon as I copy that binary somewhere else, it works fine:

# cp /mnt/testarea/ltp/testcases/bin/genasin /tmp
# /tmp/genasin
# echo $?
0

# cp /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2
# /mnt/testarea/ltp/testcases/bin/genasin2
# echo $?
0

# /mnt/testarea/ltp/testcases/bin/genasin
Bus error (core dumped)

# diff /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2; echo $?
0

# lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          160
On-line CPU(s) list:             0-159
Thread(s) per core:              4
Core(s) per socket:              20
Socket(s):                       2
NUMA node(s):                    2
Model:                           2.2 (pvr 004e 1202)
Model name:                      POWER9, altivec supported
Frequency boost:                 enabled
CPU max MHz:                     3800.0000
CPU min MHz:                     2166.0000
L1d cache:                       1.3 MiB
L1i cache:                       1.3 MiB
L2 cache:                        10 MiB
L3 cache:                        200 MiB
NUMA node0 CPU(s):               0-79
NUMA node8 CPU(s):               80-159
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Mitigation; RFI Flush, L1D private per thread
Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization, ori31 speculation barrier enabled
Vulnerability Spectre v2:        Mitigation; Indirect branch cache disabled, Software link stack flush
Vulnerability Tsx async abort:   Not affected


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)
  2019-11-30 21:56 ` Jan Stancek
@ 2019-12-02  5:46   ` Michael Ellerman
  2019-12-02 12:30     ` Jan Stancek
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Ellerman @ 2019-12-02  5:46 UTC (permalink / raw)
  To: Jan Stancek, CKI Project
  Cc: Linux Stable maillist, Memory Management, LTP Mailing List, linuxppc-dev

Hi Jan,

Jan Stancek <jstancek@redhat.com> writes:
> ----- Original Message -----
>> 
>> Hello,
>> 
>> We ran automated tests on a recent commit from this kernel tree:
>> 
>>        Kernel repo:
>>        git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
>>             Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack on
>>             guest exit to host kernel

I can't find this commit, I assume it's roughly the same as:

  https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-5.3.y&id=0815f75f90178bc7e1933cf0d0c818b5f3f5a20c

>> The results of these automated tests are provided below.
>> 
>>     Overall result: FAILED (see details below)
>>              Merge: OK
>>            Compile: OK
>>              Tests: FAILED
>> 
>> All kernel binaries, config files, and logs are available for download here:
>> 
>>   https://artifacts.cki-project.org/pipelines/314344
>> 
>> One or more kernel tests failed:
>> 
>>     ppc64le:
>>      ❌ LTP
>
> I suspect kernel bug.

Looks that way, but I can't reproduce it on a machine here.

I have the same CPU revision and am booting the exact kernel binary &
modules linked above.

> There were couple of 'math' runtest related failures in recent couple days.
> In all cases, some data file used by test was missing. Presumably because
> binary that generates it crashed.
>
> I managed to reproduce one failure with this CKI build, which I believe
> is the same problem.
>
> We crash early during load, before any LTP code runs:
>
> (gdb) r
> Starting program: /mnt/testarea/ltp/testcases/bin/genasin

What is this /mnt/testarea? Looks like it's setup by some of the beaker
scripts or something?

I'm running LTP out of /home, which is ext4 directly on disk.

I tried getting the tests-beaker stuff working on my machine, but I
couldn't find all the libraries and so on it requires.


> Program received signal SIGBUS, Bus error.
> dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> 1362        switch (ph->p_type)
> (gdb) bt
> #0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> #1  0x00007ffff7fcf3c8 in _dl_sysdep_start (start_argptr=<optimized out>, dl_main=0x7ffff7fb37b0 <dl_main>) at ../elf/dl-sysdep.c:253
> #2  0x00007ffff7fb1d1c in _dl_start_final (arg=arg@entry=0x7fffffffee20, info=info@entry=0x7fffffffe870) at rtld.c:445
> #3  0x00007ffff7fb2f5c in _dl_start (arg=0x7fffffffee20) at rtld.c:537
> #4  0x00007ffff7fb14d8 in _start () from /lib64/ld64.so.2
> (gdb) f 0
> #0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> 1362        switch (ph->p_type)
> (gdb) l
> 1357      /* And it was opened directly.  */
> 1358      ++main_map->l_direct_opencount;
> 1359
> 1360      /* Scan the program header table for the dynamic section.  */
> 1361      for (ph = phdr; ph < &phdr[phnum]; ++ph)
> 1362        switch (ph->p_type)
> 1363          {
> 1364          case PT_PHDR:
> 1365            /* Find out the load address.  */
> 1366            main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
>
> (gdb) p ph
> $1 = (const Elf64_Phdr *) 0x10000040
>
> (gdb) p *ph
> Cannot access memory at address 0x10000040
>
> (gdb) info proc map
> process 1110670
> Mapped address spaces:
>
>           Start Addr           End Addr       Size     Offset objfile
>           0x10000000         0x10010000    0x10000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>           0x10010000         0x10030000    0x20000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>       0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
>       0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0 /usr/lib64/ld-2.30.so
>       0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000 /usr/lib64/ld-2.30.so
>       0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]
>
> (gdb) x/1x 0x10000040
> 0x10000040:     Cannot access memory at address 0x10000040

Yeah that's weird.

> # /mnt/testarea/ltp/testcases/bin/genasin
> Bus error (core dumped)
>
> However, as soon as I copy that binary somewhere else, it works fine:
>
> # cp /mnt/testarea/ltp/testcases/bin/genasin /tmp
> # /tmp/genasin
> # echo $?
> 0

Is /tmp a real disk or tmpfs?

cheers

> # cp /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2
> # /mnt/testarea/ltp/testcases/bin/genasin2
> # echo $?
> 0
>
> # /mnt/testarea/ltp/testcases/bin/genasin
> Bus error (core dumped)
>
> # diff /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2; echo $?
> 0
>
> # lscpu
> Architecture:                    ppc64le
> Byte Order:                      Little Endian
> CPU(s):                          160
> On-line CPU(s) list:             0-159
> Thread(s) per core:              4
> Core(s) per socket:              20
> Socket(s):                       2
> NUMA node(s):                    2
> Model:                           2.2 (pvr 004e 1202)
> Model name:                      POWER9, altivec supported
> Frequency boost:                 enabled
> CPU max MHz:                     3800.0000
> CPU min MHz:                     2166.0000
> L1d cache:                       1.3 MiB
> L1i cache:                       1.3 MiB
> L2 cache:                        10 MiB
> L3 cache:                        200 MiB
> NUMA node0 CPU(s):               0-79
> NUMA node8 CPU(s):               80-159
> Vulnerability Itlb multihit:     Not affected
> Vulnerability L1tf:              Not affected
> Vulnerability Mds:               Not affected
> Vulnerability Meltdown:          Mitigation; RFI Flush, L1D private per thread
> Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
> Vulnerability Spectre v1:        Mitigation; __user pointer sanitization, ori31 speculation barrier enabled
> Vulnerability Spectre v2:        Mitigation; Indirect branch cache disabled, Software link stack flush
> Vulnerability Tsx async abort:   Not affected

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)
  2019-12-02  5:46   ` Michael Ellerman
@ 2019-12-02 12:30     ` Jan Stancek
  2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stancek @ 2019-12-02 12:30 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: CKI Project, Linux Stable maillist, Memory Management,
	LTP Mailing List, linuxppc-dev



----- Original Message -----
> Hi Jan,
> 
> Jan Stancek <jstancek@redhat.com> writes:
> > ----- Original Message -----
> >> 
> >> Hello,
> >> 
> >> We ran automated tests on a recent commit from this kernel tree:
> >> 
> >>        Kernel repo:
> >>        git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
> >>             Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack
> >>             on
> >>             guest exit to host kernel
> 
> I can't find this commit, I assume it's roughly the same as:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-5.3.y&id=0815f75f90178bc7e1933cf0d0c818b5f3f5a20c

Hi,

yes, that looks like same one:
  https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?h=3b5f97139acc

Looking at CKI reports for past 2 weeks, there were 3 (unexplained) SIGBUS related failures:

5.3.13-3b5f971.cki@upstream-stable
LTP genpower Bus error

5.4.0-rc8-4b17a56.cki@upstream-stable
LTP genatan Bus error

5.3.11-200.fc30
xfstests
+/var/lib/xfstests/tests/generic/248: line 38: 161943 Bus error               (core dumped) $TEST_PROG $TESTFILE

All 3 are from ppc64le, all power9 systems.

> 
> >> The results of these automated tests are provided below.
> >> 
> >>     Overall result: FAILED (see details below)
> >>              Merge: OK
> >>            Compile: OK
> >>              Tests: FAILED
> >> 
> >> All kernel binaries, config files, and logs are available for download
> >> here:
> >> 
> >>   https://artifacts.cki-project.org/pipelines/314344
> >> 
> >> One or more kernel tests failed:
> >> 
> >>     ppc64le:
> >>      ❌ LTP
> >
> > I suspect kernel bug.
> 
> Looks that way, but I can't reproduce it on a machine here.
> 
> I have the same CPU revision and am booting the exact kernel binary &
> modules linked above.

I can semi-reliably reproduce it with:
(where LTP is installed to /mnt/testarea/ltp)

while [ True ]; do
        echo 3 > /proc/sys/vm/drop_caches
        rm -f /mnt/testarea/ltp/results/RUNTEST.log /mnt/testarea/ltp/output/RUNTEST.run.log
        ./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
        grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. I did try couple
older kernels and could reproduce it with v4.19 and v5.0 as well.

v4.18 ran OK for 2 hours, assuming that one is good, it could be
related to xfs switching to iomap in 4.19-rc1.

Tracing so far led me to filemap_fault(), where it reached this -EIO,
before returning SIGBUS.

page_not_uptodate:
        /*
         * Umm, take care of errors if the page isn't up-to-date.
         * Try to re-read it _once_. We do this synchronously,
         * because there really aren't any performance issues here
         * and we need to check for errors.
         */
        ClearPageError(page);
        fpin = maybe_unlock_mmap_for_io(vmf, fpin);
        error = mapping->a_ops->readpage(file, page);
        if (!error) {
                wait_on_page_locked(page);
                if (!PageUptodate(page))
                        error = -EIO;
        }

...
        return VM_FAULT_SIGBUS;

> 
> > There were couple of 'math' runtest related failures in recent couple days.
> > In all cases, some data file used by test was missing. Presumably because
> > binary that generates it crashed.
> >
> > I managed to reproduce one failure with this CKI build, which I believe
> > is the same problem.
> >
> > We crash early during load, before any LTP code runs:
> >
> > (gdb) r
> > Starting program: /mnt/testarea/ltp/testcases/bin/genasin
> 
> What is this /mnt/testarea? Looks like it's setup by some of the beaker
> scripts or something?

Correct, it's where beaker script installs LTP. It's not a real mount,
just a directory on /. In my case it's xfs. It should match default
Fedora-31 Server ppc64le installation.

> 
> I'm running LTP out of /home, which is ext4 directly on disk.
> 
> I tried getting the tests-beaker stuff working on my machine, but I
> couldn't find all the libraries and so on it requires.
> 
> 
> > Program received signal SIGBUS, Bus error.
> > dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760,
> > auxv=<optimized out>) at rtld.c:1362
> > 1362        switch (ph->p_type)
> > (gdb) bt
> > #0  dl_main (phdr=0x10000040, phnum=<optimized out>,
> > user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> > #1  0x00007ffff7fcf3c8 in _dl_sysdep_start (start_argptr=<optimized out>,
> > dl_main=0x7ffff7fb37b0 <dl_main>) at ../elf/dl-sysdep.c:253
> > #2  0x00007ffff7fb1d1c in _dl_start_final (arg=arg@entry=0x7fffffffee20,
> > info=info@entry=0x7fffffffe870) at rtld.c:445
> > #3  0x00007ffff7fb2f5c in _dl_start (arg=0x7fffffffee20) at rtld.c:537
> > #4  0x00007ffff7fb14d8 in _start () from /lib64/ld64.so.2
> > (gdb) f 0
> > #0  dl_main (phdr=0x10000040, phnum=<optimized out>,
> > user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> > 1362        switch (ph->p_type)
> > (gdb) l
> > 1357      /* And it was opened directly.  */
> > 1358      ++main_map->l_direct_opencount;
> > 1359
> > 1360      /* Scan the program header table for the dynamic section.  */
> > 1361      for (ph = phdr; ph < &phdr[phnum]; ++ph)
> > 1362        switch (ph->p_type)
> > 1363          {
> > 1364          case PT_PHDR:
> > 1365            /* Find out the load address.  */
> > 1366            main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
> >
> > (gdb) p ph
> > $1 = (const Elf64_Phdr *) 0x10000040
> >
> > (gdb) p *ph
> > Cannot access memory at address 0x10000040
> >
> > (gdb) info proc map
> > process 1110670
> > Mapped address spaces:
> >
> >           Start Addr           End Addr       Size     Offset objfile
> >           0x10000000         0x10010000    0x10000        0x0
> >           /mnt/testarea/ltp/testcases/bin/genasin
> >           0x10010000         0x10030000    0x20000        0x0
> >           /mnt/testarea/ltp/testcases/bin/genasin
> >       0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
> >       0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0
> >       /usr/lib64/ld-2.30.so
> >       0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000
> >       /usr/lib64/ld-2.30.so
> >       0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]
> >
> > (gdb) x/1x 0x10000040
> > 0x10000040:     Cannot access memory at address 0x10000040
> 
> Yeah that's weird.
> 
> > # /mnt/testarea/ltp/testcases/bin/genasin
> > Bus error (core dumped)
> >
> > However, as soon as I copy that binary somewhere else, it works fine:
> >
> > # cp /mnt/testarea/ltp/testcases/bin/genasin /tmp
> > # /tmp/genasin
> > # echo $?
> > 0
> 
> Is /tmp a real disk or tmpfs?

tmpfs

Filesystem                           Type      1K-blocks     Used  Available Use% Mounted on
devtmpfs                             devtmpfs  254530176        0  254530176   0% /dev
tmpfs                                tmpfs     267992768        0  267992768   0% /dev/shm
tmpfs                                tmpfs     267992768     9152  267983616   1% /run
/dev/mapper/fedora_ibm--p9b--03-root xfs        15718400 13029284    2689116  83% /
tmpfs                                tmpfs     267992768        0  267992768   0% /tmp
/dev/sda1                            xfs         1038336   944588      93748  91% /boot
tmpfs                                tmpfs      53598528        0   53598528   0% /run/user/0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-02 12:30     ` Jan Stancek
@ 2019-12-03 12:50       ` Jan Stancek
  2019-12-03 13:07         ` Christoph Hellwig
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jan Stancek @ 2019-12-03 12:50 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel, hch, darrick.wong
  Cc: linuxppc-dev, Memory Management, LTP Mailing List,
	Linux Stable maillist, CKI Project, Michael Ellerman

Hi,

(This bug report is summary from thread [1] with some additions)

User-space binaries on Power9 ppc64le (with 64k pages) on xfs
filesystem are sporadically hitting SIGBUS:

---------- 8< ----------
(gdb) r
Starting program: /mnt/testarea/ltp/testcases/bin/genasin

Program received signal SIGBUS, Bus error.
dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
1362        switch (ph->p_type)

(gdb) p ph
$1 = (const Elf64_Phdr *) 0x10000040

(gdb) p *ph
Cannot access memory at address 0x10000040

(gdb) info proc map
process 1110670
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
          0x10000000         0x10010000    0x10000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
          0x10010000         0x10030000    0x20000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
      0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
      0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0 /usr/lib64/ld-2.30.so
      0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000 /usr/lib64/ld-2.30.so
      0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]

(gdb) x/1x 0x10000040
0x10000040:     Cannot access memory at address 0x10000040
---------- >8 ----------

When this happens the binary continues to hit SIGBUS until page
is released, for example by: echo 3 > /proc/sys/vm/drop_caches

The issue goes back to at least v4.19.

I can semi-reliably reproduce it with LTP is installed to /mnt/testarea/ltp by:
while [ True ]; do
        echo 3 > /proc/sys/vm/drop_caches
        rm -f /mnt/testarea/ltp/results/RUNTEST.log /mnt/testarea/ltp/output/RUNTEST.run.log
        ./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
        grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
done

and some stress activity in other terminal (e.g. kernel build).
Sometimes in minutes, sometimes in hours. It is not reliable
enough to get meaningful bisect results.

My theory is that there's a race in iomap. There appear to be
interleaved calls to iomap_set_range_uptodate() for same page
with varying offset and length. Each call sees bitmap as _not_
entirely "uptodate" and hence doesn't call SetPageUptodate().
Even though each bit in bitmap ends up uptodate by the time
all calls finish.

For example, with following traces:

iomap_set_range_uptodate()
...
        if (uptodate && !PageError(page))
                SetPageUptodate(page);
+       
+       if (mycheck(page)) {
+               trace_printk("page: %px, iop: %px, uptodate: %d, !PageError(page): %d, flags: %lx\n", page, iop, uptodate, !PageError(page), page->flags);
+               trace_printk("first: %u, last: %u, off: %u, len: %u, i: %u\n", first, last, off, len, i);
+       }

I get:
         genacos-18471 [057] ....   162.465730: iomap_readpages: mapping: c000003f185a1ab0
         genacos-18471 [057] ....   162.465732: iomap_page_create: iomap_page_create page: c00c00000fe26180, page->private: 0000000000000000, iop: c000003fc70a19c0, flags: 3ffff800000001
         genacos-18471 [057] ....   162.465736: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002001
         genacos-18471 [057] ....   162.465736: iomap_set_range_uptodate: first: 1, last: 14, off: 4096, len: 57344, i: 16
          <idle>-0     [060] ..s.   162.534862: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002081
          <idle>-0     [061] ..s.   162.534862: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002081
          <idle>-0     [060] ..s.   162.534864: iomap_set_range_uptodate: first: 0, last: 0, off: 0, len: 4096, i: 16
          <idle>-0     [061] ..s.   162.534864: iomap_set_range_uptodate: first: 15, last: 15, off: 61440, len: 4096, i: 16

This page doesn't have Uptodate flag set, which leads to filemap_fault()
returning VM_FAULT_SIGBUS:

crash> p/x ((struct page *) 0xc00c00000fe26180)->flags                                                                                                                                             
$1 = 0x3ffff800002032

crash> kmem -g 0x3ffff800002032
FLAGS: 3ffff800002032
  PAGE-FLAG       BIT  VALUE
  PG_error          1  0000002
  PG_dirty          4  0000010
  PG_lru            5  0000020
  PG_private_2     13  0002000
  PG_fscache       13  0002000
  PG_savepinned     4  0000010
  PG_double_map    13  0002000

But iomap_page->uptodate in page->private suggests all bits are uptodate:

crash> p/x ((struct page *) 0xc00c00000fe26180)->private
$2 = 0xc000003fc70a19c0

crash> p/x ((struct iomap_page *) 0xc000003fc70a19c0)->uptodate                                                                                                                                    
$3 = {0xffff, 0x0}


It appears (after ~4 hours) that I can avoid the problem if I split
the loop so that bits are checked only after all updates are made.
Not sure if this correct approach, or just making it less reproducible:

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e25901ae3ff4..abe37031c93d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -131,7 +131,11 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
                for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
                        if (i >= first && i <= last)
                                set_bit(i, iop->uptodate);
-                       else if (!test_bit(i, iop->uptodate))
+               }
+               for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
+                       if (i >= first && i <= last)
+                               continue;
+                       if (!test_bit(i, iop->uptodate))
                                uptodate = false;
                }
        }

Thanks,
Jan

[1] https://lore.kernel.org/stable/1420623640.14527843.1575289859701.JavaMail.zimbra@redhat.com/T/#u


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
@ 2019-12-03 13:07         ` Christoph Hellwig
  2019-12-03 14:35           ` Jan Stancek
  2019-12-03 19:09         ` Christoph Hellwig
  2019-12-07  0:02         ` dftxbs3e
  2 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2019-12-03 13:07 UTC (permalink / raw)
  To: Jan Stancek
  Cc: linux-xfs, linux-fsdevel, hch, darrick.wong, linuxppc-dev,
	Memory Management, LTP Mailing List, Linux Stable maillist,
	CKI Project, Michael Ellerman

On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> My theory is that there's a race in iomap. There appear to be
> interleaved calls to iomap_set_range_uptodate() for same page
> with varying offset and length. Each call sees bitmap as _not_
> entirely "uptodate" and hence doesn't call SetPageUptodate().
> Even though each bit in bitmap ends up uptodate by the time
> all calls finish.

Weird.  That should be prevented by the page lock that all callers
of iomap_set_range_uptodate.  But in case I miss something, does
the patch below trigger?  If not it is not jut a race, but might
be some weird ordering problem with the bitops, especially if it
only triggers on ppc, which is very weakly ordered.

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index d33c7bc5ee92..25e942c71590 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 	unsigned int i;
 	bool uptodate = true;
 
+	WARN_ON_ONCE(!PageLocked(page));
+
 	if (iop) {
 		for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
 			if (i >= first && i <= last)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 13:07         ` Christoph Hellwig
@ 2019-12-03 14:35           ` Jan Stancek
  2019-12-03 16:08             ` Darrick J. Wong
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stancek @ 2019-12-03 14:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-xfs, linux-fsdevel, darrick wong, linuxppc-dev,
	Memory Management, LTP Mailing List, Linux Stable maillist,
	CKI Project, Michael Ellerman


----- Original Message -----
> On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > My theory is that there's a race in iomap. There appear to be
> > interleaved calls to iomap_set_range_uptodate() for same page
> > with varying offset and length. Each call sees bitmap as _not_
> > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > Even though each bit in bitmap ends up uptodate by the time
> > all calls finish.
> 
> Weird.  That should be prevented by the page lock that all callers
> of iomap_set_range_uptodate.  But in case I miss something, does
> the patch below trigger?  If not it is not jut a race, but might
> be some weird ordering problem with the bitops, especially if it
> only triggers on ppc, which is very weakly ordered.
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index d33c7bc5ee92..25e942c71590 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off,
> unsigned len)
>  	unsigned int i;
>  	bool uptodate = true;
>  
> +	WARN_ON_ONCE(!PageLocked(page));
> +
>  	if (iop) {
>  		for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
>  			if (i >= first && i <= last)
> 

Hit it pretty quick this time:

# uptime
 09:27:42 up 22 min,  2 users,  load average: 0.09, 13.38, 26.18

# /mnt/testarea/ltp/testcases/bin/genbessel                                                                                                                                     
Bus error (core dumped)

# dmesg | grep -i -e warn -e call                                                                                                                                               
[    0.000000] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or unsupported by kernel)
[    0.000000] random: get_random_u64 called from cache_random_seq_create+0x98/0x1e0 with crng_init=0
[    0.000000] rcu:     Offload RCU callbacks from CPUs: (none).
[    5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[    5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[    5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000

So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
did not trigger.

Is it possible for iomap code to submit multiple bio-s for same
locked page and then receive callbacks in parallel?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 14:35           ` Jan Stancek
@ 2019-12-03 16:08             ` Darrick J. Wong
  0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2019-12-03 16:08 UTC (permalink / raw)
  To: Jan Stancek
  Cc: Christoph Hellwig, linux-xfs, linux-fsdevel, linuxppc-dev,
	Memory Management, LTP Mailing List, Linux Stable maillist,
	CKI Project, Michael Ellerman

On Tue, Dec 03, 2019 at 09:35:28AM -0500, Jan Stancek wrote:
> 
> ----- Original Message -----
> > On Tue, Dec 03, 2019 at 07:50:39AM -0500, Jan Stancek wrote:
> > > My theory is that there's a race in iomap. There appear to be
> > > interleaved calls to iomap_set_range_uptodate() for same page
> > > with varying offset and length. Each call sees bitmap as _not_
> > > entirely "uptodate" and hence doesn't call SetPageUptodate().
> > > Even though each bit in bitmap ends up uptodate by the time
> > > all calls finish.
> > 
> > Weird.  That should be prevented by the page lock that all callers
> > of iomap_set_range_uptodate.  But in case I miss something, does
> > the patch below trigger?  If not it is not jut a race, but might
> > be some weird ordering problem with the bitops, especially if it
> > only triggers on ppc, which is very weakly ordered.
> > 
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index d33c7bc5ee92..25e942c71590 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -148,6 +148,8 @@ iomap_set_range_uptodate(struct page *page, unsigned off,
> > unsigned len)
> >  	unsigned int i;
> >  	bool uptodate = true;
> >  
> > +	WARN_ON_ONCE(!PageLocked(page));
> > +
> >  	if (iop) {
> >  		for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
> >  			if (i >= first && i <= last)
> > 
> 
> Hit it pretty quick this time:
> 
> # uptime
>  09:27:42 up 22 min,  2 users,  load average: 0.09, 13.38, 26.18
> 
> # /mnt/testarea/ltp/testcases/bin/genbessel                                                                                                                                     
> Bus error (core dumped)
> 
> # dmesg | grep -i -e warn -e call                                                                                                                                               
> [    0.000000] dt-cpu-ftrs: not enabling: system-call-vectored (disabled or unsupported by kernel)
> [    0.000000] random: get_random_u64 called from cache_random_seq_create+0x98/0x1e0 with crng_init=0
> [    0.000000] rcu:     Offload RCU callbacks from CPUs: (none).
> [    5.312075] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
> [    5.357307] megaraid_sas 0031:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
> [    5.485126] megaraid_sas 0031:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
> 
> So, extra WARN_ON_ONCE applied on top of v5.4-8836-g81b6b96475ac
> did not trigger.
> 
> Is it possible for iomap code to submit multiple bio-s for same
> locked page and then receive callbacks in parallel?

Yes, if (say) you have 64k pages on a 4k-block filesystem and the extent
mapping for all 16 blocks aren't contiguous, then iomap will issue
separate bios for each physical fragment it finds.  iomap will call
submit_bio on those bios whenever it thinks it's done filling the bio,
so you can indeed get multiple callbacks in parallel.

--D

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
  2019-12-03 13:07         ` Christoph Hellwig
@ 2019-12-03 19:09         ` Christoph Hellwig
  2019-12-04 14:43           ` Jan Stancek
  2019-12-07  0:02         ` dftxbs3e
  2 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2019-12-03 19:09 UTC (permalink / raw)
  To: Jan Stancek
  Cc: linux-xfs, linux-fsdevel, hch, darrick.wong, linuxppc-dev,
	Memory Management, LTP Mailing List, Linux Stable maillist,
	CKI Project, Michael Ellerman

Please try the patch below:

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 512856a88106..340c15400423 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -28,6 +28,7 @@
 struct iomap_page {
 	atomic_t		read_count;
 	atomic_t		write_count;
+	spinlock_t		uptodate_lock;
 	DECLARE_BITMAP(uptodate, PAGE_SIZE / 512);
 };
 
@@ -51,6 +52,7 @@ iomap_page_create(struct inode *inode, struct page *page)
 	iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL);
 	atomic_set(&iop->read_count, 0);
 	atomic_set(&iop->write_count, 0);
+	spin_lock_init(&iop->uptodate_lock);
 	bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE);
 
 	/*
@@ -139,25 +141,38 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
 }
 
 static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
 {
 	struct iomap_page *iop = to_iomap_page(page);
 	struct inode *inode = page->mapping->host;
 	unsigned first = off >> inode->i_blkbits;
 	unsigned last = (off + len - 1) >> inode->i_blkbits;
-	unsigned int i;
 	bool uptodate = true;
+	unsigned long flags;
+	unsigned int i;
 
-	if (iop) {
-		for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
-			if (i >= first && i <= last)
-				set_bit(i, iop->uptodate);
-			else if (!test_bit(i, iop->uptodate))
-				uptodate = false;
-		}
+	spin_lock_irqsave(&iop->uptodate_lock, flags);
+	for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
+		if (i >= first && i <= last)
+			set_bit(i, iop->uptodate);
+		else if (!test_bit(i, iop->uptodate))
+			uptodate = false;
 	}
 
-	if (uptodate && !PageError(page))
+	if (uptodate)
+		SetPageUptodate(page);
+	spin_unlock_irqrestore(&iop->uptodate_lock, flags);
+}
+
+static void
+iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+{
+	if (PageError(page))
+		return;
+
+	if (page_has_private(page))
+		iomap_iop_set_range_uptodate(page, off, len);
+	else
 		SetPageUptodate(page);
 }
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 19:09         ` Christoph Hellwig
@ 2019-12-04 14:43           ` Jan Stancek
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Stancek @ 2019-12-04 14:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-xfs, linux-fsdevel, darrick wong, linuxppc-dev,
	Memory Management, LTP Mailing List, Linux Stable maillist,
	CKI Project, Michael Ellerman


----- Original Message -----
> Please try the patch below:

I ran reproducer for 18 hours on 2 systems were it previously reproduced,
there were no crashes / SIGBUS.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later
  2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
  2019-12-03 13:07         ` Christoph Hellwig
  2019-12-03 19:09         ` Christoph Hellwig
@ 2019-12-07  0:02         ` dftxbs3e
  2 siblings, 0 replies; 11+ messages in thread
From: dftxbs3e @ 2019-12-07  0:02 UTC (permalink / raw)
  To: stable

Hello!

I am very happy that someone has found this issue.

I have been suffering from rather random SIGBUS errors in similar
conditions described by the author.

I don't have much troubleshooting information to provide, however, I hit
the issue regularly so I could investigate during that.

How do you debug such an issue? I tried a debugger etc. but besides
crashing with SIGBUS, I couldnt get any other meaningful information.

dftxbs3e

On 12/3/19 1:50 PM, Jan Stancek wrote:
> Hi,
>
> (This bug report is summary from thread [1] with some additions)
>
> User-space binaries on Power9 ppc64le (with 64k pages) on xfs
> filesystem are sporadically hitting SIGBUS:
>
> ---------- 8< ----------
> (gdb) r
> Starting program: /mnt/testarea/ltp/testcases/bin/genasin
>
> Program received signal SIGBUS, Bus error.
> dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> 1362        switch (ph->p_type)
>
> (gdb) p ph
> $1 = (const Elf64_Phdr *) 0x10000040
>
> (gdb) p *ph
> Cannot access memory at address 0x10000040
>
> (gdb) info proc map
> process 1110670
> Mapped address spaces:
>
>            Start Addr           End Addr       Size     Offset objfile
>            0x10000000         0x10010000    0x10000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>            0x10010000         0x10030000    0x20000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>        0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
>        0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0 /usr/lib64/ld-2.30.so
>        0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000 /usr/lib64/ld-2.30.so
>        0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]
>
> (gdb) x/1x 0x10000040
> 0x10000040:     Cannot access memory at address 0x10000040
> ---------- >8 ----------
>
> When this happens the binary continues to hit SIGBUS until page
> is released, for example by: echo 3 > /proc/sys/vm/drop_caches
>
> The issue goes back to at least v4.19.
>
> I can semi-reliably reproduce it with LTP is installed to /mnt/testarea/ltp by:
> while [ True ]; do
>          echo 3 > /proc/sys/vm/drop_caches
>          rm -f /mnt/testarea/ltp/results/RUNTEST.log /mnt/testarea/ltp/output/RUNTEST.run.log
>          ./runltp -p -d results -l RUNTEST.log -o RUNTEST.run.log -f math
>          grep FAIL /mnt/testarea/ltp/results/RUNTEST.log && exit 1
> done
>
> and some stress activity in other terminal (e.g. kernel build).
> Sometimes in minutes, sometimes in hours. It is not reliable
> enough to get meaningful bisect results.
>
> My theory is that there's a race in iomap. There appear to be
> interleaved calls to iomap_set_range_uptodate() for same page
> with varying offset and length. Each call sees bitmap as _not_
> entirely "uptodate" and hence doesn't call SetPageUptodate().
> Even though each bit in bitmap ends up uptodate by the time
> all calls finish.
>
> For example, with following traces:
>
> iomap_set_range_uptodate()
> ...
>          if (uptodate && !PageError(page))
>                  SetPageUptodate(page);
> +
> +       if (mycheck(page)) {
> +               trace_printk("page: %px, iop: %px, uptodate: %d, !PageError(page): %d, flags: %lx\n", page, iop, uptodate, !PageError(page), page->flags);
> +               trace_printk("first: %u, last: %u, off: %u, len: %u, i: %u\n", first, last, off, len, i);
> +       }
>
> I get:
>           genacos-18471 [057] ....   162.465730: iomap_readpages: mapping: c000003f185a1ab0
>           genacos-18471 [057] ....   162.465732: iomap_page_create: iomap_page_create page: c00c00000fe26180, page->private: 0000000000000000, iop: c000003fc70a19c0, flags: 3ffff800000001
>           genacos-18471 [057] ....   162.465736: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002001
>           genacos-18471 [057] ....   162.465736: iomap_set_range_uptodate: first: 1, last: 14, off: 4096, len: 57344, i: 16
>            <idle>-0     [060] ..s.   162.534862: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002081
>            <idle>-0     [061] ..s.   162.534862: iomap_set_range_uptodate: page: c00c00000fe26180, iop: c000003fc70a19c0, uptodate: 0, !PageError(page): 1, flags: 3ffff800002081
>            <idle>-0     [060] ..s.   162.534864: iomap_set_range_uptodate: first: 0, last: 0, off: 0, len: 4096, i: 16
>            <idle>-0     [061] ..s.   162.534864: iomap_set_range_uptodate: first: 15, last: 15, off: 61440, len: 4096, i: 16
>
> This page doesn't have Uptodate flag set, which leads to filemap_fault()
> returning VM_FAULT_SIGBUS:
>
> crash> p/x ((struct page *) 0xc00c00000fe26180)->flags
> $1 = 0x3ffff800002032
>
> crash> kmem -g 0x3ffff800002032
> FLAGS: 3ffff800002032
>    PAGE-FLAG       BIT  VALUE
>    PG_error          1  0000002
>    PG_dirty          4  0000010
>    PG_lru            5  0000020
>    PG_private_2     13  0002000
>    PG_fscache       13  0002000
>    PG_savepinned     4  0000010
>    PG_double_map    13  0002000
>
> But iomap_page->uptodate in page->private suggests all bits are uptodate:
>
> crash> p/x ((struct page *) 0xc00c00000fe26180)->private
> $2 = 0xc000003fc70a19c0
>
> crash> p/x ((struct iomap_page *) 0xc000003fc70a19c0)->uptodate
> $3 = {0xffff, 0x0}
>
>
> It appears (after ~4 hours) that I can avoid the problem if I split
> the loop so that bits are checked only after all updates are made.
> Not sure if this correct approach, or just making it less reproducible:
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index e25901ae3ff4..abe37031c93d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -131,7 +131,11 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
>                  for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
>                          if (i >= first && i <= last)
>                                  set_bit(i, iop->uptodate);
> -                       else if (!test_bit(i, iop->uptodate))
> +               }
> +               for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) {
> +                       if (i >= first && i <= last)
> +                               continue;
> +                       if (!test_bit(i, iop->uptodate))
>                                  uptodate = false;
>                  }
>          }
>
> Thanks,
> Jan
>
> [1] https://lore.kernel.org/stable/1420623640.14527843.1575289859701.JavaMail.zimbra@redhat.com/T/#u
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-30  5:26 ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue) CKI Project
2019-11-30 21:56 ` Jan Stancek
2019-12-02  5:46   ` Michael Ellerman
2019-12-02 12:30     ` Jan Stancek
2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
2019-12-03 13:07         ` Christoph Hellwig
2019-12-03 14:35           ` Jan Stancek
2019-12-03 16:08             ` Darrick J. Wong
2019-12-03 19:09         ` Christoph Hellwig
2019-12-04 14:43           ` Jan Stancek
2019-12-07  0:02         ` dftxbs3e

Stable Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://lore.kernel.org/stable \
		stable@vger.kernel.org
	public-inbox-index stable

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git