stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Jan Stancek <jstancek@redhat.com>, CKI Project <cki-project@redhat.com>
Cc: Linux Stable maillist <stable@vger.kernel.org>,
	Memory Management <mm-qe@redhat.com>,
	LTP Mailing List <ltp@lists.linux.it>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue)
Date: Mon, 02 Dec 2019 16:46:40 +1100	[thread overview]
Message-ID: <8736e3ffen.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <1738119916.14437244.1575151003345.JavaMail.zimbra@redhat.com>

Hi Jan,

Jan Stancek <jstancek@redhat.com> writes:
> ----- Original Message -----
>> 
>> Hello,
>> 
>> We ran automated tests on a recent commit from this kernel tree:
>> 
>>        Kernel repo:
>>        git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
>>             Commit: 3b5f97139acc - KVM: PPC: Book3S HV: Flush link stack on
>>             guest exit to host kernel

I can't find this commit, I assume it's roughly the same as:

  https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-5.3.y&id=0815f75f90178bc7e1933cf0d0c818b5f3f5a20c

>> The results of these automated tests are provided below.
>> 
>>     Overall result: FAILED (see details below)
>>              Merge: OK
>>            Compile: OK
>>              Tests: FAILED
>> 
>> All kernel binaries, config files, and logs are available for download here:
>> 
>>   https://artifacts.cki-project.org/pipelines/314344
>> 
>> One or more kernel tests failed:
>> 
>>     ppc64le:
>>      ❌ LTP
>
> I suspect kernel bug.

Looks that way, but I can't reproduce it on a machine here.

I have the same CPU revision and am booting the exact kernel binary &
modules linked above.

> There were couple of 'math' runtest related failures in recent couple days.
> In all cases, some data file used by test was missing. Presumably because
> binary that generates it crashed.
>
> I managed to reproduce one failure with this CKI build, which I believe
> is the same problem.
>
> We crash early during load, before any LTP code runs:
>
> (gdb) r
> Starting program: /mnt/testarea/ltp/testcases/bin/genasin

What is this /mnt/testarea? Looks like it's setup by some of the beaker
scripts or something?

I'm running LTP out of /home, which is ext4 directly on disk.

I tried getting the tests-beaker stuff working on my machine, but I
couldn't find all the libraries and so on it requires.


> Program received signal SIGBUS, Bus error.
> dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> 1362        switch (ph->p_type)
> (gdb) bt
> #0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> #1  0x00007ffff7fcf3c8 in _dl_sysdep_start (start_argptr=<optimized out>, dl_main=0x7ffff7fb37b0 <dl_main>) at ../elf/dl-sysdep.c:253
> #2  0x00007ffff7fb1d1c in _dl_start_final (arg=arg@entry=0x7fffffffee20, info=info@entry=0x7fffffffe870) at rtld.c:445
> #3  0x00007ffff7fb2f5c in _dl_start (arg=0x7fffffffee20) at rtld.c:537
> #4  0x00007ffff7fb14d8 in _start () from /lib64/ld64.so.2
> (gdb) f 0
> #0  dl_main (phdr=0x10000040, phnum=<optimized out>, user_entry=0x7fffffffe760, auxv=<optimized out>) at rtld.c:1362
> 1362        switch (ph->p_type)
> (gdb) l
> 1357      /* And it was opened directly.  */
> 1358      ++main_map->l_direct_opencount;
> 1359
> 1360      /* Scan the program header table for the dynamic section.  */
> 1361      for (ph = phdr; ph < &phdr[phnum]; ++ph)
> 1362        switch (ph->p_type)
> 1363          {
> 1364          case PT_PHDR:
> 1365            /* Find out the load address.  */
> 1366            main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
>
> (gdb) p ph
> $1 = (const Elf64_Phdr *) 0x10000040
>
> (gdb) p *ph
> Cannot access memory at address 0x10000040
>
> (gdb) info proc map
> process 1110670
> Mapped address spaces:
>
>           Start Addr           End Addr       Size     Offset objfile
>           0x10000000         0x10010000    0x10000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>           0x10010000         0x10030000    0x20000        0x0 /mnt/testarea/ltp/testcases/bin/genasin
>       0x7ffff7f90000     0x7ffff7fb0000    0x20000        0x0 [vdso]
>       0x7ffff7fb0000     0x7ffff7fe0000    0x30000        0x0 /usr/lib64/ld-2.30.so
>       0x7ffff7fe0000     0x7ffff8000000    0x20000    0x20000 /usr/lib64/ld-2.30.so
>       0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]
>
> (gdb) x/1x 0x10000040
> 0x10000040:     Cannot access memory at address 0x10000040

Yeah that's weird.

> # /mnt/testarea/ltp/testcases/bin/genasin
> Bus error (core dumped)
>
> However, as soon as I copy that binary somewhere else, it works fine:
>
> # cp /mnt/testarea/ltp/testcases/bin/genasin /tmp
> # /tmp/genasin
> # echo $?
> 0

Is /tmp a real disk or tmpfs?

cheers

> # cp /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2
> # /mnt/testarea/ltp/testcases/bin/genasin2
> # echo $?
> 0
>
> # /mnt/testarea/ltp/testcases/bin/genasin
> Bus error (core dumped)
>
> # diff /mnt/testarea/ltp/testcases/bin/genasin /mnt/testarea/ltp/testcases/bin/genasin2; echo $?
> 0
>
> # lscpu
> Architecture:                    ppc64le
> Byte Order:                      Little Endian
> CPU(s):                          160
> On-line CPU(s) list:             0-159
> Thread(s) per core:              4
> Core(s) per socket:              20
> Socket(s):                       2
> NUMA node(s):                    2
> Model:                           2.2 (pvr 004e 1202)
> Model name:                      POWER9, altivec supported
> Frequency boost:                 enabled
> CPU max MHz:                     3800.0000
> CPU min MHz:                     2166.0000
> L1d cache:                       1.3 MiB
> L1i cache:                       1.3 MiB
> L2 cache:                        10 MiB
> L3 cache:                        200 MiB
> NUMA node0 CPU(s):               0-79
> NUMA node8 CPU(s):               80-159
> Vulnerability Itlb multihit:     Not affected
> Vulnerability L1tf:              Not affected
> Vulnerability Mds:               Not affected
> Vulnerability Meltdown:          Mitigation; RFI Flush, L1D private per thread
> Vulnerability Spec store bypass: Mitigation; Kernel entry/exit barrier (eieio)
> Vulnerability Spectre v1:        Mitigation; __user pointer sanitization, ori31 speculation barrier enabled
> Vulnerability Spectre v2:        Mitigation; Indirect branch cache disabled, Software link stack flush
> Vulnerability Tsx async abort:   Not affected

  reply	other threads:[~2019-12-02  5:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-30  5:26 ❌ FAIL: Test report for kernel 5.3.13-3b5f971.cki (stable-queue) CKI Project
2019-11-30 21:56 ` Jan Stancek
2019-12-02  5:46   ` Michael Ellerman [this message]
2019-12-02 12:30     ` Jan Stancek
2019-12-03 12:50       ` [bug] userspace hitting sporadic SIGBUS on xfs (Power9, ppc64le), v4.19 and later Jan Stancek
2019-12-03 13:07         ` Christoph Hellwig
2019-12-03 14:35           ` Jan Stancek
2019-12-03 16:08             ` Darrick J. Wong
2019-12-03 19:09         ` Christoph Hellwig
2019-12-04 14:43           ` Jan Stancek
2019-12-07  0:02         ` dftxbs3e

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8736e3ffen.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=cki-project@redhat.com \
    --cc=jstancek@redhat.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=ltp@lists.linux.it \
    --cc=mm-qe@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).