From: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
To: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: coherency issue observed after hotplug on POWER8
Date: Fri, 24 Sep 2021 22:47:09 +0530 [thread overview]
Message-ID: <1632500323.sp1p885nv8.naveen@linux.ibm.com> (raw)
In-Reply-To: <YUpIqytZqpohq4EM@mussarela>
Hi Cascardo,
Thanks for reporting this.
Thadeu Lima de Souza Cascardo wrote:
> Hi, there.
>
> We have been investigating an issue we have observed on POWER8 POWERNV systems.
> When running the kernel selftests reuseport_bpf_cpu after a CPU hotplug, we see
> crashes, in different forms. [1]
Just to re-confirm: you are only seeing this on P8 powernv, and not in a
P8 guest/LPAR? I haven't been able to reproduce this on a firestone --
can you share more details about your power8 machine?
Also, do you only see this with ubuntu kernels, or are you also able to
reproduce this with the upstream tree?
>
> I managed to get xmon on that trap, and did some debugging. [2] I tried to dump
> the BPF JIT code, and it looks different when dumped from CPU#0 and CPU#0x9f
> (the one that was hotplugged, offlined, then onlined).
Next time you reproduce this, can you try dumping the SLBs for the cpus
(command 'u' in xmon)?
>
> Here is my partial analysis [3]. Basically, the BPF JIT fills a page with
> invalid instructions (traps, in ppc64 case), and puts the BPF program in a
> random offset of the page. In the case of the hotplugged CPU, which was the one
> that compiled the program, the page had the expected contents (BPF program
> started at the offset used to run the program). On the other CPU (in many
> cases, CPU #0), the same memory address/page had different contents, with the
> program starting at a different offset.
From [3], I think fp->aux->jit_data can be NULL if there are subprogs.
But, I find it interesting that you don't always see the correct
bpf_func, as reported in comment #25. Can you also try dumping the full
bpf_prog structure (prog/fp) from xmon?
>
> Is this a case of a bug in the micro-architecture or the firmware when
> doing the hotplug? Can someone chime in?
It's possible that something is going wrong when offlining the cpu. Can
you try booting the kernel with 'powersave=off' and see if the problem
goes away?
>
> Notice that we can't reproduce the same issue on a POWER9 system.
>
> Thanks.
> Cascardo.
>
> [1] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076
> [2] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/29
> [3] https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/comments/30
>
- Naveen
next prev parent reply other threads:[~2021-09-24 17:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-21 21:03 coherency issue observed after hotplug on POWER8 Thadeu Lima de Souza Cascardo
2021-09-24 17:17 ` Naveen N. Rao [this message]
2021-10-21 15:01 ` Krzysztof Kozlowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1632500323.sp1p885nv8.naveen@linux.ibm.com \
--to=naveen.n.rao@linux.ibm.com \
--cc=cascardo@canonical.com \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).