From: Palmer Dabbelt <palmer@sifive.com>
To: schwab@suse.de
Cc: linux-riscv@lists.infradead.org,
David Abdurachmanov <david.abdurachmanov@gmail.com>,
opensbi@lists.infradead.org,
Paul Walmsley <paul.walmsley@sifive.com>
Subject: Re: Random memory corruption with v5.2
Date: Thu, 01 Aug 2019 19:00:07 -0700 (PDT) [thread overview]
Message-ID: <mhng-780916c8-0f2d-4487-b55c-2b1236e8778b@palmer-si-x1c4> (raw)
In-Reply-To: <mvmwofw68ji.fsf@suse.de>
On Thu, 01 Aug 2019 11:32:33 PDT (-0700), schwab@suse.de wrote:
> On Jul 30 2019, Paul Walmsley <paul.walmsley@sifive.com> wrote:
>
>> On Tue, 30 Jul 2019, Andreas Schwab wrote:
>>
>>> On Jul 30 2019, David Abdurachmanov <david.abdurachmanov@gmail.com> wrote:
>>>
>>> > On Mon, Jul 29, 2019 at 1:51 PM Andreas Schwab <schwab@suse.de> wrote:
>>> >>
>>> >> Since switching to 5.2 kernels I'm seeing random crashes and
>>> >> misbehaviors on the HiFive, for example while building gcc or glibc.
>>> >> Perhaps missing TLB flushes?
>>> >
>>> > Do you have some examples of crashes?
>>>
>>> While building glibc:
>>>
>>> an_ES.UTF-8...realloc(): invalid pointer
>>> /bin/sh: line 1: 7841 Aborted (core dumped) I18NPATH=. GCONV_PATH=/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/iconvdata LC_ALL=C /home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/elf/ld-linux-riscv64-lp64d.so.1 --library-path /home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/math:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/elf:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/dlfcn:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/nss:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/nis:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/rt:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/resolv:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/mathvec:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/support:/home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/nptl /home/abuild/rpmbuild/BUILD/glibc-2.29/cc-base/locale/localedef $flags --alias-file=../intl/locale.alias -i locales/$input -f charmaps/$charset --prefix=/home/abuild/rpmbuild/BUILD
ROOT/glibc-2.29-0.riscv64
>
>>> make[2]: *** [Makefile:422: install-archive-an_ES.UTF-8/UTF-8] Error 134
>>>
>>> While building gcc:
>>>
>>> ../../gcc/ada/exp_aggr.adb: In function 'Exp_Aggr.Expand_N_Aggregate':
>>> ../../gcc/ada/exp_aggr.adb:5311:21: warning: 'Csiz' may be used uninitialized in this function [-Wmaybe-uninitialized]
>>> ../../gcc/ada/exp_aggr.adb:5220:10: note: 'Csiz' was declared here
>>> +===========================GNAT BUG DETECTED==============================+
>>> | 10.0.0 20190727 (experimental) [trunk revision 273844] (riscv64-suse-linux) |
>>> | Storage_Error stack overflow or erroneous memory access |
>>> | Error detected at output.ads:39:8 |
>>> realloc(): invalid pointer
>>
>> I personally haven't seen these issues; but then again, I haven't done any
>> glibc or gcc builds on v5.2. Will take a closer look.
>
> I think there is some fundamental problem with SBI_REMOTE_SFENCE_VMA or
> the kernel interface to it.
>
> For exmaple, flush_tlb_page is defined as:
>
> #define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, 0)
>
> But the third argument of flush_tlb_range is supposed to be the end
> address, so this should actually be:
>
> #define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, (addr) + PAGE_SIZE)
>
> Alas, that doesn't fix the crashes.
This line of reasoning smells like it'd find the issue: BBL just flushes the
entire TLB every time, but IIRC OpenSBI respects the ranges. It looks like
Fixes: 90cb4917b584 ("lib: Implement sfence.vma correctly.")
is what introduced the new behavior in OpenSBI, which may have triggered a lot
of latent bugs in Linux. If you have an easy way to compile OpenSBI, does
something like
$ git diff | cat
diff --git a/lib/sbi/sbi_tlb.c b/lib/sbi/sbi_tlb.c
index cffda52d66ab..007266b1f970 100644
--- a/lib/sbi/sbi_tlb.c
+++ b/lib/sbi/sbi_tlb.c
@@ -133,50 +133,12 @@ static void sbi_tlb_flush_all(void)
static void sbi_tlb_fifo_sfence_vma(struct sbi_tlb_info *tinfo)
{
- unsigned long start = tinfo->start;
- unsigned long size = tinfo->size;
- unsigned long i;
-
- if ((start == 0 && size == 0) || (size == SBI_TLB_FLUSH_ALL)) {
- sbi_tlb_flush_all();
- return;
- }
-
- for (i = 0; i < size; i += PAGE_SIZE) {
- __asm__ __volatile__("sfence.vma %0"
- :
- : "r"(start + i)
- : "memory");
- }
+ sbi_tlb_flush_all();
}
static void sbi_tlb_fifo_sfence_vma_asid(struct sbi_tlb_info *tinfo)
{
- unsigned long start = tinfo->start;
- unsigned long size = tinfo->size;
- unsigned long asid = tinfo->asid;
- unsigned long i;
-
- if (start == 0 && size == 0) {
- sbi_tlb_flush_all();
- return;
- }
-
- /* Flush entire MM context for a given ASID */
- if (size == SBI_TLB_FLUSH_ALL) {
- __asm__ __volatile__("sfence.vma x0, %0"
- :
- : "r"(asid)
- : "memory");
- return;
- }
-
- for (i = 0; i < size; i += PAGE_SIZE) {
- __asm__ __volatile__("sfence.vma %0, %1"
- :
- : "r"(start + i), "r"(asid)
- : "memory");
- }
+ sbi_tlb_flush_all();
}
void sbi_tlb_fifo_process(struct sbi_scratch *scratch, u32 event)
cause the issue to go away? If so, then I'd bet we need to scour Linux for
broken TLB flushing, as given the one you found is pretty obvious I'd bet
there's a lot more...
>
> Andreas.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2019-08-02 2:00 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-29 10:51 Random memory corruption with v5.2 Andreas Schwab
2019-07-29 22:58 ` David Abdurachmanov
2019-07-30 4:27 ` Atish Patra
2019-07-30 6:56 ` Andreas Schwab
2019-07-31 0:22 ` Paul Walmsley
2019-07-31 7:39 ` Andreas Schwab
2019-07-31 8:14 ` Anup Patel
2019-08-01 19:57 ` Palmer Dabbelt
2019-07-31 10:19 ` Andreas Schwab
2019-07-31 12:57 ` Troy Benjegerdes
2019-07-31 13:10 ` Andreas Schwab
2019-08-01 18:32 ` Andreas Schwab
2019-08-02 2:00 ` Palmer Dabbelt [this message]
2019-08-02 2:15 ` Anup Patel
2019-08-05 14:08 ` Andreas Schwab
2019-08-05 14:34 ` Andreas Schwab
2019-08-05 15:36 ` Andreas Schwab
2019-08-05 22:34 ` Atish Patra
2019-08-06 0:25 ` Troy Benjegerdes
2019-08-06 0:30 ` Atish Patra
2019-08-06 6:41 ` Andreas Schwab
2019-08-06 7:43 ` Andreas Schwab
2019-08-02 7:25 ` Paul Walmsley
2019-08-02 12:08 ` Andreas Schwab
2019-08-02 17:32 ` Paul Walmsley
2019-08-05 7:13 ` Andreas Schwab
2019-08-15 20:52 ` Atish Patra
2019-08-16 5:22 ` Atish Patra
2019-08-16 15:38 ` Troy Benjegerdes
2019-08-19 10:53 ` Andreas Schwab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mhng-780916c8-0f2d-4487-b55c-2b1236e8778b@palmer-si-x1c4 \
--to=palmer@sifive.com \
--cc=david.abdurachmanov@gmail.com \
--cc=linux-riscv@lists.infradead.org \
--cc=opensbi@lists.infradead.org \
--cc=paul.walmsley@sifive.com \
--cc=schwab@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).