All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandre Ghiti <alexghiti@rivosinc.com>
To: Conor Dooley <conor.dooley@microchip.com>
Cc: Song Shuai <suagrfillet@gmail.com>,
	robh@kernel.org, Andrew Jones <ajones@ventanamicro.com>,
	anup@brainfault.org, palmer@rivosinc.com,
	jeeheng.sia@starfivetech.com, leyfoon.tan@starfivetech.com,
	mason.huo@starfivetech.com,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Guo Ren <guoren@kernel.org>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: Bug report: kernel paniced when system hibernates
Date: Wed, 17 May 2023 16:55:53 +0200	[thread overview]
Message-ID: <CAHVXubhMLgb54_7zV2yFuGPoMKCkUXwozHbDvghc7kQqNLK-JA@mail.gmail.com> (raw)
In-Reply-To: <20230517-preacher-primer-f41020b3376a@wendy>

On Wed, May 17, 2023 at 1:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey Alex,
>
> On Wed, May 17, 2023 at 10:58:02AM +0200, Alexandre Ghiti wrote:
> > On Tue, May 16, 2023 at 1:12 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> > > On Tue, May 16, 2023 at 11:24 AM Song Shuai <suagrfillet@gmail.com> wrote:
> > > I actually removed this flag a few years ago, and I have to admit that
> > > I need to check if that's necessary: the goal of commit 3335068f8721
> > > ("riscv: Use PUD/P4D/PGD pages for the linear mapping") is to expose
> > > the "right" start of DRAM so that we can align virtual and physical
> > > addresses on a 1GB boundary.
> > >
> > > So I have to check if a nomap region is actually added as a
> > > memblock.memory.regions[] or not: if yes, that's perfect, let's add
> > > the nomap attributes to the PMP regions, otherwise, I don't think that
> > > is a good solution.
> >
> > So here is the current linear mapping without nomap in openSBI:
> >
> > ---[ Linear mapping ]---
> > 0xff60000000000000-0xff60000000200000    0x0000000080000000         2M
> > PMD     D A G . . W R V
> > 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> > PMD     D A G . . . R V
> >
> > And below the linear mapping with nomap in openSBI:
> >
> > ---[ Linear mapping ]---
> > 0xff60000000080000-0xff60000000200000    0x0000000080080000      1536K
> > PTE     D A G . . W R V
> > 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> > PMD     D A G . . . R V
> >
> > So adding nomap does not misalign virtual and physical addresses, it
> > prevents the usage of 1GB page for this area though, so that's a
> > solution, we just lose this 1GB page here.
> >
> > But even though that may be the fix, I think we also need to fix that
> > in the kernel as it would break compatibility with certain versions of
> > openSBI *if* we fix openSBI...So here are a few solutions:
> >
> > 1. we can mark all "mmode_resv" nodes in the device tree as nomap,
> > before the linear mapping is established (IIUC, those nodes are added
> > by openSBI to advertise PMP regions)
> >     -> This amounts to the same fix as opensbi and we lose the 1GB hugepage.
>
> AFAIU, losing the 1 GB hugepage is a regression, which would make this
> not an option, right?

Not sure this is a real regression, I'd rather avoid it, but as
mentioned in my first answer, Mike Rapoport showed that it was making
no difference performance-wise...

>
> > 2. we can tweak pfn_is_nosave function to *not* save pfn corresponding
> > to PMP regions
> >     -> We don't lose the 1GB hugepage \o/
> > 3. we can use register_nosave_region() to not save the "mmode_resv"
> > regions (x86 does that
> > https://elixir.bootlin.com/linux/v6.4-rc1/source/arch/x86/kernel/e820.c#L753)
> >     -> We don't lose the 1GB hugepage \o/
> > 4. Given JeeHeng pointer to
> > https://elixir.bootlin.com/linux/v6.4-rc1/source/kernel/power/snapshot.c#L1340,
> > we can mark those pages as non-readable and make the hibernation
> > process not save those pages
> >     -> Very late-in-the-day idea, not sure what it's worth, we also
> > lose the 1GB hugepage...
>
> Ditto here re: introducing another regression.
>
> > To me, the best solution is 3 as it would prepare for other similar
> > issues later, it is similar to x86 and it allows us to keep 1GB
> > hugepages.
> >
> > I have been thinking, and to me nomap does not provide anything since
> > the kernel should not address this memory range, so if it does, we
> > must fix the kernel.
> >
> > Let me know what you all think, I'll be preparing a PoC of 3 in the meantime!
>
> #3 would probably get my vote too. It seems like you could use it
> dynamically if there was to be a future other provider of "mmode_resv"
> regions, rather than doing something location-specific.
>
> We should probably document these opensbi reserved memory nodes though
> in a dt-binding or w/e if we are going to be relying on them to not
> crash!

Yes, you're right, let's see what Atish and Anup think!

Thanks for your quick answers Conor and Song, really appreciated!

Alex

>
> Thanks for working on this,
> Conor.
>

WARNING: multiple messages have this Message-ID (diff)
From: Alexandre Ghiti <alexghiti@rivosinc.com>
To: Conor Dooley <conor.dooley@microchip.com>
Cc: Song Shuai <suagrfillet@gmail.com>,
	robh@kernel.org,  Andrew Jones <ajones@ventanamicro.com>,
	anup@brainfault.org, palmer@rivosinc.com,
	 jeeheng.sia@starfivetech.com, leyfoon.tan@starfivetech.com,
	 mason.huo@starfivetech.com,
	Paul Walmsley <paul.walmsley@sifive.com>,
	 Guo Ren <guoren@kernel.org>,
	linux-riscv@lists.infradead.org,  linux-kernel@vger.kernel.org
Subject: Re: Bug report: kernel paniced when system hibernates
Date: Wed, 17 May 2023 16:55:53 +0200	[thread overview]
Message-ID: <CAHVXubhMLgb54_7zV2yFuGPoMKCkUXwozHbDvghc7kQqNLK-JA@mail.gmail.com> (raw)
In-Reply-To: <20230517-preacher-primer-f41020b3376a@wendy>

On Wed, May 17, 2023 at 1:28 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey Alex,
>
> On Wed, May 17, 2023 at 10:58:02AM +0200, Alexandre Ghiti wrote:
> > On Tue, May 16, 2023 at 1:12 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> > > On Tue, May 16, 2023 at 11:24 AM Song Shuai <suagrfillet@gmail.com> wrote:
> > > I actually removed this flag a few years ago, and I have to admit that
> > > I need to check if that's necessary: the goal of commit 3335068f8721
> > > ("riscv: Use PUD/P4D/PGD pages for the linear mapping") is to expose
> > > the "right" start of DRAM so that we can align virtual and physical
> > > addresses on a 1GB boundary.
> > >
> > > So I have to check if a nomap region is actually added as a
> > > memblock.memory.regions[] or not: if yes, that's perfect, let's add
> > > the nomap attributes to the PMP regions, otherwise, I don't think that
> > > is a good solution.
> >
> > So here is the current linear mapping without nomap in openSBI:
> >
> > ---[ Linear mapping ]---
> > 0xff60000000000000-0xff60000000200000    0x0000000080000000         2M
> > PMD     D A G . . W R V
> > 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> > PMD     D A G . . . R V
> >
> > And below the linear mapping with nomap in openSBI:
> >
> > ---[ Linear mapping ]---
> > 0xff60000000080000-0xff60000000200000    0x0000000080080000      1536K
> > PTE     D A G . . W R V
> > 0xff60000000200000-0xff60000000e00000    0x0000000080200000        12M
> > PMD     D A G . . . R V
> >
> > So adding nomap does not misalign virtual and physical addresses, it
> > prevents the usage of 1GB page for this area though, so that's a
> > solution, we just lose this 1GB page here.
> >
> > But even though that may be the fix, I think we also need to fix that
> > in the kernel as it would break compatibility with certain versions of
> > openSBI *if* we fix openSBI...So here are a few solutions:
> >
> > 1. we can mark all "mmode_resv" nodes in the device tree as nomap,
> > before the linear mapping is established (IIUC, those nodes are added
> > by openSBI to advertise PMP regions)
> >     -> This amounts to the same fix as opensbi and we lose the 1GB hugepage.
>
> AFAIU, losing the 1 GB hugepage is a regression, which would make this
> not an option, right?

Not sure this is a real regression, I'd rather avoid it, but as
mentioned in my first answer, Mike Rapoport showed that it was making
no difference performance-wise...

>
> > 2. we can tweak pfn_is_nosave function to *not* save pfn corresponding
> > to PMP regions
> >     -> We don't lose the 1GB hugepage \o/
> > 3. we can use register_nosave_region() to not save the "mmode_resv"
> > regions (x86 does that
> > https://elixir.bootlin.com/linux/v6.4-rc1/source/arch/x86/kernel/e820.c#L753)
> >     -> We don't lose the 1GB hugepage \o/
> > 4. Given JeeHeng pointer to
> > https://elixir.bootlin.com/linux/v6.4-rc1/source/kernel/power/snapshot.c#L1340,
> > we can mark those pages as non-readable and make the hibernation
> > process not save those pages
> >     -> Very late-in-the-day idea, not sure what it's worth, we also
> > lose the 1GB hugepage...
>
> Ditto here re: introducing another regression.
>
> > To me, the best solution is 3 as it would prepare for other similar
> > issues later, it is similar to x86 and it allows us to keep 1GB
> > hugepages.
> >
> > I have been thinking, and to me nomap does not provide anything since
> > the kernel should not address this memory range, so if it does, we
> > must fix the kernel.
> >
> > Let me know what you all think, I'll be preparing a PoC of 3 in the meantime!
>
> #3 would probably get my vote too. It seems like you could use it
> dynamically if there was to be a future other provider of "mmode_resv"
> regions, rather than doing something location-specific.
>
> We should probably document these opensbi reserved memory nodes though
> in a dt-binding or w/e if we are going to be relying on them to not
> crash!

Yes, you're right, let's see what Atish and Anup think!

Thanks for your quick answers Conor and Song, really appreciated!

Alex

>
> Thanks for working on this,
> Conor.
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-05-17 14:56 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-16  9:24 Bug report: kernel paniced when system hibernates Song Shuai
2023-05-16  9:24 ` Song Shuai
2023-05-16  9:55 ` JeeHeng Sia
2023-05-16  9:55   ` JeeHeng Sia
2023-05-16 11:14   ` Alexandre Ghiti
2023-05-16 11:14     ` Alexandre Ghiti
2023-05-16 11:27     ` JeeHeng Sia
2023-05-16 11:27       ` JeeHeng Sia
2023-05-17  8:33       ` Alexandre Ghiti
2023-05-17  8:33         ` Alexandre Ghiti
2023-05-18  4:06         ` JeeHeng Sia
2023-05-18  4:06           ` JeeHeng Sia
2023-05-16 11:12 ` Alexandre Ghiti
2023-05-16 11:12   ` Alexandre Ghiti
2023-05-17  8:58   ` Alexandre Ghiti
2023-05-17  8:58     ` Alexandre Ghiti
2023-05-17 11:05     ` Song Shuai
2023-05-17 11:05       ` Song Shuai
     [not found]       ` <CAHVXubgjgMvFV0MOABbtKr+2TH85+0kow7wOrjxFCP5iXt1saQ@mail.gmail.com>
2023-05-17 14:42         ` Fwd: " Alexandre Ghiti
2023-05-17 14:42           ` Alexandre Ghiti
2023-05-18  1:29           ` JeeHeng Sia
2023-05-18  1:29             ` JeeHeng Sia
2023-05-18  9:13             ` Alexandre Ghiti
2023-05-18  9:13               ` Alexandre Ghiti
2023-05-18  3:29           ` Song Shuai
2023-05-18  3:29             ` Song Shuai
2023-05-18 11:54             ` Alexandre Ghiti
2023-05-18 11:54               ` Alexandre Ghiti
2023-05-17 11:27     ` Conor Dooley
2023-05-17 11:27       ` Conor Dooley
2023-05-17 14:55       ` Alexandre Ghiti [this message]
2023-05-17 14:55         ` Alexandre Ghiti
2023-05-18  6:53         ` Anup Patel
2023-05-18  6:53           ` Anup Patel
2023-05-18  7:59           ` Conor Dooley
2023-05-18  7:59             ` Conor Dooley
2023-05-18  8:41             ` Alexandre Ghiti
2023-05-18  8:41               ` Alexandre Ghiti
2023-05-18 10:35               ` Conor Dooley
2023-05-18 10:35                 ` Conor Dooley
2023-05-18 11:58                 ` Alexandre Ghiti
2023-05-18 11:58                   ` Alexandre Ghiti
2023-05-24 23:53                   ` Atish Patra
2023-05-24 23:53                     ` Atish Patra
2023-05-25 12:55                     ` Conor Dooley
2023-05-25 12:55                       ` Conor Dooley
2023-05-18 12:21                 ` JeeHeng Sia
2023-05-18 12:21                   ` JeeHeng Sia
2023-05-18 12:09           ` Alexandre Ghiti
2023-05-18 12:09             ` Alexandre Ghiti
2023-05-18 14:04             ` Anup Patel
2023-05-18 14:04               ` Anup Patel
2023-05-24 13:49               ` Conor Dooley
2023-05-24 13:49                 ` Conor Dooley
2023-05-24 13:57                 ` Alexandre Ghiti
2023-05-24 13:57                   ` Alexandre Ghiti
2023-05-24 15:59                   ` Conor Dooley
2023-05-24 15:59                     ` Conor Dooley
2023-05-24 23:45               ` Atish Patra
2023-05-24 23:45                 ` Atish Patra
2023-05-25 13:08                 ` Conor Dooley
2023-05-25 13:08                   ` Conor Dooley
2023-05-25 13:21                   ` Anup Patel
2023-05-25 13:21                     ` Anup Patel
2023-05-25 13:37                     ` Conor Dooley
2023-05-25 13:37                       ` Conor Dooley
2023-05-25 13:43                       ` Anup Patel
2023-05-25 13:43                         ` Anup Patel
2023-05-25 13:55                         ` Conor Dooley
2023-05-25 13:55                           ` Conor Dooley
2023-05-25 13:59                           ` Anup Patel
2023-05-25 13:59                             ` Anup Patel
2023-05-25 14:20                             ` Conor Dooley
2023-05-25 14:20                               ` Conor Dooley
2023-05-25 17:39                               ` Atish Patra
2023-05-25 17:39                                 ` Atish Patra
2023-05-25 18:22                                 ` Conor Dooley
2023-05-25 18:22                                   ` Conor Dooley
2023-05-25 18:37                                   ` Atish Patra
2023-05-25 18:37                                     ` Atish Patra
2023-05-25 18:39                                     ` Conor Dooley
2023-05-25 18:39                                       ` Conor Dooley
2023-05-25 20:06                                       ` Atish Patra
2023-05-25 20:06                                         ` Atish Patra
2023-05-25 21:24                                         ` Conor Dooley
2023-05-25 21:24                                           ` Conor Dooley
2023-05-26 13:14                                           ` Alexandre Ghiti
2023-05-26 13:14                                             ` Alexandre Ghiti
2023-05-26 14:59                                             ` Conor Dooley
2023-05-26 14:59                                               ` Conor Dooley
2023-05-26 15:12                                               ` Alexandre Ghiti
2023-05-26 15:12                                                 ` Alexandre Ghiti
2023-05-26 15:17                                                 ` Anup Patel
2023-05-26 15:17                                                   ` Anup Patel
2023-05-26 15:22                                                   ` Alexandre Ghiti
2023-05-26 15:22                                                     ` Alexandre Ghiti
2023-05-26 18:48                                                     ` Atish Patra
2023-05-26 18:48                                                       ` Atish Patra
2023-05-16 11:35 ` Conor Dooley
2023-05-16 11:35   ` Conor Dooley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHVXubhMLgb54_7zV2yFuGPoMKCkUXwozHbDvghc7kQqNLK-JA@mail.gmail.com \
    --to=alexghiti@rivosinc.com \
    --cc=ajones@ventanamicro.com \
    --cc=anup@brainfault.org \
    --cc=conor.dooley@microchip.com \
    --cc=guoren@kernel.org \
    --cc=jeeheng.sia@starfivetech.com \
    --cc=leyfoon.tan@starfivetech.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mason.huo@starfivetech.com \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=robh@kernel.org \
    --cc=suagrfillet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.