All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linux ARM <linux-arm-kernel@lists.infradead.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	Will Deacon <will@kernel.org>,
	Jeremy Linton <jeremy.linton@arm.com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Nicolas Saenz Julienne <nsaenzjulienne@suse.de>,
	Rob Herring <robh+dt@kernel.org>, Christoph Hellwig <hch@lst.de>,
	Robin Murphy <robin.murphy@arm.com>,
	Hanjun Guo <guohanjun@huawei.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Re: [PATCH] arm64: mm: set ZONE_DMA size based on early IORT scan
Date: Mon, 12 Oct 2020 17:55:45 +0200	[thread overview]
Message-ID: <CAMj1kXFKRZ-eHtvqxZ84RSVcY8LQgkv1Vh6w8CvsWyOO-qJcuA@mail.gmail.com> (raw)
In-Reply-To: <20201012154954.GB6493@gaia>

On Mon, 12 Oct 2020 at 17:50, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Mon, Oct 12, 2020 at 04:19:08PM +0200, Ard Biesheuvel wrote:
> > On Mon, 12 Oct 2020 at 13:24, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Mon, Oct 12, 2020 at 12:43:05PM +0200, Ard Biesheuvel wrote:
> > > > On Mon, 12 Oct 2020 at 11:30, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > On Mon, 12 Oct 2020 at 11:28, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > > > On Sat, Oct 10, 2020 at 11:31:53AM +0200, Ard Biesheuvel wrote:
> > > > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > > > > > > index f0599ae73b8d..829fa63c3d72 100644
> > > > > > > --- a/arch/arm64/mm/init.c
> > > > > > > +++ b/arch/arm64/mm/init.c
> > > > > > > @@ -191,6 +191,14 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> > > > > > >       unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
> > > > > > >
> > > > > > >  #ifdef CONFIG_ZONE_DMA
> > > > > > > +     if (IS_ENABLED(CONFIG_ACPI)) {
> > > > > > > +             extern unsigned int acpi_iort_get_zone_dma_size(void);
> > > > > >
> > > > > > Nitpick: can we add this prototype to include/linux/acpi_iort.h?
> > > > > >
> > > > > > > +
> > > > > > > +             zone_dma_bits = min(zone_dma_bits,
> > > > > > > +                                 acpi_iort_get_zone_dma_size());
> > > > > > > +             arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
> > > > > > > +     }
> > > > > > > +
> > > > > > >       max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
> > > > > >
> > > > > > I think we should initialise zone_dma_bits slightly earlier via
> > > > > > arm64_memblock_init(). We'll eventually have reserve_crashkernel()
> > > > > > called before this and it will make use of arm64_dma_phys_limit for
> > > > > > "low" reservations:
> > > > > >
> > > > > > https://lore.kernel.org/linux-arm-kernel/20200907134745.25732-7-chenzhou10@huawei.com/
> > > > > >
> > > > >
> > > > > We don't have access to the ACPI tables yet at that point.
> > > >
> > > > Also, could someone give an executive summary of why it matters where
> > > > the crashkernel is loaded? As far as I can tell, reserve_crashkernel()
> > > > only allocates memory for the kernel's executable image itself, which
> > > > can usually be loaded anywhere in memory. I could see how a
> > > > crashkernel might need some DMA'able memory if it needs to use the
> > > > hardware, but I don't think that is what is going on here.
> > >
> > > I thought the crashkernel needs some additional reserved RAM as well to
> > > be able to run. It should not touch the original kernel's memory as it
> > > usually needs to dump it.
> >
> > Looking at the code, it is definitely allocating memory for the kernel
> > itself (as it refers to the 2 MB alignment requirement), and given
> > that we used to require the kernel to be at the base of the linear
> > region to even be able to access all of memory, I suspect that we
> > might be able to relax this requirement. Not sure what that means for
> > the userland tools, though.
>
> The 2MB is an interpretation of booting.txt that the DRAM must start at
> this alignment (not sure what we do these days, in lots of
> configurations we just use 4K pages for the linear map).
>

On 4k granule kernels, We still need 2 MB alignment today unless you
use a relocatable kernel. The reason is that virtual addresses are
assigned at link time, and we use section mappings to map the kernel.
If CONFIG_RELOCATABLE=y, the kernel can run happily at any 64k aligned
address (except for the 64k granule kernel with CONFIG_VMAP=y, which
needs 128k in this case)

So keeping a 2 MB alignment requirement in booting.txt still makes sense.

> However, the crashkernel=... range is meant for sufficiently large
> reservation to be able to run the kdump kernel, not just load the image.
>

Sure. But I was referring to the requirement that it is loaded low in
memory. Unless I am misunderstanding something, all we need for the
crashkernel to be able to operate is some ZONE_DMA memory in case it
is needed by the hardware, and beyond that, it could happily live
anywhere in memory.

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ardb@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Jeremy Linton <jeremy.linton@arm.com>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Hanjun Guo <guohanjun@huawei.com>, Will Deacon <will@kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Subject: Re: [PATCH] arm64: mm: set ZONE_DMA size based on early IORT scan
Date: Mon, 12 Oct 2020 17:55:45 +0200	[thread overview]
Message-ID: <CAMj1kXFKRZ-eHtvqxZ84RSVcY8LQgkv1Vh6w8CvsWyOO-qJcuA@mail.gmail.com> (raw)
In-Reply-To: <20201012154954.GB6493@gaia>

On Mon, 12 Oct 2020 at 17:50, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Mon, Oct 12, 2020 at 04:19:08PM +0200, Ard Biesheuvel wrote:
> > On Mon, 12 Oct 2020 at 13:24, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Mon, Oct 12, 2020 at 12:43:05PM +0200, Ard Biesheuvel wrote:
> > > > On Mon, 12 Oct 2020 at 11:30, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > > On Mon, 12 Oct 2020 at 11:28, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > > > On Sat, Oct 10, 2020 at 11:31:53AM +0200, Ard Biesheuvel wrote:
> > > > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > > > > > > index f0599ae73b8d..829fa63c3d72 100644
> > > > > > > --- a/arch/arm64/mm/init.c
> > > > > > > +++ b/arch/arm64/mm/init.c
> > > > > > > @@ -191,6 +191,14 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> > > > > > >       unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
> > > > > > >
> > > > > > >  #ifdef CONFIG_ZONE_DMA
> > > > > > > +     if (IS_ENABLED(CONFIG_ACPI)) {
> > > > > > > +             extern unsigned int acpi_iort_get_zone_dma_size(void);
> > > > > >
> > > > > > Nitpick: can we add this prototype to include/linux/acpi_iort.h?
> > > > > >
> > > > > > > +
> > > > > > > +             zone_dma_bits = min(zone_dma_bits,
> > > > > > > +                                 acpi_iort_get_zone_dma_size());
> > > > > > > +             arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
> > > > > > > +     }
> > > > > > > +
> > > > > > >       max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
> > > > > >
> > > > > > I think we should initialise zone_dma_bits slightly earlier via
> > > > > > arm64_memblock_init(). We'll eventually have reserve_crashkernel()
> > > > > > called before this and it will make use of arm64_dma_phys_limit for
> > > > > > "low" reservations:
> > > > > >
> > > > > > https://lore.kernel.org/linux-arm-kernel/20200907134745.25732-7-chenzhou10@huawei.com/
> > > > > >
> > > > >
> > > > > We don't have access to the ACPI tables yet at that point.
> > > >
> > > > Also, could someone give an executive summary of why it matters where
> > > > the crashkernel is loaded? As far as I can tell, reserve_crashkernel()
> > > > only allocates memory for the kernel's executable image itself, which
> > > > can usually be loaded anywhere in memory. I could see how a
> > > > crashkernel might need some DMA'able memory if it needs to use the
> > > > hardware, but I don't think that is what is going on here.
> > >
> > > I thought the crashkernel needs some additional reserved RAM as well to
> > > be able to run. It should not touch the original kernel's memory as it
> > > usually needs to dump it.
> >
> > Looking at the code, it is definitely allocating memory for the kernel
> > itself (as it refers to the 2 MB alignment requirement), and given
> > that we used to require the kernel to be at the base of the linear
> > region to even be able to access all of memory, I suspect that we
> > might be able to relax this requirement. Not sure what that means for
> > the userland tools, though.
>
> The 2MB is an interpretation of booting.txt that the DRAM must start at
> this alignment (not sure what we do these days, in lots of
> configurations we just use 4K pages for the linear map).
>

On 4k granule kernels, We still need 2 MB alignment today unless you
use a relocatable kernel. The reason is that virtual addresses are
assigned at link time, and we use section mappings to map the kernel.
If CONFIG_RELOCATABLE=y, the kernel can run happily at any 64k aligned
address (except for the 64k granule kernel with CONFIG_VMAP=y, which
needs 128k in this case)

So keeping a 2 MB alignment requirement in booting.txt still makes sense.

> However, the crashkernel=... range is meant for sufficiently large
> reservation to be able to run the kdump kernel, not just load the image.
>

Sure. But I was referring to the requirement that it is loaded low in
memory. Unless I am misunderstanding something, all we need for the
crashkernel to be able to operate is some ZONE_DMA memory in case it
is needed by the hardware, and beyond that, it could happily live
anywhere in memory.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-10-12 15:55 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-10  9:31 [PATCH] arm64: mm: set ZONE_DMA size based on early IORT scan Ard Biesheuvel
2020-10-10  9:31 ` Ard Biesheuvel
2020-10-12  9:28 ` Catalin Marinas
2020-10-12  9:28   ` Catalin Marinas
2020-10-12  9:30   ` Ard Biesheuvel
2020-10-12  9:30     ` Ard Biesheuvel
2020-10-12 10:43     ` Ard Biesheuvel
2020-10-12 10:43       ` Ard Biesheuvel
2020-10-12 11:24       ` Catalin Marinas
2020-10-12 11:24         ` Catalin Marinas
2020-10-12 14:19         ` Ard Biesheuvel
2020-10-12 14:19           ` Ard Biesheuvel
2020-10-12 15:49           ` Catalin Marinas
2020-10-12 15:49             ` Catalin Marinas
2020-10-12 15:55             ` Ard Biesheuvel [this message]
2020-10-12 15:55               ` Ard Biesheuvel
2020-10-12 16:22               ` Catalin Marinas
2020-10-12 16:22                 ` Catalin Marinas
2020-10-12 16:35                 ` Ard Biesheuvel
2020-10-12 16:35                   ` Ard Biesheuvel
2020-10-12 16:59                   ` Catalin Marinas
2020-10-12 16:59                     ` Catalin Marinas
2020-10-13 14:42                     ` Nicolas Saenz Julienne
2020-10-13 14:42                       ` Nicolas Saenz Julienne
2020-10-13 15:45                       ` Catalin Marinas
2020-10-13 15:45                         ` Catalin Marinas
2020-10-14 12:44                       ` Ard Biesheuvel
2020-10-14 12:44                         ` Ard Biesheuvel
2020-10-14 12:54                         ` Nicolas Saenz Julienne
2020-10-14 12:54                           ` Nicolas Saenz Julienne
2020-10-12 12:16 ` kernel test robot
2020-10-12 12:16   ` kernel test robot
2020-10-12 12:16   ` kernel test robot
2020-10-13 11:09 ` Lorenzo Pieralisi
2020-10-13 11:09   ` Lorenzo Pieralisi
2020-10-13 11:22   ` Ard Biesheuvel
2020-10-13 11:22     ` Ard Biesheuvel
2020-10-13 11:38     ` Ard Biesheuvel
2020-10-13 11:38       ` Ard Biesheuvel
2020-10-13 11:43       ` Ard Biesheuvel
2020-10-13 11:43         ` Ard Biesheuvel
2020-10-13 13:13     ` Lorenzo Pieralisi
2020-10-13 13:13       ` Lorenzo Pieralisi
2020-10-13 13:42       ` Ard Biesheuvel
2020-10-13 13:42         ` Ard Biesheuvel
2020-10-13 15:11         ` Robin Murphy
2020-10-13 15:11           ` Robin Murphy
2020-10-13 15:41         ` Lorenzo Pieralisi
2020-10-13 15:41           ` Lorenzo Pieralisi
2020-10-14 16:18           ` Catalin Marinas
2020-10-14 16:18             ` Catalin Marinas
2020-10-14 17:23             ` Lorenzo Pieralisi
2020-10-14 17:23               ` Lorenzo Pieralisi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMj1kXFKRZ-eHtvqxZ84RSVcY8LQgkv1Vh6w8CvsWyOO-qJcuA@mail.gmail.com \
    --to=ardb@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=guohanjun@huawei.com \
    --cc=hch@lst.de \
    --cc=jeremy.linton@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=nsaenzjulienne@suse.de \
    --cc=robh+dt@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.