linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
To: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: catalin.marinas@arm.com, labbott@redhat.com,
	linux-arm-kernel@lists.infradead.org, ard.biesheuvel@linaro.org
Subject: Re: Memory size and section boundary on armv7
Date: Thu, 11 Apr 2019 17:22:10 +0100	[thread overview]
Message-ID: <20190411162210.462hwmwlwrciq3re@shell.armlinux.org.uk> (raw)
In-Reply-To: <20190411155000.GA25323@apalos>

On Thu, Apr 11, 2019 at 06:50:00PM +0300, Ilias Apalodimas wrote:
> Hi Russell,
> 
> > On Thu, Apr 11, 2019 at 06:13:20PM +0300, Ilias Apalodimas wrote:
> > > Hello, 
> > > 
> > > While experimenting with u-boot and booting the kernel as an EFI stub i seem to
> > > hit an issue with memory mapping (map_lowmem).
> > > 
> > > I can only reproduce the crash when using u-boot's dtb (specified in
> > > fdtcontroladdr) and pass that to the kernel, instead of loading an external one
> > > with 'dtb=' on it's command line.
> > > 
> > > If CONFIG_ARM_LPAE=y, everything works fine. If the latter is not selected
> > > though the kernel seems to crash only if dtb is placed on a section boundary.
> > > The reason is crashes is the BUG_ON() on arm_pte_alloc() triggers.
> > > 
> > > fdt size is 0xc00 in both scenarios.
> > > So something like this works fine:
> > > In this case ff000 + c000 + f5000 = 200000
> > > [    0.000000] MAP addr: c7c00000, next c7e00000, phys c7c00000 len 200000
> > > [    0.000000] INI addr: c7e00000, next c7eff000, len ff000
> > > [    0.000000] INI addr: c7f0b000, next c8000000, len f5000
> > > [    0.000000] MAP addr: c8000000, next c8200000, phys c8000000 len 200000
> > > 
> > > In this working case kernel maps sizes ff000 and f5000 alloc_init_pte() and
> > > skips the 0xc000 of the fdt correctly
> > > 
> > > The non-working case is: 
> > > [    0.000000] MAP addr: c7e00000, next c7f00000, phys c7e00000 len 100000
> > > [    0.000000] INI addr: c7f0c000, next c8000000, len f4000
> > > 
> > > Both entries end up using the same pmd
> > > 
> > > I am not sure what's the best way to fix that. 
> > > Obviously skipping the alloc_init_pte() once this case is detected fixes things,
> > > but  is there a better idea?
> > 
> > Well, with the above information, all that I can say is that the
> > mapping code is quite rightfully BUG_ON()ing - it's working as
> > designed.
> > 
> > It seems that alloc_init_pte() is finding that a section mapping has
> > already been created for the virtual address range, but it is then
> > being asked to create a page table mapping over the top of it.  That
> > is one BIG no no.  Replacing the existing section mapping with a page
> > table means that the section mapping is destroyed.  If we allowed
> > such replacement, when the section mapping is accessed, we will end
> > up oopsing the kernel.
> 
> On the crashing case calling __map_init_section() maps 1mb should the rest be
> skipped then?

Sorry, I don't understand the question.  What "rest" are you referring
to?

> > It is also designed to allow hardware-section sized mappings (making
> > it possible to map sections on 1MB granularity) but as a single Linux
> > page table always occupies 2MB, it is not permitted for the unused
> > half of an aligned 2MB slot to be used for a page table mapping -
> > hence this BUG_ON().
> > 
> > The ARM early mapping routines are intentionally designed such that
> > areas of memory that they are asked to map are non-overlapping - it
> > is the caller's responsibility to ensure this.
> 
> And the case i described falls into 'overlapping'?

Yes, as I described above - note that I pointed out that a page table
takes a naturally aligned 2MB of space, and sections are naturally
aligned 1MB of space.

arch/arm/include/asm/pgtable-2level.h documents how we fit the
hardware's page table implementation to the kernel's page table
requirements.  The key thing to note is that, because of the mismatch
between the kernel and the hardware, a Linux page table occupies two
physical level 1 entries, whereas a section occupies one level 1
entry.

Hence, if there is a section mapping in the even-numbered level 1
entry, and we then get a request via alloc_init_pte() to create a
page table, we can't without destroying the mapping that is already
present.

So, yes, this absolutely is an overlapping mapping.  It may not be
at page level, but it is in exactly the same way as asking for the
first half of a page to be mapped to one physical address, and the
second half to be mapped to a different physical address would be -
that is beyond the capabilities of the system to achieve.

Having a section mapping in the first 1MB of a 2MB block and then
trying to create a page table in the second half of a 2MB block is
illegal as, as far as Linux is concerned, the level 1 page table
is indexed in 2MB chunks and contains 2048 entries, not 4096 that
the hardware actually has.

In fact, the software implementation completely ignores the odd-
numbered entries in the L1 page table beyond writing them.

> > 
> > So, this is not a problem with the mapping functions, but the way they
> > are being used.
> > Beyond that, it is impossible to say with the above information since:
> > 
> > 1. You've obviously added some additional printk()s, but you haven't
> >    said where they are or what they mean.  What is the difference
> >    between "MAP addr" and "INI addr" ?
> MAP comes from alloc_init_pmd() when calling ()
> INI comes from the same function when calling alloc_init_pte()
> > 
> > 2. You haven't included the text of the crash, so there's no way to
> >    know the call path to the BUG_ON(), and hence ascertain where the
> >    duplicated mapping is coming from.
> > 
> > Please always provide the full unaltered text from any kernel oops, bug
> > or warning.
> That's because the BUG_ON never shows up in my console. I added printks before
> and after the BUG_ON() though and i can only see the first print
> I can provide a dump_stack() if that helps.

Yes please.

It may also help to enable memblock debugging (memblock=debug) so we can
see what that looks like as well.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-04-11 16:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-11 15:13 Memory size and section boundary on armv7 Ilias Apalodimas
2019-04-11 15:41 ` Russell King - ARM Linux admin
2019-04-11 15:50   ` Ilias Apalodimas
2019-04-11 16:22     ` Russell King - ARM Linux admin [this message]
2019-04-12  5:23       ` Ilias Apalodimas
2019-04-12  8:40         ` Russell King - ARM Linux admin
2019-04-12 10:10           ` Ilias Apalodimas
2019-04-12 11:16             ` Russell King - ARM Linux admin
2019-04-12 11:26               ` Ilias Apalodimas
2019-04-12 11:43               ` Ilias Apalodimas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190411162210.462hwmwlwrciq3re@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=labbott@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).