From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8615C433E0 for ; Mon, 1 Mar 2021 09:52:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 837C164E40 for ; Mon, 1 Mar 2021 09:52:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234179AbhCAJvq (ORCPT ); Mon, 1 Mar 2021 04:51:46 -0500 Received: from mail.kernel.org ([198.145.29.99]:50178 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234147AbhCAJqe (ORCPT ); Mon, 1 Mar 2021 04:46:34 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id D11CF64DEE; Mon, 1 Mar 2021 09:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614591952; bh=aG4uuUZ0w7ZSRKSVJdeCsJ8UVIZRf6kb7VnGUopdSqk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=k4GV/Uxs5wNzkXzHjYoZWIY3WG4np42G5szWlM+bD25bq57UhbTiperHROA2OT/1f ILFsTkQtOm1nWPUgIzyLKAi0de1A3+RI6NxdemY+SckNXKE3crmj1YwS3af+0nuXQz RscH4I4t9rgIh81pLTBoTg1iE48CUAflhIVzxgmuyANhuqtKDATQVb1vUJ7k1YtZZj QaUD2lc/3ig01cp9IGaV4EZ6NI0t1Brdi17xhS0fOFe2JfS9CLY6g3jTdIbw2zlkSE ZMi/ezK6wQ8FLI7jiEDOfqK4bxnF9in1gdPKQqtRBj4aPs+Fd3zRqJdr6qhTMLRz7C UxvOZd28wtgMA== Date: Mon, 1 Mar 2021 11:45:42 +0200 From: Mike Rapoport To: Florian Fainelli Cc: Serge Semin , Thomas Bogendoerfer , Serge Semin , Roman Gushchin , Andrew Morton , linux-mm@kvack.org, Kamal Dasu , Paul Cercueil , Jiaxun Yang , iamjoonsoo.kim@lge.com, riel@surriel.com, Michal Hocko , linux-kernel@vger.kernel.org, kernel-team@fb.com, "open list:BROADCOM BMIPS MIPS ARCHITECTURE" Subject: Re: [PATCH v2 2/2] memblock: do not start bottom-up allocations with kernel_end Message-ID: References: <20201217201214.3414100-1-guro@fb.com> <20201217201214.3414100-2-guro@fb.com> <23fc1ef9-7342-8bc2-d184-d898107c52b2@gmail.com> <20210228090041.GO1447004@kernel.org> <8cbafe95-0f8c-a9b7-2dc9-cded846622fd@gmail.com> <20210228230811.wdae7oaaf3mbpgwl@mobilestation> <2e973fa8-5f2b-6840-0874-9c15fa0ebea0@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e973fa8-5f2b-6840-0874-9c15fa0ebea0@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 28, 2021 at 07:50:45PM -0800, Florian Fainelli wrote: > Hi Serge, > > On 2/28/2021 3:08 PM, Serge Semin wrote: > > Hi folks, > > What you've got here seems a more complicated problem than it > > could originally look like. Please, see my comments below. > > > > (Note I've discarded some of the email logs, which of no interest > > to the discovered problem. Please also note that I haven't got any > > Broadcom hardware to test out a solution suggested below.) > > > > On Sun, Feb 28, 2021 at 10:19:51AM -0800, Florian Fainelli wrote: > >> Hi Mike, > >> > >> On 2/28/2021 1:00 AM, Mike Rapoport wrote: > >>> Hi Florian, > >>> > >>> On Sat, Feb 27, 2021 at 08:18:47PM -0800, Florian Fainelli wrote: > >>>> > > > >>>> [...] > > > >>>> > >>>> Hi Roman, Thomas and other linux-mips folks, > >>>> > >>>> Kamal and myself have been unable to boot v5.11 on MIPS since this > >>>> commit, reverting it makes our MIPS platforms boot successfully. We do > >>>> not see a warning like this one in the commit message, instead what > >>>> happens appear to be a corrupted Device Tree which prevents the parsing > >>>> of the "rdb" node and leading to the interrupt controllers not being > >>>> registered, and the system eventually not booting. > >>>> > >>>> The Device Tree is built-into the kernel image and resides at > >>>> arch/mips/boot/dts/brcm/bcm97435svmb.dts. > >>>> > >>>> Do you have any idea what could be wrong with MIPS specifically here? > > > > Most likely the problem you've discovered has been there for quite > > some time. The patch you are referring to just caused it to be > > triggered by extending the early allocation range. See before that > > patch was accepted the early memory allocations had been performed > > in the range: > > [kernel_end, RAM_END]. > > The patch changed that, so the early allocations are done within > > [RAM_START + PAGE_SIZE, RAM_END]. > > > > In normal situations it's safe to do that as long as all the critical > > memory regions (including the memory residing a space below the > > kernel) have been reserved. But as soon as a memory with some critical > > structures haven't been reserved, the kernel may allocate it to be used > > for instance for early initializations with obviously unpredictable but > > most of the times unpleasant consequences. > > > >>> > >>> Apparently there is a memblock allocation in one of the functions called > >>> from arch_mem_init() between plat_mem_setup() and > >>> early_init_fdt_reserve_self(). > > > > Mike, alas according to the log provided by Florian that's not the reason > > of the problem. Please, see my considerations below. > > > >> [...] > >> > >> [ 0.000000] Linux version 5.11.0-g5695e5161974 (florian@localhost) > >> (mipsel-linux-gcc (GCC) 8.3.0, GNU ld (GNU Binutils) 2.32) #84 SMP Sun > >> Feb 28 10:01:50 PST 2021 > >> [ 0.000000] CPU0 revision is: 00025b00 (Broadcom BMIPS5200) > >> [ 0.000000] FPU revision is: 00130001 > > > >> [ 0.000000] memblock_add: [0x00000000-0x0fffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > >> [ 0.000000] memblock_add: [0x20000000-0x4fffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > >> [ 0.000000] memblock_add: [0x90000000-0xcfffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > > > > Here the memory has been added to the memblock allocator. > > > >> [ 0.000000] MIPS: machine is Broadcom BCM97435SVMB > >> [ 0.000000] earlycon: ns16550a0 at MMIO32 0x10406b00 (options '') > >> [ 0.000000] printk: bootconsole [ns16550a0] enabled > > > >> [ 0.000000] memblock_reserve: [0x00aa7600-0x00aaa0a0] > >> setup_arch+0x128/0x69c > > > > Here the fdt memory has been reserved. (Note it's built into the > > kernel.) > > > >> [ 0.000000] memblock_reserve: [0x00010000-0x018313cf] > >> setup_arch+0x1f8/0x69c > > > > Here the kernel itself together with built-in dtb have been reserved. > > So far so good. > > > >> [ 0.000000] Initrd not found or empty - disabling initrd > > > >> [ 0.000000] memblock_alloc_try_nid: 10913 bytes align=0x40 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 > >> [ 0.000000] memblock_reserve: [0x00001000-0x00003aa0] > >> memblock_alloc_range_nid+0xf8/0x198 > >> [ 0.000000] memblock_alloc_try_nid: 32680 bytes align=0x4 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 > >> [ 0.000000] memblock_reserve: [0x00003aa4-0x0000ba4b] > >> memblock_alloc_range_nid+0xf8/0x198 > > > > The log above most likely belongs to the call-chain: > > setup_arch() > > +-> arch_mem_init() > > +-> device_tree_init() - BMIPS specific method > > +-> unflatten_and_copy_device_tree() > > > > So to speak here we've copied the fdt from the original space > > [0x00aa7600-0x00aaa0a0] into [0x00001000-0x00003aa0] and unflattened > > it to [0x00003aa4-0x0000ba4b]. > > > > The problem is that a bit later the next call-chain is performed: > > setup_arch() > > +-> plat_smp_setup() > > +-> mp_ops->smp_setup(); - registered by prom_init()->register_bmips_smp_ops(); > > +-> if (!board_ebase_setup) > > board_ebase_setup = &bmips_ebase_setup; > > > > So at the moment of the CPU traps initialization the bmips_ebase_setup() > > method is called. What trap_init() does isn't compatible with the > > allocation performed by the unflatten_and_copy_device_tree() method. > > See the next comment. > > > >> [ 0.000000] memblock_alloc_try_nid: 25 bytes align=0x4 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 ... > >> [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 > >> bytes, linear) > > > >> [ 0.000000] memblock_reserve: [0x00000000-0x000003ff] > >> trap_init+0x70/0x4e8 > > > > Most likely someplace here the corruption has happened. The log above > > has just reserved a memory for NMI/reset vectors: > > arch/mips/kernel/traps.c: trap_init(void): Line 2373. > > > > But then the board_ebase_setup() pointer is dereferenced and called, > > which has been initialized with bmips_ebase_setup() earlier and which > > overwrites the ebase variable with: 0x80001000 as this is > > CPU_BMIPS5000 CPU. So any further calls of the functions like > > set_handler()/set_except_vector()/set_vi_srs_handler()/etc may cause a > > corruption of the memory above 0x80001000, which as we have discovered > > belongs to fdt and unflattened device tree. > > > >> [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off > >> [ 0.000000] Memory: 2045268K/2097152K available (8226K kernel code, > >> 1070K rwdata, 1336K rodata, 13808K init, 260K bss, 51884K reserved, 0K > >> cma-reserved, 1835008K highmem) > >> [ 0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > >> [ 0.000000] rcu: Hierarchical RCU implementation. > >> [ 0.000000] rcu: RCU event tracing is enabled. > >> [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay > >> is 25 jiffies. > >> [ 0.000000] NR_IRQS: 256 > > > >> [ 0.000000] OF: Bad cell count for /rdb > >> [ 0.000000] irq_bcm7038_l1: failed to remap intc L1 registers > >> [ 0.000000] OF: of_irq_init: children remain, but no parents > > > > So here is the first time we have got the consequence of the corruption > > popped up. Luckily it's just the "Bad cells count" error. We could have > > got much less obvious log here up to getting a crash at some place > > further... > > > >> [ 0.000000] random: get_random_bytes called from > >> start_kernel+0x444/0x654 with crng_init=0 > >> [ 0.000000] sched_clock: 32 bits at 250 Hz, resolution 4000000ns, > >> wraps every 8589934590000000ns > > > >> > >> and with your patch applied which unfortunately did not work we have the > >> following: > >> > >> [...] > > > > So a patch like this shall workaround the corruption: > > > > --- a/arch/mips/bmips/setup.c > > +++ b/arch/mips/bmips/setup.c > > @@ -174,6 +174,8 @@ void __init plat_mem_setup(void) > > > > __dt_setup_arch(dtb); > > > > + memblock_reserve(0x0, 0x1000 + 0x100*64); > > + > > for (q = bmips_quirk_list; q->quirk_fn; q++) { > > if (of_flat_dt_is_compatible(of_get_flat_dt_root(), > > q->compatible)) { > > This patch works, thanks a lot for the troubleshooting and analysis! How > about the following which would be more generic and works as well and > should be more universal since it does not require each architecture to > provide an appropriate call to memblock_reserve(): > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c > index e0352958e2f7..b0a173b500e8 100644 > --- a/arch/mips/kernel/traps.c > +++ b/arch/mips/kernel/traps.c > @@ -2367,10 +2367,7 @@ void __init trap_init(void) > > if (!cpu_has_mips_r2_r6) { > ebase = CAC_BASE; > - ebase_pa = virt_to_phys((void *)ebase); > vec_size = 0x400; > - > - memblock_reserve(ebase_pa, vec_size); > } else { > if (cpu_has_veic || cpu_has_vint) > vec_size = 0x200 + VECTORSPACING*64; > @@ -2410,6 +2407,14 @@ void __init trap_init(void) > > if (board_ebase_setup) > board_ebase_setup(); > + > + /* board_ebase_setup() can change the exception base address > + * reserve it now after changes were made. > + */ > + if (!cpu_has_mips_r2_r6) { > + ebase_pa = virt_to_phys((void *)ebase); > + memblock_reserve(ebase_pa, vec_size); > + } With this it's still possible to have memblock allocations around ebase_pa before it is reserved. I think we have two options here to solve it in more or less generic way: * split the reservation of ebase from traps_init() and move it earlier to setup_arch(). I didn't check what board_ebase_setup() do, if they need to allocate memory it would not work. * add an API to memblock to set lower limit for allocations and then set the lower limit, to e.g. kernel load address in arch_mem_init(). This may add complexity for configurations with relocatable kernel and kaslr. > per_cpu_trap_init(true); > memblock_set_bottom_up(false); -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 419F5C433E0 for ; Mon, 1 Mar 2021 09:45:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 907E964E07 for ; Mon, 1 Mar 2021 09:45:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 907E964E07 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 12DA86B00DA; Mon, 1 Mar 2021 04:45:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DDE46B00DB; Mon, 1 Mar 2021 04:45:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F373C6B00DC; Mon, 1 Mar 2021 04:45:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0004.hostedemail.com [216.40.44.4]) by kanga.kvack.org (Postfix) with ESMTP id DD06B6B00DA for ; Mon, 1 Mar 2021 04:45:53 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 99762181AEF09 for ; Mon, 1 Mar 2021 09:45:53 +0000 (UTC) X-FDA: 77870823786.09.3F791B1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf12.hostedemail.com (Postfix) with ESMTP id B2F0612E for ; Mon, 1 Mar 2021 09:45:45 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id D11CF64DEE; Mon, 1 Mar 2021 09:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614591952; bh=aG4uuUZ0w7ZSRKSVJdeCsJ8UVIZRf6kb7VnGUopdSqk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=k4GV/Uxs5wNzkXzHjYoZWIY3WG4np42G5szWlM+bD25bq57UhbTiperHROA2OT/1f ILFsTkQtOm1nWPUgIzyLKAi0de1A3+RI6NxdemY+SckNXKE3crmj1YwS3af+0nuXQz RscH4I4t9rgIh81pLTBoTg1iE48CUAflhIVzxgmuyANhuqtKDATQVb1vUJ7k1YtZZj QaUD2lc/3ig01cp9IGaV4EZ6NI0t1Brdi17xhS0fOFe2JfS9CLY6g3jTdIbw2zlkSE ZMi/ezK6wQ8FLI7jiEDOfqK4bxnF9in1gdPKQqtRBj4aPs+Fd3zRqJdr6qhTMLRz7C UxvOZd28wtgMA== Date: Mon, 1 Mar 2021 11:45:42 +0200 From: Mike Rapoport To: Florian Fainelli Cc: Serge Semin , Thomas Bogendoerfer , Serge Semin , Roman Gushchin , Andrew Morton , linux-mm@kvack.org, Kamal Dasu , Paul Cercueil , Jiaxun Yang , iamjoonsoo.kim@lge.com, riel@surriel.com, Michal Hocko , linux-kernel@vger.kernel.org, kernel-team@fb.com, "open list:BROADCOM BMIPS MIPS ARCHITECTURE" Subject: Re: [PATCH v2 2/2] memblock: do not start bottom-up allocations with kernel_end Message-ID: References: <20201217201214.3414100-1-guro@fb.com> <20201217201214.3414100-2-guro@fb.com> <23fc1ef9-7342-8bc2-d184-d898107c52b2@gmail.com> <20210228090041.GO1447004@kernel.org> <8cbafe95-0f8c-a9b7-2dc9-cded846622fd@gmail.com> <20210228230811.wdae7oaaf3mbpgwl@mobilestation> <2e973fa8-5f2b-6840-0874-9c15fa0ebea0@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e973fa8-5f2b-6840-0874-9c15fa0ebea0@gmail.com> X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B2F0612E X-Stat-Signature: 73s4rik1u97738nocmyef4ygroxg1qmk Received-SPF: none (kernel.org>: No applicable sender policy available) receiver=imf12; identity=mailfrom; envelope-from=""; helo=mail.kernel.org; client-ip=198.145.29.99 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614591945-513609 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Feb 28, 2021 at 07:50:45PM -0800, Florian Fainelli wrote: > Hi Serge, > > On 2/28/2021 3:08 PM, Serge Semin wrote: > > Hi folks, > > What you've got here seems a more complicated problem than it > > could originally look like. Please, see my comments below. > > > > (Note I've discarded some of the email logs, which of no interest > > to the discovered problem. Please also note that I haven't got any > > Broadcom hardware to test out a solution suggested below.) > > > > On Sun, Feb 28, 2021 at 10:19:51AM -0800, Florian Fainelli wrote: > >> Hi Mike, > >> > >> On 2/28/2021 1:00 AM, Mike Rapoport wrote: > >>> Hi Florian, > >>> > >>> On Sat, Feb 27, 2021 at 08:18:47PM -0800, Florian Fainelli wrote: > >>>> > > > >>>> [...] > > > >>>> > >>>> Hi Roman, Thomas and other linux-mips folks, > >>>> > >>>> Kamal and myself have been unable to boot v5.11 on MIPS since this > >>>> commit, reverting it makes our MIPS platforms boot successfully. We do > >>>> not see a warning like this one in the commit message, instead what > >>>> happens appear to be a corrupted Device Tree which prevents the parsing > >>>> of the "rdb" node and leading to the interrupt controllers not being > >>>> registered, and the system eventually not booting. > >>>> > >>>> The Device Tree is built-into the kernel image and resides at > >>>> arch/mips/boot/dts/brcm/bcm97435svmb.dts. > >>>> > >>>> Do you have any idea what could be wrong with MIPS specifically here? > > > > Most likely the problem you've discovered has been there for quite > > some time. The patch you are referring to just caused it to be > > triggered by extending the early allocation range. See before that > > patch was accepted the early memory allocations had been performed > > in the range: > > [kernel_end, RAM_END]. > > The patch changed that, so the early allocations are done within > > [RAM_START + PAGE_SIZE, RAM_END]. > > > > In normal situations it's safe to do that as long as all the critical > > memory regions (including the memory residing a space below the > > kernel) have been reserved. But as soon as a memory with some critical > > structures haven't been reserved, the kernel may allocate it to be used > > for instance for early initializations with obviously unpredictable but > > most of the times unpleasant consequences. > > > >>> > >>> Apparently there is a memblock allocation in one of the functions called > >>> from arch_mem_init() between plat_mem_setup() and > >>> early_init_fdt_reserve_self(). > > > > Mike, alas according to the log provided by Florian that's not the reason > > of the problem. Please, see my considerations below. > > > >> [...] > >> > >> [ 0.000000] Linux version 5.11.0-g5695e5161974 (florian@localhost) > >> (mipsel-linux-gcc (GCC) 8.3.0, GNU ld (GNU Binutils) 2.32) #84 SMP Sun > >> Feb 28 10:01:50 PST 2021 > >> [ 0.000000] CPU0 revision is: 00025b00 (Broadcom BMIPS5200) > >> [ 0.000000] FPU revision is: 00130001 > > > >> [ 0.000000] memblock_add: [0x00000000-0x0fffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > >> [ 0.000000] memblock_add: [0x20000000-0x4fffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > >> [ 0.000000] memblock_add: [0x90000000-0xcfffffff] > >> early_init_dt_scan_memory+0x160/0x1e0 > > > > Here the memory has been added to the memblock allocator. > > > >> [ 0.000000] MIPS: machine is Broadcom BCM97435SVMB > >> [ 0.000000] earlycon: ns16550a0 at MMIO32 0x10406b00 (options '') > >> [ 0.000000] printk: bootconsole [ns16550a0] enabled > > > >> [ 0.000000] memblock_reserve: [0x00aa7600-0x00aaa0a0] > >> setup_arch+0x128/0x69c > > > > Here the fdt memory has been reserved. (Note it's built into the > > kernel.) > > > >> [ 0.000000] memblock_reserve: [0x00010000-0x018313cf] > >> setup_arch+0x1f8/0x69c > > > > Here the kernel itself together with built-in dtb have been reserved. > > So far so good. > > > >> [ 0.000000] Initrd not found or empty - disabling initrd > > > >> [ 0.000000] memblock_alloc_try_nid: 10913 bytes align=0x40 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 > >> [ 0.000000] memblock_reserve: [0x00001000-0x00003aa0] > >> memblock_alloc_range_nid+0xf8/0x198 > >> [ 0.000000] memblock_alloc_try_nid: 32680 bytes align=0x4 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 > >> [ 0.000000] memblock_reserve: [0x00003aa4-0x0000ba4b] > >> memblock_alloc_range_nid+0xf8/0x198 > > > > The log above most likely belongs to the call-chain: > > setup_arch() > > +-> arch_mem_init() > > +-> device_tree_init() - BMIPS specific method > > +-> unflatten_and_copy_device_tree() > > > > So to speak here we've copied the fdt from the original space > > [0x00aa7600-0x00aaa0a0] into [0x00001000-0x00003aa0] and unflattened > > it to [0x00003aa4-0x0000ba4b]. > > > > The problem is that a bit later the next call-chain is performed: > > setup_arch() > > +-> plat_smp_setup() > > +-> mp_ops->smp_setup(); - registered by prom_init()->register_bmips_smp_ops(); > > +-> if (!board_ebase_setup) > > board_ebase_setup = &bmips_ebase_setup; > > > > So at the moment of the CPU traps initialization the bmips_ebase_setup() > > method is called. What trap_init() does isn't compatible with the > > allocation performed by the unflatten_and_copy_device_tree() method. > > See the next comment. > > > >> [ 0.000000] memblock_alloc_try_nid: 25 bytes align=0x4 nid=-1 > >> from=0x00000000 max_addr=0x00000000 > >> early_init_dt_alloc_memory_arch+0x40/0x84 ... > >> [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 > >> bytes, linear) > > > >> [ 0.000000] memblock_reserve: [0x00000000-0x000003ff] > >> trap_init+0x70/0x4e8 > > > > Most likely someplace here the corruption has happened. The log above > > has just reserved a memory for NMI/reset vectors: > > arch/mips/kernel/traps.c: trap_init(void): Line 2373. > > > > But then the board_ebase_setup() pointer is dereferenced and called, > > which has been initialized with bmips_ebase_setup() earlier and which > > overwrites the ebase variable with: 0x80001000 as this is > > CPU_BMIPS5000 CPU. So any further calls of the functions like > > set_handler()/set_except_vector()/set_vi_srs_handler()/etc may cause a > > corruption of the memory above 0x80001000, which as we have discovered > > belongs to fdt and unflattened device tree. > > > >> [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off > >> [ 0.000000] Memory: 2045268K/2097152K available (8226K kernel code, > >> 1070K rwdata, 1336K rodata, 13808K init, 260K bss, 51884K reserved, 0K > >> cma-reserved, 1835008K highmem) > >> [ 0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > >> [ 0.000000] rcu: Hierarchical RCU implementation. > >> [ 0.000000] rcu: RCU event tracing is enabled. > >> [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay > >> is 25 jiffies. > >> [ 0.000000] NR_IRQS: 256 > > > >> [ 0.000000] OF: Bad cell count for /rdb > >> [ 0.000000] irq_bcm7038_l1: failed to remap intc L1 registers > >> [ 0.000000] OF: of_irq_init: children remain, but no parents > > > > So here is the first time we have got the consequence of the corruption > > popped up. Luckily it's just the "Bad cells count" error. We could have > > got much less obvious log here up to getting a crash at some place > > further... > > > >> [ 0.000000] random: get_random_bytes called from > >> start_kernel+0x444/0x654 with crng_init=0 > >> [ 0.000000] sched_clock: 32 bits at 250 Hz, resolution 4000000ns, > >> wraps every 8589934590000000ns > > > >> > >> and with your patch applied which unfortunately did not work we have the > >> following: > >> > >> [...] > > > > So a patch like this shall workaround the corruption: > > > > --- a/arch/mips/bmips/setup.c > > +++ b/arch/mips/bmips/setup.c > > @@ -174,6 +174,8 @@ void __init plat_mem_setup(void) > > > > __dt_setup_arch(dtb); > > > > + memblock_reserve(0x0, 0x1000 + 0x100*64); > > + > > for (q = bmips_quirk_list; q->quirk_fn; q++) { > > if (of_flat_dt_is_compatible(of_get_flat_dt_root(), > > q->compatible)) { > > This patch works, thanks a lot for the troubleshooting and analysis! How > about the following which would be more generic and works as well and > should be more universal since it does not require each architecture to > provide an appropriate call to memblock_reserve(): > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c > index e0352958e2f7..b0a173b500e8 100644 > --- a/arch/mips/kernel/traps.c > +++ b/arch/mips/kernel/traps.c > @@ -2367,10 +2367,7 @@ void __init trap_init(void) > > if (!cpu_has_mips_r2_r6) { > ebase = CAC_BASE; > - ebase_pa = virt_to_phys((void *)ebase); > vec_size = 0x400; > - > - memblock_reserve(ebase_pa, vec_size); > } else { > if (cpu_has_veic || cpu_has_vint) > vec_size = 0x200 + VECTORSPACING*64; > @@ -2410,6 +2407,14 @@ void __init trap_init(void) > > if (board_ebase_setup) > board_ebase_setup(); > + > + /* board_ebase_setup() can change the exception base address > + * reserve it now after changes were made. > + */ > + if (!cpu_has_mips_r2_r6) { > + ebase_pa = virt_to_phys((void *)ebase); > + memblock_reserve(ebase_pa, vec_size); > + } With this it's still possible to have memblock allocations around ebase_pa before it is reserved. I think we have two options here to solve it in more or less generic way: * split the reservation of ebase from traps_init() and move it earlier to setup_arch(). I didn't check what board_ebase_setup() do, if they need to allocate memory it would not work. * add an API to memblock to set lower limit for allocations and then set the lower limit, to e.g. kernel load address in arch_mem_init(). This may add complexity for configurations with relocatable kernel and kaslr. > per_cpu_trap_init(true); > memblock_set_bottom_up(false); -- Sincerely yours, Mike.