linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jianyong Wu <Jianyong.Wu@arm.com>
To: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: "will@kernel.org" <will@kernel.org>,
	Anshuman Khandual <Anshuman.Khandual@arm.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"david@redhat.com" <david@redhat.com>,
	"quic_qiancai@quicinc.com" <quic_qiancai@quicinc.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"gshan@redhat.com" <gshan@redhat.com>,
	Justin He <Justin.He@arm.com>, nd <nd@arm.com>
Subject: RE: [PATCH v3] arm64/mm: avoid fixmap race condition when create pud mapping
Date: Fri, 7 Jan 2022 09:10:57 +0000	[thread overview]
Message-ID: <AM9PR08MB7276B412F02CA0431E30E06CF44D9@AM9PR08MB7276.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <YdcRLohx777jzWah@arm.com>

Hi Catalin,

I roughly find the root cause.
 alloc_init_pud will be called at the very beginning of kernel boot in create_mapping_noalloc where no memory allocator is initialized. But lockdep check may need allocate memory. So, kernel take exception when acquire lock.(I have not found the exact code that cause this issue) that's say we may not be able to use a lock so early.

I come up with 2 methods to address it. 
1) skip dead lock check at the very beginning of kernel boot in lockdep code.
2) provided 2 two versions of __create_pgd_mapping, one with lock in it and the other without. There may be no possible of race for memory mapping at the very beginning time of kernel boot, thus we can use the no lock version of __create_pgd_mapping safely.
In my test, this issue is gone if there is no lock held in create_mapping_noalloc. I think create_mapping_noalloc is called early enough to avoid the race conditions of memory mapping, however, I have not proved it.

For now, I prefer 2).
The rough change for method 2:
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index acfae9b41cc8..3d3c910f446b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -63,6 +63,7 @@ static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss __maybe_unused;
 static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss __maybe_unused;

 static DEFINE_SPINLOCK(swapper_pgdir_lock);
+static DEFINE_MUTEX(fixmap_lock);

 void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
 {
@@ -381,6 +382,41 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
        addr = virt & PAGE_MASK;
        end = PAGE_ALIGN(virt + size);

+       do {
+               next = pgd_addr_end(addr, end);
+               /*
+                * fixmap is used inside of alloc_init_pud, but we only have
+                * one fixmap entry per page-table level, so take the fixmap
+                * lock until we're done.
+                */
+               mutex_lock(&fixmap_lock);
+               alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,
+                              flags);
+               mutex_unlock(&fixmap_lock);
+               phys += next - addr;
+       } while (pgdp++, addr = next, addr != end);
+}
+
+static void __create_pgd_mapping_nolock(pgd_t *pgdir, phys_addr_t phys,
+                                unsigned long virt, phys_addr_t size,
+                                pgprot_t prot,
+                                phys_addr_t (*pgtable_alloc)(int),
+                                int flags)
+{
+       unsigned long addr, end, next;
+       pgd_t *pgdp = pgd_offset_pgd(pgdir, virt);
+
+       /*
+        * If the virtual and physical address don't have the same offset
+        * within a page, we cannot map the region as the caller expects.
+        */
+       if (WARN_ON((phys ^ virt) & ~PAGE_MASK))
+               return;
+
+       phys &= PAGE_MASK;
+       addr = virt & PAGE_MASK;
+       end = PAGE_ALIGN(virt + size);
+
        do {
                next = pgd_addr_end(addr, end);
                alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,
@@ -432,7 +468,10 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
                        &phys, virt);
                return;
        }
-       __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
+       /*
+        * We have no need to hold a lock at this very beginning.
+        */
+       __create_pgd_mapping_nolock(init_mm.pgd, phys, virt, size, prot, NULL,
                             NO_CONT_MAPPINGS);
 }

WDYT?

Thanks
Jianyong

> -----Original Message-----
> From: Catalin Marinas <catalin.marinas@arm.com>
> Sent: Thursday, January 6, 2022 11:57 PM
> To: Jianyong Wu <Jianyong.Wu@arm.com>
> Cc: will@kernel.org; Anshuman Khandual <Anshuman.Khandual@arm.com>;
> akpm@linux-foundation.org; david@redhat.com; quic_qiancai@quicinc.com;
> ardb@kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org; gshan@redhat.com; Justin He
> <Justin.He@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH v3] arm64/mm: avoid fixmap race condition when create
> pud mapping
> 
> On Thu, Jan 06, 2022 at 10:13:06AM +0000, Jianyong Wu wrote:
> > I test this patch in your way using both EDK2 V2.6 and EDK2 v2.7. it's
> > peculiar that this issue shows up on v2.6 but not on v2.7.
> > For now, I only find that if "CONFIG_DEBUG_LOCK_ALLOC" is enabled, the
> > kernel boot will hang. However, I can't debug it by printk as this
> > issue happens before pl11 is ready.
> 
> I tried earlycon but that doesn't help either.
> 
> > I will go on debugging, but very appreciated if someone can give some
> > hints on it.
> 
> FWIW, passing "nokaslr" on the kernel command line makes it boot (and this
> makes debugging harder). That's as far as I've gone.
> 
> --
> Catalin

  reply	other threads:[~2022-01-07  9:11 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-16  8:28 [PATCH v3] arm64/mm: avoid fixmap race condition when create pud mapping Jianyong Wu
2021-12-16 15:19 ` David Hildenbrand
2021-12-17  9:30 ` Mark Rutland
2021-12-17 10:09   ` Jianyong Wu
2022-01-05 18:03 ` Catalin Marinas
2022-01-06 10:13   ` Jianyong Wu
2022-01-06 15:56     ` Catalin Marinas
2022-01-07  9:10       ` Jianyong Wu [this message]
2022-01-07 10:42         ` Catalin Marinas
2022-01-26  4:20           ` Justin He
2022-01-26  8:36             ` Ard Biesheuvel
2022-01-26 10:09               ` Jianyong Wu
2022-01-26 10:12                 ` Ard Biesheuvel
2022-01-26 10:17                   ` David Hildenbrand
2022-01-26 10:28                     ` Jianyong Wu
2022-01-26 10:30                       ` David Hildenbrand
2022-01-26 10:31                         ` David Hildenbrand
2022-01-27  6:24                           ` Jianyong Wu
2022-01-27 12:22                             ` David Hildenbrand
2022-01-27 12:34                               ` Catalin Marinas
2022-01-31  8:13                                 ` Jianyong Wu
2022-01-31  8:10                               ` Jianyong Wu
2022-01-27  1:31               ` Justin He
2022-01-07 10:53         ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM9PR08MB7276B412F02CA0431E30E06CF44D9@AM9PR08MB7276.eurprd08.prod.outlook.com \
    --to=jianyong.wu@arm.com \
    --cc=Anshuman.Khandual@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Justin.He@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=david@redhat.com \
    --cc=gshan@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nd@arm.com \
    --cc=quic_qiancai@quicinc.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).