From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Nick Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Joel Fernandes <joel@joelfernandes.org>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Mark Rutland <mark.rutland@arm.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Chintan Pandya <cpandya@codeaurora.org>,
Jun Yao <yaojun8558363@gmail.com>,
Laura Abbott <labbott@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-mm@kvack.org, Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 1/3] arm64: mm: use appropriate ctors for page tables
Date: Wed, 20 Feb 2019 15:57:59 +0530 [thread overview]
Message-ID: <f7e4db43-b836-4ac2-1aea-922be585d8b1@arm.com> (raw)
In-Reply-To: <20190219222828.GA68281@google.com>
On 02/20/2019 03:58 AM, Yu Zhao wrote:
> On Tue, Feb 19, 2019 at 11:47:12AM +0530, Anshuman Khandual wrote:
>> + Matthew Wilcox
>>
>> On 02/19/2019 11:02 AM, Yu Zhao wrote:
>>> On Tue, Feb 19, 2019 at 09:51:01AM +0530, Anshuman Khandual wrote:
>>>>
>>>>
>>>> On 02/19/2019 04:43 AM, Yu Zhao wrote:
>>>>> For pte page, use pgtable_page_ctor(); for pmd page, use
>>>>> pgtable_pmd_page_ctor() if not folded; and for the rest (pud,
>>>>> p4d and pgd), don't use any.
>>>> pgtable_page_ctor()/dtor() is not optional for any level page table page
>>>> as it determines the struct page state and zone statistics.
>>>
>>> This is not true. pgtable_page_ctor() is only meant for user pte
>>> page. The name isn't perfect (we named it this way before we had
>>> split pmd page table lock, and never bothered to change it).
>>>
>>> The commit cccd843f54be ("mm: mark pages in use for page tables")
>>> clearly states so:
>>> Note that only pages currently accounted as NR_PAGETABLES are
>>> tracked as PageTable; this does not include pgd/p4d/pud/pmd pages.
>>
>> I think the commit is the following one and it does say so. But what is
>> the rationale of tagging only PTE page as PageTable and updating the zone
>> stat but not doing so for higher level page table pages ? Are not they
>> used as page table pages ? Should not they count towards NR_PAGETABLE ?
>>
>> 1d40a5ea01d53251c ("mm: mark pages in use for page tables")
>
> Well, I was just trying to clarify how the ctor is meant to be used.
> The rational behind it is probably another topic.
>
> For starters, the number of pmd/pud/p4d/pgd is at least two orders
> of magnitude less than the number of pte, which makes them almost
> negligible. And some archs use kmem for them, so it's infeasible to
> SetPageTable on or account them in the way the ctor does on those
> archs.
>
I understand the kmem cases which are definitely problematic and should
be fixed. IIRC there is a mechanism to custom init pages allocated for
slab cache with a ctor function which in turn can call pgtable_page_ctor().
But destructor helper support for slab has been dropped I guess.
> But, as I said, it's not something can't be changed. It's just not
> the concern of this patch.
Using pgtable_pmd_page_ctor() during PMD level pgtable page allocation
as suggested in the patch breaks pmd_alloc_one() changes as per the
previous proposal. Hence we all would need some agreement here.
https://www.spinics.net/lists/arm-kernel/msg701960.html
We can still accommodate the split PMD ptlock feature in pmd_alloc_one().
A possible solution can be like this above and over the previous series.
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a4168d366127..c02abb2a69f7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -9,6 +9,7 @@ config ARM64
select ACPI_SPCR_TABLE if ACPI
select ACPI_PPTT if ACPI
select ARCH_CLOCKSOURCE_DATA
+ select ARCH_ENABLE_SPLIT_PMD_PTLOCK if HAVE_ARCH_TRANSPARENT_HUGEPAGE
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_DMA_COHERENT_TO_PFN
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index a02a4d1d967d..258e09fb3ce2 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -37,13 +37,29 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte);
static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
{
- return (pmd_t *)pte_alloc_one_virt(mm);
+ pgtable_t ptr;
+
+ ptr = pte_alloc_one(mm);
+ if (!ptr)
+ return 0;
+
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && USE_SPLIT_PMD_PTLOCKS
+ ptr->pmd_huge_pte = NULL;
+#endif
+ return (pmd_t *)page_to_virt(ptr);
}
static inline void pmd_free(struct mm_struct *mm, pmd_t *pmdp)
{
+ struct page *page;
+
BUG_ON((unsigned long)pmdp & (PAGE_SIZE-1));
- pte_free(mm, virt_to_page(pmdp));
+ page = virt_to_page(pmdp);
+
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && USE_SPLIT_PMD_PTLOCKS
+ VM_BUG_ON_PAGE(page->pmd_huge_pte, page);
+#endif
+ pte_free(mm, page);
}
>
>>>
>>> I'm sure if we go back further, we can find similar stories: we
>>> don't set PageTable on page tables other than pte; and we don't
>>> account page tables other than pte. I don't have any objection if
>>> you want change these two. But please make sure they are consistent
>>> across all archs.
>>
>> pgtable_page_ctor/dtor() use across arch is not consistent and there is a need
>> for generalization which has been already acknowledged earlier. But for now we
>> can atleast fix this on arm64.
>>
>> https://lore.kernel.org/lkml/1547619692-7946-1-git-send-email-anshuman.khandual@arm.com/
>
> This is again not true. Please stop making claims not backed up by
> facts. And the link is completely irrelevant to the ctor.
>
> I just checked *all* arches. Only four arches call the ctor outside
> pte_alloc_one(). They are arm, arm64, ppc and s390. The last two do
> so not because they want to SetPageTable on or account pmd/pud/p4d/
> pgd, but because they have to work around something, as arm/arm64
> do.
That reaffirms the fact that pgtable_page_ctor()/dtor() are getting used
not in a consistent manner.
>
>>
>>>
>>>> We should not skip it for any page table page.
>>>
>>> In fact, calling it on pmd/pud/p4d is peculiar, and may even be
>>> considered wrong. AFAIK, no other arch does so.
>>
>> Why would it be considered wrong ? IIUC archs have their own understanding
>> of this and there are different implementations. But doing something for
>> PTE page and skipping for others is plain inconsistent.
>
> Allocating memory that will never be used is wrong. Please look into
> the ctor and find out what exactly it does under different configs.
Are you referring to ptlock_init() --> ptlock_alloc() triggered spinlock_t
allocations with USE_SPLIT_PTE_PTLOCKS and ALLOC_SPLIT_PTLOCKS.
>
> And why I said "may"? Because we know there is only negligible number
> of pmd/pud/p4d, so the memory allocated may be considered negligible
> as well.
Okay.
next prev parent reply other threads:[~2019-02-20 10:28 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-14 21:16 [PATCH] arm64: mm: enable per pmd page table lock Yu Zhao
2019-02-18 15:12 ` Will Deacon
2019-02-18 19:49 ` Yu Zhao
2019-02-18 20:48 ` Yu Zhao
2019-02-19 4:09 ` Anshuman Khandual
2019-02-18 23:13 ` [PATCH v2 1/3] arm64: mm: use appropriate ctors for page tables Yu Zhao
2019-02-18 23:13 ` [PATCH v2 2/3] arm64: mm: don't call page table ctors for init_mm Yu Zhao
2019-02-26 15:13 ` Mark Rutland
2019-03-09 3:52 ` Yu Zhao
2019-02-18 23:13 ` [PATCH v2 3/3] arm64: mm: enable per pmd page table lock Yu Zhao
2019-02-19 4:21 ` [PATCH v2 1/3] arm64: mm: use appropriate ctors for page tables Anshuman Khandual
2019-02-19 5:32 ` Yu Zhao
2019-02-19 6:17 ` Anshuman Khandual
2019-02-19 22:28 ` Yu Zhao
2019-02-20 10:27 ` Anshuman Khandual [this message]
2019-02-20 12:24 ` Matthew Wilcox
2019-02-20 20:22 ` Yu Zhao
2019-02-20 20:59 ` Matthew Wilcox
2019-02-20 1:34 ` Matthew Wilcox
2019-02-20 3:20 ` Anshuman Khandual
2019-02-20 21:03 ` Matthew Wilcox
2019-02-26 15:12 ` Mark Rutland
2019-03-09 4:01 ` Yu Zhao
2019-03-10 1:19 ` [PATCH v3 " Yu Zhao
2019-03-10 1:19 ` [PATCH v3 2/3] arm64: mm: don't call page table ctors for init_mm Yu Zhao
2019-03-10 1:19 ` [PATCH v3 3/3] arm64: mm: enable per pmd page table lock Yu Zhao
2019-03-11 8:28 ` Anshuman Khandual
2019-03-11 23:10 ` Yu Zhao
2019-03-11 12:12 ` Mark Rutland
2019-03-11 12:57 ` Anshuman Khandual
2019-03-11 23:11 ` Yu Zhao
2019-03-11 7:45 ` [PATCH v3 1/3] arm64: mm: use appropriate ctors for page tables Anshuman Khandual
2019-03-11 23:23 ` Yu Zhao
2019-03-12 0:57 ` [PATCH v4 1/4] " Yu Zhao
2019-03-12 0:57 ` [PATCH v4 2/4] arm64: mm: don't call page table ctors for init_mm Yu Zhao
2019-03-12 0:57 ` [PATCH v4 3/4] arm64: mm: call ctor for stage2 pmd page Yu Zhao
2019-03-12 0:57 ` [PATCH v4 4/4] arm64: mm: enable per pmd page table lock Yu Zhao
2019-02-19 3:08 ` [PATCH] " Anshuman Khandual
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7e4db43-b836-4ac2-1aea-922be585d8b1@arm.com \
--to=anshuman.khandual@arm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=cpandya@codeaurora.org \
--cc=joel@joelfernandes.org \
--cc=kirill@shutemov.name \
--cc=labbott@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=will.deacon@arm.com \
--cc=willy@infradead.org \
--cc=yaojun8558363@gmail.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).