* [PATCH 1/3] asm-generic/tlb: stub out pud_free_tlb() if __PAGETABLE_PUD_FOLDED ...
2019-10-09 22:26 [PATCH 0/3] eldie generated code for folded p4d/pud Vineet Gupta
@ 2019-10-09 22:26 ` Vineet Gupta
2019-10-09 22:26 ` [PATCH 2/3] asm-generic/tlb: stub out p4d_free_tlb() if __PAGETABLE_P4D_FOLDED Vineet Gupta
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2019-10-09 22:26 UTC (permalink / raw)
To: linux-snps-arc
... independent of __ARCH_HAS_4LEVEL_HACK
This came up when removing __ARCH_HAS_5LEVEL_HACK for ARC as code bloat
from pud_free_tlb() despite pud being folded (with 2 levels on ARC)
| bloat-o-meter2 vmlinux-B-elide-ARCH_USE_5LEVEL_HACK vmlinux-C-elide-pud_free_tlb
| add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-104 (-104)
| function old new delta
| free_pgd_range 656 552 -104
| Total: Before=4137276, After=4137172, chg -1.000000%
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
include/asm-generic/4level-fixup.h | 2 --
include/asm-generic/tlb.h | 4 +++-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/4level-fixup.h b/include/asm-generic/4level-fixup.h
index e3667c9a33a5..d7c5ba1968d3 100644
--- a/include/asm-generic/4level-fixup.h
+++ b/include/asm-generic/4level-fixup.h
@@ -27,8 +27,6 @@
#define pud_page(pud) pgd_page(pud)
#define pud_page_vaddr(pud) pgd_page_vaddr(pud)
-#undef pud_free_tlb
-#define pud_free_tlb(tlb, x, addr) do { } while (0)
#define pud_free(mm, x) do { } while (0)
#define __pud_free_tlb(tlb, x, addr) do { } while (0)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 04c0644006fd..1f83188cb331 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -584,7 +584,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
} while (0)
#endif
-#ifndef __ARCH_HAS_4LEVEL_HACK
+#ifndef __PAGETABLE_PUD_FOLDED
#ifndef pud_free_tlb
#define pud_free_tlb(tlb, pudp, address) \
do { \
@@ -594,6 +594,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
__pud_free_tlb(tlb, pudp, address); \
} while (0)
#endif
+#else
+#define pud_free_tlb(tlb, pudp, address) do { } while (0)
#endif
#ifndef __ARCH_HAS_5LEVEL_HACK
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] asm-generic/tlb: stub out p4d_free_tlb() if __PAGETABLE_P4D_FOLDED ...
2019-10-09 22:26 [PATCH 0/3] eldie generated code for folded p4d/pud Vineet Gupta
2019-10-09 22:26 ` [PATCH 1/3] asm-generic/tlb: stub out pud_free_tlb() if __PAGETABLE_PUD_FOLDED Vineet Gupta
@ 2019-10-09 22:26 ` Vineet Gupta
2019-10-09 22:26 ` [PATCH 3/3] asm-generic/mm: stub out p{4, d}d_clear_bad() if __PAGETABLE_P{4, u}D_FOLDED Vineet Gupta
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2019-10-09 22:26 UTC (permalink / raw)
To: linux-snps-arc
... independent of __ARCH_HAS_5LEVEL_HACK
This came up when removing __ARCH_HAS_5LEVEL_HACK for ARC as code bloat
from p4d_free_tlb() despite pud being folded (with 2 levels on ARC)
| bloat-o-meter2 vmlinux-C-elide-pud_free_tlb vmlinux-D-elide-p4d_free_tlb
| add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-104 (-104)
| function old new delta
| free_pgd_range 552 422 -130
| Total: Before=4137172, After=4137042, chg -1.000000%
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
include/asm-generic/5level-fixup.h | 2 --
include/asm-generic/tlb.h | 4 +++-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/5level-fixup.h b/include/asm-generic/5level-fixup.h
index f6947da70d71..c855b5cf4425 100644
--- a/include/asm-generic/5level-fixup.h
+++ b/include/asm-generic/5level-fixup.h
@@ -48,8 +48,6 @@ static inline int p4d_present(p4d_t p4d)
#define __p4d(x) __pgd(x)
#define set_p4d(p4dp, p4d) set_pgd(p4dp, p4d)
-#undef p4d_free_tlb
-#define p4d_free_tlb(tlb, x, addr) do { } while (0)
#define p4d_free(mm, x) do { } while (0)
#define __p4d_free_tlb(tlb, x, addr) do { } while (0)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 1f83188cb331..f3dad87f4ecc 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -598,7 +598,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
#define pud_free_tlb(tlb, pudp, address) do { } while (0)
#endif
-#ifndef __ARCH_HAS_5LEVEL_HACK
+#ifndef __PAGETABLE_P4D_FOLDED
#ifndef p4d_free_tlb
#define p4d_free_tlb(tlb, pudp, address) \
do { \
@@ -607,6 +607,8 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
__p4d_free_tlb(tlb, pudp, address); \
} while (0)
#endif
+#else
+#define p4d_free_tlb(tlb, pudp, address) do { } while (0)
#endif
#endif /* CONFIG_MMU */
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] asm-generic/mm: stub out p{4, d}d_clear_bad() if __PAGETABLE_P{4, u}D_FOLDED
2019-10-09 22:26 [PATCH 0/3] eldie generated code for folded p4d/pud Vineet Gupta
2019-10-09 22:26 ` [PATCH 1/3] asm-generic/tlb: stub out pud_free_tlb() if __PAGETABLE_PUD_FOLDED Vineet Gupta
2019-10-09 22:26 ` [PATCH 2/3] asm-generic/tlb: stub out p4d_free_tlb() if __PAGETABLE_P4D_FOLDED Vineet Gupta
@ 2019-10-09 22:26 ` Vineet Gupta
2019-10-10 7:29 ` [PATCH 0/3] eldie generated code for folded p4d/pud Peter Zijlstra
2019-10-10 8:56 ` Kirill A. Shutemov
4 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2019-10-09 22:26 UTC (permalink / raw)
To: linux-snps-arc
This removes the code for 2 level paging as seen on ARC
| bloat-o-meter2 vmlinux-D-elide-p4d_free_tlb vmlinux-E-elide-p?d_clear_bad
| add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-22 (-22)
| function old new delta
| pud_clear_bad 20 - -20
| p4d_clear_bad 20 - -20
| Total: Before=4137104, After=4137082, chg -1.000000%
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
include/asm-generic/pgtable.h | 11 +++++++++++
mm/pgtable-generic.c | 4 ++++
2 files changed, 15 insertions(+)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 818691846c90..9cdcbc7c0b7b 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -558,8 +558,19 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
* Do the tests inline, but report and clear the bad entry in mm/memory.c.
*/
void pgd_clear_bad(pgd_t *);
+
+#ifndef __PAGETABLE_P4D_FOLDED
void p4d_clear_bad(p4d_t *);
+#else
+#define p4d_clear_bad(p4d) do { } while (0)
+#endif
+
+#ifndef __PAGETABLE_PUD_FOLDED
void pud_clear_bad(pud_t *);
+#else
+#define pud_clear_bad(p4d) do { } while (0)
+#endif
+
void pmd_clear_bad(pmd_t *);
static inline int pgd_none_or_clear_bad(pgd_t *pgd)
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 532c29276fce..856dc3bb77e6 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -24,17 +24,21 @@ void pgd_clear_bad(pgd_t *pgd)
pgd_clear(pgd);
}
+#ifndef __PAGETABLE_P4D_FOLDED
void p4d_clear_bad(p4d_t *p4d)
{
p4d_ERROR(*p4d);
p4d_clear(p4d);
}
+#endif
+#ifndef __PAGETABLE_PUD_FOLDED
void pud_clear_bad(pud_t *pud)
{
pud_ERROR(*pud);
pud_clear(pud);
}
+#endif
void pmd_clear_bad(pmd_t *pmd)
{
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 0/3] eldie generated code for folded p4d/pud
2019-10-09 22:26 [PATCH 0/3] eldie generated code for folded p4d/pud Vineet Gupta
` (2 preceding siblings ...)
2019-10-09 22:26 ` [PATCH 3/3] asm-generic/mm: stub out p{4, d}d_clear_bad() if __PAGETABLE_P{4, u}D_FOLDED Vineet Gupta
@ 2019-10-10 7:29 ` Peter Zijlstra
2019-10-10 8:56 ` Kirill A. Shutemov
4 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2019-10-10 7:29 UTC (permalink / raw)
To: linux-snps-arc
On Wed, Oct 09, 2019@03:26:55PM -0700, Vineet Gupta wrote:
> Hi,
>
> This series elides extraneous generate code for folded p4d/pud.
> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> The code saving are not a while lot, but still worthwhile IMHO.
>
> bloat-o-meter2 vmlinux-A-baseline vmlinux-E-elide-p?d_clear_bad
> add/remove: 0/2 grow/shrink: 0/1 up/down: 0/-146 (-146)
> function old new delta
> p4d_clear_bad 2 - -2
> pud_clear_bad 20 - -20
> free_pgd_range 546 422 -124
> Total: Before=4137148, After=4137002, chg -1.000000%
>
Works for me, thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 0/3] eldie generated code for folded p4d/pud
2019-10-09 22:26 [PATCH 0/3] eldie generated code for folded p4d/pud Vineet Gupta
` (3 preceding siblings ...)
2019-10-10 7:29 ` [PATCH 0/3] eldie generated code for folded p4d/pud Peter Zijlstra
@ 2019-10-10 8:56 ` Kirill A. Shutemov
2019-10-10 20:05 ` Vineet Gupta
4 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2019-10-10 8:56 UTC (permalink / raw)
To: linux-snps-arc
On Wed, Oct 09, 2019@10:26:55PM +0000, Vineet Gupta wrote:
> Hi,
>
> This series elides extraneous generate code for folded p4d/pud.
> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> The code saving are not a while lot, but still worthwhile IMHO.
Agreed.
Acked-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 0/3] eldie generated code for folded p4d/pud
2019-10-10 8:56 ` Kirill A. Shutemov
@ 2019-10-10 20:05 ` Vineet Gupta
2019-10-11 12:19 ` Kirill A. Shutemov
0 siblings, 1 reply; 9+ messages in thread
From: Vineet Gupta @ 2019-10-10 20:05 UTC (permalink / raw)
To: linux-snps-arc
Hi Kirill,
On 10/10/19 1:56 AM, Kirill A. Shutemov wrote:
> On Wed, Oct 09, 2019@10:26:55PM +0000, Vineet Gupta wrote:
>>
>> This series elides extraneous generate code for folded p4d/pud.
>> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
>> The code saving are not a while lot, but still worthwhile IMHO.
>
> Agreed.
Thx.
So given we are folding pmd too, it seemed we could do the following as well.
+#ifndef __PAGETABLE_PMD_FOLDED
void pmd_clear_bad(pmd_t *);
+#else
+#define pmd_clear_bad(pmd) do { } while (0)
+#endif
+#ifndef __PAGETABLE_PMD_FOLDED
void pmd_clear_bad(pmd_t *pmd)
{
pmd_ERROR(*pmd);
pmd_clear(pmd);
}
+#endif
I stared at generated code and it seems a bit wrong.
free_pgd_range() -> pgd_none_or_clear_bad() is no longer checking for unmapped pgd
entries as pgd_none/pgd_bad are all stubs returning 0.
This whole pmd folding is a bit confusing considering I only revisit it every few
years :-) Abstraction wise, __PAGETABLE_PMD_FOLDED only has pgd, pte but even in
this regime bunch of pmd macros are still valid
pmd_set(pmdp, ptep) {
*pmdp.pud.p4d.pgd = (unsigned long)ptep
}
Is there a better way to make a mental model of this code folding.
In an ideal world pmd folded would have meant pmd_* routines just vanish - poof.
So in that sense I like your implementation under #[45]LEVEL_HACK where the level
simply vanishes by code like #define p4d_t pgd_t. Perhaps there is lot of historic
baggage, proliferated into arch code so hard to untangle.
Thx,
-Vineet
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 0/3] eldie generated code for folded p4d/pud
2019-10-10 20:05 ` Vineet Gupta
@ 2019-10-11 12:19 ` Kirill A. Shutemov
2019-10-11 22:38 ` [RFC] asm-generic/tlb: stub out pmd_free_tlb() if __PAGETABLE_PMD_FOLDED Vineet Gupta
0 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2019-10-11 12:19 UTC (permalink / raw)
To: linux-snps-arc
On Thu, Oct 10, 2019@01:05:56PM -0700, Vineet Gupta wrote:
>
> Hi Kirill,
>
> On 10/10/19 1:56 AM, Kirill A. Shutemov wrote:
> > On Wed, Oct 09, 2019@10:26:55PM +0000, Vineet Gupta wrote:
> >>
> >> This series elides extraneous generate code for folded p4d/pud.
> >> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> >> The code saving are not a while lot, but still worthwhile IMHO.
> >
> > Agreed.
>
> Thx.
>
> So given we are folding pmd too, it seemed we could do the following as well.
>
> +#ifndef __PAGETABLE_PMD_FOLDED
> void pmd_clear_bad(pmd_t *);
> +#else
> +#define pmd_clear_bad(pmd) do { } while (0)
> +#endif
>
> +#ifndef __PAGETABLE_PMD_FOLDED
> void pmd_clear_bad(pmd_t *pmd)
> {
> pmd_ERROR(*pmd);
> pmd_clear(pmd);
> }
> +#endif
>
> I stared at generated code and it seems a bit wrong.
> free_pgd_range() -> pgd_none_or_clear_bad() is no longer checking for unmapped pgd
> entries as pgd_none/pgd_bad are all stubs returning 0.
>
> This whole pmd folding is a bit confusing considering I only revisit it every few
> years :-) Abstraction wise, __PAGETABLE_PMD_FOLDED only has pgd, pte but even in
> this regime bunch of pmd macros are still valid
>
> pmd_set(pmdp, ptep) {
> *pmdp.pud.p4d.pgd = (unsigned long)ptep
> }
>
> Is there a better way to make a mental model of this code folding.
I don't have any. PMD folding predates me and have never looked at it
closely. Quick look brings more confusion than clarity. :P
> In an ideal world pmd folded would have meant pmd_* routines just vanish - poof.
> So in that sense I like your implementation under #[45]LEVEL_HACK where the level
> simply vanishes by code like #define p4d_t pgd_t. Perhaps there is lot of historic
> baggage, proliferated into arch code so hard to untangle.
In ideal world all these pgd/p4d/pud/pmd/pte should die and we have
something more flexible to begin with.
I played with this before:
https://lore.kernel.org/lkml/20180424154355.mfjgkf47kdp2by4e at black.fi.intel.com/
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC] asm-generic/tlb: stub out pmd_free_tlb() if __PAGETABLE_PMD_FOLDED
2019-10-11 12:19 ` Kirill A. Shutemov
@ 2019-10-11 22:38 ` Vineet Gupta
0 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2019-10-11 22:38 UTC (permalink / raw)
To: linux-snps-arc
This is inine with similar patches for nopud [1] and nop4d [2] cases.
However I'm not really sure I understand clearly how the nopmd code is
supposed to work (for a 2 tier paging system) - hence the RFC.
Consider free_pmd_range() simplified/annotated below
free_pmd_range
...
pmd = pmd_offset(pud, addr);
do {
next = pmd_addr_end(addr, end);
if (pmd_none_or_clear_bad(pmd)) => *pmd_bad()/pmd_clear_bad() [a]*
continue;
free_pte_range(tlb, pmd, addr);
} while (pmd++, addr = next, addr != end);
...
*pmd_free_tlb(tlb, pmd, start); => [b]*
For ARC/nopmd case [a] is actually checking pgd and consequently
pmd_clear_bad() can't be stubbed out for PMD_FOLDED case. However it seems
case [b] can be stubbed out (hence this patch) along same lines as [1] and [2]
| bloat-o-meter2 vmlinux-E-elide-p?d_clear_bad vmlinux-F-elide-pmd_free_tlb
| add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-112 (-112)
| function old new delta
| free_pgd_range 422 310 -112
| Total: Before=4137002, After=4136890, chg -1.000000%
[1] http://lists.infradead.org/pipermail/linux-snps-arc/2019-October/006266.html
[2] http://lists.infradead.org/pipermail/linux-snps-arc/2019-October/006265.html
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
include/asm-generic/tlb.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index f3dad87f4ecc..a1edad7d4170 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -574,6 +574,7 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
} while (0)
#endif
+#ifndef __PAGETABLE_PMD_FOLDED
#ifndef pmd_free_tlb
#define pmd_free_tlb(tlb, pmdp, address) \
do { \
@@ -583,6 +584,9 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
__pmd_free_tlb(tlb, pmdp, address); \
} while (0)
#endif
+#else
+#define pmd_free_tlb(tlb, pmdp, address) do { } while (0)
+#endif
#ifndef __PAGETABLE_PUD_FOLDED
#ifndef pud_free_tlb
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread