linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
@ 2020-03-28  4:30 Zhenyu Ye
  2020-03-30 12:16 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Zhenyu Ye @ 2020-03-28  4:30 UTC (permalink / raw)
  To: Peter Zijlstra, npiggin, will.deacon, mingo, torvalds,
	schwidefsky, akpm, luto, bp, Marc Zyngier
  Cc: linux-arm-kernel, linux-kernel, linux-arch, arm, xiexiangyou, yezhenyu2

Hi all,

commit a6d60245 "Track which levels of the page tables have been cleared"
added cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather, and the values
of them are set in some places. For example:

In include/asm-generic/tlb.h, pte_free_tlb() set the tlb->cleared_pmds:
---8<---
#ifndef pte_free_tlb
#define pte_free_tlb(tlb, ptep, address)			\
	do {							\
		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
		tlb->freed_tables = 1;				\
		tlb->cleared_pmds = 1;				\
		__pte_free_tlb(tlb, ptep, address);		\
	} while (0)
#endif
---8<---


However, in arch/s390/include/asm/tlb.h, pte_free_tlb() set the tlb->cleared_ptes:
---8<---
static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
                                unsigned long address)
{
	__tlb_adjust_range(tlb, address, PAGE_SIZE);
	tlb->mm->context.flush_mm = 1;
	tlb->freed_tables = 1;
	tlb->cleared_ptes = 1;
	/*
	 * page_table_free_rcu takes care of the allocation bit masks
	 * of the 2K table fragments in the 4K page table page,
	 * then calls tlb_remove_table.
	 */
	page_table_free_rcu(tlb, (unsigned long *) pte, address);
}
---8<---


In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
then use it when needed.

I'm very confused about this. Which is wrong? Or is there something
I understand wrong?


Thanks,
Zhenyu



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
  2020-03-28  4:30 [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather Zhenyu Ye
@ 2020-03-30 12:16 ` Peter Zijlstra
  2020-03-31  8:15   ` Zhenyu Ye
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Peter Zijlstra @ 2020-03-30 12:16 UTC (permalink / raw)
  To: Zhenyu Ye
  Cc: npiggin, will.deacon, mingo, torvalds, schwidefsky, akpm, luto,
	bp, Marc Zyngier, linux-arm-kernel, linux-kernel, linux-arch,
	arm, xiexiangyou

On Sat, Mar 28, 2020 at 12:30:50PM +0800, Zhenyu Ye wrote:
> Hi all,
> 
> commit a6d60245 "Track which levels of the page tables have been cleared"
> added cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather, and the values
> of them are set in some places. For example:
> 
> In include/asm-generic/tlb.h, pte_free_tlb() set the tlb->cleared_pmds:
> ---8<---
> #ifndef pte_free_tlb
> #define pte_free_tlb(tlb, ptep, address)			\
> 	do {							\
> 		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
> 		tlb->freed_tables = 1;				\
> 		tlb->cleared_pmds = 1;				\
> 		__pte_free_tlb(tlb, ptep, address);		\
> 	} while (0)
> #endif
> ---8<---
> 
> 
> However, in arch/s390/include/asm/tlb.h, pte_free_tlb() set the tlb->cleared_ptes:
> ---8<---
> static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
>                                 unsigned long address)
> {
> 	__tlb_adjust_range(tlb, address, PAGE_SIZE);
> 	tlb->mm->context.flush_mm = 1;
> 	tlb->freed_tables = 1;
> 	tlb->cleared_ptes = 1;
> 	/*
> 	 * page_table_free_rcu takes care of the allocation bit masks
> 	 * of the 2K table fragments in the 4K page table page,
> 	 * then calls tlb_remove_table.
> 	 */
> 	page_table_free_rcu(tlb, (unsigned long *) pte, address);
> }
> ---8<---
> 
> 
> In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
> correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
> then use it when needed.

So pte_free_tlb() clears a table of PTE entries, or a PMD level entity,
also see free_pte_range(). So the generic code makes sense to me. The
PTE level invalidations will have happened on tlb_remove_tlb_entry().

> I'm very confused about this. Which is wrong? Or is there something
> I understand wrong?

I agree the s390 case is puzzling, Martin does s390 need a PTE level
invalidate for removing a PTE table or was this a mistake?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
  2020-03-30 12:16 ` Peter Zijlstra
@ 2020-03-31  8:15   ` Zhenyu Ye
  2020-04-08  8:51   ` Christian Borntraeger
  2020-04-14  7:05   ` Christian Borntraeger
  2 siblings, 0 replies; 6+ messages in thread
From: Zhenyu Ye @ 2020-03-31  8:15 UTC (permalink / raw)
  To: Peter Zijlstra, schwidefsky
  Cc: npiggin, will.deacon, mingo, torvalds, akpm, luto, bp,
	Marc Zyngier, linux-arm-kernel, linux-kernel, linux-arch, arm,
	xiexiangyou

Hi Peter,

On 2020/3/30 20:16, Peter Zijlstra wrote:
> On Sat, Mar 28, 2020 at 12:30:50PM +0800, Zhenyu Ye wrote:
>> Hi all,
>>
>> commit a6d60245 "Track which levels of the page tables have been cleared"
>> added cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather, and the values
>> of them are set in some places. For example:
>>
>> In include/asm-generic/tlb.h, pte_free_tlb() set the tlb->cleared_pmds:
>> ---8<---
>> #ifndef pte_free_tlb
>> #define pte_free_tlb(tlb, ptep, address)			\
>> 	do {							\
>> 		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
>> 		tlb->freed_tables = 1;				\
>> 		tlb->cleared_pmds = 1;				\
>> 		__pte_free_tlb(tlb, ptep, address);		\
>> 	} while (0)
>> #endif
>> ---8<---
>>
>>
>> However, in arch/s390/include/asm/tlb.h, pte_free_tlb() set the tlb->cleared_ptes:
>> ---8<---
>> static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
>>                                 unsigned long address)
>> {
>> 	__tlb_adjust_range(tlb, address, PAGE_SIZE);
>> 	tlb->mm->context.flush_mm = 1;
>> 	tlb->freed_tables = 1;
>> 	tlb->cleared_ptes = 1;
>> 	/*
>> 	 * page_table_free_rcu takes care of the allocation bit masks
>> 	 * of the 2K table fragments in the 4K page table page,
>> 	 * then calls tlb_remove_table.
>> 	 */
>> 	page_table_free_rcu(tlb, (unsigned long *) pte, address);
>> }
>> ---8<---
>>
>>
>> In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
>> correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
>> then use it when needed.
> 
> So pte_free_tlb() clears a table of PTE entries, or a PMD level entity,
> also see free_pte_range(). So the generic code makes sense to me. The
> PTE level invalidations will have happened on tlb_remove_tlb_entry().
> 

Thanks for your explanation. I can understand now.

>> I'm very confused about this. Which is wrong? Or is there something
>> I understand wrong?
> 
> I agree the s390 case is puzzling, Martin does s390 need a PTE level
> invalidate for removing a PTE table or was this a mistake?
> 

Then we should wait for @ Martin's reply.  Though s390 has never used
this value, I think we still should correct it if this is a mistake.

Thanks,
Zhenyu



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
  2020-03-30 12:16 ` Peter Zijlstra
  2020-03-31  8:15   ` Zhenyu Ye
@ 2020-04-08  8:51   ` Christian Borntraeger
  2020-04-20 16:20     ` Gerald Schaefer
  2020-04-14  7:05   ` Christian Borntraeger
  2 siblings, 1 reply; 6+ messages in thread
From: Christian Borntraeger @ 2020-04-08  8:51 UTC (permalink / raw)
  To: Peter Zijlstra, Zhenyu Ye
  Cc: npiggin, will.deacon, mingo, torvalds, Vasily Gorbik, akpm, luto,
	bp, Marc Zyngier, linux-arm-kernel, linux-kernel, linux-arch,
	arm, xiexiangyou, Gerald Schaefer, linux-s390

Sorry, just saw that now..

On 30.03.20 14:16, Peter Zijlstra wrote:
> On Sat, Mar 28, 2020 at 12:30:50PM +0800, Zhenyu Ye wrote:
>> Hi all,
>>
>> commit a6d60245 "Track which levels of the page tables have been cleared"
>> added cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather, and the values
>> of them are set in some places. For example:
>>
>> In include/asm-generic/tlb.h, pte_free_tlb() set the tlb->cleared_pmds:
>> ---8<---
>> #ifndef pte_free_tlb
>> #define pte_free_tlb(tlb, ptep, address)			\
>> 	do {							\
>> 		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
>> 		tlb->freed_tables = 1;				\
>> 		tlb->cleared_pmds = 1;				\
>> 		__pte_free_tlb(tlb, ptep, address);		\
>> 	} while (0)
>> #endif
>> ---8<---
>>
>>
>> However, in arch/s390/include/asm/tlb.h, pte_free_tlb() set the tlb->cleared_ptes:
>> ---8<---
>> static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
>>                                 unsigned long address)
>> {
>> 	__tlb_adjust_range(tlb, address, PAGE_SIZE);
>> 	tlb->mm->context.flush_mm = 1;
>> 	tlb->freed_tables = 1;
>> 	tlb->cleared_ptes = 1;
>> 	/*
>> 	 * page_table_free_rcu takes care of the allocation bit masks
>> 	 * of the 2K table fragments in the 4K page table page,
>> 	 * then calls tlb_remove_table.
>> 	 */
>> 	page_table_free_rcu(tlb, (unsigned long *) pte, address);
>> }
>> ---8<---

adding Gerald and Vasily. Gerald can you have a look?

>>
>>
>> In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
>> correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
>> then use it when needed.
> 
> So pte_free_tlb() clears a table of PTE entries, or a PMD level entity,
> also see free_pte_range(). So the generic code makes sense to me. The
> PTE level invalidations will have happened on tlb_remove_tlb_entry().
> 
>> I'm very confused about this. Which is wrong? Or is there something
>> I understand wrong?
> 
> I agree the s390 case is puzzling, Martin does s390 need a PTE level
> invalidate for removing a PTE table or was this a mistake?
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
  2020-03-30 12:16 ` Peter Zijlstra
  2020-03-31  8:15   ` Zhenyu Ye
  2020-04-08  8:51   ` Christian Borntraeger
@ 2020-04-14  7:05   ` Christian Borntraeger
  2 siblings, 0 replies; 6+ messages in thread
From: Christian Borntraeger @ 2020-04-14  7:05 UTC (permalink / raw)
  To: Peter Zijlstra, Zhenyu Ye, Gerald Schaefer
  Cc: npiggin, will.deacon, mingo, torvalds, schwidefsky, akpm, luto,
	bp, Marc Zyngier, linux-arm-kernel, linux-kernel, linux-arch,
	arm, xiexiangyou

Gerald,

can you have a look?

On 30.03.20 14:16, Peter Zijlstra wrote:
> On Sat, Mar 28, 2020 at 12:30:50PM +0800, Zhenyu Ye wrote:
>> Hi all,
>>
>> commit a6d60245 "Track which levels of the page tables have been cleared"
>> added cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather, and the values
>> of them are set in some places. For example:
>>
>> In include/asm-generic/tlb.h, pte_free_tlb() set the tlb->cleared_pmds:
>> ---8<---
>> #ifndef pte_free_tlb
>> #define pte_free_tlb(tlb, ptep, address)			\
>> 	do {							\
>> 		__tlb_adjust_range(tlb, address, PAGE_SIZE);	\
>> 		tlb->freed_tables = 1;				\
>> 		tlb->cleared_pmds = 1;				\
>> 		__pte_free_tlb(tlb, ptep, address);		\
>> 	} while (0)
>> #endif
>> ---8<---
>>
>>
>> However, in arch/s390/include/asm/tlb.h, pte_free_tlb() set the tlb->cleared_ptes:
>> ---8<---
>> static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
>>                                 unsigned long address)
>> {
>> 	__tlb_adjust_range(tlb, address, PAGE_SIZE);
>> 	tlb->mm->context.flush_mm = 1;
>> 	tlb->freed_tables = 1;
>> 	tlb->cleared_ptes = 1;
>> 	/*
>> 	 * page_table_free_rcu takes care of the allocation bit masks
>> 	 * of the 2K table fragments in the 4K page table page,
>> 	 * then calls tlb_remove_table.
>> 	 */
>> 	page_table_free_rcu(tlb, (unsigned long *) pte, address);
>> }
>> ---8<---
>>
>>
>> In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
>> correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
>> then use it when needed.
> 
> So pte_free_tlb() clears a table of PTE entries, or a PMD level entity,
> also see free_pte_range(). So the generic code makes sense to me. The
> PTE level invalidations will have happened on tlb_remove_tlb_entry().
> 
>> I'm very confused about this. Which is wrong? Or is there something
>> I understand wrong?
> 
> I agree the s390 case is puzzling, Martin does s390 need a PTE level
> invalidate for removing a PTE table or was this a mistake?
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather
  2020-04-08  8:51   ` Christian Borntraeger
@ 2020-04-20 16:20     ` Gerald Schaefer
  0 siblings, 0 replies; 6+ messages in thread
From: Gerald Schaefer @ 2020-04-20 16:20 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Peter Zijlstra, Zhenyu Ye, npiggin, will.deacon, mingo, torvalds,
	Vasily Gorbik, akpm, luto, bp, Marc Zyngier, linux-arm-kernel,
	linux-kernel, linux-arch, arm, xiexiangyou, linux-s390,
	Gerald Schaefer

On Wed, 8 Apr 2020 10:51:59 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

[...]
> 
> adding Gerald and Vasily. Gerald can you have a look?
> 
> >>
> >>
> >> In my view, the cleared_(ptes|pmds|puds) and (pte|pmd|pud)_free_tlb
> >> correspond one-to-one.  So we should set cleared_ptes in pte_free_tlb(),
> >> then use it when needed.
> > 
> > So pte_free_tlb() clears a table of PTE entries, or a PMD level entity,
> > also see free_pte_range(). So the generic code makes sense to me. The
> > PTE level invalidations will have happened on tlb_remove_tlb_entry().
> > 
> >> I'm very confused about this. Which is wrong? Or is there something
> >> I understand wrong?
> > 
> > I agree the s390 case is puzzling, Martin does s390 need a PTE level
> > invalidate for removing a PTE table or was this a mistake?
> > 

Peter is right, the PTE level invalidations will happen before. For
s390, not exactly at the tlb_remove_tlb_entry() itself, since
__tlb_remove_tlb_entry() is not defined, but rather directly at the
preceding ptep_get_and_clear(). I think this also the reason why we
cannot easily optimize for larger granularity.

Anyway, pte_free_tlb() will then later only take care of freeing
the page table page, no further PTE level clearing/invalidation
needed. I see no reason why s390 should behave differently from
the generic code, wrt to cleared_pxds setting in pxd_free_tlb().

So I guess this was an "off-by-one" mistake in commit 9de7d833e3708
("s390/tlb: Convert to generic mmu_gather"), since the other
pxd_free_tlb() functions also show similar puzzling behavior.
Not consistently off-by-one though, as pmd_free_tlb() seems
to behave correctly, setting tlb->cleared_puds = 1, similar to
generic code.

That was a very nice catch, Zhenyu, thanks for reporting!
We are not yet making use of the tlb->cleared_pxds for s390, but
we would certainly have stumbled over this if we ever tried.
Will send a patch to make s390 behave like generic code here.

Regards,
Gerald


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-20 16:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-28  4:30 [RFC][Qusetion] the value of cleared_(ptes|pmds|puds|p4ds) in struct mmu_gather Zhenyu Ye
2020-03-30 12:16 ` Peter Zijlstra
2020-03-31  8:15   ` Zhenyu Ye
2020-04-08  8:51   ` Christian Borntraeger
2020-04-20 16:20     ` Gerald Schaefer
2020-04-14  7:05   ` Christian Borntraeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).