linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 5/8] x86/clear_page: add clear_page_uncached()
       [not found] <20201014083300.19077-1-ankur.a.arora@oracle.com>
@ 2020-10-14  8:32 ` Ankur Arora
  2020-10-14 15:45   ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Ankur Arora @ 2020-10-14  8:32 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: kirill, mhocko, boris.ostrovsky, konrad.wilk, Ankur Arora,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

Define clear_page_uncached() as an alternative_call() to clear_page_nt()
if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
doesn't.

Similarly define clear_page_uncached_flush() which provides an SFENCE
if the CPU sets X86_FEATURE_NT_GOOD.

Also, add the glue interface clear_user_highpage_uncached().

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/x86/include/asm/page.h    |  6 ++++++
 arch/x86/include/asm/page_32.h |  9 +++++++++
 arch/x86/include/asm/page_64.h | 14 ++++++++++++++
 include/asm-generic/page.h     |  3 +++
 include/linux/highmem.h        | 10 ++++++++++
 5 files changed, 42 insertions(+)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 7555b48803a8..ca0aa379ac7f 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -28,6 +28,12 @@ static inline void clear_user_page(void *page, unsigned long vaddr,
 	clear_page(page);
 }
 
+static inline void clear_user_page_uncached(void *page, unsigned long vaddr,
+					    struct page *pg)
+{
+	clear_page_uncached(page);
+}
+
 static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 				  struct page *topage)
 {
diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h
index 94dbd51df58f..7a03a274a9a4 100644
--- a/arch/x86/include/asm/page_32.h
+++ b/arch/x86/include/asm/page_32.h
@@ -39,6 +39,15 @@ static inline void clear_page(void *page)
 	memset(page, 0, PAGE_SIZE);
 }
 
+static inline void clear_page_uncached(void *page)
+{
+	clear_page(page);
+}
+
+static inline void clear_page_uncached_flush(void)
+{
+}
+
 static inline void copy_page(void *to, void *from)
 {
 	memcpy(to, from, PAGE_SIZE);
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index bde3c2785ec4..5897075e77dd 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -55,6 +55,20 @@ static inline void clear_page(void *page)
 			   : "cc", "memory", "rax", "rcx");
 }
 
+static inline void clear_page_uncached(void *page)
+{
+	alternative_call(clear_page,
+			 clear_page_nt, X86_FEATURE_NT_GOOD,
+			 "=D" (page),
+			 "0" (page)
+			 : "cc", "memory", "rax", "rcx");
+}
+
+static inline void clear_page_uncached_flush(void)
+{
+	alternative("", "sfence", X86_FEATURE_NT_GOOD);
+}
+
 void copy_page(void *to, void *from);
 
 #endif	/* !__ASSEMBLY__ */
diff --git a/include/asm-generic/page.h b/include/asm-generic/page.h
index fe801f01625e..60235a0cf24a 100644
--- a/include/asm-generic/page.h
+++ b/include/asm-generic/page.h
@@ -26,6 +26,9 @@
 #ifndef __ASSEMBLY__
 
 #define clear_page(page)	memset((page), 0, PAGE_SIZE)
+#define clear_page_uncached(page)	clear_page(page)
+#define clear_page_uncached_flush()	do { } while (0)
+
 #define copy_page(to,from)	memcpy((to), (from), PAGE_SIZE)
 
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 14e6202ce47f..f842593e2474 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -232,6 +232,16 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 }
 #endif
 
+#ifndef clear_user_highpage_uncached
+static inline void clear_user_highpage_uncached(struct page *page, unsigned long vaddr)
+{
+	void *addr = kmap_atomic(page);
+
+	clear_user_page_uncached(addr, vaddr, page);
+	kunmap_atomic(addr);
+}
+#endif
+
 #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 /**
  * __alloc_zeroed_user_highpage - Allocate a zeroed HIGHMEM page for a VMA with caller-specified movable GFP flags
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14  8:32 ` [PATCH 5/8] x86/clear_page: add clear_page_uncached() Ankur Arora
@ 2020-10-14 15:45   ` Andy Lutomirski
  2020-10-14 19:58     ` Borislav Petkov
  2020-10-14 20:54     ` Ankur Arora
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Lutomirski @ 2020-10-14 15:45 UTC (permalink / raw)
  To: Ankur Arora
  Cc: LKML, Linux-MM, Kirill A. Shutemov, Michal Hocko,
	Boris Ostrovsky, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch

On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
> Define clear_page_uncached() as an alternative_call() to clear_page_nt()
> if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
> doesn't.
>
> Similarly define clear_page_uncached_flush() which provides an SFENCE
> if the CPU sets X86_FEATURE_NT_GOOD.

As long as you keep "NT" or "MOVNTI" in the names and keep functions
in arch/x86, I think it's reasonable to expect that callers understand
that MOVNTI has bizarre memory ordering rules.  But once you give
something a generic name like "clear_page_uncached" and stick it in
generic code, I think the semantics should be more obvious.

How about:

clear_page_uncached_unordered() or clear_page_uncached_incoherent()

and

flush_after_clear_page_uncached()

After all, a naive reader might expect "uncached" to imply "caches are
off and this is coherent with everything".  And the results of getting
this wrong will be subtle and possibly hard-to-reproduce corruption.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 15:45   ` Andy Lutomirski
@ 2020-10-14 19:58     ` Borislav Petkov
  2020-10-14 21:07       ` Andy Lutomirski
  2020-10-14 20:54     ` Ankur Arora
  1 sibling, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-10-14 19:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ankur Arora, LKML, Linux-MM, Kirill A. Shutemov, Michal Hocko,
	Boris Ostrovsky, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, X86 ML, H. Peter Anvin, Arnd Bergmann,
	Andrew Morton, Ira Weiny, linux-arch

On Wed, Oct 14, 2020 at 08:45:37AM -0700, Andy Lutomirski wrote:
> On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
> >
> > Define clear_page_uncached() as an alternative_call() to clear_page_nt()
> > if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
> > doesn't.
> >
> > Similarly define clear_page_uncached_flush() which provides an SFENCE
> > if the CPU sets X86_FEATURE_NT_GOOD.
> 
> As long as you keep "NT" or "MOVNTI" in the names and keep functions
> in arch/x86, I think it's reasonable to expect that callers understand
> that MOVNTI has bizarre memory ordering rules.  But once you give
> something a generic name like "clear_page_uncached" and stick it in
> generic code, I think the semantics should be more obvious.

Why does it have to be a separate call? Why isn't it behind the
clear_page() alternative machinery so that the proper function is
selected at boot? IOW, why does a user of clear_page functionality need
to know at all about an "uncached" variant?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 15:45   ` Andy Lutomirski
  2020-10-14 19:58     ` Borislav Petkov
@ 2020-10-14 20:54     ` Ankur Arora
  1 sibling, 0 replies; 13+ messages in thread
From: Ankur Arora @ 2020-10-14 20:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, Linux-MM, Kirill A. Shutemov, Michal Hocko,
	Boris Ostrovsky, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch

On 2020-10-14 8:45 a.m., Andy Lutomirski wrote:
> On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>>
>> Define clear_page_uncached() as an alternative_call() to clear_page_nt()
>> if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
>> doesn't.
>>
>> Similarly define clear_page_uncached_flush() which provides an SFENCE
>> if the CPU sets X86_FEATURE_NT_GOOD.
> 
> As long as you keep "NT" or "MOVNTI" in the names and keep functions
> in arch/x86, I think it's reasonable to expect that callers understand
> that MOVNTI has bizarre memory ordering rules.  But once you give
> something a generic name like "clear_page_uncached" and stick it in
> generic code, I think the semantics should be more obvious.
> 
> How about:
> 
> clear_page_uncached_unordered() or clear_page_uncached_incoherent()
> 
> and
> 
> flush_after_clear_page_uncached()
> 
> After all, a naive reader might expect "uncached" to imply "caches are
> off and this is coherent with everything".  And the results of getting
> this wrong will be subtle and possibly hard-to-reproduce corruption.
Yeah, these are a lot more obvious. Thanks. Will fix.

Ankur

>
> --Andy
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 19:58     ` Borislav Petkov
@ 2020-10-14 21:07       ` Andy Lutomirski
  2020-10-14 21:12         ` Borislav Petkov
  2020-10-15  3:21         ` Ankur Arora
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Lutomirski @ 2020-10-14 21:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Ankur Arora, LKML, Linux-MM, Kirill A. Shutemov,
	Michal Hocko, Boris Ostrovsky, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Ingo Molnar, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch




> On Oct 14, 2020, at 12:58 PM, Borislav Petkov <bp@alien8.de> wrote:
> 
> On Wed, Oct 14, 2020 at 08:45:37AM -0700, Andy Lutomirski wrote:
>>> On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>>> 
>>> Define clear_page_uncached() as an alternative_call() to clear_page_nt()
>>> if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
>>> doesn't.
>>> 
>>> Similarly define clear_page_uncached_flush() which provides an SFENCE
>>> if the CPU sets X86_FEATURE_NT_GOOD.
>> 
>> As long as you keep "NT" or "MOVNTI" in the names and keep functions
>> in arch/x86, I think it's reasonable to expect that callers understand
>> that MOVNTI has bizarre memory ordering rules.  But once you give
>> something a generic name like "clear_page_uncached" and stick it in
>> generic code, I think the semantics should be more obvious.
> 
> Why does it have to be a separate call? Why isn't it behind the
> clear_page() alternative machinery so that the proper function is
> selected at boot? IOW, why does a user of clear_page functionality need
> to know at all about an "uncached" variant?
> 
> 

I assume it’s for a little optimization of clearing more than one page per SFENCE.

In any event, based on the benchmark data upthread, we only want to do NT clears when they’re rather large, so this shouldn’t be just an alternative. I assume this is because a page or two will fit in cache and, for most uses that allocate zeroed pages, we prefer cache-hot pages.  When clearing 1G, on the other hand, cache-hot is impossible and we prefer the improved bandwidth and less cache trashing of NT clears.

Perhaps SFENCE is so fast that this is a silly optimization, though, and we don’t lose anything measurable by SFENCEing once per page.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 21:07       ` Andy Lutomirski
@ 2020-10-14 21:12         ` Borislav Petkov
  2020-10-15  3:37           ` Ankur Arora
  2020-10-15  3:21         ` Ankur Arora
  1 sibling, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-10-14 21:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, Ankur Arora, LKML, Linux-MM, Kirill A. Shutemov,
	Michal Hocko, Boris Ostrovsky, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Ingo Molnar, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch

On Wed, Oct 14, 2020 at 02:07:30PM -0700, Andy Lutomirski wrote:
> I assume it’s for a little optimization of clearing more than one
> page per SFENCE.
>
> In any event, based on the benchmark data upthread, we only want to do
> NT clears when they’re rather large, so this shouldn’t be just an
> alternative. I assume this is because a page or two will fit in cache
> and, for most uses that allocate zeroed pages, we prefer cache-hot
> pages. When clearing 1G, on the other hand, cache-hot is impossible
> and we prefer the improved bandwidth and less cache trashing of NT
> clears.

Yeah, use case makes sense but people won't know what to use. At the
time I was experimenting with this crap, I remember Linus saying that
that selection should be made based on the size of the area cleared, so
users should not have to know the difference.

Which perhaps is the only sane use case I see for this.

> Perhaps SFENCE is so fast that this is a silly optimization, though,
> and we don’t lose anything measurable by SFENCEing once per page.

Yes, I'd like to see real use cases showing improvement from this, not
just microbenchmarks.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 21:07       ` Andy Lutomirski
  2020-10-14 21:12         ` Borislav Petkov
@ 2020-10-15  3:21         ` Ankur Arora
  2020-10-15 10:40           ` Borislav Petkov
  1 sibling, 1 reply; 13+ messages in thread
From: Ankur Arora @ 2020-10-15  3:21 UTC (permalink / raw)
  To: Andy Lutomirski, Borislav Petkov
  Cc: Andy Lutomirski, LKML, Linux-MM, Kirill A. Shutemov,
	Michal Hocko, Boris Ostrovsky, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Ingo Molnar, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch

On 2020-10-14 2:07 p.m., Andy Lutomirski wrote:
> 
> 
> 
>> On Oct 14, 2020, at 12:58 PM, Borislav Petkov <bp@alien8.de> wrote:
>>
>> On Wed, Oct 14, 2020 at 08:45:37AM -0700, Andy Lutomirski wrote:
>>>> On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora <ankur.a.arora@oracle.com> wrote:
>>>>
>>>> Define clear_page_uncached() as an alternative_call() to clear_page_nt()
>>>> if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it
>>>> doesn't.
>>>>
>>>> Similarly define clear_page_uncached_flush() which provides an SFENCE
>>>> if the CPU sets X86_FEATURE_NT_GOOD.
>>>
>>> As long as you keep "NT" or "MOVNTI" in the names and keep functions
>>> in arch/x86, I think it's reasonable to expect that callers understand
>>> that MOVNTI has bizarre memory ordering rules.  But once you give
>>> something a generic name like "clear_page_uncached" and stick it in
>>> generic code, I think the semantics should be more obvious.
>>
>> Why does it have to be a separate call? Why isn't it behind the
>> clear_page() alternative machinery so that the proper function is
>> selected at boot? IOW, why does a user of clear_page functionality need
>> to know at all about an "uncached" variant?
>
> I assume it’s for a little optimization of clearing more than one page
> per SFENCE.
>
> In any event, based on the benchmark data upthread, we only want to do
> NT clears when they’re rather large, so this shouldn’t be just an
> alternative. I assume this is because a page or two will fit in cache
> and, for most uses that allocate zeroed pages, we prefer cache-hot
> pages. When clearing 1G, on the other hand, cache-hot is impossible
> and we prefer the improved bandwidth and less cache trashing of NT
> clears.

Also, if we did extend clear_page() to take the page-size as parameter
we still might not have enough information (ex. a 4K or a 2MB page that
clear_page() sees could be part of a GUP of a much larger extent) to
decide whether to go uncached or not.

> Perhaps SFENCE is so fast that this is a silly optimization, though,
> and we don’t lose anything measurable by SFENCEing once per page.
Alas, no. I tried that and dropped about 15% performance on Rome.

Thanks
Ankur

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-14 21:12         ` Borislav Petkov
@ 2020-10-15  3:37           ` Ankur Arora
  2020-10-15 10:35             ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Ankur Arora @ 2020-10-15  3:37 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski
  Cc: Andy Lutomirski, LKML, Linux-MM, Kirill A. Shutemov,
	Michal Hocko, Boris Ostrovsky, Konrad Rzeszutek Wilk,
	Thomas Gleixner, Ingo Molnar, X86 ML, H. Peter Anvin,
	Arnd Bergmann, Andrew Morton, Ira Weiny, linux-arch

On 2020-10-14 2:12 p.m., Borislav Petkov wrote:
> On Wed, Oct 14, 2020 at 02:07:30PM -0700, Andy Lutomirski wrote:
>> I assume it’s for a little optimization of clearing more than one
>> page per SFENCE.
>>
>> In any event, based on the benchmark data upthread, we only want to do
>> NT clears when they’re rather large, so this shouldn’t be just an
>> alternative. I assume this is because a page or two will fit in cache
>> and, for most uses that allocate zeroed pages, we prefer cache-hot
>> pages. When clearing 1G, on the other hand, cache-hot is impossible
>> and we prefer the improved bandwidth and less cache trashing of NT
>> clears.
> 
> Yeah, use case makes sense but people won't know what to use. At the
> time I was experimenting with this crap, I remember Linus saying that
> that selection should be made based on the size of the area cleared, so
> users should not have to know the difference.
I don't disagree but I think the selection of cached/uncached route should
be made where we have enough context available to be able to choose to do
this.

This could be for example, done in mm_populate() or gup where if say the
extent is larger than LLC-size, it takes the uncached path.

> 
> Which perhaps is the only sane use case I see for this.
> 
>> Perhaps SFENCE is so fast that this is a silly optimization, though,
>> and we don’t lose anything measurable by SFENCEing once per page.
> 
> Yes, I'd like to see real use cases showing improvement from this, not
> just microbenchmarks.
Sure will add.

Thanks
Ankur

> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-15  3:37           ` Ankur Arora
@ 2020-10-15 10:35             ` Borislav Petkov
  2020-10-15 21:20               ` Ankur Arora
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-10-15 10:35 UTC (permalink / raw)
  To: Ankur Arora
  Cc: Andy Lutomirski, Andy Lutomirski, LKML, Linux-MM,
	Kirill A. Shutemov, Michal Hocko, Boris Ostrovsky,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, X86 ML,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

On Wed, Oct 14, 2020 at 08:37:44PM -0700, Ankur Arora wrote:
> I don't disagree but I think the selection of cached/uncached route should
> be made where we have enough context available to be able to choose to do
> this.
>
> This could be for example, done in mm_populate() or gup where if say the
> extent is larger than LLC-size, it takes the uncached path.

Are there examples where we don't know the size?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-15  3:21         ` Ankur Arora
@ 2020-10-15 10:40           ` Borislav Petkov
  2020-10-15 21:40             ` Ankur Arora
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-10-15 10:40 UTC (permalink / raw)
  To: Ankur Arora
  Cc: Andy Lutomirski, Andy Lutomirski, LKML, Linux-MM,
	Kirill A. Shutemov, Michal Hocko, Boris Ostrovsky,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, X86 ML,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

On Wed, Oct 14, 2020 at 08:21:57PM -0700, Ankur Arora wrote:
> Also, if we did extend clear_page() to take the page-size as parameter
> we still might not have enough information (ex. a 4K or a 2MB page that
> clear_page() sees could be part of a GUP of a much larger extent) to
> decide whether to go uncached or not.

clear_page* assumes 4K. All of the lowlevel asm variants do. So adding
the size there won't bring you a whole lot.

So you'd need to devise this whole thing differently. Perhaps have a
clear_pages() helper which decides based on size what to do: uncached
clearing or the clear_page() as is now in a loop.

Looking at the callsites would give you a better idea I'd say.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-15 10:35             ` Borislav Petkov
@ 2020-10-15 21:20               ` Ankur Arora
  2020-10-16 18:21                 ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Ankur Arora @ 2020-10-15 21:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Andy Lutomirski, LKML, Linux-MM,
	Kirill A. Shutemov, Michal Hocko, Boris Ostrovsky,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, X86 ML,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

On 2020-10-15 3:35 a.m., Borislav Petkov wrote:
> On Wed, Oct 14, 2020 at 08:37:44PM -0700, Ankur Arora wrote:
>> I don't disagree but I think the selection of cached/uncached route should
>> be made where we have enough context available to be able to choose to do
>> this.
>>
>> This could be for example, done in mm_populate() or gup where if say the
>> extent is larger than LLC-size, it takes the uncached path.
> 
> Are there examples where we don't know the size?

The case I was thinking of was that clear_huge_page() or faultin_page() would
know the size to a page unit, while the higher level function would know the
whole extent and could optimize differently based on that.

Thanks
Ankur

> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-15 10:40           ` Borislav Petkov
@ 2020-10-15 21:40             ` Ankur Arora
  0 siblings, 0 replies; 13+ messages in thread
From: Ankur Arora @ 2020-10-15 21:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Andy Lutomirski, LKML, Linux-MM,
	Kirill A. Shutemov, Michal Hocko, Boris Ostrovsky,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, X86 ML,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

On 2020-10-15 3:40 a.m., Borislav Petkov wrote:
> On Wed, Oct 14, 2020 at 08:21:57PM -0700, Ankur Arora wrote:
>> Also, if we did extend clear_page() to take the page-size as parameter
>> we still might not have enough information (ex. a 4K or a 2MB page that
>> clear_page() sees could be part of a GUP of a much larger extent) to
>> decide whether to go uncached or not.
> 
> clear_page* assumes 4K. All of the lowlevel asm variants do. So adding
> the size there won't bring you a whole lot.
> 
> So you'd need to devise this whole thing differently. Perhaps have a
> clear_pages() helper which decides based on size what to do: uncached
> clearing or the clear_page() as is now in a loop.

I think that'll work well for GB pages, where the clear_pages() helper
has enough information to make a decision.

But, unless I'm missing something, I'm not sure how that would work for
say, a 1GB mm_populate() using 4K pages. The clear_page() (or clear_pages())
in that case would only see the 4K size.

But let me think about this more (and look at the callsites as you suggest.)

> 
> Looking at the callsites would give you a better idea I'd say.
Thanks, yeah that's a good idea. Let me go do that.

Ankur

> 
> Thx.
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached()
  2020-10-15 21:20               ` Ankur Arora
@ 2020-10-16 18:21                 ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2020-10-16 18:21 UTC (permalink / raw)
  To: Ankur Arora
  Cc: Andy Lutomirski, Andy Lutomirski, LKML, Linux-MM,
	Kirill A. Shutemov, Michal Hocko, Boris Ostrovsky,
	Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar, X86 ML,
	H. Peter Anvin, Arnd Bergmann, Andrew Morton, Ira Weiny,
	linux-arch

On Thu, Oct 15, 2020 at 02:20:36PM -0700, Ankur Arora wrote:
> The case I was thinking of was that clear_huge_page()

That loop in clear_gigantic_page() there could be optimized not to
iterate over the pages but do a NTA moves in one go, provided they're
contiguous.

> or faultin_page() would

faultin_page() goes into the bowels of mm fault handling, you'd have to
be more precise what exactly you mean with that one.

> know the size to a page unit, while the higher level function would know the
> whole extent and could optimize differently based on that.

Just don't forget that this "optimization" of yours comes at the price
of added code complexity and you're putting the onus on the people to
know which function to call. So it is not for free and needs to be
carefully weighed.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-10-16 18:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201014083300.19077-1-ankur.a.arora@oracle.com>
2020-10-14  8:32 ` [PATCH 5/8] x86/clear_page: add clear_page_uncached() Ankur Arora
2020-10-14 15:45   ` Andy Lutomirski
2020-10-14 19:58     ` Borislav Petkov
2020-10-14 21:07       ` Andy Lutomirski
2020-10-14 21:12         ` Borislav Petkov
2020-10-15  3:37           ` Ankur Arora
2020-10-15 10:35             ` Borislav Petkov
2020-10-15 21:20               ` Ankur Arora
2020-10-16 18:21                 ` Borislav Petkov
2020-10-15  3:21         ` Ankur Arora
2020-10-15 10:40           ` Borislav Petkov
2020-10-15 21:40             ` Ankur Arora
2020-10-14 20:54     ` Ankur Arora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).