* Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86 [not found] <3f28adee-8214-fa8e-b368-eaf8b193469e@huawei.com> @ 2021-04-13 11:01 ` Borislav Petkov 2021-04-13 12:54 ` Kemeng Shi 0 siblings, 1 reply; 4+ messages in thread From: Borislav Petkov @ 2021-04-13 11:01 UTC (permalink / raw) To: Kemeng Shi; +Cc: tglx, mingo, x86, hpa, linux-kernel, linux-nvdimm + linux-nvdimm Original mail at https://lkml.kernel.org/r/3f28adee-8214-fa8e-b368-eaf8b193469e@huawei.com On Tue, Apr 13, 2021 at 02:25:58PM +0800, Kemeng Shi wrote: > I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node in What is AEP? > my system. I will move cold pages from DRAM node to AEP node with > move_pages system call. With old "rep movsq', it costs 2030ms to move > 1 GB pages. With "movnti", it only cost about 890ms to move 1GB pages. So there's __copy_user_nocache() which does NT stores. > - ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD > + ALTERNATIVE_2 "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD, \ > + "jmp copy_page_nt", X86_FEATURE_XMM2 This makes every machine which has sse2 do NT stores now. Which means *every* machine practically. The folks on linux-nvdimm@ should be able to give you a better idea what to do. HTH. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re:Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86 2021-04-13 11:01 ` [PATCH] x86: Accelerate copy_page with non-temporal in X86 Borislav Petkov @ 2021-04-13 12:54 ` Kemeng Shi 2021-04-13 14:53 ` Borislav Petkov 0 siblings, 1 reply; 4+ messages in thread From: Kemeng Shi @ 2021-04-13 12:54 UTC (permalink / raw) To: Borislav Petkov; +Cc: tglx, mingo, x86, hpa, linux-kernel, linux-nvdimm on 2021/4/13 19:01, Borislav Petkov wrote: > + linux-nvdimm > > Original mail at https://lkml.kernel.org/r/3f28adee-8214-fa8e-b368-eaf8b193469e@huawei.com > > On Tue, Apr 13, 2021 at 02:25:58PM +0800, Kemeng Shi wrote: >> I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node in > > What is AEP? > AEP is a type of persistent memory produced by Intel. It's slower than normal memory but is persistent. >> my system. I will move cold pages from DRAM node to AEP node with >> move_pages system call. With old "rep movsq', it costs 2030ms to move >> 1 GB pages. With "movnti", it only cost about 890ms to move 1GB pages. > > So there's __copy_user_nocache() which does NT stores. > >> - ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD >> + ALTERNATIVE_2 "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD, \ >> + "jmp copy_page_nt", X86_FEATURE_XMM2 > > This makes every machine which has sse2 do NT stores now. Which means > *every* machine practically. > Yes. And NT stores should be better for copy_page especially copying a lot of pages as only partial memory of copied page will be access recently. > The folks on linux-nvdimm@ should be able to give you a better idea what > to do. > > HTH. > Thanks for response and help. _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86 2021-04-13 12:54 ` Kemeng Shi @ 2021-04-13 14:53 ` Borislav Petkov 2021-04-14 5:25 ` Kemeng Shi 0 siblings, 1 reply; 4+ messages in thread From: Borislav Petkov @ 2021-04-13 14:53 UTC (permalink / raw) To: Kemeng Shi; +Cc: tglx, mingo, x86, hpa, linux-kernel, linux-nvdimm On Tue, Apr 13, 2021 at 08:54:55PM +0800, Kemeng Shi wrote: > Yes. And NT stores should be better for copy_page especially copying a lot > of pages as only partial memory of copied page will be access recently. I thought "should be better" too last time when I measured rep; movs vs NT stores but actual measurements showed no real difference. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86 2021-04-13 14:53 ` Borislav Petkov @ 2021-04-14 5:25 ` Kemeng Shi 0 siblings, 0 replies; 4+ messages in thread From: Kemeng Shi @ 2021-04-14 5:25 UTC (permalink / raw) To: Borislav Petkov; +Cc: tglx, mingo, x86, hpa, linux-kernel, linux-nvdimm on 2021/4/13 22:53, Borislav Petkov wrote: > I thought "should be better" too last time when I measured rep; movs vs > NT stores but actual measurements showed no real difference. Mabye the NT stores make difference when store to slow dimms, like the persistent memory I just tested. Also, it likely reduces unnecessary cache load and flush, and benifits the running processes which have data cached. -- Best wishes Kemeng Shi _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-04-14 5:25 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <3f28adee-8214-fa8e-b368-eaf8b193469e@huawei.com> 2021-04-13 11:01 ` [PATCH] x86: Accelerate copy_page with non-temporal in X86 Borislav Petkov 2021-04-13 12:54 ` Kemeng Shi 2021-04-13 14:53 ` Borislav Petkov 2021-04-14 5:25 ` Kemeng Shi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).