linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH -V2] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault
       [not found] <20210402082717.3525316-1-ying.huang@intel.com>
@ 2021-04-07  8:27 ` Mel Gorman
  2021-04-08 13:46   ` Huang, Ying
  0 siblings, 1 reply; 2+ messages in thread
From: Mel Gorman @ 2021-04-07  8:27 UTC (permalink / raw)
  To: Huang Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Peter Zijlstra, Peter Xu,
	Johannes Weiner, Vlastimil Babka, Matthew Wilcox, Will Deacon,
	Michel Lespinasse, Arjun Roy, Kirill A. Shutemov

On Fri, Apr 02, 2021 at 04:27:17PM +0800, Huang Ying wrote:
> With NUMA balancing, in hint page fault handler, the faulting page
> will be migrated to the accessing node if necessary.  During the
> migration, TLB will be shot down on all CPUs that the process has run
> on recently.  Because in the hint page fault handler, the PTE will be
> made accessible before the migration is tried.  The overhead of TLB
> shooting down can be high, so it's better to be avoided if possible.
> In fact, if we delay mapping the page until migration, that can be
> avoided.  This is what this patch doing.
> 
> <SNIP>
>

Thanks, I think this is ok for Andrew to pick up to see if anything
bisects to this commit but it's a low risk.

Reviewed-by: Mel Gorman <mgorman@suse.de>

More notes;

This is not a universal win given that not all workloads exhibit the
pattern where accesses occur in parallel threads between when a page
is marked accessible and when it is migrated. The impact of the patch
appears to be neutral for those workloads. For workloads that do exhibit
the pattern, there is a small gain with a reduction in interrupts as
advertised unlike v1 of the patch. Further tests are running to confirm
the reduction is in TLB shootdown interrupts but I'm reasonably confident
that will be the case. Gains are typically small and the load described in
the changelog appears to be a best case scenario but a 1-5% gain in some
other workloads is still an improvement. There is still the possibility
that some workloads will unnecessarily stall as a result of the patch
for slightly longer periods of time but that is a relatively low risk
and will be difficult to detect. If I'm wrong, a bisection will find it.

Andrew?

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH -V2] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault
  2021-04-07  8:27 ` [PATCH -V2] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault Mel Gorman
@ 2021-04-08 13:46   ` Huang, Ying
  0 siblings, 0 replies; 2+ messages in thread
From: Huang, Ying @ 2021-04-08 13:46 UTC (permalink / raw)
  To: Mel Gorman, Andrew Morton
  Cc: linux-mm, linux-kernel, Peter Zijlstra, Peter Xu,
	Johannes Weiner, Vlastimil Babka, Matthew Wilcox, Will Deacon,
	Michel Lespinasse, Arjun Roy, Kirill A. Shutemov

Mel Gorman <mgorman@suse.de> writes:

> On Fri, Apr 02, 2021 at 04:27:17PM +0800, Huang Ying wrote:
>> With NUMA balancing, in hint page fault handler, the faulting page
>> will be migrated to the accessing node if necessary.  During the
>> migration, TLB will be shot down on all CPUs that the process has run
>> on recently.  Because in the hint page fault handler, the PTE will be
>> made accessible before the migration is tried.  The overhead of TLB
>> shooting down can be high, so it's better to be avoided if possible.
>> In fact, if we delay mapping the page until migration, that can be
>> avoided.  This is what this patch doing.
>> 
>> <SNIP>
>>
>
> Thanks, I think this is ok for Andrew to pick up to see if anything
> bisects to this commit but it's a low risk.
>
> Reviewed-by: Mel Gorman <mgorman@suse.de>
>
> More notes;
>
> This is not a universal win given that not all workloads exhibit the
> pattern where accesses occur in parallel threads between when a page
> is marked accessible and when it is migrated. The impact of the patch
> appears to be neutral for those workloads. For workloads that do exhibit
> the pattern, there is a small gain with a reduction in interrupts as
> advertised unlike v1 of the patch. Further tests are running to confirm
> the reduction is in TLB shootdown interrupts but I'm reasonably confident
> that will be the case. Gains are typically small and the load described in
> the changelog appears to be a best case scenario but a 1-5% gain in some
> other workloads is still an improvement. There is still the possibility
> that some workloads will unnecessarily stall as a result of the patch
> for slightly longer periods of time but that is a relatively low risk
> and will be difficult to detect. If I'm wrong, a bisection will find it.

Hi, Mel,

Thanks!

Hi, Andrew,

I found that V2 cannot apply on top of latest mmotm, so I send V3 as
follows.  In case you need it.

https://lore.kernel.org/lkml/20210408132236.1175607-1-ying.huang@intel.com/

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-08 13:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210402082717.3525316-1-ying.huang@intel.com>
2021-04-07  8:27 ` [PATCH -V2] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault Mel Gorman
2021-04-08 13:46   ` Huang, Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).