linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: kemi <kemi.wang@intel.com>
To: Christopher Lameter <cl@linux.com>, Michal Hocko <mhocko@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Nikolay Borisov <nborisov@suse.com>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	David Rientjes <rientjes@google.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Dave <dave.hansen@linux.intel.com>,
	Andi Kleen <andi.kleen@intel.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Ying Huang <ying.huang@intel.com>, Aaron Lu <aaron.lu@intel.com>,
	Aubrey Li <aubrey.li@intel.com>, Linux MM <linux-mm@kvack.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/5] mm: Extends local cpu counter vm_diff_nodestat from s8 to s16
Date: Wed, 20 Dec 2017 14:45:52 +0800	[thread overview]
Message-ID: <cc5c715f-2525-38e6-054e-500a95b12dff@intel.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1712191116370.18938@nuc-kabylake>



On 2017年12月20日 01:21, Christopher Lameter wrote:
> On Tue, 19 Dec 2017, Michal Hocko wrote:
> 
>>> Well the reason for s8 was to keep the data structures small so that they
>>> fit in the higher level cpu caches. The large these structures become the
>>> more cachelines are used by the counters and the larger the performance
>>> influence on the code that should not be impacted by the overhead.
>>
>> I am not sure I understand. We usually do not access more counters in
>> the single code path (well, PGALLOC and NUMA counteres is more of an
>> exception). So it is rarely an advantage that the whole array is in the
>> same cache line. Besides that this is allocated by the percpu allocator
>> aligns to the type size rather than cache lines AFAICS.
> 
> I thought we are talking about NUMA counters here?
> 
> Regardless: A typical fault, system call or OS action will access multiple
> zone and node counters when allocating or freeing memory. Enlarging the
> fields will increase the number of cachelines touched.
> 

Yes, we add one more cache line footprint access theoretically.
But I don't think it would be a problem.
1) Not all the counters need to be accessed in fast path of page allocation,
the counters covered in a single cache line usually is enough for that, we
probably don't need to access one more cache line. I tend to agree Michal's
argument.
Besides, in some slow path in which code is protected by zone lock or lru lock,
access one more cache line would be a big problem since many other cache lines 
are also be accessed.

2) Enlarging vm_node_stat_diff from s8 to s16 gives an opportunity to keep
more number in local cpus that provides the possibility of reducing the global
counter update frequency. Thus, we can gain the benefit by reducing expensive 
cache bouncing.  

Well, if you still have some concerns, I can post some data for will-it-scale.page_fault1.
What the benchmark does is: it forks nr_cpu processes and then each
process does the following:
    1 mmap() 128M anonymous space;
    2 writes to each page there to trigger actual page allocation;
    3 munmap() it.
in a loop.
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c

Or you can provide some other benchmarks on which you want to see performance 
impact.

>> Maybe it used to be all different back then when the code has been added
>> but arguing about cache lines seems to be a bit problematic here. Maybe
>> you have some specific workloads which can prove me wrong?
> 
> Run a workload that does some page faults? Heavy allocation and freeing of
> memory?
> 
> Maybe that is no longer relevant since the number of the counters is
> large that the accesses are so sparse that each action pulls in a whole
> cacheline. That would be something we tried to avoid when implementing
> the differentials.
> 
> 

  reply	other threads:[~2017-12-20  6:47 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-19  6:39 [PATCH v2 0/5] mm: NUMA stats code cleanup and enhancement Kemi Wang
2017-12-19  6:39 ` [PATCH v2 1/5] mm: migrate NUMA stats from per-zone to per-node Kemi Wang
2017-12-19 12:28   ` Michal Hocko
2017-12-20  5:32     ` kemi
2017-12-19  6:39 ` [PATCH v2 2/5] mm: Extends local cpu counter vm_diff_nodestat from s8 to s16 Kemi Wang
2017-12-19 12:38   ` Michal Hocko
2017-12-20  3:05     ` kemi
2017-12-19 16:05   ` Christopher Lameter
2017-12-19 16:20     ` Michal Hocko
2017-12-19 17:21       ` Christopher Lameter
2017-12-20  6:45         ` kemi [this message]
2017-12-19  6:39 ` [PATCH v2 3/5] mm: enlarge NUMA counters threshold size Kemi Wang
2017-12-19 12:40   ` Michal Hocko
2017-12-20  5:52     ` kemi
2017-12-20 10:12       ` Michal Hocko
2017-12-20 10:21         ` kemi
2017-12-21  8:06         ` kemi
2017-12-21  8:17           ` Michal Hocko
2017-12-21  8:23             ` kemi
2017-12-21  8:59               ` Michal Hocko
2017-12-21 10:31                 ` kemi
2017-12-22 12:31                   ` Michal Hocko
2017-12-21 17:10           ` Christopher Lameter
2017-12-22  2:06             ` kemi
2017-12-26 19:05               ` Christopher Lameter
2017-12-19  6:39 ` [PATCH v2 4/5] mm: use node_page_state_snapshot to avoid deviation Kemi Wang
2017-12-19 12:43   ` Michal Hocko
2017-12-20  6:07     ` kemi
2017-12-20 10:06       ` Michal Hocko
2017-12-20 10:24         ` kemi
2017-12-20 15:58           ` Christopher Lameter
2017-12-21  1:39             ` kemi
2017-12-19  6:39 ` [PATCH v2 5/5] mm: Rename zone_statistics() to numa_statistics() Kemi Wang
2017-12-19 12:44   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc5c715f-2525-38e6-054e-500a95b12dff@intel.com \
    --to=kemi.wang@intel.com \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=aubrey.li@intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=brouer@redhat.com \
    --cc=cl@linux.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=nborisov@suse.com \
    --cc=pasha.tatashin@oracle.com \
    --cc=rientjes@google.com \
    --cc=tim.c.chen@intel.com \
    --cc=vbabka@suse.cz \
    --cc=yasu.isimatu@gmail.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).