linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: SeongJae Park <sjpark@amazon.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: SeongJae Park <sjpark@amazon.com>,
	<alexander.shishkin@linux.intel.com>, <linux-mm@kvack.org>,
	<akpm@linux-foundation.org>, SeongJae Park <sjpark@amazon.de>,
	<aarcange@redhat.com>, <acme@kernel.org>, <amit@kernel.org>,
	<brendan.d.gregg@gmail.com>, <brendanhiggins@google.com>,
	<cai@lca.pw>, <colin.king@canonical.com>, <corbet@lwn.net>,
	<dwmw@amazon.com>, <jolsa@redhat.com>, <kirill@shutemov.name>,
	<mark.rutland@arm.com>, <mgorman@suse.de>, <minchan@kernel.org>,
	<mingo@redhat.com>, <namhyung@kernel.org>, <peterz@infradead.org>,
	<rdunlap@infradead.org>, <riel@surriel.com>,
	<rientjes@google.com>, <rostedt@goodmis.org>,
	<shakeelb@google.com>, <shuah@kernel.org>, <sj38.park@gmail.com>,
	<vbabka@suse.cz>, <vdavydov.dev@gmail.com>,
	<yang.shi@linux.alibaba.com>, <ying.huang@intel.com>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: Re: [RFC v5 0/7] Implement Data Access Monitoring-based Memory Operation Schemes
Date: Wed, 1 Apr 2020 10:21:50 +0200	[thread overview]
Message-ID: <20200401082150.21124-1-sjpark@amazon.com> (raw)
In-Reply-To: <20200331173908.0000696f@Huawei.com> (raw)

On Tue, 31 Mar 2020 17:39:08 +0100 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Tue, 31 Mar 2020 18:18:19 +0200
> SeongJae Park <sjpark@amazon.com> wrote:
> 
> > On Tue, 31 Mar 2020 16:51:55 +0100 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> > 
> > > On Mon, 30 Mar 2020 13:50:35 +0200
> > > SeongJae Park <sjpark@amazon.com> wrote:
> > >   
> > > > From: SeongJae Park <sjpark@amazon.de>
> > > > 
> > > > DAMON[1] can be used as a primitive for data access awared memory management
> > > > optimizations.  That said, users who want such optimizations should run DAMON,
> > > > read the monitoring results, analyze it, plan a new memory management scheme,
> > > > and apply the new scheme by themselves.  Such efforts will be inevitable for
> > > > some complicated optimizations.
> > > > 
> > > > However, in many other cases, the users could simply want the system to apply a
> > > > memory management action to a memory region of a specific size having a
> > > > specific access frequency for a specific time.  For example, "page out a memory
> > > > region larger than 100 MiB keeping only rare accesses more than 2 minutes", or
> > > > "Do not use THP for a memory region larger than 2 MiB rarely accessed for more
> > > > than 1 seconds".
> > > > 
> > > > This RFC patchset makes DAMON to handle such data access monitoring-based
> > > > operation schemes.  With this change, users can do the data access awared
> > > > optimizations by simply specifying their schemes to DAMON.  
[...]
> > > > 
> > > > Efficient THP
> > > > ~~~~~~~~~~~~~
> > > > 
> > > > THP 'always' enabled policy achieves 5.57% speedup but incurs 7.29% memory
> > > > overhead.  It achieves 41.62% speedup in best case, but 79.98% memory overhead
> > > > in worst case.  Interestingly, both the best and worst case are with
> > > > 'splash2x/ocean_ncp').  
> > > 
> > > The results above don't seems to support this any more? 
> > >   
> > > > runtime                 orig     rec      (overhead) thp      (overhead) ethp     (overhead) prcl     (overhead)
> > > > splash2x/ocean_ncp      86.927   87.065   (0.16)     50.747   (-41.62)   86.855   (-0.08)    199.553  (129.57)   
> > 
> > Hmm... But, I don't get what point you meaning...  In the data, column of 'thp'
> > means the THP 'always' enabled policy.  And, the following column shows the
> > overhead of it compared to that of 'orig', in percent.  Thus, the data says THP
> > 'always' enabled policy enabled kernel consumes 50.747 seconds to finish
> > splash2x/ocean_ncp, while THP disabled original kernel consumes 86.927 seconds.
> 
> ah. I got myself confused. 
> 
> However, I was expecting to see a significant performance advantage
> to ethp for this particular case as we did in the previous version.
> 
> In the previous version (you posted in reply to v6 of Damon), for ethp we had a significant gain with:
> 
> runtime                 orig     rec      (overhead) thp      (overhead) ethp     (overhead) prcl     (overhead)
> splash2x/ocean_ncp      81.360   81.434   (0.09)     51.157   (-37.12)   66.711   (-18.00)   91.611   (12.60) 
> 
> So, in ethp we got roughly half the performance back (at the cost of some of the memory)
> 
> That was a result I have been trying to replicate, hence was at the front of my mind!
> 
> Any idea why that changed so much? 

Ah, I forgot to explain about this change.  Thank you for let me know.

Overall, ETHP in DAMON-based Operations Schemes RFC v5 shows worse peak
performance gains.

For example, splash2x/fft shows best case speedup with ETHP for both this
version and previous version.  The speedup changed from 19% to 12%.  In case of
splash2x/ocean_ncp, it changed from 18% to only 0.08%.

That said, total performance speedup is improved.  It was 1.83% before, and
2.21% now.  Also, there are several workloads showing better speedup.  In case
of parsec3/canneal, the speedup changed from 3.86% to 6.34%.

Also note that ETHP's memory savings for the workloads showing less speedup are
much improved.  For example, the memory overhead of ETHP for splash2x/ocean_ncp
was 24.4% before, but only 3.5% now.

This is due to the fact that I didn't update the schemes for the updated DAMON.
Basically, the effect of the schemes are access pattern dependent.  In this
case, because DAMON has changed so that it might report access pattern that
different from that of previous version, the schemes should also be modified to
make best performance.  However, I didn't update the schemes because those are
for only proof of the concepts, not for productions.

The change of report was not so huge, fortunately.  I also confirmed this with
my human eyes by comparing the visualized access patterns of the two version.
The overall trend (better performance, less memory overhead) also changed only
subtle.  However, some individual workloads got some remarkable changes.


Thanks,
SeongJae Park

> 
> Thanks,
> 
> Jonathan
> 
> 
> > Thus, the overhead is ``(50.747 - 86.927) / 86.927 = -0.4162``.  In other
> > words, 41.62% speedup.
> > 
> > Also, 5.57% speedup and 7.29% memory overhead is for _total_.  This data shows
> > it.
> > 
> > > > runtime                 orig     rec      (overhead) thp      (overhead) ethp     (overhead) prcl     (overhead)
> > > > total                   3020.570 3028.080 (0.25)     2852.190 (-5.57)    2953.960 (-2.21)    3276.550 (8.47)      
> > 
> > Maybe I made you confused by ambiguously saying this.  Sorry if so.  Or, if I'm
> > still misunderstanding your point, please let me know.
> 
> 
> > 
> > 
> > Thanks,
> > SeongJae Park
> >  
> > [...]
> 


      reply	other threads:[~2020-04-01  8:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-30 11:50 [RFC v5 0/7] Implement Data Access Monitoring-based Memory Operation Schemes SeongJae Park
2020-03-30 11:50 ` [RFC v5 1/7] mm/madvise: Export do_madvise() to external GPL modules SeongJae Park
2020-03-30 11:50 ` [RFC v5 2/7] mm/damon: Account age of target regions SeongJae Park
2020-03-30 11:50 ` [RFC v5 3/7] mm/damon: Implement data access monitoring-based operation schemes SeongJae Park
2020-03-30 11:50 ` [RFC v5 4/7] mm/damon/schemes: Implement a debugfs interface SeongJae Park
2020-03-30 11:50 ` [RFC v5 5/7] mm/damon-test: Add kunit test case for regions age accounting SeongJae Park
2020-03-30 11:50 ` [RFC v5 6/7] mm/damon/selftests: Add 'schemes' debugfs tests SeongJae Park
2020-03-30 11:50 ` [RFC v5 7/7] damon/tools: Support more human friendly 'schemes' control SeongJae Park
2020-03-31 15:51 ` [RFC v5 0/7] Implement Data Access Monitoring-based Memory Operation Schemes Jonathan Cameron
2020-03-31 16:18   ` SeongJae Park
2020-03-31 16:39     ` Jonathan Cameron
2020-04-01  8:21       ` SeongJae Park [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200401082150.21124-1-sjpark@amazon.com \
    --to=sjpark@amazon.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=aarcange@redhat.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=amit@kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=brendanhiggins@google.com \
    --cc=cai@lca.pw \
    --cc=colin.king@canonical.com \
    --cc=corbet@lwn.net \
    --cc=dwmw@amazon.com \
    --cc=jolsa@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sj38.park@gmail.com \
    --cc=sjpark@amazon.de \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yang.shi@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).