linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@linux.alibaba.com>
To: Daniel Jordan <daniel.m.jordan@oracle.com>,
	Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	mgorman@techsingularity.net, tj@kernel.org,
	khlebnikov@yandex-team.ru, willy@infradead.org,
	hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	shakeelb@google.com, iamjoonsoo.kim@lge.com,
	richard.weiyang@gmail.com, kirill@shutemov.name,
	alexander.duyck@gmail.com, rong.a.chen@intel.com,
	mhocko@suse.com, vdavydov.dev@gmail.com, shy828301@gmail.com
Subject: Re: [PATCH v18 00/32] per memcg lru_lock
Date: Tue, 25 Aug 2020 11:26:58 +0800	[thread overview]
Message-ID: <ec62a835-f79d-2b8c-99c7-120834703b42@linux.alibaba.com> (raw)
In-Reply-To: <20200825015627.3c3pnwauqznnp3gc@ca-dmjordan1.us.oracle.com>



在 2020/8/25 上午9:56, Daniel Jordan 写道:
> On Mon, Aug 24, 2020 at 01:24:20PM -0700, Hugh Dickins wrote:
>> On Mon, 24 Aug 2020, Andrew Morton wrote:
>>> On Mon, 24 Aug 2020 20:54:33 +0800 Alex Shi <alex.shi@linux.alibaba.com> wrote:
>> Andrew demurred on version 17 for lack of review.  Alexander Duyck has
>> been doing a lot on that front since then.  I have intended to do so,
>> but it's a mirage that moves away from me as I move towards it: I have
> 
> Same, I haven't been able to keep up with the versions or the recent review
> feedback.  I got through about half of v17 last week and hope to have more time
> for the rest this week and beyond.
> 
>>>> Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104
>>>> containers on a 2s * 26cores * HT box with a modefied case:
> 
> Alex, do you have a pointer to the modified readtwice case?

Sorry, no. my developer machine crashed, so I lost case my container and modified
case. I am struggling to get my container back from a account problematic repository. 

But some testing scripts is here, generally, the original readtwice case will
run each of threads on each of cpus. The new case will run one container on each cpus,
and just run one readtwice thead in each of containers.

Here is readtwice case changes(Just a reference)
diff --git a/case-lru-file-readtwice b/case-lru-file-readtwice
index 85533b248634..48c6b5f44256 100755
--- a/case-lru-file-readtwice
+++ b/case-lru-file-readtwice
@@ -15,12 +15,9 @@

 . ./hw_vars

-for i in `seq 1 $nr_task`
-do
        create_sparse_file $SPARSE_FILE-$i $((ROTATE_BYTES / nr_task))
        timeout --foreground -s INT ${runtime:-600} dd bs=4k if=$SPARSE_FILE-$i of=/dev/null > $TMPFS_MNT/dd-output-1-$i 2>&1 &
        timeout --foreground -s INT ${runtime:-600} dd bs=4k if=$SPARSE_FILE-$i of=/dev/null > $TMPFS_MNT/dd-output-2-$i 2>&1 &
-done

 wait
 sleep 1
@@ -31,7 +28,7 @@ do
                echo "dd output file empty: $file" >&2
        }
        cat $file
-       rm  $file
+       #rm  $file
 done

 rm `seq -f $SPARSE_FILE-%g 1 $nr_task`

And here is how to running the case: 
--------
#run all case on 24 cpu machine, lrulockv2 is the container with modified case.
for ((i=0; i<24; i++))
do
        #btw, vm-scalability need create 23 loop devices
        docker run --privileged=true --rm lrulockv2 bash -c " sleep 20000" &
done
sleep 15  #wait all container ready. 

#kick testing
for i in `docker ps | sed '1 d' | awk '{print $1 }'` ;do docker exec --privileged=true -it $i bash -c "cd vm-scalability/; bash -x ./run case-lru-file-readtwice "& done

#show testing result for all
for i in `docker ps | sed '1 d' | awk '{print $1 }'` ;do echo === $i ===; docker exec $i bash -c 'cat /tmp/vm-scalability-tmp/dd-output-* '  & done
for i in `docker ps | sed '1 d' | awk '{print $1 }'` ;do echo === $i ===; docker exec $i bash -c 'cat /tmp/vm-scalability-tmp/dd-output-* '  & done | grep MB | awk 'BEGIN {a=0
;} { a+=$8} END {print NR, a/(NR)}'


> 
> Even better would be a description of the problem you're having in production
> with lru_lock.  We might be able to create at least a simulation of it to show
> what the expected improvement of your real workload is.

we are using thousands memcgs in a machine, but as a simulation, I guess above case
could be helpful to show the problem.

Thanks a lot!
Alex

> 
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
>>>> With this patchset, the readtwice performance increased about 80%
>>>> in concurrent containers.
>>>
>>> That's rather a slight amount of performance testing for a huge
>>> performance patchset!
>>
>> Indeed.  And I see that clause about readtwice performance increased 80%
>> going back eight months to v6: a lot of fundamental bugs have been fixed
>> in it since then, so I do think it needs refreshing.  It could be faster
>> now: v16 or v17 fixed the last bug I knew of, which had been slowing
>> down reclaim considerably.
>>
>> When I last timed my repetitive swapping loads (not loads anyone sensible
>> would be running with), across only two memcgs, Alex's patchset was
>> slightly faster than without: it really did make a difference.  But
>> I tend to think that for all patchsets, there exists at least one
>> test that shows it faster, and another that shows it slower.

In my testing, case-lru-file-mmap-read has a bit slower, 10+% on 96 thread machine,
when memcg is enabled but unused, that may due to longer pointer jumpping on 
lruvec than pgdat->lru_lock, since cgroup_disable=memory could fully remove the
regression with the new lock path.

I tried reusing page->prviate to store lruvec pointer, that could remove some 
regression on this, since private is generally unused on a lru page. But the patch
is too buggy now. 

BTW, 
Guess memcg would cause more memory disturb on a large machine, if it's enabled but
unused, isn't it?


>>
>>> Is more detailed testing planned?
>>
>> Not by me, performance testing is not something I trust myself with,
>> just get lost in the numbers: Alex, this is what we hoped for months
>> ago, please make a more convincing case, I hope Daniel and others
>> can make more suggestions.  But my own evidence suggests it's good.
> 
> I ran a few benchmarks on v17 last week (sysbench oltp readonly, kerndevel from
> mmtests, a memcg-ized version of the readtwice case I cooked up) and then today
> discovered there's a chance I wasn't running the right kernels, so I'm redoing
> them on v18.  Plan to look into what other, more "macro" tests would be
> sensitive to these changes.
> 


  reply	other threads:[~2020-08-25  3:28 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24 12:54 [PATCH v18 00/32] per memcg lru_lock Alex Shi
2020-08-24 12:54 ` [PATCH v18 01/32] mm/memcg: warning on !memcg after readahead page charged Alex Shi
2020-08-24 12:54 ` [PATCH v18 02/32] mm/memcg: bail out early from swap accounting when memcg is disabled Alex Shi
2020-08-24 12:54 ` [PATCH v18 03/32] mm/thp: move lru_add_page_tail func to huge_memory.c Alex Shi
2020-08-24 12:54 ` [PATCH v18 04/32] mm/thp: clean up lru_add_page_tail Alex Shi
2020-08-24 12:54 ` [PATCH v18 05/32] mm/thp: remove code path which never got into Alex Shi
2020-08-24 12:54 ` [PATCH v18 06/32] mm/thp: narrow lru locking Alex Shi
2020-09-10 13:49   ` Matthew Wilcox
2020-09-11  3:37     ` Alex Shi
2020-09-13 15:27       ` Matthew Wilcox
2020-09-19  1:00         ` Hugh Dickins
2020-08-24 12:54 ` [PATCH v18 07/32] mm/swap.c: stop deactivate_file_page if page not on lru Alex Shi
2020-08-24 12:54 ` [PATCH v18 08/32] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2020-08-24 12:54 ` [PATCH v18 09/32] mm/page_idle: no unlikely double check for idle page counting Alex Shi
2020-08-24 12:54 ` [PATCH v18 10/32] mm/compaction: rename compact_deferred as compact_should_defer Alex Shi
2020-08-24 12:54 ` [PATCH v18 11/32] mm/memcg: add debug checking in lock_page_memcg Alex Shi
2020-08-24 12:54 ` [PATCH v18 12/32] mm/memcg: optimize mem_cgroup_page_lruvec Alex Shi
2020-08-24 12:54 ` [PATCH v18 13/32] mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn Alex Shi
2020-08-24 12:54 ` [PATCH v18 14/32] mm/lru: move lru_lock holding in func lru_note_cost_page Alex Shi
2020-08-24 12:54 ` [PATCH v18 15/32] mm/lru: move lock into lru_note_cost Alex Shi
2020-09-21 21:36   ` Hugh Dickins
2020-09-21 22:03     ` Hugh Dickins
2020-09-22  3:39       ` Alex Shi
2020-09-22  3:38     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 16/32] mm/lru: introduce TestClearPageLRU Alex Shi
2020-09-21 23:16   ` Hugh Dickins
2020-09-22  3:53     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 17/32] mm/compaction: do page isolation first in compaction Alex Shi
2020-09-21 23:49   ` Hugh Dickins
2020-09-22  4:57     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 18/32] mm/thp: add tail pages into lru anyway in split_huge_page() Alex Shi
2020-08-24 12:54 ` [PATCH v18 19/32] mm/swap.c: serialize memcg changes in pagevec_lru_move_fn Alex Shi
2020-09-22  0:42   ` Hugh Dickins
2020-09-22  5:00     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 20/32] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2020-09-22  5:27   ` Hugh Dickins
2020-09-22  8:58     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 21/32] mm/lru: introduce the relock_page_lruvec function Alex Shi
2020-09-22  5:40   ` Hugh Dickins
2020-08-24 12:54 ` [PATCH v18 22/32] mm/vmscan: use relock for move_pages_to_lru Alex Shi
2020-09-22  5:44   ` Hugh Dickins
2020-09-23  1:55     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 23/32] mm/lru: revise the comments of lru_lock Alex Shi
2020-09-22  5:48   ` Hugh Dickins
2020-08-24 12:54 ` [PATCH v18 24/32] mm/pgdat: remove pgdat lru_lock Alex Shi
2020-09-22  5:53   ` Hugh Dickins
2020-09-23  1:55     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 25/32] mm/mlock: remove lru_lock on TestClearPageMlocked in munlock_vma_page Alex Shi
2020-08-26  5:52   ` Alex Shi
2020-09-22  6:13   ` Hugh Dickins
2020-09-23  1:58     ` Alex Shi
2020-08-24 12:54 ` [PATCH v18 26/32] mm/mlock: remove __munlock_isolate_lru_page Alex Shi
2020-08-24 12:55 ` [PATCH v18 27/32] mm/swap.c: optimizing __pagevec_lru_add lru_lock Alex Shi
2020-08-26  9:07   ` Alex Shi
2020-08-24 12:55 ` [PATCH v18 28/32] mm/compaction: Drop locked from isolate_migratepages_block Alex Shi
2020-08-24 12:55 ` [PATCH v18 29/32] mm: Identify compound pages sooner in isolate_migratepages_block Alex Shi
2020-08-24 12:55 ` [PATCH v18 30/32] mm: Drop use of test_and_set_skip in favor of just setting skip Alex Shi
2020-08-24 12:55 ` [PATCH v18 31/32] mm: Add explicit page decrement in exception path for isolate_lru_pages Alex Shi
2020-09-09  1:01   ` Matthew Wilcox
2020-09-09 15:43     ` Alexander Duyck
2020-09-09 17:07       ` Matthew Wilcox
2020-09-09 18:24       ` Hugh Dickins
2020-09-09 20:15         ` Matthew Wilcox
2020-09-09 21:05           ` Hugh Dickins
2020-09-09 21:17         ` Alexander Duyck
2020-08-24 12:55 ` [PATCH v18 32/32] mm: Split release_pages work into 3 passes Alex Shi
2020-08-24 18:42 ` [PATCH v18 00/32] per memcg lru_lock Andrew Morton
2020-08-24 20:24   ` Hugh Dickins
2020-08-25  1:56     ` Daniel Jordan
2020-08-25  3:26       ` Alex Shi [this message]
2020-08-25 11:39         ` Matthew Wilcox
2020-08-26  1:19         ` Daniel Jordan
2020-08-26  8:59           ` Alex Shi
2020-08-28  1:40             ` Daniel Jordan
2020-08-28  5:22               ` Alex Shi
2020-09-09  2:44               ` Aaron Lu
2020-09-09 11:40                 ` Michal Hocko
2020-08-25  8:52       ` Alex Shi
2020-08-25 13:00         ` Alex Shi
2020-08-27  7:01     ` Hugh Dickins
2020-08-27 12:20       ` Race between freeing and waking page Matthew Wilcox
2020-09-08 23:41       ` [PATCH v18 00/32] per memcg lru_lock: reviews Hugh Dickins
2020-09-09  2:24         ` Wei Yang
2020-09-09 15:08         ` Alex Shi
2020-09-09 23:16           ` Hugh Dickins
2020-09-11  2:50             ` Alex Shi
2020-09-12  2:13               ` Hugh Dickins
2020-09-13 14:21                 ` Alex Shi
2020-09-15  8:21                   ` Hugh Dickins
2020-09-15 16:58                     ` Daniel Jordan
2020-09-16 12:44                       ` Alex Shi
2020-09-17  2:37                       ` Alex Shi
2020-09-17 14:35                         ` Daniel Jordan
2020-09-17 15:39                           ` Alexander Duyck
2020-09-17 16:48                             ` Daniel Jordan
2020-09-12  8:38           ` Hugh Dickins
2020-09-13 14:22             ` Alex Shi
2020-09-09 16:11         ` Alexander Duyck
2020-09-10  0:32           ` Hugh Dickins
2020-09-10 14:24             ` Alexander Duyck
2020-09-12  5:12               ` Hugh Dickins
2020-08-25  7:21   ` [PATCH v18 00/32] per memcg lru_lock Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec62a835-f79d-2b8c-99c7-120834703b42@linux.alibaba.com \
    --to=alex.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rong.a.chen@intel.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).