On Fri, Jun 01, 2018 at 03:26:04PM +0800, Aaron Lu wrote: > On Mon, May 28, 2018 at 07:40:19PM +0800, kernel test robot wrote: > > > > Greeting, > > > > FYI, we noticed a +23.0% improvement of vm-scalability.throughput due to commit: > > > > > > commit: 309fe96bfc0ae387f53612927a8f0dc3eb056efd ("mm, memcontrol: implement memory.swap.events") > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > in testcase: vm-scalability > > on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory > > with following parameters: > > > > runtime: 300s > > size: 1T > > test: lru-shm > > cpufreq_governor: performance > > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ > > > > With the patch I just sent out: > "mem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the > same cacheline" > > Applying this commit on top doesn't yield 23% improvement any more, but > a 6% performace drop... > I found the culprit being the following one line introduced in this commit: > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d90b0201a8c4..07ab974c0a49 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6019,13 +6019,17 @@ int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry) > if (!memcg) > return 0; > > - if (!entry.val) > + if (!entry.val) { > + memcg_memory_event(memcg, MEMCG_SWAP_FAIL); Removing this line restored performance but it really doesn't make any sense. Ying suggested it might be code alignment related and suggested to use a different compiler than gcc-7.2. Then I used gcc-6.4 and turned out the test result to be pretty much the same for the two commits: (each test has run for 3 times) $ grep throughput base/*/stats.json base/0/stats.json: "vm-scalability.throughput": 89207489, base/1/stats.json: "vm-scalability.throughput": 89982933, base/2/stats.json: "vm-scalability.throughput": 90436592, $ grep throughput head/*/stats.json head/0/stats.json: "vm-scalability.throughput": 90882775, head/1/stats.json: "vm-scalability.throughput": 90675220, head/2/stats.json: "vm-scalability.throughput": 91173479, So probably it's really related to code alignment and this bisected commit doesn't cause performance change(as expected). > return 0; > + } > > memcg = mem_cgroup_id_get_online(memcg); > > If I remove that memcg_memory_event() call, performance will restore. > > It's beyond my understanding why this code path matters since there is > no swap device setup in the test machine so I don't see how possible > get_swap_page() could ever be called. > > Still investigating... >