All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xing Zhengjun <zhengjun.xing@linux.intel.com>
To: Matthew Wilcox <willy@infradead.org>, lkp <oliver.sang@intel.com>
Cc: David Howells <dhowells@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, lkp@intel.com
Subject: Re: [LKP] Re: [mm/writeback] e5dbd33218: will-it-scale.per_process_ops -3.8% regression
Date: Wed, 28 Apr 2021 14:15:15 +0800	[thread overview]
Message-ID: <e4ea454c-aa68-24de-709b-9fee462e3dcf@linux.intel.com> (raw)
In-Reply-To: <20210423124753.GA235567@casper.infradead.org>

Hi Matthew,

On 4/23/2021 8:47 PM, Matthew Wilcox wrote:
> On Fri, Apr 23, 2021 at 01:46:01PM +0800, kernel test robot wrote:
>> FYI, we noticed a -3.8% regression of will-it-scale.per_process_ops due to commit:
>> commit: e5dbd33218bd8d87ab69f730ab90aed5fab7eb26 ("mm/writeback: Add wait_on_page_writeback_killable")
> That commit just adds a function.  It doesn't add any callers.  It must
> just be moving something around ...

The micro benchmark like will-it-scale is sensitive to the alignments 
(text/data), so I apply the data align debug patch and re-test, the 
regression reduced to -1.5%.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor/ucode:
lkp-csl-2sp9/will-it-scale/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-ge5dbd33218bd-no-dynamic/gcc-9/16/process/mmap2/performance/0x5003006

commit:
   a142a3781e3dc0c03a48688cac619c2684eed18f (fs/cachefiles: Remove 
wait_bit_key layout dependency)
   86460bf788cb360a14811fadb3f94f9765ba5a23 (mm/writeback: Add 
wait_on_page_writeback_killable)

a142a3781e3dc0c0 86460bf788cb360a14811fadb3f
---------------- ---------------------------
          %stddev     %change         %stddev
              \          |                \
    9089952            -1.5%    8953838 will-it-scale.16.processes
     568121            -1.5%     559614 will-it-scale.per_process_ops
    9089952            -1.5%    8953838        will-it-scale.workload

>> 39f985c8f667c80a e5dbd33218bd8d87ab69f730ab9
>> ---------------- ---------------------------
>>           %stddev     %change         %stddev
>>               \          |                \
>>     9359770            -3.8%    9001769        will-it-scale.16.processes
>>      584985            -3.8%     562610        will-it-scale.per_process_ops
>>     9359770            -3.8%    9001769        will-it-scale.workload
>>       15996            -1.2%      15811        proc-vmstat.nr_kernel_stack
>>       23577 ± 10%     +18.5%      27937 ±  7%  softirqs.CPU48.SCHED
>>        5183 ± 41%     +47.2%       7630 ±  7%  interrupts.CPU1.NMI:Non-maskable_interrupts
>>        5183 ± 41%     +47.2%       7630 ±  7%  interrupts.CPU1.PMI:Performance_monitoring_interrupts
>>       54.33 ± 12%     +18.4%      64.33 ±  7%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
>>      153.34 ± 24%     -45.9%      83.00 ± 25%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
>>      153.33 ± 24%     -45.9%      82.99 ± 25%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
>>   2.424e+10            -3.8%  2.332e+10        perf-stat.i.branch-instructions
>>        0.47            +3.7%       0.48        perf-stat.i.cpi
>>   2.529e+10            -4.0%  2.428e+10        perf-stat.i.dTLB-loads
>>    1.15e+10            -3.8%  1.106e+10        perf-stat.i.dTLB-stores
>>    54249733            -4.8%   51627939        perf-stat.i.iTLB-load-misses
>>   1.004e+11            -3.8%  9.661e+10        perf-stat.i.instructions
>>        2.15            -3.6%       2.07        perf-stat.i.ipc
>>      693.66            -3.9%     666.70        perf-stat.i.metric.M/sec
>>        0.46            +3.7%       0.48        perf-stat.overall.cpi
>>        2.15            -3.6%       2.08        perf-stat.overall.ipc
>>   2.416e+10            -3.8%  2.324e+10        perf-stat.ps.branch-instructions
>>    2.52e+10            -4.0%  2.419e+10        perf-stat.ps.dTLB-loads
>>   1.146e+10            -3.8%  1.102e+10        perf-stat.ps.dTLB-stores
>>    54065825            -4.8%   51454019        perf-stat.ps.iTLB-load-misses
>>   1.001e+11            -3.8%  9.628e+10        perf-stat.ps.instructions
>>   3.025e+13            -3.9%  2.908e+13        perf-stat.total.instructions
>>        0.89 ± 14%      -0.1        0.77 ± 11%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.shmem_mmap.mmap_region.do_mmap
>>        0.14 ± 13%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.common_mmap
>>        0.61 ± 12%      -0.1        0.52 ± 12%  perf-profile.children.cycles-pp.common_file_perm
>>        0.21 ±  8%      -0.0        0.17 ± 11%  perf-profile.children.cycles-pp.vma_set_page_prot
>>        0.12 ±  8%      -0.0        0.09 ± 12%  perf-profile.children.cycles-pp.blocking_notifier_call_chain
>>        0.12 ± 14%      -0.0        0.09 ± 15%  perf-profile.children.cycles-pp.get_mmap_base
>>        0.09 ±  8%      -0.0        0.07 ± 11%  perf-profile.children.cycles-pp.vm_pgprot_modify
>>        0.13 ± 15%      +0.1        0.19 ±  8%  perf-profile.children.cycles-pp.cap_capable
>>        0.03 ±102%      +0.1        0.12 ± 12%  perf-profile.children.cycles-pp.munmap@plt
>>        0.14 ± 13%      +0.1        0.24 ±  6%  perf-profile.children.cycles-pp.testcase
>>        0.33 ± 10%      -0.1        0.23 ± 10%  perf-profile.self.cycles-pp.cap_vm_enough_memory
>>        0.13 ± 11%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.common_mmap
>>        0.48 ± 12%      -0.1        0.41 ± 12%  perf-profile.self.cycles-pp.common_file_perm
>>        0.49 ± 12%      -0.1        0.43 ± 13%  perf-profile.self.cycles-pp.vm_area_alloc
>>        0.12 ±  8%      -0.0        0.09 ± 12%  perf-profile.self.cycles-pp.blocking_notifier_call_chain
>>        0.12 ± 13%      -0.0        0.09 ± 14%  perf-profile.self.cycles-pp.get_mmap_base
>>        0.11 ±  8%      +0.0        0.16 ± 10%  perf-profile.self.cycles-pp.__x64_sys_munmap
>>        0.11 ± 14%      +0.1        0.18 ±  8%  perf-profile.self.cycles-pp.cap_capable
>>        0.12 ± 11%      +0.1        0.20 ±  6%  perf-profile.self.cycles-pp.testcase
>>        0.01 ±223%      +0.1        0.11 ± 13%  perf-profile.self.cycles-pp.munmap@plt
> I'm struggling to see anything in that that says anything other than
> "we did 3-4% less work".  Maybe someone else has something useful to
> say about it?
> _______________________________________________
> LKP mailing list -- lkp@lists.01.org
> To unsubscribe send an email to lkp-leave@lists.01.org

-- 
Zhengjun Xing


WARNING: multiple messages have this Message-ID (diff)
From: Xing Zhengjun <zhengjun.xing@linux.intel.com>
To: lkp@lists.01.org
Subject: Re: [mm/writeback] e5dbd33218: will-it-scale.per_process_ops -3.8% regression
Date: Wed, 28 Apr 2021 14:15:15 +0800	[thread overview]
Message-ID: <e4ea454c-aa68-24de-709b-9fee462e3dcf@linux.intel.com> (raw)
In-Reply-To: <20210423124753.GA235567@casper.infradead.org>

[-- Attachment #1: Type: text/plain, Size: 6459 bytes --]

Hi Matthew,

On 4/23/2021 8:47 PM, Matthew Wilcox wrote:
> On Fri, Apr 23, 2021 at 01:46:01PM +0800, kernel test robot wrote:
>> FYI, we noticed a -3.8% regression of will-it-scale.per_process_ops due to commit:
>> commit: e5dbd33218bd8d87ab69f730ab90aed5fab7eb26 ("mm/writeback: Add wait_on_page_writeback_killable")
> That commit just adds a function.  It doesn't add any callers.  It must
> just be moving something around ...

The micro benchmark like will-it-scale is sensitive to the alignments 
(text/data), so I apply the data align debug patch and re-test, the 
regression reduced to -1.5%.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor/ucode:
lkp-csl-2sp9/will-it-scale/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-ge5dbd33218bd-no-dynamic/gcc-9/16/process/mmap2/performance/0x5003006

commit:
   a142a3781e3dc0c03a48688cac619c2684eed18f (fs/cachefiles: Remove 
wait_bit_key layout dependency)
   86460bf788cb360a14811fadb3f94f9765ba5a23 (mm/writeback: Add 
wait_on_page_writeback_killable)

a142a3781e3dc0c0 86460bf788cb360a14811fadb3f
---------------- ---------------------------
          %stddev     %change         %stddev
              \          |                \
    9089952            -1.5%    8953838 will-it-scale.16.processes
     568121            -1.5%     559614 will-it-scale.per_process_ops
    9089952            -1.5%    8953838        will-it-scale.workload

>> 39f985c8f667c80a e5dbd33218bd8d87ab69f730ab9
>> ---------------- ---------------------------
>>           %stddev     %change         %stddev
>>               \          |                \
>>     9359770            -3.8%    9001769        will-it-scale.16.processes
>>      584985            -3.8%     562610        will-it-scale.per_process_ops
>>     9359770            -3.8%    9001769        will-it-scale.workload
>>       15996            -1.2%      15811        proc-vmstat.nr_kernel_stack
>>       23577 ± 10%     +18.5%      27937 ±  7%  softirqs.CPU48.SCHED
>>        5183 ± 41%     +47.2%       7630 ±  7%  interrupts.CPU1.NMI:Non-maskable_interrupts
>>        5183 ± 41%     +47.2%       7630 ±  7%  interrupts.CPU1.PMI:Performance_monitoring_interrupts
>>       54.33 ± 12%     +18.4%      64.33 ±  7%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
>>      153.34 ± 24%     -45.9%      83.00 ± 25%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
>>      153.33 ± 24%     -45.9%      82.99 ± 25%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
>>   2.424e+10            -3.8%  2.332e+10        perf-stat.i.branch-instructions
>>        0.47            +3.7%       0.48        perf-stat.i.cpi
>>   2.529e+10            -4.0%  2.428e+10        perf-stat.i.dTLB-loads
>>    1.15e+10            -3.8%  1.106e+10        perf-stat.i.dTLB-stores
>>    54249733            -4.8%   51627939        perf-stat.i.iTLB-load-misses
>>   1.004e+11            -3.8%  9.661e+10        perf-stat.i.instructions
>>        2.15            -3.6%       2.07        perf-stat.i.ipc
>>      693.66            -3.9%     666.70        perf-stat.i.metric.M/sec
>>        0.46            +3.7%       0.48        perf-stat.overall.cpi
>>        2.15            -3.6%       2.08        perf-stat.overall.ipc
>>   2.416e+10            -3.8%  2.324e+10        perf-stat.ps.branch-instructions
>>    2.52e+10            -4.0%  2.419e+10        perf-stat.ps.dTLB-loads
>>   1.146e+10            -3.8%  1.102e+10        perf-stat.ps.dTLB-stores
>>    54065825            -4.8%   51454019        perf-stat.ps.iTLB-load-misses
>>   1.001e+11            -3.8%  9.628e+10        perf-stat.ps.instructions
>>   3.025e+13            -3.9%  2.908e+13        perf-stat.total.instructions
>>        0.89 ± 14%      -0.1        0.77 ± 11%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.shmem_mmap.mmap_region.do_mmap
>>        0.14 ± 13%      -0.1        0.04 ± 71%  perf-profile.children.cycles-pp.common_mmap
>>        0.61 ± 12%      -0.1        0.52 ± 12%  perf-profile.children.cycles-pp.common_file_perm
>>        0.21 ±  8%      -0.0        0.17 ± 11%  perf-profile.children.cycles-pp.vma_set_page_prot
>>        0.12 ±  8%      -0.0        0.09 ± 12%  perf-profile.children.cycles-pp.blocking_notifier_call_chain
>>        0.12 ± 14%      -0.0        0.09 ± 15%  perf-profile.children.cycles-pp.get_mmap_base
>>        0.09 ±  8%      -0.0        0.07 ± 11%  perf-profile.children.cycles-pp.vm_pgprot_modify
>>        0.13 ± 15%      +0.1        0.19 ±  8%  perf-profile.children.cycles-pp.cap_capable
>>        0.03 ±102%      +0.1        0.12 ± 12%  perf-profile.children.cycles-pp.munmap(a)plt
>>        0.14 ± 13%      +0.1        0.24 ±  6%  perf-profile.children.cycles-pp.testcase
>>        0.33 ± 10%      -0.1        0.23 ± 10%  perf-profile.self.cycles-pp.cap_vm_enough_memory
>>        0.13 ± 11%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.common_mmap
>>        0.48 ± 12%      -0.1        0.41 ± 12%  perf-profile.self.cycles-pp.common_file_perm
>>        0.49 ± 12%      -0.1        0.43 ± 13%  perf-profile.self.cycles-pp.vm_area_alloc
>>        0.12 ±  8%      -0.0        0.09 ± 12%  perf-profile.self.cycles-pp.blocking_notifier_call_chain
>>        0.12 ± 13%      -0.0        0.09 ± 14%  perf-profile.self.cycles-pp.get_mmap_base
>>        0.11 ±  8%      +0.0        0.16 ± 10%  perf-profile.self.cycles-pp.__x64_sys_munmap
>>        0.11 ± 14%      +0.1        0.18 ±  8%  perf-profile.self.cycles-pp.cap_capable
>>        0.12 ± 11%      +0.1        0.20 ±  6%  perf-profile.self.cycles-pp.testcase
>>        0.01 ±223%      +0.1        0.11 ± 13%  perf-profile.self.cycles-pp.munmap(a)plt
> I'm struggling to see anything in that that says anything other than
> "we did 3-4% less work".  Maybe someone else has something useful to
> say about it?
> _______________________________________________
> LKP mailing list -- lkp(a)lists.01.org
> To unsubscribe send an email to lkp-leave(a)lists.01.org

-- 
Zhengjun Xing

  parent reply	other threads:[~2021-04-28  6:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23  5:46 [mm/writeback] e5dbd33218: will-it-scale.per_process_ops -3.8% regression kernel test robot
2021-04-23  5:46 ` kernel test robot
2021-04-23 12:47 ` Matthew Wilcox
2021-04-23 12:47   ` Matthew Wilcox
2021-04-28  6:09   ` Xing, Zhengjun
2021-04-28  6:15   ` Xing Zhengjun [this message]
2021-04-28  6:15     ` Xing Zhengjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4ea454c-aa68-24de-709b-9fee462e3dcf@linux.intel.com \
    --to=zhengjun.xing@linux.intel.com \
    --cc=dhowells@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=oliver.sang@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.