All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: kernel test robot <oliver.sang@intel.com>,
	John Garry <john.garry@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	linux-ide@vger.kernel.org, lkp@lists.01.org, lkp@intel.com,
	ying.huang@intel.com, feng.tang@intel.com,
	zhengjun.xing@linux.intel.com, fengwei.yin@intel.com
Subject: Re: [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression
Date: Mon, 8 Aug 2022 07:52:17 -0700	[thread overview]
Message-ID: <1f498d4a-f93f-ceb4-b713-753196e5e08d@opensource.wdc.com> (raw)
In-Reply-To: <YuzPMMnnY739Tnit@xsang-OptiPlex-9020>

On 2022/08/05 1:05, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed a -15.0% regression of stress-ng.copy-file.ops_per_sec due to commit:
> 
> 
> commit: 0568e6122574dcc1aded2979cd0245038efe22b6 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> in testcase: stress-ng
> on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> with following parameters:
> 
> 	nr_threads: 10%
> 	disk: 1HDD
> 	testtime: 60s
> 	fs: f2fs
> 	class: filesystem
> 	test: copy-file
> 	cpufreq_governor: performance
> 	ucode: 0xb000280

Without knowing what the device adapter is, hard to say where the problem is. I
suspect that with the patch applied, we may be ending up with a small default
max_sectors value, causing overhead due to more commands than necessary.

Will check what I see with my test rig.

> 
> 
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@intel.com>
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         sudo bin/lkp install job.yaml           # job file is attached in this email
>         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
>         sudo bin/lkp run generated-yaml-file
> 
>         # if come across any failure that blocks the test,
>         # please remove ~/.lkp and /lkp dir to run from a clean state.
> 
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
>   filesystem/gcc-11/performance/1HDD/f2fs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp1/copy-file/stress-ng/60s/0xb000280
> 
> commit: 
>   4cbfca5f77 ("scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit")
>   0568e61225 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors")
> 
> 4cbfca5f7750520f 0568e6122574dcc1aded2979cd0 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>       1627           -14.9%       1385        stress-ng.copy-file.ops
>      27.01           -15.0%      22.96        stress-ng.copy-file.ops_per_sec
>    8935079           -11.9%    7870629        stress-ng.time.file_system_outputs
>      14.88 ±  5%     -31.8%      10.14 ±  3%  stress-ng.time.percent_of_cpu_this_job_got
>      50912           -14.7%      43413        vmstat.io.bo
>      93.78            +1.4%      95.10        iostat.cpu.idle
>       3.89           -31.6%       2.66        iostat.cpu.iowait
>       4.01            -1.3        2.74        mpstat.cpu.all.iowait%
>       0.23 ±  9%      -0.1        0.17 ± 11%  mpstat.cpu.all.sys%
>       1.66 ± 37%      -1.2        0.51 ± 55%  perf-profile.calltrace.cycles-pp.f2fs_write_end.generic_perform_write.f2fs_buffered_write_iter.f2fs_file_write_iter.do_iter_readv_writev
>       1.66 ± 37%      -1.1        0.59 ± 25%  perf-profile.children.cycles-pp.f2fs_write_end
>       1.51 ± 40%      -1.1        0.45 ± 26%  perf-profile.children.cycles-pp.f2fs_dirty_data_folio
>       1.21 ± 49%      -1.0        0.23 ± 33%  perf-profile.children.cycles-pp.f2fs_update_dirty_folio
>       0.88 ± 56%      -0.8        0.04 ±111%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>       0.14 ± 26%      +0.1        0.25 ± 28%  perf-profile.children.cycles-pp.page_cache_ra_unbounded
>       0.88 ± 56%      -0.8        0.04 ±112%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>    3164876 ±  9%     -20.2%    2524713 ±  7%  perf-stat.i.cache-misses
>  4.087e+08            -4.6%  3.899e+08        perf-stat.i.dTLB-loads
>     313050 ± 10%     -18.4%     255410 ±  6%  perf-stat.i.node-loads
>     972573 ±  9%     -16.4%     812873 ±  6%  perf-stat.i.node-stores
>    3114748 ±  9%     -20.2%    2484807 ±  7%  perf-stat.ps.cache-misses
>  4.022e+08            -4.6%  3.837e+08        perf-stat.ps.dTLB-loads
>     308178 ± 10%     -18.4%     251418 ±  6%  perf-stat.ps.node-loads
>     956996 ±  9%     -16.4%     799948 ±  6%  perf-stat.ps.node-stores
>     358486            -8.3%     328694        proc-vmstat.nr_active_file
>    1121620           -11.9%     987816        proc-vmstat.nr_dirtied
>     179906            -6.7%     167912        proc-vmstat.nr_dirty
>    1151201            -1.7%    1131322        proc-vmstat.nr_file_pages
>     100181            +9.9%     110078 ±  2%  proc-vmstat.nr_inactive_file
>     846362           -14.6%     722471        proc-vmstat.nr_written
>     358486            -8.3%     328694        proc-vmstat.nr_zone_active_file
>     100181            +9.9%     110078 ±  2%  proc-vmstat.nr_zone_inactive_file
>     180668            -6.8%     168456        proc-vmstat.nr_zone_write_pending
>     556469            -3.5%     536985        proc-vmstat.pgactivate
>    3385454           -14.6%    2889953        proc-vmstat.pgpgout
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 


-- 
Damien Le Moal
Western Digital Research

WARNING: multiple messages have this Message-ID (diff)
From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: lkp@lists.01.org
Subject: Re: [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression
Date: Mon, 08 Aug 2022 07:52:17 -0700	[thread overview]
Message-ID: <1f498d4a-f93f-ceb4-b713-753196e5e08d@opensource.wdc.com> (raw)
In-Reply-To: <YuzPMMnnY739Tnit@xsang-OptiPlex-9020>

[-- Attachment #1: Type: text/plain, Size: 5693 bytes --]

On 2022/08/05 1:05, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed a -15.0% regression of stress-ng.copy-file.ops_per_sec due to commit:
> 
> 
> commit: 0568e6122574dcc1aded2979cd0245038efe22b6 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> in testcase: stress-ng
> on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> with following parameters:
> 
> 	nr_threads: 10%
> 	disk: 1HDD
> 	testtime: 60s
> 	fs: f2fs
> 	class: filesystem
> 	test: copy-file
> 	cpufreq_governor: performance
> 	ucode: 0xb000280

Without knowing what the device adapter is, hard to say where the problem is. I
suspect that with the patch applied, we may be ending up with a small default
max_sectors value, causing overhead due to more commands than necessary.

Will check what I see with my test rig.

> 
> 
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@intel.com>
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         sudo bin/lkp install job.yaml           # job file is attached in this email
>         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
>         sudo bin/lkp run generated-yaml-file
> 
>         # if come across any failure that blocks the test,
>         # please remove ~/.lkp and /lkp dir to run from a clean state.
> 
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
>   filesystem/gcc-11/performance/1HDD/f2fs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp1/copy-file/stress-ng/60s/0xb000280
> 
> commit: 
>   4cbfca5f77 ("scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit")
>   0568e61225 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors")
> 
> 4cbfca5f7750520f 0568e6122574dcc1aded2979cd0 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>       1627           -14.9%       1385        stress-ng.copy-file.ops
>      27.01           -15.0%      22.96        stress-ng.copy-file.ops_per_sec
>    8935079           -11.9%    7870629        stress-ng.time.file_system_outputs
>      14.88 ±  5%     -31.8%      10.14 ±  3%  stress-ng.time.percent_of_cpu_this_job_got
>      50912           -14.7%      43413        vmstat.io.bo
>      93.78            +1.4%      95.10        iostat.cpu.idle
>       3.89           -31.6%       2.66        iostat.cpu.iowait
>       4.01            -1.3        2.74        mpstat.cpu.all.iowait%
>       0.23 ±  9%      -0.1        0.17 ± 11%  mpstat.cpu.all.sys%
>       1.66 ± 37%      -1.2        0.51 ± 55%  perf-profile.calltrace.cycles-pp.f2fs_write_end.generic_perform_write.f2fs_buffered_write_iter.f2fs_file_write_iter.do_iter_readv_writev
>       1.66 ± 37%      -1.1        0.59 ± 25%  perf-profile.children.cycles-pp.f2fs_write_end
>       1.51 ± 40%      -1.1        0.45 ± 26%  perf-profile.children.cycles-pp.f2fs_dirty_data_folio
>       1.21 ± 49%      -1.0        0.23 ± 33%  perf-profile.children.cycles-pp.f2fs_update_dirty_folio
>       0.88 ± 56%      -0.8        0.04 ±111%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>       0.14 ± 26%      +0.1        0.25 ± 28%  perf-profile.children.cycles-pp.page_cache_ra_unbounded
>       0.88 ± 56%      -0.8        0.04 ±112%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>    3164876 ±  9%     -20.2%    2524713 ±  7%  perf-stat.i.cache-misses
>  4.087e+08            -4.6%  3.899e+08        perf-stat.i.dTLB-loads
>     313050 ± 10%     -18.4%     255410 ±  6%  perf-stat.i.node-loads
>     972573 ±  9%     -16.4%     812873 ±  6%  perf-stat.i.node-stores
>    3114748 ±  9%     -20.2%    2484807 ±  7%  perf-stat.ps.cache-misses
>  4.022e+08            -4.6%  3.837e+08        perf-stat.ps.dTLB-loads
>     308178 ± 10%     -18.4%     251418 ±  6%  perf-stat.ps.node-loads
>     956996 ±  9%     -16.4%     799948 ±  6%  perf-stat.ps.node-stores
>     358486            -8.3%     328694        proc-vmstat.nr_active_file
>    1121620           -11.9%     987816        proc-vmstat.nr_dirtied
>     179906            -6.7%     167912        proc-vmstat.nr_dirty
>    1151201            -1.7%    1131322        proc-vmstat.nr_file_pages
>     100181            +9.9%     110078 ±  2%  proc-vmstat.nr_inactive_file
>     846362           -14.6%     722471        proc-vmstat.nr_written
>     358486            -8.3%     328694        proc-vmstat.nr_zone_active_file
>     100181            +9.9%     110078 ±  2%  proc-vmstat.nr_zone_inactive_file
>     180668            -6.8%     168456        proc-vmstat.nr_zone_write_pending
>     556469            -3.5%     536985        proc-vmstat.pgactivate
>    3385454           -14.6%    2889953        proc-vmstat.pgpgout
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2022-08-08 14:52 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  8:05 [ata] 0568e61225: stress-ng.copy-file.ops_per_sec -15.0% regression kernel test robot
2022-08-05  8:05 ` kernel test robot
2022-08-08 14:52 ` Damien Le Moal [this message]
2022-08-08 14:52   ` Damien Le Moal
2022-08-09  9:58   ` John Garry
2022-08-09  9:58     ` John Garry
2022-08-09 14:16     ` John Garry
2022-08-09 14:16       ` John Garry
2022-08-09 14:57       ` Damien Le Moal
2022-08-09 14:57         ` Damien Le Moal
2022-08-10  8:33         ` John Garry
2022-08-10  8:33           ` John Garry
2022-08-10 13:52           ` Damien Le Moal
2022-08-10 13:52             ` Damien Le Moal
2022-08-09 14:55     ` Damien Le Moal
2022-08-09 14:55       ` Damien Le Moal
2022-08-09 15:16       ` David Laight
2022-08-09 15:16         ` David Laight
2022-08-10 13:57         ` Damien Le Moal
2022-08-10 13:57           ` Damien Le Moal
2022-08-12  5:01       ` Oliver Sang
2022-08-12  5:01         ` Oliver Sang
2022-08-12 11:13         ` John Garry
2022-08-12 11:13           ` John Garry
2022-08-12 14:58           ` John Garry
2022-08-12 14:58             ` John Garry
2022-08-16  6:57             ` Oliver Sang
2022-08-16  6:57               ` Oliver Sang
2022-08-16 10:35               ` John Garry
2022-08-16 10:35                 ` John Garry
2022-08-16 15:42                 ` Damien Le Moal
2022-08-16 15:42                   ` Damien Le Moal
2022-08-16 16:38                   ` John Garry
2022-08-16 16:38                     ` John Garry
2022-08-16 20:02                     ` Damien Le Moal
2022-08-16 20:02                       ` Damien Le Moal
2022-08-16 20:44                       ` John Garry
2022-08-16 20:44                         ` John Garry
2022-08-17 15:55                         ` Damien Le Moal
2022-08-17 15:55                           ` Damien Le Moal
2022-08-17 13:51                     ` Oliver Sang
2022-08-17 13:51                       ` Oliver Sang
2022-08-17 14:04                       ` John Garry
2022-08-17 14:04                         ` John Garry
2022-08-18  2:06                         ` Oliver Sang
2022-08-18  2:06                           ` Oliver Sang
2022-08-18  9:28                           ` John Garry
2022-08-18  9:28                             ` John Garry
2022-08-19  6:24                             ` Oliver Sang
2022-08-19  6:24                               ` Oliver Sang
2022-08-19  7:54                               ` John Garry
2022-08-19  7:54                                 ` John Garry
2022-08-20 16:36                               ` Damien Le Moal
2022-08-20 16:36                                 ` Damien Le Moal
2022-08-12 15:41           ` Damien Le Moal
2022-08-12 15:41             ` Damien Le Moal
2022-08-12 17:17             ` John Garry
2022-08-12 17:17               ` John Garry
2022-08-12 18:27               ` Damien Le Moal
2022-08-12 18:27                 ` Damien Le Moal
2022-08-13  7:23                 ` John Garry
2022-08-13  7:23                   ` John Garry
2022-08-16  2:52           ` Oliver Sang
2022-08-16  2:52             ` Oliver Sang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f498d4a-f93f-ceb4-b713-753196e5e08d@opensource.wdc.com \
    --to=damien.lemoal@opensource.wdc.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=martin.petersen@oracle.com \
    --cc=oliver.sang@intel.com \
    --cc=ying.huang@intel.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.