All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Fomichev <fomichev.ru@gmail.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Mel Gorman <mgorman@suse.de>,
	linux-kernel@vger.kernel.org, dmaengine@vger.kernel.org,
	linux@yadro.com, Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC] Scheduler: DMA Engine regression because of sched/fair changes
Date: Wed, 19 Jan 2022 15:55:13 +0300	[thread overview]
Message-ID: <20220119125513.bpnf563tjc2u6g47@yadro.com> (raw)
In-Reply-To: <20220118020448.2399-1-hdanton@sina.com>

On Tue, Jan 18, 2022 at 10:04:48AM +0800, Hillf Danton wrote:
> On Mon, 17 Jan 2022 20:44:19 +0300 Alexander Fomichev wrote:
> > On Mon, Jan 17, 2022 at 10:27:01AM +0000, Mel Gorman wrote:
> > > > 1) You're right. When options "noverify=1" and "polling=1" are used.
> > > > then no performance reducing occurs.
> > > 
> > > How about just noverify=1 on its own? It's a stronger indicator that
> > > cache hotness is a factor.
> > > 
> > 
> > With "noverify=1 polled=0" the performance reduction is only 10-20%,
> > but still exists.
> > 
> > -----< v5.15.8-vanilla >-----
> > [17057.866760] dmatest: Added 1 threads using dma0chan0
> > [17060.133880] dmatest: Started 1 threads using dma0chan0
> > [17060.154343] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 49338.85 iops 3157686 KB/s (0)
> > [17063.737887] dmatest: Added 1 threads using dma0chan0
> > [17065.113838] dmatest: Started 1 threads using dma0chan0
> > [17065.137659] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 42183.41 iops 2699738 KB/s (0)
> > [17100.339989] dmatest: Added 1 threads using dma0chan0
> > [17102.190764] dmatest: Started 1 threads using dma0chan0
> > [17102.214285] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 42844.89 iops 2742073 KB/s (0)
> > -----< end >-----
> > 
> > -----< 5.15.8-ioat-ptdma-dirty-fix+ >-----
> > [ 6183.356549] dmatest: Added 1 threads using dma0chan0
> > [ 6187.868237] dmatest: Started 1 threads using dma0chan0
> > [ 6187.887389] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 52753.74 iops 3376239 KB/s (0)
> > [ 6201.913154] dmatest: Added 1 threads using dma0chan0
> > [ 6204.701340] dmatest: Started 1 threads using dma0chan0
> > [ 6204.720490] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 52614.96 iops 3367357 KB/s (0)
> > [ 6285.114603] dmatest: Added 1 threads using dma0chan0
> > [ 6287.031875] dmatest: Started 1 threads using dma0chan0
> > [ 6287.050278] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 54939.01 iops 3516097 KB/s (0)
> > -----< end >-----
> > 
> 
> Check if cold cache provides some room for selecting CPU.
> 
> Only for thoughts now.
> 
> Hillf
> 
> +++ x/kernel/sched/fair.c
> @@ -5889,19 +5889,16 @@ static int
>  wake_affine_idle(int this_cpu, int prev_cpu, int sync)
>  {
>  	/*
> -	 * If this_cpu is idle, it implies the wakeup is from interrupt
> -	 * context. Only allow the move if cache is shared. Otherwise an
> -	 * interrupt intensive workload could force all tasks onto one
> -	 * node depending on the IO topology or IRQ affinity settings.
> -	 *
> -	 * If the prev_cpu is idle and cache affine then avoid a migration.
> -	 * There is no guarantee that the cache hot data from an interrupt
> -	 * is more important than cache hot data on the prev_cpu and from
> -	 * a cpufreq perspective, it's better to have higher utilisation
> -	 * on one CPU.
> +	 * select this cpu if both are idle because of
> +	 * cold shared cache
>  	 */
> -	if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
> -		return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
> +	if (cpus_share_cache(this_cpu, prev_cpu)) {
> +		if (available_idle_cpu(this_cpu))
> +			return this_cpu;
> +
> +		if (available_idle_cpu(prev_cpu))
> +			return prev_cpu;
> +	}
>  
>  	if (sync && cpu_rq(this_cpu)->nr_running == 1)
>  		return this_cpu;

Hi Hillf,

The results with your patch are controversial:

-----< v5.15.8-Hillf-Danton-patch+ >-----
[ 1572.178884] dmatest: Added 1 threads using dma0chan0
[ 1577.413535] dmatest: Started 1 threads using dma0chan0
[ 1577.432495] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 53188.66 iops 3404074 KB/s (0)
[ 1592.356173] dmatest: Added 1 threads using dma0chan0
[ 1593.791100] dmatest: Started 1 threads using dma0chan0
[ 1593.815282] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 41668.40 iops 2666777 KB/s (0)
[ 1617.117040] dmatest: Added 1 threads using dma0chan0
[ 1619.545890] dmatest: Started 1 threads using dma0chan0
[ 1619.569639] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 42426.81 iops 2715316 KB/s (0)
-----< end >-----

Just to remind, used dmatest parameters:

/sys/module/dmatest/parameters/iterations:1000
/sys/module/dmatest/parameters/alignment:-1
/sys/module/dmatest/parameters/verbose:N
/sys/module/dmatest/parameters/norandom:Y
/sys/module/dmatest/parameters/max_channels:0
/sys/module/dmatest/parameters/dmatest:0
/sys/module/dmatest/parameters/polled:N
/sys/module/dmatest/parameters/threads_per_chan:1
/sys/module/dmatest/parameters/noverify:Y
/sys/module/dmatest/parameters/test_buf_size:1048576
/sys/module/dmatest/parameters/transfer_size:65536
/sys/module/dmatest/parameters/run:N
/sys/module/dmatest/parameters/wait:Y
/sys/module/dmatest/parameters/timeout:2000
/sys/module/dmatest/parameters/xor_sources:3
/sys/module/dmatest/parameters/pq_sources:3


-- 
Regards,
  Alexander

  parent reply	other threads:[~2022-01-19 12:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-12 15:26 [RFC] Scheduler: DMA Engine regression because of sched/fair changes Alexander Fomichev
2022-01-12 17:05 ` Mel Gorman
2022-01-17  8:19   ` Alexander Fomichev
2022-01-17 10:27     ` Mel Gorman
2022-01-17 17:44       ` Alexander Fomichev
     [not found]       ` <20220118020448.2399-1-hdanton@sina.com>
2022-01-18 10:05         ` Mel Gorman
2022-01-19 12:55         ` Alexander Fomichev [this message]
     [not found]         ` <20220121101217.2849-1-hdanton@sina.com>
2022-01-21 13:46           ` Alexander Fomichev
     [not found]           ` <20220122233314.2999-1-hdanton@sina.com>
2022-01-28 16:50             ` Alexander Fomichev
2022-02-23 15:24               ` Thorsten Leemhuis
2022-03-06 11:19                 ` [RFC] Scheduler: DMA Engine regression because of sched/fair changes #forregzbot Thorsten Leemhuis
2022-01-16  9:55 ` [RFC] Scheduler: DMA Engine regression because of sched/fair changes Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220119125513.bpnf563tjc2u6g47@yadro.com \
    --to=fomichev.ru@gmail.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@yadro.com \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.