All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Qiao <zhangqiao22@huawei.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: <linux-kernel@vger.kernel.org>, <mingo@redhat.com>,
	<juri.lelli@redhat.com>, <dietmar.eggemann@arm.com>,
	<rostedt@goodmis.org>, <bsegall@google.com>, <mgorman@suse.de>,
	<bristot@redhat.com>, <vschneid@redhat.com>, <rkagan@amazon.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v2] sched/fair: sanitize vruntime of entity being migrated
Date: Sat, 11 Mar 2023 17:57:04 +0800	[thread overview]
Message-ID: <ef2f07f1-fe3a-624f-52e7-1089138dc137@huawei.com> (raw)
In-Reply-To: <ZAs+zV0o9ShO7nLT@vingu-book>



在 2023/3/10 22:29, Vincent Guittot 写道:
> Le jeudi 09 mars 2023 à 16:14:38 (+0100), Vincent Guittot a écrit :
>> On Thu, 9 Mar 2023 at 15:37, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Thu, Mar 09, 2023 at 03:28:25PM +0100, Peter Zijlstra wrote:
>>>> On Thu, Mar 09, 2023 at 02:34:05PM +0100, Vincent Guittot wrote:
>>>>
>>>>> Then, even if we don't clear exec_start before migrating and keep
>>>>> current value to be used in place_entity on the new cpu, we can't
>>>>> compare the rq_clock_task(rq_of(cfs_rq)) of 2 different rqs AFAICT
>>>>
>>>> Blergh -- indeed, irq and steal time can skew them between CPUs :/
>>>> I suppose we can fudge that... wait_start (which is basically what we're
>>>> making it do) also does that IIRC.
>>>>
>>>> I really dislike having this placement muck spreadout like proposed.
>>>
>>> Also, I think we might be over-engineering this, we don't care about
>>> accuracy at all, all we really care about is 'long-time'.
>>
>> you mean taking the patch 1/2 that you mentioned here to add a
>> migrated field:
>> https://lore.kernel.org/all/68832dfbb60fda030540b5f4e39c5801942689b1.1648228023.git.tim.c.chen@linux.intel.com/T/#ma5637eb8010f3f4a4abff778af8db705429d003b
>>
>> And assume that the divergence between the rq_clock_task() can be ignored ?
>>
>> That could probably work but we need to replace the (60LL *
>> NSEC_PER_SEC) by ((1ULL << 63) / NICE_0_LOAD) because 60sec divergence
>> would not be unrealistic.
>> and a comment to explain why it's acceptable
> 
> Zhang,
> 
> Could you try the patch below ?
> This is a rebase/merge/update of:
> -patch 1/2 above and 
> -https://lore.kernel.org/lkml/20230209193107.1432770-1-rkagan@amazon.de/


I applyed and tested this patch, and it make hackbench slower.
According to my previous test results. The good result is 82.1(s).
But the result of this patch is 108.725(s).


> version1: v6.2
> version2: v6.2 + commit 829c1651e9c4
> version3: v6.2 + commit 829c1651e9c4 + this patch
>
> -------------------------------------------------
> 	version1	version2	version3
> test1	81.0 		118.1 		82.1
> test2	82.1 		116.9 		80.3
> test3	83.2 		103.9 		83.3
> avg(s)	82.1 		113.0 		81.9
>
> -------------------------------------------------
> 
> The proposal accepts a divergence of up to 52 days between the 2 rqs.
> 
> If this work, we will prepare a proper patch
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 63d242164b1a..cb8af0a137f7 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -550,6 +550,7 @@ struct sched_entity {
>         struct rb_node                  run_node;
>         struct list_head                group_node;
>         unsigned int                    on_rq;
> +       unsigned int                    migrated;
> 
>         u64                             exec_start;
>         u64                             sum_exec_runtime;
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7a1b1f855b96..36acd9598b40 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1057,6 +1057,7 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
>         /*
>          * We are starting a new run period:
>          */
> +       se->migrated = 0;
>         se->exec_start = rq_clock_task(rq_of(cfs_rq));
>  }
> 
> @@ -4684,13 +4685,23 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> 
>         /*
>          * Pull vruntime of the entity being placed to the base level of
> -        * cfs_rq, to prevent boosting it if placed backwards.  If the entity
> -        * slept for a long time, don't even try to compare its vruntime with
> -        * the base as it may be too far off and the comparison may get
> -        * inversed due to s64 overflow.
> +        * cfs_rq, to prevent boosting it if placed backwards.
> +        * However, min_vruntime can advance much faster than real time, with
> +        * the exterme being when an entity with the minimal weight always runs
> +        * on the cfs_rq. If the new entity slept for long, its vruntime
> +        * difference from min_vruntime may overflow s64 and their comparison
> +        * may get inversed, so ignore the entity's original vruntime in that
> +        * case.
> +        * The maximal vruntime speedup is given by the ratio of normal to
> +        * minimal weight: NICE_0_LOAD / MIN_SHARES, so cutting off on the

why not is `scale_load_down(NICE_0_LOAD) / MIN_SHARES` here ?


> +        * sleep time of 2^63 / NICE_0_LOAD should be safe.
> +        * When placing a migrated waking entity, its exec_start has been set
> +        * from a different rq. In order to take into account a possible
> +        * divergence between new and prev rq's clocks task because of irq and

This divergence might be larger, it cause `sleep_time` maybe negative.

> +        * stolen time, we take an additional margin.
>          */
>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> +       if ((s64)sleep_time > (1ULL << 63) / NICE_0_LOAD / 2)>                 se->vruntime = vruntime;
>         else
>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
> @@ -7658,7 +7669,7 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>         se->avg.last_update_time = 0;
> 
>         /* We have migrated, no longer consider this task hot */
> -       se->exec_start = 0;
> +       se->migrated = 1;
> 
>         update_scan_period(p, new_cpu);
>  }
> @@ -8344,6 +8355,9 @@ static int task_hot(struct task_struct *p, struct lb_env *env)
>         if (sysctl_sched_migration_cost == 0)
>                 return 0;
> 
> +       if (p->se.migrated)
> +               return 0;
> +
>         delta = rq_clock_task(env->src_rq) - p->se.exec_start;
> 
>         return delta < (s64)sysctl_sched_migration_cost;
> 
> 
> 
>>
>>
>>>
>>>
> .
> 

  reply	other threads:[~2023-03-11  9:57 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 13:24 [PATCH v2] sched/fair: sanitize vruntime of entity being migrated Zhang Qiao
2023-03-06 13:53 ` Vincent Guittot
2023-03-07 10:26   ` Vincent Guittot
2023-03-07 11:05     ` Zhang Qiao
2023-03-07 13:41     ` Zhang Qiao
2023-03-08  8:01       ` Vincent Guittot
2023-03-08 12:55         ` Vincent Guittot
2023-03-09  8:37           ` Zhang Qiao
2023-03-09  9:09             ` Dietmar Eggemann
2023-03-09  9:30               ` Zhang Qiao
2023-03-09 10:48             ` Vincent Guittot
2023-03-09 14:23               ` Zhang Qiao
2023-03-07  2:16 ` kernel test robot
2023-03-07 12:45 ` Dietmar Eggemann
2023-03-07 14:06   ` Zhang Qiao
2023-03-09  9:43   ` Zhang Qiao
2023-03-08 14:33 ` Chen Yu
2023-03-09 13:05 ` Peter Zijlstra
2023-03-09 13:34   ` Vincent Guittot
2023-03-09 14:28     ` Peter Zijlstra
2023-03-09 14:36       ` Peter Zijlstra
2023-03-09 15:14         ` Vincent Guittot
2023-03-10 14:29           ` Vincent Guittot
2023-03-11  9:57             ` Zhang Qiao [this message]
2023-03-13 14:23               ` Vincent Guittot
2023-03-14 11:03                 ` Zhang Qiao
2023-03-14 13:26                   ` Vincent Guittot
2023-03-14 13:38                     ` Zhang Qiao
2023-03-14 13:39                       ` Vincent Guittot
2023-03-14 15:32                         ` Vincent Guittot
2023-03-15  9:16                           ` Zhang Qiao
2023-03-15 15:30                             ` Vincent Guittot
2023-03-13  9:06             ` Dietmar Eggemann
2023-03-13 18:17               ` Dietmar Eggemann
2023-03-14  7:41                 ` Vincent Guittot
2023-03-14 12:07                   ` Peter Zijlstra
2023-03-14 13:24                     ` Vincent Guittot
2023-03-14 17:16                       ` Peter Zijlstra
2023-03-15  7:18                         ` Vincent Guittot
2023-03-15  8:42                           ` Vincent Guittot
2023-03-15 10:15                             ` Dietmar Eggemann
2023-03-15 10:21                               ` Vincent Guittot
2023-03-15 13:35                                 ` Dietmar Eggemann
2023-03-15 15:32                                   ` Vincent Guittot
2023-03-14 13:29                     ` Dietmar Eggemann
2023-03-14 13:37                       ` Dietmar Eggemann
2023-03-17 16:08 Vincent Guittot
2023-03-18  7:45 ` Zhang Qiao
2023-03-20 12:29   ` Dietmar Eggemann
2023-03-20 13:26     ` Vincent Guittot
2023-03-21 10:02 ` Peter Zijlstra
2023-03-21 10:29   ` Dietmar Eggemann
2023-03-21 10:49     ` Peter Zijlstra
2023-03-21 11:12       ` Vincent Guittot
2023-03-21 11:13       ` Dietmar Eggemann
2023-03-21 12:26         ` Peter Zijlstra
2023-03-21 12:28 ` Peter Zijlstra
2023-03-21 12:38   ` Vincent Guittot
2023-03-24  4:05 ` Chen Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef2f07f1-fe3a-624f-52e7-1089138dc137@huawei.com \
    --to=zhangqiao22@huawei.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkagan@amazon.de \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.