linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: add protection for delta of wait time
@ 2021-01-17 12:31 Jiang Biao
  2021-01-18  7:56 ` Vincent Guittot
  0 siblings, 1 reply; 5+ messages in thread
From: Jiang Biao @ 2021-01-17 12:31 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot
  Cc: dietmar.eggemann, rostedt, bsegall, mgorman, bristot,
	linux-kernel, Jiang Biao

From: Jiang Biao <benbjiang@tencent.com>

delta in update_stats_wait_end() might be negative, which would
make following statistics go wrong.

Add protection for delta of wait time, like what have been done in
update_stats_enqueue_sleeper() for deltas of sleep/block time.

Signed-off-by: Jiang Biao <benbjiang@tencent.com>
---
 kernel/sched/fair.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c0374c1152e0..ac950ac950bc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -917,6 +917,9 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
 
 	delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start);
 
+	if ((s64)delta < 0)
+		delta = 0;
+
 	if (entity_is_task(se)) {
 		p = task_of(se);
 		if (task_on_rq_migrating(p)) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: add protection for delta of wait time
  2021-01-17 12:31 [PATCH] sched/fair: add protection for delta of wait time Jiang Biao
@ 2021-01-18  7:56 ` Vincent Guittot
  2021-01-18 14:11   ` Jiang Biao
  0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2021-01-18  7:56 UTC (permalink / raw)
  To: Jiang Biao
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Jiang Biao

On Sun, 17 Jan 2021 at 13:31, Jiang Biao <benbjiang@gmail.com> wrote:
>
> From: Jiang Biao <benbjiang@tencent.com>
>
> delta in update_stats_wait_end() might be negative, which would
> make following statistics go wrong.

Could you describe the use case that generates a negative delta ?

rq_clock is always increasing so this should not lead to a negative
value even if update_stats_wait_end/start are not called in the right
order,
This situation could happen after a migration if we forgot to call
update_stats_wait_start

>
> Add protection for delta of wait time, like what have been done in
> update_stats_enqueue_sleeper() for deltas of sleep/block time.
>
> Signed-off-by: Jiang Biao <benbjiang@tencent.com>
> ---
>  kernel/sched/fair.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c0374c1152e0..ac950ac950bc 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -917,6 +917,9 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
>
>         delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start);
>
> +       if ((s64)delta < 0)
> +               delta = 0;
> +
>         if (entity_is_task(se)) {
>                 p = task_of(se);
>                 if (task_on_rq_migrating(p)) {
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: add protection for delta of wait time
  2021-01-18  7:56 ` Vincent Guittot
@ 2021-01-18 14:11   ` Jiang Biao
  2021-01-18 15:32     ` Vincent Guittot
  0 siblings, 1 reply; 5+ messages in thread
From: Jiang Biao @ 2021-01-18 14:11 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Jiang Biao

Hi, Vincent

On Mon, 18 Jan 2021 at 15:56, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Sun, 17 Jan 2021 at 13:31, Jiang Biao <benbjiang@gmail.com> wrote:
> >
> > From: Jiang Biao <benbjiang@tencent.com>
> >
> > delta in update_stats_wait_end() might be negative, which would
> > make following statistics go wrong.
>
> Could you describe the use case that generates a negative delta ?
>
> rq_clock is always increasing so this should not lead to a negative
> value even if update_stats_wait_end/start are not called in the right
> order,
Yes, indeed.

> This situation could happen after a migration if we forgot to call
> update_stats_wait_start
The migration case was what I worried about, but no regular use case
comes into my mind. :)
As an extreme case, would it be a problem if we disable/re-enable
sched_schedstats during migration?

static inline void
update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
        u64 wait_start, prev_wait_start;

        if (!schedstat_enabled()) // disable during migration
                return; // return here, and skip updating wait_start
...
}

static inline void
update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
        struct task_struct *p;
        u64 delta;

        if (!schedstat_enabled())  // re-enable again
                return;

        /*
         * When the sched_schedstat changes from 0 to 1, some sched se
         * maybe already in the runqueue, the se->statistics.wait_start
         * will be 0.So it will let the delta wrong. We need to avoid this
         * scenario.
         */
        if (unlikely(!schedstat_val(se->statistics.wait_start)))
                return;
         //stale wait_start which might be bigger than rq_clock would
be used. -)
        delta = rq_clock(rq_of(cfs_rq)) -
schedstat_val(se->statistics.wait_start);
...

Thanks a lot.
Regards,
Jiang

}
>
> >
> > Add protection for delta of wait time, like what have been done in
> > update_stats_enqueue_sleeper() for deltas of sleep/block time.
> >
> > Signed-off-by: Jiang Biao <benbjiang@tencent.com>
> > ---
> >  kernel/sched/fair.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index c0374c1152e0..ac950ac950bc 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -917,6 +917,9 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >
> >         delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start);
> >
> > +       if ((s64)delta < 0)
> > +               delta = 0;
> > +
> >         if (entity_is_task(se)) {
> >                 p = task_of(se);
> >                 if (task_on_rq_migrating(p)) {
> > --
> > 2.21.0
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: add protection for delta of wait time
  2021-01-18 14:11   ` Jiang Biao
@ 2021-01-18 15:32     ` Vincent Guittot
  2021-01-18 16:05       ` Jiang Biao
  0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2021-01-18 15:32 UTC (permalink / raw)
  To: Jiang Biao
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Jiang Biao

On Mon, 18 Jan 2021 at 15:11, Jiang Biao <benbjiang@gmail.com> wrote:
>
> Hi, Vincent
>
> On Mon, 18 Jan 2021 at 15:56, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> >
> > On Sun, 17 Jan 2021 at 13:31, Jiang Biao <benbjiang@gmail.com> wrote:
> > >
> > > From: Jiang Biao <benbjiang@tencent.com>
> > >
> > > delta in update_stats_wait_end() might be negative, which would
> > > make following statistics go wrong.
> >
> > Could you describe the use case that generates a negative delta ?
> >
> > rq_clock is always increasing so this should not lead to a negative
> > value even if update_stats_wait_end/start are not called in the right
> > order,
> Yes, indeed.
>
> > This situation could happen after a migration if we forgot to call
> > update_stats_wait_start
> The migration case was what I worried about, but no regular use case
> comes into my mind. :)

IIUC, you haven't faced the problem and it's only based on studying the code.

> As an extreme case, would it be a problem if we disable/re-enable
> sched_schedstats during migration?
>
> static inline void
> update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
>         u64 wait_start, prev_wait_start;
>
>         if (!schedstat_enabled()) // disable during migration
>                 return; // return here, and skip updating wait_start
> ...
> }
>
> static inline void
> update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
>         struct task_struct *p;
>         u64 delta;
>
>         if (!schedstat_enabled())  // re-enable again
>                 return;
>
>         /*
>          * When the sched_schedstat changes from 0 to 1, some sched se
>          * maybe already in the runqueue, the se->statistics.wait_start
>          * will be 0.So it will let the delta wrong. We need to avoid this
>          * scenario.
>          */
>         if (unlikely(!schedstat_val(se->statistics.wait_start)))
>                 return;
>          //stale wait_start which might be bigger than rq_clock would
> be used. -)
>         delta = rq_clock(rq_of(cfs_rq)) -
> schedstat_val(se->statistics.wait_start);
> ...
>
> Thanks a lot.
> Regards,
> Jiang
>
> }
> >
> > >
> > > Add protection for delta of wait time, like what have been done in
> > > update_stats_enqueue_sleeper() for deltas of sleep/block time.
> > >
> > > Signed-off-by: Jiang Biao <benbjiang@tencent.com>
> > > ---
> > >  kernel/sched/fair.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index c0374c1152e0..ac950ac950bc 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -917,6 +917,9 @@ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > >
> > >         delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start);
> > >
> > > +       if ((s64)delta < 0)
> > > +               delta = 0;
> > > +
> > >         if (entity_is_task(se)) {
> > >                 p = task_of(se);
> > >                 if (task_on_rq_migrating(p)) {
> > > --
> > > 2.21.0
> > >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: add protection for delta of wait time
  2021-01-18 15:32     ` Vincent Guittot
@ 2021-01-18 16:05       ` Jiang Biao
  0 siblings, 0 replies; 5+ messages in thread
From: Jiang Biao @ 2021-01-18 16:05 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Jiang Biao

Hi, Vincent

On Mon, 18 Jan 2021 at 23:32, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Mon, 18 Jan 2021 at 15:11, Jiang Biao <benbjiang@gmail.com> wrote:
> >
> > Hi, Vincent
> >
> > On Mon, 18 Jan 2021 at 15:56, Vincent Guittot
> > <vincent.guittot@linaro.org> wrote:
> > >
> > > On Sun, 17 Jan 2021 at 13:31, Jiang Biao <benbjiang@gmail.com> wrote:
> > > >
> > > > From: Jiang Biao <benbjiang@tencent.com>
> > > >
> > > > delta in update_stats_wait_end() might be negative, which would
> > > > make following statistics go wrong.
> > >
> > > Could you describe the use case that generates a negative delta ?
> > >
> > > rq_clock is always increasing so this should not lead to a negative
> > > value even if update_stats_wait_end/start are not called in the right
> > > order,
> > Yes, indeed.
> >
> > > This situation could happen after a migration if we forgot to call
> > > update_stats_wait_start
> > The migration case was what I worried about, but no regular use case
> > comes into my mind. :)
>
> IIUC, you haven't faced the problem and it's only based on studying the code.
Not yet. :).  Just found there are protections for
sleep_time/block_time, but no protection
for wait_time.

Think more later, the sleep_time/block_time do need to be protected
for the migration case,
because update_stats_enqueue_sleeper could be called right after
migration with src cpu's
sleep_start/block_start. But wait_time is not the case.

The following case might be too extreme to happen. :)

Thanks a lot for your patience.

Regards,
Jiang

>
> > As an extreme case, would it be a problem if we disable/re-enable
> > sched_schedstats during migration?
> >
> > static inline void
> > update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> >         u64 wait_start, prev_wait_start;
> >
> >         if (!schedstat_enabled()) // disable during migration
> >                 return; // return here, and skip updating wait_start
> > ...
> > }
> >
> > static inline void
> > update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> >         struct task_struct *p;
> >         u64 delta;
> >
> >         if (!schedstat_enabled())  // re-enable again
> >                 return;
> >
> >         /*
> >          * When the sched_schedstat changes from 0 to 1, some sched se
> >          * maybe already in the runqueue, the se->statistics.wait_start
> >          * will be 0.So it will let the delta wrong. We need to avoid this
> >          * scenario.
> >          */
> >         if (unlikely(!schedstat_val(se->statistics.wait_start)))
> >                 return;
> >          //stale wait_start which might be bigger than rq_clock would
> > be used. -)
> >         delta = rq_clock(rq_of(cfs_rq)) -
> > schedstat_val(se->statistics.wait_start);
> > ...
> >
> > Thanks a lot.
> > Regards,
> > Jiang

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-18 16:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-17 12:31 [PATCH] sched/fair: add protection for delta of wait time Jiang Biao
2021-01-18  7:56 ` Vincent Guittot
2021-01-18 14:11   ` Jiang Biao
2021-01-18 15:32     ` Vincent Guittot
2021-01-18 16:05       ` Jiang Biao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).