All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
@ 2020-11-18 22:56 Guilherme G. Piccoli
  2020-11-18 23:30 ` Tao Zhou
  2020-11-18 23:50 ` Tao Zhou
  0 siblings, 2 replies; 12+ messages in thread
From: Guilherme G. Piccoli @ 2020-11-18 22:56 UTC (permalink / raw)
  To: vincent.guittot
  Cc: bsegall, dietmar.eggemann, juri.lelli, zohooouoto, mgorman,
	mingo, ouwen210, pauld, peterz, pkondeti, rostedt, Jay Vosburgh,
	Gavin Guo, halves, nivedita.singhvi, linux-kernel, gpiccoli

Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
we experienced a similar condition to what this patch addresses; it's an
older kernel (4.15.x) but when suggesting the users to move to an
updated 5.4.x kernel, we noticed that this patch is not there, although
similar ones are (like [0] and [1]).

So, I'd like to ask if there's any particular reason to not backport
this fix to stable kernels, specially the longterm 5.4. The main reason
behind the question is that the code is very complex for non-experienced
scheduler developers, and I'm afraid in suggesting such backport to 5.4
and introduce complex-to-debug issues.

Let me know your thoughts Vincent (and all CCed), thanks in advance.
Cheers,


Guilherme


P.S. For those that deleted this thread from the email client, here's a
link:
https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/


[0]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb

[1]
https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
<- great thread BTW!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-18 22:56 [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list Guilherme G. Piccoli
@ 2020-11-18 23:30 ` Tao Zhou
  2020-11-18 23:50 ` Tao Zhou
  1 sibling, 0 replies; 12+ messages in thread
From: Tao Zhou @ 2020-11-18 23:30 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: vincent.guittot, bsegall, dietmar.eggemann, juri.lelli,
	zohooouoto, mgorman, mingo, ouwen210, pauld, peterz, pkondeti,
	rostedt, Jay Vosburgh, Gavin Guo, halves, nivedita.singhvi,
	linux-kernel, t1zhou

Hi Guilherme,

On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> we experienced a similar condition to what this patch addresses; it's an
> older kernel (4.15.x) but when suggesting the users to move to an
> updated 5.4.x kernel, we noticed that this patch is not there, although
> similar ones are (like [0] and [1]).
> 
> So, I'd like to ask if there's any particular reason to not backport
> this fix to stable kernels, specially the longterm 5.4. The main reason
> behind the question is that the code is very complex for non-experienced
> scheduler developers, and I'm afraid in suggesting such backport to 5.4
> and introduce complex-to-debug issues.
> 
> Let me know your thoughts Vincent (and all CCed), thanks in advance.
> Cheers,
> 
> 
> Guilherme
> 
> 
> P.S. For those that deleted this thread from the email client, here's a
> link:
> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> 
> 
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> 
> [1]
> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> <- great thread BTW!

Backport this patch to 5.4 need runnable_avg. but it is not introduced in 5.4
that time(please correct me if I am wrong).


Thanks,
Tao


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-18 22:56 [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list Guilherme G. Piccoli
  2020-11-18 23:30 ` Tao Zhou
@ 2020-11-18 23:50 ` Tao Zhou
  2020-11-19  0:33   ` Tao Zhou
  1 sibling, 1 reply; 12+ messages in thread
From: Tao Zhou @ 2020-11-18 23:50 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: vincent.guittot, bsegall, dietmar.eggemann, juri.lelli,
	zohooouoto, mgorman, mingo, ouwen210, pauld, peterz, pkondeti,
	rostedt, Jay Vosburgh, Gavin Guo, halves, nivedita.singhvi,
	linux-kernel, t1zhou

On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> we experienced a similar condition to what this patch addresses; it's an
> older kernel (4.15.x) but when suggesting the users to move to an
> updated 5.4.x kernel, we noticed that this patch is not there, although
> similar ones are (like [0] and [1]).
> 
> So, I'd like to ask if there's any particular reason to not backport
> this fix to stable kernels, specially the longterm 5.4. The main reason
> behind the question is that the code is very complex for non-experienced
> scheduler developers, and I'm afraid in suggesting such backport to 5.4
> and introduce complex-to-debug issues.
> 
> Let me know your thoughts Vincent (and all CCed), thanks in advance.
> Cheers,
> 
> 
> Guilherme
> 
> 
> P.S. For those that deleted this thread from the email client, here's a
> link:
> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> 
> 
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> 
> [1]
> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> <- great thread BTW!

'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
5.4-stable tree'

You could check above. But I do not have the link about this. Can't search it
on LKML web: https://lore.kernel.org/lkml/

BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.

Sorry for the confusing..

Thanks.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-18 23:50 ` Tao Zhou
@ 2020-11-19  0:33   ` Tao Zhou
  2020-11-19  8:36     ` Vincent Guittot
  0 siblings, 1 reply; 12+ messages in thread
From: Tao Zhou @ 2020-11-19  0:33 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: vincent.guittot, bsegall, dietmar.eggemann, juri.lelli,
	zohooouoto, mgorman, mingo, ouwen210, pauld, peterz, pkondeti,
	rostedt, Jay Vosburgh, Gavin Guo, halves, nivedita.singhvi,
	linux-kernel, t1zhou

On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
> On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> > Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> > we experienced a similar condition to what this patch addresses; it's an
> > older kernel (4.15.x) but when suggesting the users to move to an
> > updated 5.4.x kernel, we noticed that this patch is not there, although
> > similar ones are (like [0] and [1]).
> > 
> > So, I'd like to ask if there's any particular reason to not backport
> > this fix to stable kernels, specially the longterm 5.4. The main reason
> > behind the question is that the code is very complex for non-experienced
> > scheduler developers, and I'm afraid in suggesting such backport to 5.4
> > and introduce complex-to-debug issues.
> > 
> > Let me know your thoughts Vincent (and all CCed), thanks in advance.
> > Cheers,
> > 
> > 
> > Guilherme
> > 
> > 
> > P.S. For those that deleted this thread from the email client, here's a
> > link:
> > https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> > 
> > 
> > [0]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> > 
> > [1]
> > https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> > <- great thread BTW!
> 
> 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
> 5.4-stable tree'
> 
> You could check above. But I do not have the link about this. Can't search it
> on LKML web: https://lore.kernel.org/lkml/
> 
> BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
> 
> Sorry for the confusing..
> 
> Thanks.

Sorry again. I forget something. It is in the stable.

Here it is:

  https://lore.kernel.org/stable/159041776924279@kroah.com/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-19  0:33   ` Tao Zhou
@ 2020-11-19  8:36     ` Vincent Guittot
  2020-11-19 11:34       ` Guilherme G. Piccoli
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2020-11-19  8:36 UTC (permalink / raw)
  To: Tao Zhou, Guilherme G. Piccoli, gregkh, Sasha Levin, SeongJae Park
  Cc: Ben Segall, Dietmar Eggemann, Juri Lelli, Tao Zhou, Mel Gorman,
	Ingo Molnar, Tao Zhou, Phil Auld, Peter Zijlstra, Pavan Kondeti,
	Steven Rostedt, Jay Vosburgh, Gavin Guo, halves,
	nivedita.singhvi, linux-kernel, # v4 . 16+

On Thu, 19 Nov 2020 at 01:36, Tao Zhou <t1zhou@163.com> wrote:
>
> On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
> > On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> > > Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> > > we experienced a similar condition to what this patch addresses; it's an
> > > older kernel (4.15.x) but when suggesting the users to move to an
> > > updated 5.4.x kernel, we noticed that this patch is not there, although
> > > similar ones are (like [0] and [1]).
> > >
> > > So, I'd like to ask if there's any particular reason to not backport
> > > this fix to stable kernels, specially the longterm 5.4. The main reason
> > > behind the question is that the code is very complex for non-experienced
> > > scheduler developers, and I'm afraid in suggesting such backport to 5.4
> > > and introduce complex-to-debug issues.
> > >
> > > Let me know your thoughts Vincent (and all CCed), thanks in advance.
> > > Cheers,
> > >
> > >
> > > Guilherme
> > >
> > >
> > > P.S. For those that deleted this thread from the email client, here's a
> > > link:
> > > https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> > >
> > >
> > > [0]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> > >
> > > [1]
> > > https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> > > <- great thread BTW!
> >
> > 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
> > 5.4-stable tree'
> >
> > You could check above. But I do not have the link about this. Can't search it
> > on LKML web: https://lore.kernel.org/lkml/
> >
> > BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
> >
> > Sorry for the confusing..
> >
> > Thanks.
>
> Sorry again. I forget something. It is in the stable.
>
> Here it is:
>
>   https://lore.kernel.org/stable/159041776924279@kroah.com/

I think it has never been applied to stable.
As you mentioned, the backport has been sent :
https://lore.kernel.org/stable/20200525172709.GB7427@vingu-book/

I received another emailed in September and pointed out to the
backport : https://www.spinics.net/lists/stable/msg410445.html


>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-19  8:36     ` Vincent Guittot
@ 2020-11-19 11:34       ` Guilherme G. Piccoli
  2020-11-19 13:25         ` Vincent Guittot
  0 siblings, 1 reply; 12+ messages in thread
From: Guilherme G. Piccoli @ 2020-11-19 11:34 UTC (permalink / raw)
  To: Vincent Guittot, Tao Zhou
  Cc: gregkh, Sasha Levin, SeongJae Park, Ben Segall, Dietmar Eggemann,
	Juri Lelli, Tao Zhou, Mel Gorman, Ingo Molnar, Tao Zhou,
	Phil Auld, Peter Zijlstra, Pavan Kondeti, Steven Rostedt,
	Jay Vosburgh, Gavin Guo, halves, nivedita.singhvi, linux-kernel,
	# v4 . 16+



On 19/11/2020 05:36, Vincent Guittot wrote:
> On Thu, 19 Nov 2020 at 01:36, Tao Zhou <t1zhou@163.com> wrote:
>>
>> On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
>>> On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
>>>> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
>>>> we experienced a similar condition to what this patch addresses; it's an
>>>> older kernel (4.15.x) but when suggesting the users to move to an
>>>> updated 5.4.x kernel, we noticed that this patch is not there, although
>>>> similar ones are (like [0] and [1]).
>>>>
>>>> So, I'd like to ask if there's any particular reason to not backport
>>>> this fix to stable kernels, specially the longterm 5.4. The main reason
>>>> behind the question is that the code is very complex for non-experienced
>>>> scheduler developers, and I'm afraid in suggesting such backport to 5.4
>>>> and introduce complex-to-debug issues.
>>>>
>>>> Let me know your thoughts Vincent (and all CCed), thanks in advance.
>>>> Cheers,
>>>>
>>>>
>>>> Guilherme
>>>>
>>>>
>>>> P.S. For those that deleted this thread from the email client, here's a
>>>> link:
>>>> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
>>>>
>>>>
>>>> [0]
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
>>>>
>>>> [1]
>>>> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
>>>> <- great thread BTW!
>>>
>>> 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
>>> 5.4-stable tree'
>>>
>>> You could check above. But I do not have the link about this. Can't search it
>>> on LKML web: https://lore.kernel.org/lkml/
>>>
>>> BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
>>>
>>> Sorry for the confusing..
>>>
>>> Thanks.
>>
>> Sorry again. I forget something. It is in the stable.
>>
>> Here it is:
>>
>>   https://lore.kernel.org/stable/159041776924279@kroah.com/
> 
> I think it has never been applied to stable.
> As you mentioned, the backport has been sent :
> https://lore.kernel.org/stable/20200525172709.GB7427@vingu-book/
> 
> I received another emailed in September and pointed out to the
> backport : https://www.spinics.net/lists/stable/msg410445.html
> 
> 
>>

Thanks a lot Tao and Vincent! Nice to know that you already worked the
backport, gives much more confidence when the author does that heheh

So, this should go to stable 5.4.y, but not 4.19.y IIUC?
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-19 11:34       ` Guilherme G. Piccoli
@ 2020-11-19 13:25         ` Vincent Guittot
  2020-11-19 14:07           ` Guilherme Piccoli
  2021-06-24 10:29           ` Po-Hsu Lin
  0 siblings, 2 replies; 12+ messages in thread
From: Vincent Guittot @ 2020-11-19 13:25 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Tao Zhou, gregkh, Sasha Levin, SeongJae Park, Ben Segall,
	Dietmar Eggemann, Juri Lelli, Tao Zhou, Mel Gorman, Ingo Molnar,
	Tao Zhou, Phil Auld, Peter Zijlstra, Pavan Kondeti,
	Steven Rostedt, Jay Vosburgh, Gavin Guo, halves,
	nivedita.singhvi, linux-kernel, # v4 . 16+

On Thu, 19 Nov 2020 at 12:36, Guilherme G. Piccoli
<gpiccoli@canonical.com> wrote:
>
>
>
> On 19/11/2020 05:36, Vincent Guittot wrote:
> > On Thu, 19 Nov 2020 at 01:36, Tao Zhou <t1zhou@163.com> wrote:
> >>
> >> On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
> >>> On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> >>>> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> >>>> we experienced a similar condition to what this patch addresses; it's an
> >>>> older kernel (4.15.x) but when suggesting the users to move to an
> >>>> updated 5.4.x kernel, we noticed that this patch is not there, although
> >>>> similar ones are (like [0] and [1]).
> >>>>
> >>>> So, I'd like to ask if there's any particular reason to not backport
> >>>> this fix to stable kernels, specially the longterm 5.4. The main reason
> >>>> behind the question is that the code is very complex for non-experienced
> >>>> scheduler developers, and I'm afraid in suggesting such backport to 5.4
> >>>> and introduce complex-to-debug issues.
> >>>>
> >>>> Let me know your thoughts Vincent (and all CCed), thanks in advance.
> >>>> Cheers,
> >>>>
> >>>>
> >>>> Guilherme
> >>>>
> >>>>
> >>>> P.S. For those that deleted this thread from the email client, here's a
> >>>> link:
> >>>> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> >>>>
> >>>>
> >>>> [0]
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> >>>>
> >>>> [1]
> >>>> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> >>>> <- great thread BTW!
> >>>
> >>> 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
> >>> 5.4-stable tree'
> >>>
> >>> You could check above. But I do not have the link about this. Can't search it
> >>> on LKML web: https://lore.kernel.org/lkml/
> >>>
> >>> BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
> >>>
> >>> Sorry for the confusing..
> >>>
> >>> Thanks.
> >>
> >> Sorry again. I forget something. It is in the stable.
> >>
> >> Here it is:
> >>
> >>   https://lore.kernel.org/stable/159041776924279@kroah.com/
> >
> > I think it has never been applied to stable.
> > As you mentioned, the backport has been sent :
> > https://lore.kernel.org/stable/20200525172709.GB7427@vingu-book/
> >
> > I received another emailed in September and pointed out to the
> > backport : https://www.spinics.net/lists/stable/msg410445.html
> >
> >
> >>
>
> Thanks a lot Tao and Vincent! Nice to know that you already worked the
> backport, gives much more confidence when the author does that heheh
>
> So, this should go to stable 5.4.y, but not 4.19.y IIUC?

Yeah. they should be backported up to v5.1 but not earlier

Regards,
Vincent

> Cheers,
>
>
> Guilherme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-19 13:25         ` Vincent Guittot
@ 2020-11-19 14:07           ` Guilherme Piccoli
  2021-06-24 10:29           ` Po-Hsu Lin
  1 sibling, 0 replies; 12+ messages in thread
From: Guilherme Piccoli @ 2020-11-19 14:07 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Tao Zhou, gregkh, Sasha Levin, SeongJae Park, Ben Segall,
	Dietmar Eggemann, Juri Lelli, Tao Zhou, Mel Gorman, Ingo Molnar,
	Tao Zhou, Phil Auld, Peter Zijlstra, Pavan Kondeti,
	Steven Rostedt, Jay Vosburgh, Gavin Guo,
	Heitor R. Alves de Siqueira, Nivedita Singhvi, linux-kernel,
	# v4 . 16+

Thank you Vincent, much appreciated! I'll respond in the patch thread,
hopefully we can get that included in 5.4.y .

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-11-19 13:25         ` Vincent Guittot
  2020-11-19 14:07           ` Guilherme Piccoli
@ 2021-06-24 10:29           ` Po-Hsu Lin
  2021-06-24 12:31             ` Vincent Guittot
  1 sibling, 1 reply; 12+ messages in thread
From: Po-Hsu Lin @ 2021-06-24 10:29 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Guilherme G. Piccoli, Tao Zhou, gregkh, Sasha Levin,
	SeongJae Park, Ben Segall, Dietmar Eggemann, Juri Lelli,
	Tao Zhou, Mel Gorman, Ingo Molnar, Tao Zhou, Phil Auld,
	Peter Zijlstra, Pavan Kondeti, Steven Rostedt, Jay Vosburgh,
	Gavin Guo, halves, nivedita.singhvi, linux-kernel, # v4 . 16+

Hello Vincent,

sorry to resurrect this thread again,
I was trying to backport this patch and corresponding fixes to our
Ubuntu 4.15 kernel [1] to fix an issue report by LTP cfs_bandwidth01
test[2], my colleague Guilherme told me there once a discussion about
backporting this on this thread.

You mentioned here this should not be backported to earlier stable
kernel, I am curious if there is any specific reason of it? Too risky
maybe?
Thanks!
PHLin

[1] https://lists.ubuntu.com/archives/kernel-team/2021-June/121571.html
[2] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/cfs_bandwidth01.c


On Thu, Nov 19, 2020 at 9:25 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Thu, 19 Nov 2020 at 12:36, Guilherme G. Piccoli
> <gpiccoli@canonical.com> wrote:
> >
> >
> >
> > On 19/11/2020 05:36, Vincent Guittot wrote:
> > > On Thu, 19 Nov 2020 at 01:36, Tao Zhou <t1zhou@163.com> wrote:
> > >>
> > >> On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
> > >>> On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> > >>>> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> > >>>> we experienced a similar condition to what this patch addresses; it's an
> > >>>> older kernel (4.15.x) but when suggesting the users to move to an
> > >>>> updated 5.4.x kernel, we noticed that this patch is not there, although
> > >>>> similar ones are (like [0] and [1]).
> > >>>>
> > >>>> So, I'd like to ask if there's any particular reason to not backport
> > >>>> this fix to stable kernels, specially the longterm 5.4. The main reason
> > >>>> behind the question is that the code is very complex for non-experienced
> > >>>> scheduler developers, and I'm afraid in suggesting such backport to 5.4
> > >>>> and introduce complex-to-debug issues.
> > >>>>
> > >>>> Let me know your thoughts Vincent (and all CCed), thanks in advance.
> > >>>> Cheers,
> > >>>>
> > >>>>
> > >>>> Guilherme
> > >>>>
> > >>>>
> > >>>> P.S. For those that deleted this thread from the email client, here's a
> > >>>> link:
> > >>>> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> > >>>>
> > >>>>
> > >>>> [0]
> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> > >>>>
> > >>>> [1]
> > >>>> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> > >>>> <- great thread BTW!
> > >>>
> > >>> 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
> > >>> 5.4-stable tree'
> > >>>
> > >>> You could check above. But I do not have the link about this. Can't search it
> > >>> on LKML web: https://lore.kernel.org/lkml/
> > >>>
> > >>> BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
> > >>>
> > >>> Sorry for the confusing..
> > >>>
> > >>> Thanks.
> > >>
> > >> Sorry again. I forget something. It is in the stable.
> > >>
> > >> Here it is:
> > >>
> > >>   https://lore.kernel.org/stable/159041776924279@kroah.com/
> > >
> > > I think it has never been applied to stable.
> > > As you mentioned, the backport has been sent :
> > > https://lore.kernel.org/stable/20200525172709.GB7427@vingu-book/
> > >
> > > I received another emailed in September and pointed out to the
> > > backport : https://www.spinics.net/lists/stable/msg410445.html
> > >
> > >
> > >>
> >
> > Thanks a lot Tao and Vincent! Nice to know that you already worked the
> > backport, gives much more confidence when the author does that heheh
> >
> > So, this should go to stable 5.4.y, but not 4.19.y IIUC?
>
> Yeah. they should be backported up to v5.1 but not earlier
>
> Regards,
> Vincent
>
> > Cheers,
> >
> >
> > Guilherme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2021-06-24 10:29           ` Po-Hsu Lin
@ 2021-06-24 12:31             ` Vincent Guittot
  0 siblings, 0 replies; 12+ messages in thread
From: Vincent Guittot @ 2021-06-24 12:31 UTC (permalink / raw)
  To: Po-Hsu Lin
  Cc: Guilherme G. Piccoli, Tao Zhou, gregkh, Sasha Levin,
	SeongJae Park, Ben Segall, Dietmar Eggemann, Juri Lelli,
	Tao Zhou, Mel Gorman, Ingo Molnar, Tao Zhou, Phil Auld,
	Peter Zijlstra, Pavan Kondeti, Steven Rostedt, Jay Vosburgh,
	Gavin Guo, halves, nivedita.singhvi, linux-kernel, # v4 . 16+

On Thu, 24 Jun 2021 at 12:29, Po-Hsu Lin <po-hsu.lin@canonical.com> wrote:
>
> Hello Vincent,
>
> sorry to resurrect this thread again,
> I was trying to backport this patch and corresponding fixes to our
> Ubuntu 4.15 kernel [1] to fix an issue report by LTP cfs_bandwidth01
> test[2], my colleague Guilherme told me there once a discussion about
> backporting this on this thread.
>
> You mentioned here this should not be backported to earlier stable
> kernel, I am curious if there is any specific reason of it? Too risky
> maybe?

Yes, IIRC there are some dependencies with other patchsets that make
the backport complex and not straight forward


> Thanks!
> PHLin
>
> [1] https://lists.ubuntu.com/archives/kernel-team/2021-June/121571.html
> [2] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/cfs_bandwidth01.c
>
>
> On Thu, Nov 19, 2020 at 9:25 PM Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> >
> > On Thu, 19 Nov 2020 at 12:36, Guilherme G. Piccoli
> > <gpiccoli@canonical.com> wrote:
> > >
> > >
> > >
> > > On 19/11/2020 05:36, Vincent Guittot wrote:
> > > > On Thu, 19 Nov 2020 at 01:36, Tao Zhou <t1zhou@163.com> wrote:
> > > >>
> > > >> On Thu, Nov 19, 2020 at 07:50:15AM +0800, Tao Zhou wrote:
> > > >>> On Wed, Nov 18, 2020 at 07:56:38PM -0300, Guilherme G. Piccoli wrote:
> > > >>>> Hi Vincent (and all CCed), I'm sorry to ping about such "old" patch, but
> > > >>>> we experienced a similar condition to what this patch addresses; it's an
> > > >>>> older kernel (4.15.x) but when suggesting the users to move to an
> > > >>>> updated 5.4.x kernel, we noticed that this patch is not there, although
> > > >>>> similar ones are (like [0] and [1]).
> > > >>>>
> > > >>>> So, I'd like to ask if there's any particular reason to not backport
> > > >>>> this fix to stable kernels, specially the longterm 5.4. The main reason
> > > >>>> behind the question is that the code is very complex for non-experienced
> > > >>>> scheduler developers, and I'm afraid in suggesting such backport to 5.4
> > > >>>> and introduce complex-to-debug issues.
> > > >>>>
> > > >>>> Let me know your thoughts Vincent (and all CCed), thanks in advance.
> > > >>>> Cheers,
> > > >>>>
> > > >>>>
> > > >>>> Guilherme
> > > >>>>
> > > >>>>
> > > >>>> P.S. For those that deleted this thread from the email client, here's a
> > > >>>> link:
> > > >>>> https://lore.kernel.org/lkml/20200513135528.4742-1-vincent.guittot@linaro.org/
> > > >>>>
> > > >>>>
> > > >>>> [0]
> > > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe61468b2cb
> > > >>>>
> > > >>>> [1]
> > > >>>> https://lore.kernel.org/lkml/20200506141821.GA9773@lorien.usersys.redhat.com/
> > > >>>> <- great thread BTW!
> > > >>>
> > > >>> 'sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list" failed to apply to
> > > >>> 5.4-stable tree'
> > > >>>
> > > >>> You could check above. But I do not have the link about this. Can't search it
> > > >>> on LKML web: https://lore.kernel.org/lkml/
> > > >>>
> > > >>> BTW: 'ouwen210@hotmail.com' and 'zohooouoto@zoho.com.cn' all is myself.
> > > >>>
> > > >>> Sorry for the confusing..
> > > >>>
> > > >>> Thanks.
> > > >>
> > > >> Sorry again. I forget something. It is in the stable.
> > > >>
> > > >> Here it is:
> > > >>
> > > >>   https://lore.kernel.org/stable/159041776924279@kroah.com/
> > > >
> > > > I think it has never been applied to stable.
> > > > As you mentioned, the backport has been sent :
> > > > https://lore.kernel.org/stable/20200525172709.GB7427@vingu-book/
> > > >
> > > > I received another emailed in September and pointed out to the
> > > > backport : https://www.spinics.net/lists/stable/msg410445.html
> > > >
> > > >
> > > >>
> > >
> > > Thanks a lot Tao and Vincent! Nice to know that you already worked the
> > > backport, gives much more confidence when the author does that heheh
> > >
> > > So, this should go to stable 5.4.y, but not 4.19.y IIUC?
> >
> > Yeah. they should be backported up to v5.1 but not earlier
> >
> > Regards,
> > Vincent
> >
> > > Cheers,
> > >
> > >
> > > Guilherme

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
  2020-05-13 13:55 Vincent Guittot
@ 2020-05-13 18:25 ` bsegall
  0 siblings, 0 replies; 12+ messages in thread
From: bsegall @ 2020-05-13 18:25 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel, pauld, ouwen210, pkondeti

Vincent Guittot <vincent.guittot@linaro.org> writes:

> Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair()
> are quite close and follow the same sequence for enqueuing an entity in the
> cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as
> enqueue_task_fair(). This fixes a problem already faced with the latter and
> add an optimization in the last for_each_sched_entity loop.
>
> Reported-by Tao Zhou <zohooouoto@zoho.com.cn>
> Reviewed-by: Phil Auld <pauld@redhat.com>

Reveiewed-by: Ben Segall <bsegall@google.com>

> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>
> v3 changes:
>   - remove the unused enqueue variable
>
>  kernel/sched/fair.c | 42 ++++++++++++++++++++++++++++++------------
>  1 file changed, 30 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4e12ba882663..9a58874ef104 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4792,7 +4792,6 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>  	struct rq *rq = rq_of(cfs_rq);
>  	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
>  	struct sched_entity *se;
> -	int enqueue = 1;
>  	long task_delta, idle_task_delta;
>  
>  	se = cfs_rq->tg->se[cpu_of(rq)];
> @@ -4816,26 +4815,44 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>  	idle_task_delta = cfs_rq->idle_h_nr_running;
>  	for_each_sched_entity(se) {
>  		if (se->on_rq)
> -			enqueue = 0;
> +			break;
> +		cfs_rq = cfs_rq_of(se);
> +		enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
>  
> +		cfs_rq->h_nr_running += task_delta;
> +		cfs_rq->idle_h_nr_running += idle_task_delta;
> +
> +		/* end evaluation on encountering a throttled cfs_rq */
> +		if (cfs_rq_throttled(cfs_rq))
> +			goto unthrottle_throttle;
> +	}
> +
> +	for_each_sched_entity(se) {
>  		cfs_rq = cfs_rq_of(se);
> -		if (enqueue) {
> -			enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> -		} else {
> -			update_load_avg(cfs_rq, se, 0);
> -			se_update_runnable(se);
> -		}
> +
> +		update_load_avg(cfs_rq, se, UPDATE_TG);
> +		se_update_runnable(se);
>  
>  		cfs_rq->h_nr_running += task_delta;
>  		cfs_rq->idle_h_nr_running += idle_task_delta;
>  
> +
> +		/* end evaluation on encountering a throttled cfs_rq */
>  		if (cfs_rq_throttled(cfs_rq))
> -			break;
> +			goto unthrottle_throttle;
> +
> +		/*
> +		 * One parent has been throttled and cfs_rq removed from the
> +		 * list. Add it back to not break the leaf list.
> +		 */
> +		if (throttled_hierarchy(cfs_rq))
> +			list_add_leaf_cfs_rq(cfs_rq);
>  	}
>  
> -	if (!se)
> -		add_nr_running(rq, task_delta);
> +	/* At this point se is NULL and we are at root level*/
> +	add_nr_running(rq, task_delta);
>  
> +unthrottle_throttle:
>  	/*
>  	 * The cfs_rq_throttled() breaks in the above iteration can result in
>  	 * incomplete leaf list maintenance, resulting in triggering the
> @@ -4844,7 +4861,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>  	for_each_sched_entity(se) {
>  		cfs_rq = cfs_rq_of(se);
>  
> -		list_add_leaf_cfs_rq(cfs_rq);
> +		if (list_add_leaf_cfs_rq(cfs_rq))
> +			break;
>  	}
>  
>  	assert_list_leaf_cfs_rq(rq);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
@ 2020-05-13 13:55 Vincent Guittot
  2020-05-13 18:25 ` bsegall
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2020-05-13 13:55 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel
  Cc: pauld, ouwen210, pkondeti, Vincent Guittot

Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair()
are quite close and follow the same sequence for enqueuing an entity in the
cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as
enqueue_task_fair(). This fixes a problem already faced with the latter and
add an optimization in the last for_each_sched_entity loop.

Reported-by Tao Zhou <zohooouoto@zoho.com.cn>
Reviewed-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---

v3 changes:
  - remove the unused enqueue variable

 kernel/sched/fair.c | 42 ++++++++++++++++++++++++++++++------------
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4e12ba882663..9a58874ef104 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4792,7 +4792,6 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	struct rq *rq = rq_of(cfs_rq);
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
 	struct sched_entity *se;
-	int enqueue = 1;
 	long task_delta, idle_task_delta;
 
 	se = cfs_rq->tg->se[cpu_of(rq)];
@@ -4816,26 +4815,44 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	idle_task_delta = cfs_rq->idle_h_nr_running;
 	for_each_sched_entity(se) {
 		if (se->on_rq)
-			enqueue = 0;
+			break;
+		cfs_rq = cfs_rq_of(se);
+		enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
 
+		cfs_rq->h_nr_running += task_delta;
+		cfs_rq->idle_h_nr_running += idle_task_delta;
+
+		/* end evaluation on encountering a throttled cfs_rq */
+		if (cfs_rq_throttled(cfs_rq))
+			goto unthrottle_throttle;
+	}
+
+	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
-		if (enqueue) {
-			enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
-		} else {
-			update_load_avg(cfs_rq, se, 0);
-			se_update_runnable(se);
-		}
+
+		update_load_avg(cfs_rq, se, UPDATE_TG);
+		se_update_runnable(se);
 
 		cfs_rq->h_nr_running += task_delta;
 		cfs_rq->idle_h_nr_running += idle_task_delta;
 
+
+		/* end evaluation on encountering a throttled cfs_rq */
 		if (cfs_rq_throttled(cfs_rq))
-			break;
+			goto unthrottle_throttle;
+
+		/*
+		 * One parent has been throttled and cfs_rq removed from the
+		 * list. Add it back to not break the leaf list.
+		 */
+		if (throttled_hierarchy(cfs_rq))
+			list_add_leaf_cfs_rq(cfs_rq);
 	}
 
-	if (!se)
-		add_nr_running(rq, task_delta);
+	/* At this point se is NULL and we are at root level*/
+	add_nr_running(rq, task_delta);
 
+unthrottle_throttle:
 	/*
 	 * The cfs_rq_throttled() breaks in the above iteration can result in
 	 * incomplete leaf list maintenance, resulting in triggering the
@@ -4844,7 +4861,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	for_each_sched_entity(se) {
 		cfs_rq = cfs_rq_of(se);
 
-		list_add_leaf_cfs_rq(cfs_rq);
+		if (list_add_leaf_cfs_rq(cfs_rq))
+			break;
 	}
 
 	assert_list_leaf_cfs_rq(rq);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-06-24 12:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-18 22:56 [PATCH v3] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list Guilherme G. Piccoli
2020-11-18 23:30 ` Tao Zhou
2020-11-18 23:50 ` Tao Zhou
2020-11-19  0:33   ` Tao Zhou
2020-11-19  8:36     ` Vincent Guittot
2020-11-19 11:34       ` Guilherme G. Piccoli
2020-11-19 13:25         ` Vincent Guittot
2020-11-19 14:07           ` Guilherme Piccoli
2021-06-24 10:29           ` Po-Hsu Lin
2021-06-24 12:31             ` Vincent Guittot
  -- strict thread matches above, loose matches on Subject: below --
2020-05-13 13:55 Vincent Guittot
2020-05-13 18:25 ` bsegall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.