All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Valentin Schneider <valentin.schneider@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Oltean <olteanv@gmail.com>,
	Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>,
	Alison Wang <alison.wang@nxp.com>,
	catalin.marinas@arm.com, will@kernel.org, paulmck@kernel.org,
	mw@semihalf.com, leoyang.li@nxp.com, vladimir.oltean@nxp.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Anna-Maria Gleixner <anna-maria@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting
Date: Wed, 5 Aug 2020 10:50:29 +0200	[thread overview]
Message-ID: <02195130-3d9a-a206-d931-fab7dc606061@arm.com> (raw)
In-Reply-To: <jhjft93i8mg.mognet@arm.com>

On 04/08/2020 01:59, Valentin Schneider wrote:
> 
> On 03/08/20 20:22, Thomas Gleixner wrote:
>> Valentin,
>>
>> Valentin Schneider <valentin.schneider@arm.com> writes:
>>> On 03/08/20 16:13, Thomas Gleixner wrote:
>>>> Vladimir Oltean <olteanv@gmail.com> writes:
>>>>>>  1) When irq accounting is disabled, RT throttling kicks in as
>>>>>>     expected.
>>>>>>
>>>>>>  2) With irq accounting the RT throttler does not kick in and the RCU
>>>>>>     stall/lockups happen.
>>>>> What is this telling us?
>>>>
>>>> It seems that the fine grained irq time accounting affects the runtime
>>>> accounting in some way which I haven't figured out yet.
>>>>
>>>
>>> With IRQ_TIME_ACCOUNTING, rq_clock_task() will always be incremented by a
>>> lesser-or-equal value than when not having the option; you start with the
>>> same delta_exec but slice some for the IRQ accounting, and leave the rest
>>> for the rq_clock_task() (+paravirt).
>>>
>>> IIUC this means that if you spend e.g. 10% of the time in IRQ and 90% of
>>> the time running the stress-ng RT tasks, despite having RT tasks hogging
>>> the entirety of the "available time" it is still only 90% runtime, which is
>>> below the 95% default and the throttling doesn't happen.
>>
>>    totaltime = irqtime + tasktime
>>
>> Ignoring irqtime and pretending that totaltime is what the scheduler
>> can control and deal with is naive at best.
>>
> 
> Agreed, however AFAICT rt_time is only incremented by rq_clock_task()
> deltas, which don't include IRQ time with IRQ_TIME_ACCOUNTING=y. That would
> then be directly compared to the sysctl runtime.
> 
> Adding some prints in sched_rt_runtime_exceeded() and running this test
> case on my Juno, I get:
>   # IRQ_TIME_ACCOUNTING=y
>   cpu=2 rt_time=713455220 runtime=950000000 rq->avg_irq.util_avg=265
>   (rt_time oscillates between [70.1e7, 75.1e7]; avg_irq between [220, 270])
> 
>   # IRQ_TIME_ACCOUNTING=n
>   cpu=2 rt_time=963035300 runtime=949951811
>   (rt_time oscillates between [94.1e7, 96.1e7];
> 
> Throttling happens for IRQ_TIME_ACCOUNTING=n and doesn't for
> IRQ_TIME_ACCOUNTING=y - clearly the accounted rt_time isn't high enough for
> that to happen, and it does look like what is missing in rt_time (or what
> should be subtracted from the available runtime) is there in the avg_irq.

I agree that w/ IRQ_TIME_ACCOUNTING=y rt_rq->rt_time isn't high enough
in this testcase.

stress-ng-hrtim-1655 [001] 462.897733: bprint: update_curr_rt:
rt_rq->rt_time=416716900 rt_rq->rt_runtime=950000000
rt_b->rt_runtime=950000000

The 5% reservation (1 - sched_rt_runtime_us/sched_rt_period_us) for CFS
is massively eclipsed by irqtime.

It's true that avg_irq tracks 'irq_delta + steal' time but it is meant
to potentially reduce cpu capacity. It's also cpu and frequency
invariant (your CPU2 is a big CPU so no issue here).

Could a rq_clock(rq) derived rt_rq signal been used to compare against
rt_runtime?

BTW, DL already influences rt_rq->rt_time.

[...]

WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Valentin Schneider <valentin.schneider@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: mw@semihalf.com, paulmck@kernel.org,
	Anna-Maria Gleixner <anna-maria@linutronix.de>,
	catalin.marinas@arm.com, Alison Wang <alison.wang@nxp.com>,
	linux-kernel@vger.kernel.org, leoyang.li@nxp.com,
	Peter Zijlstra <peterz@infradead.org>,
	vladimir.oltean@nxp.com,
	Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>,
	Vladimir Oltean <olteanv@gmail.com>,
	will@kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting
Date: Wed, 5 Aug 2020 10:50:29 +0200	[thread overview]
Message-ID: <02195130-3d9a-a206-d931-fab7dc606061@arm.com> (raw)
In-Reply-To: <jhjft93i8mg.mognet@arm.com>

On 04/08/2020 01:59, Valentin Schneider wrote:
> 
> On 03/08/20 20:22, Thomas Gleixner wrote:
>> Valentin,
>>
>> Valentin Schneider <valentin.schneider@arm.com> writes:
>>> On 03/08/20 16:13, Thomas Gleixner wrote:
>>>> Vladimir Oltean <olteanv@gmail.com> writes:
>>>>>>  1) When irq accounting is disabled, RT throttling kicks in as
>>>>>>     expected.
>>>>>>
>>>>>>  2) With irq accounting the RT throttler does not kick in and the RCU
>>>>>>     stall/lockups happen.
>>>>> What is this telling us?
>>>>
>>>> It seems that the fine grained irq time accounting affects the runtime
>>>> accounting in some way which I haven't figured out yet.
>>>>
>>>
>>> With IRQ_TIME_ACCOUNTING, rq_clock_task() will always be incremented by a
>>> lesser-or-equal value than when not having the option; you start with the
>>> same delta_exec but slice some for the IRQ accounting, and leave the rest
>>> for the rq_clock_task() (+paravirt).
>>>
>>> IIUC this means that if you spend e.g. 10% of the time in IRQ and 90% of
>>> the time running the stress-ng RT tasks, despite having RT tasks hogging
>>> the entirety of the "available time" it is still only 90% runtime, which is
>>> below the 95% default and the throttling doesn't happen.
>>
>>    totaltime = irqtime + tasktime
>>
>> Ignoring irqtime and pretending that totaltime is what the scheduler
>> can control and deal with is naive at best.
>>
> 
> Agreed, however AFAICT rt_time is only incremented by rq_clock_task()
> deltas, which don't include IRQ time with IRQ_TIME_ACCOUNTING=y. That would
> then be directly compared to the sysctl runtime.
> 
> Adding some prints in sched_rt_runtime_exceeded() and running this test
> case on my Juno, I get:
>   # IRQ_TIME_ACCOUNTING=y
>   cpu=2 rt_time=713455220 runtime=950000000 rq->avg_irq.util_avg=265
>   (rt_time oscillates between [70.1e7, 75.1e7]; avg_irq between [220, 270])
> 
>   # IRQ_TIME_ACCOUNTING=n
>   cpu=2 rt_time=963035300 runtime=949951811
>   (rt_time oscillates between [94.1e7, 96.1e7];
> 
> Throttling happens for IRQ_TIME_ACCOUNTING=n and doesn't for
> IRQ_TIME_ACCOUNTING=y - clearly the accounted rt_time isn't high enough for
> that to happen, and it does look like what is missing in rt_time (or what
> should be subtracted from the available runtime) is there in the avg_irq.

I agree that w/ IRQ_TIME_ACCOUNTING=y rt_rq->rt_time isn't high enough
in this testcase.

stress-ng-hrtim-1655 [001] 462.897733: bprint: update_curr_rt:
rt_rq->rt_time=416716900 rt_rq->rt_runtime=950000000
rt_b->rt_runtime=950000000

The 5% reservation (1 - sched_rt_runtime_us/sched_rt_period_us) for CFS
is massively eclipsed by irqtime.

It's true that avg_irq tracks 'irq_delta + steal' time but it is meant
to potentially reduce cpu capacity. It's also cpu and frequency
invariant (your CPU2 is a big CPU so no issue here).

Could a rq_clock(rq) derived rt_rq signal been used to compare against
rt_runtime?

BTW, DL already influences rt_rq->rt_time.

[...]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-08-05  8:50 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-29  3:39 [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting Alison Wang
2020-07-29  3:39 ` Alison Wang
2020-07-29  8:40 ` Kurt Kanzenbach
2020-07-29  8:40   ` Kurt Kanzenbach
2020-07-29  8:50   ` [EXT] " Alison Wang
2020-07-29  8:50     ` Alison Wang
2020-07-29  9:49   ` Vladimir Oltean
2020-07-29  9:49     ` Vladimir Oltean
2020-07-30  7:23     ` Kurt Kanzenbach
2020-07-30  7:23       ` Kurt Kanzenbach
2020-07-30  8:22       ` Vladimir Oltean
2020-07-30  8:22         ` Vladimir Oltean
2020-08-03  8:04         ` Kurt Kanzenbach
2020-08-03  8:04           ` Kurt Kanzenbach
2020-08-03  8:16           ` Vladimir Oltean
2020-08-03  8:16             ` Vladimir Oltean
2020-08-03  9:51             ` Robin Murphy
2020-08-03  9:51               ` Robin Murphy
2020-08-03 11:38               ` Vladimir Oltean
2020-08-03 11:38                 ` Vladimir Oltean
2020-08-03 11:48                 ` Valentin Schneider
2020-08-03 11:48                   ` Valentin Schneider
2020-08-03 13:24                   ` Marc Zyngier
2020-08-03 13:24                     ` Marc Zyngier
2020-08-03 10:02             ` Thomas Gleixner
2020-08-03 10:02               ` Thomas Gleixner
2020-08-03 10:49           ` Thomas Gleixner
2020-08-03 10:49             ` Thomas Gleixner
2020-08-03 11:41             ` Vladimir Oltean
2020-08-03 11:41               ` Vladimir Oltean
2020-08-03 15:13               ` Thomas Gleixner
2020-08-03 15:13                 ` Thomas Gleixner
2020-08-03 15:47                 ` Valentin Schneider
2020-08-03 15:47                   ` Valentin Schneider
2020-08-03 16:14                   ` Vladimir Oltean
2020-08-03 16:14                     ` Vladimir Oltean
2020-08-03 19:22                   ` Thomas Gleixner
2020-08-03 19:22                     ` Thomas Gleixner
2020-08-03 23:59                     ` Valentin Schneider
2020-08-03 23:59                       ` Valentin Schneider
2020-08-05  8:50                       ` Dietmar Eggemann [this message]
2020-08-05  8:50                         ` Dietmar Eggemann
2020-08-05 13:40                     ` peterz
2020-08-05 13:40                       ` peterz
2020-08-05 13:56                       ` Valentin Schneider
2020-08-05 13:56                         ` Valentin Schneider
2020-08-05 15:31                         ` peterz
2020-08-05 15:31                           ` peterz
2020-08-06  9:41                           ` Thomas Gleixner
2020-08-06  9:41                             ` Thomas Gleixner
2020-08-06 11:45                             ` peterz
2020-08-06 11:45                               ` peterz
2020-08-06 13:27                               ` Paul E. McKenney
2020-08-06 13:27                                 ` Paul E. McKenney
2020-08-06 19:03                                 ` Thomas Gleixner
2020-08-06 19:03                                   ` Thomas Gleixner
2020-08-06 20:39                                   ` Paul E. McKenney
2020-08-06 20:39                                     ` Paul E. McKenney
2020-08-06 18:58                               ` Thomas Gleixner
2020-08-06 18:58                                 ` Thomas Gleixner
2020-08-06  9:34                       ` Thomas Gleixner
2020-08-06  9:34                         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02195130-3d9a-a206-d931-fab7dc606061@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=alison.wang@nxp.com \
    --cc=anna-maria@linutronix.de \
    --cc=catalin.marinas@arm.com \
    --cc=kurt.kanzenbach@linutronix.de \
    --cc=leoyang.li@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mw@semihalf.com \
    --cc=olteanv@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.