linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] psi: report zeroes for CPU full at the system level
@ 2022-03-05  2:13 Chengming Zhou
       [not found] ` <89d939a61f840683101542fe0da823e693ef6cc3.camel@proact.de>
  0 siblings, 1 reply; 2+ messages in thread
From: Chengming Zhou @ 2022-03-05  2:13 UTC (permalink / raw)
  To: corbet, hannes, mingo, peterz, ebiggers, surenb
  Cc: linux-doc, linux-kernel, songmuchun, Chengming Zhou, Martin Steigerwald

Martin find it confusing when look at the /proc/pressure/cpu output,
and found no hint about that CPU "full" line in psi Documentation.

% cat /proc/pressure/cpu
some avg10=0.92 avg60=0.91 avg300=0.73 total=933490489
full avg10=0.22 avg60=0.23 avg300=0.16 total=358783277

The PSI_CPU_FULL state is introduced by commit e7fcd7622823
("psi: Add PSI_CPU_FULL state"), which mainly for cgroup level,
but also counted at the system level as a side effect.

Naturally, the FULL state doesn't exist for the CPU resource at
the system level. These "full" numbers can come from CPU idle
schedule latency. For example, t1 is the time when task wakeup
on an idle CPU, t2 is the time when CPU pick and switch to it.
The delta of (t2 - t1) will be in CPU_FULL state.

Another case all processes can be stalled is when all cgroups
have been throttled at the same time, which unlikely to happen.

Anyway, CPU_FULL metric is meaningless and confusing at the
system level. So this patch will report zeroes for CPU full
at the system level, and update psi Documentation accordingly.

Fixes: e7fcd7622823 ("psi: Add PSI_CPU_FULL state")
Reported-by: Martin Steigerwald <Martin.Steigerwald@proact.de>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 Documentation/accounting/psi.rst |  6 +-----
 kernel/sched/psi.c               | 15 +++++++++------
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/Documentation/accounting/psi.rst b/Documentation/accounting/psi.rst
index 860fe651d645..7e15e37d3179 100644
--- a/Documentation/accounting/psi.rst
+++ b/Documentation/accounting/psi.rst
@@ -37,11 +37,7 @@ Pressure interface
 Pressure information for each resource is exported through the
 respective file in /proc/pressure/ -- cpu, memory, and io.
 
-The format for CPU is as such::
-
-	some avg10=0.00 avg60=0.00 avg300=0.00 total=0
-
-and for memory and IO::
+The format is as such::
 
 	some avg10=0.00 avg60=0.00 avg300=0.00 total=0
 	full avg10=0.00 avg60=0.00 avg300=0.00 total=0
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index e14358178849..97fd85c5143c 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1062,14 +1062,17 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
 	mutex_unlock(&group->avgs_lock);
 
 	for (full = 0; full < 2; full++) {
-		unsigned long avg[3];
-		u64 total;
+		unsigned long avg[3] = { 0, };
+		u64 total = 0;
 		int w;
 
-		for (w = 0; w < 3; w++)
-			avg[w] = group->avg[res * 2 + full][w];
-		total = div_u64(group->total[PSI_AVGS][res * 2 + full],
-				NSEC_PER_USEC);
+		/* CPU FULL is undefined at the system level */
+		if (!(group == &psi_system && res == PSI_CPU && full)) {
+			for (w = 0; w < 3; w++)
+				avg[w] = group->avg[res * 2 + full][w];
+			total = div_u64(group->total[PSI_AVGS][res * 2 + full],
+					NSEC_PER_USEC);
+		}
 
 		seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
 			   full ? "full" : "some",
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [Phishing Risk] [External] Re: [PATCH] psi: report zeroes for CPU full at the system level
       [not found] ` <89d939a61f840683101542fe0da823e693ef6cc3.camel@proact.de>
@ 2022-03-08  0:30   ` Chengming Zhou
  0 siblings, 0 replies; 2+ messages in thread
From: Chengming Zhou @ 2022-03-08  0:30 UTC (permalink / raw)
  To: Martin Steigerwald, mingo, peterz, corbet, ebiggers, hannes, surenb
  Cc: linux-doc, linux-kernel, songmuchun

On 2022/3/7 7:01 下午, Martin Steigerwald wrote:
> Am Samstag, dem 05.03.2022 um 10:13 +0800 schrieb Chengming Zhou:
>> Martin find it confusing when look at the /proc/pressure/cpu output,
>> and found no hint about that CPU "full" line in psi Documentation.
>>
>> % cat /proc/pressure/cpu
>> some avg10=0.92 avg60=0.91 avg300=0.73 total=933490489
>> full avg10=0.22 avg60=0.23 avg300=0.16 total=358783277
>>
>> The PSI_CPU_FULL state is introduced by commit e7fcd7622823
>> ("psi: Add PSI_CPU_FULL state"), which mainly for cgroup level,
>> but also counted at the system level as a side effect.
>>
>> Naturally, the FULL state doesn't exist for the CPU resource at
>> the system level. These "full" numbers can come from CPU idle
>> schedule latency. For example, t1 is the time when task wakeup
>> on an idle CPU, t2 is the time when CPU pick and switch to it.
>> The delta of (t2 - t1) will be in CPU_FULL state.
>>
>> Another case all processes can be stalled is when all cgroups
>> have been throttled at the same time, which unlikely to happen.
>>
>> Anyway, CPU_FULL metric is meaningless and confusing at the
>> system level. So this patch will report zeroes for CPU full
>> at the system level, and update psi Documentation accordingly.
>>
>> Fixes: e7fcd7622823 ("psi: Add PSI_CPU_FULL state")
>> Reported-by: Martin Steigerwald <Martin.Steigerwald@proact.de>
>> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
>> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
>> ---
>>  Documentation/accounting/psi.rst |  6 +-----
>>  kernel/sched/psi.c               | 15 +++++++++------
>>  2 files changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/accounting/psi.rst
>> b/Documentation/accounting/psi.rst
>> index 860fe651d645..7e15e37d3179 100644
>> --- a/Documentation/accounting/psi.rst
>> +++ b/Documentation/accounting/psi.rst
>> @@ -37,11 +37,7 @@ Pressure interface
>>  Pressure information for each resource is exported through the
>>  respective file in /proc/pressure/ -- cpu, memory, and io.
>>  
>> -The format for CPU is as such::
>> -
>> -       some avg10=0.00 avg60=0.00 avg300=0.00 total=0
>> -
>> -and for memory and IO::
>> +The format is as such::
>>  
>>         some avg10=0.00 avg60=0.00 avg300=0.00 total=0
>>         full avg10=0.00 avg60=0.00 avg300=0.00 total=0
> 
> This leaves unexplained why there is a CPU full line in the
> documentation. And I bet someone – not me this time – could wonder why
> it is always zero.
> 
> I recommend to either remove the CPU full line completely and hope to
> get away with it just having been there for one, maybe two kernel
> versions (5.17 and maybe 5.18) or to put a note in documentation:
> 
> "CPU full on the system level is undefined, but has been reported in
> 5.17, so it is set to zero for backwards compatibility."

Ok, it's better to leave a note there, if someone is curious about the
zeros in CPU full line. Previous patch v1 does remove the CPU full line
completely, but it may break some userspace parsers.

Thanks.

> 
>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>> index e14358178849..97fd85c5143c 100644
>> --- a/kernel/sched/psi.c
>> +++ b/kernel/sched/psi.c
>> @@ -1062,14 +1062,17 @@ int psi_show(struct seq_file *m, struct
>> psi_group *group, enum psi_res res)
>>         mutex_unlock(&group->avgs_lock);
>>  
>>         for (full = 0; full < 2; full++) {
>> -               unsigned long avg[3];
>> -               u64 total;
>> +               unsigned long avg[3] = { 0, };
>> +               u64 total = 0;
>>                 int w;
>>  
>> -               for (w = 0; w < 3; w++)
>> -                       avg[w] = group->avg[res * 2 + full][w];
>> -               total = div_u64(group->total[PSI_AVGS][res * 2 +
>> full],
>> -                               NSEC_PER_USEC);
>> +               /* CPU FULL is undefined at the system level */
>> +               if (!(group == &psi_system && res == PSI_CPU && full))
>> {
>> +                       for (w = 0; w < 3; w++)
>> +                               avg[w] = group->avg[res * 2 +
>> full][w];
>> +                       total = div_u64(group->total[PSI_AVGS][res * 2
>> + full],
>> +                                       NSEC_PER_USEC);
>> +               }
>>  
>>                 seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu
>> avg300=%lu.%02lu total=%llu\n",
>>                            full ? "full" : "some",
> 
> 
> Martin Steigerwald • 
> Proact Deutschland GmbH
> Trainer
> Telefon: +49 911 30999 0 • 
> www.proact.de
> Südwestpark 43 • 
> 90449 Nürnberg • 
> Germany
> Amtsgericht Nürnberg
>  • 
> HRB 18320
> Geschäftsführer: 
> René Schülein
>  • 
> Jonas Hasselberg
>  • 
> Linda Höljö
> #ThePowerOfData  |  
> #ThePowerOfTogether
> 
> This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. Please read more in Proacts’ privacy notice,
>  
>  
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-08  0:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-05  2:13 [PATCH] psi: report zeroes for CPU full at the system level Chengming Zhou
     [not found] ` <89d939a61f840683101542fe0da823e693ef6cc3.camel@proact.de>
2022-03-08  0:30   ` [Phishing Risk] [External] " Chengming Zhou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).