linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "tip-bot2 for Frederic Weisbecker" <tip-bot2@linutronix.de>
To: linux-tip-commits@vger.kernel.org
Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Rik van Riel <riel@surriel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Wanpeng Li <wanpengli@tencent.com>,
	Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>,
	linux-kernel@vger.kernel.org
Subject: [tip: sched/core] sched/kcpustat: Introduce vtime-aware kcpustat accessor for CPUTIME_SYSTEM
Date: Tue, 29 Oct 2019 09:52:19 -0000	[thread overview]
Message-ID: <157234273931.29376.8892610322143430578.tip-bot2@tip-bot2> (raw)
In-Reply-To: <20191025020303.19342-1-frederic@kernel.org>

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     64eea63c19a2c386a96638f4e54a1355510709e3
Gitweb:        https://git.kernel.org/tip/64eea63c19a2c386a96638f4e54a1355510709e3
Author:        Frederic Weisbecker <frederic@kernel.org>
AuthorDate:    Fri, 25 Oct 2019 04:03:03 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 29 Oct 2019 10:01:17 +01:00

sched/kcpustat: Introduce vtime-aware kcpustat accessor for CPUTIME_SYSTEM

Kcpustat is not correctly supported on nohz_full CPUs. The tick doesn't
fire and the cputime therefore doesn't move forward. The issue has shown
up after the vanishing of the remaining 1Hz which has made the stall
visible.

We are solving that with checking the task running on a CPU through RCU
and reading its vtime delta that we add to the raw kcpustat values.

We make sure that we fetch a coherent raw-kcpustat/vtime-delta couple
sequence while checking that the CPU referred by the target vtime is the
correct one, under the locked vtime seqcount.

Only CPUTIME_SYSTEM is handled here as a start because it's the trivial
case. User and guest time will require more preparation work to
correctly handle niceness.

Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpengli@tencent.com>
Link: https://lkml.kernel.org/r/20191025020303.19342-1-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/kernel_stat.h | 11 +++++-
 kernel/sched/cputime.c      | 82 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 93 insertions(+)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 7ee2bb4..7978119 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -78,6 +78,17 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
 	return kstat_cpu(cpu).irqs_sum;
 }
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
+extern u64 kcpustat_field(struct kernel_cpustat *kcpustat,
+			  enum cpu_usage_stat usage, int cpu);
+#else
+static inline u64 kcpustat_field(struct kernel_cpustat *kcpustat,
+				 enum cpu_usage_stat usage, int cpu)
+{
+	return kcpustat->cpustat[usage];
+}
+#endif
+
 extern void account_user_time(struct task_struct *, u64);
 extern void account_guest_time(struct task_struct *, u64);
 extern void account_system_time(struct task_struct *, int, u64);
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index b931a19..e0cd206 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -911,4 +911,86 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime)
 			*utime += vtime->utime + delta;
 	} while (read_seqcount_retry(&vtime->seqcount, seq));
 }
+
+static int kcpustat_field_vtime(u64 *cpustat,
+				struct vtime *vtime,
+				enum cpu_usage_stat usage,
+				int cpu, u64 *val)
+{
+	unsigned int seq;
+	int err;
+
+	do {
+		seq = read_seqcount_begin(&vtime->seqcount);
+
+		/*
+		 * We raced against context switch, fetch the
+		 * kcpustat task again.
+		 */
+		if (vtime->cpu != cpu && vtime->cpu != -1)
+			return -EAGAIN;
+
+		/*
+		 * Two possible things here:
+		 * 1) We are seeing the scheduling out task (prev) or any past one.
+		 * 2) We are seeing the scheduling in task (next) but it hasn't
+		 *    passed though vtime_task_switch() yet so the pending
+		 *    cputime of the prev task may not be flushed yet.
+		 *
+		 * Case 1) is ok but 2) is not. So wait for a safe VTIME state.
+		 */
+		if (vtime->state == VTIME_INACTIVE)
+			return -EAGAIN;
+
+		err = 0;
+
+		*val = cpustat[usage];
+
+		if (vtime->state == VTIME_SYS)
+			*val += vtime->stime + vtime_delta(vtime);
+
+	} while (read_seqcount_retry(&vtime->seqcount, seq));
+
+	return 0;
+}
+
+u64 kcpustat_field(struct kernel_cpustat *kcpustat,
+		   enum cpu_usage_stat usage, int cpu)
+{
+	u64 *cpustat = kcpustat->cpustat;
+	struct rq *rq;
+	u64 val;
+	int err;
+
+	if (!vtime_accounting_enabled_cpu(cpu))
+		return cpustat[usage];
+
+	/* Only support sys vtime for now */
+	if (usage != CPUTIME_SYSTEM)
+		return cpustat[usage];
+
+	rq = cpu_rq(cpu);
+
+	for (;;) {
+		struct task_struct *curr;
+		struct vtime *vtime;
+
+		rcu_read_lock();
+		curr = rcu_dereference(rq->curr);
+		if (WARN_ON_ONCE(!curr)) {
+			rcu_read_unlock();
+			return cpustat[usage];
+		}
+
+		vtime = &curr->vtime;
+		err = kcpustat_field_vtime(cpustat, vtime, usage, cpu, &val);
+		rcu_read_unlock();
+
+		if (!err)
+			return val;
+
+		cpu_relax();
+	}
+}
+EXPORT_SYMBOL_GPL(kcpustat_field);
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */

  reply	other threads:[~2019-10-29  9:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-16  2:56 [PATCH 00/14] sched/nohz: Make kcpustat's CPUTIME_SYSTEM vtime aware v2 (Partially fix kcpustat on nohz_full) Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 01/14] sched/vtime: Record CPU under seqcount for kcpustat needs Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 02/14] sched/cputime: Add vtime idle task state Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 03/14] sched/cputime: Add vtime guest " Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 04/14] context_tracking: Remove context_tracking_active() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 05/14] context_tracking: s/context_tracking_is_enabled/context_tracking_enabled() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] context_tracking: Rename context_tracking_is_enabled() => context_tracking_enabled() tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 06/14] context_tracking: Rename context_tracking_is_cpu_enabled() to context_tracking_enabled_this_cpu() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 07/14] context_tracking: Introduce context_tracking_enabled_cpu() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 08/14] sched/vtime: Rename vtime_accounting_cpu_enabled() to vtime_accounting_enabled_this_cpu() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 09/14] sched/vtime: Introduce vtime_accounting_enabled_cpu() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 10/14] context_tracking: Check static key on context_tracking_enabled_*cpu() Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 11/14] sched/kcpustat: Introduce vtime-aware kcpustat accessor for CPUTIME_SYSTEM Frederic Weisbecker
2019-10-24 11:50   ` Peter Zijlstra
2019-10-25  1:25     ` Frederic Weisbecker
2019-10-25  2:03   ` [PATCH 11/14 v2] " Frederic Weisbecker
2019-10-29  9:52     ` tip-bot2 for Frederic Weisbecker [this message]
2019-10-16  2:56 ` [PATCH 12/14] procfs: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:56 ` [PATCH 13/14] cpufreq: " Frederic Weisbecker
2019-10-16  3:40   ` Viresh Kumar
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-16  2:57 ` [PATCH 14/14] leds: " Frederic Weisbecker
2019-10-29  9:52   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2019-10-24  0:45 ` [GIT PULL] sched/nohz: Make kcpustat's CPUTIME_SYSTEM vtime aware Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=157234273931.29376.8892610322143430578.tip-bot2@tip-bot2 \
    --to=tip-bot2@linutronix.de \
    --cc=bp@alien8.de \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wanpengli@tencent.com \
    --cc=yauheni.kaliuta@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).