linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Add new tracepoints required for EAS testing
@ 2019-05-10 11:30 Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 1/7] sched: autogroup: Make autogroup_path() always available Qais Yousef
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

Changes in v2:
	- Add include guards to the newly added headers
	- Rename tracepoints:
		sched_load_rq -> pelt_rq
		sched_load_se -> pelt_se
	- Rename helper functions: s/sched_tp/sched_trace/
	- Make sched_trace*() less fat by reducing path size to 20 bytes from
	  64.
	- Fix compilation error when building on UP


The following patches add the bare minimum tracepoints required to perform EAS
testing in Lisa[1].

The new tracepoints are bare in a sense that they don't export any info in
tracefs, hence shouldn't introduce any ABI. The intended way to use them is by
loading a module that will probe the tracepoints and extract the info required
for userspace testing.

It is done in this way because adding new TRACE_EVENTS() is no longer accepted
AFAIU.

The tracepoints are focused around tracking PELT signals which is what EAS uses
to make its decision, hence knowing the value of PELT as it changes allows
verifying that EAS is doing the right thing based on synthetic tests that
simulate different scenarios.

Beside EAS, the new tracepoints can help investigate CFS load balancer and CFS
taskgroup handling as they are both based on PELT signals too.

The first 2 patches do a bit of code shuffling to expose some required
functions.

Patch 3 adds a new cfs helper function.

Patches 4-6 add the new tracepoints.

Patch 7 exports the tracepoints so that out of tree modules can probe the new
tracepoints with least amount of effort - which extends the usefulness of the
tracepoints since creating a module to probe them is the only way to access
them.

An example module that uses these tracepoints is available in [2].

[1] https://github.com/ARM-software/lisa
[2] https://github.com/qais-yousef/tracepoints-helpers/blob/master/lisa_tp/lisa_tp.c

Qais Yousef (7):
  sched: autogroup: Make autogroup_path() always available
  sched: fair: move helper functions into fair.h
  sched: fair.h: add a new cfs_rq_tg_path()
  sched: Add pelt_rq tracepoint
  sched: Add pelt_se tracepoint
  sched: Add sched_overutilized tracepoint
  sched: export the newly added tracepoints

 include/trace/events/sched.h     |  17 +++
 kernel/sched/autogroup.c         |   2 -
 kernel/sched/core.c              |   8 ++
 kernel/sched/fair.c              | 212 ++----------------------------
 kernel/sched/fair.h              | 219 +++++++++++++++++++++++++++++++
 kernel/sched/pelt.c              |   6 +
 kernel/sched/sched.h             |   1 +
 kernel/sched/sched_tracepoints.h |  63 +++++++++
 8 files changed, 328 insertions(+), 200 deletions(-)
 create mode 100644 kernel/sched/fair.h
 create mode 100644 kernel/sched/sched_tracepoints.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/7] sched: autogroup: Make autogroup_path() always available
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 2/7] sched: fair: move helper functions into fair.h Qais Yousef
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

By removing the #ifdef CONFIG_SCHED_DEBUG

Some of the tracepoints to be introduces in later patches need to access
this function. Hence make it always available since the tracepoints are
not protected by CONFIG_SCHED_DEBUG.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 kernel/sched/autogroup.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c
index 2d4ff5353ded..2067080bb235 100644
--- a/kernel/sched/autogroup.c
+++ b/kernel/sched/autogroup.c
@@ -259,7 +259,6 @@ void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m)
 }
 #endif /* CONFIG_PROC_FS */
 
-#ifdef CONFIG_SCHED_DEBUG
 int autogroup_path(struct task_group *tg, char *buf, int buflen)
 {
 	if (!task_group_is_autogroup(tg))
@@ -267,4 +266,3 @@ int autogroup_path(struct task_group *tg, char *buf, int buflen)
 
 	return snprintf(buf, buflen, "%s-%ld", "/autogroup", tg->autogroup->id);
 }
-#endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/7] sched: fair: move helper functions into fair.h
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 1/7] sched: autogroup: Make autogroup_path() always available Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 3/7] sched: fair.h: add a new cfs_rq_tg_path() Qais Yousef
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

Move the small cfs rq helper functions that are inlined into fair.h
header.

In later patches we need a couple of functions and it made more sense to
move the majority of the functions into their own header rather than the
two needed only. This keeps the functions grouped together in the same
file.

Always include the new header in sched.h to make them accessible to all
sched subsystem files like autogroup.h

find_match_se() was excluded because it wasn't inlined.

The two required functions are:

	- cfs_rq_of()
	- group_rq_of()

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 kernel/sched/fair.c  | 195 -----------------------------------------
 kernel/sched/fair.h  | 201 +++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/sched.h |   1 +
 3 files changed, 202 insertions(+), 195 deletions(-)
 create mode 100644 kernel/sched/fair.h

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 35f3ea375084..2b4963bbeab4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -243,151 +243,7 @@ static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight
 
 const struct sched_class fair_sched_class;
 
-/**************************************************************
- * CFS operations on generic schedulable entities:
- */
-
 #ifdef CONFIG_FAIR_GROUP_SCHED
-static inline struct task_struct *task_of(struct sched_entity *se)
-{
-	SCHED_WARN_ON(!entity_is_task(se));
-	return container_of(se, struct task_struct, se);
-}
-
-/* Walk up scheduling entities hierarchy */
-#define for_each_sched_entity(se) \
-		for (; se; se = se->parent)
-
-static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
-{
-	return p->se.cfs_rq;
-}
-
-/* runqueue on which this entity is (to be) queued */
-static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
-{
-	return se->cfs_rq;
-}
-
-/* runqueue "owned" by this group */
-static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
-{
-	return grp->my_q;
-}
-
-static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
-{
-	struct rq *rq = rq_of(cfs_rq);
-	int cpu = cpu_of(rq);
-
-	if (cfs_rq->on_list)
-		return rq->tmp_alone_branch == &rq->leaf_cfs_rq_list;
-
-	cfs_rq->on_list = 1;
-
-	/*
-	 * Ensure we either appear before our parent (if already
-	 * enqueued) or force our parent to appear after us when it is
-	 * enqueued. The fact that we always enqueue bottom-up
-	 * reduces this to two cases and a special case for the root
-	 * cfs_rq. Furthermore, it also means that we will always reset
-	 * tmp_alone_branch either when the branch is connected
-	 * to a tree or when we reach the top of the tree
-	 */
-	if (cfs_rq->tg->parent &&
-	    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
-		/*
-		 * If parent is already on the list, we add the child
-		 * just before. Thanks to circular linked property of
-		 * the list, this means to put the child at the tail
-		 * of the list that starts by parent.
-		 */
-		list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
-			&(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
-		/*
-		 * The branch is now connected to its tree so we can
-		 * reset tmp_alone_branch to the beginning of the
-		 * list.
-		 */
-		rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
-		return true;
-	}
-
-	if (!cfs_rq->tg->parent) {
-		/*
-		 * cfs rq without parent should be put
-		 * at the tail of the list.
-		 */
-		list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
-			&rq->leaf_cfs_rq_list);
-		/*
-		 * We have reach the top of a tree so we can reset
-		 * tmp_alone_branch to the beginning of the list.
-		 */
-		rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
-		return true;
-	}
-
-	/*
-	 * The parent has not already been added so we want to
-	 * make sure that it will be put after us.
-	 * tmp_alone_branch points to the begin of the branch
-	 * where we will add parent.
-	 */
-	list_add_rcu(&cfs_rq->leaf_cfs_rq_list, rq->tmp_alone_branch);
-	/*
-	 * update tmp_alone_branch to points to the new begin
-	 * of the branch
-	 */
-	rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
-	return false;
-}
-
-static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
-{
-	if (cfs_rq->on_list) {
-		struct rq *rq = rq_of(cfs_rq);
-
-		/*
-		 * With cfs_rq being unthrottled/throttled during an enqueue,
-		 * it can happen the tmp_alone_branch points the a leaf that
-		 * we finally want to del. In this case, tmp_alone_branch moves
-		 * to the prev element but it will point to rq->leaf_cfs_rq_list
-		 * at the end of the enqueue.
-		 */
-		if (rq->tmp_alone_branch == &cfs_rq->leaf_cfs_rq_list)
-			rq->tmp_alone_branch = cfs_rq->leaf_cfs_rq_list.prev;
-
-		list_del_rcu(&cfs_rq->leaf_cfs_rq_list);
-		cfs_rq->on_list = 0;
-	}
-}
-
-static inline void assert_list_leaf_cfs_rq(struct rq *rq)
-{
-	SCHED_WARN_ON(rq->tmp_alone_branch != &rq->leaf_cfs_rq_list);
-}
-
-/* Iterate thr' all leaf cfs_rq's on a runqueue */
-#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos)			\
-	list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list,	\
-				 leaf_cfs_rq_list)
-
-/* Do the two (enqueued) entities belong to the same group ? */
-static inline struct cfs_rq *
-is_same_group(struct sched_entity *se, struct sched_entity *pse)
-{
-	if (se->cfs_rq == pse->cfs_rq)
-		return se->cfs_rq;
-
-	return NULL;
-}
-
-static inline struct sched_entity *parent_entity(struct sched_entity *se)
-{
-	return se->parent;
-}
-
 static void
 find_matching_se(struct sched_entity **se, struct sched_entity **pse)
 {
@@ -419,62 +275,11 @@ find_matching_se(struct sched_entity **se, struct sched_entity **pse)
 		*pse = parent_entity(*pse);
 	}
 }
-
 #else	/* !CONFIG_FAIR_GROUP_SCHED */
-
-static inline struct task_struct *task_of(struct sched_entity *se)
-{
-	return container_of(se, struct task_struct, se);
-}
-
-#define for_each_sched_entity(se) \
-		for (; se; se = NULL)
-
-static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
-{
-	return &task_rq(p)->cfs;
-}
-
-static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
-{
-	struct task_struct *p = task_of(se);
-	struct rq *rq = task_rq(p);
-
-	return &rq->cfs;
-}
-
-/* runqueue "owned" by this group */
-static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
-{
-	return NULL;
-}
-
-static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
-{
-	return true;
-}
-
-static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
-{
-}
-
-static inline void assert_list_leaf_cfs_rq(struct rq *rq)
-{
-}
-
-#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos)	\
-		for (cfs_rq = &rq->cfs, pos = NULL; cfs_rq; cfs_rq = pos)
-
-static inline struct sched_entity *parent_entity(struct sched_entity *se)
-{
-	return NULL;
-}
-
 static inline void
 find_matching_se(struct sched_entity **se, struct sched_entity **pse)
 {
 }
-
 #endif	/* CONFIG_FAIR_GROUP_SCHED */
 
 static __always_inline
diff --git a/kernel/sched/fair.h b/kernel/sched/fair.h
new file mode 100644
index 000000000000..2e5aefaf56de
--- /dev/null
+++ b/kernel/sched/fair.h
@@ -0,0 +1,201 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * CFS operations on generic schedulable entities:
+ */
+#ifndef __SCHED_FAIR_H
+#define __SCHED_FAIR_H
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+static inline struct task_struct *task_of(struct sched_entity *se)
+{
+	SCHED_WARN_ON(!entity_is_task(se));
+	return container_of(se, struct task_struct, se);
+}
+
+/* Walk up scheduling entities hierarchy */
+#define for_each_sched_entity(se) \
+		for (; se; se = se->parent)
+
+static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
+{
+	return p->se.cfs_rq;
+}
+
+/* runqueue on which this entity is (to be) queued */
+static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
+{
+	return se->cfs_rq;
+}
+
+/* runqueue "owned" by this group */
+static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
+{
+	return grp->my_q;
+}
+
+static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+{
+	struct rq *rq = rq_of(cfs_rq);
+	int cpu = cpu_of(rq);
+
+	if (cfs_rq->on_list)
+		return rq->tmp_alone_branch == &rq->leaf_cfs_rq_list;
+
+	cfs_rq->on_list = 1;
+
+	/*
+	 * Ensure we either appear before our parent (if already
+	 * enqueued) or force our parent to appear after us when it is
+	 * enqueued. The fact that we always enqueue bottom-up
+	 * reduces this to two cases and a special case for the root
+	 * cfs_rq. Furthermore, it also means that we will always reset
+	 * tmp_alone_branch either when the branch is connected
+	 * to a tree or when we reach the top of the tree
+	 */
+	if (cfs_rq->tg->parent &&
+	    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
+		/*
+		 * If parent is already on the list, we add the child
+		 * just before. Thanks to circular linked property of
+		 * the list, this means to put the child at the tail
+		 * of the list that starts by parent.
+		 */
+		list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
+			&(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
+		/*
+		 * The branch is now connected to its tree so we can
+		 * reset tmp_alone_branch to the beginning of the
+		 * list.
+		 */
+		rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+		return true;
+	}
+
+	if (!cfs_rq->tg->parent) {
+		/*
+		 * cfs rq without parent should be put
+		 * at the tail of the list.
+		 */
+		list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
+			&rq->leaf_cfs_rq_list);
+		/*
+		 * We have reach the top of a tree so we can reset
+		 * tmp_alone_branch to the beginning of the list.
+		 */
+		rq->tmp_alone_branch = &rq->leaf_cfs_rq_list;
+		return true;
+	}
+
+	/*
+	 * The parent has not already been added so we want to
+	 * make sure that it will be put after us.
+	 * tmp_alone_branch points to the begin of the branch
+	 * where we will add parent.
+	 */
+	list_add_rcu(&cfs_rq->leaf_cfs_rq_list, rq->tmp_alone_branch);
+	/*
+	 * update tmp_alone_branch to points to the new begin
+	 * of the branch
+	 */
+	rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list;
+	return false;
+}
+
+static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+{
+	if (cfs_rq->on_list) {
+		struct rq *rq = rq_of(cfs_rq);
+
+		/*
+		 * With cfs_rq being unthrottled/throttled during an enqueue,
+		 * it can happen the tmp_alone_branch points the a leaf that
+		 * we finally want to del. In this case, tmp_alone_branch moves
+		 * to the prev element but it will point to rq->leaf_cfs_rq_list
+		 * at the end of the enqueue.
+		 */
+		if (rq->tmp_alone_branch == &cfs_rq->leaf_cfs_rq_list)
+			rq->tmp_alone_branch = cfs_rq->leaf_cfs_rq_list.prev;
+
+		list_del_rcu(&cfs_rq->leaf_cfs_rq_list);
+		cfs_rq->on_list = 0;
+	}
+}
+
+static inline void assert_list_leaf_cfs_rq(struct rq *rq)
+{
+	SCHED_WARN_ON(rq->tmp_alone_branch != &rq->leaf_cfs_rq_list);
+}
+
+/* Iterate thr' all leaf cfs_rq's on a runqueue */
+#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos)			\
+	list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list,	\
+				 leaf_cfs_rq_list)
+
+/* Do the two (enqueued) entities belong to the same group ? */
+static inline struct cfs_rq *
+is_same_group(struct sched_entity *se, struct sched_entity *pse)
+{
+	if (se->cfs_rq == pse->cfs_rq)
+		return se->cfs_rq;
+
+	return NULL;
+}
+
+static inline struct sched_entity *parent_entity(struct sched_entity *se)
+{
+	return se->parent;
+}
+
+#else	/* !CONFIG_FAIR_GROUP_SCHED */
+
+static inline struct task_struct *task_of(struct sched_entity *se)
+{
+	return container_of(se, struct task_struct, se);
+}
+
+#define for_each_sched_entity(se) \
+		for (; se; se = NULL)
+
+static inline struct cfs_rq *task_cfs_rq(struct task_struct *p)
+{
+	return &task_rq(p)->cfs;
+}
+
+static inline struct cfs_rq *cfs_rq_of(struct sched_entity *se)
+{
+	struct task_struct *p = task_of(se);
+	struct rq *rq = task_rq(p);
+
+	return &rq->cfs;
+}
+
+/* runqueue "owned" by this group */
+static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
+{
+	return NULL;
+}
+
+static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+{
+	return true;
+}
+
+static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
+{
+}
+
+static inline void assert_list_leaf_cfs_rq(struct rq *rq)
+{
+}
+
+#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos)	\
+		for (cfs_rq = &rq->cfs, pos = NULL; cfs_rq; cfs_rq = pos)
+
+static inline struct sched_entity *parent_entity(struct sched_entity *se)
+{
+	return NULL;
+}
+
+#endif	/* CONFIG_FAIR_GROUP_SCHED */
+
+#endif /* __SCHED_FAIR_H */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index efa686eeff26..509c1dba77fc 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1418,6 +1418,7 @@ static inline void sched_ttwu_pending(void) { }
 
 #include "stats.h"
 #include "autogroup.h"
+#include "fair.h"
 
 #ifdef CONFIG_CGROUP_SCHED
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/7] sched: fair.h: add a new cfs_rq_tg_path()
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 1/7] sched: autogroup: Make autogroup_path() always available Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 2/7] sched: fair: move helper functions into fair.h Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 4/7] sched: Add pelt_rq tracepoint Qais Yousef
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

The new function will be used in later patches when introducing the new
PELT tracepoints.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 kernel/sched/fair.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/sched/fair.h b/kernel/sched/fair.h
index 2e5aefaf56de..109dd068be78 100644
--- a/kernel/sched/fair.h
+++ b/kernel/sched/fair.h
@@ -33,6 +33,18 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
 	return grp->my_q;
 }
 
+static inline void cfs_rq_tg_path(struct cfs_rq *cfs_rq, char *path, int len)
+{
+	int l = path ? len : 0;
+
+	if (cfs_rq && task_group_is_autogroup(cfs_rq->tg))
+		autogroup_path(cfs_rq->tg, path, l);
+	else if (cfs_rq && cfs_rq->tg->css.cgroup)
+		cgroup_path(cfs_rq->tg->css.cgroup, path, l);
+	else if (path)
+		strcpy(path, "(null)");
+}
+
 static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	struct rq *rq = rq_of(cfs_rq);
@@ -175,6 +187,12 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
 	return NULL;
 }
 
+static inline void cfs_rq_tg_path(struct cfs_rq *cfs_rq, char *path, int len)
+{
+	if (path)
+		strcpy(path, "(null)");
+}
+
 static inline bool list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	return true;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/7] sched: Add pelt_rq tracepoint
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
                   ` (2 preceding siblings ...)
  2019-05-10 11:30 ` [PATCH v2 3/7] sched: fair.h: add a new cfs_rq_tg_path() Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-13 12:14   ` Peter Zijlstra
  2019-05-10 11:30 ` [PATCH v2 5/7] sched: Add pelt_se tracepoint Qais Yousef
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

The new tracepoint allows tracking PELT signals at rq level for all
scheduling classes.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 include/trace/events/sched.h     |  9 ++++++
 kernel/sched/fair.c              |  9 ++++--
 kernel/sched/pelt.c              |  4 +++
 kernel/sched/sched_tracepoints.h | 50 ++++++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+), 2 deletions(-)
 create mode 100644 kernel/sched/sched_tracepoints.h

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 9a4bdfadab07..50346098e026 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -587,6 +587,15 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
 
 	TP_printk("cpu=%d", __entry->cpu)
 );
+
+/*
+ * Following tracepoints are not exported in tracefs and provide hooking
+ * mechanisms only for testing and debugging purposes.
+ */
+DECLARE_TRACE(pelt_rq,
+	TP_PROTO(int cpu, const char *path, struct sched_avg *avg),
+	TP_ARGS(cpu, path, avg));
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2b4963bbeab4..34782e37387c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -21,8 +21,7 @@
  *  Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra
  */
 #include "sched.h"
-
-#include <trace/events/sched.h>
+#include "sched_tracepoints.h"
 
 /*
  * Targeted preemption latency for CPU-bound tasks:
@@ -3139,6 +3138,8 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
 	update_tg_cfs_util(cfs_rq, se, gcfs_rq);
 	update_tg_cfs_runnable(cfs_rq, se, gcfs_rq);
 
+	sched_trace_pelt_cfs_rq(cfs_rq);
+
 	return 1;
 }
 
@@ -3291,6 +3292,8 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 	add_tg_cfs_propagate(cfs_rq, se->avg.load_sum);
 
 	cfs_rq_util_change(cfs_rq, flags);
+
+	sched_trace_pelt_cfs_rq(cfs_rq);
 }
 
 /**
@@ -3310,6 +3313,8 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 	add_tg_cfs_propagate(cfs_rq, -se->avg.load_sum);
 
 	cfs_rq_util_change(cfs_rq, 0);
+
+	sched_trace_pelt_cfs_rq(cfs_rq);
 }
 
 /*
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index befce29bd882..39418e80699f 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -26,6 +26,7 @@
 
 #include <linux/sched.h>
 #include "sched.h"
+#include "sched_tracepoints.h"
 #include "pelt.h"
 
 /*
@@ -292,6 +293,7 @@ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq)
 				cfs_rq->curr != NULL)) {
 
 		___update_load_avg(&cfs_rq->avg, 1, 1);
+		sched_trace_pelt_cfs_rq(cfs_rq);
 		return 1;
 	}
 
@@ -317,6 +319,7 @@ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running)
 				running)) {
 
 		___update_load_avg(&rq->avg_rt, 1, 1);
+		sched_trace_pelt_rt_rq(rq);
 		return 1;
 	}
 
@@ -340,6 +343,7 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
 				running)) {
 
 		___update_load_avg(&rq->avg_dl, 1, 1);
+		sched_trace_pelt_dl_rq(rq);
 		return 1;
 	}
 
diff --git a/kernel/sched/sched_tracepoints.h b/kernel/sched/sched_tracepoints.h
new file mode 100644
index 000000000000..5f804629d3b7
--- /dev/null
+++ b/kernel/sched/sched_tracepoints.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Scheduler tracepoints that are probe-able only and aren't exported ABI in
+ * tracefs.
+ */
+#ifndef __SCHED_TRACEPOINTS_H
+#define __SCHED_TRACEPOINTS_H
+
+#include <trace/events/sched.h>
+
+#define SCHED_TP_PATH_LEN		20
+
+
+#ifdef CONFIG_SMP
+static __always_inline void sched_trace_pelt_cfs_rq(struct cfs_rq *cfs_rq)
+{
+	if (trace_pelt_rq_enabled()) {
+		int cpu = cpu_of(rq_of(cfs_rq));
+		char path[SCHED_TP_PATH_LEN];
+
+		cfs_rq_tg_path(cfs_rq, path, SCHED_TP_PATH_LEN);
+		trace_pelt_rq(cpu, path, &cfs_rq->avg);
+	}
+}
+
+static __always_inline void sched_trace_pelt_rt_rq(struct rq *rq)
+{
+	if (trace_pelt_rq_enabled()) {
+		int cpu = cpu_of(rq);
+
+		trace_pelt_rq(cpu, NULL, &rq->avg_rt);
+	}
+}
+
+static __always_inline void sched_trace_pelt_dl_rq(struct rq *rq)
+{
+	if (trace_pelt_rq_enabled()) {
+		int cpu = cpu_of(rq);
+
+		trace_pelt_rq(cpu, NULL, &rq->avg_dl);
+	}
+}
+#else
+static inline void sched_trace_pelt_cfs_rq(struct cfs_rq *cfs_rq) {}
+static inline void sched_trace_pelt_rt_rq(struct rq *rq) {}
+static inline void sched_trace_pelt_dl_rq(struct rq *rq) {}
+#endif /* CONFIG_SMP */
+
+
+#endif /* __SCHED_TRACEPOINTS_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/7] sched: Add pelt_se tracepoint
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
                   ` (3 preceding siblings ...)
  2019-05-10 11:30 ` [PATCH v2 4/7] sched: Add pelt_rq tracepoint Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-10 11:30 ` [PATCH v2 6/7] sched: Add sched_overutilized tracepoint Qais Yousef
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

The new tracepoint allows tracking PELT signals at sched_entity level.
Which is supported in CFS tasks and taskgroups only.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 include/trace/events/sched.h     |  4 ++++
 kernel/sched/fair.c              |  1 +
 kernel/sched/pelt.c              |  2 ++
 kernel/sched/sched_tracepoints.h | 13 +++++++++++++
 4 files changed, 20 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 50346098e026..cbcb47972232 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -596,6 +596,10 @@ DECLARE_TRACE(pelt_rq,
 	TP_PROTO(int cpu, const char *path, struct sched_avg *avg),
 	TP_ARGS(cpu, path, avg));
 
+DECLARE_TRACE(pelt_se,
+	TP_PROTO(int cpu, const char *path, struct sched_entity *se),
+	TP_ARGS(cpu, path, se));
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 34782e37387c..81036c34fd29 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3139,6 +3139,7 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
 	update_tg_cfs_runnable(cfs_rq, se, gcfs_rq);
 
 	sched_trace_pelt_cfs_rq(cfs_rq);
+	sched_trace_pelt_se(se);
 
 	return 1;
 }
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 39418e80699f..75eea3b61a97 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -266,6 +266,7 @@ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se)
 {
 	if (___update_load_sum(now, &se->avg, 0, 0, 0)) {
 		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
+		sched_trace_pelt_se(se);
 		return 1;
 	}
 
@@ -279,6 +280,7 @@ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se
 
 		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
 		cfs_se_util_change(&se->avg);
+		sched_trace_pelt_se(se);
 		return 1;
 	}
 
diff --git a/kernel/sched/sched_tracepoints.h b/kernel/sched/sched_tracepoints.h
index 5f804629d3b7..d1992f04ee27 100644
--- a/kernel/sched/sched_tracepoints.h
+++ b/kernel/sched/sched_tracepoints.h
@@ -47,4 +47,17 @@ static inline void sched_trace_pelt_dl_rq(struct rq *rq) {}
 #endif /* CONFIG_SMP */
 
 
+static __always_inline void sched_trace_pelt_se(struct sched_entity *se)
+{
+	if (trace_pelt_se_enabled()) {
+		struct cfs_rq *gcfs_rq = group_cfs_rq(se);
+		struct cfs_rq *cfs_rq = cfs_rq_of(se);
+		int cpu = cpu_of(rq_of(cfs_rq));
+		char path[SCHED_TP_PATH_LEN];
+
+		cfs_rq_tg_path(gcfs_rq, path, SCHED_TP_PATH_LEN);
+		trace_pelt_se(cpu, path, se);
+	}
+}
+
 #endif /* __SCHED_TRACEPOINTS_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/7] sched: Add sched_overutilized tracepoint
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
                   ` (4 preceding siblings ...)
  2019-05-10 11:30 ` [PATCH v2 5/7] sched: Add pelt_se tracepoint Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-13 12:08   ` Peter Zijlstra
  2019-05-10 11:30 ` [PATCH v2 7/7] sched: export the newly added tracepoints Qais Yousef
  2019-05-13 12:28 ` [PATCH v2 0/7] Add new tracepoints required for EAS testing Peter Zijlstra
  7 siblings, 1 reply; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

The new tracepoint allows us to track the changes in overutilized
status.

Overutilized status is associated with EAS. It indicates that the system
is in high performance state. EAS is disabled when the system is in this
state since there's not much energy savings while high performance tasks
are pushing the system to the limit and it's better to default to the
spreading behavior of the scheduler.

This tracepoint helps understanding and debugging the conditions under
which this happens.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 include/trace/events/sched.h | 4 ++++
 kernel/sched/fair.c          | 7 ++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index cbcb47972232..0cf42d13d6c4 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -600,6 +600,10 @@ DECLARE_TRACE(pelt_se,
 	TP_PROTO(int cpu, const char *path, struct sched_entity *se),
 	TP_ARGS(cpu, path, se));
 
+DECLARE_TRACE(sched_overutilized,
+	TP_PROTO(int overutilized),
+	TP_ARGS(overutilized));
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 81036c34fd29..494032220fe7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4965,8 +4965,10 @@ static inline bool cpu_overutilized(int cpu)
 
 static inline void update_overutilized_status(struct rq *rq)
 {
-	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
+	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
 		WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
+		trace_sched_overutilized(1);
+	}
 }
 #else
 static inline void update_overutilized_status(struct rq *rq) { }
@@ -8330,8 +8332,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 
 		/* Update over-utilization (tipping point, U >= 0) indicator */
 		WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
+
+		trace_sched_overutilized(!!(sg_status & SG_OVERUTILIZED));
 	} else if (sg_status & SG_OVERUTILIZED) {
 		WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
+		trace_sched_overutilized(1);
 	}
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/7] sched: export the newly added tracepoints
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
                   ` (5 preceding siblings ...)
  2019-05-10 11:30 ` [PATCH v2 6/7] sched: Add sched_overutilized tracepoint Qais Yousef
@ 2019-05-10 11:30 ` Qais Yousef
  2019-05-13 12:28 ` [PATCH v2 0/7] Add new tracepoints required for EAS testing Peter Zijlstra
  7 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-10 11:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Steven Rostedt
  Cc: linux-kernel, Pavankumar Kondeti, Sebastian Andrzej Siewior,
	Uwe Kleine-Konig, Dietmar Eggemann, Quentin Perret, Qais Yousef

So that external modules can hook into them and extract the info they
need. Since these new tracepoints have no events associated with them
exporting these tracepoints make them useful for external modules to
perform testing and debugging. There's no other way otherwise to access
them.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 kernel/sched/core.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4778c48a7fda..0f16e445cca1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -22,6 +22,14 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/sched.h>
 
+/*
+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
+ * associated with them) to allow external modules to probe them.
+ */
+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_rq);
+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized);
+
 DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
 
 #if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_JUMP_LABEL)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 6/7] sched: Add sched_overutilized tracepoint
  2019-05-10 11:30 ` [PATCH v2 6/7] sched: Add sched_overutilized tracepoint Qais Yousef
@ 2019-05-13 12:08   ` Peter Zijlstra
  2019-05-13 12:42     ` Qais Yousef
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-05-13 12:08 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On Fri, May 10, 2019 at 12:30:12PM +0100, Qais Yousef wrote:

> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index cbcb47972232..0cf42d13d6c4 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -600,6 +600,10 @@ DECLARE_TRACE(pelt_se,
>  	TP_PROTO(int cpu, const char *path, struct sched_entity *se),
>  	TP_ARGS(cpu, path, se));
>  
> +DECLARE_TRACE(sched_overutilized,
> +	TP_PROTO(int overutilized),
> +	TP_ARGS(overutilized));
> +
>  #endif /* _TRACE_SCHED_H */
>  
>  /* This part must be outside protection */
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 81036c34fd29..494032220fe7 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4965,8 +4965,10 @@ static inline bool cpu_overutilized(int cpu)
>  
>  static inline void update_overutilized_status(struct rq *rq)
>  {
> -	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
> +	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
>  		WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
> +		trace_sched_overutilized(1);
> +	}
>  }
>  #else
>  static inline void update_overutilized_status(struct rq *rq) { }
> @@ -8330,8 +8332,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
>  
>  		/* Update over-utilization (tipping point, U >= 0) indicator */
>  		WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
> +
> +		trace_sched_overutilized(!!(sg_status & SG_OVERUTILIZED));
>  	} else if (sg_status & SG_OVERUTILIZED) {
>  		WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
> +		trace_sched_overutilized(1);
>  	}
>  }

Note how the state is per root domain and the tracepoint doesn't
communicate that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/7] sched: Add pelt_rq tracepoint
  2019-05-10 11:30 ` [PATCH v2 4/7] sched: Add pelt_rq tracepoint Qais Yousef
@ 2019-05-13 12:14   ` Peter Zijlstra
  2019-05-13 12:48     ` Qais Yousef
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-05-13 12:14 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On Fri, May 10, 2019 at 12:30:10PM +0100, Qais Yousef wrote:

> +DECLARE_TRACE(pelt_rq,
> +	TP_PROTO(int cpu, const char *path, struct sched_avg *avg),
> +	TP_ARGS(cpu, path, avg));
> +

> +static __always_inline void sched_trace_pelt_cfs_rq(struct cfs_rq *cfs_rq)
> +{
> +	if (trace_pelt_rq_enabled()) {
> +		int cpu = cpu_of(rq_of(cfs_rq));
> +		char path[SCHED_TP_PATH_LEN];
> +
> +		cfs_rq_tg_path(cfs_rq, path, SCHED_TP_PATH_LEN);
> +		trace_pelt_rq(cpu, path, &cfs_rq->avg);
> +	}
> +}
> +
> +static __always_inline void sched_trace_pelt_rt_rq(struct rq *rq)
> +{
> +	if (trace_pelt_rq_enabled()) {
> +		int cpu = cpu_of(rq);
> +
> +		trace_pelt_rq(cpu, NULL, &rq->avg_rt);
> +	}
> +}
> +
> +static __always_inline void sched_trace_pelt_dl_rq(struct rq *rq)
> +{
> +	if (trace_pelt_rq_enabled()) {
> +		int cpu = cpu_of(rq);
> +
> +		trace_pelt_rq(cpu, NULL, &rq->avg_dl);
> +	}
> +}

Since it is only the one real tracepoint, how do we know which avg is
which?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] Add new tracepoints required for EAS testing
  2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
                   ` (6 preceding siblings ...)
  2019-05-10 11:30 ` [PATCH v2 7/7] sched: export the newly added tracepoints Qais Yousef
@ 2019-05-13 12:28 ` Peter Zijlstra
  2019-05-13 13:42   ` Qais Yousef
  7 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-05-13 12:28 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret



diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index c8c7c7efb487..11555f95a88e 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -594,6 +594,23 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
 
 	TP_printk("cpu=%d", __entry->cpu)
 );
+
+/*
+ * Following tracepoints are not exported in tracefs and provide hooking
+ * mechanisms only for testing and debugging purposes.
+ */
+DECLARE_TRACE(pelt_cfs_rq,
+	TP_PROTO(struct cfs_rq *cfs_rq),
+	TP_ARGS(cfs_rq));
+
+DECLARE_TRACE(pelt_se,
+	TP_PROTO(struct sched_entity *se),
+	TP_ARGS(se));
+
+DECLARE_TRACE(sched_overutilized,
+	TP_PROTO(int overutilized),
+	TP_ARGS(overutilized));
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c
index 2d4ff5353ded..2067080bb235 100644
--- a/kernel/sched/autogroup.c
+++ b/kernel/sched/autogroup.c
@@ -259,7 +259,6 @@ void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m)
 }
 #endif /* CONFIG_PROC_FS */
 
-#ifdef CONFIG_SCHED_DEBUG
 int autogroup_path(struct task_group *tg, char *buf, int buflen)
 {
 	if (!task_group_is_autogroup(tg))
@@ -267,4 +266,3 @@ int autogroup_path(struct task_group *tg, char *buf, int buflen)
 
 	return snprintf(buf, buflen, "%s-%ld", "/autogroup", tg->autogroup->id);
 }
-#endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 102dfcf0a29a..629bbf4f4247 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -22,6 +22,14 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/sched.h>
 
+/*
+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
+ * associated with them) to allow external modules to probe them.
+ */
+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_cfs_rq);
+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se);
+EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized);
+
 DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
 
 #if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_JUMP_LABEL)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f35930f5e528..e7f82b1778b1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3334,6 +3334,9 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
 	update_tg_cfs_util(cfs_rq, se, gcfs_rq);
 	update_tg_cfs_runnable(cfs_rq, se, gcfs_rq);
 
+	trace_pelt_cfs_rq(cfs_rq);
+	trace_pelt_se(se);
+
 	return 1;
 }
 
@@ -3486,6 +3489,8 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 	add_tg_cfs_propagate(cfs_rq, se->avg.load_sum);
 
 	cfs_rq_util_change(cfs_rq, flags);
+
+	trace_pelt_cfs_rq(cfs_rq);
 }
 
 /**
@@ -3505,6 +3510,8 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
 	add_tg_cfs_propagate(cfs_rq, -se->avg.load_sum);
 
 	cfs_rq_util_change(cfs_rq, 0);
+
+	trace_pelt_cfs_rq(cfs_rq);
 }
 
 /*
@@ -5153,8 +5160,10 @@ static inline bool cpu_overutilized(int cpu)
 
 static inline void update_overutilized_status(struct rq *rq)
 {
-	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
+	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
 		WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
+		trace_sched_overutilized(1);
+	}
 }
 #else
 static inline void update_overutilized_status(struct rq *rq) { }
@@ -8516,8 +8525,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 
 		/* Update over-utilization (tipping point, U >= 0) indicator */
 		WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
+
+		trace_sched_overutilized(!!(sg_status & SG_OVERUTILIZED));
 	} else if (sg_status & SG_OVERUTILIZED) {
 		WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
+		trace_sched_overutilized(1);
 	}
 }
 
@@ -10737,3 +10749,17 @@ __init void init_sched_fair_class(void)
 #endif /* SMP */
 
 }
+
+char *sched_trace_cfs_rq_path(struct cfs_rq *cfs_rq, char *str, size_t len)
+{
+	cfs_rq_tg_path(cfs_rq, path, len);
+	return str;
+}
+EXPORT_SYMBOL_GPL(sched_trace_cfs_rq_path);
+
+int sched_trace_cfs_rq_cpu(struct cfs_rq *cfs_rq)
+{
+	return cpu_of(rq_of(cfs_rq));
+}
+EXPORT_SYMBOL_GPL(sched_trace_cfs_rq_cpu);
+
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index befce29bd882..ebca40ba71f3 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -25,6 +25,7 @@
  */
 
 #include <linux/sched.h>
+#include <trace/events/sched.h>
 #include "sched.h"
 #include "pelt.h"
 
@@ -265,6 +266,7 @@ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se)
 {
 	if (___update_load_sum(now, &se->avg, 0, 0, 0)) {
 		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
+		trace_pelt_se(se);
 		return 1;
 	}
 
@@ -278,6 +280,7 @@ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se
 
 		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
 		cfs_se_util_change(&se->avg);
+		trace_pelt_se(se);
 		return 1;
 	}
 
@@ -292,6 +295,7 @@ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq)
 				cfs_rq->curr != NULL)) {
 
 		___update_load_avg(&cfs_rq->avg, 1, 1);
+		trace_pelt_cfs_rq(cfs_rq);
 		return 1;
 	}
 
@@ -317,6 +321,7 @@ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running)
 				running)) {
 
 		___update_load_avg(&rq->avg_rt, 1, 1);
+//		sched_trace_pelt_rt_rq(rq);
 		return 1;
 	}
 
@@ -340,6 +345,7 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
 				running)) {
 
 		___update_load_avg(&rq->avg_dl, 1, 1);
+//		sched_trace_pelt_dl_rq(rq);
 		return 1;
 	}
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 6/7] sched: Add sched_overutilized tracepoint
  2019-05-13 12:08   ` Peter Zijlstra
@ 2019-05-13 12:42     ` Qais Yousef
  0 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-13 12:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On 05/13/19 14:08, Peter Zijlstra wrote:
> On Fri, May 10, 2019 at 12:30:12PM +0100, Qais Yousef wrote:
> 
> > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > index cbcb47972232..0cf42d13d6c4 100644
> > --- a/include/trace/events/sched.h
> > +++ b/include/trace/events/sched.h
> > @@ -600,6 +600,10 @@ DECLARE_TRACE(pelt_se,
> >  	TP_PROTO(int cpu, const char *path, struct sched_entity *se),
> >  	TP_ARGS(cpu, path, se));
> >  
> > +DECLARE_TRACE(sched_overutilized,
> > +	TP_PROTO(int overutilized),
> > +	TP_ARGS(overutilized));
> > +
> >  #endif /* _TRACE_SCHED_H */
> >  
> >  /* This part must be outside protection */
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 81036c34fd29..494032220fe7 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4965,8 +4965,10 @@ static inline bool cpu_overutilized(int cpu)
> >  
> >  static inline void update_overutilized_status(struct rq *rq)
> >  {
> > -	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
> > +	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
> >  		WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
> > +		trace_sched_overutilized(1);
> > +	}
> >  }
> >  #else
> >  static inline void update_overutilized_status(struct rq *rq) { }
> > @@ -8330,8 +8332,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
> >  
> >  		/* Update over-utilization (tipping point, U >= 0) indicator */
> >  		WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
> > +
> > +		trace_sched_overutilized(!!(sg_status & SG_OVERUTILIZED));
> >  	} else if (sg_status & SG_OVERUTILIZED) {
> >  		WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
> > +		trace_sched_overutilized(1);
> >  	}
> >  }
> 
> Note how the state is per root domain and the tracepoint doesn't
> communicate that.

Right! I can pass rd->span so that the probing function can use it to
differentiate the root domains if they care?

--
Qais Yousef

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/7] sched: Add pelt_rq tracepoint
  2019-05-13 12:14   ` Peter Zijlstra
@ 2019-05-13 12:48     ` Qais Yousef
  2019-05-13 13:37       ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Qais Yousef @ 2019-05-13 12:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On 05/13/19 14:14, Peter Zijlstra wrote:
> On Fri, May 10, 2019 at 12:30:10PM +0100, Qais Yousef wrote:
> 
> > +DECLARE_TRACE(pelt_rq,
> > +	TP_PROTO(int cpu, const char *path, struct sched_avg *avg),
> > +	TP_ARGS(cpu, path, avg));
> > +
> 
> > +static __always_inline void sched_trace_pelt_cfs_rq(struct cfs_rq *cfs_rq)
> > +{
> > +	if (trace_pelt_rq_enabled()) {
> > +		int cpu = cpu_of(rq_of(cfs_rq));
> > +		char path[SCHED_TP_PATH_LEN];
> > +
> > +		cfs_rq_tg_path(cfs_rq, path, SCHED_TP_PATH_LEN);
> > +		trace_pelt_rq(cpu, path, &cfs_rq->avg);
> > +	}
> > +}
> > +
> > +static __always_inline void sched_trace_pelt_rt_rq(struct rq *rq)
> > +{
> > +	if (trace_pelt_rq_enabled()) {
> > +		int cpu = cpu_of(rq);
> > +
> > +		trace_pelt_rq(cpu, NULL, &rq->avg_rt);
> > +	}
> > +}
> > +
> > +static __always_inline void sched_trace_pelt_dl_rq(struct rq *rq)
> > +{
> > +	if (trace_pelt_rq_enabled()) {
> > +		int cpu = cpu_of(rq);
> > +
> > +		trace_pelt_rq(cpu, NULL, &rq->avg_dl);
> > +	}
> > +}
> 
> Since it is only the one real tracepoint, how do we know which avg is
> which?

Good question. I missed that to be honest since we are mainly interested in cfs
and I was focused into not adding too many tracepoints..

I'm happy to create a tracepoint per class assuming that's what you're
suggesting.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/7] sched: Add pelt_rq tracepoint
  2019-05-13 12:48     ` Qais Yousef
@ 2019-05-13 13:37       ` Dietmar Eggemann
  0 siblings, 0 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2019-05-13 13:37 UTC (permalink / raw)
  To: Qais Yousef, Peter Zijlstra
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Quentin Perret

On 5/13/19 2:48 PM, Qais Yousef wrote:
> On 05/13/19 14:14, Peter Zijlstra wrote:
>> On Fri, May 10, 2019 at 12:30:10PM +0100, Qais Yousef wrote:
>>
>>> +DECLARE_TRACE(pelt_rq,
>>> +	TP_PROTO(int cpu, const char *path, struct sched_avg *avg),
>>> +	TP_ARGS(cpu, path, avg));
>>> +
>>
>>> +static __always_inline void sched_trace_pelt_cfs_rq(struct cfs_rq *cfs_rq)
>>> +{
>>> +	if (trace_pelt_rq_enabled()) {
>>> +		int cpu = cpu_of(rq_of(cfs_rq));
>>> +		char path[SCHED_TP_PATH_LEN];
>>> +
>>> +		cfs_rq_tg_path(cfs_rq, path, SCHED_TP_PATH_LEN);
>>> +		trace_pelt_rq(cpu, path, &cfs_rq->avg);
>>> +	}
>>> +}
>>> +
>>> +static __always_inline void sched_trace_pelt_rt_rq(struct rq *rq)
>>> +{
>>> +	if (trace_pelt_rq_enabled()) {
>>> +		int cpu = cpu_of(rq);
>>> +
>>> +		trace_pelt_rq(cpu, NULL, &rq->avg_rt);
>>> +	}
>>> +}
>>> +
>>> +static __always_inline void sched_trace_pelt_dl_rq(struct rq *rq)
>>> +{
>>> +	if (trace_pelt_rq_enabled()) {
>>> +		int cpu = cpu_of(rq);
>>> +
>>> +		trace_pelt_rq(cpu, NULL, &rq->avg_dl);
>>> +	}
>>> +}
>>
>> Since it is only the one real tracepoint, how do we know which avg is
>> which?
> 
> Good question. I missed that to be honest since we are mainly interested in cfs
> and I was focused into not adding too many tracepoints..
> 
> I'm happy to create a tracepoint per class assuming that's what you're
> suggesting.

IMHO, you should also consider irq (rq->avg_irq), so when people are 
tracing asystem with 'IRQ' or 'paravirtual steal' time accounting, they 
will get the full picture.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] Add new tracepoints required for EAS testing
  2019-05-13 12:28 ` [PATCH v2 0/7] Add new tracepoints required for EAS testing Peter Zijlstra
@ 2019-05-13 13:42   ` Qais Yousef
  2019-05-13 15:06     ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Qais Yousef @ 2019-05-13 13:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On 05/13/19 14:28, Peter Zijlstra wrote:
> 
> 
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index c8c7c7efb487..11555f95a88e 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -594,6 +594,23 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
>  
>  	TP_printk("cpu=%d", __entry->cpu)
>  );
> +
> +/*
> + * Following tracepoints are not exported in tracefs and provide hooking
> + * mechanisms only for testing and debugging purposes.
> + */
> +DECLARE_TRACE(pelt_cfs_rq,
> +	TP_PROTO(struct cfs_rq *cfs_rq),
> +	TP_ARGS(cfs_rq));
> +
> +DECLARE_TRACE(pelt_se,
> +	TP_PROTO(struct sched_entity *se),
> +	TP_ARGS(se));
> +
> +DECLARE_TRACE(sched_overutilized,
> +	TP_PROTO(int overutilized),
> +	TP_ARGS(overutilized));
> +

If I decoded this patch correctly, what you're saying:

	1. Move struct cfs_rq to the exported sched.h header
	2. Get rid of the fatty wrapper functions and export any necessary
	   helper functions.
	3. No need for RT and DL pelt tracking at the moment.

I'm okay with this. The RT and DL might need to be revisited later but we don't
have immediate need for them now.

I'll add to this passing rd->span to sched_overutilizied.

Thanks

--
Qais Yousef

>  #endif /* _TRACE_SCHED_H */
>  
>  /* This part must be outside protection */
> diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c
> index 2d4ff5353ded..2067080bb235 100644
> --- a/kernel/sched/autogroup.c
> +++ b/kernel/sched/autogroup.c
> @@ -259,7 +259,6 @@ void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m)
>  }
>  #endif /* CONFIG_PROC_FS */
>  
> -#ifdef CONFIG_SCHED_DEBUG
>  int autogroup_path(struct task_group *tg, char *buf, int buflen)
>  {
>  	if (!task_group_is_autogroup(tg))
> @@ -267,4 +266,3 @@ int autogroup_path(struct task_group *tg, char *buf, int buflen)
>  
>  	return snprintf(buf, buflen, "%s-%ld", "/autogroup", tg->autogroup->id);
>  }
> -#endif
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 102dfcf0a29a..629bbf4f4247 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -22,6 +22,14 @@
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/sched.h>
>  
> +/*
> + * Export tracepoints that act as a bare tracehook (ie: have no trace event
> + * associated with them) to allow external modules to probe them.
> + */
> +EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_cfs_rq);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized);
> +
>  DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
>  
>  #if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_JUMP_LABEL)
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index f35930f5e528..e7f82b1778b1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3334,6 +3334,9 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
>  	update_tg_cfs_util(cfs_rq, se, gcfs_rq);
>  	update_tg_cfs_runnable(cfs_rq, se, gcfs_rq);
>  
> +	trace_pelt_cfs_rq(cfs_rq);
> +	trace_pelt_se(se);
> +
>  	return 1;
>  }
>  
> @@ -3486,6 +3489,8 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
>  	add_tg_cfs_propagate(cfs_rq, se->avg.load_sum);
>  
>  	cfs_rq_util_change(cfs_rq, flags);
> +
> +	trace_pelt_cfs_rq(cfs_rq);
>  }
>  
>  /**
> @@ -3505,6 +3510,8 @@ static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
>  	add_tg_cfs_propagate(cfs_rq, -se->avg.load_sum);
>  
>  	cfs_rq_util_change(cfs_rq, 0);
> +
> +	trace_pelt_cfs_rq(cfs_rq);
>  }
>  
>  /*
> @@ -5153,8 +5160,10 @@ static inline bool cpu_overutilized(int cpu)
>  
>  static inline void update_overutilized_status(struct rq *rq)
>  {
> -	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu))
> +	if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) {
>  		WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED);
> +		trace_sched_overutilized(1);
> +	}
>  }
>  #else
>  static inline void update_overutilized_status(struct rq *rq) { }
> @@ -8516,8 +8525,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
>  
>  		/* Update over-utilization (tipping point, U >= 0) indicator */
>  		WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED);
> +
> +		trace_sched_overutilized(!!(sg_status & SG_OVERUTILIZED));
>  	} else if (sg_status & SG_OVERUTILIZED) {
>  		WRITE_ONCE(env->dst_rq->rd->overutilized, SG_OVERUTILIZED);
> +		trace_sched_overutilized(1);
>  	}
>  }
>  
> @@ -10737,3 +10749,17 @@ __init void init_sched_fair_class(void)
>  #endif /* SMP */
>  
>  }
> +
> +char *sched_trace_cfs_rq_path(struct cfs_rq *cfs_rq, char *str, size_t len)
> +{
> +	cfs_rq_tg_path(cfs_rq, path, len);
> +	return str;
> +}
> +EXPORT_SYMBOL_GPL(sched_trace_cfs_rq_path);
> +
> +int sched_trace_cfs_rq_cpu(struct cfs_rq *cfs_rq)
> +{
> +	return cpu_of(rq_of(cfs_rq));
> +}
> +EXPORT_SYMBOL_GPL(sched_trace_cfs_rq_cpu);
> +
> diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
> index befce29bd882..ebca40ba71f3 100644
> --- a/kernel/sched/pelt.c
> +++ b/kernel/sched/pelt.c
> @@ -25,6 +25,7 @@
>   */
>  
>  #include <linux/sched.h>
> +#include <trace/events/sched.h>
>  #include "sched.h"
>  #include "pelt.h"
>  
> @@ -265,6 +266,7 @@ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se)
>  {
>  	if (___update_load_sum(now, &se->avg, 0, 0, 0)) {
>  		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
> +		trace_pelt_se(se);
>  		return 1;
>  	}
>  
> @@ -278,6 +280,7 @@ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se
>  
>  		___update_load_avg(&se->avg, se_weight(se), se_runnable(se));
>  		cfs_se_util_change(&se->avg);
> +		trace_pelt_se(se);
>  		return 1;
>  	}
>  
> @@ -292,6 +295,7 @@ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq)
>  				cfs_rq->curr != NULL)) {
>  
>  		___update_load_avg(&cfs_rq->avg, 1, 1);
> +		trace_pelt_cfs_rq(cfs_rq);
>  		return 1;
>  	}
>  
> @@ -317,6 +321,7 @@ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running)
>  				running)) {
>  
>  		___update_load_avg(&rq->avg_rt, 1, 1);
> +//		sched_trace_pelt_rt_rq(rq);
>  		return 1;
>  	}
>  
> @@ -340,6 +345,7 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
>  				running)) {
>  
>  		___update_load_avg(&rq->avg_dl, 1, 1);
> +//		sched_trace_pelt_dl_rq(rq);
>  		return 1;
>  	}
>  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] Add new tracepoints required for EAS testing
  2019-05-13 13:42   ` Qais Yousef
@ 2019-05-13 15:06     ` Peter Zijlstra
  2019-05-13 15:18       ` Qais Yousef
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-05-13 15:06 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On Mon, May 13, 2019 at 02:42:03PM +0100, Qais Yousef wrote:
> On 05/13/19 14:28, Peter Zijlstra wrote:
> > 
> > 
> > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > index c8c7c7efb487..11555f95a88e 100644
> > --- a/include/trace/events/sched.h
> > +++ b/include/trace/events/sched.h
> > @@ -594,6 +594,23 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
> >  
> >  	TP_printk("cpu=%d", __entry->cpu)
> >  );
> > +
> > +/*
> > + * Following tracepoints are not exported in tracefs and provide hooking
> > + * mechanisms only for testing and debugging purposes.
> > + */
> > +DECLARE_TRACE(pelt_cfs_rq,
> > +	TP_PROTO(struct cfs_rq *cfs_rq),
> > +	TP_ARGS(cfs_rq));
> > +
> > +DECLARE_TRACE(pelt_se,
> > +	TP_PROTO(struct sched_entity *se),
> > +	TP_ARGS(se));
> > +
> > +DECLARE_TRACE(sched_overutilized,
> > +	TP_PROTO(int overutilized),
> > +	TP_ARGS(overutilized));
> > +
> 
> If I decoded this patch correctly, what you're saying:
> 
> 	1. Move struct cfs_rq to the exported sched.h header

No, don't expose the structure, we want to keep that private. You can
use unqualified pointers.

> 	2. Get rid of the fatty wrapper functions and export any necessary
> 	   helper functions.

Right, that should get them read-only access to the members of those
structures and avoids the tracing code itself from becoming ugleh and
also avoids us having to export those structures (which we really don't
want to do).

> 	3. No need for RT and DL pelt tracking at the moment.

Nah, you probably want rt,dl,irq (as Dietmar pointed out), it's just
that your patched didn't do it right and I was lazy.

> I'm okay with this. The RT and DL might need to be revisited later but we don't
> have immediate need for them now.
> 
> I'll add to this passing rd->span to sched_overutilizied.

Or pass the rd itself and add another wrapper to extract the span.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/7] Add new tracepoints required for EAS testing
  2019-05-13 15:06     ` Peter Zijlstra
@ 2019-05-13 15:18       ` Qais Yousef
  0 siblings, 0 replies; 17+ messages in thread
From: Qais Yousef @ 2019-05-13 15:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Steven Rostedt, linux-kernel, Pavankumar Kondeti,
	Sebastian Andrzej Siewior, Uwe Kleine-Konig, Dietmar Eggemann,
	Quentin Perret

On 05/13/19 17:06, Peter Zijlstra wrote:
> On Mon, May 13, 2019 at 02:42:03PM +0100, Qais Yousef wrote:
> > On 05/13/19 14:28, Peter Zijlstra wrote:
> > > 
> > > 
> > > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > > index c8c7c7efb487..11555f95a88e 100644
> > > --- a/include/trace/events/sched.h
> > > +++ b/include/trace/events/sched.h
> > > @@ -594,6 +594,23 @@ TRACE_EVENT(sched_wake_idle_without_ipi,
> > >  
> > >  	TP_printk("cpu=%d", __entry->cpu)
> > >  );
> > > +
> > > +/*
> > > + * Following tracepoints are not exported in tracefs and provide hooking
> > > + * mechanisms only for testing and debugging purposes.
> > > + */
> > > +DECLARE_TRACE(pelt_cfs_rq,
> > > +	TP_PROTO(struct cfs_rq *cfs_rq),
> > > +	TP_ARGS(cfs_rq));
> > > +
> > > +DECLARE_TRACE(pelt_se,
> > > +	TP_PROTO(struct sched_entity *se),
> > > +	TP_ARGS(se));
> > > +
> > > +DECLARE_TRACE(sched_overutilized,
> > > +	TP_PROTO(int overutilized),
> > > +	TP_ARGS(overutilized));
> > > +
> > 
> > If I decoded this patch correctly, what you're saying:
> > 
> > 	1. Move struct cfs_rq to the exported sched.h header
> 
> No, don't expose the structure, we want to keep that private. You can
> use unqualified pointers.
> 
> > 	2. Get rid of the fatty wrapper functions and export any necessary
> > 	   helper functions.
> 
> Right, that should get them read-only access to the members of those
> structures and avoids the tracing code itself from becoming ugleh and
> also avoids us having to export those structures (which we really don't
> want to do).
> 
> > 	3. No need for RT and DL pelt tracking at the moment.
> 
> Nah, you probably want rt,dl,irq (as Dietmar pointed out), it's just
> that your patched didn't do it right and I was lazy.
> 
> > I'm okay with this. The RT and DL might need to be revisited later but we don't
> > have immediate need for them now.
> > 
> > I'll add to this passing rd->span to sched_overutilizied.
> 
> Or pass the rd itself and add another wrapper to extract the span.

Ok got ya. Will do.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-05-13 15:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-10 11:30 [PATCH v2 0/7] Add new tracepoints required for EAS testing Qais Yousef
2019-05-10 11:30 ` [PATCH v2 1/7] sched: autogroup: Make autogroup_path() always available Qais Yousef
2019-05-10 11:30 ` [PATCH v2 2/7] sched: fair: move helper functions into fair.h Qais Yousef
2019-05-10 11:30 ` [PATCH v2 3/7] sched: fair.h: add a new cfs_rq_tg_path() Qais Yousef
2019-05-10 11:30 ` [PATCH v2 4/7] sched: Add pelt_rq tracepoint Qais Yousef
2019-05-13 12:14   ` Peter Zijlstra
2019-05-13 12:48     ` Qais Yousef
2019-05-13 13:37       ` Dietmar Eggemann
2019-05-10 11:30 ` [PATCH v2 5/7] sched: Add pelt_se tracepoint Qais Yousef
2019-05-10 11:30 ` [PATCH v2 6/7] sched: Add sched_overutilized tracepoint Qais Yousef
2019-05-13 12:08   ` Peter Zijlstra
2019-05-13 12:42     ` Qais Yousef
2019-05-10 11:30 ` [PATCH v2 7/7] sched: export the newly added tracepoints Qais Yousef
2019-05-13 12:28 ` [PATCH v2 0/7] Add new tracepoints required for EAS testing Peter Zijlstra
2019-05-13 13:42   ` Qais Yousef
2019-05-13 15:06     ` Peter Zijlstra
2019-05-13 15:18       ` Qais Yousef

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).