All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] pgo: Add PGO support for module profile data
@ 2021-06-12  3:24 Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 1/5] pgo: Expose module sections for clang PGO instumentation Jarmo Tiitto
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Nathan Chancellor, Nick Desaulniers, clang-built-linux
  Cc: Jarmo Tiitto, samitolvanen, morbo, wcw, keescook, jeyu, linux-kernel

This patch series intends to extend the current Clang PGO code to
support profile data from modules. Note that current PGO can and *does* 
instrument all kernel code, including modules, but this profile data
is inaccessible.

This patch series adds pgo/<module>.profraw files from what
per loaded module profile data can be read.

The final profile can be generated by merging all these profile files:
llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw ...
and then building the optimized kernel.

This v2 patch series is still an bit of RFC so I'd like feedback how
to do things better still.

The patches itself are based on Kees/for-next/clang/features tree
where I have two of my bug fix patches already in. :-)

I have done some initial testing:
 * Booted the instrumented kernel on qemu *and* bare hardware.
 * Module un/loading via test_module in QEMU.
 * Built optimized kernel using the new profile data.

Jarmo Tiitto (5):
  pgo: Expose module sections for clang PGO instumentation.
  pgo: Make serializing functions to take prf_object
  pgo: Wire up the new more generic code for modules
  pgo: Add module notifier machinery
  pgo: Cleanup code in pgo/fs.c

 include/linux/module.h  |  15 +++
 kernel/Makefile         |   6 +
 kernel/module.c         |   7 ++
 kernel/pgo/fs.c         | 241 ++++++++++++++++++++++++++++++++++------
 kernel/pgo/instrument.c |  57 +++++++---
 kernel/pgo/pgo.h        |  85 ++++++++++----
 6 files changed, 342 insertions(+), 69 deletions(-)


base-commit: 0039303120c0065f3952698597e0c9916b76ebd5
-- 
2.32.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 1/5] pgo: Expose module sections for clang PGO instumentation.
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
@ 2021-06-12  3:24 ` Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 2/5] pgo: Make serializing functions to take prf_object Jarmo Tiitto
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Jessica Yu, Nathan Chancellor, Nick Desaulniers, Kees Cook,
	Andrew Morton, Sami Tolvanen, Marco Elver, Alexander Popov,
	Masahiro Yamada, Paul E. McKenney, Jarmo Tiitto, Dmitry Vyukov,
	Arnd Bergmann, Alexei Starovoitov, linux-kernel,
	clang-built-linux
  Cc: morbo, Andy Shevchenko, Thomas Gleixner, Chris Wilson

Expose module sections for clang PGO:
In find_module_sections() add code to grab pointer and size
of __llvm_prf_{data,cnts,names,vnds} sections.

This data is used by pgo/instrument.c and pgo/fs.c
in following patches together with explicitly exposed
 vmlinux's core __llvm_prf_xxx sections.

Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
---
the reason of disabling profiling for module.c
is that instrumented kernel changes struct module layout,
and thus invalidates profile data collected from module.c
when optimized kernel it built.

More over the profile data from kernel/module.c
is probably not needed either way.
---
 include/linux/module.h | 15 +++++++++++++++
 kernel/Makefile        |  6 ++++++
 kernel/module.c        |  7 +++++++
 3 files changed, 28 insertions(+)

diff --git a/include/linux/module.h b/include/linux/module.h
index 8100bb477d86..7f557016e879 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -492,6 +492,21 @@ struct module {
 	unsigned long *kprobe_blacklist;
 	unsigned int num_kprobe_blacklist;
 #endif
+#ifdef CONFIG_PGO_CLANG
+	/*
+	 * Keep in sync with the PGO_CLANG_DATA sections
+	 * in include/asm-generic/vmlinux.lds.h
+	 * The prf_xxx_size is the section size in bytes.
+	 */
+	void *prf_data; /* struct llvm_prf_data */
+	int prf_data_size;
+	void *prf_cnts;
+	int prf_cnts_size;
+	const void *prf_names;
+	int prf_names_size;
+	void *prf_vnds; /* struct llvm_prf_value_node */
+	int prf_vnds_size;
+#endif
 #ifdef CONFIG_HAVE_STATIC_CALL_INLINE
 	int num_static_call_sites;
 	struct static_call_site *static_call_sites;
diff --git a/kernel/Makefile b/kernel/Makefile
index 6deef4effbdb..8657d67b771c 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -44,6 +44,12 @@ CFLAGS_kcov.o := $(call cc-option, -fno-conserve-stack) -fno-stack-protector
 # Don't instrument error handlers
 CFLAGS_REMOVE_cfi.o := $(CC_FLAGS_CFI)
 
+# Don't profile module.c:
+# CLANG_PGO changes the layout of struct module
+# for instrumented kernel so the profile data
+# will mismatch on final build.
+PGO_PROFILE_module.o := n
+
 obj-y += sched/
 obj-y += locking/
 obj-y += power/
diff --git a/kernel/module.c b/kernel/module.c
index b5dd92e35b02..a2f65c247c41 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3329,6 +3329,13 @@ static int find_module_sections(struct module *mod, struct load_info *info)
 					      sizeof(*mod->static_call_sites),
 					      &mod->num_static_call_sites);
 #endif
+#ifdef CONFIG_PGO_CLANG
+	mod->prf_data = section_objs(info, "__llvm_prf_data", 1, &mod->prf_data_size);
+	mod->prf_cnts = section_objs(info, "__llvm_prf_cnts", 1, &mod->prf_cnts_size);
+	mod->prf_names = section_objs(info, "__llvm_prf_names", 1, &mod->prf_names_size);
+	mod->prf_vnds = section_objs(info, "__llvm_prf_vnds", 1, &mod->prf_vnds_size);
+#endif
+
 	mod->extable = section_objs(info, "__ex_table",
 				    sizeof(*mod->extable), &mod->num_exentries);
 

base-commit: 0039303120c0065f3952698597e0c9916b76ebd5
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 2/5] pgo: Make serializing functions to take prf_object
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 1/5] pgo: Expose module sections for clang PGO instumentation Jarmo Tiitto
@ 2021-06-12  3:24 ` Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules Jarmo Tiitto
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel
  Cc: Jarmo Tiitto, morbo

* Add struct prf_object to hold module sections in
  the new "prf_list" linked list.

* Rewrite serialization code such that they take struct
  prf_object as an argument and work on it instead
  of the global: __llvm_prf_{data,cnts,names,vnds} sections.

* No functional user visible changes yet.

The prf_vmlinux is initialized using vmlinux core sections.
This struct is then passed down into the serializing functions.

Idea here is that the profiling nor serialization code
doesn't need to care if the prf_object points into
vmlinux's core sections or into some loaded module's sections.

* The prf_list is RCU protected so that in following commits
  allocate_node() can walk the prf_list uninterupted.

* List updaters must take the new prf_list_lock() lock.

Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
---
This is the main commit of this patch series,
since it re-structures most of the pgo/fs.c around prf_object.

I have squashed a lot of stuff into this single commit.
---
 kernel/pgo/fs.c         | 105 +++++++++++++++++++++++++++++-----------
 kernel/pgo/instrument.c |   3 +-
 kernel/pgo/pgo.h        |  83 +++++++++++++++++++++++--------
 3 files changed, 142 insertions(+), 49 deletions(-)

diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
index 3c5aa7c2a4ce..7e269d69bcd7 100644
--- a/kernel/pgo/fs.c
+++ b/kernel/pgo/fs.c
@@ -34,6 +34,31 @@ struct prf_private_data {
 	size_t size;
 };
 
+static struct prf_object prf_vmlinux;
+
+/*
+ * This lock guards prf_list from concurrent access:
+ * - New module is registered.
+ * - Module is unloaded.
+ * It does *not* protect any PGO serialization functions.
+ */
+DEFINE_SPINLOCK(prf_reg_lock);
+LIST_HEAD(prf_list);
+
+static inline unsigned long prf_list_lock(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&prf_reg_lock, flags);
+
+	return flags;
+}
+
+static inline void prf_list_unlock(unsigned long flags)
+{
+	spin_unlock_irqrestore(&prf_reg_lock, flags);
+}
+
 /*
  * Raw profile data format:
  *
@@ -51,7 +76,7 @@ struct prf_private_data {
  *		...
  */
 
-static void prf_fill_header(void **buffer)
+static void prf_fill_header(struct prf_object *po, void **buffer)
 {
 	struct llvm_prf_header *header = *(struct llvm_prf_header **)buffer;
 
@@ -61,13 +86,13 @@ static void prf_fill_header(void **buffer)
 	header->magic = LLVM_INSTR_PROF_RAW_MAGIC_32;
 #endif
 	header->version = LLVM_VARIANT_MASK_IR_PROF | LLVM_INSTR_PROF_RAW_VERSION;
-	header->data_size = prf_data_count();
+	header->data_size = po->data_num;
 	header->padding_bytes_before_counters = 0;
-	header->counters_size = prf_cnts_count();
+	header->counters_size = po->cnts_num;
 	header->padding_bytes_after_counters = 0;
-	header->names_size = prf_names_count();
-	header->counters_delta = (u64)__llvm_prf_cnts_start;
-	header->names_delta = (u64)__llvm_prf_names_start;
+	header->names_size = po->names_num;
+	header->counters_delta = (u64)po->cnts;
+	header->names_delta = (u64)po->names;
 	header->value_kind_last = LLVM_INSTR_PROF_IPVK_LAST;
 
 	*buffer += sizeof(*header);
@@ -77,7 +102,7 @@ static void prf_fill_header(void **buffer)
  * Copy the source into the buffer, incrementing the pointer into buffer in the
  * process.
  */
-static void prf_copy_to_buffer(void **buffer, void *src, unsigned long size)
+static void prf_copy_to_buffer(void **buffer, const void *src, unsigned long size)
 {
 	memcpy(*buffer, src, size);
 	*buffer += size;
@@ -129,12 +154,13 @@ static u32 __prf_get_value_size(struct llvm_prf_data *p, u32 *value_kinds)
 	return size;
 }
 
-static u32 prf_get_value_size(void)
+static u32 prf_get_value_size(struct prf_object *po)
 {
 	u32 size = 0;
 	struct llvm_prf_data *p;
+	struct llvm_prf_data *end = po->data + po->data_num;
 
-	for (p = __llvm_prf_data_start; p < __llvm_prf_data_end; p++)
+	for (p = po->data; p < end; p++)
 		size += __prf_get_value_size(p, NULL);
 
 	return size;
@@ -201,11 +227,12 @@ static void prf_serialize_value(struct llvm_prf_data *p, void **buffer)
 	}
 }
 
-static void prf_serialize_values(void **buffer)
+static void prf_serialize_values(struct prf_object *po, void **buffer)
 {
 	struct llvm_prf_data *p;
+	struct llvm_prf_data *end = po->data + po->data_num;
 
-	for (p = __llvm_prf_data_start; p < __llvm_prf_data_end; p++)
+	for (p = po->data; p < end; p++)
 		prf_serialize_value(p, buffer);
 }
 
@@ -215,14 +242,14 @@ static inline unsigned long prf_get_padding(unsigned long size)
 }
 
 /* Note: caller *must* hold pgo_lock */
-static unsigned long prf_buffer_size(void)
+static unsigned long prf_buffer_size(struct prf_object *po)
 {
-	return sizeof(struct llvm_prf_header) +
-			prf_data_size()	+
-			prf_cnts_size() +
-			prf_names_size() +
-			prf_get_padding(prf_names_size()) +
-			prf_get_value_size();
+	return sizeof(struct llvm_prf_header)	+
+			prf_data_size(po)	+
+			prf_cnts_size(po)	+
+			prf_names_size(po)	+
+			prf_get_padding(prf_names_size(po)) +
+			prf_get_value_size(po);
 }
 
 /*
@@ -232,12 +259,12 @@ static unsigned long prf_buffer_size(void)
  * area of at least prf_buffer_size() in size.
  * Note: caller *must* hold pgo_lock.
  */
-static int prf_serialize(struct prf_private_data *p, size_t buf_size)
+static int prf_serialize(struct prf_object *po, struct prf_private_data *p, size_t buf_size)
 {
 	void *buffer;
 
 	/* get buffer size, again. */
-	p->size = prf_buffer_size();
+	p->size = prf_buffer_size(po);
 
 	/* check for unlikely overflow. */
 	if (p->size > buf_size)
@@ -245,15 +272,16 @@ static int prf_serialize(struct prf_private_data *p, size_t buf_size)
 
 	buffer = p->buffer;
 
-	prf_fill_header(&buffer);
-	prf_copy_to_buffer(&buffer, __llvm_prf_data_start,  prf_data_size());
-	prf_copy_to_buffer(&buffer, __llvm_prf_cnts_start,  prf_cnts_size());
-	prf_copy_to_buffer(&buffer, __llvm_prf_names_start, prf_names_size());
-	buffer += prf_get_padding(prf_names_size());
+	prf_fill_header(po, &buffer);
+	prf_copy_to_buffer(&buffer, po->data,  prf_data_size(po));
+	prf_copy_to_buffer(&buffer, po->cnts,  prf_cnts_size(po));
+	prf_copy_to_buffer(&buffer, po->names, prf_names_size(po));
+	buffer += prf_get_padding(prf_names_size(po));
 
-	prf_serialize_values(&buffer);
+	prf_serialize_values(po, &buffer);
 
 	return 0;
+
 }
 
 /* open() implementation for PGO. Creates a copy of the profiling data set. */
@@ -270,7 +298,7 @@ static int prf_open(struct inode *inode, struct file *file)
 
 	/* Get initial buffer size. */
 	flags = prf_lock();
-	data->size = prf_buffer_size();
+	data->size = prf_buffer_size(&prf_vmlinux);
 	prf_unlock(flags);
 
 	do {
@@ -290,7 +318,7 @@ static int prf_open(struct inode *inode, struct file *file)
 		 * data length in data->size.
 		 */
 		flags = prf_lock();
-		err = prf_serialize(data, buf_size);
+		err = prf_serialize(&prf_vmlinux, data, buf_size);
 		prf_unlock(flags);
 		/* In unlikely case, try again. */
 	} while (err == -EAGAIN);
@@ -346,7 +374,7 @@ static ssize_t reset_write(struct file *file, const char __user *addr,
 {
 	struct llvm_prf_data *data;
 
-	memset(__llvm_prf_cnts_start, 0, prf_cnts_size());
+	memset(__llvm_prf_cnts_start, 0, prf_cnts_size(&prf_vmlinux));
 
 	for (data = __llvm_prf_data_start; data < __llvm_prf_data_end; data++) {
 		struct llvm_prf_value_node **vnodes;
@@ -384,6 +412,25 @@ static const struct file_operations prf_reset_fops = {
 /* Create debugfs entries. */
 static int __init pgo_init(void)
 {
+	/* Init profiler vmlinux core entry */
+	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
+	prf_vmlinux.data = __llvm_prf_data_start;
+	prf_vmlinux.data_num = prf_get_count(__llvm_prf_data_start,
+		__llvm_prf_data_end, sizeof(__llvm_prf_data_start[0]));
+
+	prf_vmlinux.cnts = __llvm_prf_cnts_start;
+	prf_vmlinux.cnts_num = prf_get_count(__llvm_prf_cnts_start,
+		__llvm_prf_cnts_end, sizeof(__llvm_prf_cnts_start[0]));
+
+	prf_vmlinux.names = __llvm_prf_names_start;
+	prf_vmlinux.names_num = prf_get_count(__llvm_prf_names_start,
+		__llvm_prf_names_end, sizeof(__llvm_prf_names_start[0]));
+
+	prf_vmlinux.vnds = __llvm_prf_vnds_start;
+	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
+		__llvm_prf_vnds_end, sizeof(__llvm_prf_vnds_start[0]));
+
+
 	directory = debugfs_create_dir("pgo", NULL);
 	if (!directory)
 		goto err_remove;
diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index 8b54fb6be336..24fdeb79b674 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -56,7 +56,8 @@ void prf_unlock(unsigned long flags)
 static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
 						 u32 index, u64 value)
 {
-	const int max_vnds = prf_vnds_count();
+	const int max_vnds = prf_get_count(__llvm_prf_vnds_start,
+		__llvm_prf_vnds_end, sizeof(struct llvm_prf_value_node));
 
 	/*
 	 * Check that p is within vmlinux __llvm_prf_data section.
diff --git a/kernel/pgo/pgo.h b/kernel/pgo/pgo.h
index ba3f8499254a..44d79e2861e1 100644
--- a/kernel/pgo/pgo.h
+++ b/kernel/pgo/pgo.h
@@ -185,27 +185,72 @@ void __llvm_profile_instrument_range(u64 target_value, void *data,
 void __llvm_profile_instrument_memop(u64 target_value, void *data,
 				     u32 counter_index);
 
-#define __DEFINE_PRF_SIZE(s) \
-	static inline unsigned long prf_ ## s ## _size(void)		\
-	{								\
-		unsigned long start =					\
-			(unsigned long)__llvm_prf_ ## s ## _start;	\
-		unsigned long end =					\
-			(unsigned long)__llvm_prf_ ## s ## _end;	\
-		return roundup(end - start,				\
-				sizeof(__llvm_prf_ ## s ## _start[0]));	\
-	}								\
-	static inline unsigned long prf_ ## s ## _count(void)		\
-	{								\
-		return prf_ ## s ## _size() /				\
-			sizeof(__llvm_prf_ ## s ## _start[0]);		\
+/*
+ * profiler object:
+ * maintain profiler internal state
+ * of vmlinux or any instrumented module.
+ */
+struct prf_object {
+	struct list_head link;
+	struct rcu_head rcu;
+
+	/*
+	 * module name of this prf_object
+	 * refers to struct module::name
+	 * or "vmlinux"
+	 */
+	const char *mod_name;
+
+	/* debugfs file of this profile data set */
+	struct dentry *file;
+
+	int current_node;
+
+	struct llvm_prf_data *data;
+	int data_num;
+	u64 *cnts;
+	int cnts_num;
+	const char *names;
+	int names_num;
+	struct llvm_prf_value_node *vnds;
+	int vnds_num;
+};
+
+/*
+ * List of profiled modules and vmlinux.
+ * Readers must take rcu_read_lock() and
+ * updaters must take prf_list_lock() mutex.
+ */
+extern struct list_head prf_list;
+
+/*
+ * define helpers to get __llvm_prf_xxx sections bounds
+ */
+#define __DEFINE_PRF_OBJ_SIZE(s) \
+	static inline unsigned long prf_ ## s ## _size(struct prf_object *po) \
+	{ \
+		return po->s ## _num * sizeof(po->s[0]); \
+	} \
+	static inline unsigned long prf_ ## s ## _count(struct prf_object *po) \
+	{ \
+		return po->s ## _num; \
 	}
 
-__DEFINE_PRF_SIZE(data);
-__DEFINE_PRF_SIZE(cnts);
-__DEFINE_PRF_SIZE(names);
-__DEFINE_PRF_SIZE(vnds);
+__DEFINE_PRF_OBJ_SIZE(data);
+__DEFINE_PRF_OBJ_SIZE(cnts);
+__DEFINE_PRF_OBJ_SIZE(names);
+__DEFINE_PRF_OBJ_SIZE(vnds);
+
+#undef __DEFINE_PRF_OBJ_SIZE
+
+/* count number of items in range */
+static inline unsigned int prf_get_count(const void *_start, const void *_end,
+	unsigned int objsize)
+{
+	unsigned long start = (unsigned long)_start;
+	unsigned long end =	(unsigned long)_end;
 
-#undef __DEFINE_PRF_SIZE
+	return roundup(end - start, objsize) / objsize;
+}
 
 #endif /* _PGO_H */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 1/5] pgo: Expose module sections for clang PGO instumentation Jarmo Tiitto
  2021-06-12  3:24 ` [RFC PATCH 2/5] pgo: Make serializing functions to take prf_object Jarmo Tiitto
@ 2021-06-12  3:24 ` Jarmo Tiitto
  2021-06-14 15:55   ` Kees Cook
  2021-06-12  3:24 ` [RFC PATCH 4/5] pgo: Add module notifier machinery Jarmo Tiitto
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel
  Cc: Jarmo Tiitto, morbo

prf_open() now uses the inode->i_private to get
the prf_object for the file. This can be either
vmlinux.profraw or any module.profraw file.

The prf_vmlinux object is now added into prf_list and
allocate_node() scans the list and reserves vnodes
from corresponding prf_object(s).

Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
---
note: There is no module notifier code yet,
so only vmlinux.profraw profile data
is available with this commit.

Another thing is that pgo/reset will only
reset vmlinux.profraw.
Profile data reset for modules may be added later:
maybe writing module's name into pgo/reset would reset only
the specified module's profile data?
Then writing "all" or zero would atomically reset everything.

I'm bit unsure about the new allocate_node() code since
it is the first place I had to put rcu_read_lock()
and the code is likely to change from this.
---
 kernel/pgo/fs.c         | 30 ++++++++++++++++++++-----
 kernel/pgo/instrument.c | 49 +++++++++++++++++++++++++++--------------
 kernel/pgo/pgo.h        |  2 ++
 3 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
index 7e269d69bcd7..84b36e61758b 100644
--- a/kernel/pgo/fs.c
+++ b/kernel/pgo/fs.c
@@ -32,8 +32,10 @@ static struct dentry *directory;
 struct prf_private_data {
 	void *buffer;
 	size_t size;
+	struct prf_object *core;
 };
 
+/* vmlinux's prf core */
 static struct prf_object prf_vmlinux;
 
 /*
@@ -281,7 +283,6 @@ static int prf_serialize(struct prf_object *po, struct prf_private_data *p, size
 	prf_serialize_values(po, &buffer);
 
 	return 0;
-
 }
 
 /* open() implementation for PGO. Creates a copy of the profiling data set. */
@@ -292,13 +293,21 @@ static int prf_open(struct inode *inode, struct file *file)
 	size_t buf_size;
 	int err = -EINVAL;
 
+	if (WARN_ON(!inode->i_private)) {
+		/* bug: inode was not initialized by us */
+		return err;
+	}
+
 	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	if (!data)
 		return -ENOMEM;
 
+	/* Get prf_object of this inode */
+	data->core = inode->i_private;
+
 	/* Get initial buffer size. */
 	flags = prf_lock();
-	data->size = prf_buffer_size(&prf_vmlinux);
+	data->size = prf_buffer_size(data->core);
 	prf_unlock(flags);
 
 	do {
@@ -318,12 +327,13 @@ static int prf_open(struct inode *inode, struct file *file)
 		 * data length in data->size.
 		 */
 		flags = prf_lock();
-		err = prf_serialize(&prf_vmlinux, data, buf_size);
+		err = prf_serialize(data->core, data, buf_size);
 		prf_unlock(flags);
 		/* In unlikely case, try again. */
 	} while (err == -EAGAIN);
 
 	if (err < 0) {
+
 		if (data)
 			vfree(data->buffer);
 		kfree(data);
@@ -412,6 +422,8 @@ static const struct file_operations prf_reset_fops = {
 /* Create debugfs entries. */
 static int __init pgo_init(void)
 {
+	unsigned long flags;
+
 	/* Init profiler vmlinux core entry */
 	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
 	prf_vmlinux.data = __llvm_prf_data_start;
@@ -430,19 +442,27 @@ static int __init pgo_init(void)
 	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
 		__llvm_prf_vnds_end, sizeof(__llvm_prf_vnds_start[0]));
 
+	/* enable profiling */
+	flags = prf_list_lock();
+	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
+	prf_list_unlock(flags);
 
 	directory = debugfs_create_dir("pgo", NULL);
 	if (!directory)
 		goto err_remove;
 
-	if (!debugfs_create_file("vmlinux.profraw", 0600, directory, NULL,
-				 &prf_fops))
+	prf_vmlinux.file = debugfs_create_file("vmlinux.profraw",
+		0600, directory, &prf_vmlinux, &prf_fops);
+	if (!prf_vmlinux.file)
 		goto err_remove;
 
 	if (!debugfs_create_file("reset", 0200, directory, NULL,
 				 &prf_reset_fops))
 		goto err_remove;
 
+	/* show notice why the system slower: */
+	pr_notice("Clang PGO instrumentation is active.");
+
 	return 0;
 
 err_remove:
diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index 24fdeb79b674..e214c9d7a113 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -24,6 +24,7 @@
 #include <linux/export.h>
 #include <linux/spinlock.h>
 #include <linux/types.h>
+#include <linux/rculist.h>
 #include "pgo.h"
 
 /*
@@ -56,22 +57,38 @@ void prf_unlock(unsigned long flags)
 static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
 						 u32 index, u64 value)
 {
-	const int max_vnds = prf_get_count(__llvm_prf_vnds_start,
-		__llvm_prf_vnds_end, sizeof(struct llvm_prf_value_node));
-
-	/*
-	 * Check that p is within vmlinux __llvm_prf_data section.
-	 * If not, don't allocate since we can't handle modules yet.
-	 */
-	if (!memory_contains(__llvm_prf_data_start,
-		__llvm_prf_data_end, p, sizeof(*p)))
-		return NULL;
-
-	if (WARN_ON_ONCE(current_node >= max_vnds))
-		return NULL; /* Out of nodes */
-
-	/* reserve vnode for vmlinux */
-	return &__llvm_prf_vnds_start[current_node++];
+	struct llvm_prf_value_node *vnode = NULL;
+	struct prf_object *po;
+	struct llvm_prf_data *data_end;
+	int max_vnds;
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(po, &prf_list, link) {
+		/* get section limits */
+		max_vnds = prf_vnds_count(po);
+		data_end = po->data + prf_data_count(po);
+
+		/*
+		 * Check that p is within:
+		 * [po->data, po->data + prf_data_count(po)] section.
+		 * If yes, allocate vnode from this prf_object.
+		 */
+		if (memory_contains(po->data, data_end, p, sizeof(*p))) {
+
+
+			if (WARN_ON_ONCE(po->current_node >= max_vnds))
+				return NULL; /* Out of nodes */
+
+			/* reserve the vnode */
+			vnode = &po->vnds[po->current_node++];
+			goto out;
+		}
+	}
+
+out:
+	rcu_read_unlock();
+	return vnode;
 }
 
 /*
diff --git a/kernel/pgo/pgo.h b/kernel/pgo/pgo.h
index 44d79e2861e1..59d0aa966fbe 100644
--- a/kernel/pgo/pgo.h
+++ b/kernel/pgo/pgo.h
@@ -19,6 +19,8 @@
 #ifndef _PGO_H
 #define _PGO_H
 
+#include <linux/rculist.h>
+
 /*
  * Note: These internal LLVM definitions must match the compiler version.
  * See llvm/include/llvm/ProfileData/InstrProfData.inc in LLVM's source code.
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 4/5] pgo: Add module notifier machinery
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
                   ` (2 preceding siblings ...)
  2021-06-12  3:24 ` [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules Jarmo Tiitto
@ 2021-06-12  3:24 ` Jarmo Tiitto
  2021-06-14 16:00   ` Kees Cook
  2021-06-12  3:24 ` [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c Jarmo Tiitto
  2021-06-14 16:05 ` [RFC PATCH 0/5] pgo: Add PGO support for module profile data Kees Cook
  5 siblings, 1 reply; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel
  Cc: Jarmo Tiitto, morbo

Add module notifier callback and register modules
into prf_list.

For each module that has __llvm_prf_{data,cnts,names,vnds} sections
The callback creates debugfs <module>.profraw entry and
links new prf_object into prf_list.

This enables profiling for all loaded modules.

* Moved rcu_read_lock() outside of allocate_node() in order
  to protect __llvm_profile_instrument_target() from module unload(s)

Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
---
v2: Passed QEMU SMP boot test into minimal Arch Linux distro,
module loading and unloading work without warnings.
Module profile data looks ok. :-)
---
 kernel/pgo/fs.c         | 111 ++++++++++++++++++++++++++++++++++++++++
 kernel/pgo/instrument.c |  19 ++++---
 2 files changed, 122 insertions(+), 8 deletions(-)

diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
index 84b36e61758b..98b982245b58 100644
--- a/kernel/pgo/fs.c
+++ b/kernel/pgo/fs.c
@@ -419,6 +419,110 @@ static const struct file_operations prf_reset_fops = {
 	.llseek		= noop_llseek,
 };
 
+static void pgo_module_init(struct module *mod)
+{
+	struct prf_object *po;
+	char fname[MODULE_NAME_LEN + 9]; /* +strlen(".profraw") */
+	unsigned long flags;
+
+	/* add new prf_object entry for the module */
+	po = kzalloc(sizeof(*po), GFP_KERNEL);
+	if (!po)
+		goto out;
+
+	po->mod_name = mod->name;
+
+	fname[0] = 0;
+	snprintf(fname, sizeof(fname), "%s.profraw", po->mod_name);
+
+	/* setup prf_object sections */
+	po->data = mod->prf_data;
+	po->data_num = prf_get_count(mod->prf_data,
+		(char *)mod->prf_data + mod->prf_data_size, sizeof(po->data[0]));
+
+	po->cnts = mod->prf_cnts;
+	po->cnts_num = prf_get_count(mod->prf_cnts,
+		(char *)mod->prf_cnts + mod->prf_cnts_size, sizeof(po->cnts[0]));
+
+	po->names = mod->prf_names;
+	po->names_num = prf_get_count(mod->prf_names,
+		(char *)mod->prf_names + mod->prf_names_size, sizeof(po->names[0]));
+
+	po->vnds = mod->prf_vnds;
+	po->vnds_num = prf_get_count(mod->prf_vnds,
+		(char *)mod->prf_vnds + mod->prf_vnds_size, sizeof(po->vnds[0]));
+
+	/* create debugfs entry */
+	po->file = debugfs_create_file(fname, 0600, directory, po, &prf_fops);
+	if (!po->file) {
+		pr_err("Failed to setup module pgo: %s", fname);
+		kfree(po);
+		goto out;
+	}
+
+	/* finally enable profiling for the module */
+	flags = prf_list_lock();
+	list_add_tail_rcu(&po->link, &prf_list);
+	prf_list_unlock(flags);
+
+out:
+	return;
+}
+
+static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
+				void *pdata)
+{
+	struct module *mod = pdata;
+	struct prf_object *po;
+	unsigned long flags;
+
+	if (event == MODULE_STATE_LIVE) {
+		/* does the module have profiling info? */
+		if (mod->prf_data
+			&& mod->prf_cnts
+			&& mod->prf_names
+			&& mod->prf_vnds) {
+
+			/* setup module profiling */
+			pgo_module_init(mod);
+		}
+	}
+
+	if (event == MODULE_STATE_GOING) {
+		/* find the prf_object from the list */
+		rcu_read_lock();
+
+		list_for_each_entry_rcu(po, &prf_list, link) {
+			if (strcmp(po->mod_name, mod->name) == 0)
+				goto out_unlock;
+		}
+		/* no such module */
+		po = NULL;
+
+out_unlock:
+		rcu_read_unlock();
+
+		if (po) {
+			/* remove from profiled modules */
+			flags = prf_list_lock();
+			list_del_rcu(&po->link);
+			prf_list_unlock(flags);
+
+			debugfs_remove(po->file);
+			po->file = NULL;
+
+			/* cleanup memory */
+			kfree_rcu(po, rcu);
+		}
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block pgo_module_nb = {
+	.notifier_call = pgo_module_notifier
+};
+
 /* Create debugfs entries. */
 static int __init pgo_init(void)
 {
@@ -444,6 +548,7 @@ static int __init pgo_init(void)
 
 	/* enable profiling */
 	flags = prf_list_lock();
+	prf_vmlinux.mod_name = "vmlinux";
 	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
 	prf_list_unlock(flags);
 
@@ -460,6 +565,9 @@ static int __init pgo_init(void)
 				 &prf_reset_fops))
 		goto err_remove;
 
+	/* register module notifer */
+	register_module_notifier(&pgo_module_nb);
+
 	/* show notice why the system slower: */
 	pr_notice("Clang PGO instrumentation is active.");
 
@@ -473,6 +581,9 @@ static int __init pgo_init(void)
 /* Remove debugfs entries. */
 static void __exit pgo_exit(void)
 {
+	/* unsubscribe the notifier and do cleanup. */
+	unregister_module_notifier(&pgo_module_nb);
+
 	debugfs_remove_recursive(directory);
 }
 
diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index e214c9d7a113..70bab7e4c153 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -33,7 +33,6 @@
  * ensures that we don't try to serialize data that's only partially updated.
  */
 static DEFINE_SPINLOCK(pgo_lock);
-static int current_node;
 
 unsigned long prf_lock(void)
 {
@@ -62,8 +61,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
 	struct llvm_prf_data *data_end;
 	int max_vnds;
 
-	rcu_read_lock();
-
 	list_for_each_entry_rcu(po, &prf_list, link) {
 		/* get section limits */
 		max_vnds = prf_vnds_count(po);
@@ -76,7 +73,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
 		 */
 		if (memory_contains(po->data, data_end, p, sizeof(*p))) {
 
-
 			if (WARN_ON_ONCE(po->current_node >= max_vnds))
 				return NULL; /* Out of nodes */
 
@@ -87,7 +83,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
 	}
 
 out:
-	rcu_read_unlock();
 	return vnode;
 }
 
@@ -108,8 +103,14 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
 	u8 values = 0;
 	unsigned long flags;
 
+	/*
+	 * lock the underlying prf_object(s) in place,
+	 * so they won't vanish while we are operating on it.
+	 */
+	rcu_read_lock();
+
 	if (!p || !p->values)
-		return;
+		goto rcu_unlock;
 
 	counters = (struct llvm_prf_value_node **)p->values;
 	curr = counters[index];
@@ -117,7 +118,7 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
 	while (curr) {
 		if (target_value == curr->value) {
 			curr->count++;
-			return;
+			goto rcu_unlock;
 		}
 
 		if (curr->count < min_count) {
@@ -136,7 +137,7 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
 			curr->value = target_value;
 			curr->count++;
 		}
-		return;
+		goto rcu_unlock;
 	}
 
 	/* Lock when updating the value node structure. */
@@ -156,6 +157,8 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
 
 out:
 	prf_unlock(flags);
+rcu_unlock:
+	rcu_read_unlock();
 }
 EXPORT_SYMBOL(__llvm_profile_instrument_target);
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
                   ` (3 preceding siblings ...)
  2021-06-12  3:24 ` [RFC PATCH 4/5] pgo: Add module notifier machinery Jarmo Tiitto
@ 2021-06-12  3:24 ` Jarmo Tiitto
  2021-06-14 15:50   ` Kees Cook
  2021-06-14 16:05 ` [RFC PATCH 0/5] pgo: Add PGO support for module profile data Kees Cook
  5 siblings, 1 reply; 14+ messages in thread
From: Jarmo Tiitto @ 2021-06-12  3:24 UTC (permalink / raw)
  To: Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel
  Cc: Jarmo Tiitto, morbo

Cleanups to comments and punctuation.
Cleanup return path in pgo_module_init.

Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
---
 kernel/pgo/fs.c | 47 +++++++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
index 98b982245b58..855d5e3050fa 100644
--- a/kernel/pgo/fs.c
+++ b/kernel/pgo/fs.c
@@ -294,7 +294,7 @@ static int prf_open(struct inode *inode, struct file *file)
 	int err = -EINVAL;
 
 	if (WARN_ON(!inode->i_private)) {
-		/* bug: inode was not initialized by us */
+		/* Bug: inode was not initialized by us. */
 		return err;
 	}
 
@@ -302,7 +302,7 @@ static int prf_open(struct inode *inode, struct file *file)
 	if (!data)
 		return -ENOMEM;
 
-	/* Get prf_object of this inode */
+	/* Get prf_object of this inode. */
 	data->core = inode->i_private;
 
 	/* Get initial buffer size. */
@@ -425,17 +425,17 @@ static void pgo_module_init(struct module *mod)
 	char fname[MODULE_NAME_LEN + 9]; /* +strlen(".profraw") */
 	unsigned long flags;
 
-	/* add new prf_object entry for the module */
+	/* Add new prf_object entry for the module. */
 	po = kzalloc(sizeof(*po), GFP_KERNEL);
 	if (!po)
-		goto out;
+		return; /* -ENOMEM */
 
 	po->mod_name = mod->name;
 
 	fname[0] = 0;
 	snprintf(fname, sizeof(fname), "%s.profraw", po->mod_name);
 
-	/* setup prf_object sections */
+	/* Setup prf_object sections. */
 	po->data = mod->prf_data;
 	po->data_num = prf_get_count(mod->prf_data,
 		(char *)mod->prf_data + mod->prf_data_size, sizeof(po->data[0]));
@@ -452,20 +452,19 @@ static void pgo_module_init(struct module *mod)
 	po->vnds_num = prf_get_count(mod->prf_vnds,
 		(char *)mod->prf_vnds + mod->prf_vnds_size, sizeof(po->vnds[0]));
 
-	/* create debugfs entry */
+	/* Create debugfs entry. */
 	po->file = debugfs_create_file(fname, 0600, directory, po, &prf_fops);
 	if (!po->file) {
 		pr_err("Failed to setup module pgo: %s", fname);
 		kfree(po);
-		goto out;
-	}
 
-	/* finally enable profiling for the module */
-	flags = prf_list_lock();
-	list_add_tail_rcu(&po->link, &prf_list);
-	prf_list_unlock(flags);
+	} else {
+		/* Finally enable profiling for the module. */
+		flags = prf_list_lock();
+		list_add_tail_rcu(&po->link, &prf_list);
+		prf_list_unlock(flags);
+	}
 
-out:
 	return;
 }
 
@@ -477,33 +476,33 @@ static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
 	unsigned long flags;
 
 	if (event == MODULE_STATE_LIVE) {
-		/* does the module have profiling info? */
+		/* Does the module have profiling info? */
 		if (mod->prf_data
 			&& mod->prf_cnts
 			&& mod->prf_names
 			&& mod->prf_vnds) {
 
-			/* setup module profiling */
+			/* Setup module profiling. */
 			pgo_module_init(mod);
 		}
 	}
 
 	if (event == MODULE_STATE_GOING) {
-		/* find the prf_object from the list */
+		/* Find the prf_object from the list. */
 		rcu_read_lock();
 
 		list_for_each_entry_rcu(po, &prf_list, link) {
 			if (strcmp(po->mod_name, mod->name) == 0)
 				goto out_unlock;
 		}
-		/* no such module */
+		/* No such module. */
 		po = NULL;
 
 out_unlock:
 		rcu_read_unlock();
 
 		if (po) {
-			/* remove from profiled modules */
+			/* Remove from profiled modules. */
 			flags = prf_list_lock();
 			list_del_rcu(&po->link);
 			prf_list_unlock(flags);
@@ -511,7 +510,7 @@ static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
 			debugfs_remove(po->file);
 			po->file = NULL;
 
-			/* cleanup memory */
+			/* Cleanup memory. */
 			kfree_rcu(po, rcu);
 		}
 	}
@@ -528,7 +527,7 @@ static int __init pgo_init(void)
 {
 	unsigned long flags;
 
-	/* Init profiler vmlinux core entry */
+	/* Init profiler vmlinux core entry. */
 	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
 	prf_vmlinux.data = __llvm_prf_data_start;
 	prf_vmlinux.data_num = prf_get_count(__llvm_prf_data_start,
@@ -546,7 +545,7 @@ static int __init pgo_init(void)
 	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
 		__llvm_prf_vnds_end, sizeof(__llvm_prf_vnds_start[0]));
 
-	/* enable profiling */
+	/* Enable profiling. */
 	flags = prf_list_lock();
 	prf_vmlinux.mod_name = "vmlinux";
 	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
@@ -565,10 +564,10 @@ static int __init pgo_init(void)
 				 &prf_reset_fops))
 		goto err_remove;
 
-	/* register module notifer */
+	/* Register module notifer. */
 	register_module_notifier(&pgo_module_nb);
 
-	/* show notice why the system slower: */
+	/* Show notice why the system slower: */
 	pr_notice("Clang PGO instrumentation is active.");
 
 	return 0;
@@ -581,7 +580,7 @@ static int __init pgo_init(void)
 /* Remove debugfs entries. */
 static void __exit pgo_exit(void)
 {
-	/* unsubscribe the notifier and do cleanup. */
+	/* Unsubscribe the notifier and do cleanup. */
 	unregister_module_notifier(&pgo_module_nb);
 
 	debugfs_remove_recursive(directory);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c
  2021-06-12  3:24 ` [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c Jarmo Tiitto
@ 2021-06-14 15:50   ` Kees Cook
  0 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2021-06-14 15:50 UTC (permalink / raw)
  To: Jarmo Tiitto
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

On Sat, Jun 12, 2021 at 06:24:26AM +0300, Jarmo Tiitto wrote:
> Cleanups to comments and punctuation.
> Cleanup return path in pgo_module_init.

Can you include these changes in the patches that introduce the various
comments, etc? It looks like most (all?) are from patches in this
series.

-Kees

> 
> Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
> ---
>  kernel/pgo/fs.c | 47 +++++++++++++++++++++++------------------------
>  1 file changed, 23 insertions(+), 24 deletions(-)
> 
> diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
> index 98b982245b58..855d5e3050fa 100644
> --- a/kernel/pgo/fs.c
> +++ b/kernel/pgo/fs.c
> @@ -294,7 +294,7 @@ static int prf_open(struct inode *inode, struct file *file)
>  	int err = -EINVAL;
>  
>  	if (WARN_ON(!inode->i_private)) {
> -		/* bug: inode was not initialized by us */
> +		/* Bug: inode was not initialized by us. */
>  		return err;
>  	}
>  
> @@ -302,7 +302,7 @@ static int prf_open(struct inode *inode, struct file *file)
>  	if (!data)
>  		return -ENOMEM;
>  
> -	/* Get prf_object of this inode */
> +	/* Get prf_object of this inode. */
>  	data->core = inode->i_private;
>  
>  	/* Get initial buffer size. */
> @@ -425,17 +425,17 @@ static void pgo_module_init(struct module *mod)
>  	char fname[MODULE_NAME_LEN + 9]; /* +strlen(".profraw") */
>  	unsigned long flags;
>  
> -	/* add new prf_object entry for the module */
> +	/* Add new prf_object entry for the module. */
>  	po = kzalloc(sizeof(*po), GFP_KERNEL);
>  	if (!po)
> -		goto out;
> +		return; /* -ENOMEM */
>  
>  	po->mod_name = mod->name;
>  
>  	fname[0] = 0;
>  	snprintf(fname, sizeof(fname), "%s.profraw", po->mod_name);
>  
> -	/* setup prf_object sections */
> +	/* Setup prf_object sections. */
>  	po->data = mod->prf_data;
>  	po->data_num = prf_get_count(mod->prf_data,
>  		(char *)mod->prf_data + mod->prf_data_size, sizeof(po->data[0]));
> @@ -452,20 +452,19 @@ static void pgo_module_init(struct module *mod)
>  	po->vnds_num = prf_get_count(mod->prf_vnds,
>  		(char *)mod->prf_vnds + mod->prf_vnds_size, sizeof(po->vnds[0]));
>  
> -	/* create debugfs entry */
> +	/* Create debugfs entry. */
>  	po->file = debugfs_create_file(fname, 0600, directory, po, &prf_fops);
>  	if (!po->file) {
>  		pr_err("Failed to setup module pgo: %s", fname);
>  		kfree(po);
> -		goto out;
> -	}
>  
> -	/* finally enable profiling for the module */
> -	flags = prf_list_lock();
> -	list_add_tail_rcu(&po->link, &prf_list);
> -	prf_list_unlock(flags);
> +	} else {
> +		/* Finally enable profiling for the module. */
> +		flags = prf_list_lock();
> +		list_add_tail_rcu(&po->link, &prf_list);
> +		prf_list_unlock(flags);
> +	}
>  
> -out:
>  	return;
>  }
>  
> @@ -477,33 +476,33 @@ static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
>  	unsigned long flags;
>  
>  	if (event == MODULE_STATE_LIVE) {
> -		/* does the module have profiling info? */
> +		/* Does the module have profiling info? */
>  		if (mod->prf_data
>  			&& mod->prf_cnts
>  			&& mod->prf_names
>  			&& mod->prf_vnds) {
>  
> -			/* setup module profiling */
> +			/* Setup module profiling. */
>  			pgo_module_init(mod);
>  		}
>  	}
>  
>  	if (event == MODULE_STATE_GOING) {
> -		/* find the prf_object from the list */
> +		/* Find the prf_object from the list. */
>  		rcu_read_lock();
>  
>  		list_for_each_entry_rcu(po, &prf_list, link) {
>  			if (strcmp(po->mod_name, mod->name) == 0)
>  				goto out_unlock;
>  		}
> -		/* no such module */
> +		/* No such module. */
>  		po = NULL;
>  
>  out_unlock:
>  		rcu_read_unlock();
>  
>  		if (po) {
> -			/* remove from profiled modules */
> +			/* Remove from profiled modules. */
>  			flags = prf_list_lock();
>  			list_del_rcu(&po->link);
>  			prf_list_unlock(flags);
> @@ -511,7 +510,7 @@ static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
>  			debugfs_remove(po->file);
>  			po->file = NULL;
>  
> -			/* cleanup memory */
> +			/* Cleanup memory. */
>  			kfree_rcu(po, rcu);
>  		}
>  	}
> @@ -528,7 +527,7 @@ static int __init pgo_init(void)
>  {
>  	unsigned long flags;
>  
> -	/* Init profiler vmlinux core entry */
> +	/* Init profiler vmlinux core entry. */
>  	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
>  	prf_vmlinux.data = __llvm_prf_data_start;
>  	prf_vmlinux.data_num = prf_get_count(__llvm_prf_data_start,
> @@ -546,7 +545,7 @@ static int __init pgo_init(void)
>  	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
>  		__llvm_prf_vnds_end, sizeof(__llvm_prf_vnds_start[0]));
>  
> -	/* enable profiling */
> +	/* Enable profiling. */
>  	flags = prf_list_lock();
>  	prf_vmlinux.mod_name = "vmlinux";
>  	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
> @@ -565,10 +564,10 @@ static int __init pgo_init(void)
>  				 &prf_reset_fops))
>  		goto err_remove;
>  
> -	/* register module notifer */
> +	/* Register module notifer. */
>  	register_module_notifier(&pgo_module_nb);
>  
> -	/* show notice why the system slower: */
> +	/* Show notice why the system slower: */
>  	pr_notice("Clang PGO instrumentation is active.");
>  
>  	return 0;
> @@ -581,7 +580,7 @@ static int __init pgo_init(void)
>  /* Remove debugfs entries. */
>  static void __exit pgo_exit(void)
>  {
> -	/* unsubscribe the notifier and do cleanup. */
> +	/* Unsubscribe the notifier and do cleanup. */
>  	unregister_module_notifier(&pgo_module_nb);
>  
>  	debugfs_remove_recursive(directory);
> -- 
> 2.32.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules
  2021-06-12  3:24 ` [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules Jarmo Tiitto
@ 2021-06-14 15:55   ` Kees Cook
  2021-06-14 19:08     ` jarmo.tiitto
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2021-06-14 15:55 UTC (permalink / raw)
  To: Jarmo Tiitto
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

On Sat, Jun 12, 2021 at 06:24:24AM +0300, Jarmo Tiitto wrote:
> prf_open() now uses the inode->i_private to get
> the prf_object for the file. This can be either
> vmlinux.profraw or any module.profraw file.
> 
> The prf_vmlinux object is now added into prf_list and
> allocate_node() scans the list and reserves vnodes
> from corresponding prf_object(s).
> 
> Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
> ---
> note: There is no module notifier code yet,
> so only vmlinux.profraw profile data
> is available with this commit.
> 
> Another thing is that pgo/reset will only
> reset vmlinux.profraw.
> Profile data reset for modules may be added later:
> maybe writing module's name into pgo/reset would reset only
> the specified module's profile data?
> Then writing "all" or zero would atomically reset everything.

Yeah, I think matching the internal naming is right. "vmlinux",
module::name, and "all"?

> I'm bit unsure about the new allocate_node() code since
> it is the first place I had to put rcu_read_lock()
> and the code is likely to change from this.

Comments below...

> ---
>  kernel/pgo/fs.c         | 30 ++++++++++++++++++++-----
>  kernel/pgo/instrument.c | 49 +++++++++++++++++++++++++++--------------
>  kernel/pgo/pgo.h        |  2 ++
>  3 files changed, 60 insertions(+), 21 deletions(-)
> 
> diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
> index 7e269d69bcd7..84b36e61758b 100644
> --- a/kernel/pgo/fs.c
> +++ b/kernel/pgo/fs.c
> @@ -32,8 +32,10 @@ static struct dentry *directory;
>  struct prf_private_data {
>  	void *buffer;
>  	size_t size;
> +	struct prf_object *core;
>  };
>  
> +/* vmlinux's prf core */
>  static struct prf_object prf_vmlinux;
>  
>  /*
> @@ -281,7 +283,6 @@ static int prf_serialize(struct prf_object *po, struct prf_private_data *p, size
>  	prf_serialize_values(po, &buffer);
>  
>  	return 0;
> -
>  }
>  
>  /* open() implementation for PGO. Creates a copy of the profiling data set. */
> @@ -292,13 +293,21 @@ static int prf_open(struct inode *inode, struct file *file)
>  	size_t buf_size;
>  	int err = -EINVAL;
>  
> +	if (WARN_ON(!inode->i_private)) {
> +		/* bug: inode was not initialized by us */
> +		return err;
> +	}
> +
>  	data = kzalloc(sizeof(*data), GFP_KERNEL);
>  	if (!data)
>  		return -ENOMEM;
>  
> +	/* Get prf_object of this inode */
> +	data->core = inode->i_private;
> +
>  	/* Get initial buffer size. */
>  	flags = prf_lock();
> -	data->size = prf_buffer_size(&prf_vmlinux);
> +	data->size = prf_buffer_size(data->core);
>  	prf_unlock(flags);
>  
>  	do {
> @@ -318,12 +327,13 @@ static int prf_open(struct inode *inode, struct file *file)
>  		 * data length in data->size.
>  		 */
>  		flags = prf_lock();
> -		err = prf_serialize(&prf_vmlinux, data, buf_size);
> +		err = prf_serialize(data->core, data, buf_size);
>  		prf_unlock(flags);
>  		/* In unlikely case, try again. */
>  	} while (err == -EAGAIN);
>  
>  	if (err < 0) {
> +
>  		if (data)
>  			vfree(data->buffer);
>  		kfree(data);
> @@ -412,6 +422,8 @@ static const struct file_operations prf_reset_fops = {
>  /* Create debugfs entries. */
>  static int __init pgo_init(void)
>  {
> +	unsigned long flags;
> +
>  	/* Init profiler vmlinux core entry */
>  	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
>  	prf_vmlinux.data = __llvm_prf_data_start;
> @@ -430,19 +442,27 @@ static int __init pgo_init(void)
>  	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
>  		__llvm_prf_vnds_end, sizeof(__llvm_prf_vnds_start[0]));
>  
> +	/* enable profiling */
> +	flags = prf_list_lock();
> +	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
> +	prf_list_unlock(flags);
>  
>  	directory = debugfs_create_dir("pgo", NULL);
>  	if (!directory)
>  		goto err_remove;
>  
> -	if (!debugfs_create_file("vmlinux.profraw", 0600, directory, NULL,
> -				 &prf_fops))
> +	prf_vmlinux.file = debugfs_create_file("vmlinux.profraw",
> +		0600, directory, &prf_vmlinux, &prf_fops);
> +	if (!prf_vmlinux.file)
>  		goto err_remove;
>  
>  	if (!debugfs_create_file("reset", 0200, directory, NULL,
>  				 &prf_reset_fops))
>  		goto err_remove;
>  
> +	/* show notice why the system slower: */
> +	pr_notice("Clang PGO instrumentation is active.");
> +

Please pull this change into a separate patch and make it pr_info()
("notice" is, I think, not right here).

>  	return 0;
>  
>  err_remove:
> diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> index 24fdeb79b674..e214c9d7a113 100644
> --- a/kernel/pgo/instrument.c
> +++ b/kernel/pgo/instrument.c
> @@ -24,6 +24,7 @@
>  #include <linux/export.h>
>  #include <linux/spinlock.h>
>  #include <linux/types.h>
> +#include <linux/rculist.h>
>  #include "pgo.h"
>  
>  /*
> @@ -56,22 +57,38 @@ void prf_unlock(unsigned long flags)
>  static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
>  						 u32 index, u64 value)
>  {
> -	const int max_vnds = prf_get_count(__llvm_prf_vnds_start,
> -		__llvm_prf_vnds_end, sizeof(struct llvm_prf_value_node));
> -
> -	/*
> -	 * Check that p is within vmlinux __llvm_prf_data section.
> -	 * If not, don't allocate since we can't handle modules yet.
> -	 */
> -	if (!memory_contains(__llvm_prf_data_start,
> -		__llvm_prf_data_end, p, sizeof(*p)))
> -		return NULL;
> -
> -	if (WARN_ON_ONCE(current_node >= max_vnds))
> -		return NULL; /* Out of nodes */
> -
> -	/* reserve vnode for vmlinux */
> -	return &__llvm_prf_vnds_start[current_node++];
> +	struct llvm_prf_value_node *vnode = NULL;
> +	struct prf_object *po;
> +	struct llvm_prf_data *data_end;
> +	int max_vnds;
> +
> +	rcu_read_lock();

AIUI, list readers are using rcu_read_lock(), and writers are using
prf_list_lock()?

> +
> +	list_for_each_entry_rcu(po, &prf_list, link) {
> +		/* get section limits */
> +		max_vnds = prf_vnds_count(po);
> +		data_end = po->data + prf_data_count(po);
> +
> +		/*
> +		 * Check that p is within:
> +		 * [po->data, po->data + prf_data_count(po)] section.
> +		 * If yes, allocate vnode from this prf_object.
> +		 */
> +		if (memory_contains(po->data, data_end, p, sizeof(*p))) {
> +
> +
> +			if (WARN_ON_ONCE(po->current_node >= max_vnds))
> +				return NULL; /* Out of nodes */
> +
> +			/* reserve the vnode */
> +			vnode = &po->vnds[po->current_node++];
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +	return vnode;
>  }
>  
>  /*
> diff --git a/kernel/pgo/pgo.h b/kernel/pgo/pgo.h
> index 44d79e2861e1..59d0aa966fbe 100644
> --- a/kernel/pgo/pgo.h
> +++ b/kernel/pgo/pgo.h
> @@ -19,6 +19,8 @@
>  #ifndef _PGO_H
>  #define _PGO_H
>  
> +#include <linux/rculist.h>
> +
>  /*
>   * Note: These internal LLVM definitions must match the compiler version.
>   * See llvm/include/llvm/ProfileData/InstrProfData.inc in LLVM's source code.
> -- 
> 2.32.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 4/5] pgo: Add module notifier machinery
  2021-06-12  3:24 ` [RFC PATCH 4/5] pgo: Add module notifier machinery Jarmo Tiitto
@ 2021-06-14 16:00   ` Kees Cook
  2021-06-14 18:31     ` jarmo.tiitto
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2021-06-14 16:00 UTC (permalink / raw)
  To: Jarmo Tiitto
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

On Sat, Jun 12, 2021 at 06:24:25AM +0300, Jarmo Tiitto wrote:
> Add module notifier callback and register modules
> into prf_list.
> 
> For each module that has __llvm_prf_{data,cnts,names,vnds} sections
> The callback creates debugfs <module>.profraw entry and
> links new prf_object into prf_list.
> 
> This enables profiling for all loaded modules.
> 
> * Moved rcu_read_lock() outside of allocate_node() in order
>   to protect __llvm_profile_instrument_target() from module unload(s)
> 
> Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
> ---
> v2: Passed QEMU SMP boot test into minimal Arch Linux distro,
> module loading and unloading work without warnings.
> Module profile data looks ok. :-)
> ---
>  kernel/pgo/fs.c         | 111 ++++++++++++++++++++++++++++++++++++++++
>  kernel/pgo/instrument.c |  19 ++++---
>  2 files changed, 122 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
> index 84b36e61758b..98b982245b58 100644
> --- a/kernel/pgo/fs.c
> +++ b/kernel/pgo/fs.c
> @@ -419,6 +419,110 @@ static const struct file_operations prf_reset_fops = {
>  	.llseek		= noop_llseek,
>  };
>  
> +static void pgo_module_init(struct module *mod)
> +{
> +	struct prf_object *po;
> +	char fname[MODULE_NAME_LEN + 9]; /* +strlen(".profraw") */
> +	unsigned long flags;
> +
> +	/* add new prf_object entry for the module */
> +	po = kzalloc(sizeof(*po), GFP_KERNEL);
> +	if (!po)
> +		goto out;
> +
> +	po->mod_name = mod->name;
> +
> +	fname[0] = 0;
> +	snprintf(fname, sizeof(fname), "%s.profraw", po->mod_name);
> +
> +	/* setup prf_object sections */
> +	po->data = mod->prf_data;
> +	po->data_num = prf_get_count(mod->prf_data,
> +		(char *)mod->prf_data + mod->prf_data_size, sizeof(po->data[0]));
> +
> +	po->cnts = mod->prf_cnts;
> +	po->cnts_num = prf_get_count(mod->prf_cnts,
> +		(char *)mod->prf_cnts + mod->prf_cnts_size, sizeof(po->cnts[0]));
> +
> +	po->names = mod->prf_names;
> +	po->names_num = prf_get_count(mod->prf_names,
> +		(char *)mod->prf_names + mod->prf_names_size, sizeof(po->names[0]));
> +
> +	po->vnds = mod->prf_vnds;
> +	po->vnds_num = prf_get_count(mod->prf_vnds,
> +		(char *)mod->prf_vnds + mod->prf_vnds_size, sizeof(po->vnds[0]));
> +
> +	/* create debugfs entry */
> +	po->file = debugfs_create_file(fname, 0600, directory, po, &prf_fops);
> +	if (!po->file) {
> +		pr_err("Failed to setup module pgo: %s", fname);
> +		kfree(po);
> +		goto out;
> +	}
> +
> +	/* finally enable profiling for the module */
> +	flags = prf_list_lock();
> +	list_add_tail_rcu(&po->link, &prf_list);
> +	prf_list_unlock(flags);
> +
> +out:
> +	return;
> +}
> +
> +static int pgo_module_notifier(struct notifier_block *nb, unsigned long event,
> +				void *pdata)
> +{
> +	struct module *mod = pdata;
> +	struct prf_object *po;
> +	unsigned long flags;
> +
> +	if (event == MODULE_STATE_LIVE) {
> +		/* does the module have profiling info? */
> +		if (mod->prf_data
> +			&& mod->prf_cnts
> +			&& mod->prf_names
> +			&& mod->prf_vnds) {
> +
> +			/* setup module profiling */
> +			pgo_module_init(mod);
> +		}
> +	}
> +
> +	if (event == MODULE_STATE_GOING) {
> +		/* find the prf_object from the list */
> +		rcu_read_lock();
> +
> +		list_for_each_entry_rcu(po, &prf_list, link) {
> +			if (strcmp(po->mod_name, mod->name) == 0)
> +				goto out_unlock;
> +		}
> +		/* no such module */
> +		po = NULL;
> +
> +out_unlock:
> +		rcu_read_unlock();

Is it correct to do the unlock before the possible list_del_rcu()?

> +
> +		if (po) {
> +			/* remove from profiled modules */
> +			flags = prf_list_lock();
> +			list_del_rcu(&po->link);
> +			prf_list_unlock(flags);
> +
> +			debugfs_remove(po->file);
> +			po->file = NULL;
> +
> +			/* cleanup memory */
> +			kfree_rcu(po, rcu);
> +		}

Though I thought module load/unload was already under a global lock, so
maybe a race isn't possible here.

For the next version of this series, please Cc the module subsystem
maintainer as well:
	Jessica Yu <jeyu@kernel.org>

> +	}
> +
> +	return NOTIFY_OK;
> +}
> +
> +static struct notifier_block pgo_module_nb = {
> +	.notifier_call = pgo_module_notifier
> +};
> +
>  /* Create debugfs entries. */
>  static int __init pgo_init(void)
>  {
> @@ -444,6 +548,7 @@ static int __init pgo_init(void)
>  
>  	/* enable profiling */
>  	flags = prf_list_lock();
> +	prf_vmlinux.mod_name = "vmlinux";
>  	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
>  	prf_list_unlock(flags);
>  
> @@ -460,6 +565,9 @@ static int __init pgo_init(void)
>  				 &prf_reset_fops))
>  		goto err_remove;
>  
> +	/* register module notifer */
> +	register_module_notifier(&pgo_module_nb);
> +
>  	/* show notice why the system slower: */
>  	pr_notice("Clang PGO instrumentation is active.");
>  
> @@ -473,6 +581,9 @@ static int __init pgo_init(void)
>  /* Remove debugfs entries. */
>  static void __exit pgo_exit(void)
>  {
> +	/* unsubscribe the notifier and do cleanup. */
> +	unregister_module_notifier(&pgo_module_nb);
> +
>  	debugfs_remove_recursive(directory);
>  }
>  
> diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> index e214c9d7a113..70bab7e4c153 100644
> --- a/kernel/pgo/instrument.c
> +++ b/kernel/pgo/instrument.c
> @@ -33,7 +33,6 @@
>   * ensures that we don't try to serialize data that's only partially updated.
>   */
>  static DEFINE_SPINLOCK(pgo_lock);
> -static int current_node;
>  
>  unsigned long prf_lock(void)
>  {
> @@ -62,8 +61,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
>  	struct llvm_prf_data *data_end;
>  	int max_vnds;
>  
> -	rcu_read_lock();
> -

Please move these rcu locks change into the patch that introduces them
to avoid adding those changes here.

>  	list_for_each_entry_rcu(po, &prf_list, link) {
>  		/* get section limits */
>  		max_vnds = prf_vnds_count(po);
> @@ -76,7 +73,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
>  		 */
>  		if (memory_contains(po->data, data_end, p, sizeof(*p))) {
>  
> -
>  			if (WARN_ON_ONCE(po->current_node >= max_vnds))
>  				return NULL; /* Out of nodes */
>  
> @@ -87,7 +83,6 @@ static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
>  	}
>  
>  out:
> -	rcu_read_unlock();
>  	return vnode;
>  }
>  
> @@ -108,8 +103,14 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
>  	u8 values = 0;
>  	unsigned long flags;
>  
> +	/*
> +	 * lock the underlying prf_object(s) in place,
> +	 * so they won't vanish while we are operating on it.
> +	 */
> +	rcu_read_lock();
> +
>  	if (!p || !p->values)
> -		return;
> +		goto rcu_unlock;
>  
>  	counters = (struct llvm_prf_value_node **)p->values;
>  	curr = counters[index];
> @@ -117,7 +118,7 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
>  	while (curr) {
>  		if (target_value == curr->value) {
>  			curr->count++;
> -			return;
> +			goto rcu_unlock;
>  		}
>  
>  		if (curr->count < min_count) {
> @@ -136,7 +137,7 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
>  			curr->value = target_value;
>  			curr->count++;
>  		}
> -		return;
> +		goto rcu_unlock;
>  	}
>  
>  	/* Lock when updating the value node structure. */
> @@ -156,6 +157,8 @@ void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index)
>  
>  out:
>  	prf_unlock(flags);
> +rcu_unlock:
> +	rcu_read_unlock();
>  }
>  EXPORT_SYMBOL(__llvm_profile_instrument_target);
>  
> -- 
> 2.32.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/5] pgo: Add PGO support for module profile data
  2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
                   ` (4 preceding siblings ...)
  2021-06-12  3:24 ` [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c Jarmo Tiitto
@ 2021-06-14 16:05 ` Kees Cook
  2021-06-14 21:57   ` Kees Cook
  5 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2021-06-14 16:05 UTC (permalink / raw)
  To: Jarmo Tiitto
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

[oops, I failed to CC correctly -- resending]

On Sat, Jun 12, 2021 at 06:24:21AM +0300, Jarmo Tiitto wrote:
> This patch series intends to extend the current Clang PGO code to
> support profile data from modules. Note that current PGO can and *does* 
> instrument all kernel code, including modules, but this profile data
> is inaccessible.
> 
> This patch series adds pgo/<module>.profraw files from what
> per loaded module profile data can be read.
> 
> The final profile can be generated by merging all these profile files:
> llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw ...
> and then building the optimized kernel.
> 
> This v2 patch series is still an bit of RFC so I'd like feedback how
> to do things better still.

This looks pretty good; thank you! I sent some notes, which are mostly
just clean-ups and patch hunk moves.

> The patches itself are based on Kees/for-next/clang/features tree
> where I have two of my bug fix patches already in. :-)
> 
> I have done some initial testing:
>  * Booted the instrumented kernel on qemu *and* bare hardware.
>  * Module un/loading via test_module in QEMU.
>  * Built optimized kernel using the new profile data.

If you haven't already, can you also test this with lock testing
enabled? i.e. these configs:

# Detect potential deadlocks.
CONFIG_PROVE_LOCKING=y
# Detect sleep-while-atomic.
CONFIG_DEBUG_ATOMIC_SLEEP=y

Thanks!

-Kees

> 
> Jarmo Tiitto (5):
>   pgo: Expose module sections for clang PGO instumentation.
>   pgo: Make serializing functions to take prf_object
>   pgo: Wire up the new more generic code for modules
>   pgo: Add module notifier machinery
>   pgo: Cleanup code in pgo/fs.c
> 
>  include/linux/module.h  |  15 +++
>  kernel/Makefile         |   6 +
>  kernel/module.c         |   7 ++
>  kernel/pgo/fs.c         | 241 ++++++++++++++++++++++++++++++++++------
>  kernel/pgo/instrument.c |  57 +++++++---
>  kernel/pgo/pgo.h        |  85 ++++++++++----
>  6 files changed, 342 insertions(+), 69 deletions(-)
> 
> 
> base-commit: 0039303120c0065f3952698597e0c9916b76ebd5
> -- 
> 2.32.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 4/5] pgo: Add module notifier machinery
  2021-06-14 16:00   ` Kees Cook
@ 2021-06-14 18:31     ` jarmo.tiitto
  0 siblings, 0 replies; 14+ messages in thread
From: jarmo.tiitto @ 2021-06-14 18:31 UTC (permalink / raw)
  To: Jarmo Tiitto, Kees Cook
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo, jeyu

Kees Cook wrote maanantaina 14. kesäkuuta 2021 19.00.34 EEST:
> On Sat, Jun 12, 2021 at 06:24:25AM +0300, Jarmo Tiitto wrote:
> > Add module notifier callback and register modules
> > into prf_list.
> > 
> > For each module that has __llvm_prf_{data,cnts,names,vnds} sections
> > The callback creates debugfs <module>.profraw entry and
> > links new prf_object into prf_list.
> > 
> > This enables profiling for all loaded modules.
> > 
> > * Moved rcu_read_lock() outside of allocate_node() in order
> > 
> >   to protect __llvm_profile_instrument_target() from module unload(s)
> > 
> > Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
> > ---
> > v2: Passed QEMU SMP boot test into minimal Arch Linux distro,
> > module loading and unloading work without warnings.
> > Module profile data looks ok. :-)
> > ---
> > 
> >  kernel/pgo/fs.c         | 111 ++++++++++++++++++++++++++++++++++++++++
> >  kernel/pgo/instrument.c |  19 ++++---
> >  2 files changed, 122 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
> > index 84b36e61758b..98b982245b58 100644
> > --- a/kernel/pgo/fs.c
> > +++ b/kernel/pgo/fs.c
> > @@ -419,6 +419,110 @@ static const struct file_operations prf_reset_fops = 
{
> > 
> >  	.llseek		= noop_llseek,
> >  
> >  };
> > 
> > +static void pgo_module_init(struct module *mod)
> > +{
> > +	struct prf_object *po;
> > +	char fname[MODULE_NAME_LEN + 9]; /* +strlen(".profraw") */
> > +	unsigned long flags;
> > +
> > +	/* add new prf_object entry for the module */
> > +	po = kzalloc(sizeof(*po), GFP_KERNEL);
> > +	if (!po)
> > +		goto out;
> > +
> > +	po->mod_name = mod->name;
> > +
> > +	fname[0] = 0;
> > +	snprintf(fname, sizeof(fname), "%s.profraw", po->mod_name);
> > +
> > +	/* setup prf_object sections */
> > +	po->data = mod->prf_data;
> > +	po->data_num = prf_get_count(mod->prf_data,
> > +		(char *)mod->prf_data + mod->prf_data_size, 
sizeof(po->data[0]));
> > +
> > +	po->cnts = mod->prf_cnts;
> > +	po->cnts_num = prf_get_count(mod->prf_cnts,
> > +		(char *)mod->prf_cnts + mod->prf_cnts_size, 
sizeof(po->cnts[0]));
> > +
> > +	po->names = mod->prf_names;
> > +	po->names_num = prf_get_count(mod->prf_names,
> > +		(char *)mod->prf_names + mod->prf_names_size, 
sizeof(po->names[0]));
> > +
> > +	po->vnds = mod->prf_vnds;
> > +	po->vnds_num = prf_get_count(mod->prf_vnds,
> > +		(char *)mod->prf_vnds + mod->prf_vnds_size, 
sizeof(po->vnds[0]));
> > +
> > +	/* create debugfs entry */
> > +	po->file = debugfs_create_file(fname, 0600, directory, po, 
&prf_fops);
> > +	if (!po->file) {
> > +		pr_err("Failed to setup module pgo: %s", fname);
> > +		kfree(po);
> > +		goto out;
> > +	}
> > +
> > +	/* finally enable profiling for the module */
> > +	flags = prf_list_lock();
> > +	list_add_tail_rcu(&po->link, &prf_list);
> > +	prf_list_unlock(flags);
> > +
> > +out:
> > +	return;
> > +}
> > +
> > +static int pgo_module_notifier(struct notifier_block *nb, unsigned long
> > event, +				void *pdata)
> > +{
> > +	struct module *mod = pdata;
> > +	struct prf_object *po;
> > +	unsigned long flags;
> > +
> > +	if (event == MODULE_STATE_LIVE) {
> > +		/* does the module have profiling info? */
> > +		if (mod->prf_data
> > +			&& mod->prf_cnts
> > +			&& mod->prf_names
> > +			&& mod->prf_vnds) {
> > +
> > +			/* setup module profiling */
> > +			pgo_module_init(mod);
> > +		}
> > +	}
> > +
> > +	if (event == MODULE_STATE_GOING) {
> > +		/* find the prf_object from the list */
> > +		rcu_read_lock();
> > +
> > +		list_for_each_entry_rcu(po, &prf_list, link) {
> > +			if (strcmp(po->mod_name, mod->name) == 
0)
> > +				goto out_unlock;
> > +		}
> > +		/* no such module */
> > +		po = NULL;
> > +
> > +out_unlock:
> > +		rcu_read_unlock();
> 
> Is it correct to do the unlock before the possible list_del_rcu()?
> 

Oh, list_del_rcu() should then propably go before unlocking. I'll fix that.

> > +
> > +		if (po) {
> > +			/* remove from profiled modules */
> > +			flags = prf_list_lock();
> > +			list_del_rcu(&po->link);
> > +			prf_list_unlock(flags);
> > +
> > +			debugfs_remove(po->file);
> > +			po->file = NULL;
> > +
> > +			/* cleanup memory */
> > +			kfree_rcu(po, rcu);
> > +		}
> 
> Though I thought module load/unload was already under a global lock, so
> maybe a race isn't possible here.
> 

I searched a bit and found out that module.c/module_mutex is not held during
the module notifier MODULE_STATE_GOING callbacks.
But the callback it only invoked only once per module un/load so I think it is 
okay.
If I remember correctly, the main reason I moved the tear down code after 
rcu_read_unlock() was that debugfs_remove() may sleep.


> For the next version of this series, please Cc the module subsystem
> maintainer as well:
> 	Jessica Yu <jeyu@kernel.org>
> 

OK, after all there is a lot of pointers to the kernel's module subsys.

> > +	}
> > +
> > +	return NOTIFY_OK;
> > +}
> > +
> > +static struct notifier_block pgo_module_nb = {
> > +	.notifier_call = pgo_module_notifier
> > +};
> > +
> > 
> >  /* Create debugfs entries. */
> >  static int __init pgo_init(void)
> >  {
> > 
> > @@ -444,6 +548,7 @@ static int __init pgo_init(void)
> > 
> >  	/* enable profiling */
> >  	flags = prf_list_lock();
> > 
> > +	prf_vmlinux.mod_name = "vmlinux";
> > 
> >  	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
> >  	prf_list_unlock(flags);
> > 
> > @@ -460,6 +565,9 @@ static int __init pgo_init(void)
> > 
> >  				 &prf_reset_fops))
> >  		
> >  		goto err_remove;
> > 
> > +	/* register module notifer */
> > +	register_module_notifier(&pgo_module_nb);
> > +
> > 
> >  	/* show notice why the system slower: */
> >  	pr_notice("Clang PGO instrumentation is active.");
> > 
> > @@ -473,6 +581,9 @@ static int __init pgo_init(void)
> > 
> >  /* Remove debugfs entries. */
> >  static void __exit pgo_exit(void)
> >  {
> > 
> > +	/* unsubscribe the notifier and do cleanup. */
> > +	unregister_module_notifier(&pgo_module_nb);
> > +
> > 
> >  	debugfs_remove_recursive(directory);
> >  
> >  }
> > 
> > diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> > index e214c9d7a113..70bab7e4c153 100644
> > --- a/kernel/pgo/instrument.c
> > +++ b/kernel/pgo/instrument.c
> > @@ -33,7 +33,6 @@
> > 
> >   * ensures that we don't try to serialize data that's only partially
> >   updated.
> >   */
> >  
> >  static DEFINE_SPINLOCK(pgo_lock);
> > 
> > -static int current_node;
> > 
> >  unsigned long prf_lock(void)
> >  {
> > 
> > @@ -62,8 +61,6 @@ static struct llvm_prf_value_node *allocate_node(struct
> > llvm_prf_data *p,> 
> >  	struct llvm_prf_data *data_end;
> >  	int max_vnds;
> > 
> > -	rcu_read_lock();
> > -
> 
> Please move these rcu locks change into the patch that introduces them
> to avoid adding those changes here.
> 
> >  	list_for_each_entry_rcu(po, &prf_list, link) {
> >  	
> >  		/* get section limits */
> >  		max_vnds = prf_vnds_count(po);
> > 
> > @@ -76,7 +73,6 @@ static struct llvm_prf_value_node *allocate_node(struct
> > llvm_prf_data *p,> 
> >  		 */
> >  		
> >  		if (memory_contains(po->data, data_end, p, 
sizeof(*p))) {
> > 
> > -
> > 
> >  			if (WARN_ON_ONCE(po->current_node >= 
max_vnds))
> >  			
> >  				return NULL; /* Out of 
nodes */
> > 
> > @@ -87,7 +83,6 @@ static struct llvm_prf_value_node *allocate_node(struct
> > llvm_prf_data *p,> 
> >  	}
> >  
> >  out:
> > -	rcu_read_unlock();
> > 
> >  	return vnode;
> >  
> >  }
> > 
> > @@ -108,8 +103,14 @@ void __llvm_profile_instrument_target(u64 
target_value,
> > void *data, u32 index)> 
> >  	u8 values = 0;
> >  	unsigned long flags;
> > 
> > +	/*
> > +	 * lock the underlying prf_object(s) in place,
> > +	 * so they won't vanish while we are operating on it.
> > +	 */
> > +	rcu_read_lock();
> > +
> > 
> >  	if (!p || !p->values)
> > 
> > -		return;
> > +		goto rcu_unlock;
> > 
> >  	counters = (struct llvm_prf_value_node **)p->values;
> >  	curr = counters[index];
> > 
> > @@ -117,7 +118,7 @@ void __llvm_profile_instrument_target(u64 target_value,
> > void *data, u32 index)> 
> >  	while (curr) {
> >  	
> >  		if (target_value == curr->value) {
> >  		
> >  			curr->count++;
> > 
> > -			return;
> > +			goto rcu_unlock;
> > 
> >  		}
> >  		
> >  		if (curr->count < min_count) {
> > 
> > @@ -136,7 +137,7 @@ void __llvm_profile_instrument_target(u64 target_value,
> > void *data, u32 index)> 
> >  			curr->value = target_value;
> >  			curr->count++;
> >  		
> >  		}
> > 
> > -		return;
> > +		goto rcu_unlock;
> > 
> >  	}
> >  	
> >  	/* Lock when updating the value node structure. */
> > 
> > @@ -156,6 +157,8 @@ void __llvm_profile_instrument_target(u64 target_value,
> > void *data, u32 index)> 
> >  out:
> >  	prf_unlock(flags);
> > 
> > +rcu_unlock:
> > +	rcu_read_unlock();
> > 
> >  }
> >  EXPORT_SYMBOL(__llvm_profile_instrument_target);
> > 
> > --
> > 2.32.0
> 
> --
> Kees Cook





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules
  2021-06-14 15:55   ` Kees Cook
@ 2021-06-14 19:08     ` jarmo.tiitto
  0 siblings, 0 replies; 14+ messages in thread
From: jarmo.tiitto @ 2021-06-14 19:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

Kees Cook wrote maanantaina 14. kesäkuuta 2021 18.55.23 EEST:
> On Sat, Jun 12, 2021 at 06:24:24AM +0300, Jarmo Tiitto wrote:
> > prf_open() now uses the inode->i_private to get
> > the prf_object for the file. This can be either
> > vmlinux.profraw or any module.profraw file.
> > 
> > The prf_vmlinux object is now added into prf_list and
> > allocate_node() scans the list and reserves vnodes
> > from corresponding prf_object(s).
> > 
> > Signed-off-by: Jarmo Tiitto <jarmo.tiitto@gmail.com>
> > ---
> > note: There is no module notifier code yet,
> > so only vmlinux.profraw profile data
> > is available with this commit.
> > 
> > Another thing is that pgo/reset will only
> > reset vmlinux.profraw.
> > Profile data reset for modules may be added later:
> > maybe writing module's name into pgo/reset would reset only
> > the specified module's profile data?
> > Then writing "all" or zero would atomically reset everything.
> 
> Yeah, I think matching the internal naming is right. "vmlinux",
> module::name, and "all"?
> 
> > I'm bit unsure about the new allocate_node() code since
> > it is the first place I had to put rcu_read_lock()
> > and the code is likely to change from this.
> 
> Comments below...
> 
> > ---
> > 
> >  kernel/pgo/fs.c         | 30 ++++++++++++++++++++-----
> >  kernel/pgo/instrument.c | 49 +++++++++++++++++++++++++++--------------
> >  kernel/pgo/pgo.h        |  2 ++
> >  3 files changed, 60 insertions(+), 21 deletions(-)
> > 
> > diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c
> > index 7e269d69bcd7..84b36e61758b 100644
> > --- a/kernel/pgo/fs.c
> > +++ b/kernel/pgo/fs.c
> > @@ -32,8 +32,10 @@ static struct dentry *directory;
> > 
> >  struct prf_private_data {
> >  
> >  	void *buffer;
> >  	size_t size;
> > 
> > +	struct prf_object *core;
> > 
> >  };
> > 
> > +/* vmlinux's prf core */
> > 
> >  static struct prf_object prf_vmlinux;
> >  
> >  /*
> > 
> > @@ -281,7 +283,6 @@ static int prf_serialize(struct prf_object *po, struct
> > prf_private_data *p, size> 
> >  	prf_serialize_values(po, &buffer);
> >  	
> >  	return 0;
> > 
> > -
> > 
> >  }
> >  
> >  /* open() implementation for PGO. Creates a copy of the profiling data 
set.
> >  */> 
> > @@ -292,13 +293,21 @@ static int prf_open(struct inode *inode, struct file
> > *file)> 
> >  	size_t buf_size;
> >  	int err = -EINVAL;
> > 
> > +	if (WARN_ON(!inode->i_private)) {
> > +		/* bug: inode was not initialized by us */
> > +		return err;
> > +	}
> > +
> > 
> >  	data = kzalloc(sizeof(*data), GFP_KERNEL);
> >  	if (!data)
> >  	
> >  		return -ENOMEM;
> > 
> > +	/* Get prf_object of this inode */
> > +	data->core = inode->i_private;
> > +
> > 
> >  	/* Get initial buffer size. */
> >  	flags = prf_lock();
> > 
> > -	data->size = prf_buffer_size(&prf_vmlinux);
> > +	data->size = prf_buffer_size(data->core);
> > 
> >  	prf_unlock(flags);
> >  	
> >  	do {
> > 
> > @@ -318,12 +327,13 @@ static int prf_open(struct inode *inode, struct file
> > *file)> 
> >  		 * data length in data->size.
> >  		 */
> >  		
> >  		flags = prf_lock();
> > 
> > -		err = prf_serialize(&prf_vmlinux, data, buf_size);
> > +		err = prf_serialize(data->core, data, buf_size);
> > 
> >  		prf_unlock(flags);
> >  		/* In unlikely case, try again. */
> >  	
> >  	} while (err == -EAGAIN);
> >  	
> >  	if (err < 0) {
> > 
> > +
> > 
> >  		if (data)
> >  		
> >  			vfree(data->buffer);
> >  		
> >  		kfree(data);
> > 
> > @@ -412,6 +422,8 @@ static const struct file_operations prf_reset_fops = {
> > 
> >  /* Create debugfs entries. */
> >  static int __init pgo_init(void)
> >  {
> > 
> > +	unsigned long flags;
> > +
> > 
> >  	/* Init profiler vmlinux core entry */
> >  	memset(&prf_vmlinux, 0, sizeof(prf_vmlinux));
> >  	prf_vmlinux.data = __llvm_prf_data_start;
> > 
> > @@ -430,19 +442,27 @@ static int __init pgo_init(void)
> > 
> >  	prf_vmlinux.vnds_num = prf_get_count(__llvm_prf_vnds_start,
> >  	
> >  		__llvm_prf_vnds_end, 
sizeof(__llvm_prf_vnds_start[0]));
> > 
> > +	/* enable profiling */
> > +	flags = prf_list_lock();
> > +	list_add_tail_rcu(&prf_vmlinux.link, &prf_list);
> > +	prf_list_unlock(flags);
> > 
> >  	directory = debugfs_create_dir("pgo", NULL);
> >  	if (!directory)
> >  	
> >  		goto err_remove;
> > 
> > -	if (!debugfs_create_file("vmlinux.profraw", 0600, directory, NULL,
> > -				 &prf_fops))
> > +	prf_vmlinux.file = debugfs_create_file("vmlinux.profraw",
> > +		0600, directory, &prf_vmlinux, &prf_fops);
> > +	if (!prf_vmlinux.file)
> > 
> >  		goto err_remove;
> >  	
> >  	if (!debugfs_create_file("reset", 0200, directory, NULL,
> >  	
> >  				 &prf_reset_fops))
> >  		
> >  		goto err_remove;
> > 
> > +	/* show notice why the system slower: */
> > +	pr_notice("Clang PGO instrumentation is active.");
> > +
> 
> Please pull this change into a separate patch and make it pr_info()
> ("notice" is, I think, not right here).
> 

All rightly then.

> >  	return 0;
> >  
> >  err_remove:
> > diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> > index 24fdeb79b674..e214c9d7a113 100644
> > --- a/kernel/pgo/instrument.c
> > +++ b/kernel/pgo/instrument.c
> > @@ -24,6 +24,7 @@
> > 
> >  #include <linux/export.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/types.h>
> > 
> > +#include <linux/rculist.h>
> > 
> >  #include "pgo.h"
> >  
> >  /*
> > 
> > @@ -56,22 +57,38 @@ void prf_unlock(unsigned long flags)
> > 
> >  static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
> >  
> >  						 
u32 index, u64 value)
> >  
> >  {
> > 
> > -	const int max_vnds = prf_get_count(__llvm_prf_vnds_start,
> > -		__llvm_prf_vnds_end, sizeof(struct 
llvm_prf_value_node));
> > -
> > -	/*
> > -	 * Check that p is within vmlinux __llvm_prf_data section.
> > -	 * If not, don't allocate since we can't handle modules yet.
> > -	 */
> > -	if (!memory_contains(__llvm_prf_data_start,
> > -		__llvm_prf_data_end, p, sizeof(*p)))
> > -		return NULL;
> > -
> > -	if (WARN_ON_ONCE(current_node >= max_vnds))
> > -		return NULL; /* Out of nodes */
> > -
> > -	/* reserve vnode for vmlinux */
> > -	return &__llvm_prf_vnds_start[current_node++];
> > +	struct llvm_prf_value_node *vnode = NULL;
> > +	struct prf_object *po;
> > +	struct llvm_prf_data *data_end;
> > +	int max_vnds;
> > +
> > +	rcu_read_lock();
> 
> AIUI, list readers are using rcu_read_lock(), and writers are using
> prf_list_lock()?
> 

Yes, I intended the list readers to use rcu_read_lock() and writers to take 
the prf_list_lock().

Sadly after I sent this patch set I found during more testing that there are 
few problems that I need to work on:

There is an lockup that only occurs during bare metal run after +15min, so I 
haven't been able to catch it in VM.
I suspect this is caused by the RCU locking I added such that it results in 
recursive calls into __llvm_profile_instrument_target()

I will try build with CONFIG_PROVE_LOCKING, but I have had problems with
the kernel not getting past cgroup_init_early()... even without my patches
applied. (stock -rc kernel) :-(

> > +
> > +	list_for_each_entry_rcu(po, &prf_list, link) {
> > +		/* get section limits */
> > +		max_vnds = prf_vnds_count(po);
> > +		data_end = po->data + prf_data_count(po);
> > +
> > +		/*
> > +		 * Check that p is within:
> > +		 * [po->data, po->data + prf_data_count(po)] section.
> > +		 * If yes, allocate vnode from this prf_object.
> > +		 */
> > +		if (memory_contains(po->data, data_end, p, 
sizeof(*p))) {
> > +
> > +
> > +			if (WARN_ON_ONCE(po->current_node >= 
max_vnds))
> > +				return NULL; /* Out of 
nodes */
> > +
> > +			/* reserve the vnode */
> > +			vnode = &po->vnds[po->current_node++];
> > +			goto out;
> > +		}
> > +	}
> > +
> > +out:
> > +	rcu_read_unlock();
> > +	return vnode;
> > 
> >  }
> >  
> >  /*
> > 
> > diff --git a/kernel/pgo/pgo.h b/kernel/pgo/pgo.h
> > index 44d79e2861e1..59d0aa966fbe 100644
> > --- a/kernel/pgo/pgo.h
> > +++ b/kernel/pgo/pgo.h
> > @@ -19,6 +19,8 @@
> > 
> >  #ifndef _PGO_H
> >  #define _PGO_H
> > 
> > +#include <linux/rculist.h>
> > +
> > 
> >  /*
> >  
> >   * Note: These internal LLVM definitions must match the compiler version.
> >   * See llvm/include/llvm/ProfileData/InstrProfData.inc in LLVM's source
> >   code.
> > 
> > --
> > 2.32.0
> 
> --
> Kees Cook





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/5] pgo: Add PGO support for module profile data
  2021-06-14 16:05 ` [RFC PATCH 0/5] pgo: Add PGO support for module profile data Kees Cook
@ 2021-06-14 21:57   ` Kees Cook
  2021-06-14 22:27     ` jarmo.tiitto
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2021-06-14 21:57 UTC (permalink / raw)
  To: Jarmo Tiitto
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

On Sat, Jun 12, 2021 at 06:24:21AM +0300, Jarmo Tiitto wrote:
> [...]
> The patches itself are based on Kees/for-next/clang/features tree
> where I have two of my bug fix patches already in. :-)

BTW, due to the (soon to be addressed) requirements of noinstr[1],
I've removed PGO from my -next tree, and moved it into its own tree in
"for-next/clang/pgo".

-Kees

[1] https://lore.kernel.org/lkml/202106140921.5E591BD@keescook/

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 0/5] pgo: Add PGO support for module profile data
  2021-06-14 21:57   ` Kees Cook
@ 2021-06-14 22:27     ` jarmo.tiitto
  0 siblings, 0 replies; 14+ messages in thread
From: jarmo.tiitto @ 2021-06-14 22:27 UTC (permalink / raw)
  To: Jarmo Tiitto, Kees Cook
  Cc: Sami Tolvanen, Bill Wendling, Nathan Chancellor,
	Nick Desaulniers, clang-built-linux, linux-kernel, morbo

Kees Cook wrote tiistaina 15. kesäkuuta 2021 0.57.46 EEST:
> On Sat, Jun 12, 2021 at 06:24:21AM +0300, Jarmo Tiitto wrote:
> > [...]
> > The patches itself are based on Kees/for-next/clang/features tree
> > where I have two of my bug fix patches already in. :-)
> 
> BTW, due to the (soon to be addressed) requirements of noinstr[1],
> I've removed PGO from my -next tree, and moved it into its own tree in
> "for-next/clang/pgo".
> 
> -Kees
> 
> [1] https://lore.kernel.org/lkml/202106140921.5E591BD@keescook/
> 
> --
> Kees Cook

Yeah, I noticed that. Actually, I think we really should wait for that noinstr 
stuff since:

> There is an lockup that only occurs during bare metal run after +15min, so I 
> haven't been able to catch it in VM.
> I suspect this is caused by the RCU locking I added such that it results in 
> recursive calls into __llvm_profile_instrument_target()

That basically means LLVM is instrumenting code that I call from
__llvm_profile_instrument_target() resulting in nice cycle of doom.
Sigh...

I wrote fix for this but it would be nice to be sure the compiler
doesn't pull the rug under my toes. :-)
--
Jarmo



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-14 22:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-12  3:24 [RFC PATCH 0/5] pgo: Add PGO support for module profile data Jarmo Tiitto
2021-06-12  3:24 ` [RFC PATCH 1/5] pgo: Expose module sections for clang PGO instumentation Jarmo Tiitto
2021-06-12  3:24 ` [RFC PATCH 2/5] pgo: Make serializing functions to take prf_object Jarmo Tiitto
2021-06-12  3:24 ` [RFC PATCH 3/5] pgo: Wire up the new more generic code for modules Jarmo Tiitto
2021-06-14 15:55   ` Kees Cook
2021-06-14 19:08     ` jarmo.tiitto
2021-06-12  3:24 ` [RFC PATCH 4/5] pgo: Add module notifier machinery Jarmo Tiitto
2021-06-14 16:00   ` Kees Cook
2021-06-14 18:31     ` jarmo.tiitto
2021-06-12  3:24 ` [RFC PATCH 5/5] pgo: Cleanup code in pgo/fs.c Jarmo Tiitto
2021-06-14 15:50   ` Kees Cook
2021-06-14 16:05 ` [RFC PATCH 0/5] pgo: Add PGO support for module profile data Kees Cook
2021-06-14 21:57   ` Kees Cook
2021-06-14 22:27     ` jarmo.tiitto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.