All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
@ 2016-07-27 21:27 Hari Bathini
  2016-07-27 21:27 ` [RFC PATCH v2 1/3] perf: filter container events based on cgroup namespace Hari Bathini
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Hari Bathini @ 2016-07-27 21:27 UTC (permalink / raw)
  To: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, rostedt, viro
  Cc: aravinda, ananth

This RFC patch set supports filtering container specific events
when perf tool is executed inside a container. The patches apply
cleanly on v4.7.0-rc7

Changes from v1:
1/3. Revived earlier approach[1] with cgroup namespace instead
     of pid namespace
2/3. New patch that adds instance support for uprobe events in
     tracefs filesystem
3/3. New patch that adds "newinstance" mount option for tracefs
     filesystem

[1] https://lkml.org/lkml/2015/7/15/192

---

Aravinda Prasad (1):
      perf: filter container events based on cgroup namespace

Hari Bathini (2):
      tracefs: add instances support for uprobe events
      tracefs: add 'newinstance' mount option


 fs/tracefs/inode.c           |  171 ++++++++++++++++++++++++++++++++++--------
 include/linux/trace_events.h |    3 -
 include/linux/tracefs.h      |   11 ++-
 kernel/events/core.c         |   51 +++++++++----
 kernel/trace/trace.c         |   54 +++++++++----
 kernel/trace/trace.h         |   12 +++
 kernel/trace/trace_events.c  |   15 +++-
 kernel/trace/trace_kprobe.c  |    2 
 kernel/trace/trace_uprobe.c  |  158 ++++++++++++++++++++++++++++-----------
 9 files changed, 361 insertions(+), 116 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 1/3] perf: filter container events based on cgroup namespace
  2016-07-27 21:27 [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Hari Bathini
@ 2016-07-27 21:27 ` Hari Bathini
  2016-07-27 21:27 ` [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events Hari Bathini
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Hari Bathini @ 2016-07-27 21:27 UTC (permalink / raw)
  To: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, rostedt, viro
  Cc: aravinda, ananth

From: Aravinda Prasad <aravinda@linux.vnet.ibm.com>

This patch adds support to filter container specific events, without
any change in the user interface, when invoked within a container for
the perf utility.

Our earlier patch [1] required the container to be created with PID
namespace. However, during the discussion in Plumbers it was mentioned
that the requirement of PID namespace is insufficient for containers
that need access to the host PID namespace [3]. Now that the kernel
supports cgroup namespace, we modified the patch to look for cgroup
namespace instead of pid namespace to filter events. Thus keeping
the basic idea of approach [1] same while addressing [3].

The patch assumes that tracefs is available within the container and
all the processes running inside the container are grouped into a
single perf_event subsystem of cgroups.


Running the below command inside a container with global cgroup namespace

  $ perf record -e kmem:kmalloc -aR

perf report looks like below (with lot of noise):

  $ perf report --sort pid,symbol -n
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 8K of event 'kmem:kmalloc'
  # Event count (approx.): 8487
  #
  # Overhead       Samples    Pid:Command        Symbol
  # ........  ............  ...................  ..........................
  #
      71.56%          6073      0:kworker/dying  [k] __kmalloc
      26.82%          2276      0:kworker/dying  [k] kmem_cache_alloc_trace
       1.48%           126      0:kworker/dying  [k] __kmalloc_track_caller
       0.07%             6      0:curl           [k] kmalloc_order_trace
       0.05%             4    186:perf           [k] __kmalloc
       0.02%             2     61:java           [k] __kmalloc


  $

while running the above perf record command inside a container with new
cgroup namespace, only samples that belong to this container are listed:

  $ perf report --sort pid,dso,symbol -n
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 3  of event 'kmem:kmalloc'
  # Event count (approx.): 3
  #
  # Overhead       Samples    Pid:Command  Symbol
  # ........  ............  .............  .............
  #
     100.00%             3     61:java     [k] __kmalloc


  $

In order to filter events specific to a container, this patch assumes the
container is created with a new cgroup namespace.

[1] https://lkml.org/lkml/2015/7/15/192
[2] http://linuxplumbersconf.org/2015/ocw/sessions/2667.html
[3] Notes for container-aware tracing:
	https://etherpad.openstack.org/p/LPC2015_Containers

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
---
 kernel/events/core.c |   51 +++++++++++++++++++++++++++++++++++---------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 43d43a2d..d7ef1e1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -764,17 +764,38 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
 {
 	struct perf_cgroup *cgrp;
 	struct cgroup_subsys_state *css;
-	struct fd f = fdget(fd);
+	struct fd f;
 	int ret = 0;
 
-	if (!f.file)
-		return -EBADF;
+	if (fd != -1) {
+		f = fdget(fd);
+		if (!f.file)
+			return -EBADF;
 
-	css = css_tryget_online_from_dir(f.file->f_path.dentry,
-					 &perf_event_cgrp_subsys);
-	if (IS_ERR(css)) {
-		ret = PTR_ERR(css);
-		goto out;
+		css = css_tryget_online_from_dir(f.file->f_path.dentry,
+						 &perf_event_cgrp_subsys);
+		if (IS_ERR(css)) {
+			ret = PTR_ERR(css);
+			fdput(f);
+			return ret;
+		}
+	} else if (event->attach_state == PERF_ATTACH_TASK) {
+		/* Tracing on a PID. No need to set event->cgrp */
+		return ret;
+	} else if (current->nsproxy->cgroup_ns != &init_cgroup_ns) {
+		/* Don't set event->cgrp if task belongs to root cgroup */
+		if (task_css_is_root(current, perf_event_cgrp_id))
+			return ret;
+
+		css = task_css(current, perf_event_cgrp_id);
+		if (!css || !css_tryget_online(css))
+			return -ENOENT;
+	} else {
+		/*
+		 * perf invoked from global context and hence don't set
+		 * event->cgrp as all the events should be included
+		 */
+		return ret;
 	}
 
 	cgrp = container_of(css, struct perf_cgroup, css);
@@ -789,8 +810,10 @@ static inline int perf_cgroup_connect(int fd, struct perf_event *event,
 		perf_detach_cgroup(event);
 		ret = -EINVAL;
 	}
-out:
-	fdput(f);
+
+	if (fd != -1)
+		fdput(f);
+
 	return ret;
 }
 
@@ -8864,11 +8887,9 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	if (!has_branch_stack(event))
 		event->attr.branch_sample_type = 0;
 
-	if (cgroup_fd != -1) {
-		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
-		if (err)
-			goto err_ns;
-	}
+	err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
+	if (err)
+		goto err_ns;
 
 	pmu = perf_init_event(event);
 	if (!pmu)

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-07-27 21:27 [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Hari Bathini
  2016-07-27 21:27 ` [RFC PATCH v2 1/3] perf: filter container events based on cgroup namespace Hari Bathini
@ 2016-07-27 21:27 ` Hari Bathini
  2016-08-01 21:45   ` Steven Rostedt
  2016-07-27 21:27 ` [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option Hari Bathini
       [not found] ` <146965470618.23765.7329786743211962695.stgit-2ivJzYymj6EA+286u2LMdEEOCMrvLtNR@public.gmane.org>
  3 siblings, 1 reply; 26+ messages in thread
From: Hari Bathini @ 2016-07-27 21:27 UTC (permalink / raw)
  To: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, rostedt, viro
  Cc: aravinda, ananth

If a uprobe event is set on a library function, and if a similar uprobe
event trace is needed for a container, a duplicate is created leaving
the uprobe list with multiple entries of the same function:

  $ perf probe --list
    probe_libc:malloc    (on 0x80490 in /lib64/libc.so.6)
    probe_libc:malloc_1  (on __libc_malloc in /lib64/libc.so.6)
  $

This can soon get out of hand if multiple containers want to probe the
same function/address in their libraries. This patch tries to resolve this
by adding uprobe event trace files to every new instance. Currently, perf
tool can leverage this by using --debugfs-dir option - something like
(assuming instance dir name is 'tracing'):

  $ perf --debugfs-dir=$MOUNT_PNT/instances probe /lib64/libc.so.6 malloc
  $
  $
  $ perf --debugfs-dir=$MOUNT_PNT/instances probe --list
    probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
  $

New uprobe events can be added to the uprobe_events file under the instance
directory and the profile information for these events will be available in
uprobe_profile file in the same instance directory.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
---
 include/linux/trace_events.h |    3 +
 kernel/trace/trace.c         |    2 +
 kernel/trace/trace.h         |   12 +++
 kernel/trace/trace_events.c  |   15 +++-
 kernel/trace/trace_kprobe.c  |    2 -
 kernel/trace/trace_uprobe.c  |  158 +++++++++++++++++++++++++++++++-----------
 6 files changed, 144 insertions(+), 48 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index be00761..f893223 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -451,7 +451,8 @@ extern int trace_event_raw_init(struct trace_event_call *call);
 extern int trace_define_field(struct trace_event_call *call, const char *type,
 			      const char *name, int offset, int size,
 			      int is_signed, int filter_type);
-extern int trace_add_event_call(struct trace_event_call *call);
+extern int trace_add_event_call(struct trace_event_call *call,
+				struct trace_array *tr);
 extern int trace_remove_event_call(struct trace_event_call *call);
 extern int trace_event_get_offsets(struct trace_event_call *call);
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8a4bd6b..23a8111 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6966,6 +6966,8 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
 			&tr->max_latency, &tracing_max_lat_fops);
 #endif
 
+	uprobe_create_trace_files(tr, d_tracer);
+
 	if (ftrace_create_function_files(tr, d_tracer))
 		WARN(1, "Could not allocate function filter files");
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 5167c36..a8360e9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -245,6 +245,10 @@ struct trace_array {
 	struct list_head	events;
 	cpumask_var_t		tracing_cpumask; /* only trace on set CPUs */
 	int			ref;
+#ifdef CONFIG_UPROBE_EVENT
+	struct mutex		uprobe_lock;
+	struct list_head	uprobe_list;
+#endif
 #ifdef CONFIG_FUNCTION_TRACER
 	struct ftrace_ops	*ops;
 	/* function tracing enabled */
@@ -819,6 +823,14 @@ print_graph_function_flags(struct trace_iterator *iter, u32 flags)
 
 extern struct list_head ftrace_pids;
 
+#ifdef CONFIG_UPROBE_EVENT
+void uprobe_create_trace_files(struct trace_array *tr,
+			       struct dentry *parent);
+#else
+static inline void
+uprobe_create_trace_files(struct trace_array *tr, struct dentry *parent) { }
+#endif /* CONFIG_UPROBE_EVENT */
+
 #ifdef CONFIG_FUNCTION_TRACER
 extern bool ftrace_filter_param __initdata;
 static inline int ftrace_trace_task(struct task_struct *task)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 3d41558..2e0f986 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2441,15 +2441,20 @@ struct ftrace_module_file_ops;
 static void __add_event_to_tracers(struct trace_event_call *call);
 
 /* Add an additional event_call dynamically */
-int trace_add_event_call(struct trace_event_call *call)
+int trace_add_event_call(struct trace_event_call *call, struct trace_array *tr)
 {
 	int ret;
 	mutex_lock(&trace_types_lock);
 	mutex_lock(&event_mutex);
 
 	ret = __register_event(call, NULL);
-	if (ret >= 0)
-		__add_event_to_tracers(call);
+	if (ret >= 0) {
+		if (tr)
+			/* If a tracer is specified, add event only to it */
+			__trace_add_new_event(call, tr);
+		else
+			__add_event_to_tracers(call);
+	}
 
 	mutex_unlock(&event_mutex);
 	mutex_unlock(&trace_types_lock);
@@ -2609,6 +2614,10 @@ __trace_add_event_dirs(struct trace_array *tr)
 	int ret;
 
 	list_for_each_entry(call, &ftrace_events, list) {
+		/* Don't add dynamic uprobe events to new tracers */
+		if (call->flags & TRACE_EVENT_FL_UPROBE)
+			continue;
+
 		ret = __trace_add_new_event(call, tr);
 		if (ret < 0)
 			pr_warn("Could not create directory for event %s\n",
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 5546eec..b82a328 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1296,7 +1296,7 @@ static int register_kprobe_event(struct trace_kprobe *tk)
 	call->flags = TRACE_EVENT_FL_KPROBE;
 	call->class->reg = kprobe_register;
 	call->data = tk;
-	ret = trace_add_event_call(call);
+	ret = trace_add_event_call(call, NULL);
 	if (ret) {
 		pr_info("Failed to register kprobe event: %s\n",
 			trace_event_name(call));
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index c534854..ea8c4e4 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -64,12 +64,10 @@ struct trace_uprobe {
 	(offsetof(struct trace_uprobe, tp.args) +	\
 	(sizeof(struct probe_arg) * (n)))
 
-static int register_uprobe_event(struct trace_uprobe *tu);
+static int register_uprobe_event(struct trace_array *tr,
+				 struct trace_uprobe *tu);
 static int unregister_uprobe_event(struct trace_uprobe *tu);
 
-static DEFINE_MUTEX(uprobe_lock);
-static LIST_HEAD(uprobe_list);
-
 struct uprobe_dispatch_data {
 	struct trace_uprobe	*tu;
 	unsigned long		bp_addr;
@@ -288,11 +286,12 @@ static void free_trace_uprobe(struct trace_uprobe *tu)
 	kfree(tu);
 }
 
-static struct trace_uprobe *find_probe_event(const char *event, const char *group)
+static struct trace_uprobe *
+find_probe_event(struct trace_array *tr, const char *event, const char *group)
 {
 	struct trace_uprobe *tu;
 
-	list_for_each_entry(tu, &uprobe_list, list)
+	list_for_each_entry(tu, &tr->uprobe_list, list)
 		if (strcmp(trace_event_name(&tu->tp.call), event) == 0 &&
 		    strcmp(tu->tp.call.class->system, group) == 0)
 			return tu;
@@ -315,15 +314,16 @@ static int unregister_trace_uprobe(struct trace_uprobe *tu)
 }
 
 /* Register a trace_uprobe and probe_event */
-static int register_trace_uprobe(struct trace_uprobe *tu)
+static int register_trace_uprobe(struct trace_array *tr,
+				 struct trace_uprobe *tu)
 {
 	struct trace_uprobe *old_tu;
 	int ret;
 
-	mutex_lock(&uprobe_lock);
+	mutex_lock(&tr->uprobe_lock);
 
 	/* register as an event */
-	old_tu = find_probe_event(trace_event_name(&tu->tp.call),
+	old_tu = find_probe_event(tr, trace_event_name(&tu->tp.call),
 			tu->tp.call.class->system);
 	if (old_tu) {
 		/* delete old event */
@@ -332,16 +332,16 @@ static int register_trace_uprobe(struct trace_uprobe *tu)
 			goto end;
 	}
 
-	ret = register_uprobe_event(tu);
+	ret = register_uprobe_event(tr, tu);
 	if (ret) {
 		pr_warn("Failed to register probe event(%d)\n", ret);
 		goto end;
 	}
 
-	list_add_tail(&tu->list, &uprobe_list);
+	list_add_tail(&tu->list, &tr->uprobe_list);
 
 end:
-	mutex_unlock(&uprobe_lock);
+	mutex_unlock(&tr->uprobe_lock);
 
 	return ret;
 }
@@ -352,7 +352,7 @@ end:
  *
  *  - Remove uprobe: -:[GRP/]EVENT
  */
-static int create_trace_uprobe(int argc, char **argv)
+static int create_trace_uprobe(struct trace_array *tr, int argc, char **argv)
 {
 	struct trace_uprobe *tu;
 	struct inode *inode;
@@ -409,17 +409,17 @@ static int create_trace_uprobe(int argc, char **argv)
 			pr_info("Delete command needs an event name.\n");
 			return -EINVAL;
 		}
-		mutex_lock(&uprobe_lock);
-		tu = find_probe_event(event, group);
+		mutex_lock(&tr->uprobe_lock);
+		tu = find_probe_event(tr, event, group);
 
 		if (!tu) {
-			mutex_unlock(&uprobe_lock);
+			mutex_unlock(&tr->uprobe_lock);
 			pr_info("Event %s/%s doesn't exist.\n", group, event);
 			return -ENOENT;
 		}
 		/* delete an event */
 		ret = unregister_trace_uprobe(tu);
-		mutex_unlock(&uprobe_lock);
+		mutex_unlock(&tr->uprobe_lock);
 		return ret;
 	}
 
@@ -543,7 +543,7 @@ static int create_trace_uprobe(int argc, char **argv)
 		}
 	}
 
-	ret = register_trace_uprobe(tu);
+	ret = register_trace_uprobe(tr, tu);
 	if (ret)
 		goto error;
 	return 0;
@@ -560,37 +560,45 @@ fail_address_parse:
 	return ret;
 }
 
-static int cleanup_all_probes(void)
+static int cleanup_all_probes(struct trace_array *tr)
 {
 	struct trace_uprobe *tu;
 	int ret = 0;
 
-	mutex_lock(&uprobe_lock);
-	while (!list_empty(&uprobe_list)) {
-		tu = list_entry(uprobe_list.next, struct trace_uprobe, list);
+	mutex_lock(&tr->uprobe_lock);
+	while (!list_empty(&tr->uprobe_list)) {
+		tu = list_entry(tr->uprobe_list.next,
+				struct trace_uprobe,
+				list);
 		ret = unregister_trace_uprobe(tu);
 		if (ret)
 			break;
 	}
-	mutex_unlock(&uprobe_lock);
+	mutex_unlock(&tr->uprobe_lock);
 	return ret;
 }
 
 /* Probes listing interfaces */
 static void *probes_seq_start(struct seq_file *m, loff_t *pos)
 {
-	mutex_lock(&uprobe_lock);
-	return seq_list_start(&uprobe_list, *pos);
+	struct trace_array *tr = m->file->f_inode->i_private;
+
+	mutex_lock(&tr->uprobe_lock);
+	return seq_list_start(&tr->uprobe_list, *pos);
 }
 
 static void *probes_seq_next(struct seq_file *m, void *v, loff_t *pos)
 {
-	return seq_list_next(v, &uprobe_list, pos);
+	struct trace_array *tr = m->file->f_inode->i_private;
+
+	return seq_list_next(v, &tr->uprobe_list, pos);
 }
 
 static void probes_seq_stop(struct seq_file *m, void *v)
 {
-	mutex_unlock(&uprobe_lock);
+	struct trace_array *tr = m->file->f_inode->i_private;
+
+	mutex_unlock(&tr->uprobe_lock);
 }
 
 static int probes_seq_show(struct seq_file *m, void *v)
@@ -635,9 +643,10 @@ static const struct seq_operations probes_seq_op = {
 static int probes_open(struct inode *inode, struct file *file)
 {
 	int ret;
+	struct trace_array *tr = inode->i_private;
 
 	if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_TRUNC)) {
-		ret = cleanup_all_probes();
+		ret = cleanup_all_probes(tr);
 		if (ret)
 			return ret;
 	}
@@ -645,10 +654,72 @@ static int probes_open(struct inode *inode, struct file *file)
 	return seq_open(file, &probes_seq_op);
 }
 
+#define WRITE_BUFSIZE  4096
+
 static ssize_t probes_write(struct file *file, const char __user *buffer,
 			    size_t count, loff_t *ppos)
 {
-	return traceprobe_probes_write(file, buffer, count, ppos, create_trace_uprobe);
+	char *kbuf, *tmp;
+	char **argv;
+	int argc;
+	int ret = 0;
+	size_t done = 0;
+	size_t size;
+	struct trace_array *tr = file->f_inode->i_private;
+
+	kbuf = kmalloc(WRITE_BUFSIZE, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	while (done < count) {
+		size = count - done;
+
+		if (size >= WRITE_BUFSIZE)
+			size = WRITE_BUFSIZE - 1;
+
+		if (copy_from_user(kbuf, buffer + done, size)) {
+			ret = -EFAULT;
+			goto out;
+		}
+		kbuf[size] = '\0';
+		tmp = strchr(kbuf, '\n');
+
+		if (tmp) {
+			*tmp = '\0';
+			size = tmp - kbuf + 1;
+		} else if (done + size < count) {
+			pr_warn("Line length is too long: Should be less than %d\n",
+				WRITE_BUFSIZE);
+			ret = -EINVAL;
+			goto out;
+		}
+		done += size;
+		/* Remove comments */
+		tmp = strchr(kbuf, '#');
+
+		if (tmp)
+			*tmp = '\0';
+
+		argc = 0;
+		argv = argv_split(GFP_KERNEL, kbuf, &argc);
+		if (!argv) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		if (argc)
+			ret = create_trace_uprobe(tr, argc, argv);
+
+		argv_free(argv);
+		if (ret)
+			goto out;
+	}
+	ret = done;
+
+out:
+	kfree(kbuf);
+
+	return ret;
 }
 
 static const struct file_operations uprobe_events_ops = {
@@ -1290,7 +1361,8 @@ static struct trace_event_functions uprobe_funcs = {
 	.trace		= print_uprobe_event
 };
 
-static int register_uprobe_event(struct trace_uprobe *tu)
+static int register_uprobe_event(struct trace_array *tr,
+				 struct trace_uprobe *tu)
 {
 	struct trace_event_call *call = &tu->tp.call;
 	int ret;
@@ -1312,7 +1384,7 @@ static int register_uprobe_event(struct trace_uprobe *tu)
 	call->flags = TRACE_EVENT_FL_UPROBE;
 	call->class->reg = trace_uprobe_register;
 	call->data = tu;
-	ret = trace_add_event_call(call);
+	ret = trace_add_event_call(call, tr);
 
 	if (ret) {
 		pr_info("Failed to register uprobe event: %s\n",
@@ -1338,20 +1410,20 @@ static int unregister_uprobe_event(struct trace_uprobe *tu)
 }
 
 /* Make a trace interface for controling probe points */
-static __init int init_uprobe_trace(void)
+void uprobe_create_trace_files(struct trace_array *tr,
+			       struct dentry *parent)
 {
-	struct dentry *d_tracer;
+	if (!tr) {
+		WARN(1, "Need a trace array for uprobe events");
+		return;
+	}
 
-	d_tracer = tracing_init_dentry();
-	if (IS_ERR(d_tracer))
-		return 0;
+	mutex_init(&tr->uprobe_lock);
+	INIT_LIST_HEAD(&tr->uprobe_list);
 
-	trace_create_file("uprobe_events", 0644, d_tracer,
-				    NULL, &uprobe_events_ops);
+	trace_create_file("uprobe_events", 0644, parent,
+				tr, &uprobe_events_ops);
 	/* Profile interface */
-	trace_create_file("uprobe_profile", 0444, d_tracer,
-				    NULL, &uprobe_profile_ops);
-	return 0;
+	trace_create_file("uprobe_profile", 0444, parent,
+				tr, &uprobe_profile_ops);
 }
-
-fs_initcall(init_uprobe_trace);

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option
  2016-07-27 21:27 [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Hari Bathini
  2016-07-27 21:27 ` [RFC PATCH v2 1/3] perf: filter container events based on cgroup namespace Hari Bathini
  2016-07-27 21:27 ` [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events Hari Bathini
@ 2016-07-27 21:27 ` Hari Bathini
  2016-08-04  2:54   ` Eric W. Biederman
       [not found] ` <146965470618.23765.7329786743211962695.stgit-2ivJzYymj6EA+286u2LMdEEOCMrvLtNR@public.gmane.org>
  3 siblings, 1 reply; 26+ messages in thread
From: Hari Bathini @ 2016-07-27 21:27 UTC (permalink / raw)
  To: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, rostedt, viro
  Cc: aravinda, ananth

When tracefs is mounted inside a container, its files are visible to
all containers. This implies that a user from within a container can
list/delete uprobes registered elsewhere, leading to security issues
and/or denial of service (Eg. deleting a probe that is registered from
elsewhere). This patch addresses this problem by adding mount option
'newinstance', allowing containers to have their own instance mounted
separately. Something like the below from within a container:

  $ mount -o newinstance -t tracefs tracefs /sys/kernel/tracing
  $
  $
  $ perf probe /lib/x86_64-linux-gnu/libc.so.6 malloc
  Added new event:
    probe_libc:malloc    (on malloc in /lib/x86_64-linux-gnu/libc.so.6)

  You can now use it in all perf tools, such as:

  	perf record -e probe_libc:malloc -aR sleep 1

  $
  $
  $ perf probe --list
    probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
  $

while another container/host has a completely different view:


  $ perf probe --list
    probe_libc:memset    (on __libc_memset in /lib64/libc.so.6)
  $

This patch reuses the code that provides support to create new instances
under tracefs instances directory.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
---
 fs/tracefs/inode.c      |  171 ++++++++++++++++++++++++++++++++++++++---------
 include/linux/tracefs.h |   11 ++-
 kernel/trace/trace.c    |   52 ++++++++++----
 3 files changed, 181 insertions(+), 53 deletions(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 4a0e48f..2d6acda 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -51,9 +51,9 @@ static const struct file_operations tracefs_file_operations = {
 };
 
 static struct tracefs_dir_ops {
-	int (*mkdir)(const char *name);
-	int (*rmdir)(const char *name);
-} tracefs_ops;
+	int (*mkdir)(int instance_type, void *data);
+	int (*rmdir)(int instance_type, void *data);
+} tracefs_instance_ops;
 
 static char *get_dname(struct dentry *dentry)
 {
@@ -85,7 +85,7 @@ static int tracefs_syscall_mkdir(struct inode *inode, struct dentry *dentry, umo
 	 * mkdir routine to handle races.
 	 */
 	inode_unlock(inode);
-	ret = tracefs_ops.mkdir(name);
+	ret = tracefs_instance_ops.mkdir(INSTANCE_DIR, name);
 	inode_lock(inode);
 
 	kfree(name);
@@ -112,7 +112,7 @@ static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry)
 	inode_unlock(inode);
 	inode_unlock(dentry->d_inode);
 
-	ret = tracefs_ops.rmdir(name);
+	ret = tracefs_instance_ops.rmdir(INSTANCE_DIR, name);
 
 	inode_lock_nested(inode, I_MUTEX_PARENT);
 	inode_lock(dentry->d_inode);
@@ -142,12 +142,14 @@ struct tracefs_mount_opts {
 	kuid_t uid;
 	kgid_t gid;
 	umode_t mode;
+	int newinstance;
 };
 
 enum {
 	Opt_uid,
 	Opt_gid,
 	Opt_mode,
+	Opt_newinstance,
 	Opt_err
 };
 
@@ -155,14 +157,26 @@ static const match_table_t tokens = {
 	{Opt_uid, "uid=%u"},
 	{Opt_gid, "gid=%u"},
 	{Opt_mode, "mode=%o"},
+	{Opt_newinstance, "newinstance"},
 	{Opt_err, NULL}
 };
 
 struct tracefs_fs_info {
 	struct tracefs_mount_opts mount_opts;
+	struct super_block *sb;
 };
 
-static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
+static inline struct tracefs_fs_info *TRACEFS_SB(struct super_block *sb)
+{
+	return sb->s_fs_info;
+}
+
+#define PARSE_MOUNT		0
+#define PARSE_REMOUNT		1
+
+static int tracefs_parse_options(char *data,
+				 int op,
+				 struct tracefs_mount_opts *opts)
 {
 	substring_t args[MAX_OPT_ARGS];
 	int option;
@@ -173,6 +187,10 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
 
 	opts->mode = TRACEFS_DEFAULT_MODE;
 
+	/* newinstance makes sense only on initial mount */
+	if (op == PARSE_MOUNT)
+		opts->newinstance = 0;
+
 	while ((p = strsep(&data, ",")) != NULL) {
 		if (!*p)
 			continue;
@@ -200,6 +218,11 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
 				return -EINVAL;
 			opts->mode = option & S_IALLUGO;
 			break;
+		case Opt_newinstance:
+			/* newinstance makes sense only on initial mount */
+			if (op == PARSE_MOUNT)
+				opts->newinstance = 1;
+			break;
 		/*
 		 * We might like to report bad mount options here;
 		 * but traditionally tracefs has ignored all mount options
@@ -231,7 +254,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
 	struct tracefs_fs_info *fsi = sb->s_fs_info;
 
 	sync_filesystem(sb);
-	err = tracefs_parse_options(data, &fsi->mount_opts);
+	err = tracefs_parse_options(data, PARSE_REMOUNT, &fsi->mount_opts);
 	if (err)
 		goto fail;
 
@@ -254,6 +277,8 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
 			   from_kgid_munged(&init_user_ns, opts->gid));
 	if (opts->mode != TRACEFS_DEFAULT_MODE)
 		seq_printf(m, ",mode=%o", opts->mode);
+	if (opts->newinstance)
+		seq_puts(m, ",newinstance");
 
 	return 0;
 }
@@ -264,53 +289,130 @@ static const struct super_operations tracefs_super_operations = {
 	.show_options	= tracefs_show_options,
 };
 
-static int trace_fill_super(struct super_block *sb, void *data, int silent)
+static void *new_tracefs_fs_info(struct super_block *sb)
 {
-	static struct tree_descr trace_files[] = {{""}};
 	struct tracefs_fs_info *fsi;
-	int err;
-
-	save_mount_options(sb, data);
 
 	fsi = kzalloc(sizeof(struct tracefs_fs_info), GFP_KERNEL);
-	sb->s_fs_info = fsi;
-	if (!fsi) {
-		err = -ENOMEM;
-		goto fail;
-	}
+	if (!fsi)
+		return NULL;
 
-	err = tracefs_parse_options(data, &fsi->mount_opts);
-	if (err)
+	fsi->mount_opts.mode = TRACEFS_DEFAULT_MODE;
+	fsi->sb = sb;
+
+	return fsi;
+}
+
+static int trace_fill_super(struct super_block *sb, void *data, int silent)
+{
+	struct inode *inode;
+
+	sb->s_blocksize = PAGE_SIZE;
+	sb->s_blocksize_bits = PAGE_SHIFT;
+	sb->s_magic = TRACEFS_MAGIC;
+	sb->s_op = &tracefs_super_operations;
+	sb->s_time_gran = 1;
+
+	sb->s_fs_info = new_tracefs_fs_info(sb);
+	if (!sb->s_fs_info)
 		goto fail;
 
-	err  =  simple_fill_super(sb, TRACEFS_MAGIC, trace_files);
-	if (err)
+	inode = new_inode(sb);
+	if (!inode)
 		goto fail;
+	inode->i_ino = 1;
+	inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
+	inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO | S_IWUSR;
+	inode->i_op = &simple_dir_inode_operations;
+	inode->i_fop = &simple_dir_operations;
+	set_nlink(inode, 2);
 
-	sb->s_op = &tracefs_super_operations;
+	sb->s_root = d_make_root(inode);
+	if (sb->s_root)
+		return 0;
 
-	tracefs_apply_options(sb);
+	pr_err("get root dentry failed\n");
 
 	return 0;
 
 fail:
-	kfree(fsi);
-	sb->s_fs_info = NULL;
-	return err;
+	return -ENOMEM;
+}
+
+static int compare_init_tracefs_sb(struct super_block *s, void *p)
+{
+	if (tracefs_mount)
+		return tracefs_mount->mnt_sb == s;
+	return 0;
 }
 
 static struct dentry *trace_mount(struct file_system_type *fs_type,
 			int flags, const char *dev_name,
 			void *data)
 {
-	return mount_single(fs_type, flags, data, trace_fill_super);
+	int err;
+	struct tracefs_mount_opts opts;
+	struct super_block *s;
+
+	err = tracefs_parse_options(data, PARSE_MOUNT, &opts);
+	if (err)
+		return ERR_PTR(err);
+
+	/* Require newinstance for all user namespace mounts to ensure
+	 * the mount options are not changed.
+	 */
+	if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
+		return ERR_PTR(-EINVAL);
+
+	if (opts.newinstance)
+		s = sget(fs_type, NULL, set_anon_super, flags, NULL);
+	else
+		s = sget(fs_type, compare_init_tracefs_sb, set_anon_super,
+			 flags, NULL);
+
+	if (IS_ERR(s))
+		return ERR_CAST(s);
+
+	if (!s->s_root) {
+		err = trace_fill_super(s, data, flags & MS_SILENT ? 1 : 0);
+		if (err)
+			goto out_undo_sget;
+		s->s_flags |= MS_ACTIVE;
+	}
+
+	if (opts.newinstance) {
+		err = tracefs_instance_ops.mkdir(INSTANCE_MNT, s->s_root);
+		if (err)
+			goto out_undo_sget;
+	}
+
+	memcpy(&(TRACEFS_SB(s))->mount_opts, &opts, sizeof(opts));
+
+	tracefs_apply_options(s);
+
+	return dget(s->s_root);
+
+out_undo_sget:
+	deactivate_locked_super(s);
+	return ERR_PTR(err);
+}
+
+static void trace_kill_sb(struct super_block *sb)
+{
+	struct tracefs_fs_info *fsi = TRACEFS_SB(sb);
+
+	if (fsi->mount_opts.newinstance)
+		tracefs_instance_ops.rmdir(INSTANCE_MNT, sb->s_root);
+
+	kfree(fsi);
+	kill_litter_super(sb);
 }
 
 static struct file_system_type trace_fs_type = {
 	.owner =	THIS_MODULE,
 	.name =		"tracefs",
 	.mount =	trace_mount,
-	.kill_sb =	kill_litter_super,
+	.kill_sb =	trace_kill_sb,
 };
 MODULE_ALIAS_FS("tracefs");
 
@@ -480,22 +582,23 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent)
  *
  * Returns the dentry of the instances directory.
  */
-struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
-					  int (*mkdir)(const char *name),
-					  int (*rmdir)(const char *name))
+struct dentry *
+tracefs_create_instance_dir(const char *name, struct dentry *parent,
+			    int (*mkdir)(int instance_type, void *data),
+			    int (*rmdir)(int instance_type, void *data))
 {
 	struct dentry *dentry;
 
 	/* Only allow one instance of the instances directory. */
-	if (WARN_ON(tracefs_ops.mkdir || tracefs_ops.rmdir))
+	if (WARN_ON(tracefs_instance_ops.mkdir || tracefs_instance_ops.rmdir))
 		return NULL;
 
 	dentry = __create_dir(name, parent, &tracefs_dir_inode_operations);
 	if (!dentry)
 		return NULL;
 
-	tracefs_ops.mkdir = mkdir;
-	tracefs_ops.rmdir = rmdir;
+	tracefs_instance_ops.mkdir = mkdir;
+	tracefs_instance_ops.rmdir = rmdir;
 
 	return dentry;
 }
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 5b727a1..30d4e55 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -25,6 +25,10 @@ struct file_operations;
 
 #ifdef CONFIG_TRACING
 
+/* instance types */
+#define INSTANCE_DIR	0	/* created inside instances dir */
+#define INSTANCE_MNT	1	/* created with newinstance mount option */
+
 struct dentry *tracefs_create_file(const char *name, umode_t mode,
 				   struct dentry *parent, void *data,
 				   const struct file_operations *fops);
@@ -34,9 +38,10 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
 void tracefs_remove(struct dentry *dentry);
 void tracefs_remove_recursive(struct dentry *dentry);
 
-struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
-					   int (*mkdir)(const char *name),
-					   int (*rmdir)(const char *name));
+struct dentry *
+tracefs_create_instance_dir(const char *name, struct dentry *parent,
+			    int (*mkdir)(int instance_type, void *data),
+			    int (*rmdir)(int instance_type, void *data));
 
 bool tracefs_initialized(void);
 
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 23a8111..a991e9d 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6782,17 +6782,24 @@ static void update_tracer_options(struct trace_array *tr)
 	mutex_unlock(&trace_types_lock);
 }
 
-static int instance_mkdir(const char *name)
+static int instance_mkdir(int instance_type, void *data)
 {
+	const char *name =  "tracing";
+	struct dentry *mnt_root = NULL;
 	struct trace_array *tr;
 	int ret;
 
 	mutex_lock(&trace_types_lock);
 
-	ret = -EEXIST;
-	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
-		if (tr->name && strcmp(tr->name, name) == 0)
-			goto out_unlock;
+	if (instance_type == INSTANCE_MNT)
+		mnt_root = data;
+	else {
+		name = data;
+		ret = -EEXIST;
+		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
+			if (tr->name && strcmp(tr->name, name) == 0)
+				goto out_unlock;
+		}
 	}
 
 	ret = -ENOMEM;
@@ -6823,9 +6830,14 @@ static int instance_mkdir(const char *name)
 	if (allocate_trace_buffers(tr, trace_buf_size) < 0)
 		goto out_free_tr;
 
-	tr->dir = tracefs_create_dir(name, trace_instance_dir);
-	if (!tr->dir)
-		goto out_free_tr;
+	if (instance_type == INSTANCE_MNT) {
+		mnt_root->d_inode->i_private = tr;
+		tr->dir = mnt_root;
+	} else {
+		tr->dir = tracefs_create_dir(name, trace_instance_dir);
+		if (!tr->dir)
+			goto out_free_tr;
+	}
 
 	ret = event_trace_add_tracer(tr->dir, tr);
 	if (ret) {
@@ -6856,8 +6868,10 @@ static int instance_mkdir(const char *name)
 
 }
 
-static int instance_rmdir(const char *name)
+static int instance_rmdir(int instance_type, void *data)
 {
+	const char *name =  "tracing";
+	struct dentry *mnt_root = NULL;
 	struct trace_array *tr;
 	int found = 0;
 	int ret;
@@ -6865,15 +6879,21 @@ static int instance_rmdir(const char *name)
 
 	mutex_lock(&trace_types_lock);
 
-	ret = -ENODEV;
-	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
-		if (tr->name && strcmp(tr->name, name) == 0) {
-			found = 1;
-			break;
+	if (instance_type == INSTANCE_MNT) {
+		mnt_root = data;
+		tr = mnt_root->d_inode->i_private;
+	} else {
+		name = data;
+		ret = -ENODEV;
+		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
+			if (tr->name && strcmp(tr->name, name) == 0) {
+				found = 1;
+				break;
+			}
 		}
+		if (!found)
+			goto out_unlock;
 	}
-	if (!found)
-		goto out_unlock;
 
 	ret = -EBUSY;
 	if (tr->ref || (tr->current_trace && tr->current_trace->ref))

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-07-27 21:27 ` [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events Hari Bathini
@ 2016-08-01 21:45   ` Steven Rostedt
  2016-08-02 17:27     ` Hari Bathini
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2016-08-01 21:45 UTC (permalink / raw)
  To: Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, viro, aravinda, ananth

On Thu, 28 Jul 2016 02:57:38 +0530
Hari Bathini <hbathini@linux.vnet.ibm.com> wrote:

> If a uprobe event is set on a library function, and if a similar uprobe
> event trace is needed for a container, a duplicate is created leaving
> the uprobe list with multiple entries of the same function:
> 
>   $ perf probe --list
>     probe_libc:malloc    (on 0x80490 in /lib64/libc.so.6)
>     probe_libc:malloc_1  (on __libc_malloc in /lib64/libc.so.6)
>   $
> 
> This can soon get out of hand if multiple containers want to probe the
> same function/address in their libraries. This patch tries to resolve this
> by adding uprobe event trace files to every new instance. Currently, perf
> tool can leverage this by using --debugfs-dir option - something like
> (assuming instance dir name is 'tracing'):
> 
>   $ perf --debugfs-dir=$MOUNT_PNT/instances probe /lib64/libc.so.6 malloc
>   $
>   $
>   $ perf --debugfs-dir=$MOUNT_PNT/instances probe --list
>     probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
>   $
> 
> New uprobe events can be added to the uprobe_events file under the instance
> directory and the profile information for these events will be available in
> uprobe_profile file in the same instance directory.

Hmm, this does change the behavior of normal instances.

# cd /sys/kernel/debug/tracing
# echo 'p /bin/bash:0x41adf0' > uprobe_events
# ls events/uprobes
enable filter p_bash_0x41adf0

# mkdir instances/foo
# ls instances/foo/events/uprobes
ls: cannot access instances/foo/events/uprobes: No such file or directory

Usually, instances will have the same events as the top level
directory. This will make uprobes, and only uprobes different. I'm not
sure if this is a bad thing or not, I'll have to think about it more.
But what would it take to have this only differ for containers, and not
normal instances?

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-01 21:45   ` Steven Rostedt
@ 2016-08-02 17:27     ` Hari Bathini
  2016-08-02 17:32       ` Hari Bathini
  2016-08-02 17:49       ` Steven Rostedt
  0 siblings, 2 replies; 26+ messages in thread
From: Hari Bathini @ 2016-08-02 17:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, viro, aravinda, ananth

Hi Steve,


Thanks for the review


On Tuesday 02 August 2016 03:15 AM, Steven Rostedt wrote:
> On Thu, 28 Jul 2016 02:57:38 +0530
> Hari Bathini <hbathini@linux.vnet.ibm.com> wrote:
>
>> If a uprobe event is set on a library function, and if a similar uprobe
>> event trace is needed for a container, a duplicate is created leaving
>> the uprobe list with multiple entries of the same function:
>>
>>    $ perf probe --list
>>      probe_libc:malloc    (on 0x80490 in /lib64/libc.so.6)
>>      probe_libc:malloc_1  (on __libc_malloc in /lib64/libc.so.6)
>>    $
>>
>> This can soon get out of hand if multiple containers want to probe the
>> same function/address in their libraries. This patch tries to resolve this
>> by adding uprobe event trace files to every new instance. Currently, perf
>> tool can leverage this by using --debugfs-dir option - something like
>> (assuming instance dir name is 'tracing'):
>>
>>    $ perf --debugfs-dir=$MOUNT_PNT/instances probe /lib64/libc.so.6 malloc
>>    $
>>    $
>>    $ perf --debugfs-dir=$MOUNT_PNT/instances probe --list
>>      probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
>>    $
>>
>> New uprobe events can be added to the uprobe_events file under the instance
>> directory and the profile information for these events will be available in
>> uprobe_profile file in the same instance directory.
> Hmm, this does change the behavior of normal instances.
>
> # cd /sys/kernel/debug/tracing
> # echo 'p /bin/bash:0x41adf0' > uprobe_events
> # ls events/uprobes
> enable filter p_bash_0x41adf0
>
> # mkdir instances/foo
> # ls instances/foo/events/uprobes
> ls: cannot access instances/foo/events/uprobes: No such file or directory
>
> Usually, instances will have the same events as the top level
> directory. This will make uprobes, and only uprobes different. I'm not
> sure if this is a bad thing or not, I'll have to think about it more.

Hmmm. I think making uprobes an exception is worth considering.

> But what would it take to have this only differ for containers, and not
> normal instances?

With the current approach, instances created in instances directory and
the ones created with newinstance mount option (patch 3 of 3) are similar.
Each instance corresponds to a trace_array structure.
An alternate approach I could think of is something like below:

struct trace_instance {
     struct trace_array tr;
     struct mutex uprobe_lock;
     struct list_head uprobe_list;
     /* any other new data specific to a mount instance */
};

where a mountable instance is more than a trace array.
This may need addition of new flags for trace array saying
whether it is a global trace or directory instance or mountable instance.
Also, the helper functions that add/remove events need to be tweaked 
accordingly.


Thanks
Hari

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-02 17:27     ` Hari Bathini
@ 2016-08-02 17:32       ` Hari Bathini
  2016-08-02 17:49       ` Steven Rostedt
  1 sibling, 0 replies; 26+ messages in thread
From: Hari Bathini @ 2016-08-02 17:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, viro, aravinda, ananth



On Tuesday 02 August 2016 10:57 PM, Hari Bathini wrote:
> Hi Steve,
>
>
> Thanks for the review
>
>
> On Tuesday 02 August 2016 03:15 AM, Steven Rostedt wrote:
>> On Thu, 28 Jul 2016 02:57:38 +0530
>> Hari Bathini <hbathini@linux.vnet.ibm.com> wrote:
>>
>>> If a uprobe event is set on a library function, and if a similar uprobe
>>> event trace is needed for a container, a duplicate is created leaving
>>> the uprobe list with multiple entries of the same function:
>>>
>>>    $ perf probe --list
>>>      probe_libc:malloc    (on 0x80490 in /lib64/libc.so.6)
>>>      probe_libc:malloc_1  (on __libc_malloc in /lib64/libc.so.6)
>>>    $
>>>
>>> This can soon get out of hand if multiple containers want to probe the
>>> same function/address in their libraries. This patch tries to 
>>> resolve this
>>> by adding uprobe event trace files to every new instance. Currently, 
>>> perf
>>> tool can leverage this by using --debugfs-dir option - something like
>>> (assuming instance dir name is 'tracing'):
>>>
>>>    $ perf --debugfs-dir=$MOUNT_PNT/instances probe /lib64/libc.so.6 
>>> malloc
>>>    $
>>>    $
>>>    $ perf --debugfs-dir=$MOUNT_PNT/instances probe --list
>>>      probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
>>>    $
>>>
>>> New uprobe events can be added to the uprobe_events file under the 
>>> instance
>>> directory and the profile information for these events will be 
>>> available in
>>> uprobe_profile file in the same instance directory.
>> Hmm, this does change the behavior of normal instances.
>>
>> # cd /sys/kernel/debug/tracing
>> # echo 'p /bin/bash:0x41adf0' > uprobe_events
>> # ls events/uprobes
>> enable filter p_bash_0x41adf0
>>
>> # mkdir instances/foo
>> # ls instances/foo/events/uprobes
>> ls: cannot access instances/foo/events/uprobes: No such file or 
>> directory
>>
>> Usually, instances will have the same events as the top level
>> directory. This will make uprobes, and only uprobes different. I'm not
>> sure if this is a bad thing or not, I'll have to think about it more.
>
> Hmmm. I think making uprobes an exception is worth considering.
>
>> But what would it take to have this only differ for containers, and not
>> normal instances?
>
> With the current approach, instances created in instances directory and
> the ones created with newinstance mount option (patch 3 of 3) are 
> similar.
> Each instance corresponds to a trace_array structure.
> An alternate approach I could think of is something like below:
>
> struct trace_instance {
>     struct trace_array tr;
>     struct mutex uprobe_lock;
>     struct list_head uprobe_list;
>     /* any other new data specific to a mount instance */
> };
>
> where a mountable instance is more than a trace array.
> This may need addition of new flags for trace array saying
> whether it is a global trace or directory instance or mountable instance.
> Also, the helper functions that add/remove events need to be tweaked 
> accordingly.
>

.. and the mountable instance can be used for containers without change in
behavior for directory instances..

Thanks
Hari

>
> Thanks
> Hari

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-02 17:27     ` Hari Bathini
  2016-08-02 17:32       ` Hari Bathini
@ 2016-08-02 17:49       ` Steven Rostedt
  2016-08-03 19:30         ` Aravinda Prasad
  1 sibling, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2016-08-02 17:49 UTC (permalink / raw)
  To: Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, viro, aravinda, ananth

On Tue, 2 Aug 2016 22:57:30 +0530
Hari Bathini <hbathini@linux.vnet.ibm.com> wrote:

> where a mountable instance is more than a trace array.
> This may need addition of new flags for trace array saying
> whether it is a global trace or directory instance or mountable instance.
> Also, the helper functions that add/remove events need to be tweaked 
> accordingly.

BTW, I'm curious to how you handle the rest of the trace files in a
container? The tracing system really looks at the Linux kernel as a
whole, and for the most part ignores things like name spaces.

Can a container have its own function tracing?

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-02 17:49       ` Steven Rostedt
@ 2016-08-03 19:30         ` Aravinda Prasad
  2016-08-03 20:10           ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-03 19:30 UTC (permalink / raw)
  To: Steven Rostedt, Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, ebiederm, kernel, viro, ananth



On Tuesday 02 August 2016 11:19 PM, Steven Rostedt wrote:
> On Tue, 2 Aug 2016 22:57:30 +0530
> Hari Bathini <hbathini@linux.vnet.ibm.com> wrote:
> 
>> where a mountable instance is more than a trace array.
>> This may need addition of new flags for trace array saying
>> whether it is a global trace or directory instance or mountable instance.
>> Also, the helper functions that add/remove events need to be tweaked 
>> accordingly.
> 
> BTW, I'm curious to how you handle the rest of the trace files in a
> container? The tracing system really looks at the Linux kernel as a
> whole, and for the most part ignores things like name spaces.

We started by trying to support perf inside a container and currently we
are exploring approaches to support function tracing inside a container.

One approach that we are thinking/working is on the lines of patch 1/3.
For example, filtering events based on the namespace in which the trace
file is read. We are trying to understand the ftrace implementation and
hence not sure if this is feasible.

We would be happy to explore if you have any suggestions/feedback on
supporting function tracing inside a container.

> 
> Can a container have its own function tracing?

Sorry, I didn't understand that. Do you mean to have a separate
per-container trace files?

Regards,
Aravinda

> 
> -- Steve
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-03 19:30         ` Aravinda Prasad
@ 2016-08-03 20:10           ` Steven Rostedt
  2016-08-03 20:16             ` Aravinda Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2016-08-03 20:10 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth

On Thu, 4 Aug 2016 01:00:51 +0530
Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:

> 
> > Can a container have its own function tracing?  
> 
> Sorry, I didn't understand that. Do you mean to have a separate
> per-container trace files?

Actually, it's more my ignorance of containers, as I haven't had the
need to play with them. Although, I think it may be time to do so.

When a container enters kernel mode, I'm assuming that it's part of the
host at that moment, and the host needs to take care of separating
everything? That is, there's not a "second kernel" like VMs have, right?

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-03 20:10           ` Steven Rostedt
@ 2016-08-03 20:16             ` Aravinda Prasad
  2016-08-04  1:04               ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-03 20:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth



On Thursday 04 August 2016 01:40 AM, Steven Rostedt wrote:
> On Thu, 4 Aug 2016 01:00:51 +0530
> Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
> 
>>
>>> Can a container have its own function tracing?  
>>
>> Sorry, I didn't understand that. Do you mean to have a separate
>> per-container trace files?
> 
> Actually, it's more my ignorance of containers, as I haven't had the
> need to play with them. Although, I think it may be time to do so.
> 
> When a container enters kernel mode, I'm assuming that it's part of the
> host at that moment, and the host needs to take care of separating
> everything? That is, there's not a "second kernel" like VMs have, right?

Yes. The host needs to take care of separating everything. There is no
"second kernel".

> 
> -- Steve
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-03 20:16             ` Aravinda Prasad
@ 2016-08-04  1:04               ` Steven Rostedt
  2016-08-04 13:46                 ` Aravinda Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2016-08-04  1:04 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth

On Thu, 4 Aug 2016 01:46:04 +0530
Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:

> On Thursday 04 August 2016 01:40 AM, Steven Rostedt wrote:
> > On Thu, 4 Aug 2016 01:00:51 +0530
> > Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
> >   
> >>  
> >>> Can a container have its own function tracing?    
> >>
> >> Sorry, I didn't understand that. Do you mean to have a separate
> >> per-container trace files?  
> > 
> > Actually, it's more my ignorance of containers, as I haven't had the
> > need to play with them. Although, I think it may be time to do so.
> > 
> > When a container enters kernel mode, I'm assuming that it's part of the
> > host at that moment, and the host needs to take care of separating
> > everything? That is, there's not a "second kernel" like VMs have, right?  
> 
> Yes. The host needs to take care of separating everything. There is no
> "second kernel".

That's what I figured. Thus, my worry is that something like the
function tracer can cause information leak to a container. How would
you separate functions for the container from functions for the host?

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option
  2016-07-27 21:27 ` [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option Hari Bathini
@ 2016-08-04  2:54   ` Eric W. Biederman
  2016-08-04 12:26     ` Hari Bathini
  0 siblings, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04  2:54 UTC (permalink / raw)
  To: Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, kernel, rostedt, viro, aravinda, ananth

Hari Bathini <hbathini@linux.vnet.ibm.com> writes:

> When tracefs is mounted inside a container, its files are visible to
> all containers. This implies that a user from within a container can
> list/delete uprobes registered elsewhere, leading to security issues
> and/or denial of service (Eg. deleting a probe that is registered from
> elsewhere). This patch addresses this problem by adding mount option
> 'newinstance', allowing containers to have their own instance mounted
> separately. Something like the below from within a container:

newinstance is an anti-pattern in devpts and should not be copied.
To fix some severe defects of devpts we had to always create new
istances and the code and the testing to make that all work was
not pleasant.  Please don't add another option that we will just have to
make redundant later.

Eric


>   $ mount -o newinstance -t tracefs tracefs /sys/kernel/tracing
>   $
>   $
>   $ perf probe /lib/x86_64-linux-gnu/libc.so.6 malloc
>   Added new event:
>     probe_libc:malloc    (on malloc in /lib/x86_64-linux-gnu/libc.so.6)
>
>   You can now use it in all perf tools, such as:
>
>   	perf record -e probe_libc:malloc -aR sleep 1
>
>   $
>   $
>   $ perf probe --list
>     probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
>   $
>
> while another container/host has a completely different view:
>
>
>   $ perf probe --list
>     probe_libc:memset    (on __libc_memset in /lib64/libc.so.6)
>   $
>
> This patch reuses the code that provides support to create new instances
> under tracefs instances directory.
>
> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
> ---
>  fs/tracefs/inode.c      |  171 ++++++++++++++++++++++++++++++++++++++---------
>  include/linux/tracefs.h |   11 ++-
>  kernel/trace/trace.c    |   52 ++++++++++----
>  3 files changed, 181 insertions(+), 53 deletions(-)
>
> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
> index 4a0e48f..2d6acda 100644
> --- a/fs/tracefs/inode.c
> +++ b/fs/tracefs/inode.c
> @@ -51,9 +51,9 @@ static const struct file_operations tracefs_file_operations = {
>  };
>  
>  static struct tracefs_dir_ops {
> -	int (*mkdir)(const char *name);
> -	int (*rmdir)(const char *name);
> -} tracefs_ops;
> +	int (*mkdir)(int instance_type, void *data);
> +	int (*rmdir)(int instance_type, void *data);
> +} tracefs_instance_ops;
>  
>  static char *get_dname(struct dentry *dentry)
>  {
> @@ -85,7 +85,7 @@ static int tracefs_syscall_mkdir(struct inode *inode, struct dentry *dentry, umo
>  	 * mkdir routine to handle races.
>  	 */
>  	inode_unlock(inode);
> -	ret = tracefs_ops.mkdir(name);
> +	ret = tracefs_instance_ops.mkdir(INSTANCE_DIR, name);
>  	inode_lock(inode);
>  
>  	kfree(name);
> @@ -112,7 +112,7 @@ static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry)
>  	inode_unlock(inode);
>  	inode_unlock(dentry->d_inode);
>  
> -	ret = tracefs_ops.rmdir(name);
> +	ret = tracefs_instance_ops.rmdir(INSTANCE_DIR, name);
>  
>  	inode_lock_nested(inode, I_MUTEX_PARENT);
>  	inode_lock(dentry->d_inode);
> @@ -142,12 +142,14 @@ struct tracefs_mount_opts {
>  	kuid_t uid;
>  	kgid_t gid;
>  	umode_t mode;
> +	int newinstance;
>  };
>  
>  enum {
>  	Opt_uid,
>  	Opt_gid,
>  	Opt_mode,
> +	Opt_newinstance,
>  	Opt_err
>  };
>  
> @@ -155,14 +157,26 @@ static const match_table_t tokens = {
>  	{Opt_uid, "uid=%u"},
>  	{Opt_gid, "gid=%u"},
>  	{Opt_mode, "mode=%o"},
> +	{Opt_newinstance, "newinstance"},
>  	{Opt_err, NULL}
>  };
>  
>  struct tracefs_fs_info {
>  	struct tracefs_mount_opts mount_opts;
> +	struct super_block *sb;
>  };
>  
> -static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
> +static inline struct tracefs_fs_info *TRACEFS_SB(struct super_block *sb)
> +{
> +	return sb->s_fs_info;
> +}
> +
> +#define PARSE_MOUNT		0
> +#define PARSE_REMOUNT		1
> +
> +static int tracefs_parse_options(char *data,
> +				 int op,
> +				 struct tracefs_mount_opts *opts)
>  {
>  	substring_t args[MAX_OPT_ARGS];
>  	int option;
> @@ -173,6 +187,10 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
>  
>  	opts->mode = TRACEFS_DEFAULT_MODE;
>  
> +	/* newinstance makes sense only on initial mount */
> +	if (op == PARSE_MOUNT)
> +		opts->newinstance = 0;
> +
>  	while ((p = strsep(&data, ",")) != NULL) {
>  		if (!*p)
>  			continue;
> @@ -200,6 +218,11 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
>  				return -EINVAL;
>  			opts->mode = option & S_IALLUGO;
>  			break;
> +		case Opt_newinstance:
> +			/* newinstance makes sense only on initial mount */
> +			if (op == PARSE_MOUNT)
> +				opts->newinstance = 1;
> +			break;
>  		/*
>  		 * We might like to report bad mount options here;
>  		 * but traditionally tracefs has ignored all mount options
> @@ -231,7 +254,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
>  	struct tracefs_fs_info *fsi = sb->s_fs_info;
>  
>  	sync_filesystem(sb);
> -	err = tracefs_parse_options(data, &fsi->mount_opts);
> +	err = tracefs_parse_options(data, PARSE_REMOUNT, &fsi->mount_opts);
>  	if (err)
>  		goto fail;
>  
> @@ -254,6 +277,8 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
>  			   from_kgid_munged(&init_user_ns, opts->gid));
>  	if (opts->mode != TRACEFS_DEFAULT_MODE)
>  		seq_printf(m, ",mode=%o", opts->mode);
> +	if (opts->newinstance)
> +		seq_puts(m, ",newinstance");
>  
>  	return 0;
>  }
> @@ -264,53 +289,130 @@ static const struct super_operations tracefs_super_operations = {
>  	.show_options	= tracefs_show_options,
>  };
>  
> -static int trace_fill_super(struct super_block *sb, void *data, int silent)
> +static void *new_tracefs_fs_info(struct super_block *sb)
>  {
> -	static struct tree_descr trace_files[] = {{""}};
>  	struct tracefs_fs_info *fsi;
> -	int err;
> -
> -	save_mount_options(sb, data);
>  
>  	fsi = kzalloc(sizeof(struct tracefs_fs_info), GFP_KERNEL);
> -	sb->s_fs_info = fsi;
> -	if (!fsi) {
> -		err = -ENOMEM;
> -		goto fail;
> -	}
> +	if (!fsi)
> +		return NULL;
>  
> -	err = tracefs_parse_options(data, &fsi->mount_opts);
> -	if (err)
> +	fsi->mount_opts.mode = TRACEFS_DEFAULT_MODE;
> +	fsi->sb = sb;
> +
> +	return fsi;
> +}
> +
> +static int trace_fill_super(struct super_block *sb, void *data, int silent)
> +{
> +	struct inode *inode;
> +
> +	sb->s_blocksize = PAGE_SIZE;
> +	sb->s_blocksize_bits = PAGE_SHIFT;
> +	sb->s_magic = TRACEFS_MAGIC;
> +	sb->s_op = &tracefs_super_operations;
> +	sb->s_time_gran = 1;
> +
> +	sb->s_fs_info = new_tracefs_fs_info(sb);
> +	if (!sb->s_fs_info)
>  		goto fail;
>  
> -	err  =  simple_fill_super(sb, TRACEFS_MAGIC, trace_files);
> -	if (err)
> +	inode = new_inode(sb);
> +	if (!inode)
>  		goto fail;
> +	inode->i_ino = 1;
> +	inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
> +	inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO | S_IWUSR;
> +	inode->i_op = &simple_dir_inode_operations;
> +	inode->i_fop = &simple_dir_operations;
> +	set_nlink(inode, 2);
>  
> -	sb->s_op = &tracefs_super_operations;
> +	sb->s_root = d_make_root(inode);
> +	if (sb->s_root)
> +		return 0;
>  
> -	tracefs_apply_options(sb);
> +	pr_err("get root dentry failed\n");
>  
>  	return 0;
>  
>  fail:
> -	kfree(fsi);
> -	sb->s_fs_info = NULL;
> -	return err;
> +	return -ENOMEM;
> +}
> +
> +static int compare_init_tracefs_sb(struct super_block *s, void *p)
> +{
> +	if (tracefs_mount)
> +		return tracefs_mount->mnt_sb == s;
> +	return 0;
>  }
>  
>  static struct dentry *trace_mount(struct file_system_type *fs_type,
>  			int flags, const char *dev_name,
>  			void *data)
>  {
> -	return mount_single(fs_type, flags, data, trace_fill_super);
> +	int err;
> +	struct tracefs_mount_opts opts;
> +	struct super_block *s;
> +
> +	err = tracefs_parse_options(data, PARSE_MOUNT, &opts);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	/* Require newinstance for all user namespace mounts to ensure
> +	 * the mount options are not changed.
> +	 */
> +	if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
> +		return ERR_PTR(-EINVAL);
> +
> +	if (opts.newinstance)
> +		s = sget(fs_type, NULL, set_anon_super, flags, NULL);
> +	else
> +		s = sget(fs_type, compare_init_tracefs_sb, set_anon_super,
> +			 flags, NULL);
> +
> +	if (IS_ERR(s))
> +		return ERR_CAST(s);
> +
> +	if (!s->s_root) {
> +		err = trace_fill_super(s, data, flags & MS_SILENT ? 1 : 0);
> +		if (err)
> +			goto out_undo_sget;
> +		s->s_flags |= MS_ACTIVE;
> +	}
> +
> +	if (opts.newinstance) {
> +		err = tracefs_instance_ops.mkdir(INSTANCE_MNT, s->s_root);
> +		if (err)
> +			goto out_undo_sget;
> +	}
> +
> +	memcpy(&(TRACEFS_SB(s))->mount_opts, &opts, sizeof(opts));
> +
> +	tracefs_apply_options(s);
> +
> +	return dget(s->s_root);
> +
> +out_undo_sget:
> +	deactivate_locked_super(s);
> +	return ERR_PTR(err);
> +}
> +
> +static void trace_kill_sb(struct super_block *sb)
> +{
> +	struct tracefs_fs_info *fsi = TRACEFS_SB(sb);
> +
> +	if (fsi->mount_opts.newinstance)
> +		tracefs_instance_ops.rmdir(INSTANCE_MNT, sb->s_root);
> +
> +	kfree(fsi);
> +	kill_litter_super(sb);
>  }
>  
>  static struct file_system_type trace_fs_type = {
>  	.owner =	THIS_MODULE,
>  	.name =		"tracefs",
>  	.mount =	trace_mount,
> -	.kill_sb =	kill_litter_super,
> +	.kill_sb =	trace_kill_sb,
>  };
>  MODULE_ALIAS_FS("tracefs");
>  
> @@ -480,22 +582,23 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent)
>   *
>   * Returns the dentry of the instances directory.
>   */
> -struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
> -					  int (*mkdir)(const char *name),
> -					  int (*rmdir)(const char *name))
> +struct dentry *
> +tracefs_create_instance_dir(const char *name, struct dentry *parent,
> +			    int (*mkdir)(int instance_type, void *data),
> +			    int (*rmdir)(int instance_type, void *data))
>  {
>  	struct dentry *dentry;
>  
>  	/* Only allow one instance of the instances directory. */
> -	if (WARN_ON(tracefs_ops.mkdir || tracefs_ops.rmdir))
> +	if (WARN_ON(tracefs_instance_ops.mkdir || tracefs_instance_ops.rmdir))
>  		return NULL;
>  
>  	dentry = __create_dir(name, parent, &tracefs_dir_inode_operations);
>  	if (!dentry)
>  		return NULL;
>  
> -	tracefs_ops.mkdir = mkdir;
> -	tracefs_ops.rmdir = rmdir;
> +	tracefs_instance_ops.mkdir = mkdir;
> +	tracefs_instance_ops.rmdir = rmdir;
>  
>  	return dentry;
>  }
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 5b727a1..30d4e55 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -25,6 +25,10 @@ struct file_operations;
>  
>  #ifdef CONFIG_TRACING
>  
> +/* instance types */
> +#define INSTANCE_DIR	0	/* created inside instances dir */
> +#define INSTANCE_MNT	1	/* created with newinstance mount option */
> +
>  struct dentry *tracefs_create_file(const char *name, umode_t mode,
>  				   struct dentry *parent, void *data,
>  				   const struct file_operations *fops);
> @@ -34,9 +38,10 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
>  void tracefs_remove(struct dentry *dentry);
>  void tracefs_remove_recursive(struct dentry *dentry);
>  
> -struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
> -					   int (*mkdir)(const char *name),
> -					   int (*rmdir)(const char *name));
> +struct dentry *
> +tracefs_create_instance_dir(const char *name, struct dentry *parent,
> +			    int (*mkdir)(int instance_type, void *data),
> +			    int (*rmdir)(int instance_type, void *data));
>  
>  bool tracefs_initialized(void);
>  
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 23a8111..a991e9d 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -6782,17 +6782,24 @@ static void update_tracer_options(struct trace_array *tr)
>  	mutex_unlock(&trace_types_lock);
>  }
>  
> -static int instance_mkdir(const char *name)
> +static int instance_mkdir(int instance_type, void *data)
>  {
> +	const char *name =  "tracing";
> +	struct dentry *mnt_root = NULL;
>  	struct trace_array *tr;
>  	int ret;
>  
>  	mutex_lock(&trace_types_lock);
>  
> -	ret = -EEXIST;
> -	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> -		if (tr->name && strcmp(tr->name, name) == 0)
> -			goto out_unlock;
> +	if (instance_type == INSTANCE_MNT)
> +		mnt_root = data;
> +	else {
> +		name = data;
> +		ret = -EEXIST;
> +		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> +			if (tr->name && strcmp(tr->name, name) == 0)
> +				goto out_unlock;
> +		}
>  	}
>  
>  	ret = -ENOMEM;
> @@ -6823,9 +6830,14 @@ static int instance_mkdir(const char *name)
>  	if (allocate_trace_buffers(tr, trace_buf_size) < 0)
>  		goto out_free_tr;
>  
> -	tr->dir = tracefs_create_dir(name, trace_instance_dir);
> -	if (!tr->dir)
> -		goto out_free_tr;
> +	if (instance_type == INSTANCE_MNT) {
> +		mnt_root->d_inode->i_private = tr;
> +		tr->dir = mnt_root;
> +	} else {
> +		tr->dir = tracefs_create_dir(name, trace_instance_dir);
> +		if (!tr->dir)
> +			goto out_free_tr;
> +	}
>  
>  	ret = event_trace_add_tracer(tr->dir, tr);
>  	if (ret) {
> @@ -6856,8 +6868,10 @@ static int instance_mkdir(const char *name)
>  
>  }
>  
> -static int instance_rmdir(const char *name)
> +static int instance_rmdir(int instance_type, void *data)
>  {
> +	const char *name =  "tracing";
> +	struct dentry *mnt_root = NULL;
>  	struct trace_array *tr;
>  	int found = 0;
>  	int ret;
> @@ -6865,15 +6879,21 @@ static int instance_rmdir(const char *name)
>  
>  	mutex_lock(&trace_types_lock);
>  
> -	ret = -ENODEV;
> -	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> -		if (tr->name && strcmp(tr->name, name) == 0) {
> -			found = 1;
> -			break;
> +	if (instance_type == INSTANCE_MNT) {
> +		mnt_root = data;
> +		tr = mnt_root->d_inode->i_private;
> +	} else {
> +		name = data;
> +		ret = -ENODEV;
> +		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> +			if (tr->name && strcmp(tr->name, name) == 0) {
> +				found = 1;
> +				break;
> +			}
>  		}
> +		if (!found)
> +			goto out_unlock;
>  	}
> -	if (!found)
> -		goto out_unlock;
>  
>  	ret = -EBUSY;
>  	if (tr->ref || (tr->current_trace && tr->current_trace->ref))

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
  2016-07-27 21:27 [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Hari Bathini
@ 2016-08-04  2:59     ` Eric W. Biederman
  2016-07-27 21:27 ` [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events Hari Bathini
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04  2:59 UTC (permalink / raw)
  To: Hari Bathini
  Cc: ananth-xthvdsQ13ZrQT0dZR+AlfA, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	acme-DgEjT+Ai2ygdnm+yROfE0A,
	alexander.shishkin-VuQAYsv1563Yd54FQh9/CA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, paulus-eUNUBHrolfbYtjvyW6yDsg,
	rostedt-nx8X9YLhiw1AfugRpC6u6w,
	aravinda-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, kernel-6AxghH7DbtA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

Hari Bathini <hbathini-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:

> This RFC patch set supports filtering container specific events
> when perf tool is executed inside a container. The patches apply
> cleanly on v4.7.0-rc7
>
> Changes from v1:
> 1/3. Revived earlier approach[1] with cgroup namespace instead
>      of pid namespace
> 2/3. New patch that adds instance support for uprobe events in
>      tracefs filesystem
> 3/3. New patch that adds "newinstance" mount option for tracefs
>      filesystem
       "newinstace" ick no.

I see no justification anywhere why the perf cgroup is not enough for
this.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
@ 2016-08-04  2:59     ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04  2:59 UTC (permalink / raw)
  To: Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, kernel, rostedt, viro, aravinda, ananth,
	Linux Containers

Hari Bathini <hbathini@linux.vnet.ibm.com> writes:

> This RFC patch set supports filtering container specific events
> when perf tool is executed inside a container. The patches apply
> cleanly on v4.7.0-rc7
>
> Changes from v1:
> 1/3. Revived earlier approach[1] with cgroup namespace instead
>      of pid namespace
> 2/3. New patch that adds instance support for uprobe events in
>      tracefs filesystem
> 3/3. New patch that adds "newinstance" mount option for tracefs
>      filesystem
       "newinstace" ick no.

I see no justification anywhere why the perf cgroup is not enough for
this.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option
  2016-08-04  2:54   ` Eric W. Biederman
@ 2016-08-04 12:26     ` Hari Bathini
  2016-08-04 14:12       ` Eric W. Biederman
  0 siblings, 1 reply; 26+ messages in thread
From: Hari Bathini @ 2016-08-04 12:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, kernel, rostedt, viro, aravinda, ananth

Hi Eric,


Thanks for the comments..


On Thursday 04 August 2016 08:24 AM, Eric W. Biederman wrote:
> Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
>
>> When tracefs is mounted inside a container, its files are visible to
>> all containers. This implies that a user from within a container can
>> list/delete uprobes registered elsewhere, leading to security issues
>> and/or denial of service (Eg. deleting a probe that is registered from
>> elsewhere). This patch addresses this problem by adding mount option
>> 'newinstance', allowing containers to have their own instance mounted
>> separately. Something like the below from within a container:
> newinstance is an anti-pattern in devpts and should not be copied.
> To fix some severe defects of devpts we had to always create new
> istances and the code and the testing to make that all work was

OK..

> not pleasant.  Please don't add another option that we will just have to
> make redundant later.

IIUC, you mean, implicitly create a new instance for tracefs mount
inside container without the need for a new option?

Thanks
Hari

> Eric
>
>
>>    $ mount -o newinstance -t tracefs tracefs /sys/kernel/tracing
>>    $
>>    $
>>    $ perf probe /lib/x86_64-linux-gnu/libc.so.6 malloc
>>    Added new event:
>>      probe_libc:malloc    (on malloc in /lib/x86_64-linux-gnu/libc.so.6)
>>
>>    You can now use it in all perf tools, such as:
>>
>>    	perf record -e probe_libc:malloc -aR sleep 1
>>
>>    $
>>    $
>>    $ perf probe --list
>>      probe_libc:malloc    (on __libc_malloc in /lib64/libc.so.6)
>>    $
>>
>> while another container/host has a completely different view:
>>
>>
>>    $ perf probe --list
>>      probe_libc:memset    (on __libc_memset in /lib64/libc.so.6)
>>    $
>>
>> This patch reuses the code that provides support to create new instances
>> under tracefs instances directory.
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
>> ---
>>   fs/tracefs/inode.c      |  171 ++++++++++++++++++++++++++++++++++++++---------
>>   include/linux/tracefs.h |   11 ++-
>>   kernel/trace/trace.c    |   52 ++++++++++----
>>   3 files changed, 181 insertions(+), 53 deletions(-)
>>
>> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
>> index 4a0e48f..2d6acda 100644
>> --- a/fs/tracefs/inode.c
>> +++ b/fs/tracefs/inode.c
>> @@ -51,9 +51,9 @@ static const struct file_operations tracefs_file_operations = {
>>   };
>>   
>>   static struct tracefs_dir_ops {
>> -	int (*mkdir)(const char *name);
>> -	int (*rmdir)(const char *name);
>> -} tracefs_ops;
>> +	int (*mkdir)(int instance_type, void *data);
>> +	int (*rmdir)(int instance_type, void *data);
>> +} tracefs_instance_ops;
>>   
>>   static char *get_dname(struct dentry *dentry)
>>   {
>> @@ -85,7 +85,7 @@ static int tracefs_syscall_mkdir(struct inode *inode, struct dentry *dentry, umo
>>   	 * mkdir routine to handle races.
>>   	 */
>>   	inode_unlock(inode);
>> -	ret = tracefs_ops.mkdir(name);
>> +	ret = tracefs_instance_ops.mkdir(INSTANCE_DIR, name);
>>   	inode_lock(inode);
>>   
>>   	kfree(name);
>> @@ -112,7 +112,7 @@ static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry)
>>   	inode_unlock(inode);
>>   	inode_unlock(dentry->d_inode);
>>   
>> -	ret = tracefs_ops.rmdir(name);
>> +	ret = tracefs_instance_ops.rmdir(INSTANCE_DIR, name);
>>   
>>   	inode_lock_nested(inode, I_MUTEX_PARENT);
>>   	inode_lock(dentry->d_inode);
>> @@ -142,12 +142,14 @@ struct tracefs_mount_opts {
>>   	kuid_t uid;
>>   	kgid_t gid;
>>   	umode_t mode;
>> +	int newinstance;
>>   };
>>   
>>   enum {
>>   	Opt_uid,
>>   	Opt_gid,
>>   	Opt_mode,
>> +	Opt_newinstance,
>>   	Opt_err
>>   };
>>   
>> @@ -155,14 +157,26 @@ static const match_table_t tokens = {
>>   	{Opt_uid, "uid=%u"},
>>   	{Opt_gid, "gid=%u"},
>>   	{Opt_mode, "mode=%o"},
>> +	{Opt_newinstance, "newinstance"},
>>   	{Opt_err, NULL}
>>   };
>>   
>>   struct tracefs_fs_info {
>>   	struct tracefs_mount_opts mount_opts;
>> +	struct super_block *sb;
>>   };
>>   
>> -static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
>> +static inline struct tracefs_fs_info *TRACEFS_SB(struct super_block *sb)
>> +{
>> +	return sb->s_fs_info;
>> +}
>> +
>> +#define PARSE_MOUNT		0
>> +#define PARSE_REMOUNT		1
>> +
>> +static int tracefs_parse_options(char *data,
>> +				 int op,
>> +				 struct tracefs_mount_opts *opts)
>>   {
>>   	substring_t args[MAX_OPT_ARGS];
>>   	int option;
>> @@ -173,6 +187,10 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
>>   
>>   	opts->mode = TRACEFS_DEFAULT_MODE;
>>   
>> +	/* newinstance makes sense only on initial mount */
>> +	if (op == PARSE_MOUNT)
>> +		opts->newinstance = 0;
>> +
>>   	while ((p = strsep(&data, ",")) != NULL) {
>>   		if (!*p)
>>   			continue;
>> @@ -200,6 +218,11 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
>>   				return -EINVAL;
>>   			opts->mode = option & S_IALLUGO;
>>   			break;
>> +		case Opt_newinstance:
>> +			/* newinstance makes sense only on initial mount */
>> +			if (op == PARSE_MOUNT)
>> +				opts->newinstance = 1;
>> +			break;
>>   		/*
>>   		 * We might like to report bad mount options here;
>>   		 * but traditionally tracefs has ignored all mount options
>> @@ -231,7 +254,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
>>   	struct tracefs_fs_info *fsi = sb->s_fs_info;
>>   
>>   	sync_filesystem(sb);
>> -	err = tracefs_parse_options(data, &fsi->mount_opts);
>> +	err = tracefs_parse_options(data, PARSE_REMOUNT, &fsi->mount_opts);
>>   	if (err)
>>   		goto fail;
>>   
>> @@ -254,6 +277,8 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
>>   			   from_kgid_munged(&init_user_ns, opts->gid));
>>   	if (opts->mode != TRACEFS_DEFAULT_MODE)
>>   		seq_printf(m, ",mode=%o", opts->mode);
>> +	if (opts->newinstance)
>> +		seq_puts(m, ",newinstance");
>>   
>>   	return 0;
>>   }
>> @@ -264,53 +289,130 @@ static const struct super_operations tracefs_super_operations = {
>>   	.show_options	= tracefs_show_options,
>>   };
>>   
>> -static int trace_fill_super(struct super_block *sb, void *data, int silent)
>> +static void *new_tracefs_fs_info(struct super_block *sb)
>>   {
>> -	static struct tree_descr trace_files[] = {{""}};
>>   	struct tracefs_fs_info *fsi;
>> -	int err;
>> -
>> -	save_mount_options(sb, data);
>>   
>>   	fsi = kzalloc(sizeof(struct tracefs_fs_info), GFP_KERNEL);
>> -	sb->s_fs_info = fsi;
>> -	if (!fsi) {
>> -		err = -ENOMEM;
>> -		goto fail;
>> -	}
>> +	if (!fsi)
>> +		return NULL;
>>   
>> -	err = tracefs_parse_options(data, &fsi->mount_opts);
>> -	if (err)
>> +	fsi->mount_opts.mode = TRACEFS_DEFAULT_MODE;
>> +	fsi->sb = sb;
>> +
>> +	return fsi;
>> +}
>> +
>> +static int trace_fill_super(struct super_block *sb, void *data, int silent)
>> +{
>> +	struct inode *inode;
>> +
>> +	sb->s_blocksize = PAGE_SIZE;
>> +	sb->s_blocksize_bits = PAGE_SHIFT;
>> +	sb->s_magic = TRACEFS_MAGIC;
>> +	sb->s_op = &tracefs_super_operations;
>> +	sb->s_time_gran = 1;
>> +
>> +	sb->s_fs_info = new_tracefs_fs_info(sb);
>> +	if (!sb->s_fs_info)
>>   		goto fail;
>>   
>> -	err  =  simple_fill_super(sb, TRACEFS_MAGIC, trace_files);
>> -	if (err)
>> +	inode = new_inode(sb);
>> +	if (!inode)
>>   		goto fail;
>> +	inode->i_ino = 1;
>> +	inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
>> +	inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO | S_IWUSR;
>> +	inode->i_op = &simple_dir_inode_operations;
>> +	inode->i_fop = &simple_dir_operations;
>> +	set_nlink(inode, 2);
>>   
>> -	sb->s_op = &tracefs_super_operations;
>> +	sb->s_root = d_make_root(inode);
>> +	if (sb->s_root)
>> +		return 0;
>>   
>> -	tracefs_apply_options(sb);
>> +	pr_err("get root dentry failed\n");
>>   
>>   	return 0;
>>   
>>   fail:
>> -	kfree(fsi);
>> -	sb->s_fs_info = NULL;
>> -	return err;
>> +	return -ENOMEM;
>> +}
>> +
>> +static int compare_init_tracefs_sb(struct super_block *s, void *p)
>> +{
>> +	if (tracefs_mount)
>> +		return tracefs_mount->mnt_sb == s;
>> +	return 0;
>>   }
>>   
>>   static struct dentry *trace_mount(struct file_system_type *fs_type,
>>   			int flags, const char *dev_name,
>>   			void *data)
>>   {
>> -	return mount_single(fs_type, flags, data, trace_fill_super);
>> +	int err;
>> +	struct tracefs_mount_opts opts;
>> +	struct super_block *s;
>> +
>> +	err = tracefs_parse_options(data, PARSE_MOUNT, &opts);
>> +	if (err)
>> +		return ERR_PTR(err);
>> +
>> +	/* Require newinstance for all user namespace mounts to ensure
>> +	 * the mount options are not changed.
>> +	 */
>> +	if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	if (opts.newinstance)
>> +		s = sget(fs_type, NULL, set_anon_super, flags, NULL);
>> +	else
>> +		s = sget(fs_type, compare_init_tracefs_sb, set_anon_super,
>> +			 flags, NULL);
>> +
>> +	if (IS_ERR(s))
>> +		return ERR_CAST(s);
>> +
>> +	if (!s->s_root) {
>> +		err = trace_fill_super(s, data, flags & MS_SILENT ? 1 : 0);
>> +		if (err)
>> +			goto out_undo_sget;
>> +		s->s_flags |= MS_ACTIVE;
>> +	}
>> +
>> +	if (opts.newinstance) {
>> +		err = tracefs_instance_ops.mkdir(INSTANCE_MNT, s->s_root);
>> +		if (err)
>> +			goto out_undo_sget;
>> +	}
>> +
>> +	memcpy(&(TRACEFS_SB(s))->mount_opts, &opts, sizeof(opts));
>> +
>> +	tracefs_apply_options(s);
>> +
>> +	return dget(s->s_root);
>> +
>> +out_undo_sget:
>> +	deactivate_locked_super(s);
>> +	return ERR_PTR(err);
>> +}
>> +
>> +static void trace_kill_sb(struct super_block *sb)
>> +{
>> +	struct tracefs_fs_info *fsi = TRACEFS_SB(sb);
>> +
>> +	if (fsi->mount_opts.newinstance)
>> +		tracefs_instance_ops.rmdir(INSTANCE_MNT, sb->s_root);
>> +
>> +	kfree(fsi);
>> +	kill_litter_super(sb);
>>   }
>>   
>>   static struct file_system_type trace_fs_type = {
>>   	.owner =	THIS_MODULE,
>>   	.name =		"tracefs",
>>   	.mount =	trace_mount,
>> -	.kill_sb =	kill_litter_super,
>> +	.kill_sb =	trace_kill_sb,
>>   };
>>   MODULE_ALIAS_FS("tracefs");
>>   
>> @@ -480,22 +582,23 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent)
>>    *
>>    * Returns the dentry of the instances directory.
>>    */
>> -struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
>> -					  int (*mkdir)(const char *name),
>> -					  int (*rmdir)(const char *name))
>> +struct dentry *
>> +tracefs_create_instance_dir(const char *name, struct dentry *parent,
>> +			    int (*mkdir)(int instance_type, void *data),
>> +			    int (*rmdir)(int instance_type, void *data))
>>   {
>>   	struct dentry *dentry;
>>   
>>   	/* Only allow one instance of the instances directory. */
>> -	if (WARN_ON(tracefs_ops.mkdir || tracefs_ops.rmdir))
>> +	if (WARN_ON(tracefs_instance_ops.mkdir || tracefs_instance_ops.rmdir))
>>   		return NULL;
>>   
>>   	dentry = __create_dir(name, parent, &tracefs_dir_inode_operations);
>>   	if (!dentry)
>>   		return NULL;
>>   
>> -	tracefs_ops.mkdir = mkdir;
>> -	tracefs_ops.rmdir = rmdir;
>> +	tracefs_instance_ops.mkdir = mkdir;
>> +	tracefs_instance_ops.rmdir = rmdir;
>>   
>>   	return dentry;
>>   }
>> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
>> index 5b727a1..30d4e55 100644
>> --- a/include/linux/tracefs.h
>> +++ b/include/linux/tracefs.h
>> @@ -25,6 +25,10 @@ struct file_operations;
>>   
>>   #ifdef CONFIG_TRACING
>>   
>> +/* instance types */
>> +#define INSTANCE_DIR	0	/* created inside instances dir */
>> +#define INSTANCE_MNT	1	/* created with newinstance mount option */
>> +
>>   struct dentry *tracefs_create_file(const char *name, umode_t mode,
>>   				   struct dentry *parent, void *data,
>>   				   const struct file_operations *fops);
>> @@ -34,9 +38,10 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
>>   void tracefs_remove(struct dentry *dentry);
>>   void tracefs_remove_recursive(struct dentry *dentry);
>>   
>> -struct dentry *tracefs_create_instance_dir(const char *name, struct dentry *parent,
>> -					   int (*mkdir)(const char *name),
>> -					   int (*rmdir)(const char *name));
>> +struct dentry *
>> +tracefs_create_instance_dir(const char *name, struct dentry *parent,
>> +			    int (*mkdir)(int instance_type, void *data),
>> +			    int (*rmdir)(int instance_type, void *data));
>>   
>>   bool tracefs_initialized(void);
>>   
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index 23a8111..a991e9d 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -6782,17 +6782,24 @@ static void update_tracer_options(struct trace_array *tr)
>>   	mutex_unlock(&trace_types_lock);
>>   }
>>   
>> -static int instance_mkdir(const char *name)
>> +static int instance_mkdir(int instance_type, void *data)
>>   {
>> +	const char *name =  "tracing";
>> +	struct dentry *mnt_root = NULL;
>>   	struct trace_array *tr;
>>   	int ret;
>>   
>>   	mutex_lock(&trace_types_lock);
>>   
>> -	ret = -EEXIST;
>> -	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
>> -		if (tr->name && strcmp(tr->name, name) == 0)
>> -			goto out_unlock;
>> +	if (instance_type == INSTANCE_MNT)
>> +		mnt_root = data;
>> +	else {
>> +		name = data;
>> +		ret = -EEXIST;
>> +		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
>> +			if (tr->name && strcmp(tr->name, name) == 0)
>> +				goto out_unlock;
>> +		}
>>   	}
>>   
>>   	ret = -ENOMEM;
>> @@ -6823,9 +6830,14 @@ static int instance_mkdir(const char *name)
>>   	if (allocate_trace_buffers(tr, trace_buf_size) < 0)
>>   		goto out_free_tr;
>>   
>> -	tr->dir = tracefs_create_dir(name, trace_instance_dir);
>> -	if (!tr->dir)
>> -		goto out_free_tr;
>> +	if (instance_type == INSTANCE_MNT) {
>> +		mnt_root->d_inode->i_private = tr;
>> +		tr->dir = mnt_root;
>> +	} else {
>> +		tr->dir = tracefs_create_dir(name, trace_instance_dir);
>> +		if (!tr->dir)
>> +			goto out_free_tr;
>> +	}
>>   
>>   	ret = event_trace_add_tracer(tr->dir, tr);
>>   	if (ret) {
>> @@ -6856,8 +6868,10 @@ static int instance_mkdir(const char *name)
>>   
>>   }
>>   
>> -static int instance_rmdir(const char *name)
>> +static int instance_rmdir(int instance_type, void *data)
>>   {
>> +	const char *name =  "tracing";
>> +	struct dentry *mnt_root = NULL;
>>   	struct trace_array *tr;
>>   	int found = 0;
>>   	int ret;
>> @@ -6865,15 +6879,21 @@ static int instance_rmdir(const char *name)
>>   
>>   	mutex_lock(&trace_types_lock);
>>   
>> -	ret = -ENODEV;
>> -	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
>> -		if (tr->name && strcmp(tr->name, name) == 0) {
>> -			found = 1;
>> -			break;
>> +	if (instance_type == INSTANCE_MNT) {
>> +		mnt_root = data;
>> +		tr = mnt_root->d_inode->i_private;
>> +	} else {
>> +		name = data;
>> +		ret = -ENODEV;
>> +		list_for_each_entry(tr, &ftrace_trace_arrays, list) {
>> +			if (tr->name && strcmp(tr->name, name) == 0) {
>> +				found = 1;
>> +				break;
>> +			}
>>   		}
>> +		if (!found)
>> +			goto out_unlock;
>>   	}
>> -	if (!found)
>> -		goto out_unlock;
>>   
>>   	ret = -EBUSY;
>>   	if (tr->ref || (tr->current_trace && tr->current_trace->ref))

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-04  1:04               ` Steven Rostedt
@ 2016-08-04 13:46                 ` Aravinda Prasad
  2016-08-04 14:08                   ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 13:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth



On Thursday 04 August 2016 06:34 AM, Steven Rostedt wrote:
> On Thu, 4 Aug 2016 01:46:04 +0530
> Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
> 
>> On Thursday 04 August 2016 01:40 AM, Steven Rostedt wrote:
>>> On Thu, 4 Aug 2016 01:00:51 +0530
>>> Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
>>>   
>>>>  
>>>>> Can a container have its own function tracing?    
>>>>
>>>> Sorry, I didn't understand that. Do you mean to have a separate
>>>> per-container trace files?  
>>>
>>> Actually, it's more my ignorance of containers, as I haven't had the
>>> need to play with them. Although, I think it may be time to do so.
>>>
>>> When a container enters kernel mode, I'm assuming that it's part of the
>>> host at that moment, and the host needs to take care of separating
>>> everything? That is, there's not a "second kernel" like VMs have, right?  
>>
>> Yes. The host needs to take care of separating everything. There is no
>> "second kernel".
> 
> That's what I figured. Thus, my worry is that something like the
> function tracer can cause information leak to a container. 

Yes and thus function tracer is currently disabled inside container
unless it is a privileged container.

> How would
> you separate functions for the container from functions for the host?

Separation is based on the context in which the function is called.
Hence, containers can see only those kernel functions that are
triggered/invoked by the processes running inside that container and
should not see other kernel functions, for example, called by RCU grace
period kthread or any other kthread.

Regards,
Aravinda

> 
> -- Steve
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-04 13:46                 ` Aravinda Prasad
@ 2016-08-04 14:08                   ` Steven Rostedt
  2016-08-04 14:34                     ` Aravinda Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2016-08-04 14:08 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth

On Thu, 4 Aug 2016 19:16:03 +0530
Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:


> Separation is based on the context in which the function is called.
> Hence, containers can see only those kernel functions that are
> triggered/invoked by the processes running inside that container and
> should not see other kernel functions, for example, called by RCU grace
> period kthread or any other kthread.
> 

What about interrupts and softirqs? They run under the container
process's context, but service other processes outside the container.
Same goes for trace events.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option
  2016-08-04 12:26     ` Hari Bathini
@ 2016-08-04 14:12       ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04 14:12 UTC (permalink / raw)
  To: Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, kernel, rostedt, viro, aravinda, ananth

Hari Bathini <hbathini@linux.vnet.ibm.com> writes:

> Hi Eric,
>
>
> Thanks for the comments..
>
>
> On Thursday 04 August 2016 08:24 AM, Eric W. Biederman wrote:
>> Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
>>
>>> When tracefs is mounted inside a container, its files are visible to
>>> all containers. This implies that a user from within a container can
>>> list/delete uprobes registered elsewhere, leading to security issues
>>> and/or denial of service (Eg. deleting a probe that is registered from
>>> elsewhere). This patch addresses this problem by adding mount option
>>> 'newinstance', allowing containers to have their own instance mounted
>>> separately. Something like the below from within a container:
>> newinstance is an anti-pattern in devpts and should not be copied.
>> To fix some severe defects of devpts we had to always create new
>> istances and the code and the testing to make that all work was
>
> OK..
>
>> not pleasant.  Please don't add another option that we will just have to
>> make redundant later.
>
> IIUC, you mean, implicitly create a new instance for tracefs mount
> inside container without the need for a new option?

Yes.  Or always create a new instance.  Whatever makes sense.  If we
don't have to bind things to a namespace all the better.

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events
  2016-08-04 14:08                   ` Steven Rostedt
@ 2016-08-04 14:34                     ` Aravinda Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 14:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, ebiederm, kernel, viro,
	ananth



On Thursday 04 August 2016 07:38 PM, Steven Rostedt wrote:
> On Thu, 4 Aug 2016 19:16:03 +0530
> Aravinda Prasad <aravinda@linux.vnet.ibm.com> wrote:
> 
> 
>> Separation is based on the context in which the function is called.
>> Hence, containers can see only those kernel functions that are
>> triggered/invoked by the processes running inside that container and
>> should not see other kernel functions, for example, called by RCU grace
>> period kthread or any other kthread.
>>
> 
> What about interrupts and softirqs? They run under the container
> process's context, but service other processes outside the container.
> Same goes for trace events.

Interrupts and softirqs are tricky. We have not yet figured that out.

Same for trace events. Had similar discussion for trace events with
Brendan:

http://www.spinics.net/lists/linux-perf-users/msg03018.html
(Last section of the mail is on trace event)

Regards,
Aravinda

> 
> -- Steve
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
       [not found]     ` <87twf1ck95.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
@ 2016-08-04 14:48       ` Aravinda Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 14:48 UTC (permalink / raw)
  To: Eric W. Biederman, Hari Bathini
  Cc: ananth-xthvdsQ13ZrQT0dZR+AlfA, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	acme-DgEjT+Ai2ygdnm+yROfE0A,
	alexander.shishkin-VuQAYsv1563Yd54FQh9/CA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, paulus-eUNUBHrolfbYtjvyW6yDsg,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, kernel-6AxghH7DbtA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn



On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
> Hari Bathini <hbathini-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> 
>> This RFC patch set supports filtering container specific events
>> when perf tool is executed inside a container. The patches apply
>> cleanly on v4.7.0-rc7
>>
>> Changes from v1:
>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>      of pid namespace
>> 2/3. New patch that adds instance support for uprobe events in
>>      tracefs filesystem
>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>      filesystem
>        "newinstace" ick no.
> 
> I see no justification anywhere why the perf cgroup is not enough for
> this.

perf cgroup is not enough for uprobes, because even with perf cgroups a
user within a container can still list/delete uprobes registered in
other containers.

Regards,
Aravinda

> 
> Eric
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
  2016-08-04  2:59     ` Eric W. Biederman
  (?)
  (?)
@ 2016-08-04 14:48     ` Aravinda Prasad
       [not found]       ` <57A355C1.4090004-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2016-08-04 18:27       ` Eric W. Biederman
  -1 siblings, 2 replies; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 14:48 UTC (permalink / raw)
  To: Eric W. Biederman, Hari Bathini
  Cc: daniel, peterz, linux-kernel, acme, alexander.shishkin, mingo,
	paulus, kernel, rostedt, viro, ananth, Linux Containers



On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
> Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
> 
>> This RFC patch set supports filtering container specific events
>> when perf tool is executed inside a container. The patches apply
>> cleanly on v4.7.0-rc7
>>
>> Changes from v1:
>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>      of pid namespace
>> 2/3. New patch that adds instance support for uprobe events in
>>      tracefs filesystem
>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>      filesystem
>        "newinstace" ick no.
> 
> I see no justification anywhere why the perf cgroup is not enough for
> this.

perf cgroup is not enough for uprobes, because even with perf cgroups a
user within a container can still list/delete uprobes registered in
other containers.

Regards,
Aravinda

> 
> Eric
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
       [not found]       ` <57A355C1.4090004-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2016-08-04 18:27         ` Eric W. Biederman
  0 siblings, 0 replies; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04 18:27 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: ananth-xthvdsQ13ZrQT0dZR+AlfA, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	acme-DgEjT+Ai2ygdnm+yROfE0A,
	alexander.shishkin-VuQAYsv1563Yd54FQh9/CA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, paulus-eUNUBHrolfbYtjvyW6yDsg,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, Hari Bathini, kernel-6AxghH7DbtA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn

Aravinda Prasad <aravinda-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:

> On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
>> Hari Bathini <hbathini-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
>> 
>>> This RFC patch set supports filtering container specific events
>>> when perf tool is executed inside a container. The patches apply
>>> cleanly on v4.7.0-rc7
>>>
>>> Changes from v1:
>>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>>      of pid namespace
>>> 2/3. New patch that adds instance support for uprobe events in
>>>      tracefs filesystem
>>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>>      filesystem
>>        "newinstace" ick no.
>> 
>> I see no justification anywhere why the perf cgroup is not enough for
>> this.
>
> perf cgroup is not enough for uprobes, because even with perf cgroups a
> user within a container can still list/delete uprobes registered in
> other containers.

Just to be clear, even if there is one cgroup per container?

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
  2016-08-04 14:48     ` Aravinda Prasad
       [not found]       ` <57A355C1.4090004-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2016-08-04 18:27       ` Eric W. Biederman
       [not found]         ` <87h9b01ja2.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
  1 sibling, 1 reply; 26+ messages in thread
From: Eric W. Biederman @ 2016-08-04 18:27 UTC (permalink / raw)
  To: Aravinda Prasad
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, kernel, rostedt, viro, ananth,
	Linux Containers

Aravinda Prasad <aravinda@linux.vnet.ibm.com> writes:

> On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
>> Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
>> 
>>> This RFC patch set supports filtering container specific events
>>> when perf tool is executed inside a container. The patches apply
>>> cleanly on v4.7.0-rc7
>>>
>>> Changes from v1:
>>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>>      of pid namespace
>>> 2/3. New patch that adds instance support for uprobe events in
>>>      tracefs filesystem
>>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>>      filesystem
>>        "newinstace" ick no.
>> 
>> I see no justification anywhere why the perf cgroup is not enough for
>> this.
>
> perf cgroup is not enough for uprobes, because even with perf cgroups a
> user within a container can still list/delete uprobes registered in
> other containers.

Just to be clear, even if there is one cgroup per container?

Eric

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
  2016-08-04 18:27       ` Eric W. Biederman
@ 2016-08-04 19:11             ` Aravinda Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 19:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: ananth-xthvdsQ13ZrQT0dZR+AlfA, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	acme-DgEjT+Ai2ygdnm+yROfE0A,
	alexander.shishkin-VuQAYsv1563Yd54FQh9/CA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, paulus-eUNUBHrolfbYtjvyW6yDsg,
	rostedt-nx8X9YLhiw1AfugRpC6u6w, Hari Bathini, kernel-6AxghH7DbtA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn



On Thursday 04 August 2016 11:57 PM, Eric W. Biederman wrote:
> Aravinda Prasad <aravinda-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> 
>> On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
>>> Hari Bathini <hbathini-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
>>>
>>>> This RFC patch set supports filtering container specific events
>>>> when perf tool is executed inside a container. The patches apply
>>>> cleanly on v4.7.0-rc7
>>>>
>>>> Changes from v1:
>>>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>>>      of pid namespace
>>>> 2/3. New patch that adds instance support for uprobe events in
>>>>      tracefs filesystem
>>>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>>>      filesystem
>>>        "newinstace" ick no.
>>>
>>> I see no justification anywhere why the perf cgroup is not enough for
>>> this.
>>
>> perf cgroup is not enough for uprobes, because even with perf cgroups a
>> user within a container can still list/delete uprobes registered in
>> other containers.
> 
> Just to be clear, even if there is one cgroup per container?

Yes. Uprobes with perf is two steps. First step is to define/add the
probe (for example: "perf probe /bin/zsh zfree"), which does not require
cgroup argument. Adding a probe writes an entry in
/sys/kernel/debug/tracing/uprobe_events file. uprobes_events file is
shared and hence users in other container can list/delete these entries.

Once added, the second step is to record. We can record by specifying
the cgroup argument with perf record and the events are filtered out
based on the cgroup.

The problem with the first step is handled (in patch 2 and 3) by
creating a separate uprobes_events file per-container by exploiting
already existing "instances" functionality.

Regards,
Aravinda

> 
> Eric
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support
@ 2016-08-04 19:11             ` Aravinda Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: Aravinda Prasad @ 2016-08-04 19:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hari Bathini, daniel, peterz, linux-kernel, acme,
	alexander.shishkin, mingo, paulus, kernel, rostedt, viro, ananth,
	Linux Containers



On Thursday 04 August 2016 11:57 PM, Eric W. Biederman wrote:
> Aravinda Prasad <aravinda@linux.vnet.ibm.com> writes:
> 
>> On Thursday 04 August 2016 08:29 AM, Eric W. Biederman wrote:
>>> Hari Bathini <hbathini@linux.vnet.ibm.com> writes:
>>>
>>>> This RFC patch set supports filtering container specific events
>>>> when perf tool is executed inside a container. The patches apply
>>>> cleanly on v4.7.0-rc7
>>>>
>>>> Changes from v1:
>>>> 1/3. Revived earlier approach[1] with cgroup namespace instead
>>>>      of pid namespace
>>>> 2/3. New patch that adds instance support for uprobe events in
>>>>      tracefs filesystem
>>>> 3/3. New patch that adds "newinstance" mount option for tracefs
>>>>      filesystem
>>>        "newinstace" ick no.
>>>
>>> I see no justification anywhere why the perf cgroup is not enough for
>>> this.
>>
>> perf cgroup is not enough for uprobes, because even with perf cgroups a
>> user within a container can still list/delete uprobes registered in
>> other containers.
> 
> Just to be clear, even if there is one cgroup per container?

Yes. Uprobes with perf is two steps. First step is to define/add the
probe (for example: "perf probe /bin/zsh zfree"), which does not require
cgroup argument. Adding a probe writes an entry in
/sys/kernel/debug/tracing/uprobe_events file. uprobes_events file is
shared and hence users in other container can list/delete these entries.

Once added, the second step is to record. We can record by specifying
the cgroup argument with perf record and the events are filtered out
based on the cgroup.

The problem with the first step is handled (in patch 2 and 3) by
creating a separate uprobes_events file per-container by exploiting
already existing "instances" functionality.

Regards,
Aravinda

> 
> Eric
> 

-- 
Regards,
Aravinda

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-08-04 19:12 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-27 21:27 [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Hari Bathini
2016-07-27 21:27 ` [RFC PATCH v2 1/3] perf: filter container events based on cgroup namespace Hari Bathini
2016-07-27 21:27 ` [RFC PATCH v2 2/3] tracefs: add instances support for uprobe events Hari Bathini
2016-08-01 21:45   ` Steven Rostedt
2016-08-02 17:27     ` Hari Bathini
2016-08-02 17:32       ` Hari Bathini
2016-08-02 17:49       ` Steven Rostedt
2016-08-03 19:30         ` Aravinda Prasad
2016-08-03 20:10           ` Steven Rostedt
2016-08-03 20:16             ` Aravinda Prasad
2016-08-04  1:04               ` Steven Rostedt
2016-08-04 13:46                 ` Aravinda Prasad
2016-08-04 14:08                   ` Steven Rostedt
2016-08-04 14:34                     ` Aravinda Prasad
2016-07-27 21:27 ` [RFC PATCH v2 3/3] tracefs: add 'newinstance' mount option Hari Bathini
2016-08-04  2:54   ` Eric W. Biederman
2016-08-04 12:26     ` Hari Bathini
2016-08-04 14:12       ` Eric W. Biederman
     [not found] ` <146965470618.23765.7329786743211962695.stgit-2ivJzYymj6EA+286u2LMdEEOCMrvLtNR@public.gmane.org>
2016-08-04  2:59   ` [RFC PATCH v2 0/3] perf/tracefs: Container-aware tracing support Eric W. Biederman
2016-08-04  2:59     ` Eric W. Biederman
     [not found]     ` <87twf1ck95.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-08-04 14:48       ` Aravinda Prasad
2016-08-04 14:48     ` Aravinda Prasad
     [not found]       ` <57A355C1.4090004-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2016-08-04 18:27         ` Eric W. Biederman
2016-08-04 18:27       ` Eric W. Biederman
     [not found]         ` <87h9b01ja2.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-08-04 19:11           ` Aravinda Prasad
2016-08-04 19:11             ` Aravinda Prasad

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.