* [PATCH 0/2] trace/kprobe: Two fixes for kretprobes @ 2021-06-14 18:03 Naveen N. Rao 2021-06-14 18:03 ` [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile Naveen N. Rao 2021-06-14 18:03 ` [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive Naveen N. Rao 0 siblings, 2 replies; 17+ messages in thread From: Naveen N. Rao @ 2021-06-14 18:03 UTC (permalink / raw) To: linux-kernel Cc: Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Anton Blanchard The first patch fixes accounting of missed kretprobes in kprobe_profile. The second patch removes limit on the maximum active kretprobe instances, when registering a kretprobe through tracefs. - Naveen Naveen N. Rao (2): trace/kprobe: Fix count of missed kretprobes in kprobe_profile trace/kprobe: Remove limit on kretprobe maxactive kernel/trace/trace_kprobe.c | 11 ++--------- kernel/trace/trace_probe.h | 1 - .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 1 - .../ftrace/test.d/kprobe/kretprobe_maxactive.tc | 3 --- 4 files changed, 2 insertions(+), 14 deletions(-) base-commit: 0b42677e2e5d87c730ddc41544b289b88596738c -- 2.31.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile 2021-06-14 18:03 [PATCH 0/2] trace/kprobe: Two fixes for kretprobes Naveen N. Rao @ 2021-06-14 18:03 ` Naveen N. Rao 2021-06-15 5:47 ` Masami Hiramatsu 2021-06-14 18:03 ` [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive Naveen N. Rao 1 sibling, 1 reply; 17+ messages in thread From: Naveen N. Rao @ 2021-06-14 18:03 UTC (permalink / raw) To: linux-kernel Cc: Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Anton Blanchard For a kretprobe, the miss count includes the number of times the probe on function entry was missed, as well as the number of times we ran out of kretprobe_instance structures due to maxactive being too low. Fixes: cd7e7bd5e44718 ("tracing: Add kprobes event profiling interface") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- kernel/trace/trace_kprobe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index ea6178cb5e334d..0475e2a6d0825e 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1192,7 +1192,8 @@ static int probes_profile_seq_show(struct seq_file *m, void *v) seq_printf(m, " %-44s %15lu %15lu\n", trace_probe_name(&tk->tp), trace_kprobe_nhit(tk), - tk->rp.kp.nmissed); + trace_kprobe_is_return(tk) ? tk->rp.kp.nmissed + tk->rp.nmissed + : tk->rp.kp.nmissed); return 0; } -- 2.31.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile 2021-06-14 18:03 ` [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile Naveen N. Rao @ 2021-06-15 5:47 ` Masami Hiramatsu 0 siblings, 0 replies; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-15 5:47 UTC (permalink / raw) To: Naveen N. Rao Cc: linux-kernel, Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Anton Blanchard On Mon, 14 Jun 2021 23:33:28 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > For a kretprobe, the miss count includes the number of times the probe > on function entry was missed, as well as the number of times we ran out > of kretprobe_instance structures due to maxactive being too low. > > Fixes: cd7e7bd5e44718 ("tracing: Add kprobes event profiling interface") > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Good catch! > --- > kernel/trace/trace_kprobe.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c > index ea6178cb5e334d..0475e2a6d0825e 100644 > --- a/kernel/trace/trace_kprobe.c > +++ b/kernel/trace/trace_kprobe.c > @@ -1192,7 +1192,8 @@ static int probes_profile_seq_show(struct seq_file *m, void *v) > seq_printf(m, " %-44s %15lu %15lu\n", > trace_probe_name(&tk->tp), > trace_kprobe_nhit(tk), > - tk->rp.kp.nmissed); > + trace_kprobe_is_return(tk) ? tk->rp.kp.nmissed + tk->rp.nmissed > + : tk->rp.kp.nmissed); Can you add a static trace_kprobe_nmissed(tk) for wrapping this ? Thank you, > > return 0; > } > -- > 2.31.1 > -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-14 18:03 [PATCH 0/2] trace/kprobe: Two fixes for kretprobes Naveen N. Rao 2021-06-14 18:03 ` [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile Naveen N. Rao @ 2021-06-14 18:03 ` Naveen N. Rao 2021-06-15 9:35 ` Masami Hiramatsu 1 sibling, 1 reply; 17+ messages in thread From: Naveen N. Rao @ 2021-06-14 18:03 UTC (permalink / raw) To: linux-kernel Cc: Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Anton Blanchard We currently limit maxactive for a kretprobe to 4096 when registering the same through tracefs. The comment indicates that this is done so as to keep list traversal reasonable. However, we don't ever iterate over all kretprobe_instance structures. The core kprobes infrastructure also imposes no such limitation. Remove the limit from the tracefs interface. This limit is easy to hit on large cpu machines when tracing functions that can sleep. Reported-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- kernel/trace/trace_kprobe.c | 8 -------- kernel/trace/trace_probe.h | 1 - .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 1 - .../selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc | 3 --- 4 files changed, 13 deletions(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 0475e2a6d0825e..b3e214980eed3d 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -21,7 +21,6 @@ #include "trace_probe_tmpl.h" #define KPROBE_EVENT_SYSTEM "kprobes" -#define KRETPROBE_MAXACTIVE_MAX 4096 /* Kprobe early definition from command line */ static char kprobe_boot_events_buf[COMMAND_LINE_SIZE] __initdata; @@ -786,13 +785,6 @@ static int __trace_kprobe_create(int argc, const char *argv[]) trace_probe_log_err(1, BAD_MAXACT); goto parse_error; } - /* kretprobes instances are iterated over via a list. The - * maximum should stay reasonable. - */ - if (maxactive > KRETPROBE_MAXACTIVE_MAX) { - trace_probe_log_err(1, MAXACT_TOO_BIG); - goto parse_error; - } } /* try to parse an address. if that fails, try to read the diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 227d518e5ba521..e331017dc086ed 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -389,7 +389,6 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call, C(BAD_UPROBE_OFFS, "Invalid uprobe offset"), \ C(MAXACT_NO_KPROBE, "Maxactive is not for kprobe"), \ C(BAD_MAXACT, "Invalid maxactive number"), \ - C(MAXACT_TOO_BIG, "Maxactive is too big"), \ C(BAD_PROBE_ADDR, "Invalid probed address or symbol"), \ C(BAD_RETPROBE, "Retprobe address must be an function entry"), \ C(BAD_ADDR_SUFFIX, "Invalid probed address suffix"), \ diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc index fa928b431555ca..be3360a258bae8 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc @@ -10,7 +10,6 @@ check_error() { # command-with-error-pos-by-^ if grep -q 'r\[maxactive\]' README; then check_error 'p^100 vfs_read' # MAXACT_NO_KPROBE check_error 'r^1a111 vfs_read' # BAD_MAXACT -check_error 'r^100000 vfs_read' # MAXACT_TOO_BIG fi check_error 'p ^non_exist_func' # BAD_PROBE_ADDR (enoent) diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc index 4f0b268c12332a..f57c95bfc5ed5a 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc @@ -6,9 +6,6 @@ # Test if we successfully reject unknown messages if echo 'a:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi -# Test if we successfully reject too big maxactive -if echo 'r1000000:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi - # Test if we successfully reject unparsable numbers for maxactive if echo 'r10fuzz:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi -- 2.31.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-14 18:03 ` [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive Naveen N. Rao @ 2021-06-15 9:35 ` Masami Hiramatsu 2021-06-15 17:41 ` Naveen N. Rao 0 siblings, 1 reply; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-15 9:35 UTC (permalink / raw) To: Naveen N. Rao Cc: linux-kernel, Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Anton Blanchard On Mon, 14 Jun 2021 23:33:29 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > We currently limit maxactive for a kretprobe to 4096 when registering > the same through tracefs. The comment indicates that this is done so as > to keep list traversal reasonable. However, we don't ever iterate over > all kretprobe_instance structures. The core kprobes infrastructure also > imposes no such limitation. > > Remove the limit from the tracefs interface. This limit is easy to hit > on large cpu machines when tracing functions that can sleep. > > Reported-by: Anton Blanchard <anton@ozlabs.org> > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> OK, but I don't like to just remove the limit (since it can cause memory shortage easily.) Can't we make it configurable? I don't mean Kconfig, but tracefs/options/kretprobe_maxactive, or kprobes's debugfs knob. Hmm, maybe debugfs/kprobes/kretprobe_maxactive will be better since it can limit both trace_kprobe and kprobes itself. Let me fix that. Thank you, > --- > kernel/trace/trace_kprobe.c | 8 -------- > kernel/trace/trace_probe.h | 1 - > .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 1 - > .../selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc | 3 --- > 4 files changed, 13 deletions(-) > > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c > index 0475e2a6d0825e..b3e214980eed3d 100644 > --- a/kernel/trace/trace_kprobe.c > +++ b/kernel/trace/trace_kprobe.c > @@ -21,7 +21,6 @@ > #include "trace_probe_tmpl.h" > > #define KPROBE_EVENT_SYSTEM "kprobes" > -#define KRETPROBE_MAXACTIVE_MAX 4096 > > /* Kprobe early definition from command line */ > static char kprobe_boot_events_buf[COMMAND_LINE_SIZE] __initdata; > @@ -786,13 +785,6 @@ static int __trace_kprobe_create(int argc, const char *argv[]) > trace_probe_log_err(1, BAD_MAXACT); > goto parse_error; > } > - /* kretprobes instances are iterated over via a list. The > - * maximum should stay reasonable. > - */ > - if (maxactive > KRETPROBE_MAXACTIVE_MAX) { > - trace_probe_log_err(1, MAXACT_TOO_BIG); > - goto parse_error; > - } > } > > /* try to parse an address. if that fails, try to read the > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h > index 227d518e5ba521..e331017dc086ed 100644 > --- a/kernel/trace/trace_probe.h > +++ b/kernel/trace/trace_probe.h > @@ -389,7 +389,6 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call, > C(BAD_UPROBE_OFFS, "Invalid uprobe offset"), \ > C(MAXACT_NO_KPROBE, "Maxactive is not for kprobe"), \ > C(BAD_MAXACT, "Invalid maxactive number"), \ > - C(MAXACT_TOO_BIG, "Maxactive is too big"), \ > C(BAD_PROBE_ADDR, "Invalid probed address or symbol"), \ > C(BAD_RETPROBE, "Retprobe address must be an function entry"), \ > C(BAD_ADDR_SUFFIX, "Invalid probed address suffix"), \ > diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc > index fa928b431555ca..be3360a258bae8 100644 > --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc > +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc > @@ -10,7 +10,6 @@ check_error() { # command-with-error-pos-by-^ > if grep -q 'r\[maxactive\]' README; then > check_error 'p^100 vfs_read' # MAXACT_NO_KPROBE > check_error 'r^1a111 vfs_read' # BAD_MAXACT > -check_error 'r^100000 vfs_read' # MAXACT_TOO_BIG > fi > > check_error 'p ^non_exist_func' # BAD_PROBE_ADDR (enoent) > diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc > index 4f0b268c12332a..f57c95bfc5ed5a 100644 > --- a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc > +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_maxactive.tc > @@ -6,9 +6,6 @@ > # Test if we successfully reject unknown messages > if echo 'a:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi > > -# Test if we successfully reject too big maxactive > -if echo 'r1000000:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi > - > # Test if we successfully reject unparsable numbers for maxactive > if echo 'r10fuzz:myprobeaccept inet_csk_accept' > kprobe_events; then false; else true; fi > > -- > 2.31.1 > -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-15 9:35 ` Masami Hiramatsu @ 2021-06-15 17:41 ` Naveen N. Rao 2021-06-16 0:46 ` Masami Hiramatsu 0 siblings, 1 reply; 17+ messages in thread From: Naveen N. Rao @ 2021-06-15 17:41 UTC (permalink / raw) To: Masami Hiramatsu Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt Masami Hiramatsu wrote: > On Mon, 14 Jun 2021 23:33:29 +0530 > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > >> We currently limit maxactive for a kretprobe to 4096 when registering >> the same through tracefs. The comment indicates that this is done so as >> to keep list traversal reasonable. However, we don't ever iterate over >> all kretprobe_instance structures. The core kprobes infrastructure also >> imposes no such limitation. >> >> Remove the limit from the tracefs interface. This limit is easy to hit >> on large cpu machines when tracing functions that can sleep. >> >> Reported-by: Anton Blanchard <anton@ozlabs.org> >> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > OK, but I don't like to just remove the limit (since it can cause > memory shortage easily.) > Can't we make it configurable? I don't mean Kconfig, but > tracefs/options/kretprobe_maxactive, or kprobes's debugfs knob. > > Hmm, maybe debugfs/kprobes/kretprobe_maxactive will be better since > it can limit both trace_kprobe and kprobes itself. I don't think it is good to put a new tunable in debugfs -- we don't have any kprobes tunable there, so this adds a dependency on debugfs which shouldn't be necessary. /proc/sys/debug/ may be a better fit since we have the kprobes-optimization flag to disable optprobes there, though I'm not sure if a new sysfs file is agreeable. But, I'm not too sure this really is a problem. Maxactive is a user _opt-in_ feature which needs to be explicitly added to an event definition. In that sense, isn't this already a tunable? - Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-15 17:41 ` Naveen N. Rao @ 2021-06-16 0:46 ` Masami Hiramatsu 2021-06-16 1:03 ` Steven Rostedt 2021-06-17 16:19 ` Naveen N. Rao 0 siblings, 2 replies; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-16 0:46 UTC (permalink / raw) To: Naveen N. Rao Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt On Tue, 15 Jun 2021 23:11:27 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > Masami Hiramatsu wrote: > > On Mon, 14 Jun 2021 23:33:29 +0530 > > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > >> We currently limit maxactive for a kretprobe to 4096 when registering > >> the same through tracefs. The comment indicates that this is done so as > >> to keep list traversal reasonable. However, we don't ever iterate over > >> all kretprobe_instance structures. The core kprobes infrastructure also > >> imposes no such limitation. > >> > >> Remove the limit from the tracefs interface. This limit is easy to hit > >> on large cpu machines when tracing functions that can sleep. > >> > >> Reported-by: Anton Blanchard <anton@ozlabs.org> > >> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > > > OK, but I don't like to just remove the limit (since it can cause > > memory shortage easily.) > > Can't we make it configurable? I don't mean Kconfig, but > > tracefs/options/kretprobe_maxactive, or kprobes's debugfs knob. > > > > Hmm, maybe debugfs/kprobes/kretprobe_maxactive will be better since > > it can limit both trace_kprobe and kprobes itself. > > I don't think it is good to put a new tunable in debugfs -- we don't > have any kprobes tunable there, so this adds a dependency on debugfs > which shouldn't be necessary. > > /proc/sys/debug/ may be a better fit since we have the > kprobes-optimization flag to disable optprobes there, though I'm not > sure if a new sysfs file is agreeable. Indeed. > But, I'm not too sure this really is a problem. Maxactive is a user > _opt-in_ feature which needs to be explicitly added to an event > definition. In that sense, isn't this already a tunable? Let me explain the background of the limiation. Maxactive is currently no limit for the kprobe kernel module API, because the kernel module developer must take care of the max memory usage (and they can). But the tracefs user may NOT have enough information about what happens if they pass something like 10M for maxactive (it will consume around 500MB kernel memory for one kretprobe). To avoid such trouble, I had set the 4096 limitation for the maxactive parameter. Of course 4096 may not enough for some use-cases. I'm welcome to expand it (e.g. 32k, isn't it enough?), but removing the limitation may cause OOM trouble easily. Thank you, > > > - Naveen > -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-16 0:46 ` Masami Hiramatsu @ 2021-06-16 1:03 ` Steven Rostedt 2021-06-16 2:27 ` Masami Hiramatsu 2021-06-17 16:19 ` Naveen N. Rao 1 sibling, 1 reply; 17+ messages in thread From: Steven Rostedt @ 2021-06-16 1:03 UTC (permalink / raw) To: Masami Hiramatsu Cc: Naveen N. Rao, Anton Blanchard, linux-kernel, Peter Zijlstra On Wed, 16 Jun 2021 09:46:22 +0900 Masami Hiramatsu <mhiramat@kernel.org> wrote: > To avoid such trouble, I had set the 4096 limitation for the maxactive > parameter. Of course 4096 may not enough for some use-cases. I'm welcome > to expand it (e.g. 32k, isn't it enough?), but removing the limitation > may cause OOM trouble easily. What if you just made the max as 10 * number of possible cpus, or 4096, which ever is greater? Why would a user need more? I'd still like to get a wrapper around function graph tracing so that kretprobes could use it. I think that would get rid of the requirement of maxactive, because isn't that just used to have a way to know the original return value? -- Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-16 1:03 ` Steven Rostedt @ 2021-06-16 2:27 ` Masami Hiramatsu 2021-06-16 15:10 ` Masami Hiramatsu 0 siblings, 1 reply; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-16 2:27 UTC (permalink / raw) To: Steven Rostedt Cc: Naveen N. Rao, Anton Blanchard, linux-kernel, Peter Zijlstra On Tue, 15 Jun 2021 21:03:51 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 16 Jun 2021 09:46:22 +0900 > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > To avoid such trouble, I had set the 4096 limitation for the maxactive > > parameter. Of course 4096 may not enough for some use-cases. I'm welcome > > to expand it (e.g. 32k, isn't it enough?), but removing the limitation > > may cause OOM trouble easily. > > What if you just made the max as 10 * number of possible cpus, or 4096, > which ever is greater? Why would a user need more? It could be. But actually, that is not correct number because the number of instances depends on the number of processes and the possiblity of recursive. Thus the huge system which runs more than 64k processes, may need more than that. > I'd still like to get a wrapper around function graph tracing so that > kretprobes could use it. I think that would get rid of the requirement > of maxactive, because isn't that just used to have a way to know the > original return value? Hmm, yes, on some arch, it can be done. But on other arch we still need current implementation for generic solution. What I need is not fully wrapped by the function graph, but just share the per-task (software) shadow stack. Thank you, -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-16 2:27 ` Masami Hiramatsu @ 2021-06-16 15:10 ` Masami Hiramatsu 2021-06-17 16:34 ` Naveen N. Rao 0 siblings, 1 reply; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-16 15:10 UTC (permalink / raw) To: Masami Hiramatsu Cc: Steven Rostedt, Naveen N. Rao, Anton Blanchard, linux-kernel, Peter Zijlstra On Wed, 16 Jun 2021 11:27:11 +0900 Masami Hiramatsu <mhiramat@kernel.org> wrote: > On Tue, 15 Jun 2021 21:03:51 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > On Wed, 16 Jun 2021 09:46:22 +0900 > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > To avoid such trouble, I had set the 4096 limitation for the maxactive > > > parameter. Of course 4096 may not enough for some use-cases. I'm welcome > > > to expand it (e.g. 32k, isn't it enough?), but removing the limitation > > > may cause OOM trouble easily. > > > > What if you just made the max as 10 * number of possible cpus, or 4096, > > which ever is greater? Why would a user need more? > > It could be. But actually, that is not correct number because the > number of instances depends on the number of processes and the possiblity > of recursive. Thus the huge system which runs more than 64k processes, > may need more than that. > > > I'd still like to get a wrapper around function graph tracing so that > > kretprobes could use it. I think that would get rid of the requirement > > of maxactive, because isn't that just used to have a way to know the > > original return value? > > Hmm, yes, on some arch, it can be done. But on other arch we still need > current implementation for generic solution. > What I need is not fully wrapped by the function graph, but just share > the per-task (software) shadow stack. BTW, I have 2 ideas to fix this except for wrapper. 1. Use func-graph tracer API directly from dynamic event instead of kretprobes. This will be enabled only if the arch supports fgraph tracer and enable it. maxactive will be ignored if this is enabled, and tracefs user may not need except for the return value (BTW, is that possible to access the stack? In some case, return value can be passed via stack) 2. Move the kretprobe instance pool from kretprobe to struct task. This pool will allocates one page per task, and shared among all kretprobes. This pool will be allocated when the 1st kretprobe is registered. maxactive will be kept for someone who wants to use per-instance data. But since dynamic event doesn't use it, it will be removed from tracefs and perf. Thank you, -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-16 15:10 ` Masami Hiramatsu @ 2021-06-17 16:34 ` Naveen N. Rao 2021-06-17 17:07 ` Steven Rostedt 0 siblings, 1 reply; 17+ messages in thread From: Naveen N. Rao @ 2021-06-17 16:34 UTC (permalink / raw) To: Masami Hiramatsu Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt Masami Hiramatsu wrote: > On Wed, 16 Jun 2021 11:27:11 +0900 > Masami Hiramatsu <mhiramat@kernel.org> wrote: > >> On Tue, 15 Jun 2021 21:03:51 -0400 >> Steven Rostedt <rostedt@goodmis.org> wrote: >> >> > On Wed, 16 Jun 2021 09:46:22 +0900 >> > Masami Hiramatsu <mhiramat@kernel.org> wrote: >> > >> > > To avoid such trouble, I had set the 4096 limitation for the maxactive >> > > parameter. Of course 4096 may not enough for some use-cases. I'm welcome >> > > to expand it (e.g. 32k, isn't it enough?), but removing the limitation >> > > may cause OOM trouble easily. >> > >> > What if you just made the max as 10 * number of possible cpus, or 4096, >> > which ever is greater? Why would a user need more? >> >> It could be. But actually, that is not correct number because the >> number of instances depends on the number of processes and the possiblity >> of recursive. Thus the huge system which runs more than 64k processes, >> may need more than that. >> >> > I'd still like to get a wrapper around function graph tracing so that >> > kretprobes could use it. I think that would get rid of the requirement >> > of maxactive, because isn't that just used to have a way to know the >> > original return value? >> >> Hmm, yes, on some arch, it can be done. But on other arch we still need >> current implementation for generic solution. >> What I need is not fully wrapped by the function graph, but just share >> the per-task (software) shadow stack. > > BTW, I have 2 ideas to fix this except for wrapper. > > 1. Use func-graph tracer API directly from dynamic event instead of > kretprobes. This will be enabled only if the arch supports fgraph > tracer and enable it. maxactive will be ignored if this is enabled, > and tracefs user may not need except for the return value > (BTW, is that possible to access the stack? In some case, return > value can be passed via stack) > > 2. Move the kretprobe instance pool from kretprobe to struct task. > This pool will allocates one page per task, and shared among all > kretprobes. This pool will be allocated when the 1st kretprobe > is registered. maxactive will be kept for someone who wants to > use per-instance data. But since dynamic event doesn't use it, > it will be removed from tracefs and perf. Won't this result in _more_ memory usage compared to what we have now? Thanks, Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-17 16:34 ` Naveen N. Rao @ 2021-06-17 17:07 ` Steven Rostedt 2021-06-18 4:26 ` Masami Hiramatsu 0 siblings, 1 reply; 17+ messages in thread From: Steven Rostedt @ 2021-06-17 17:07 UTC (permalink / raw) To: Naveen N. Rao Cc: Masami Hiramatsu, Anton Blanchard, linux-kernel, Peter Zijlstra On Thu, 17 Jun 2021 22:04:34 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > 2. Move the kretprobe instance pool from kretprobe to struct task. > > This pool will allocates one page per task, and shared among all > > kretprobes. This pool will be allocated when the 1st kretprobe > > is registered. maxactive will be kept for someone who wants to > > use per-instance data. But since dynamic event doesn't use it, > > it will be removed from tracefs and perf. > > Won't this result in _more_ memory usage compared to what we have now? Maybe or maybe not. At least with this approach (or the function graph one), you will allocate enough for the environment involved. If there's thousands of tasks, then yes, it will allocate more memory. But if you are running thousands of tasks, you should have a lot of memory in the machine. If you are only running a few tasks, it will be less than the current approach. -- Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-17 17:07 ` Steven Rostedt @ 2021-06-18 4:26 ` Masami Hiramatsu 2021-06-18 8:41 ` Naveen N. Rao 0 siblings, 1 reply; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-18 4:26 UTC (permalink / raw) To: Steven Rostedt Cc: Naveen N. Rao, Masami Hiramatsu, Anton Blanchard, linux-kernel, Peter Zijlstra On Thu, 17 Jun 2021 13:07:13 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Thu, 17 Jun 2021 22:04:34 +0530 > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > > 2. Move the kretprobe instance pool from kretprobe to struct task. > > > This pool will allocates one page per task, and shared among all > > > kretprobes. This pool will be allocated when the 1st kretprobe > > > is registered. maxactive will be kept for someone who wants to > > > use per-instance data. But since dynamic event doesn't use it, > > > it will be removed from tracefs and perf. > > > > Won't this result in _more_ memory usage compared to what we have now? > > Maybe or maybe not. At least with this approach (or the function graph > one), you will allocate enough for the environment involved. If there's > thousands of tasks, then yes, it will allocate more memory. But if you are > running thousands of tasks, you should have a lot of memory in the machine. > > If you are only running a few tasks, it will be less than the current > approach. Right, this depends on how many tasks you are running on your machine. Anyway, since you may not sure how much maxactive is enough, you will set maxactive high, then it can consume more than that. Of course you can optimize by trial and error. But that does not guarantee all cases, because the number of tasks can be increased while tracing. You might need to re-configure it by checking the nmissed count again. Thank you, -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-18 4:26 ` Masami Hiramatsu @ 2021-06-18 8:41 ` Naveen N. Rao 0 siblings, 0 replies; 17+ messages in thread From: Naveen N. Rao @ 2021-06-18 8:41 UTC (permalink / raw) To: Masami Hiramatsu, Steven Rostedt Cc: Anton Blanchard, linux-kernel, Peter Zijlstra Masami Hiramatsu wrote: > On Thu, 17 Jun 2021 13:07:13 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > >> On Thu, 17 Jun 2021 22:04:34 +0530 >> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: >> >> > > 2. Move the kretprobe instance pool from kretprobe to struct task. >> > > This pool will allocates one page per task, and shared among all >> > > kretprobes. This pool will be allocated when the 1st kretprobe >> > > is registered. maxactive will be kept for someone who wants to >> > > use per-instance data. But since dynamic event doesn't use it, >> > > it will be removed from tracefs and perf. >> > >> > Won't this result in _more_ memory usage compared to what we have now? >> >> Maybe or maybe not. At least with this approach (or the function graph >> one), you will allocate enough for the environment involved. If there's >> thousands of tasks, then yes, it will allocate more memory. But if you are >> running thousands of tasks, you should have a lot of memory in the machine. >> >> If you are only running a few tasks, it will be less than the current >> approach. > > Right, this depends on how many tasks you are running on your machine. > Anyway, since you may not sure how much maxactive is enough, you will > set maxactive high, then it can consume more than that. Of course you > can optimize by trial and error. But that does not guarantee all cases, > because the number of tasks can be increased while tracing. You might > need to re-configure it by checking the nmissed count again. Yes. If we go down this route, we should limit the per-task allocation to a more reasonable 4k -- powerpc uses 64k pages. Thanks, Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-16 0:46 ` Masami Hiramatsu 2021-06-16 1:03 ` Steven Rostedt @ 2021-06-17 16:19 ` Naveen N. Rao 2021-06-18 6:17 ` Masami Hiramatsu 1 sibling, 1 reply; 17+ messages in thread From: Naveen N. Rao @ 2021-06-17 16:19 UTC (permalink / raw) To: Masami Hiramatsu Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt Masami Hiramatsu wrote: > On Tue, 15 Jun 2021 23:11:27 +0530 > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > >> Masami Hiramatsu wrote: >> > On Mon, 14 Jun 2021 23:33:29 +0530 >> > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: >> > >> >> We currently limit maxactive for a kretprobe to 4096 when registering >> >> the same through tracefs. The comment indicates that this is done so as >> >> to keep list traversal reasonable. However, we don't ever iterate over >> >> all kretprobe_instance structures. The core kprobes infrastructure also >> >> imposes no such limitation. >> >> >> >> Remove the limit from the tracefs interface. This limit is easy to hit >> >> on large cpu machines when tracing functions that can sleep. >> >> >> >> Reported-by: Anton Blanchard <anton@ozlabs.org> >> >> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> >> > >> > OK, but I don't like to just remove the limit (since it can cause >> > memory shortage easily.) >> > Can't we make it configurable? I don't mean Kconfig, but >> > tracefs/options/kretprobe_maxactive, or kprobes's debugfs knob. >> > >> > Hmm, maybe debugfs/kprobes/kretprobe_maxactive will be better since >> > it can limit both trace_kprobe and kprobes itself. >> >> I don't think it is good to put a new tunable in debugfs -- we don't >> have any kprobes tunable there, so this adds a dependency on debugfs >> which shouldn't be necessary. >> >> /proc/sys/debug/ may be a better fit since we have the >> kprobes-optimization flag to disable optprobes there, though I'm not >> sure if a new sysfs file is agreeable. > > Indeed. > >> But, I'm not too sure this really is a problem. Maxactive is a user >> _opt-in_ feature which needs to be explicitly added to an event >> definition. In that sense, isn't this already a tunable? > > Let me explain the background of the limiation. Thanks for the background on this. > > Maxactive is currently no limit for the kprobe kernel module API, > because the kernel module developer must take care of the max memory > usage (and they can). > > But the tracefs user may NOT have enough information about what > happens if they pass something like 10M for maxactive (it will consume > around 500MB kernel memory for one kretprobe). Ok, thinking more about this... Right now, the only way for a user to notice that kretprobe maxactive is an issue is by looking at kprobe_profile. This is not even possible if using a bcc tool, which uses perf_event_open(). It took the reporting team some effort to even identify that the reason why they were getting weird results when tracing was due to the default value used for kretprobe maxactive; and then that 4096 was the hard limit through tracefs. So, IMO, anyone using any existing bcc tool, or a pre-canned perf script will not even be able to identify this as a problem to begin with... at least, not without some effort. To address this, as a first step, we should probably consider parsing kprobe_profile and printing a warning with 'perf' if we detect a non-zero miss count for a probe -- both a regular probe, as well as a retprobe. If we do this, the nice thing with kprobe_profile is that the probe miss count is available, and can serve as a good way to decide what a more reasonable maxactive value should be. This should help prevent users from trying with arbitrary maxactive values. For perf_event_open(), perhaps we can introduce an ioctl to query the probe miss count. > > To avoid such trouble, I had set the 4096 limitation for the maxactive > parameter. Of course 4096 may not enough for some use-cases. I'm welcome > to expand it (e.g. 32k, isn't it enough?), but removing the limitation > may cause OOM trouble easily. Do you have suggestions for how we can determine a better limit? As you point out in the other email, there could very well be 64k or more processes on a large machine. Since the primary concern is memory usage, we probably need to decide this based on total memory. But, memory usage will vary depending on system load... Perhaps we can start by making maxactive limit be a tunable with a default value of 4096, with the understanding that users will be careful when bumping up this value. Hopefully, scripts won't simply start writing into this file ;) If we can feed back the probe miss count, tools should be able to guide users on what would be a reasonable maxactive value to use. Thanks, Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-17 16:19 ` Naveen N. Rao @ 2021-06-18 6:17 ` Masami Hiramatsu 2021-06-18 13:19 ` Naveen N. Rao 0 siblings, 1 reply; 17+ messages in thread From: Masami Hiramatsu @ 2021-06-18 6:17 UTC (permalink / raw) To: Naveen N. Rao Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt On Thu, 17 Jun 2021 21:49:36 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > Masami Hiramatsu wrote: > > On Tue, 15 Jun 2021 23:11:27 +0530 > > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > >> Masami Hiramatsu wrote: > >> > On Mon, 14 Jun 2021 23:33:29 +0530 > >> > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > >> > > >> >> We currently limit maxactive for a kretprobe to 4096 when registering > >> >> the same through tracefs. The comment indicates that this is done so as > >> >> to keep list traversal reasonable. However, we don't ever iterate over > >> >> all kretprobe_instance structures. The core kprobes infrastructure also > >> >> imposes no such limitation. > >> >> > >> >> Remove the limit from the tracefs interface. This limit is easy to hit > >> >> on large cpu machines when tracing functions that can sleep. > >> >> > >> >> Reported-by: Anton Blanchard <anton@ozlabs.org> > >> >> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > >> > > >> > OK, but I don't like to just remove the limit (since it can cause > >> > memory shortage easily.) > >> > Can't we make it configurable? I don't mean Kconfig, but > >> > tracefs/options/kretprobe_maxactive, or kprobes's debugfs knob. > >> > > >> > Hmm, maybe debugfs/kprobes/kretprobe_maxactive will be better since > >> > it can limit both trace_kprobe and kprobes itself. > >> > >> I don't think it is good to put a new tunable in debugfs -- we don't > >> have any kprobes tunable there, so this adds a dependency on debugfs > >> which shouldn't be necessary. > >> > >> /proc/sys/debug/ may be a better fit since we have the > >> kprobes-optimization flag to disable optprobes there, though I'm not > >> sure if a new sysfs file is agreeable. > > > > Indeed. > > > >> But, I'm not too sure this really is a problem. Maxactive is a user > >> _opt-in_ feature which needs to be explicitly added to an event > >> definition. In that sense, isn't this already a tunable? > > > > Let me explain the background of the limiation. > > Thanks for the background on this. > > > > > Maxactive is currently no limit for the kprobe kernel module API, > > because the kernel module developer must take care of the max memory > > usage (and they can). > > > > But the tracefs user may NOT have enough information about what > > happens if they pass something like 10M for maxactive (it will consume > > around 500MB kernel memory for one kretprobe). > > Ok, thinking more about this... > > Right now, the only way for a user to notice that kretprobe maxactive is > an issue is by looking at kprobe_profile. This is not even possible if > using a bcc tool, which uses perf_event_open(). It took the reporting > team some effort to even identify that the reason why they were getting > weird results when tracing was due to the default value used for > kretprobe maxactive; and then that 4096 was the hard limit through > tracefs. > > So, IMO, anyone using any existing bcc tool, or a pre-canned perf script > will not even be able to identify this as a problem to begin with... at > least, not without some effort. Yeah, the nmissed counter must be exposed in that case via tracefs or debugfs. Maybe ebpf can also warn it (by checking nmissed count). > To address this, as a first step, we should probably consider parsing > kprobe_profile and printing a warning with 'perf' if we detect a > non-zero miss count for a probe -- both a regular probe, as well as a > retprobe. Yeah, it is doable. Note that perf-probe only set up the event and perf-trace or other commands will use it. > If we do this, the nice thing with kprobe_profile is that the probe miss > count is available, and can serve as a good way to decide what a more > reasonable maxactive value should be. This should help prevent users > from trying with arbitrary maxactive values. Such feedback loop is an interesting idea. Note that nmissed count is an accumulate value, not the max number of the instance which will be needed. > For perf_event_open(), perhaps we can introduce an ioctl to query the > probe miss count. Or, maybe we can expand the maxactive in runtime. e.g. add a shortage counter on the kretprobe, and run a monitor kernel thread (or kworker). If the shortage counter is incremented, the monitor allocates instances (2x counter) and give it to the kretprobe. And it resets the shortage counter. This adaptive maxactive may cause mis-hit in the beginning, but finally find the optimal maxactive value automatically. > > To avoid such trouble, I had set the 4096 limitation for the maxactive > > parameter. Of course 4096 may not enough for some use-cases. I'm welcome > > to expand it (e.g. 32k, isn't it enough?), but removing the limitation > > may cause OOM trouble easily. > > Do you have suggestions for how we can determine a better limit? As you > point out in the other email, there could very well be 64k or more > processes on a large machine. Since the primary concern is memory usage, > we probably need to decide this based on total memory. But, memory usage > will vary depending on system load... This is very good question. IMHO, it might better to calculate the total maxactive from the system memory size. For example, 1% of system memory can be used for the kretprobes, 16GB system will allow using 160MB for kretprobes, which means about "30M" is the max number of maxactive, or multiple kretprobes can share it. Doesn't it sound enough? Of course this will need to show the current usage of the kretprobe instance objects via tracefs or debugfs. But this total cap seems reasonable for me to avoid OOM trouble. > Perhaps we can start by making maxactive limit be a tunable with a > default value of 4096, with the understanding that users will be careful > when bumping up this value. Hopefully, scripts won't simply start > writing into this file ;) Yeah, that's what I suggested at first, because the best maxactive will depend on the max number of the *processes* and the probed function. If the probed function will NOT be preempted or slept, maxactive will be the number of *processor cores*. Or, if it can be preempted or slept, it will be the max number of *processes*. If the probed function can recursively called (Note: this is rare case), the maxactive has to be multiplied. It is hard to estimate the max number of processes, since it depends on the system. Small embedded systems don't run thousands of processes, but big servers will run more than ten thousands of processes. Thus make it tunable will be a good idea. Thank you, > > If we can feed back the probe miss count, tools should be able to guide > users on what would be a reasonable maxactive value to use. > > > Thanks, > Naveen > -- Masami Hiramatsu <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive 2021-06-18 6:17 ` Masami Hiramatsu @ 2021-06-18 13:19 ` Naveen N. Rao 0 siblings, 0 replies; 17+ messages in thread From: Naveen N. Rao @ 2021-06-18 13:19 UTC (permalink / raw) To: Masami Hiramatsu Cc: Anton Blanchard, linux-kernel, Peter Zijlstra, Steven Rostedt Masami Hiramatsu wrote: > >> To address this, as a first step, we should probably consider parsing >> kprobe_profile and printing a warning with 'perf' if we detect a >> non-zero miss count for a probe -- both a regular probe, as well as a >> retprobe. > > Yeah, it is doable. Note that perf-probe only set up the event and > perf-trace or other commands will use it. > > >> If we do this, the nice thing with kprobe_profile is that the probe miss >> count is available, and can serve as a good way to decide what a more >> reasonable maxactive value should be. This should help prevent users >> from trying with arbitrary maxactive values. > > Such feedback loop is an interesting idea. > Note that nmissed count is an accumulate value, not the max number of > the instance which will be needed. Yes, we will have to factor-in the duration during which the event was active. This will still be an approximation, but serves as a good starting point. It may need a few tries to get this right, but more importantly, the user knows instantly that there are missed probes. > >> For perf_event_open(), perhaps we can introduce an ioctl to query the >> probe miss count. > > Or, maybe we can expand the maxactive in runtime. e.g. add a shortage > counter on the kretprobe, and run a monitor kernel thread (or kworker). > If the shortage counter is incremented, the monitor allocates instances > (2x counter) and give it to the kretprobe. And it resets the shortage > counter. This adaptive maxactive may cause mis-hit in the beginning, > but finally find the optimal maxactive value automatically. I like this idea and I have been thinking along these lines too. If we start with a better default (rather than just num_possible_cpus() used today), I suspect we may be able to get this to work well enough to not have to miss any probes. Specifying 'maxactive' can still serve as a workaround to allocate a larger initial set of kretprobe_instances in case this doesn't work. > > >> > To avoid such trouble, I had set the 4096 limitation for the maxactive >> > parameter. Of course 4096 may not enough for some use-cases. I'm >> > welcome >> > to expand it (e.g. 32k, isn't it enough?), but removing the limitation >> > may cause OOM trouble easily. >> >> Do you have suggestions for how we can determine a better limit? As you >> point out in the other email, there could very well be 64k or more >> processes on a large machine. Since the primary concern is memory usage, >> we probably need to decide this based on total memory. But, memory usage >> will vary depending on system load... > > This is very good question. IMHO, it might better to calculate the total > maxactive from the system memory size. For example, 1% of system memory > can be used for the kretprobes, 16GB system will allow using 160MB for > kretprobes, which means about "30M" is the max number of maxactive, or > multiple kretprobes can share it. Doesn't it sound enough? Of course > this will need to show the current usage of the kretprobe instance objects > via tracefs or debugfs. But this total cap seems reasonable for me to > avoid OOM trouble. > >> Perhaps we can start by making maxactive limit be a tunable with a >> default value of 4096, with the understanding that users will be careful >> when bumping up this value. Hopefully, scripts won't simply start >> writing into this file ;) > > Yeah, that's what I suggested at first, because the best maxactive will > depend on the max number of the *processes* and the probed function. > > If the probed function will NOT be preempted or slept, maxactive will be > the number of *processor cores*. Or, if it can be preempted or slept, it > will be the max number of *processes*. If the probed function can > recursively called (Note: this is rare case), the maxactive has to > be multiplied. > > It is hard to estimate the max number of processes, since it depends > on the system. Small embedded systems don't run thousands of processes, > but big servers will run more than ten thousands of processes. > Thus make it tunable will be a good idea. Agree. Thanks, Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-06-18 13:19 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-14 18:03 [PATCH 0/2] trace/kprobe: Two fixes for kretprobes Naveen N. Rao 2021-06-14 18:03 ` [PATCH 1/2] trace/kprobe: Fix count of missed kretprobes in kprobe_profile Naveen N. Rao 2021-06-15 5:47 ` Masami Hiramatsu 2021-06-14 18:03 ` [PATCH 2/2] trace/kprobe: Remove limit on kretprobe maxactive Naveen N. Rao 2021-06-15 9:35 ` Masami Hiramatsu 2021-06-15 17:41 ` Naveen N. Rao 2021-06-16 0:46 ` Masami Hiramatsu 2021-06-16 1:03 ` Steven Rostedt 2021-06-16 2:27 ` Masami Hiramatsu 2021-06-16 15:10 ` Masami Hiramatsu 2021-06-17 16:34 ` Naveen N. Rao 2021-06-17 17:07 ` Steven Rostedt 2021-06-18 4:26 ` Masami Hiramatsu 2021-06-18 8:41 ` Naveen N. Rao 2021-06-17 16:19 ` Naveen N. Rao 2021-06-18 6:17 ` Masami Hiramatsu 2021-06-18 13:19 ` Naveen N. Rao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).