----- On Jul 9, 2020, at 7:19 AM, lttng-dev <lttng-dev@lists.lttng.org> wrote:
Hello!

Currently, I'm developing a process monitor on the base of LTTng, and I face the challenge of accessing command-line arguments passed to execve syscall. 
I'm using LTTng live session and Babeltrace 2 C API to analyze events in online mode.

syscall_entry_execve event has 3 payload fields: filename, argv, and envp. The first one is a normal C-string, the second and the third semantically are `char *const *`, 
but provided by LTTng as simple unsigned integers (the corresponding fields in Babaltrace2 event payload have type BT_FIELD_CLASS_TYPE_UNSIGNED_INTEGER,
while I expect BT_FIELD_CLASS_TYPE_DYNAMIC_ARRAY). As far as I understand, these integers are argv and envp pointers casted to uint64_t. But in the majority of
cases, events produced by LTTng are analyzed by another process and often even offline, so these pointers became completely unuseful.

Could you say, if there are some configuration parameters that enable to pass argv and envp content in syscall_entry_execve payload? Or some other ways to get this
information from LTTng.

P.S. I consider getting this information from /proc/pid/cmdline, but it is not looking like a clean solution.

The main reason why we don't implement this kind of instrumentation is because it would then
capture security-sensitive data into the trace. Likewise for payload of read() and write() system
calls for instance.

I am not against instrumenting this information, but it should be done by add-on modules which
can be compiled-out, and would be runtime-disabled by default. Also, we would need to extend the
tracepoint instrumentation to identify fields which are security-sensitive, so they could be specifically
disabled at runtime. This would also require CTF2 (Common Trace Format 2) to happen, so we can
tag specific fields as containing sensitive data. Users should really know that they are tracing sensitive
information when they do so.

So adding the instrumentation to the project is not the hard part. The hard part is making sure it is
configurable, not captured by default, and clearly identified in the traces.

There is a second technical issue that would need solving for capturing argv and envp: we would need
to ensure tracepoints hooked on system calls can take page faults, which is not possible today. The
odds of taking a page fault when reading through argv and envp in a newly forked process are probably
quite high, which would cause incomplete data. This cannot be solved in lttng-modules alone, we need
to improve the kernel tracepoint instrumentation subsystem to do so.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com