Linux-Trace-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Tzvetomir Stoyanov <tstoyanov@vmware.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>,
	Linux Trace Devel <linux-trace-devel@vger.kernel.org>
Subject: Re: [PATCH v4] trace-cmd: Save the tracee memory map into the trace.dat file.
Date: Fri, 9 Aug 2019 12:16:09 +0000
Message-ID: <CACqStofTwKeD_K+=s71_XHwWbVidarO7k28Zw0dhOoSKTtTXAQ@mail.gmail.com> (raw)
In-Reply-To: <20190808185622.67ad09e2@gandalf.local.home>

On Fri, Aug 9, 2019 at 1:56 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed,  3 Jul 2019 14:51:34 +0300
> tz.stoyanov@gmail.com wrote:
>
> > From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
> >
> > A new trace-cmd record option is added: "--mmap". When it is set with
> > combination of -F or -P options, the memory map of the traced applications
> > is stored in the trace.dat file. A new API tracecmd_search_task_mmap()
> > can be used to look up into stored memory maps. The map is retrieved from
> > /proc/<pid>/maps file.
> >
>
> I also have some notes below.
>
> > diff --git a/Documentation/trace-cmd-record.1.txt b/Documentation/trace-cmd-record.1.txt
> > index 26a8299..4a59de9 100644
> > --- a/Documentation/trace-cmd-record.1.txt
> > +++ b/Documentation/trace-cmd-record.1.txt
> > @@ -119,6 +119,9 @@ OPTIONS
> >       Used with either *-F* (or *-P* if kernel supports it) to trace the process'
> >       children too.
> >
> > +*--mmap*::
> > +     Used with either *-F* or *-P*, save the traced process memory map into
> > +     the trace.dat file.
>
> Do we need to use -F? If we want to record the proc/mmap of the process
> that we execute (trace-cmd recode -e all some-program, where we save
> the process map of some-program). Do we have to only record
> some-program? Which is what -F will do.
>
The only reason for this limitation is that current ptrace logic in trace-cmd is
bound to filtered task, and ptrace is used to track when the process finishes.
It makes sense to have the proc/mmap in this case, I will change the ptrace
logic to not depend on filtered task.

>
> >  *-C* 'clock'::
> >       Set the trace clock to "clock".
> >
> > diff --git a/include/trace-cmd/trace-cmd.h b/include/trace-cmd/trace-cmd.h
> > index 6f62ab9..17edb9d 100644
> > --- a/include/trace-cmd/trace-cmd.h
> > +++ b/include/trace-cmd/trace-cmd.h
> > @@ -82,6 +82,7 @@ enum {
> >       TRACECMD_OPTION_OFFSET,
> >       TRACECMD_OPTION_CPUCOUNT,
> >       TRACECMD_OPTION_VERSION,
> > +     TRACECMD_OPTION_PIDMMAPS,
> >  };
> >
> >  enum {
> > @@ -97,6 +98,12 @@ struct tracecmd_ftrace {
> >       int long_size;
> >  };
> >
> > +struct lib_mem_map {
>
> If this is going to be exported to external users, it needs to use the
> tracecmd_ prefix. Probably should also add "proc" as well.
>
>   struct tracecmd_proc_mem_map ?
>
> I'm also thinking of removing the "mem" as it's not really memory maps,
> it's really called address mapping. Perhaps just drop the "mem", and
> call it tracecmd_proc_map. And address may not even be pointing into
> memory (It could have a device I/O mapping).
>
>
This struct will not be exposed, I'll rename it to tracecmd_proc_map
move it to trace-cmd-local.h

>
> > +     unsigned long long      start;
> > +     unsigned long long      end;
> > +     char                    *lib_name;
> > +};
> > +
> >  typedef void (*tracecmd_show_data_func)(struct tracecmd_input *handle,
> >                                       struct tep_record *record);
> >  typedef void (*tracecmd_handle_init_func)(struct tracecmd_input *handle,
> > @@ -208,6 +215,8 @@ unsigned long long tracecmd_page_ts(struct tracecmd_input *handle,
> >  unsigned int tracecmd_record_ts_delta(struct tracecmd_input *handle,
> >                                     struct tep_record *record);
> >
> > +struct lib_mem_map *tracecmd_search_task_mmap(struct tracecmd_input *handle,
> > +                                           int pid, unsigned long long addr);
> >  #ifndef SWIG
> >  /* hack for function graph work around */
> >  extern __thread struct tracecmd_input *tracecmd_curr_thread_handle;
> > diff --git a/lib/trace-cmd/trace-input.c b/lib/trace-cmd/trace-input.c
> > index 61566ba..d9abbcf 100644
> > --- a/lib/trace-cmd/trace-input.c
> > +++ b/lib/trace-cmd/trace-input.c
> > @@ -101,6 +101,7 @@ struct tracecmd_input {
> >       struct tracecmd_ftrace  finfo;
> >
> >       struct hook_list        *hooks;
> > +     struct pid_mem_maps     *pid_mmaps;
> >       /* file information */
> >       size_t                  header_files_start;
> >       size_t                  ftrace_files_start;
> > @@ -2134,6 +2135,166 @@ void tracecmd_set_ts2secs(struct tracecmd_input *handle,
> >       handle->use_trace_clock = false;
> >  }
> >
> > +static int trace_pid_mmap_cmp(const void *a, const void *b)
> > +{
> > +     struct lib_mem_map *map_a = (struct lib_mem_map *)a;
> > +     struct lib_mem_map *map_b = (struct lib_mem_map *)b;
> > +
> > +     if (map_a->start > map_b->start)
> > +             return 1;
> > +     if (map_a->start < map_b->start)
> > +             return -1;
> > +     return 0;
> > +}
> > +
> > +static void mmap_free(struct pid_mem_maps *maps)
> > +{
> > +     int i;
> > +
> > +     if (!maps)
> > +             return;
> > +     if (maps->lib_maps) {
> > +             for (i = 0; i < maps->nr_lib_maps; i++)
> > +                     free(maps->lib_maps[i].lib_name);
> > +             free(maps->lib_maps);
> > +     }
> > +     free(maps->proc_name);
> > +     free(maps);
> > +}
> > +
> > +#define STR_MMAP_LINE_MAX    (PATH_MAX+34)
>
> What's the reason for the +34?
I have no idea, cannot remember what was in my head at that time.
It should be the maximum possible size of
  sscanf(buf, "%x %x %s", &maps->pid, &maps->nr_lib_maps, mapname);
string, where:
      %s is the name of the library, up to PATH_MAX
      the two "%x", int, should be 20 (counting the leading 0x)
      and we have two spaces.
so according to this, the define should be (PATH_MAX+22).
We need somehow to protect the local buffer char mapname[], to not
be overflowed.
I guess this string was in a different format in the first version of
the code, and the
define was not updated when the string was changed.

>
> > +static int trace_pid_mmap_load(struct tracecmd_input *handle, char *buf)
> > +{
> > +     struct pid_mem_maps *maps = NULL;
> > +     char mapname[STR_MMAP_LINE_MAX];
> > +     char *line;
> > +     int res;
> > +     int ret;
> > +     int i;
> > +
> > +     maps = calloc(1, sizeof(*maps));
> > +     if (!maps)
> > +             return -ENOMEM;
> > +
> > +     ret  = -EINVAL;
> > +     line = strchr(buf, '\n');
> > +     if (!line)
> > +             goto out_fail;
> > +
> > +     *line = '\0';
> > +     if (strlen(buf) > STR_MMAP_LINE_MAX)
> > +             goto out_fail;
> > +
> > +     res = sscanf(buf, "%x %x %s", &maps->pid, &maps->nr_lib_maps, mapname);
> > +     if (res != 3)
> > +             goto out_fail;
> > +
> > +     ret  = -ENOMEM;
> > +     maps->proc_name = strdup(mapname);
> > +     if (!maps->proc_name)
> > +             goto out_fail;
> > +
> > +     maps->lib_maps = calloc(maps->nr_lib_maps, sizeof(struct lib_mem_map));
> > +     if (!maps->lib_maps)
> > +             goto out_fail;
> > +
> > +     buf = line + 1;
> > +     line = strchr(buf, '\n');
> > +     for (i = 0; i < maps->nr_lib_maps; i++) {
> > +             if (!line)
> > +                     break;
> > +             *line = '\0';
> > +             if (strlen(buf) > STR_MMAP_LINE_MAX)
> > +                     break;
> > +             res = sscanf(buf, "%llx %llx %s", &maps->lib_maps[i].start,
> > +                          &maps->lib_maps[i].end, mapname);
> > +             if (res != 3)
> > +                     break;
> > +             maps->lib_maps[i].lib_name = strdup(mapname);
> > +             if (!maps->lib_maps[i].lib_name)
> > +                     goto out_fail;
> > +             buf = line + 1;
> > +             line = strchr(buf, '\n');
> > +     }
> > +
> > +     ret  = -EINVAL;
> > +     if (i != maps->nr_lib_maps)
> > +             goto out_fail;
> > +
> > +     qsort(maps->lib_maps, maps->nr_lib_maps,
> > +           sizeof(*maps->lib_maps), trace_pid_mmap_cmp);
> > +
> > +     maps->next = handle->pid_mmaps;
> > +     handle->pid_mmaps = maps;
> > +
> > +     return 0;
> > +
> > +out_fail:
> > +     mmap_free(maps);
> > +     return ret;
> > +}
> > +
> > +static void trace_pid_mmap_free(struct pid_mem_maps *mmaps)
> > +{
> > +     struct pid_mem_maps *del;
> > +
> > +     while (mmaps) {
> > +             del = mmaps;
> > +             mmaps = mmaps->next;
> > +             mmap_free(del);
> > +     }
> > +}
> > +
> > +static int trace_pid_mmap_search(const void *a, const void *b)
> > +{
> > +     struct lib_mem_map *key = (struct lib_mem_map *)a;
> > +     struct lib_mem_map *map = (struct lib_mem_map *)b;
> > +
> > +     if (key->start >= map->end)
> > +             return 1;
> > +     if (key->start < map->start)
> > +             return -1;
> > +     return 0;
> > +}
> > +
> > +/**
> > + * tracecmd_search_task_mmap - Search task memory address map
> > + * @handle: input handle to the trace.dat file
> > + * @pid: pid of the task
> > + * @addr: address from the task memory space.
> > + *
> > + * Map of the task memory can be saved in the trace.dat file, using the option
> > + * "--mmap". If there is such information, this API can be used to look up into
> > + * this memory map to find what library is loaded at the given @addr.
> > + *
> > + * A pointer to struct lib_mem_map is returned, containing the name of
> > + * the library at given task @addr and the library start and end addresses.
> > + */
> > +struct lib_mem_map *tracecmd_search_task_mmap(struct tracecmd_input *handle,
> > +                                           int pid, unsigned long long addr)
> > +{
> > +     struct pid_mem_maps *maps;
> > +     struct lib_mem_map *lib;
> > +     struct lib_mem_map key;
> > +
> > +     if (!handle || !handle->pid_mmaps)
> > +             return NULL;
> > +
> > +     maps = handle->pid_mmaps;
> > +     while (maps) {
> > +             if (maps->pid == pid)
> > +                     break;
> > +             maps = maps->next;
> > +     }
> > +     if (!maps || !maps->nr_lib_maps || !maps->lib_maps)
> > +             return NULL;
> > +     key.start = addr;
> > +     lib = bsearch(&key, maps->lib_maps, maps->nr_lib_maps,
> > +                   sizeof(*maps->lib_maps), trace_pid_mmap_search);
> > +
> > +     return lib;
> > +}
> > +
> >  static int handle_options(struct tracecmd_input *handle)
> >  {
> >       unsigned long long offset;
> > @@ -2221,9 +2382,6 @@ static int handle_options(struct tracecmd_input *handle)
> >               case TRACECMD_OPTION_UNAME:
> >                       handle->uname = strdup(buf);
> >                       break;
> > -             case TRACECMD_OPTION_VERSION:
> > -                     handle->version = strdup(buf);
> > -                     break;
> >               case TRACECMD_OPTION_HOOK:
> >                       hook = tracecmd_create_event_hook(buf);
> >                       hook->next = handle->hooks;
> > @@ -2233,6 +2391,10 @@ static int handle_options(struct tracecmd_input *handle)
> >                       cpus = *(int *)buf;
> >                       handle->cpus = tep_read_number(handle->pevent, &cpus, 4);
> >                       break;
> > +             case TRACECMD_OPTION_PIDMMAPS:
>
>  OPTION_PROCMAPS ?
>
> > +                     if (buf[size] == '\0')
> > +                             trace_pid_mmap_load(handle, buf);
> > +                     break;
> >               default:
> >                       warning("unknown option %d", option);
> >                       break;
> > @@ -2842,10 +3004,12 @@ void tracecmd_close(struct tracecmd_input *handle)
> >       free(handle->cpu_data);
> >       free(handle->uname);
> >       close(handle->fd);
> > -
>
> I wouldn't remove this blank line.
>
>
> >       tracecmd_free_hooks(handle->hooks);
> >       handle->hooks = NULL;
> >
> > +     trace_pid_mmap_free(handle->pid_mmaps);
> > +     handle->pid_mmaps = NULL;
> > +
> >       if (handle->flags & TRACECMD_FL_BUFFER_INSTANCE)
> >               tracecmd_close(handle->parent);
> >       else {
> > diff --git a/tracecmd/include/trace-local.h b/tracecmd/include/trace-local.h
> > index 1cad3cc..ae1632c 100644
> > --- a/tracecmd/include/trace-local.h
> > +++ b/tracecmd/include/trace-local.h
> > @@ -157,6 +157,14 @@ struct func_list {
> >       const char *mod;
> >  };
> >
> > +struct pid_mem_maps {
> > +     struct pid_mem_maps     *next;
> > +     struct lib_mem_map      *lib_maps;
> > +     unsigned int            nr_lib_maps;
> > +     char                    *proc_name;
> > +     int                     pid;
> > +};
> > +
> >  struct buffer_instance {
> >       struct buffer_instance  *next;
> >       const char              *name;
> > @@ -183,6 +191,8 @@ struct buffer_instance {
> >       struct tracecmd_msg_handle *msg_handle;
> >       struct tracecmd_output *network_handle;
> >
> > +     struct pid_mem_maps     *mem_maps;
> > +
> >       char                    *max_graph_depth;
> >
> >       int                     flags;
> > diff --git a/tracecmd/trace-record.c b/tracecmd/trace-record.c
> > index 5dc6f17..48081d4 100644
> > --- a/tracecmd/trace-record.c
> > +++ b/tracecmd/trace-record.c
> > @@ -84,6 +84,7 @@ static int max_kb;
> >  static bool use_tcp;
> >
> >  static int do_ptrace;
> > +static int get_mmap;
> >
> >  static int filter_task;
> >  static int filter_pid = -1;
> > @@ -1062,6 +1063,120 @@ static char *make_pid_filter(char *curr_filter, const char *field)
> >       return filter;
> >  }
> >
> > +static int get_pid_mmaps(int pid)
>
> Should probably remove all references to anything with 'mmap' here and
> even remove "mem" as well. Again, these are technically not memory
> maps but address mappings. Subtle difference, but a solid difference
> none the less.

OK
I'll rename all 'mmaps', related to this feature, to 'procmaps', for
consistency.
>
> > +{
> > +     struct buffer_instance *instance = &top_instance;
> > +     struct pid_mem_maps *maps = instance->mem_maps;
> > +     struct pid_mem_maps *m;
> > +     unsigned long long begin, end, inode, tmp;
> > +     struct lib_mem_map *map;
> > +     char mapname[PATH_MAX+1];
> > +     char fname[PATH_MAX+1];
> > +     char buf[PATH_MAX+100];
>
> What's the reason for the +100?
This should count the maximum size of
   "%llx-%llx %s %llx %s %lld "
 Where we have:
   3 * %llx -> 48
   %lld -> 21
   %s, 4 bytes permisions
   %s, 5 bytes device
   spaces - at least 6 bytes, can vary.
>
> > +     char perm[5];
> > +     char dev[6];
> > +     FILE *f;
> > +     int ret;
> > +     int res;
> > +     int i;
> > +
> > +     sprintf(fname, "/proc/%d/exe", pid);
> > +     ret = readlink(fname, mapname, PATH_MAX);
> > +     if (ret >= PATH_MAX || ret < 0)
> > +             return -ENOENT;
> > +     mapname[ret] = 0;
> > +
> > +     sprintf(fname, "/proc/%d/maps", pid);
> > +     f = fopen(fname, "r");
> > +     if (!f)
> > +             return -ENOENT;
> > +
> > +     while (maps) {
> > +             if (pid == maps->pid)
> > +                     break;
> > +             maps = maps->next;
> > +     }
> > +
> > +     ret = -ENOMEM;
> > +     if (!maps) {
> > +             maps = calloc(1, sizeof(*maps));
> > +             if (!maps)
> > +                     goto out_fail;
> > +             maps->pid = pid;
> > +             maps->next = instance->mem_maps;
> > +             instance->mem_maps = maps;
> > +     } else {
> > +             for (i = 0; i < maps->nr_lib_maps; i++)
> > +                     free(maps->lib_maps[i].lib_name);
> > +             free(maps->lib_maps);
> > +             maps->lib_maps = NULL;
> > +             maps->nr_lib_maps = 0;
> > +             free(maps->proc_name);
> > +     }
> > +
> > +     maps->proc_name = strdup(mapname);
> > +     if (!maps->proc_name)
> > +             goto out;
> > +
> > +     while (fgets(buf, sizeof(buf), f)) {
> > +             mapname[0] = '\0';
> > +             res = sscanf(buf, "%llx-%llx %s %llx %s %lld %s",
> > +                          &begin, &end, perm, &tmp, dev, &inode, mapname);
>
> This is dangerous with perm and dev, as if either string is bigger than
> their size, you just created an overflow bug.
>
>         buf, "%llx-%llx %4s %llx %5s %lld %" STRINGIFY(PATH_MAX) "s"
>
> Which will keep sscanf from overwriting the size of the variables.
>
> Note, the STRINGIFY() is defined as:
>
> #define _STRINGIFY(x) #x
> #define STRINGIFY(x) _STRINGIFY(x)
>
Thanks, I'll change it. I will remove perm and dev, as these
are not used in our code and their length can vary. Going to read them
with "%*s".

> -- Steve
>
>
--
Tzvetomir (Ceco) Stoyanov
VMware Open Source Technology Center

      reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-03 11:51 tz.stoyanov
2019-07-04 12:36 ` Yordan Karadzhov (VMware)
2019-08-08 22:03 ` Steven Rostedt
2019-08-08 22:56 ` Steven Rostedt
2019-08-09 12:16   ` Tzvetomir Stoyanov [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACqStofTwKeD_K+=s71_XHwWbVidarO7k28Zw0dhOoSKTtTXAQ@mail.gmail.com' \
    --to=tstoyanov@vmware.com \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tz.stoyanov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Trace-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-trace-devel/0 linux-trace-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-trace-devel linux-trace-devel/ https://lore.kernel.org/linux-trace-devel \
		linux-trace-devel@vger.kernel.org linux-trace-devel@archiver.kernel.org
	public-inbox-index linux-trace-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-trace-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox