From: Pratyush Anand <panand@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, dyoung@redhat.com, xlpang@redhat.com
Subject: Re: [trace-cmd Patch RFC] trace-cmd: top: A new interface to detect peak memory
Date: Thu, 27 Apr 2017 19:32:43 +0530 [thread overview]
Message-ID: <727920fa-219c-6d8d-3ba6-2f0553b2cbdc@redhat.com> (raw)
In-Reply-To: <20170426100135.758cb098@gandalf.local.home>
Hi Steve,
On Wednesday 26 April 2017 07:31 PM, Steven Rostedt wrote:
>
> Sorry for the late reply. I finally have time to start looking at
> trace-cmd again.
Thanks a lot for your review :-)
>
> On Wed, 1 Mar 2017 20:32:37 +0530
> Pratyush Anand <panand@redhat.com> wrote:
>
[...]
> This is as nice idea, but it needs some clean ups. I tried it out just
> to play with it.
>
> First, I just did:
>
> trace-cmd top
>
> And nothing happened. Either a help message needs to be printed, or it
> starts something. Maybe have it be interactive, that is, print out data
> as it comes in?
>
[...]
>
> But then for some reason it -p stopped doing anything. And -e didn't
> stop it either. But after killing it manually and removing the pid
> file, it appeared to work normal again. I can't seem to repeat it, so
> lets not worry about that now.
>
Yes, this version was sharing info between two processes using files. I have a
new version [1] which has sharing through socket and that works better. Help
issue is also fixed in that version.
> More below.
>
[...]
>> +void top_disable_events(void)
>> +{
>> + char *path;
>> + char c;
>> + int fd;
>> + int ret;
>> +
>> + /*
>> + * WARNING: We want only few events related to memory allocation to
>> + * be enabled. Disable all events here. Latter, selective events
>> + * would be enabled
>
> trace-cmd reset does this. You may want to call that instead of doing
> it manually here. Someone else wanted to pull out trace-cmd reset into
> its own file. I'll have to get those patches. But then we can call that
> and enable tracing again.
>
> Or better yet, if we are only tracing memory events, just create your
> own instance and use that. Why bother with normal tracing?
>
> mkdir /sys/kernel/debug/tracing/instances/trace-cmd-memory
Yes, it seems a good idea. Will do.
>
>> + */
>> + c = '0';
>> + path = get_instance_file(&top_instance, "events/enable");
>> + fd = open(path, O_WRONLY);
>> + if (fd < 0)
>> + die("failed to open '%s'", path);
>> + ret = write(fd, &c, 1);
>> + close(fd);
>> + tracecmd_put_tracing_file(path);
>> + if (ret < 1)
>> + die("failed to write 0 to events/enable");
>> +
>> + path = get_instance_file(&top_instance, "tracing_on");
>> + fd = open(path, O_WRONLY);
>> + if (fd < 0)
>> + die("failed to open '%s'", path);
>> + ret = write(fd, &c, 1);
>> + close(fd);
>> + tracecmd_put_tracing_file(path);
>> + if (ret < 1)
>> + die("failed to write 0 to tracing_on\n");
>> +
>> + path = get_instance_file(&top_instance, "set_event");
>> + fd = open(path, O_WRONLY);
>> + if (fd < 0)
>> + die("failed to open '%s'", path);
>> + close(fd);
>> + tracecmd_put_tracing_file(path);
>> +}
>> +
>> +void top_exit(void)
>> +{
>> + int i;
>> + /*
>> + * free all the dynamic memories which could have been allocated by
>> + * this program
>> + */
>> + if (kbuf)
>> + kbuffer_free(kbuf);
>> + if (record)
>> + free_record(record);
>> + if (pevent)
>> + pevent_free(pevent);
>
> The above three functions all can handle a NULL pointer. Remove the if
> statements.
ok
>
>> + if (top_task_stats) {
>> + for (i = 0; i < top_task_cnt; i++)
>> + free(top_task_stats[i].comm);
>> + free(top_task_stats);
>> + }
>> +
>> + top_disable_events();
>> + remove(TRACE_CMD_TOP_PID_FILE);
>> +}
>> +
>> +void top_initialize_events(void)
>> +{
>> + char *path;
>> + int fd;
>> + char c;
>> + int ret;
>> + char *buf;
>> + int size;
>> + enum kbuffer_long_size long_size;
>> + enum kbuffer_endian endian;
>> + char id_str[8];
>> +
>
> Might want to look at trace_stream_init() and use that instead. It is
> made for live streaming of events.
OK, will look into trace_stream_init().
>
>> + /* trace data is read from the trace_raw_pipe directly */
>> + if (tracecmd_host_bigendian())
>> + endian = KBUFFER_ENDIAN_BIG;
>> + else
>> + endian = KBUFFER_ENDIAN_LITTLE;
>> +
[...]
>> +void top_start(void)
>> +{
>> + pid_t pid;
>> + int fd, ret;
>> +
>> + top_disable_events();
>> + top_initialize_events();
>> + page_size = getpagesize();
>> +
>> + pid = fork();
>> + if (!pid) {
>> + do {
>> + top_process_events();
>> + sleep(1);
>
> We should probably just sleep on the buffer, waiting for data.
OK
>
>> + if (set_print_only) {
>> + top_print_stats();
>> + set_print_only = false;
>> + }
>> + if (set_print_and_exit) {
>> + top_print_stats();
>> + top_exit();
>> + break;
>> + }
>> + } while (1);
>> + } else {
>> + fd = open(TRACE_CMD_TOP_PID_FILE, O_WRONLY | O_CREAT);
>> + if (fd < 0)
>> + goto kill_child;
>> + ret = write(fd, &pid, sizeof(pid));
>> + if (ret < sizeof(pid))
>> + goto close_fd;
>> + close(fd);
>> + }
>> +
>> + return;
>> +
>> +close_fd:
>> + close(fd);
>> +kill_child:
>> + kill(pid, SIGUSR2);
>> + die("Failed to start trace-top");
>> + return;
>> +}
>> +
[...]
> Looks good, I think we can extend on this.
Thanks :-)
I will implement your review comments and will send next revision.
However, I had couple of observation which I was unable to justify:
# ./trace-cmd top -s /tmp/test
# ./trace-cmd top -p /tmp/test | grep trace-cmd
15292 trace-cmd 22144 15808
Here,15292 is the pid of trace-cmd task
22144 KB is the peak memory usage
15808 KB is the current memory usage
Now check rss component from statm
# cat /proc/15292/statm
50 35 23 7 0 12 0 36
This is a result on ARM64/64KB page size. Therefore, as per statm rss is 35
pages = 35*64 = 2240KB
I patched my kernel [2] for test purpose, so that statm reports peak memory as
well. Here, the last extra entry in statm output is peak memory and it is 36
pages = 2304KB.
So, this is a huge difference between what has been reported by statm and what
we get from trace-cmd.
I understand that `trace-cmd top` output would only be approximate, because
many of the memory could be allocated by task and freed in interrupt context.
So, many a time it can differ. But, is such a huge difference justified? If
yes, can we count on the output of this utility to find early boot time oom
issues?
[1]
https://github.com/pratyushanand/trace-cmd/commit/602c2cd96aa613633ad20c6d382e41f7db37a2a4
[2]
https://github.com/pratyushanand/linux/commit/197e2045361b6b70085c46c78ea6607d8afce517
next prev parent reply other threads:[~2017-04-27 14:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-01 15:02 [trace-cmd Patch RFC] trace-cmd: top: A new interface to detect peak memory Pratyush Anand
2017-04-26 14:01 ` Steven Rostedt
2017-04-27 14:02 ` Pratyush Anand [this message]
2017-04-27 16:49 ` Steven Rostedt
2017-04-27 17:33 ` Pratyush Anand
2017-04-28 2:27 ` Steven Rostedt
2019-01-04 4:02 ` Steven Rostedt
2019-01-04 4:07 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=727920fa-219c-6d8d-3ba6-2f0553b2cbdc@redhat.com \
--to=panand@redhat.com \
--cc=dyoung@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=xlpang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).