linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* getrusage vs /proc/pid/stat?
@ 2001-06-18  5:17 Dan Kegel
  2001-06-18 17:44 ` Pete Wyckoff
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Kegel @ 2001-06-18  5:17 UTC (permalink / raw)
  To: linux-kernel

I'd like to monitor CPU, memory, and I/O utilization in a 
long-running multithreaded daemon under kernels 2.2, 2.4,
and possibly also Solaris (#ifdefs are ok).

getrusage() looked promising, and might even work for CPU utilization.
Dunno if it returns info for all child threads yet, haven't tried it.
In Linux, though, getrusage() doesn't return any info about RAM.

I know I can get the RSS and VSIZE under Linux by parsing /proc/pid/stat,
but was hoping for a faster interface (although I suppose a seek,
a read, and an ascii parse isn't *that* slow).  Is /proc/pid/stat
the only way to go under Linux to monitor RSS?
- Dan

-- 
"A computer is a state machine.
 Threads are for people who can't program state machines."
         - Alan Cox

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getrusage vs /proc/pid/stat?
  2001-06-18  5:17 getrusage vs /proc/pid/stat? Dan Kegel
@ 2001-06-18 17:44 ` Pete Wyckoff
  2001-06-18 21:20   ` Dan Kegel
  0 siblings, 1 reply; 5+ messages in thread
From: Pete Wyckoff @ 2001-06-18 17:44 UTC (permalink / raw)
  To: Dan Kegel; +Cc: linux-kernel

dank@kegel.com said:
> I'd like to monitor CPU, memory, and I/O utilization in a 
> long-running multithreaded daemon under kernels 2.2, 2.4,
> and possibly also Solaris (#ifdefs are ok).
> 
> getrusage() looked promising, and might even work for CPU utilization.
> Dunno if it returns info for all child threads yet, haven't tried it.
> In Linux, though, getrusage() doesn't return any info about RAM.
> 
> I know I can get the RSS and VSIZE under Linux by parsing /proc/pid/stat,
> but was hoping for a faster interface (although I suppose a seek,
> a read, and an ascii parse isn't *that* slow).  Is /proc/pid/stat
> the only way to go under Linux to monitor RSS?

getrusage() isn't really the system call you want for this.

There is a max RSS value, which linux could support but doesn't, but
you seem to want to see the current RSS over time.  Search the archive
for various patches/complaints about getrusage.

For vsize, most OSes offer time-integral averages of text, data, and
stack sizes via getrusage().  Again, more of an aggregate than a current
snapshot, and again, linux returns zero for these currently.

		-- Pete

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getrusage vs /proc/pid/stat?
  2001-06-18 17:44 ` Pete Wyckoff
@ 2001-06-18 21:20   ` Dan Kegel
  2001-06-18 23:34     ` J . A . Magallon
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Kegel @ 2001-06-18 21:20 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: linux-kernel

Pete Wyckoff wrote:
> 
> dank@kegel.com said:
> > I'd like to monitor CPU, memory, and I/O utilization in a
> > long-running multithreaded daemon under kernels 2.2, 2.4,
> > and possibly also Solaris (#ifdefs are ok).
> 
> getrusage() isn't really the system call you want for this.

I'll buy that.  Looks like a lot of unices don't implement that
call fully, and Linux is one of them.

What is the proper way to measure total CPU time used by a multithreaded program?
The only way I can figure to do it is to sum /proc/pid/stat across
the threads of interest (see below).  Is this the easiest way, 
or am I missing something?  (Forgive the C++.  I'd recode this in C
if it were for general use.)

========= LinuxTimes.h ==========
#include <sys/times.h>
#include <pthread.h>

/*--------------------------------------------------------------------------
 Source and test case at http://www.kegel.com/lt.tar.gz

 Monitor the CPU usage of a bunch of threads in the same process.
 This is a simulation of the system call times() 
 providing traditional semantics under LinuxThreads.
 On e.g. Solaris, you don't need this; you just call the standard times().
--------------------------------------------------------------------------*/
class LinuxTimes {
    const static int MAXTHREADS = 100;

    /// number of threads being monitored
    int m_nthreads;

    /// fd open to /proc/pid/stat for each thread
    int m_proc_pid_stat_fd[MAXTHREADS];

    /// make addSelf threadsafe
    pthread_mutex_t m_lock;

public:

    LinuxTimes() { m_nthreads = 0; pthread_mutex_init(&m_lock, NULL); }

    /**
     New threads call this to add themselves to the group.
     Threadsafe.
     @return 0 on success, Unix error code on failure
     */
    int addSelf();

    /**
     Calculate user and system time accumulated by all threads in group.
     Return result in tms_utime and tms_stime fields of given struct tms.
     Similar to 'man 2 times' on Solaris (where all CPU time of all threads
     is counted as CPU time towards the process).
     @return 0 on success, Unix error code on failure
     */
    int times(struct tms *buf);
};

========= LinuxTimes.cc ==========

#include "LinuxTimes.h"
#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>      

/**
 New threads call this to add themselves to the group.
 Threadsafe.
 @return 0 on success, Unix error code on failure
 */
int LinuxTimes::addSelf()
{
    int fd;
    char fname[256];
    int err = 0;

    if (pthread_mutex_lock(&m_lock))
        return EINVAL;

    if (m_nthreads >= MAXTHREADS) {
        err = E2BIG;
    } else {
        // Under LinuxThreads, each thread has its own pid
        sprintf(fname, "/proc/%d/stat", getpid());
        fd = open(fname, O_RDONLY);
        if (fd == -1) 
            err = errno;
        else {
            m_proc_pid_stat_fd[m_nthreads++] = fd;
        }
    }

    if (pthread_mutex_unlock(&m_lock))
        return EINVAL;

    return err;
}

/* Skip to char after nth whitespace.  Returns NULL on failure. */
static const char *skipNspace(const char *p, int n)
{
    while (*++p)
        if (isspace(*p) && ! --n) 
            return p+1;
    return NULL;
}

/**
 Calculate user and system time accumulated by all threads in group.
 Return result in tms_utime and tms_stime fields of given struct tms.
 Similar to 'man 2 times' on Solaris (where all CPU time of all threads
 is counted as CPU time towards the process).
 @return 0 on success, Unix error code on failure
 */
int LinuxTimes::times(struct tms *buf)
{
    int i;
    int nread;

    buf->tms_utime = 0;
    buf->tms_stime = 0;
    for (i=0; i<m_nthreads; i++) {
        char scratch[1024]; // FIXME: is that big enough?
        int fd = m_proc_pid_stat_fd[i];

        // rewind to start of stat file.  (Not all /proc entries support this.)
        if (lseek(fd, 0, SEEK_SET))
            return EBADF;

        // Read in ASCII data and parse out utime and stime fields
        // (see 'man proc')
        nread = read(fd, scratch, sizeof(scratch)-1);
        if (nread < 50)     // FIXME: cheesy
            return EINVAL;
        scratch[nread] = 0;
        
        // Skip to end of command field
        // FIXME: what if executable has ) in its filename?  Bleh.
        const char *p = strchr(scratch, ')') + 2;

        // Skip to utime field
        p = skipNspace(p, 11);
        if (!p) return EINVAL;
        buf->tms_utime += atoi(p);

        // Skip to stime field
        p = skipNspace(p, 1);
        if (!p) return EINVAL;
        buf->tms_stime += atoi(p);
    }

    return 0;
}

==============

Thanks,
Dan

-- 
"A computer is a state machine.
 Threads are for people who can't program state machines."
         - Alan Cox

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getrusage vs /proc/pid/stat?
  2001-06-18 21:20   ` Dan Kegel
@ 2001-06-18 23:34     ` J . A . Magallon
  2001-06-19 15:05       ` Dan Kegel
  0 siblings, 1 reply; 5+ messages in thread
From: J . A . Magallon @ 2001-06-18 23:34 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Pete Wyckoff, linux-kernel @ vger . kernel . org


On 20010618 Dan Kegel wrote:
>Pete Wyckoff wrote:
>> 
>> dank@kegel.com said:
>> > I'd like to monitor CPU, memory, and I/O utilization in a
>> > long-running multithreaded daemon under kernels 2.2, 2.4,
>> > and possibly also Solaris (#ifdefs are ok).
>> 
>> getrusage() isn't really the system call you want for this.
>
>I'll buy that.  Looks like a lot of unices don't implement that
>call fully, and Linux is one of them.
>
>What is the proper way to measure total CPU time used by a multithreaded program?

I have just the same problem. getrusage() did not catch the CPU time for
children, even if the man page said that. Now I am using times(2), that
seems to work in Solaris, but gives nothing in Linux.

I you look at time(1) manpage, it says time is implemented over the times(2)
system call. But if I include that call, it gives me only zero.

This is the output on Solaris:

den:~/ask/tst/cbox0> time box @options
Rendering box.jpg: 64x64
****************************************************************
Wall Time:0000:00:00.241
User Time:0000:00:00.600
Sys  Time:0000:00:00.000

real    0m0.59s
user    0m0.76s
sys     0m0.07s

(user is greater, cause it uses 4 cpus and times is cumulative)

And this is the output on Linux (2.4.5-ac15, glibc2.2.3)

werewolf:~/ask/tst/cbox0> time -p box @options
Rendering box.jpg: 64x64
****************************************************************
Wall Time:0000:00:01.299
User Time:0000:00:00.000
Sys  Time:0000:00:00.000
real 1.43
user 2.63
sys 0.02

????? time gives good results for summed CPU time, but my own call
to times(2) fails ???

If they can help you (and if you see any error) here is my Timer:

timer.h:

#ifndef AST_TIMER_H
#define AST_TIMER_H

#include <ast/api.h>

class __apit Timer
{
protected:
	double	wstart,wlast,wnow;
	double	ustart,ulast,unow;
	double	sstart,slast,snow;
public:
static char*	format(double t);
	Timer();

	void reset();
	void update();

	double wall();
	double user();
	double system();

	double ewall();
	double euser();
	double esystem();
};

#endif // AST_TIMER_H 

timer.cc:

#include <ast/timer.h>
#include <ast/stream.h>
#include <string.h>
#include <stdio.h>
#include <time.h>

#ifdef __UNX__
#include <sys/time.h>
#include <sys/times.h>
#include <sys/resource.h>
#include <unistd.h>
#endif

//#define USE_GETRUSAGE

char *Timer::format(double t)
{
	int	h  = int(t/3600);
	int	m  = int((t-h*3600)/60);
	int	s  = int(t-h*3600-m*60);
	int	ms = int(1000*(t-h*3600-m*60-s));

	char* str = new char[64];
	ostrstream ss(str,64,ios::trunc);
	ss << setfill('0');
	ss << setw(4) << h;
	ss << ":" << setw(2) << m;
	ss << ":" << setw(2) << s;
	ss << "." << setw(3) << ms;
	ss << ends;

	return str;
}

Timer::Timer()
{
	reset();
}

void Timer::reset()
{
	update();

	wstart = wnow;
	ustart = unow;
	sstart = snow;
}

void Timer::update()
{
// Wall clock time
#ifdef __UNX__
	struct timeval tv;

	gettimeofday(&tv,NULL);
	wnow = 1000*tv.tv_sec+1e-3*tv.tv_usec;
#endif
#ifdef __WIN__
	wnow = ::time(NULL)*1000;
#endif
#ifdef __MAC__
	wnow = ::time(NULL)*1000;
#endif

// CPU User and System times
#ifdef __UNX__
#ifdef USE_GETRUSAGE
	struct rusage rus,ruc;
	getrusage(RUSAGE_SELF,&rus);
	getrusage(RUSAGE_CHILDREN,&ruc);

	double s,u;

	s = rus.ru_utime.tv_sec+ruc.ru_utime.tv_sec;
	u = rus.ru_utime.tv_usec+ruc.ru_utime.tv_usec;
	unow = 1000*s+1e-3*u;

	s = rus.ru_stime.tv_sec+ruc.ru_stime.tv_sec;
	u = rus.ru_stime.tv_usec+ruc.ru_stime.tv_usec;
	snow = 1000*s+1e-3*u;
#else
	struct tms t;
	times(&t);

	double s;

	s = t.tms_utime+t.tms_cutime;
#ifdef __linux__
	unow = double(1000*s)/double(sysconf(_SC_CLK_TCK));
#else
	unow = double(1000*s)/double(CLK_TCK);
#endif

	s = t.tms_stime+t.tms_cstime;
#ifdef __linux__
	snow = double(1000*s)/double(sysconf(_SC_CLK_TCK));
#else
	snow = double(1000*s)/double(CLK_TCK);
#endif

#endif
#endif
#ifdef __WIN__
	unow = ::time(NULL)*1000;
	snow = 0;
#endif
#ifdef __MAC__
	unow = ::time(NULL)*1000;
	snow = 0;
#endif

	wlast = wnow;
	ulast = unow;
	slast = snow;
}

double Timer::wall()
{
	update();
	return 0.001*(wnow-wstart);
}

double Timer::user()
{
	update();
	return 0.001*(unow-ustart);
}

double Timer::system()
{
	update();
	return 0.001*(snow-sstart);
}

double Timer::ewall()
{
	double last = wlast;
	update();
	return 0.001*(wnow-last);
}

double Timer::euser()
{
	double last = ulast;
	update();
	return 0.001*(unow-last);
}

double Timer::esystem()
{
	double last = slast;
	update();
	return 0.001*(snow-last);
}

-- 
J.A. Magallon                           #  Let the source be with you...        
mailto:jamagallon@able.es
Linux Mandrake release 8.1 (Cooker) for i586
Linux werewolf 2.4.5-ac15 #2 SMP Sun Jun 17 02:12:45 CEST 2001 i686

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getrusage vs /proc/pid/stat?
  2001-06-18 23:34     ` J . A . Magallon
@ 2001-06-19 15:05       ` Dan Kegel
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Kegel @ 2001-06-19 15:05 UTC (permalink / raw)
  To: J . A . Magallon; +Cc: Pete Wyckoff, linux-kernel @ vger . kernel . org

"J . A . Magallon" wrote:
> I have just the same problem. getrusage() did not catch the CPU time for
> children, even if the man page said that. Now I am using times(2), that
> seems to work in Solaris, but gives nothing in Linux.
> 
> I you look at time(1) manpage, it says time is implemented over the times(2)
> system call. But if I include that call, it gives me only zero.
>
> ????? time gives good results for summed CPU time, but my own call
> to times(2) fails ???

It could be that you have to wait() for the child before times()
includes it in 'children time'.

By the way, the source for time is easy to find.  Here's debian's
(just search for time.c, then click on 'main'):
http://src.openresources.com/debian/src/utils/HTML/mains.html

If that doesn't help, maybe the code I sent you that reads /proc/pid/stat
for all threads of interest will.  Either way, let me know...
- Dan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-06-19 15:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-18  5:17 getrusage vs /proc/pid/stat? Dan Kegel
2001-06-18 17:44 ` Pete Wyckoff
2001-06-18 21:20   ` Dan Kegel
2001-06-18 23:34     ` J . A . Magallon
2001-06-19 15:05       ` Dan Kegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).