From: Thomas Gleixner <tglx@linutronix.de> To: David Miller <davem@davemloft.net> Cc: peterz@infradead.org, linux-kernel@vger.kernel.org Subject: Re: process time < thread time? Date: Thu, 1 Sep 2011 11:56:42 +0200 (CEST) [thread overview] Message-ID: <alpine.LFD.2.02.1109011148060.2723@ionos> (raw) In-Reply-To: <20110831.230718.2029810906806382170.davem@davemloft.net> Dave, On Wed, 31 Aug 2011, David Miller wrote: > If someone who understands our thread/process time implementation can > look into this, I'd appreciate it. > > Attached below is a watered-down version of rt/tst-cpuclock2.c from > GLIBC. Just build it with "gcc -o test test.c -lpthread -lrt" or > similar. > > Run it several times, and you will see cases where the main thread > will measure a process clock difference before and after the nanosleep > which is smaller than the cpu-burner thread's individual thread clock > difference. This doesn't make any sense since the cpu-burner thread > is part of the top-level process's thread group. > > I've reproduced this on both x86-64 and sparc64 (using both 32-bit and > 64-bit binaries). > > For example: > > [davem@boricha build-x86_64-linux]$ ./test > process: before(0.001221967) after(0.498624371) diff(497402404) > thread: before(0.000081692) after(0.498316431) diff(498234739) > self: before(0.001223521) after(0.001240219) diff(16698) > [davem@boricha build-x86_64-linux]$ > > The diff of 'process' should always be >= the diff of 'thread'. > > I make sure to wrap the 'thread' clock measurements the most tightly > around the nanosleep() call, and that the 'process' clock measurements > are the outer-most ones. > > I suspect this might be some kind of artifact of how the partial > runqueue ->clock and ->clock_task updates work? Maybe some weird > interaction with ->skip_clock_update? > > Or is this some known issue? That's an SMP artifact. If you run "taskset 01 ./test" the result is always correct. The reason why this shows deviations on SMP is how the thread times are accumulated in thread_group_cputime(). We sum t->se.sum_exec_runtime of all threads. So if the hog thread is currently running on the other core (which is likely) then the runtime field of that thread is not up to date. The untested patch below should cure this. Thanks, tglx diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c index 58f405b..42378cb 100644 --- a/kernel/posix-cpu-timers.c +++ b/kernel/posix-cpu-timers.c @@ -250,7 +250,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) do { times->utime = cputime_add(times->utime, t->utime); times->stime = cputime_add(times->stime, t->stime); - times->sum_exec_runtime += t->se.sum_exec_runtime; + times->sum_exec_runtime += task_sched_runtime(t); } while_each_thread(tsk, t); out: rcu_read_unlock();
next prev parent reply other threads:[~2011-09-01 9:56 UTC|newest] Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-09-01 3:07 David Miller 2011-09-01 9:56 ` Thomas Gleixner [this message] 2011-09-01 10:11 ` Peter Zijlstra 2011-09-01 10:39 ` Thomas Gleixner 2011-09-01 10:54 ` Peter Zijlstra 2011-09-01 14:54 ` Thomas Gleixner 2011-09-01 14:56 ` David Miller
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.LFD.2.02.1109011148060.2723@ionos \ --to=tglx@linutronix.de \ --cc=davem@davemloft.net \ --cc=linux-kernel@vger.kernel.org \ --cc=peterz@infradead.org \ --subject='Re: process time < thread time?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.