All of lore.kernel.org
 help / color / mirror / Atom feed
* setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
@ 2010-04-09 19:48 Frantisek Rysanek
       [not found] ` <4BBF9B8F.3030106@gmail.com>
  2010-04-11 20:56 ` Andi Kleen
  0 siblings, 2 replies; 6+ messages in thread
From: Frantisek Rysanek @ 2010-04-09 19:48 UTC (permalink / raw)
  To: linux-kernel

Dear everyone,

I hope I'm not way too much off topic in this list... specifically, I 
hope the issue takes place in the kernel, as opposed to the user-
space part of NPTL that ships with libc, distroes etc.
At the same time, I feel shame for asking this noob question in the 
very LKML - except that there doesn't seem to be a better place to 
ask... :->

Some years ago, I've written a couple programs that tend to use the 
setitimer() syscall in a threaded environment, making use of its 
special property at the time: setitimer() had per-thread granularity. 
It used to deliver a SIGALRM from the timer to the particular thread 
that called setitimer(). I believe that was around RH8 to Fedora 5.

Recently I've recompiled the programs on a newer distro (Fedora 10) 
and voila: setitimer() now yields a SIGALRM to the program's master 
thread, no matter what child thread called setitimer()...

Based on further reading, I assume this is related to making the NPTL 
more POSIX-compliant. The latter is a correct POSIX behavior, the 
former was not. See "man pthreads", and under the NPTL heading, 
find a note saying
"Threads do not share interval timers (fixed in kernel 2.6.12)."

Yes, it used to be quite a relief to have Linux do the management of 
timers for me. Now I have two options to choose from:
1) write my own "timer queueing" (timekeeping) code to order the 
timers for me in the master thread
2) find another function, similar to setitimer(), that would function 
the way setitimer() used to work in the old days...

Obviously option #2 is much easier for me to abuse :-)
Such as, does select() work in the desired per-thread way?
In the app that I'm trying to update right now, I have a serial 
device open per thread, and I need to detect character timeouts 
(frame breaks).
But I have other apps where I have a *myriad* of stand-alone timers, 
not related to a "file descriptor like" device of any kind, 
generating "spurious events" for me, used to propel a bunch of 
threads doing some polling on various dumb "networked" devices 
(external bus slaves)...

For a moment I was wondering how complex the relevant kernel patch 
was, how difficult it would be to revert it - but then again such a 
revert might disrupt various other pieces of user-space code in my 
distro, so it's probably not such a good idea anyway :-) Also, if I 
resort to patching my kernel, it makes my user-space code fairly non-
portable to other people's machines. Let alone the bulk of code 
evolution in Linux kernel timekeeping and process management since 
2.6.12, overlaying the original patch.
AIX appears to have ITMER_REAL_TH [sob]. Not that I'm going to try 
AIX for this particular reason :-)

Wouldn't it be in fact more straightforward and "cheaper" (in terms 
of processing overhead) to have the timers thread-aware? If I just 
call a setitimer() in each thread, that requires some number of 
ioctl() calls. Now if I need to do my own timekeeping (event 
queueing) in user space, I'll probably need to call getitimer() or 
gettimeofday() ahead of every setitimer(), every time a thread needs 
to set a timer. Not sure about the required number of pointer 
indirections in the kernel for either case :-)

I understand that POSIX compliance is a good thing, for portability 
reasons. At the same time, resorting to per-process granularity of 
timers somehow "feels backwards" - from thread awareness, back to the 
old "no threads" UNIX world. It seems to remind me of the occasional 
debate whether GCC extensions to standard C are a good thing to use, 
or whether they should be avoided...

I haven't found much debate about this "timers vs. threads 
granularity" point in mailinglist archives or on the web.
Any further hints/pointers/kicks in the right direction/recommended 
reading are welcome :-)
If you've read this far, thanks for your time...

Frank Rysanek


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
       [not found] ` <4BBF9B8F.3030106@gmail.com>
@ 2010-04-10  7:26   ` Frantisek Rysanek
  0 siblings, 0 replies; 6+ messages in thread
From: Frantisek Rysanek @ 2010-04-10  7:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: bill o gallmeister

On 9 Apr 2010 at 23:26, bill o gallmeister wrote:
>
> Check out timer_create() rather than setitimer().
> 
Oh I *see* :-)  There seems to be a way to deliver an event to a 
specific thread. Just a quick guess, haven't validated this by a 
compiler:

============ PSEUDOCODE SNIPPET ==========
struct my_thr_data
{
   pthread_t ID; /* to be set upon pthread_create() */
   /* ...further members... */
};

void* my_fn(void* my_user_data)
{
    pthread_kill( ((my_thr_data*)my_user_data)->ID, SIGALRM);
}

struct my_thr_data this_thread;
timer_t my_timer;
struct sigevent my_event =
{
   sigev_notify: SIGEV_THREAD,
   sigev_notify_function: my_fn,
   sigev_value.sival_ptr: &this_thread,
   sigev_notify_attributes: NULL   
}

timer_create(CLOCK_REALTIME, &my_event, &my_timer);

/* by now we're set up, but the timer doesn't tick yet. */

/* someplace later in the code: */
timer_settime(my_timer, ...  );


=========== /PSEUDOCODE SNIPPET ==============
thank you :-)

Frank Rysanek

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
  2010-04-09 19:48 setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child) Frantisek Rysanek
       [not found] ` <4BBF9B8F.3030106@gmail.com>
@ 2010-04-11 20:56 ` Andi Kleen
  2010-04-11 21:52   ` Davide Libenzi
  2010-04-11 22:09   ` Thomas Gleixner
  1 sibling, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2010-04-11 20:56 UTC (permalink / raw)
  To: Frantisek Rysanek; +Cc: linux-kernel

"Frantisek Rysanek" <Frantisek.Rysanek@post.cz> writes:

> Yes, it used to be quite a relief to have Linux do the management of 
> timers for me. Now I have two options to choose from:
> 1) write my own "timer queueing" (timekeeping) code to order the 
> timers for me in the master thread
> 2) find another function, similar to setitimer(), that would function 
> the way setitimer() used to work in the old days...

POSIX timers (timer_create et.al.) allow specifying the signal.

So if you use custom RT signals for each threads and block them in the
threads you don't want them it should work. This would limit the
maximum number of threads though because there's only a limited
range of RT signals.

There are probably other ways to do this too, e.g. with some clever
use of timerfd_create in recent kernels.

Or you could overwrite the clone in the thread library to not 
set signal sharing semantics. This might have other bad side effects
though.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
  2010-04-11 20:56 ` Andi Kleen
@ 2010-04-11 21:52   ` Davide Libenzi
  2010-04-11 22:09   ` Thomas Gleixner
  1 sibling, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2010-04-11 21:52 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Frantisek Rysanek, Linux Kernel Mailing List

On Sun, 11 Apr 2010, Andi Kleen wrote:

> "Frantisek Rysanek" <Frantisek.Rysanek@post.cz> writes:
> 
> > Yes, it used to be quite a relief to have Linux do the management of 
> > timers for me. Now I have two options to choose from:
> > 1) write my own "timer queueing" (timekeeping) code to order the 
> > timers for me in the master thread
> > 2) find another function, similar to setitimer(), that would function 
> > the way setitimer() used to work in the old days...
> 
> POSIX timers (timer_create et.al.) allow specifying the signal.
> 
> So if you use custom RT signals for each threads and block them in the
> threads you don't want them it should work. This would limit the
> maximum number of threads though because there's only a limited
> range of RT signals.
> 
> There are probably other ways to do this too, e.g. with some clever
> use of timerfd_create in recent kernels.

Definitely timerfd allows you to handle the timer event wherever you 
like, independently from signals.  Much much simpler routing.
But if you need to be compatible with multiple unixes, of even older linux 
kernel, you are out of luck with timerfd.


- Davide



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
  2010-04-11 20:56 ` Andi Kleen
  2010-04-11 21:52   ` Davide Libenzi
@ 2010-04-11 22:09   ` Thomas Gleixner
  2010-10-17 20:42     ` Frantisek Rysanek
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2010-04-11 22:09 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Frantisek Rysanek, linux-kernel



On Sun, 11 Apr 2010, Andi Kleen wrote:

> "Frantisek Rysanek" <Frantisek.Rysanek@post.cz> writes:
> 
> > Yes, it used to be quite a relief to have Linux do the management of 
> > timers for me. Now I have two options to choose from:
> > 1) write my own "timer queueing" (timekeeping) code to order the 
> > timers for me in the master thread
> > 2) find another function, similar to setitimer(), that would function 
> > the way setitimer() used to work in the old days...
> 
> POSIX timers (timer_create et.al.) allow specifying the signal.
> 
> So if you use custom RT signals for each threads and block them in the
> threads you don't want them it should work. This would limit the
> maximum number of threads though because there's only a limited
> range of RT signals.
> 
> There are probably other ways to do this too, e.g. with some clever
> use of timerfd_create in recent kernels.
> 
> Or you could overwrite the clone in the thread library to not 
> set signal sharing semantics. This might have other bad side effects
> though.

Nonsense. Just use the right flags when creating the posix
timer. posix timers support per thread delivery of a signal, i.e. you
can use the same signal for all threads.

   sigev.sigev_notify = SIGEV_THREAD_ID | SIGEV_SIGNAL;
   sigev.sigev_signo = YOUR_SIGNAL;
   sigev.sigev_notify_thread_id = gettid();
   timer_create(CLOCK_MONOTONIC, &sigev, &timer);

That signal for that timer will not be delivered to any other thread
than the one specified in sigev.sigev_notify_thread_id as long as that
thread has not exited w/o canceling the timer.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child)
  2010-04-11 22:09   ` Thomas Gleixner
@ 2010-10-17 20:42     ` Frantisek Rysanek
  0 siblings, 0 replies; 6+ messages in thread
From: Frantisek Rysanek @ 2010-10-17 20:42 UTC (permalink / raw)
  To: linux-kernel

Dear Everyone,

apologies for following up on a thread after half a year :-)
I'm not gonna pretend it took me half a year to discover the points 
presented below - I just got buried by a dumptruck of other stuff,
then did my homework, and then couldn't find the time to post my 
follow-up...
Before this LKML thread, I couldn't find this sort of information 
anywhere (anywhere except for the source code itself). Maybe I didn't 
look into enough places where Google cannot see... anyway, I guess 
it's worth leaving a trace about the things I've learned, at a 
relevant place for the cyber crawlers to find it - for the benefit of 
future wondering apprentices who come after me.
So here it goes...

On 12 Apr 2010 at 0:09, Thomas Gleixner wrote:
> 
> Just use the right flags when creating the posix
> timer. posix timers support per thread delivery of a signal, i.e. you
> can use the same signal for all threads.
> 
>    sigev.sigev_notify = SIGEV_THREAD_ID | SIGEV_SIGNAL;
>    sigev.sigev_signo = YOUR_SIGNAL;
>    sigev.sigev_notify_thread_id = gettid();
>    timer_create(CLOCK_MONOTONIC, &sigev, &timer);
> 
> That signal for that timer will not be delivered to any other thread
> than the one specified in sigev.sigev_notify_thread_id as long as that
> thread has not exited w/o canceling the timer.
>
Thanks for that gem of ultra-compact yet precise information :-)
It does work precisely as advertised after all - except that for me, 
it was not without further homework.

I have to confess that when writing code in user space, I'm a bit 
ignorant of details - such as, whether it's bare kernel syscalls or 
some higher-level glibc abstraction that I'm talking to.
This snippet gave me a neat lesson in that particular "grey" area :-)
Well I shouldn't be surprised, if I ask kernel people, that I obtain 
a response in kernel terms :-)

I first pasted your code snippet into my program verbatim.
Followed by some timer_settime() of course...
It took a little bit of massage to get it to compile - such as, glibc 
didn't offer me a member called sigev_notify_thread_id, but I figured 
(by analogy with other macros in the relevant header) that it was 
pointing to a member called _tid in a union inside struct sigevent, 
as declared in /usr/include/bits/siginfo.h. I merely added 
#define sigev_notify_thread_id _sigev_un._tid
just below my #defines on top of the relevant C file.
Next, I couldn't find gettid() anywhere within the libraries (nothing 
to link to in user space) - so I decided to instead use 
 * the pthread_t provided by pthread_create(). *
After all, in LinuxThreads in the old days, pthread_t and pid_t were 
the same.

Guess what happened :-)
At a first run, I got an immediate SIGSEGV.

What ho? Let's ask GDB for some advice...
Hmm... timer_settime() segfaulting? Why? Old libc?
Tried compiling on a much newer distro, with the same result.
Google suggested that I was submitting a 0 for the timer_t...
How could that happen? Well maybe I should check the return value 
from timer_create(), and try perror(errno), right?
Uh oh, that was correct, timer_create() returns EINVAL.
Why is that?
(...shuffling the various parameters, trying CLOCK_MONOTONIC instead 
of CLOCK_REALTIME, googling some more...)
Found an old e-mail thread from back in 2005, suggesting in vague 
terms that timer_create(SIGEV_THREAD_ID) really still woked with 
PID's, rather than TID's, and that the per-thread logic is somehow 
completely bogus and void... so, reluctantly, I tried 
_tid = getpid() instead of  "pthread_t my_thr_ID". That worked to the 
extent that timer_create() didn't yell and timer_settime() did set up 
a timer - except that of course the SIGALARM got again delivered to 
the process master thread. Ah well... now, why on earth is there 
something called a _tid, embedded in the struct sigevent?
Time to take a dive into more source code, right?

I happened to have the source code of Libc 2.6 lying around, so I 
looked at that. And Linux 2.6.35.7.
The code did try my mediocre coding & code reading skills, but 
finally it started to dawn on me. I tried further googling more about 
the precise mapping between NPTL and the Linux kernel threading 
arrangement, and found nothing other than the usual PR factoids (N:1 
vs. M:N vs. 1:1) - which meant I really had to find out the hard way 
= by reading the code :-)

It turns out that:

NPTL (a part of Libc in the user space) uses something called "struct 
pthread" internally. It is declared in some private header inside the 
glibc source code (namely nptl/descr.h), but not in the public 
headers that end up in the systemwide /usr/include. The "pthread_t" 
that gets passed around among the various pthread_create() et al. 
library functions, although it looks like an opaque "unsigned int" or 
what on the outside, is really assigned the value of a 
struct pthread *
(pointer to the NPTL-private pthread struct). Outside of the glibc 
source tree, you don't know that such a struct exists, and you have 
no chance to access its internal members, such as the one called 
pid_t tid.

Within the kernel, it seems that the processes or threads behind the 
NPTL's threading model are called just a "task". Each task is 
described by an instance of a uniform "struct task_struct", declared 
in $KERNEL_SRC/include/linux/sched.h. Each task has its own pid (and 
this one is a genuine integer). Interesting point: struct task_struct 
contains a member called
struct task_struct* group_leader;

And that's it. In the kernel space, there's a group of mostly equal 
tasks who have a leader. This group and their leader correspond to a 
user-space NPTL process containing several lightweight threads. The 
kernel-space PID of the task group leader is equal to the user-space 
PID, used to refer to the whole multi-threaded process.

Okay... so how do we get our hands on the back-end "tid" (really a 
PID in kernel vocabulary) of a single user-space thread? We already 
know that we need a function called gettid(). It turns out that this 
is a syscall, implemented in the kernel, even known to glibc, but not 
exported by glibc to the user space. In the kernel space, 
interestingly this syscall is implemented in a file called 
kernel/timer.c (I'd expect it in kernel/pid.c or maybe 
kernel/sched.c) - well maybe the choice of translation unit hints at 
the practical use of this syscall :-) If you follow gettid(), through 
an inline function called task_pid_vnr(), all the way to 
__task_pid_nr_ns(PIDTYPE_PID), you'll find out that indeed this stack 
of calls will retrieve task->pid (and the function __task_pid_nr_ns 
also mentions task->group_leader in a different context).

So essentially in the user space (using glibc) you have a choice 
whether to
1) copy and paste the declaration of "struct pthread" from your glibc 
version's source code into your program, or "publish" the relevant 
header, or some such
2) call the gettid() syscall (in)directly. 

I chose the latter option. In my program, I added
#include <sys/syscall.h>
#define gettid() syscall(__NR_gettid)
...all of the gears can be found in the public headers.
This way of invoking a syscall by the generic syscall() function and 
the integer syscall number, is called an "indirect" invocation of a 
syscall, and can only be used for syscalls with simple argument sets, 
which luckily is the case of gettid().

So yes, I can have my cake and eat it too.
I can deliver timer-based SIGALRM directly to a particular user-space 
thread, without "rethrowing" via the process master or another 
dedicated "signal dispatch" thread.
Only to get my hands on the "tid" (really the PID of a kernel-space 
task corresponding to my user-space thread), I have to call a Linux 
syscall fairly explicitly. It feels like less of a sin than accessing 
some private (however obvious) struct under the hood of glibc/NPTL.
Calling gettid() directly doesn't seem "posixly correct", but it 
would appear that neither is SIGEV_THREAD_ID (what use would that be, 
without a possibility to get your hands on the internal TID?)
The important point for me is that it gets the job done, over a wide 
range of glibc and kernel versions.

It's been an exciting adventure. The kernel guts around pid.c and 
sched.c are a fantastic read - the code is almost amazingly clean and 
straight-forward, split into neat small functions. An interesting 
discovery after all the past claims that programming language purity 
and beauty doesn't mix well with system-level programming :-)

Thanks for your time and attention...

Frank Rysanek


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-10-17 20:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-09 19:48 setitimer vs. threads: SIGALRM returned to which thread? (process master or individual child) Frantisek Rysanek
     [not found] ` <4BBF9B8F.3030106@gmail.com>
2010-04-10  7:26   ` Frantisek Rysanek
2010-04-11 20:56 ` Andi Kleen
2010-04-11 21:52   ` Davide Libenzi
2010-04-11 22:09   ` Thomas Gleixner
2010-10-17 20:42     ` Frantisek Rysanek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.