All of lore.kernel.org
 help / color / mirror / Atom feed
* Signal delivery order
@ 2009-03-14 16:50 Gábor Melis
  2009-03-15  9:44 ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-14 16:50 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

(Please CC me, I'm not subscribed.)

Attached is a test program that tests the order in which synchronously 
triggered sigsegvs and pthread_kill() generated signals get delivered.

The test program triggers sigsegvs from a thread and tests whether the 
the sigsegv handler is invoked when the sigusr1 handler is already 
running. If that happens it exits with exit code 27. It seems that if 
the signal is sent by pthread_kill() and its signum is lower than that 
of sigsegv then it can be delivered before sigsegv. A normal kill does 
not seem to do this.

I would expect that no asynchronously generated signal (and here I 
include those sent by pthread_kill()) can overtake a sigsegv even if 
its signum is lower.

Reaching this diagnosis led me to an identical looking issue documented 
at: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4355769 .

Cheers,
Gabor Melis

PS: tested on x86 linux Debian Lenny, kernel: 2.6.26-1-686, glibc: 
2.7-18

[-- Attachment #2: signal-delivery-order.c --]
[-- Type: text/x-csrc, Size: 1960 bytes --]

#include <signal.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/ucontext.h>
#include <unistd.h>

int test_signal;
int *page_address;
pthread_t tid;

int is_signal_blocked(int signum)
{
    sigset_t set;
    pthread_sigmask(SIG_BLOCK, 0, &set);
    return (sigismember(&set, signum));
}

void test_handler(int signal, siginfo_t *info, void *context)
{
}

void sigsegv_handler(int signal, siginfo_t *info, void *context)
{
    /* The test signal is blocked only in test_handler. */
    if (is_signal_blocked(test_signal)) {
        _exit(27);
    }
    mprotect(page_address, 4096, PROT_READ | PROT_WRITE);
}

void *make_faults(void *arg)
{
    while (1) {
        mprotect(page_address, 4096, PROT_NONE);
        *page_address = 1;
    }
}

void *reserve_page(void)
{
    int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE;
    void *actual = mmap(0, 4096, PROT_NONE, flags, -1, 0);
    if (actual == MAP_FAILED) {
        perror("mmap");
        return 0;
    }
    return actual;
}

void install_handlers(void)
{
    struct sigaction sa;
    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = test_handler;
    sigaction(test_signal, &sa, 0);
    sa.sa_sigaction = sigsegv_handler;
    sigaction(SIGSEGV, &sa, 0);
}

void test_with_pthread_kill()
{
    page_address = (int *)reserve_page();
    install_handlers();
    if (pthread_create(&tid, 0, make_faults, 0) < 0)
        perror("pthread_create");
    while(1) {
        pthread_kill(tid, test_signal);
    }
}

void test_with_kill()
{
    pid_t pid = fork();
    if (pid == 0) {
        page_address = (int *)reserve_page();
        install_handlers();
        make_faults(0);
    } else {
        while (1) {
            kill(pid, test_signal);
        }
    }
}

int main(void)
{
    test_signal = SIGUSR1;
    test_with_pthread_kill();
    /* Forking and kill()ing works as expected: */
    /* test_with_kill(); */
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-14 16:50 Signal delivery order Gábor Melis
@ 2009-03-15  9:44 ` Oleg Nesterov
  2009-03-15 14:40   ` Gábor Melis
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-15  9:44 UTC (permalink / raw)
  To: Gábor Melis; +Cc: linux-kernel, Andrew Morton

On 03/14, Gábor Melis wrote:
>
> The test program triggers sigsegvs from a thread and tests whether the
> the sigsegv handler is invoked when the sigusr1 handler is already
> running.

No, this is not what happens with this test case. SIGSEGV can't be
generated when we run SIGUSR1 handler.

	void sigsegv_handler(int signal, siginfo_t *info, void *context)
	{
	     /* The test signal is blocked only in test_handler. */

The comment is not right. We can't be here (in SIGSEGV handler) if
we are in test_handler.

	     if (is_signal_blocked(test_signal)) {
		 _exit(27);
	     }

If test_signal (SIGUSR1) is blocked, this means it is already delivered,
and the handler will be invoked when we return from sigsegv_handler(),
please see below.

> A normal kill does
> not seem to do this.

Yes. Because the task deques the private signals first (sent by tkill,
or generate by kernel when the test case does "*page_address = 1"),
then it dequeues the shared signals (sent by kill).

But please note that it is still possible to hit is_signal_blocked()
even with test_with_kill(), but the probability is very low.

> I would expect that no asynchronously generated signal (and here I
> include those sent by pthread_kill()) can overtake a sigsegv even if
> its signum is lower.

Signum doesn't matter. Any unblocked signal can preempt the task, whether
it runs inside a signal handler or not. This is correct, you can use
sigaction->sa_mask to specify which signals which should be blocked during
execution of the signal handler.

OK, let's do a simple test:

	int is_blocked(int sig)
	{
		sigset_t set;
		sigprocmask(SIG_BLOCK, NULL, &set);
		return sigismember(&set, sig);
	}

	void sig_1(int sig)
	{
		printf("%d %d\n", sig, is_blocked(2));
	}

	void sig_2(int sig)
	{
		printf("%d %d\n", sig, is_blocked(1));
	}

	int main(void)
	{
		sigset_t set;

		signal(1, sig_1);
		signal(2, sig_2);

		sigemptyset(&set);
		sigaddset(&set, 1);
		sigaddset(&set, 2);
		sigprocmask(SIG_BLOCK, &set, NULL);

		kill(getpid(), 1);
		kill(getpid(), 2);

		sigprocmask(SIG_UNBLOCK, &set, NULL);

		return 0;
	}

output is:

	2 1
	1 0

When sigprocmask(SIG_UNBLOCK) returns, both signals are delivered.
The kernel deques 1 first, then 2. This means that the handler for
"2" will be called first.

But if you change kill(getpid(), 2) to tkill(getpid(), 2)) then the
output should be:

	1 1
	2 0

So, what happens with test_with_pthread_kill() is: the sub-thread
likely deques both signals, SIGSEGV=11 and SIGUSR1=10 and starts
the handler for SIGSEGV.

With test_with_kill(), the child dequeues both signals too, but
runs the handler for SIGUSR1 first, because it was send by kill(),
not tkill().

If you modify your test-case so that test_signal == SIGIO, then
I bet test_with_pthread_kill() won't hit is_signal_blocked() too.
Or you can modify test_with_kill() to use tkill(), in that case
you should see the same behaviour as with test_with_pthread_kill().

Please don't hesitate to ask more questions.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-15  9:44 ` Oleg Nesterov
@ 2009-03-15 14:40   ` Gábor Melis
  2009-03-15 17:29     ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-15 14:40 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Andrew Morton

On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> On 03/14, Gábor Melis wrote:
> > The test program triggers sigsegvs from a thread and tests whether
> > the the sigsegv handler is invoked when the sigusr1 handler is
> > already running.
>
> No, this is not what happens with this test case. SIGSEGV can't be
> generated when we run SIGUSR1 handler.
>
> 	void sigsegv_handler(int signal, siginfo_t *info, void *context)
> 	{
> 	     /* The test signal is blocked only in test_handler. */
>
> The comment is not right. We can't be here (in SIGSEGV handler) if
> we are in test_handler.
> 	     if (is_signal_blocked(test_signal)) {
> 		 _exit(27);
> 	     }
>
> If test_signal (SIGUSR1) is blocked, this means it is already
> delivered, and the handler will be invoked when we return from
> sigsegv_handler(), please see below.

SIGUSR1 is delivered, its sigmask is added to the current mask but the 
handler is not yet invoked and in this instant synchronous sigsegv is 
delivered, its handler invoked?

> > A normal kill does
> > not seem to do this.
>
> Yes. Because the task deques the private signals first (sent by
> tkill, or generate by kernel when the test case does "*page_address =
> 1"), then it dequeues the shared signals (sent by kill).
>
> But please note that it is still possible to hit is_signal_blocked()
> even with test_with_kill(), but the probability is very low.
>
> > I would expect that no asynchronously generated signal (and here I
> > include those sent by pthread_kill()) can overtake a sigsegv even
> > if its signum is lower.
>
> Signum doesn't matter. Any unblocked signal can preempt the task,
> whether it runs inside a signal handler or not. This is correct, you
> can use sigaction->sa_mask to specify which signals which should be
> blocked during execution of the signal handler.
>
> OK, let's do a simple test:
>
> 	int is_blocked(int sig)
> 	{
> 		sigset_t set;
> 		sigprocmask(SIG_BLOCK, NULL, &set);
> 		return sigismember(&set, sig);
> 	}
>
> 	void sig_1(int sig)
> 	{
> 		printf("%d %d\n", sig, is_blocked(2));
> 	}
>
> 	void sig_2(int sig)
> 	{
> 		printf("%d %d\n", sig, is_blocked(1));
> 	}
>
> 	int main(void)
> 	{
> 		sigset_t set;
>
> 		signal(1, sig_1);
> 		signal(2, sig_2);
>
> 		sigemptyset(&set);
> 		sigaddset(&set, 1);
> 		sigaddset(&set, 2);
> 		sigprocmask(SIG_BLOCK, &set, NULL);
>
> 		kill(getpid(), 1);
> 		kill(getpid(), 2);
>
> 		sigprocmask(SIG_UNBLOCK, &set, NULL);
>
> 		return 0;
> 	}
>
> output is:
>
> 	2 1
> 	1 0
>
> When sigprocmask(SIG_UNBLOCK) returns, both signals are delivered.
> The kernel deques 1 first, then 2. This means that the handler for
> "2" will be called first.

My mental model that matches what I quickly glean from the sources (from 
kernel/signal.c, arch/x86/kernel/signal_32.c) goes like this:

- signal 1 and signal 2 are generated and made pending
- they are unblocked by sigprocmask
- signal 1 is delivered: signals in its mask (only itself here) are 
blocked its handler is invoked
- barely into sig_1 (the printf didn't get a chance to run), signal 2 is 
delivered its sigmask is added to current one that already includes 
signal 1, sig_2 is invoked
- sig_2 prints "2 1" and returns
- sig_1 reaches the printf printing "1 0"
hes this model.

I can't find how 'handler for "2" will be called first'. Furthermore, if 
it's indeed sig_2 that's invoked first then sig_1's sigmask is added to 
the current mask upon dequeueing???

> But if you change kill(getpid(), 2) to tkill(getpid(), 2)) then the
> output should be:
>
> 	1 1
> 	2 0

Right.

> So, what happens with test_with_pthread_kill() is: the sub-thread
> likely deques both signals, SIGSEGV=11 and SIGUSR1=10 and starts
> the handler for SIGSEGV.
>
> With test_with_kill(), the child dequeues both signals too, but
> runs the handler for SIGUSR1 first, because it was send by kill(),
> not tkill().
>
> If you modify your test-case so that test_signal == SIGIO, then
> I bet test_with_pthread_kill() won't hit is_signal_blocked() too.
> Or you can modify test_with_kill() to use tkill(), in that case
> you should see the same behaviour as with test_with_pthread_kill().
>
> Please don't hesitate to ask more questions.

Thank you,
Gabor

> Oleg.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-15 14:40   ` Gábor Melis
@ 2009-03-15 17:29     ` Oleg Nesterov
  2009-03-15 22:06       ` Gábor Melis
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-15 17:29 UTC (permalink / raw)
  To: Gábor Melis; +Cc: linux-kernel, Andrew Morton

On 03/15, Gábor Melis wrote:
>
> On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> >
> > If test_signal (SIGUSR1) is blocked, this means it is already
> > delivered, and the handler will be invoked when we return from
> > sigsegv_handler(), please see below.
>
> SIGUSR1 is delivered, its sigmask is added to the current mask but the
> handler is not yet invoked and in this instant synchronous sigsegv is
> delivered, its handler invoked?

Can't understand the question. Could you reiterate?

> > When sigprocmask(SIG_UNBLOCK) returns, both signals are delivered.
> > The kernel deques 1 first, then 2. This means that the handler for
> > "2" will be called first.
>
> My mental model that matches what I quickly glean from the sources (from 
> kernel/signal.c, arch/x86/kernel/signal_32.c) goes like this:
>
> - signal 1 and signal 2 are generated and made pending
> - they are unblocked by sigprocmask
> - signal 1 is delivered: signals in its mask (only itself here) are
> blocked

yes.

the kernel changes ip (instruction pointer) to sig_1.

> its handler is invoked

no.

We never return to user-space with a pending signal. We dequeue signal 2
too, and change ip to sig_2.

Now, since there are no more pending signals, we return to the user space,
and start sig_2().

> I can't find how 'handler for "2" will be called first'.

see above,

> Furthermore, if
> it's indeed sig_2 that's invoked first then sig_1's sigmask is added to
> the current mask upon dequeueing???

sig_1's sigmask was added to ->blocked when we dequeued signal 1.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-15 17:29     ` Oleg Nesterov
@ 2009-03-15 22:06       ` Gábor Melis
  2009-03-16  0:28         ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-15 22:06 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> On 03/15, Gábor Melis wrote:
> > On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> > > If test_signal (SIGUSR1) is blocked, this means it is already
> > > delivered, and the handler will be invoked when we return from
> > > sigsegv_handler(), please see below.
> >
> > SIGUSR1 is delivered, its sigmask is added to the current mask but
> > the handler is not yet invoked and in this instant synchronous
> > sigsegv is delivered, its handler invoked?
>
> Can't understand the question. Could you reiterate?

No need, my question was answered below.

> > > When sigprocmask(SIG_UNBLOCK) returns, both signals are
> > > delivered. The kernel deques 1 first, then 2. This means that the
> > > handler for "2" will be called first.
> >
> > My mental model that matches what I quickly glean from the sources
> > (from kernel/signal.c, arch/x86/kernel/signal_32.c) goes like this:
> >
> > - signal 1 and signal 2 are generated and made pending
> > - they are unblocked by sigprocmask
> > - signal 1 is delivered: signals in its mask (only itself here) are
> > blocked
>
> yes.
>
> the kernel changes ip (instruction pointer) to sig_1.
>
> > its handler is invoked
>
> no.
>
> We never return to user-space with a pending signal. We dequeue
> signal 2 too, and change ip to sig_2.
>
> Now, since there are no more pending signals, we return to the user
> space, and start sig_2().

I see. I guess in addition to changing the ip, the stack frobbing magic 
arranges that sig_2 returns to sig_1 or some code that calls sig_1.

From the point of view of sig_2 it seems that sig_1 is already invoked 
because it has its sigmask in effect and the ip in the ucontext of 
sig_2 points to sig_1 as the attached signal-test.c shows:

sig_1=80485a7
sig_2=80485ed
2 1
eip: 80485a7
1 0
eip: b7fab424

The revised signal-delivery-order.c (also attached) outputs:

test_handler=8048727
sigsegv_handler=804872c
eip: 8048727
esp: b7d94cb8

which shows that sigsegv_handler also has incorrect eip in the context.

[-- Attachment #2: signal-delivery-order.c --]
[-- Type: text/x-csrc, Size: 2195 bytes --]

#include <signal.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/ucontext.h>
#include <unistd.h>

int test_signal;
int *page_address;
pthread_t tid;

int is_signal_blocked(int signum)
{
    sigset_t set;
    pthread_sigmask(SIG_BLOCK, 0, &set);
    return (sigismember(&set, signum));
}

void *eip(struct ucontext *context)
{
    return (void *)context->uc_mcontext.gregs[14];
}

void test_handler(int signal, siginfo_t *info, void *context)
{
}

void sigsegv_handler(int signal, siginfo_t *info, void *context)
{
    /* The test signal is blocked only in test_handler. */
    if (is_signal_blocked(test_signal)) {
        printf("eip: %x\n", eip(context));
        _exit(27);
    }
    mprotect(page_address, 4096, PROT_READ | PROT_WRITE);
}

void *make_faults(void *arg)
{
    while (1) {
        mprotect(page_address, 4096, PROT_NONE);
        *page_address = 1;
    }
}

void *reserve_page(void)
{
    int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE;
    void *actual = mmap(0, 4096, PROT_NONE, flags, -1, 0);
    if (actual == MAP_FAILED) {
        perror("mmap");
        return 0;
    }
    return actual;
}

void install_handlers(void)
{
    struct sigaction sa;
    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = test_handler;
    sigaction(test_signal, &sa, 0);
    sa.sa_sigaction = sigsegv_handler;
    sigaction(SIGSEGV, &sa, 0);
}

void test_with_pthread_kill()
{
    page_address = (int *)reserve_page();
    install_handlers();
    if (pthread_create(&tid, 0, make_faults, 0) < 0)
        perror("pthread_create");
    while(1) {
        pthread_kill(tid, test_signal);
    }
}

void test_with_kill()
{
    pid_t pid = fork();
    if (pid == 0) {
        page_address = (int *)reserve_page();
        install_handlers();
        make_faults(0);
    } else {
        while (1) {
            kill(pid, test_signal);
        }
    }
}

int main(void)
{
    test_signal = SIGUSR1;
    printf("test_handler=%x\n", test_handler);
    printf("sigsegv_handler=%x\n", sigsegv_handler);
    test_with_pthread_kill();
    /* Forking and kill()ing works as expected: */
    /* test_with_kill(); */
}

[-- Attachment #3: signal-test.c --]
[-- Type: text/x-csrc, Size: 1180 bytes --]

#include <syscall.h>
#include <signal.h>
#include <stdio.h>
#include <ucontext.h>

int is_blocked(int sig)
{
    sigset_t set;
    sigprocmask(SIG_BLOCK, NULL, &set);
    return sigismember(&set, sig);
}

void *eip(struct ucontext *context)
{
    return (void *)context->uc_mcontext.gregs[14];
}

void sig_1(int sig, siginfo_t *info, struct ucontext *context)
{
    printf("%d %d\n", sig, is_blocked(2));
    printf("eip: %x\n", eip(context));
}

void sig_2(int sig, siginfo_t *info, struct ucontext *context)
{
    printf("%d %d\n", sig, is_blocked(1));
    printf("eip: %x\n", eip(context));
}

int tkill(int tid, int sig)
{
    syscall(SYS_tkill, tid, sig);
}

int main(void)
{
    sigset_t set;

    struct sigaction sa;
    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = sig_1;
    sigaction(1, &sa, 0);
    sa.sa_sigaction = sig_2;
    sigaction(2, &sa, 0);

    sigemptyset(&set);
    sigaddset(&set, 1);
    sigaddset(&set, 2);
    printf("sig_1=%x\n", sig_1);
    printf("sig_2=%x\n", sig_2);
    sigprocmask(SIG_BLOCK, &set, NULL);

    kill(getpid(), 1);
    kill(getpid(), 2);

    sigprocmask(SIG_UNBLOCK, &set, NULL);

    return 0;
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-15 22:06       ` Gábor Melis
@ 2009-03-16  0:28         ` Oleg Nesterov
  2009-03-16  8:34           ` Gábor Melis
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-16  0:28 UTC (permalink / raw)
  To: Gábor Melis; +Cc: linux-kernel, Andrew Morton

On 03/15, Gábor Melis wrote:
>
> On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> > >
> > Now, since there are no more pending signals, we return to the user
> > space, and start sig_2().
>
> I see. I guess in addition to changing the ip, the stack frobbing magic
> arranges that sig_2 returns to sig_1 or some code that calls sig_1.

yes. "some code" == rt_sigreturn,

> The revised signal-delivery-order.c (also attached) outputs:
>
> test_handler=8048727
> sigsegv_handler=804872c
> eip: 8048727
> esp: b7d94cb8
>
> which shows that sigsegv_handler also has incorrect eip in the context.

Why do you think it is not correct?

I didn't try your test-case, but I can't see where "esp: b7d94cb8"
comes from. But "eip: 8048727" looks exactly right, this is the
address of test_handler.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-16  0:28         ` Oleg Nesterov
@ 2009-03-16  8:34           ` Gábor Melis
  2009-03-16 21:13             ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-16  8:34 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Andrew Morton

On Lunes 16 Marzo 2009, Oleg Nesterov wrote:
> On 03/15, Gábor Melis wrote:
> > On Domingo 15 Marzo 2009, Oleg Nesterov wrote:
> > > Now, since there are no more pending signals, we return to the
> > > user space, and start sig_2().
> >
> > I see. I guess in addition to changing the ip, the stack frobbing
> > magic arranges that sig_2 returns to sig_1 or some code that calls
> > sig_1.
>
> yes. "some code" == rt_sigreturn,
>
> > The revised signal-delivery-order.c (also attached) outputs:
> >
> > test_handler=8048727
> > sigsegv_handler=804872c
> > eip: 8048727
> > esp: b7d94cb8
> >
> > which shows that sigsegv_handler also has incorrect eip in the
> > context.
>
> Why do you think it is not correct?
>
> I didn't try your test-case, but I can't see where "esp: b7d94cb8"
> comes from. But "eip: 8048727" looks exactly right, this is the
> address of test_handler.

Sorry, I removed the printing of esp from the code as it was not 
relevant to my point but pasted the output of a previous run.

Anyway, I think eip is incorrect in sigsegv because it's not pointing to 
the instruction that caused the sigsegv. In general the ucontext is 
wrong, because it's as if sigsegv_handler were invoked within 
test_handler.

This is problematic if the sigsegv handler wants to do something with 
the context. The real life sigsegv handler that's been failing does 
this:
- skip the offending instruction by incrementing eip
- taking esp from the context, frob the control stack so that some 
function is called on return from the handler (the handler itself is on 
altstack). This is not unlike what the kernel does, it seems.

Now, "some function" cannot be called with SIGUSR1 blocked because that 
would potentially lead to deadlocks. (SIGUSR1 is sent to a thread when 
the garbage collector wants to stop it, and some function does 
allocations.)

So the context in the sigsegv handler pointing to the handler of SIGUSR1 
loses because it finds an unexpected sigmask: SIGUSR1 is blocked. It 
loses because the eip is not pointing to the right instruction, it 
loses because the SIGUSR1 handler won't finish until "some function" 
returns ...

It seems to me that the same problem could be triggerred by 
pthread_kill()ing a thread that's sigtrapping if the signum of the 
signal sent is lower than that of sigtrap, say it's SIGINT.

In a nutshell, the context argument is wrong.

> Oleg.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-16  8:34           ` Gábor Melis
@ 2009-03-16 21:13             ` Oleg Nesterov
  2009-03-16 22:56               ` Chris Friesen
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-16 21:13 UTC (permalink / raw)
  To: Gábor Melis; +Cc: linux-kernel, Andrew Morton

On 03/16, Gábor Melis wrote:
>
> In a nutshell, the context argument is wrong.

I strongly disagree. This all is correct and works as expected.
Yes, it doesn't match your expectations/needs, but this doesn't
mean it is wrong.

> On Lunes 16 Marzo 2009, Oleg Nesterov wrote:
> > On 03/15, Gábor Melis wrote:
> >
> > > The revised signal-delivery-order.c (also attached) outputs:
> > >
> > > test_handler=8048727
> > > sigsegv_handler=804872c
> > > eip: 8048727
> > > esp: b7d94cb8
> > >
> > > which shows that sigsegv_handler also has incorrect eip in the
> > > context.
> >
> > Why do you think it is not correct?
> >
> > I didn't try your test-case, but I can't see where "esp: b7d94cb8"
> > comes from. But "eip: 8048727" looks exactly right, this is the
> > address of test_handler.
>
> Sorry, I removed the printing of esp from the code as it was not
> relevant to my point but pasted the output of a previous run.
>
> Anyway, I think eip is incorrect in sigsegv because it's not pointing to
> the instruction that caused the sigsegv. In general the ucontext is
> wrong, because it's as if sigsegv_handler were invoked within
> test_handler.
>
> This is problematic if the sigsegv handler wants to do something with
> the context. The real life sigsegv handler that's been failing does
> this:
> - skip the offending instruction by incrementing eip

I don't know how to solve your problem cleanly. Please let me now
if you find the solution ;)

As a workaround, perhaps sigsegv_handler() can just return if
uc_mcontext->ip points to another signal handler (assuming that
the first instruction of the signal handler can't cause the fault).
In this case the task will repeat the faulting instruction, and
sigsegv_handler() will be called again.

Agreed, this is not nice and the task should track sigaction()
calls, or sigsegv_handler() can do sigaction(signr, NULL, &oldact)
in a loop to see which handler we have.


Can the kernel help?

Perhaps we can add si_ip to _sigfault, in this case we could just
check uc_mcontext->ip != info->si_ip and return. Or we can unwind
the stack and find the "correct" rt_sigframe which matches the
page fault.

Or, we can change force_sig_info_fault() to block all signals
except si_signo and use set_restore_sigmask() logic. This means
no other signal can be dequeued until sigsegv_handler() returns.

I dunno. I am not sure your problem is common enough to do these
changes, but I can't judge.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Signal delivery order
  2009-03-16 21:13             ` Oleg Nesterov
@ 2009-03-16 22:56               ` Chris Friesen
  2009-03-17  4:13                 ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Friesen @ 2009-03-16 22:56 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Gábor Melis, linux-kernel, Andrew Morton

Oleg Nesterov wrote:
> On 03/16, Gábor Melis wrote:
>> In a nutshell, the context argument is wrong.
> 
> I strongly disagree. This all is correct and works as expected.
> Yes, it doesn't match your expectations/needs, but this doesn't
> mean it is wrong.

It would appear that standard may allow this, depending on interpretation.

 From SUSv3, regarding sigaction():

"...the third argument can be cast to a pointer to an object of type 
ucontext_t to refer to the receiving thread's context that was 
interrupted when the signal was delivered."

 From the "signal concepts" section, "A signal is said to be 
``delivered'' to a process when the appropriate action for the process 
and signal is taken."

Given that the SIGSEGV isn't "delivered" until it's handler runs, it 
could possibly be valid for the instruction pointer in the SIGSEGV 
handler to reference test_handler, if the system was set up in such a 
way that the context was set to test_handler() at the time the SIGSEGV 
handler was run.

However, there are quality-of-implementation issues here--being able to 
handle SIGSEGV is pretty useless if the instruction pointer being passed 
to the handler doesn't actually match the instruction that caused the 
segfault.

> I dunno. I am not sure your problem is common enough to do these
> changes, but I can't judge.

What he's trying to do isn't all that unusual.  Certainly any 
application wanting to do something smart (log the instruction pointer, 
for instance) on a segfault could run into the same problem.

Chris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-16 22:56               ` Chris Friesen
@ 2009-03-17  4:13                 ` Oleg Nesterov
  2009-03-17  4:25                   ` Oleg Nesterov
                                     ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-17  4:13 UTC (permalink / raw)
  To: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath
  Cc: Andrew Morton, Chris Friesen, Gábor Melis, linux-kernel

(see http://marc.info/?t=123704955800002)

First of all, perhaps I missed somethings and this is solvable without
kernel changes, but I can't see how.

To simplify, suppose that the application wants to log the faulting
instruction before exit, so it installs the handler for SIGSEGV

	void sigsegv_handler(int sig, siginfo_t *info, struct ucontext *context)
	{
		fprintf(stderr, "bug at %p\n", context->uc_mcontext->ip);
		exit(1);
	}

But this doesn't work. It is possible that, before the task dequeues SIGSEGV,
another private signal, say, SIGHUP (note that SIGHUP < SIGSEGV) is sent to
this app.

In this case the task dequeues both signals, SIGHUP and then SIGSEGV before
return to user-space. This is correct, but now uc_mcontext->ip points to
sighup_handler, not to the faulted instruction.


Can/should we change force_sig_info_fault() to help?

We can add siginfo_t->_sigfault.ip to solve this problem. This shouldn't
enlarge the size of siginfo_t. With this change sigsegv_handler() always
knows the address of the instruction which caused the fault.


But this doesn't allow to change uc_mcontext->ip to, say, skip the offending
instruction or call another function which does a fixup. Actually, ->si_ip
helps, we can do

	void sigsegv_handler(int sig, siginfo_t *info, struct ucontext *context)
	{
		if (info->si_ip != context->uc_mcontext->ip)
			/*
			 * The offending instruction will be repeated, and
			 * we will have the "same" SIGSEGV again.
			 */
			return;

		context->uc_mcontext->ip = fixup;
		...
	}

But this doesn't look very nice. So, perhaps we can do another change?

	--- arch/x86/mm/fault.c
	+++ arch/x86/mm/fault.c
	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
	 {
		siginfo_t info;
	 
	+	current->saved_sigmask = current->blocked;
	+	spin_lock_irq(&current->sighand->siglock);
	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
	+	spin_unlock_irq(&current->sighand->siglock);
	+	set_restore_sigmask();
	+
		info.si_signo = si_signo;
		info.si_errno = 0;
		info.si_code = si_code;

But this is a user-visible change, all signals will be blocked until
sigsegv_handler() returns. But with this change sigsegv_handler()
always has the "correct" rt_sigframe.


Comments?

And once again, I have a nasty feeling I missed something and we don't
need to change the kernel.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  4:13                 ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Oleg Nesterov
@ 2009-03-17  4:25                   ` Oleg Nesterov
  2009-03-17  8:23                   ` Gábor Melis
  2009-03-18  7:59                   ` Roland McGrath
  2 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-17  4:25 UTC (permalink / raw)
  To: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath
  Cc: Andrew Morton, Chris Friesen, Gábor Melis, linux-kernel

Sorry for noise, forgot to mention,

On 03/17, Oleg Nesterov wrote:
>
> 	--- arch/x86/mm/fault.c
> 	+++ arch/x86/mm/fault.c
> 	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
> 	 {
> 		siginfo_t info;
> 	 
> 	+	current->saved_sigmask = current->blocked;
> 	+	spin_lock_irq(&current->sighand->siglock);
> 	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
> 	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
> 	+	spin_unlock_irq(&current->sighand->siglock);
> 	+	set_restore_sigmask();
> 	+

Of course, this change is wrong, it is just for illustration.
We shouldn't unblock si_signo if it was blocked, force_sig_info()
sets SIG_DFL in this case.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  4:13                 ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Oleg Nesterov
  2009-03-17  4:25                   ` Oleg Nesterov
@ 2009-03-17  8:23                   ` Gábor Melis
  2009-03-17  9:25                     ` Oleg Nesterov
  2009-03-17 15:56                     ` Linus Torvalds
  2009-03-18  7:59                   ` Roland McGrath
  2 siblings, 2 replies; 22+ messages in thread
From: Gábor Melis @ 2009-03-17  8:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel

On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> (see http://marc.info/?t=123704955800002)
>
> First of all, perhaps I missed somethings and this is solvable
> without kernel changes, but I can't see how.
>
> To simplify, suppose that the application wants to log the faulting
> instruction before exit, so it installs the handler for SIGSEGV
>
> 	void sigsegv_handler(int sig, siginfo_t *info, struct ucontext
> *context) {
> 		fprintf(stderr, "bug at %p\n", context->uc_mcontext->ip);
> 		exit(1);
> 	}
>
> But this doesn't work. It is possible that, before the task dequeues
> SIGSEGV, another private signal, say, SIGHUP (note that SIGHUP <
> SIGSEGV) is sent to this app.
>
> In this case the task dequeues both signals, SIGHUP and then SIGSEGV
> before return to user-space. This is correct, but now uc_mcontext->ip
> points to sighup_handler, not to the faulted instruction.
>
>
> Can/should we change force_sig_info_fault() to help?
>
> We can add siginfo_t->_sigfault.ip to solve this problem. This
> shouldn't enlarge the size of siginfo_t. With this change
> sigsegv_handler() always knows the address of the instruction which
> caused the fault.
>
>
> But this doesn't allow to change uc_mcontext->ip to, say, skip the
> offending instruction or call another function which does a fixup.
> Actually, ->si_ip helps, we can do
>
> 	void sigsegv_handler(int sig, siginfo_t *info, struct ucontext
> *context) {
> 		if (info->si_ip != context->uc_mcontext->ip)
> 			/*
> 			 * The offending instruction will be repeated, and
> 			 * we will have the "same" SIGSEGV again.
> 			 */
> 			return;
>
> 		context->uc_mcontext->ip = fixup;
> 		...
> 	}
>
> But this doesn't look very nice. So, perhaps we can do another
> change?
>
> 	--- arch/x86/mm/fault.c
> 	+++ arch/x86/mm/fault.c
> 	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
> 	 {
> 		siginfo_t info;
>
> 	+	current->saved_sigmask = current->blocked;
> 	+	spin_lock_irq(&current->sighand->siglock);
> 	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
> 	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
> 	+	spin_unlock_irq(&current->sighand->siglock);
> 	+	set_restore_sigmask();
> 	+
> 		info.si_signo = si_signo;
> 		info.si_errno = 0;
> 		info.si_code = si_code;
>
> But this is a user-visible change, all signals will be blocked until
> sigsegv_handler() returns. But with this change sigsegv_handler()
> always has the "correct" rt_sigframe.
>
>
> Comments?
>
> And once again, I have a nasty feeling I missed something and we
> don't need to change the kernel.
>
> Oleg.

As an application developer what I'd like to have is this: synchronously 
generated signals are delivered before asynchronously generated ones. 
That is, if a number of signals are generated but not yet delivered 
then the synchronously generated ones are delivered first. I guess, in 
the kernel this would mean that the private/non-private distinction is 
not enough.

I think Chris is right in that standard allows the current behaviour. I 
would have quoted it already if I had thought otherwise ... He's also 
right about quality of implementation. The standard doesn't say much 
about synchronously and asynchronously generated signals and it even 
goes further:

"The order in which multiple, simultaneously pending signals outside the 
range SIGRTMIN to SIGRTMAX are delivered to or accepted by a process is 
unspecified."

Bleh. It also says that the context argument represents the context at 
the time of delivery. For a handler for sigsegv, sigtrap (to name just 
the most likely suspects) the context -  in order for it to be useful 
at all - must be from the time of generation. There is an obvious 
contradiction here and the only way I see to resolve this is to deliver 
syncrhronously generated signals while the two contexts are the same 
which is exactly what allowing other signals to 'overtake' violates. Of 
course, if sigsegv is blocked then this is impossible to do, but that's 
fine.

For the application I'm working on this whole issue is kind of moot 
since kernels with the current behaviour will be around for ages to 
come and I have the workaround of not pthread_kill()ing with a signal 
whose signum is lower than the signum of any of the syncrhonously 
generated signals. In practice, it only means that I'll use SIGUSR2 
instead of SIGUSR1 because that's greater than SIGSEGV and pay 
attention not to pthread_kill() with SIGINT. The only thing that 
worries me is this remark from Oleg 
(http://marc.info/?l=linux-kernel&m=123711058421913&w=2):

"But please note that it is still possible to hit is_signal_blocked()
even with test_with_kill(), but the probability is very low."

I still can't see how that's possible, but if it is, then it defeats the 
workaround and I have even bigger problems than I thought. Not only me, 
I guess, most applications with a sigtrap, sigsegv handler that use the 
context will be broken by a C-c.

Gabor

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  8:23                   ` Gábor Melis
@ 2009-03-17  9:25                     ` Oleg Nesterov
  2009-03-17 10:20                       ` Gábor Melis
  2009-03-17 15:56                     ` Linus Torvalds
  1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-17  9:25 UTC (permalink / raw)
  To: Gábor Melis
  Cc: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel

On 03/17, Gábor Melis wrote:
>
> On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> >
> > But this doesn't look very nice. So, perhaps we can do another
> > change?
> >
> > 	--- arch/x86/mm/fault.c
> > 	+++ arch/x86/mm/fault.c
> > 	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
> > 	 {
> > 		siginfo_t info;
> >
> > 	+	current->saved_sigmask = current->blocked;
> > 	+	spin_lock_irq(&current->sighand->siglock);
> > 	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
> > 	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
> > 	+	spin_unlock_irq(&current->sighand->siglock);
> > 	+	set_restore_sigmask();
> > 	+
> > 		info.si_signo = si_signo;
> > 		info.si_errno = 0;
> > 		info.si_code = si_code;
> >
> > But this is a user-visible change, all signals will be blocked until
> > sigsegv_handler() returns. But with this change sigsegv_handler()
> > always has the "correct" rt_sigframe.
>
> As an application developer what I'd like to have is this: synchronously
> generated signals are delivered before asynchronously generated ones.
> That is, if a number of signals are generated but not yet delivered
> then the synchronously generated ones are delivered first. I guess, in
> the kernel this would mean that the private/non-private distinction is
> not enough.

With the change like above, no other signal (except SIGKILL) can be
delivered until the signal handler returns.

Probably it is better to just change force_sig_info(), in this case
SIGFPE/etc will have the same behaviour.

> The only thing that
> worries me is this remark from Oleg
> (http://marc.info/?l=linux-kernel&m=123711058421913&w=2):
>
> "But please note that it is still possible to hit is_signal_blocked()
> even with test_with_kill(), but the probability is very low."

Sorry for confusion. Initially I misread test_with_kill() case, and then
forgot to remove this part. I think this is not possible.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  9:25                     ` Oleg Nesterov
@ 2009-03-17 10:20                       ` Gábor Melis
  2009-03-17 10:43                         ` Oleg Nesterov
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-17 10:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel

On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> On 03/17, Gábor Melis wrote:
> > On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> > > But this doesn't look very nice. So, perhaps we can do another
> > > change?
> > >
> > > 	--- arch/x86/mm/fault.c
> > > 	+++ arch/x86/mm/fault.c
> > > 	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
> > > 	 {
> > > 		siginfo_t info;
> > >
> > > 	+	current->saved_sigmask = current->blocked;
> > > 	+	spin_lock_irq(&current->sighand->siglock);
> > > 	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
> > > 	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
> > > 	+	spin_unlock_irq(&current->sighand->siglock);
> > > 	+	set_restore_sigmask();
> > > 	+
> > > 		info.si_signo = si_signo;
> > > 		info.si_errno = 0;
> > > 		info.si_code = si_code;
> > >
> > > But this is a user-visible change, all signals will be blocked
> > > until sigsegv_handler() returns. But with this change
> > > sigsegv_handler() always has the "correct" rt_sigframe.
> >
> > As an application developer what I'd like to have is this:
> > synchronously generated signals are delivered before asynchronously
> > generated ones. That is, if a number of signals are generated but
> > not yet delivered then the synchronously generated ones are
> > delivered first. I guess, in the kernel this would mean that the
> > private/non-private distinction is not enough.
>
> With the change like above, no other signal (except SIGKILL) can be
> delivered until the signal handler returns.

Surely, you don't mean the above literally: it would violate the 
standard to prevent all other signals from being delivered until the 
sigsegv handler returns.

> Probably it is better to just change force_sig_info(), in this case
> SIGFPE/etc will have the same behaviour.

Indeed, uniformity seems preferable to me.

While we are at it, an interesting case is when a synchronously 
generated signal and an asynchronously generated signal - that is also 
of the type that can be synchronously generated - are to be delivered. 
Say we have a fault and a sigsegv generated but some misguided soul 
pthread_kill()s with sigtrap. In this case the sigsegv shall be 
delivered first, and the async sigtrap later.

> > The only thing that
> > worries me is this remark from Oleg
> > (http://marc.info/?l=linux-kernel&m=123711058421913&w=2):
> >
> > "But please note that it is still possible to hit
> > is_signal_blocked() even with test_with_kill(), but the probability
> > is very low."
>
> Sorry for confusion. Initially I misread test_with_kill() case, and
> then forgot to remove this part. I think this is not possible.

Thanks for the clarification.

> Oleg.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17 10:20                       ` Gábor Melis
@ 2009-03-17 10:43                         ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-03-17 10:43 UTC (permalink / raw)
  To: Gábor Melis
  Cc: Davide Libenzi, Ingo Molnar, Linus Torvalds, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel

On 03/17, Gábor Melis wrote:
>
> On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> > On 03/17, Gábor Melis wrote:
> > > On Martes 17 Marzo 2009, Oleg Nesterov wrote:
> > > > But this doesn't look very nice. So, perhaps we can do another
> > > > change?
> > > >
> > > > 	--- arch/x86/mm/fault.c
> > > > 	+++ arch/x86/mm/fault.c
> > > > 	@@ -177,6 +177,13 @@ static void force_sig_info_fault(int si_
> > > > 	 {
> > > > 		siginfo_t info;
> > > >
> > > > 	+	current->saved_sigmask = current->blocked;
> > > > 	+	spin_lock_irq(&current->sighand->siglock);
> > > > 	+	siginitsetinv(&current->blocked, sigmask(si_signo) |
> > > > 	+			sigmask(SIGKILL) | sigmask(SIGSTOP));
> > > > 	+	spin_unlock_irq(&current->sighand->siglock);
> > > > 	+	set_restore_sigmask();
> > > > 	+
> > > > 		info.si_signo = si_signo;
> > > > 		info.si_errno = 0;
> > > > 		info.si_code = si_code;
> > > >
> > > > But this is a user-visible change, all signals will be blocked
> > > > until sigsegv_handler() returns. But with this change
> > > > sigsegv_handler() always has the "correct" rt_sigframe.
> > >
> > > As an application developer what I'd like to have is this:
> > > synchronously generated signals are delivered before asynchronously
> > > generated ones. That is, if a number of signals are generated but
> > > not yet delivered then the synchronously generated ones are
> > > delivered first. I guess, in the kernel this would mean that the
> > > private/non-private distinction is not enough.
> >
> > With the change like above, no other signal (except SIGKILL) can be
> > delivered until the signal handler returns.
>
> Surely, you don't mean the above literally:

I literally meant the above ;)

> it would violate the
> standard to prevent all other signals from being delivered until the
> sigsegv handler returns.

Yes. That is why I didn't send the patch but asked the question.

But, just in case, this will not happen if the signal was sent from
user-space via kill/tkill.

> While we are at it, an interesting case is when a synchronously
> generated signal and an asynchronously generated signal - that is also
> of the type that can be synchronously generated - are to be delivered.
> Say we have a fault and a sigsegv generated but some misguided soul
> pthread_kill()s with sigtrap. In this case the sigsegv shall be
> delivered first, and the async sigtrap later.

In this case we can do nothing. The second signal will be lost.
But this is not the problem. If sigsegv_handler() wants to play
with context, it should check SI_FROMKERNEL() first. If we lose
the SIGSEGV from the fault, it will be re-generated.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  8:23                   ` Gábor Melis
  2009-03-17  9:25                     ` Oleg Nesterov
@ 2009-03-17 15:56                     ` Linus Torvalds
  2009-03-17 19:20                       ` Q: SEGSEGV && uc_mcontext->ip David Miller
  2009-03-18  9:58                       ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Gábor Melis
  1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2009-03-17 15:56 UTC (permalink / raw)
  To: Gábor Melis
  Cc: Oleg Nesterov, Davide Libenzi, Ingo Molnar, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel



On Tue, 17 Mar 2009, Gábor Melis wrote:
> 
> As an application developer what I'd like to have is this: synchronously 
> generated signals are delivered before asynchronously generated ones. 

I agree that it would be nice, but quite frankly, it's simply not how 
signals work. It would be a reasonably invasive change, and you wouldn't 
really be able to rely on it anyway since most kernels don't work that 
way.

What you might be able to do instead is to walk signal frames backwards by 
hand. IOW, accept the fact that sometimes signals end up being nested, but 
then you could try to find the right frame by just looking at them.

And your trick of comparing 'info->si_ip' with 'context->uc_mcontext->ip' 
is pretty good, and lets the code itself walk the signal frames by just 
depending on the fault happening again.

			Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip
  2009-03-17 15:56                     ` Linus Torvalds
@ 2009-03-17 19:20                       ` David Miller
  2009-03-18  9:58                       ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Gábor Melis
  1 sibling, 0 replies; 22+ messages in thread
From: David Miller @ 2009-03-17 19:20 UTC (permalink / raw)
  To: torvalds; +Cc: mega, oleg, davidel, mingo, roland, akpm, cfriesen, linux-kernel

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 17 Mar 2009 08:56:34 -0700 (PDT)

> What you might be able to do instead is to walk signal frames backwards by 
> hand. IOW, accept the fact that sometimes signals end up being nested, but 
> then you could try to find the right frame by just looking at them.

FWIW, GCC's dwarf2 unwind handlers already know how to walk through
Linux signal frames and identify them.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17  4:13                 ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Oleg Nesterov
  2009-03-17  4:25                   ` Oleg Nesterov
  2009-03-17  8:23                   ` Gábor Melis
@ 2009-03-18  7:59                   ` Roland McGrath
  2009-03-18  9:02                     ` RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)) Gábor Melis
  2 siblings, 1 reply; 22+ messages in thread
From: Roland McGrath @ 2009-03-18  7:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Davide Libenzi, Ingo Molnar, Linus Torvalds, Andrew Morton,
	Chris Friesen, Gábor Melis, linux-kernel

> First of all, perhaps I missed somethings and this is solvable without
> kernel changes, but I can't see how.

It depends what kind of "solved" you mean.

Signals pending for the thread are always delivered before signals pending
for the process.  POSIX does not guarantee this to the application, but it
has always been so in Linux and it's fine enough to rely on that.  Truly
externally-generated and asynchronous signals go to the process, so it's
really only pthread_kill use within your own program that raises the issue.

Among signals pending for the thread, signals < SIGRTMIN are always
delivered before ones >= SIGRTMIN.  POSIX does not guarantee this to the
application, but it has always been so in Linux and it's fine enough to
rely on that.  The most sensible thing to use with pthread_kill is some
SIGRTMIN+n signal anyway, since they are never confused with any other use.
If your program is doing that, you don't have a problem.

So on the one hand it seems pretty reasonable to say it's "solved" by
accepting it when we say, "Welcome to Unix, these things should have
stopped surprising you in the 1980s."  It's a strange pitfall of how
everything fits together, granted.  But you do sort of have to make an
effort to do things screwily before you can fall into it.

All that said, it's actually probably a pretty easy hack to arrange that
the signal posted by force_sig_info is the first one dequeued in all but
the most utterly strange situations.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order))
  2009-03-18  7:59                   ` Roland McGrath
@ 2009-03-18  9:02                     ` Gábor Melis
  2009-03-18 14:52                       ` Linus Torvalds
  0 siblings, 1 reply; 22+ messages in thread
From: Gábor Melis @ 2009-03-18  9:02 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, Davide Libenzi, Ingo Molnar, Linus Torvalds,
	Andrew Morton, Chris Friesen, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3309 bytes --]

On Miércoles 18 Marzo 2009, Roland McGrath wrote:
> > First of all, perhaps I missed somethings and this is solvable
> > without kernel changes, but I can't see how.
>
> It depends what kind of "solved" you mean.
>
> Signals pending for the thread are always delivered before signals
> pending for the process.  POSIX does not guarantee this to the
> application, but it has always been so in Linux and it's fine enough
> to rely on that.  Truly externally-generated and asynchronous signals
> go to the process, so it's really only pthread_kill use within your
> own program that raises the issue.
>
> Among signals pending for the thread, signals < SIGRTMIN are always
> delivered before ones >= SIGRTMIN.  POSIX does not guarantee this to
> the application, but it has always been so in Linux and it's fine
> enough to rely on that.  The most sensible thing to use with
> pthread_kill is some SIGRTMIN+n signal anyway, since they are never
> confused with any other use. If your program is doing that, you don't
> have a problem.

It was just a month or so ago when I finally made to change to use a 
non-real-time signal for signalling stop-for-gc. It was motivated by 
the fact that even with rt signals there needs to be a fallback 
mechanism for when the rt signal queue overflows. Another reason was 
that _different processes_ could interfere with each other: if one 
filled the queue the other processes would hang too (there was no 
fallback mechanism implemented). From this behaviour, it seemed that 
the rt signal queue was global. Attached is a test program that 
reproduces this. 

$ gcc -lpthread rt-signal-queue-overflow.c
$ (./a.out &); sleep 1; ./a.out
pthread_kill returned EAGAIN, errno=0, count=24566
pthread_kill returned EAGAIN, errno=0, count=0

There are two notable things here. The first is that pthread_kill 
returns EAGAIN that's not mentioned on the man page, but does not set 
errno. The other is that the first process filled the rt signal queue 
and the second one could not send a single signal successfully.

Granted, without a fallback mechanism my app deserved to lose. However, 
it seemed to me that there were other programs lacking in this regard 
on my desktop as I managed to hang a few of them.

Even though within my app I could have guarenteed that the number of 
pending rt signals is below a reasonable limit, there was no way to 
defend against other processes filling up the queue so I had to 
implement fallback mechanism that used non-rt signals (changing a few 
other things as well) and when that was done, there was no reason to 
keep the rt signal based one around.

Consider this another quality-of-implementation report.

> So on the one hand it seems pretty reasonable to say it's "solved" by
> accepting it when we say, "Welcome to Unix, these things should have
> stopped surprising you in the 1980s."  It's a strange pitfall of how
> everything fits together, granted.  But you do sort of have to make
> an effort to do things screwily before you can fall into it.
>
> All that said, it's actually probably a pretty easy hack to arrange
> that the signal posted by force_sig_info is the first one dequeued in
> all but the most utterly strange situations.
>
>
> Thanks,
> Roland

[-- Attachment #2: rt-signal-queue-overflow.c --]
[-- Type: text/x-csrc, Size: 1036 bytes --]

#include <signal.h>
#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/ucontext.h>
#include <unistd.h>
#include <errno.h>

int test_signal;

void test_handler(int signal, siginfo_t *info, void *context)
{
}

void install_handlers(void)
{
    struct sigaction sa;
    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = test_handler;
    sigaction(test_signal, &sa, 0);
}

int main(void)
{
    sigset_t sigset;
    test_signal = SIGRTMIN;
    install_handlers();
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGRTMIN);
    sigprocmask(SIG_BLOCK, &sigset, 0);
    {
        int r;
        int count = 0;
        do {
            r = pthread_kill(pthread_self(), test_signal);
            if (r == EAGAIN) {
                printf("pthread_kill returned EAGAIN, errno=%d, count=%d\n",
                       errno, count);
                sleep(2);
                exit(27);
            }
            if (!r)
                count++;
        } while (!r);
    }
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)
  2009-03-17 15:56                     ` Linus Torvalds
  2009-03-17 19:20                       ` Q: SEGSEGV && uc_mcontext->ip David Miller
@ 2009-03-18  9:58                       ` Gábor Melis
  1 sibling, 0 replies; 22+ messages in thread
From: Gábor Melis @ 2009-03-18  9:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, Davide Libenzi, Ingo Molnar, Roland McGrath,
	Andrew Morton, Chris Friesen, linux-kernel

On Martes 17 Marzo 2009, Linus Torvalds wrote:
> On Tue, 17 Mar 2009, Gábor Melis wrote:
> > As an application developer what I'd like to have is this:
> > synchronously generated signals are delivered before asynchronously
> > generated ones.
>
> I agree that it would be nice, but quite frankly, it's simply not how
> signals work. It would be a reasonably invasive change, and you
> wouldn't really be able to rely on it anyway since most kernels don't
> work that way.

True. I won't be able to rely on this and I'll stick to the SIGUSR2 
workaround that's confirmed by Oleg.

> What you might be able to do instead is to walk signal frames
> backwards by hand. IOW, accept the fact that sometimes signals end up
> being nested, but then you could try to find the right frame by just
> looking at them.

Walking the frames is not enough, because even if the right one is 
found, I can't put a new frame on it if it's not at the top ...

> And your trick of comparing 'info->si_ip' with
> 'context->uc_mcontext->ip' is pretty good, and lets the code itself
> walk the signal frames by just depending on the fault happening
> again.

Another example. Suppose there is a stack with a mprotected guard page 
at the end. The app's stack grows into the guard page, sigsegv is 
generated, its handler would be invoked, but a pthread_kill'ed SIGUSR1 
gets delivered first. Now the SIGUSR1 handler accesses the stack and 
triggers another fault, the sigsegv handler sees that si_ip == 
uc_mcontext->ip, so it unprotects the page and puts a frame on the 
stack, arranging for a function to be called. Then the function 
deadlocks because it waits for a signal that's blocked in the SIGUSR1 
handler.

I think it would be a definite improvement to prevent all these 
headaches from occurring and deliver asynchronously generated, thread 
private signals after the synchronous ones.

> 			Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order))
  2009-03-18  9:02                     ` RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)) Gábor Melis
@ 2009-03-18 14:52                       ` Linus Torvalds
  2009-03-18 15:23                         ` Gábor Melis
  0 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2009-03-18 14:52 UTC (permalink / raw)
  To: Gábor Melis
  Cc: Roland McGrath, Oleg Nesterov, Davide Libenzi, Ingo Molnar,
	Andrew Morton, Chris Friesen, linux-kernel



On Wed, 18 Mar 2009, Gábor Melis wrote:
> 
> It was just a month or so ago when I finally made to change to use a 
> non-real-time signal for signalling stop-for-gc.

Ahh, and that is when you noticed the issue with SIGSEGV?

One thing you might try is to still use a non-real-time signal, but simply 
depend on the fact that signals are basically peeled off the signal 
bitmask in a signal number order (with the exception that fatal signals 
are handled specially).

IOW, on x86, just about the _only_ normal user-generated signal that can 
happen before SIGSEGV is SIGUSR1, because SIGSEGV is 11, and SIGUSR1 is 
10.

If you were to use SIGUSR2 (signal #12) you'd probably never see this.

Of course, there are other signals with numbers < 11, but they are 
generally meant for other synchronous traps (ie signals like 
SIGTRAP/AIGABRT/SIGFPE/SIGBUS), or for killing the process (signals
SIGHUP/SIGINT/SIGQUIT).

So maybe you'd be happy with just picking another signal? It's not a 
_pretty_ solution, but it should work across most kernel versions.

			Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order))
  2009-03-18 14:52                       ` Linus Torvalds
@ 2009-03-18 15:23                         ` Gábor Melis
  0 siblings, 0 replies; 22+ messages in thread
From: Gábor Melis @ 2009-03-18 15:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Roland McGrath, Oleg Nesterov, Davide Libenzi, Ingo Molnar,
	Andrew Morton, Chris Friesen, linux-kernel

On Miércoles 18 Marzo 2009, Linus Torvalds wrote:
> On Wed, 18 Mar 2009, Gábor Melis wrote:
> > It was just a month or so ago when I finally made to change to use
> > a non-real-time signal for signalling stop-for-gc.
>
> Ahh, and that is when you noticed the issue with SIGSEGV?

Unfortunately not immediately because another change was needed make 
context frobbing sigsegvs frequent enough for this to be noticed ...

> One thing you might try is to still use a non-real-time signal, but
> simply depend on the fact that signals are basically peeled off the
> signal bitmask in a signal number order (with the exception that
> fatal signals are handled specially).
>
> IOW, on x86, just about the _only_ normal user-generated signal that
> can happen before SIGSEGV is SIGUSR1, because SIGSEGV is 11, and
> SIGUSR1 is 10.
>
> If you were to use SIGUSR2 (signal #12) you'd probably never see
> this.
>
> Of course, there are other signals with numbers < 11, but they are
> generally meant for other synchronous traps (ie signals like
> SIGTRAP/AIGABRT/SIGFPE/SIGBUS), or for killing the process (signals
> SIGHUP/SIGINT/SIGQUIT).
>
> So maybe you'd be happy with just picking another signal? It's not a
> _pretty_ solution, but it should work across most kernel versions.

So I did already. This is the workaround that I referred to before:

"... I'll stick to the SIGUSR2 workaround that's confirmed by Oleg."

So I'm happy enough that it's fixed in my app, but considering how hard 
it was to figure out what's going on, I thought the kernel people may 
be interested.

> 			Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-03-18 15:23 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-14 16:50 Signal delivery order Gábor Melis
2009-03-15  9:44 ` Oleg Nesterov
2009-03-15 14:40   ` Gábor Melis
2009-03-15 17:29     ` Oleg Nesterov
2009-03-15 22:06       ` Gábor Melis
2009-03-16  0:28         ` Oleg Nesterov
2009-03-16  8:34           ` Gábor Melis
2009-03-16 21:13             ` Oleg Nesterov
2009-03-16 22:56               ` Chris Friesen
2009-03-17  4:13                 ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Oleg Nesterov
2009-03-17  4:25                   ` Oleg Nesterov
2009-03-17  8:23                   ` Gábor Melis
2009-03-17  9:25                     ` Oleg Nesterov
2009-03-17 10:20                       ` Gábor Melis
2009-03-17 10:43                         ` Oleg Nesterov
2009-03-17 15:56                     ` Linus Torvalds
2009-03-17 19:20                       ` Q: SEGSEGV && uc_mcontext->ip David Miller
2009-03-18  9:58                       ` Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order) Gábor Melis
2009-03-18  7:59                   ` Roland McGrath
2009-03-18  9:02                     ` RT signal queue overflow (Was: Q: SEGSEGV && uc_mcontext->ip (Was: Signal delivery order)) Gábor Melis
2009-03-18 14:52                       ` Linus Torvalds
2009-03-18 15:23                         ` Gábor Melis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.