* [BUG - HRT patch] disabling timer hangs system when multiple over runs
@ 2003-01-13 18:26 Fleischer, Julie N
2003-01-14 22:26 ` [BUG - HRT patch] disabling timer hangs system when multiple overruns george anzinger
0 siblings, 1 reply; 3+ messages in thread
From: Fleischer, Julie N @ 2003-01-13 18:26 UTC (permalink / raw)
To: 'george@mvista.com'
Cc: 'high-res-timers-discourse@lists.sourceforge.net',
'linux-kernel@vger.kernel.org'
George -
I'm testing your 2.5.54-bk1 high-res-timers patches and am working on
debugging an issue I'm seeing where my system hangs (i.e., doesn't accept
input and I have to reboot). It happens when I'm disabling the timer by
setting the it_value to 0. I've been able to nail it down to know that it
only happens when you have generated multiple overruns (i.e., when I set up
a repeating timer and block it for > 1 timer expiry, my system then hangs
when I try to disable that timer -- I'm disabling before unblocking the
signals).
I know "system hang" is not very descriptive. If you have input on what
types of logs I should be looking at to figure out what's really going on or
other ways I can debug, I'll do that.
I have added the tests I'm using to reproduce this issue to
http://posixtest.sf.net. The original one where I noticed it was
posixtestsuite/conformance/interfaces/timer_gettime/2-3.c after Jim
Houston's bug fix. Then, I added
posixtestsuite/conformance/interfaces/timer_settime/3-2.c and 3-3.c to help
me get to root cause. To view the issue, you can either run
timer_gettime/2-3.c, or change timer_settime/3-3.c to use a repeating timer
(in nsecs). I have included the latter below.
==> One related ignorant question I had is I wanted to test this against
your latest version (2.5.54-bk6), but when I went today to get the bk
patches for 2.5.54, I couldn't find them. Are those only available for the
current kernel version? That makes sense -- I should have been quicker.
But, just wanted to check if there was another way for me to get that
version.
Additional information is below:
kernel used = 2.5.54-bk1
HRT patches applied =
hrtimers-core-2.5.54-bk1-1.0.patch
hrtimers-hrposix-2.5.54-bk1-1.0.patch
hrtimers-i386-2.5.54-bk1-1.0.patch
hrtimers-posix-2.5.54-bk1-1.0.patch
hrtimers-support-2.5.52-1.0.patch
Thanks.
- Julie Fleischer
timer_settime/3-3.c - with modifications to show issue
/*
* Copyright (c) 2002, Intel Corporation. All rights reserved.
* Created by: julie.n.fleischer REMOVE-THIS AT intel DOT com
* This file is licensed under the GPL license. For the full content
* of this license, see the COPYING file at the top level of this
* source tree.
* Test that if value.it_value = 0, the timer is disarmed. Test by
* disarming a currently armed and blocked timer.
*
* For this test, signal SIGTOTEST will be used, clock CLOCK_REALTIME
* will be used.
*/
#include <time.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include "posixtest.h"
#define TIMEREXPIRENSEC 10000000
#define SLEEPTIME 1
#define SIGTOTEST SIGALRM
void handler(int signo)
{
printf("OK to be in once\n");
}
int main(int argc, char *argv[])
{
sigset_t set;
struct sigevent ev;
struct sigaction act;
timer_t tid;
struct itimerspec its;
struct timespec ts;
ev.sigev_notify = SIGEV_SIGNAL;
ev.sigev_signo = SIGTOTEST;
if (sigemptyset(&set) != 0) {
perror("sigemptyset() did not return success\n");
return PTS_UNRESOLVED;
}
if (sigaddset(&set, SIGTOTEST) != 0) {
perror("sigaddset() did not return success\n");
return PTS_UNRESOLVED;
}
if (sigprocmask(SIG_SETMASK, &set, NULL) != 0) {
perror("sigprocmask() did not return success\n");
return PTS_UNRESOLVED;
}
if (timer_create(CLOCK_REALTIME, &ev, &tid) != 0) {
perror("timer_create() did not return success\n");
return PTS_UNRESOLVED;
}
/*
* First set up timer to be blocked
*/
its.it_interval.tv_sec = 0;
its.it_interval.tv_nsec = 5*TIMEREXPIRENSEC;
its.it_value.tv_sec = 0;
its.it_value.tv_nsec = TIMEREXPIRENSEC;
printf("setup first timer\n");
if (timer_settime(tid, 0, &its, NULL) != 0) {
perror("timer_settime() did not return success\n");
return PTS_UNRESOLVED;
}
printf("sleep\n");
sleep(SLEEPTIME);
printf("awoke\n");
/*
* Second, set value.it_value = 0 and set up handler to catch
* signal.
*/
act.sa_handler=handler;
act.sa_flags=0;
if (sigemptyset(&act.sa_mask) == -1) {
perror("Error calling sigemptyset\n");
return PTS_UNRESOLVED;
}
if (sigaction(SIGTOTEST, &act, 0) == -1) {
perror("Error calling sigaction\n");
return PTS_UNRESOLVED;
}
its.it_interval.tv_sec = 0;
its.it_interval.tv_nsec = 0;
its.it_value.tv_sec = 0;
its.it_value.tv_nsec = 0;
printf("setup second timer\n");
if (timer_settime(tid, 0, &its, NULL) != 0) {
perror("timer_settime() did not return success\n");
return PTS_UNRESOLVED;
}
printf("unblock\n");
if (sigprocmask(SIG_UNBLOCK, &set, NULL) != 0) {
perror("sigprocmask() did not return success\n");
return PTS_UNRESOLVED;
}
/*
* Ensure sleep for TIMEREXPIRE seconds not interrupted
*/
ts.tv_sec=SLEEPTIME;
ts.tv_nsec=0;
printf("sleep again\n");
if (nanosleep(&ts, NULL) == -1) {
printf("nanosleep() interrupted\n");
printf("Test FAILED\n");
return PTS_FAIL;
}
printf("Test PASSED\n");
return PTS_PASS;
}
**These views are not necessarily those of my employer.**
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG - HRT patch] disabling timer hangs system when multiple overruns
2003-01-13 18:26 [BUG - HRT patch] disabling timer hangs system when multiple over runs Fleischer, Julie N
@ 2003-01-14 22:26 ` george anzinger
0 siblings, 0 replies; 3+ messages in thread
From: george anzinger @ 2003-01-14 22:26 UTC (permalink / raw)
To: Fleischer, Julie N
Cc: 'high-res-timers-discourse@lists.sourceforge.net',
'linux-kernel@vger.kernel.org'
"Fleischer, Julie N" wrote:
>
> George -
> I'm testing your 2.5.54-bk1 high-res-timers patches and am working on
> debugging an issue I'm seeing where my system hangs (i.e., doesn't accept
> input and I have to reboot). It happens when I'm disabling the timer by
> setting the it_value to 0. I've been able to nail it down to know that it
> only happens when you have generated multiple overruns (i.e., when I set up
> a repeating timer and block it for > 1 timer expiry, my system then hangs
> when I try to disable that timer -- I'm disabling before unblocking the
> signals).
>
> I know "system hang" is not very descriptive. If you have input on what
> types of logs I should be looking at to figure out what's really going on or
> other ways I can debug, I'll do that.\
I suspect that you have run into a bug I fixed in the latest
version having to do with handing off a timer from id look
up to the spin lock on the timer. I was releasing the look
up lock prior to taking the timer lock which allowed an
interrupt to sneek in there and set up a dead lock with the
interrupt code. Most likey to happen when processing
overruning timers.
This is fixed in the latest patch.
>
> I have added the tests I'm using to reproduce this issue to
> http://posixtest.sf.net. The original one where I noticed it was
> posixtestsuite/conformance/interfaces/timer_gettime/2-3.c after Jim
> Houston's bug fix. Then, I added
> posixtestsuite/conformance/interfaces/timer_settime/3-2.c and 3-3.c to help
> me get to root cause. To view the issue, you can either run
> timer_gettime/2-3.c, or change timer_settime/3-3.c to use a repeating timer
> (in nsecs). I have included the latter below.
>
> ==> One related ignorant question I had is I wanted to test this against
> your latest version (2.5.54-bk6), but when I went today to get the bk
> patches for 2.5.54, I couldn't find them. Are those only available for the
> current kernel version? That makes sense -- I should have been quicker.
> But, just wanted to check if there was another way for me to get that
> version.
Oh, you mean the kernel.org patches, yes they are only on
kernel.org until the next version. It is a rather large
patch. I suppose I could send it if you can stand MB
attachments.
I have been off line trying to bring up my new computer on
RH8.0 so I have not moved to the latest kernel as yet.
-g
>
> Additional information is below:
> kernel used = 2.5.54-bk1
> HRT patches applied =
> hrtimers-core-2.5.54-bk1-1.0.patch
> hrtimers-hrposix-2.5.54-bk1-1.0.patch
> hrtimers-i386-2.5.54-bk1-1.0.patch
> hrtimers-posix-2.5.54-bk1-1.0.patch
> hrtimers-support-2.5.52-1.0.patch
>
> Thanks.
> - Julie Fleischer
>
> timer_settime/3-3.c - with modifications to show issue
> /*
> * Copyright (c) 2002, Intel Corporation. All rights reserved.
> * Created by: julie.n.fleischer REMOVE-THIS AT intel DOT com
> * This file is licensed under the GPL license. For the full content
> * of this license, see the COPYING file at the top level of this
> * source tree.
>
> * Test that if value.it_value = 0, the timer is disarmed. Test by
> * disarming a currently armed and blocked timer.
> *
> * For this test, signal SIGTOTEST will be used, clock CLOCK_REALTIME
> * will be used.
> */
>
> #include <time.h>
> #include <signal.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include "posixtest.h"
>
> #define TIMEREXPIRENSEC 10000000
> #define SLEEPTIME 1
>
> #define SIGTOTEST SIGALRM
>
> void handler(int signo)
> {
> printf("OK to be in once\n");
> }
>
> int main(int argc, char *argv[])
> {
> sigset_t set;
> struct sigevent ev;
> struct sigaction act;
> timer_t tid;
> struct itimerspec its;
> struct timespec ts;
>
> ev.sigev_notify = SIGEV_SIGNAL;
> ev.sigev_signo = SIGTOTEST;
>
> if (sigemptyset(&set) != 0) {
> perror("sigemptyset() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> if (sigaddset(&set, SIGTOTEST) != 0) {
> perror("sigaddset() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> if (sigprocmask(SIG_SETMASK, &set, NULL) != 0) {
> perror("sigprocmask() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> if (timer_create(CLOCK_REALTIME, &ev, &tid) != 0) {
> perror("timer_create() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> /*
> * First set up timer to be blocked
> */
> its.it_interval.tv_sec = 0;
> its.it_interval.tv_nsec = 5*TIMEREXPIRENSEC;
> its.it_value.tv_sec = 0;
> its.it_value.tv_nsec = TIMEREXPIRENSEC;
>
> printf("setup first timer\n");
> if (timer_settime(tid, 0, &its, NULL) != 0) {
> perror("timer_settime() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> printf("sleep\n");
> sleep(SLEEPTIME);
> printf("awoke\n");
>
> /*
> * Second, set value.it_value = 0 and set up handler to catch
> * signal.
> */
> act.sa_handler=handler;
> act.sa_flags=0;
>
> if (sigemptyset(&act.sa_mask) == -1) {
> perror("Error calling sigemptyset\n");
> return PTS_UNRESOLVED;
> }
> if (sigaction(SIGTOTEST, &act, 0) == -1) {
> perror("Error calling sigaction\n");
> return PTS_UNRESOLVED;
> }
>
> its.it_interval.tv_sec = 0;
> its.it_interval.tv_nsec = 0;
> its.it_value.tv_sec = 0;
> its.it_value.tv_nsec = 0;
>
> printf("setup second timer\n");
> if (timer_settime(tid, 0, &its, NULL) != 0) {
> perror("timer_settime() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> printf("unblock\n");
> if (sigprocmask(SIG_UNBLOCK, &set, NULL) != 0) {
> perror("sigprocmask() did not return success\n");
> return PTS_UNRESOLVED;
> }
>
> /*
> * Ensure sleep for TIMEREXPIRE seconds not interrupted
> */
> ts.tv_sec=SLEEPTIME;
> ts.tv_nsec=0;
>
> printf("sleep again\n");
> if (nanosleep(&ts, NULL) == -1) {
> printf("nanosleep() interrupted\n");
> printf("Test FAILED\n");
> return PTS_FAIL;
> }
>
> printf("Test PASSED\n");
> return PTS_PASS;
> }
>
> **These views are not necessarily those of my employer.**
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [BUG - HRT patch] disabling timer hangs system when multiple overruns
@ 2003-01-16 0:12 Fleischer, Julie N
0 siblings, 0 replies; 3+ messages in thread
From: Fleischer, Julie N @ 2003-01-16 0:12 UTC (permalink / raw)
To: 'george anzinger'
Cc: 'high-res-timers-discourse@lists.sourceforge.net',
'linux-kernel@vger.kernel.org'
> George Anzinger wrote:
> I suspect that you have run into a bug I fixed in the latest
> version having to do with handing off a timer from id look
> up to the spin lock on the timer. I was releasing the look
> up lock prior to taking the timer lock which allowed an
> interrupt to sneek in there and set up a dead lock with the
> interrupt code. Most likey to happen when processing
> overruning timers.
>
> This is fixed in the latest patch.
George -
Again, sorry about not testing on your latest version (and thanks for the
bk6 patch! :) ). I ran this test again on your latest patch (the
2.5.54-bk6-1.0 patches), and I'm still seeing a hang of the test case.
There is a good difference, though, I think due to your fix. In 2.5.54-bk1,
I had to reboot the system after the hang(or 2/3 times I usually had to).
Now, I do not have to reboot the system (or 3/3 times I don't have to), but
the test case still hangs (i.e., I have to manually kill the session the
test case was started in).
I forgot to mention reproducibility before. The test case hang is always
reproducible (bk6 or bk1 patches). As I mentioned, the system hang no
longer happens with bk6 (probably the issue you fixed), but the system would
hang ~1/3 times with the bk1 patches.
Someone also suggested I use strace to get more output. That seemed to help
pinpoint exactly that the issue came doing the "timer_settime(<an its with
it_value=0>)" call.
Here's that output:
(...)
write(1, "setup first timer\n", 18setup first timer
) = 18
ipc_subcall(0x8000003, 0, 0xbffff8e0, 0) = 0
write(1, "sleep\n", 6sleep
) = 6
rt_sigprocmask(SIG_BLOCK, [CHLD], [ALRM], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [ALRM], NULL, 8) = 0
nanosleep({1, 0}, {1, 0}) = 0
write(1, "awoke\n", 6awoke
) = 6
rt_sigaction(SIGALRM, {0x80485c0, [], 0x4000000}, NULL, 8) = 0
write(1, "setup second timer\n", 19setup second timer
) = 19
ipc_subcall(0x8000003, 0, 0xbffff8e0, 0
This is the point where the test case hangs. If I'm using strace, I just do
a Ctrl-C to get out.
I'm using these patches:
hrtimers-core-2.5.54-bk6-1.0.patch
hrtimers-hrposix-2.5.54-bk6-1.0.patch
hrtimers-i386-2.5.54-bk6-1.0.patch
hrtimers-posix-2.5.54-bk6-1.0.patch
hrtimers-support-2.5.52-1.0.patch
Thanks.
- Julie
**These views are not necessarily those of my employer.**
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-01-16 0:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-13 18:26 [BUG - HRT patch] disabling timer hangs system when multiple over runs Fleischer, Julie N
2003-01-14 22:26 ` [BUG - HRT patch] disabling timer hangs system when multiple overruns george anzinger
2003-01-16 0:12 Fleischer, Julie N
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).