linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Process with many NPTL threads terminates slowly on core dump signal
@ 2005-09-22 12:13 Michael Kerrisk
  2005-09-22 13:10 ` Michael Kerrisk
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Michael Kerrisk @ 2005-09-22 12:13 UTC (permalink / raw)
  To: mingo, roland; +Cc: linux-kernel, michael.kerrisk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 4466 bytes --]

Roland, Ingo,

I'm guessing that one of you might best be able to determine the 
cause of the behaviour I'm seeing below.

I wrote a program (below) to investigate the operation of the
RLIMIT_CPU resource limit and I encountered a strangeness: if the
program creates a large number of threads, then it takes a very long
to terminate if it receives a signal that may cause a core dump.

I first noticed this happening on receipt of a SIGXCPU (since the
program is designed to consume infinite CPU time).  However, I then
determined that the behaviour occurs on receipt of any signal that
can generate a core dump.

Here's an example run (Linux 2.6.14-rc2, x86, NPTL 2.3.4) -- the
program is asked to set a soft RLIMIT_CPU of 5 seconds, and to
create 20 threads:

==========
$ ulimit -c 0
$ time ./thread_share_RLIMIT_CPU 5 20
Linux tekapo 2.6.14-rc2 #6 SMP PREEMPT Wed Sep 21 09:29:36 CEST 2005 i686
i686 i386 GNU/Linux
Thu Sep 22 14:01:16 CEST 2005
Main thread changed soft CPU resource limit to: 5
Thread 3 cpu=0.21
Thread 3 cpu=0.42
Thread 3 cpu=0.63
Thread 3 cpu=0.84
Thread 3 cpu=1.05
Thread 3 cpu=1.26
[Here, I type ^\, generating SIGQUIT]
Quit

real    0m13.690s
user    0m1.388s
sys     0m11.977s
==========

In the above, one can see a large amount of (system) CPU time 
consumed.

In a similar run, sending SIGINT (^C) does not cause this long delay
before process termination.  Some further experimentation determined
the following:

* The slow process termination time only occurs for those signals that
  may generate core dumps (verified by sending signals using kill(1)
  from another terminal window).

* The slow startup time occurs even if RLIMIT_CORE is set to 0 (see
  the example above above).

* If the example program is made set-user-UD-root (thus
  mm->dumpable == 0), then the problem does not occur.

* I wondered if the fact that the threads were all trying to hog CPU
  might have some bearing on the problem.  However, even if I make
  the threads pause(), the problem is still observable.

All of this makes me think the problem might be somewhere around
do_coredump()/soredump_wait(), but the problem is not immediately
obvious to me.

Cheers,

Michael


/* thread_share_RLIMIT_CPU.c

   Usage: thread_share_RLIMIT_CPU [CPU-soft-limit [num-threads]]
*/
#include <sys/types.h>
#include <sys/times.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

#define errExit(msg)            { perror(msg); exit(EXIT_FAILURE); }

#define errExitEN(en, msg)      { errno = en; perror(msg); \
                                  exit(EXIT_FAILURE); }

static void *
tfunc(void *x)
{
    struct tms tms;
    int cpuCentisecs, prevCpu, j;

    usleep(20000);      /* Give main() a small chance to create other
                           threads before we start consuming CPU time */

    // pause();

   /* Following consumes CPU time and prints messages
       allowing us to track CPU consumption */

    prevCpu = 0;
    for (;;) {
        for (j = 0; j < 100000; j++)
            continue;           /* So main loop consumes mostly
                                   user-mode CPU time */

        if (times(&tms) == -1) errExit("times");
        cpuCentisecs = (tms.tms_utime + tms.tms_stime) * 100 /
                        sysconf(_SC_CLK_TCK);
        if (cpuCentisecs > prevCpu + 20) {
            printf("Thread %d cpu=%0.2f\n", (int) x,
                    cpuCentisecs / 100.0);
            prevCpu = cpuCentisecs;
        }
    }
    return NULL;
}

int
main(int argc, char *argv[])
{
    pthread_t thr;
    int s, numThreads, tn;
    struct rlimit rlim;

    system("uname -a; date");

    if (getrlimit(RLIMIT_CPU, &rlim) == -1) errExit("getrlimit");
    rlim.rlim_cur = (argc > 1) ? atoi(argv[1]) : 3;
    if (setrlimit(RLIMIT_CPU, &rlim) == -1)
        errExit("setrlimit-ORIG_LIMIT");

    if (getrlimit(RLIMIT_CPU, &rlim) == -1) errExit("getrlimit");
    printf("Main thread changed soft CPU resource limit to: %ld\n",
            (long) rlim.rlim_cur);

    numThreads = (argc > 2) ? atoi(argv[2]) : 2;

    for (tn = 0; tn < numThreads; tn++) {
        s = pthread_create(&thr, NULL, tfunc, (void *) tn);
        if (s != 0) errExitEN(s, "pthread_create");
    } 

    pause();
} /* main */

-- 
Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko!
Satte Provisionen für GMX Partner: http://www.gmx.net/de/go/partner

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
  2005-09-22 12:13 Process with many NPTL threads terminates slowly on core dump signal Michael Kerrisk
@ 2005-09-22 13:10 ` Michael Kerrisk
  2005-09-24 15:28 ` Daniel Jacobowitz
  2005-10-12 13:34 ` Michael Kerrisk
  2 siblings, 0 replies; 7+ messages in thread
From: Michael Kerrisk @ 2005-09-22 13:10 UTC (permalink / raw)
  To: Michael Kerrisk; +Cc: mingo, roland, linux-kernel, michael.kerrisk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 292 bytes --]

Just one further point to my ealier report.  Some investigation 
shows that the problem seems to have appaeared with kernel 2.6.11.

Cheers,

Michael

-- 
Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko!
Satte Provisionen für GMX Partner: http://www.gmx.net/de/go/partner

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
  2005-09-22 12:13 Process with many NPTL threads terminates slowly on core dump signal Michael Kerrisk
  2005-09-22 13:10 ` Michael Kerrisk
@ 2005-09-24 15:28 ` Daniel Jacobowitz
  2005-10-12 13:34 ` Michael Kerrisk
  2 siblings, 0 replies; 7+ messages in thread
From: Daniel Jacobowitz @ 2005-09-24 15:28 UTC (permalink / raw)
  To: Michael Kerrisk; +Cc: mingo, roland, linux-kernel, michael.kerrisk

On Thu, Sep 22, 2005 at 02:13:50PM +0200, Michael Kerrisk wrote:
> I first noticed this happening on receipt of a SIGXCPU (since the
> program is designed to consume infinite CPU time).  However, I then
> determined that the behaviour occurs on receipt of any signal that
> can generate a core dump.

You might want to try using /proc/profile (profile= option) to see
where all the kernel time is going?


-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
  2005-09-22 12:13 Process with many NPTL threads terminates slowly on core dump signal Michael Kerrisk
  2005-09-22 13:10 ` Michael Kerrisk
  2005-09-24 15:28 ` Daniel Jacobowitz
@ 2005-10-12 13:34 ` Michael Kerrisk
  2 siblings, 0 replies; 7+ messages in thread
From: Michael Kerrisk @ 2005-10-12 13:34 UTC (permalink / raw)
  To: roland
  Cc: mingo, linux-kernel, michael.kerrisk, Daniel Jacobowitz, Michael Kerrisk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 7180 bytes --]

Roland,

Following up (belatedly) from my earlier message, I took Daniel 
Jacobowitz's suggestion to investigate the result from booting 
with "profile=2".  When running my program (shwon below) on 
2.6.14-rc4 to create 100 threads, and sending a core dump signal, 
the program takes 90 seconds to terminate, and readprofile shows 
the following:

# readprofile|sort -k3nr|head -10
 15504 _spin_unlock_irq                         484.5000
  3648 do_notify_resume                          60.8000
  3867 acpi_processor_idle                        5.8063
    63 resume_userspace                           2.6250
   630 do_signal                                  1.9688
    44 _spin_unlock_irqrestore                    1.3750
    21 work_notifysig                             0.8400
    17 fput                                       0.5312
   185 get_signal_to_deliver                      0.2141
    26 fget                                       0.1806

The relevant call chain seems to be:

  resume_userspace()
    work_notifysig()
      do_notify_resume()
        do_signal()
          get_signal_to_deliver()

And these lines in do_notify_resume() (which is called 
heavily, while do_signal() does not seem to be):

        /* deal with pending signal delivery */
        if (thread_info_flags & _TIF_SIGPENDING)
                do_signal(regs,oldset);

make me suspect that the issue could be something to do 
with _TIF_SIGPENDING being erroneously cleared.  And the 
fact that the behaviour appeared in 2.6.11, where you made 
some changes to the signal handling code makes me wonder if 
the following patch might have triggered this behaviour:

http://marc.theaimsgroup.com/?l=linux-kernel&m=110567323014110&w=2
Subject:    [PATCH] clear false pending signal indication in core dump
From:       Roland McGrath <roland () redhat ! com>
Date:       2005-01-14 3:05:59

This patch adds the following lines to do_coredump()

+	/*
+	 * Clear any false indication of pending signals that might
+	 * be seen by the filesystem code called to write the core file.
+	 */
+	current->signal->group_stop_count = 0;
+	clear_thread_flag(TIF_SIGPENDING);

However, I might be right off the track here, as I haven't 
completely understood the code.  Do you have any thoughts?

Cheers,

Michael


> Betreff: Process with many NPTL threads terminates slowly on core dump
> signal
> Datum: Thu, 22 Sep 2005 14:13:50 +0200 (MEST)
> 
> Roland, Ingo,
> 
> I'm guessing that one of you might best be able to determine the 
> cause of the behaviour I'm seeing below.
> 
> I wrote a program (below) to investigate the operation of the
> RLIMIT_CPU resource limit and I encountered a strangeness: if the
> program creates a large number of threads, then it takes a very long
> to terminate if it receives a signal that may cause a core dump.
> 
> I first noticed this happening on receipt of a SIGXCPU (since the
> program is designed to consume infinite CPU time).  However, I then
> determined that the behaviour occurs on receipt of any signal that
> can generate a core dump.
> 
> Here's an example run (Linux 2.6.14-rc2, x86, NPTL 2.3.4) -- the
> program is asked to set a soft RLIMIT_CPU of 5 seconds, and to
> create 20 threads:
> 
> ==========
> $ ulimit -c 0
> $ time ./thread_share_RLIMIT_CPU 5 20
> Linux tekapo 2.6.14-rc2 #6 SMP PREEMPT Wed Sep 21 09:29:36 CEST 2005 i686
> i686 i386 GNU/Linux
> Thu Sep 22 14:01:16 CEST 2005
> Main thread changed soft CPU resource limit to: 5
> Thread 3 cpu=0.21
> Thread 3 cpu=0.42
> Thread 3 cpu=0.63
> Thread 3 cpu=0.84
> Thread 3 cpu=1.05
> Thread 3 cpu=1.26
> [Here, I type ^\, generating SIGQUIT]
> Quit
> 
> real    0m13.690s
> user    0m1.388s
> sys     0m11.977s
> ==========
> 
> In the above, one can see a large amount of (system) CPU time 
> consumed.
> 
> In a similar run, sending SIGINT (^C) does not cause this long delay
> before process termination.  Some further experimentation determined
> the following:
> 
> * The slow process termination time only occurs for those signals that
>   may generate core dumps (verified by sending signals using kill(1)
>   from another terminal window).
> 
> * The slow startup time occurs even if RLIMIT_CORE is set to 0 (see
>   the example above above).
> 
> * If the example program is made set-user-UD-root (thus
>   mm->dumpable == 0), then the problem does not occur.
> 
> * I wondered if the fact that the threads were all trying to hog CPU
>   might have some bearing on the problem.  However, even if I make
>   the threads pause(), the problem is still observable.
> 
> All of this makes me think the problem might be somewhere around
> do_coredump()/soredump_wait(), but the problem is not immediately
> obvious to me.
> 
> Cheers,
> 
> Michael
> 
> 
> /* thread_share_RLIMIT_CPU.c
> 
>    Usage: thread_share_RLIMIT_CPU [CPU-soft-limit [num-threads]]
> */
> #include <sys/types.h>
> #include <sys/times.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <pthread.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <errno.h>
> 
> #define errExit(msg)            { perror(msg); exit(EXIT_FAILURE); }
> 
> #define errExitEN(en, msg)      { errno = en; perror(msg); \
>                                   exit(EXIT_FAILURE); }
> 
> static void *
> tfunc(void *x)
> {
>     struct tms tms;
>     int cpuCentisecs, prevCpu, j;
> 
>     usleep(20000);      /* Give main() a small chance to create other
>                            threads before we start consuming CPU time */
> 
>     // pause();
> 
>    /* Following consumes CPU time and prints messages
>        allowing us to track CPU consumption */
> 
>     prevCpu = 0;
>     for (;;) {
>         for (j = 0; j < 100000; j++)
>             continue;           /* So main loop consumes mostly
>                                    user-mode CPU time */
> 
>         if (times(&tms) == -1) errExit("times");
>         cpuCentisecs = (tms.tms_utime + tms.tms_stime) * 100 /
>                         sysconf(_SC_CLK_TCK);
>         if (cpuCentisecs > prevCpu + 20) {
>             printf("Thread %d cpu=%0.2f\n", (int) x,
>                     cpuCentisecs / 100.0);
>             prevCpu = cpuCentisecs;
>         }
>     }
>     return NULL;
> }
> 
> int
> main(int argc, char *argv[])
> {
>     pthread_t thr;
>     int s, numThreads, tn;
>     struct rlimit rlim;
> 
>     system("uname -a; date");
> 
>     if (getrlimit(RLIMIT_CPU, &rlim) == -1) errExit("getrlimit");
>     rlim.rlim_cur = (argc > 1) ? atoi(argv[1]) : 3;
>     if (setrlimit(RLIMIT_CPU, &rlim) == -1)
>         errExit("setrlimit-ORIG_LIMIT");
> 
>     if (getrlimit(RLIMIT_CPU, &rlim) == -1) errExit("getrlimit");
>     printf("Main thread changed soft CPU resource limit to: %ld\n",
>             (long) rlim.rlim_cur);
> 
>     numThreads = (argc > 2) ? atoi(argv[2]) : 2;
> 
>     for (tn = 0; tn < numThreads; tn++) {
>         s = pthread_create(&thr, NULL, tfunc, (void *) tn);
>         if (s != 0) errExitEN(s, "pthread_create");
>     } 
> 
>     pause();
> } /* main */

-- 
Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko!
Satte Provisionen für GMX Partner: http://www.gmx.net/de/go/partner

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
  2005-10-13  8:41 ` Michael Kerrisk
@ 2005-10-13 17:18   ` Oleg Nesterov
  0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2005-10-13 17:18 UTC (permalink / raw)
  To: Michael Kerrisk; +Cc: dan, roland, mingo, linux-kernel, michael.kerrisk

Michael Kerrisk wrote:
> 
> Thanks.  I've applied it to 2.6.14-rc4: this patch does fix the
> specific behaviour that my program demonstrates.
> 
> What remains to be solved?

The problem remains that all threads spin with TIF_SIGPENDING flag
after the coredumping thread sets SIGNAL_GROUP_EXIT in do_coredump()
and before it kills them in zap_threads(). If one of these threads is
SCHED_FIFO we have a deadlock. Also, the coredumping thread could be
preempted, placed in ->expired array ...

Oleg.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
  2005-10-12 17:33 Oleg Nesterov
@ 2005-10-13  8:41 ` Michael Kerrisk
  2005-10-13 17:18   ` Oleg Nesterov
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Kerrisk @ 2005-10-13  8:41 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: dan, roland, mingo, linux-kernel, michael.kerrisk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 1323 bytes --]

> Von: Oleg Nesterov <oleg@tv-sign.ru>
> Datum: Wed, 12 Oct 2005 21:33:46 +0400
> 
> Michael Kerrisk wrote:
> >
> > Following up (belatedly) from my earlier message, I took Daniel 
> > Jacobowitz's suggestion to investigate the result from booting 
> > with "profile=2".  When running my program (shwon below) on 
> > 2.6.14-rc4 to create 100 threads, and sending a core dump signal, 
> > the program takes 90 seconds to terminate, and readprofile shows 
> > the following:
> 
> I think the coredumping code in __group_complete_signal() is bogus
> and what happens is:

[...]

> TIF_SIGPENDING is not cleared, so get_signal_to_deliver() will be
> called again on return to userspace. When all threads will eat their
> ->time_slice, P will return from yield() and kill all threads.

Thanks for investiagting this further.

> Could you try this patch (added to mm tree):
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=112887453531139
> ? It does not solve the whole problem, but may help.
>
> Please report the result, if possible.

Thanks.  I've applied it to 2.6.14-rc4: this patch does fix the 
specific behaviour that my program demonstrates.  

What remains to be solved?

Cheers,

Michael

-- 
10 GB Mailbox, 100 FreeSMS/Monat http://www.gmx.net/de/go/topmail
+++ GMX - die erste Adresse für Mail, Message, More +++

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Process with many NPTL threads terminates slowly on core dump signal
@ 2005-10-12 17:33 Oleg Nesterov
  2005-10-13  8:41 ` Michael Kerrisk
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2005-10-12 17:33 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Daniel Jacobowitz, Roland McGrath, Ingo Molnar, linux-kernel

Michael Kerrisk wrote:
>
> Following up (belatedly) from my earlier message, I took Daniel 
> Jacobowitz's suggestion to investigate the result from booting 
> with "profile=2".  When running my program (shwon below) on 
> 2.6.14-rc4 to create 100 threads, and sending a core dump signal, 
> the program takes 90 seconds to terminate, and readprofile shows 
> the following:

I think the coredumping code in __group_complete_signal() is bogus
and what happens is:

group_send_sig_info(P, SIGQUIT):

	adds SIGQUIT to ->shared_pending

	__group_complete_signal:

		->signal->group_exit_task = P;

		for_each_thread(t) {
			P->signal->group_stop_count ++;
			// sets TIF_SIGPENDING
			signal_wake_up(t)
		}
	

Now, P receives the signal:

get_signal_to_deliver:

	if (->signal->group_stop_count > 0) // YES
		handle_group_stop():
			if (->signal->group_exit_task == current) { // YES
				->signal->group_exit_task = NULL
				return 0;
			}

	signr = dequeue_signal(); // SIGQUIT

	do_coredump:

		->signal->flags = SIGNAL_GROUP_EXIT;

		coredump_wait:

			yield();


Now all other threads do:

get_signal_to_deliver:

	if (->signal->group_stop_count > 0) // YES
		handle_group_stop();
			if (->signal->flags & SIGNAL_GROUP_EXIT) // YES
				return 0;

	signr = dequeue_signal(); // no pending signals
		// recalc_sigpending_tsk() DOES NOT clear TIF_SIGPENDING,
		// because it sees ->group_stop_count != 0.
	
	return 0;

TIF_SIGPENDING is not cleared, so get_signal_to_deliver() will be
called again on return to userspace. When all threads will eat their
->time_slice, P will return from yield() and kill all threads.

Could you try this patch (added to mm tree):
	http://marc.theaimsgroup.com/?l=linux-kernel&m=112887453531139
? It does not solve the whole problem, but may help.

Please report the result, if possible.

Oleg.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-10-13 17:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-22 12:13 Process with many NPTL threads terminates slowly on core dump signal Michael Kerrisk
2005-09-22 13:10 ` Michael Kerrisk
2005-09-24 15:28 ` Daniel Jacobowitz
2005-10-12 13:34 ` Michael Kerrisk
2005-10-12 17:33 Oleg Nesterov
2005-10-13  8:41 ` Michael Kerrisk
2005-10-13 17:18   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).