All of lore.kernel.org
 help / color / mirror / Atom feed
* clock_nanosleep() task_struct leak
@ 2013-02-01 13:39 Tommi Rantala
  2013-02-01 13:52 ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Tommi Rantala @ 2013-02-01 13:39 UTC (permalink / raw)
  To: Thomas Gleixner, LKML; +Cc: Dave Jones

Hello,

Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:

-----8<-----8<-----8<-----
#include <time.h>

static const struct timespec req;

int main(void) {
        return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
                        TIMER_ABSTIME, &req, NULL);
}
-----8<-----8<-----8<-----

Tommi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-01 13:39 clock_nanosleep() task_struct leak Tommi Rantala
@ 2013-02-01 13:52 ` Thomas Gleixner
  2013-02-04 19:32   ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2013-02-01 13:52 UTC (permalink / raw)
  To: Tommi Rantala; +Cc: LKML, Dave Jones, John Stultz, Oleg Nesterov

B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:

> Hello,
> 
> Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:
> 
> -----8<-----8<-----8<-----
> #include <time.h>
> 
> static const struct timespec req;
> 
> int main(void) {
>         return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
>                         TIMER_ABSTIME, &req, NULL);
> }
> -----8<-----8<-----8<-----

Groan. That code probably has more than that leak burried inside.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-01 13:52 ` Thomas Gleixner
@ 2013-02-04 19:32   ` Oleg Nesterov
  2013-02-05 10:34     ` Stanislaw Gruszka
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2013-02-04 19:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Tommi Rantala, LKML, Dave Jones, John Stultz, Stanislaw Gruszka

On 02/01, Thomas Gleixner wrote:
>
> B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:
>
> > Hello,
> >
> > Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:
> >
> > -----8<-----8<-----8<-----
> > #include <time.h>
> >
> > static const struct timespec req;
> >
> > int main(void) {
> >         return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
> >                         TIMER_ABSTIME, &req, NULL);
> > }
> > -----8<-----8<-----8<-----

posix_cpu_timer_create()->get_task_struct() I guess...

Cough. I am not sure I ever understood this code, but now it certainly
looks as if I never saw it before.

All I can say is: I'll try to take another look tomorrow :/

> Groan. That code probably has more than that leak burried inside.

Oh. yes... but if you meant something connected to this particular
"test-case" please tell us ;)

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-04 19:32   ` Oleg Nesterov
@ 2013-02-05 10:34     ` Stanislaw Gruszka
  2013-02-05 10:55       ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Stanislaw Gruszka @ 2013-02-05 10:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Thomas Gleixner, Tommi Rantala, LKML, Dave Jones, John Stultz

On Mon, Feb 04, 2013 at 08:32:23PM +0100, Oleg Nesterov wrote:
> On 02/01, Thomas Gleixner wrote:
> >
> > B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:
> >
> > > Hello,
> > >
> > > Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:
> > >
> > > -----8<-----8<-----8<-----
> > > #include <time.h>
> > >
> > > static const struct timespec req;
> > >
> > > int main(void) {
> > >         return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
> > >                         TIMER_ABSTIME, &req, NULL);
> > > }
> > > -----8<-----8<-----8<-----
> 
> posix_cpu_timer_create()->get_task_struct() I guess...
> 
> Cough. I am not sure I ever understood this code, but now it certainly
> looks as if I never saw it before.

Looks on do_cpu_nanosleep() we call posix_cpu_timer_create(), but we do
not call posix_cpu_timer_del() at the end. Fix will not be super simple,
since we need to care about error cases. I can cook a patch if nobody
else want to do this.

Stanislaw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-05 10:34     ` Stanislaw Gruszka
@ 2013-02-05 10:55       ` Thomas Gleixner
  2013-02-06 11:23         ` Stanislaw Gruszka
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2013-02-05 10:55 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Oleg Nesterov, Tommi Rantala, LKML, Dave Jones, John Stultz

On Tue, 5 Feb 2013, Stanislaw Gruszka wrote:
> On Mon, Feb 04, 2013 at 08:32:23PM +0100, Oleg Nesterov wrote:
> > On 02/01, Thomas Gleixner wrote:
> > >
> > > B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:
> > >
> > > > Hello,
> > > >
> > > > Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:
> > > >
> > > > -----8<-----8<-----8<-----
> > > > #include <time.h>
> > > >
> > > > static const struct timespec req;
> > > >
> > > > int main(void) {
> > > >         return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
> > > >                         TIMER_ABSTIME, &req, NULL);
> > > > }
> > > > -----8<-----8<-----8<-----
> > 
> > posix_cpu_timer_create()->get_task_struct() I guess...
> > 
> > Cough. I am not sure I ever understood this code, but now it certainly
> > looks as if I never saw it before.
> 
> Looks on do_cpu_nanosleep() we call posix_cpu_timer_create(), but we do
> not call posix_cpu_timer_del() at the end. Fix will not be super simple,
> since we need to care about error cases. I can cook a patch if nobody
> else want to do this.

Would be much appreciated!

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-05 10:55       ` Thomas Gleixner
@ 2013-02-06 11:23         ` Stanislaw Gruszka
  2013-02-06 12:01           ` Tommi Rantala
  0 siblings, 1 reply; 13+ messages in thread
From: Stanislaw Gruszka @ 2013-02-06 11:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Oleg Nesterov, Tommi Rantala, LKML, Dave Jones, John Stultz

On Tue, Feb 05, 2013 at 11:55:19AM +0100, Thomas Gleixner wrote:
> On Tue, 5 Feb 2013, Stanislaw Gruszka wrote:
> > On Mon, Feb 04, 2013 at 08:32:23PM +0100, Oleg Nesterov wrote:
> > > On 02/01, Thomas Gleixner wrote:
> > > >
> > > > B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > Trinity discovered a task_struct leak with clock_nanosleep(), reproducible with:
> > > > >
> > > > > -----8<-----8<-----8<-----
> > > > > #include <time.h>
> > > > >
> > > > > static const struct timespec req;
> > > > >
> > > > > int main(void) {
> > > > >         return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
> > > > >                         TIMER_ABSTIME, &req, NULL);
> > > > > }
> > > > > -----8<-----8<-----8<-----
> > > 
> > > posix_cpu_timer_create()->get_task_struct() I guess...
> > > 
> > > Cough. I am not sure I ever understood this code, but now it certainly
> > > looks as if I never saw it before.
> > 
> > Looks on do_cpu_nanosleep() we call posix_cpu_timer_create(), but we do
> > not call posix_cpu_timer_del() at the end. Fix will not be super simple,
> > since we need to care about error cases. I can cook a patch if nobody
> > else want to do this.
> 
> Would be much appreciated!

Below is proposed fix. Error cases wasn't that bad since there are
various limitations when timer could be fired (i.e. timer which
already fired can not be fired again).

Tommi, please check if patch really fixes the problem. I tested it
with signal interrupt and timeout scenarios, but I don't know how
to confirm if it fix the leak or not.

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 125cb67..07a38b6 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1424,6 +1424,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 				/*
 				 * Our timer fired and was reset.
 				 */
+				posix_cpu_timer_del(&timer);
 				spin_unlock_irq(&timer.it_lock);
 				return 0;
 			}
@@ -1441,9 +1442,17 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		 * We were interrupted by a signal.
 		 */
 		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
-		posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		if (!error)
+			posix_cpu_timer_del(&timer);
 		spin_unlock_irq(&timer.it_lock);
 
+		while (error == TIMER_RETRY) {
+			spin_lock_irq(&timer.it_lock);
+			error = posix_cpu_timer_del(&timer);
+			spin_unlock_irq(&timer.it_lock);
+		}
+
 		if ((it->it_value.tv_sec | it->it_value.tv_nsec) == 0) {
 			/*
 			 * It actually did fire already.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: clock_nanosleep() task_struct leak
  2013-02-06 11:23         ` Stanislaw Gruszka
@ 2013-02-06 12:01           ` Tommi Rantala
  2013-02-06 15:15             ` [PATCH] posix-cpu-timers: fix nanosleep " Stanislaw Gruszka
  0 siblings, 1 reply; 13+ messages in thread
From: Tommi Rantala @ 2013-02-06 12:01 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Thomas Gleixner, Oleg Nesterov, LKML, Dave Jones, John Stultz

2013/2/6 Stanislaw Gruszka <sgruszka@redhat.com>:
> Below is proposed fix. Error cases wasn't that bad since there are
> various limitations when timer could be fired (i.e. timer which
> already fired can not be fired again).
>
> Tommi, please check if patch really fixes the problem. I tested it
> with signal interrupt and timeout scenarios, but I don't know how
> to confirm if it fix the leak or not.

Hi, looks good, this patch fixes the leaks I'm seeing.

Without the patch, running the program from my earlier mail shows
task_struct count growing:

{ttrantal@arkki ~}> uname -r
3.8.0-rc6+
{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct           93    123   8880    3    8 : tunables    0    0
  0 : slabdata     41     41      0
{ttrantal@arkki ~}> for i in `seq 1000` ; do ./leak ; done
{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct         1089   1089   8880    3    8 : tunables    0    0
  0 : slabdata    363    363      0
{ttrantal@arkki ~}> for i in `seq 1000` ; do ./leak ; done
{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct         2088   2088   8880    3    8 : tunables    0    0
  0 : slabdata    696    696      0
{ttrantal@arkki ~}>

With the patch applied, the leak is gone:

{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct           92    108   8880    3    8 : tunables    0    0
  0 : slabdata     36     36      0
{ttrantal@arkki ~}> for i in `seq 1000` ; do ./leak ; done
{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct           92    108   8880    3    8 : tunables    0    0
  0 : slabdata     36     36      0
{ttrantal@arkki ~}> for i in `seq 1000` ; do ./leak ; done
{ttrantal@arkki ~}> sudo grep task_struct /proc/slabinfo
task_struct           92    108   8880    3    8 : tunables    0    0
  0 : slabdata     36     36      0
{ttrantal@arkki ~}>

Running Trinity with kmemleak enabled also resulted to a lot of
detected leaks, which are all gone now based on a quick run.

> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 125cb67..07a38b6 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -1424,6 +1424,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>                                 /*
>                                  * Our timer fired and was reset.
>                                  */
> +                               posix_cpu_timer_del(&timer);
>                                 spin_unlock_irq(&timer.it_lock);
>                                 return 0;
>                         }
> @@ -1441,9 +1442,17 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>                  * We were interrupted by a signal.
>                  */
>                 sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
> -               posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +               error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +               if (!error)
> +                       posix_cpu_timer_del(&timer);
>                 spin_unlock_irq(&timer.it_lock);
>
> +               while (error == TIMER_RETRY) {
> +                       spin_lock_irq(&timer.it_lock);
> +                       error = posix_cpu_timer_del(&timer);
> +                       spin_unlock_irq(&timer.it_lock);
> +               }
> +
>                 if ((it->it_value.tv_sec | it->it_value.tv_nsec) == 0) {
>                         /*
>                          * It actually did fire already.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-06 12:01           ` Tommi Rantala
@ 2013-02-06 15:15             ` Stanislaw Gruszka
  2013-02-06 16:10               ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Stanislaw Gruszka @ 2013-02-06 15:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Oleg Nesterov, LKML, Dave Jones, John Stultz, Tommi Rantala

In do_cpu_nanosleep() we do posix_cpu_timer_create(), but forgot
corresponding posix_cpu_timer_del(), what lead to task_struct leak.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 kernel/posix-cpu-timers.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index a278cad..a01caac 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1403,6 +1403,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 				/*
 				 * Our timer fired and was reset.
 				 */
+				posix_cpu_timer_del(&timer);
 				spin_unlock_irq(&timer.it_lock);
 				return 0;
 			}
@@ -1420,9 +1421,17 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		 * We were interrupted by a signal.
 		 */
 		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
-		posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		if (!error)
+			posix_cpu_timer_del(&timer);
 		spin_unlock_irq(&timer.it_lock);
 
+		while (error == TIMER_RETRY) {
+			spin_lock_irq(&timer.it_lock);
+			error = posix_cpu_timer_del(&timer);
+			spin_unlock_irq(&timer.it_lock);
+		}
+
 		if ((it->it_value.tv_sec | it->it_value.tv_nsec) == 0) {
 			/*
 			 * It actually did fire already.
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-06 15:15             ` [PATCH] posix-cpu-timers: fix nanosleep " Stanislaw Gruszka
@ 2013-02-06 16:10               ` Oleg Nesterov
  2013-02-07 12:22                 ` Stanislaw Gruszka
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2013-02-06 16:10 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Thomas Gleixner, LKML, Dave Jones, John Stultz, Tommi Rantala

Stanislaw,

First of all, thank you so much. I knew it was a good idea to cc you ;)

And let me repeat that I forgot everything about this code.

On 02/06, Stanislaw Gruszka wrote:
>
> In do_cpu_nanosleep() we do posix_cpu_timer_create(), but forgot
> corresponding posix_cpu_timer_del(), what lead to task_struct leak.

Plus, it seems we can leave the timer on ->cpu_timers list...

> @@ -1403,6 +1403,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>  				/*
>  				 * Our timer fired and was reset.
>  				 */
> +				posix_cpu_timer_del(&timer);
>  				spin_unlock_irq(&timer.it_lock);
>  				return 0;
>  			}
> @@ -1420,9 +1421,17 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>  		 * We were interrupted by a signal.
>  		 */
>  		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
> -		posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +		if (!error)
> +			posix_cpu_timer_del(&timer);
>  		spin_unlock_irq(&timer.it_lock);
>
> +		while (error == TIMER_RETRY) {
> +			spin_lock_irq(&timer.it_lock);
> +			error = posix_cpu_timer_del(&timer);

It is not clear to me why other posix_cpu_timer_del's above can't fail..
May be you can add a comment.

And I am not sure that TIMER_RETRY is the only error we should worry.
And perhaps we need even more posix_cpu_timer_del's?

For example. Suppose that posix_cpu_timer_create() succeeds and does
get_task_struct(p). But than p dies, and the first posix_cpu_timer_set()
fails with -ESRCH. No?

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-06 16:10               ` Oleg Nesterov
@ 2013-02-07 12:22                 ` Stanislaw Gruszka
  2013-02-07 13:24                   ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Stanislaw Gruszka @ 2013-02-07 12:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Thomas Gleixner, LKML, Dave Jones, John Stultz, Tommi Rantala

On Wed, Feb 06, 2013 at 05:10:11PM +0100, Oleg Nesterov wrote:
> First of all, thank you so much. I knew it was a good idea to cc you ;)

:-)

> On 02/06, Stanislaw Gruszka wrote:
> >
> > In do_cpu_nanosleep() we do posix_cpu_timer_create(), but forgot
> > corresponding posix_cpu_timer_del(), what lead to task_struct leak.
> 
> Plus, it seems we can leave the timer on ->cpu_timers list...
> 
> > @@ -1403,6 +1403,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
> >  				/*
> >  				 * Our timer fired and was reset.
> >  				 */
> > +				posix_cpu_timer_del(&timer);
> >  				spin_unlock_irq(&timer.it_lock);
> >  				return 0;
> >  			}
> > @@ -1420,9 +1421,17 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
> >  		 * We were interrupted by a signal.
> >  		 */
> >  		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
> > -		posix_cpu_timer_set(&timer, 0, &zero_it, it);
> > +		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
> > +		if (!error)
> > +			posix_cpu_timer_del(&timer);
> >  		spin_unlock_irq(&timer.it_lock);
> >
> > +		while (error == TIMER_RETRY) {
> > +			spin_lock_irq(&timer.it_lock);
> > +			error = posix_cpu_timer_del(&timer);
> 
> It is not clear to me why other posix_cpu_timer_del's above can't fail..
> May be you can add a comment.

Sure, I'll add more comments.

Once posix_cpu_timer_set(..., &zero_it, it) succeed with 0 return value,
it's not possible to fire timer, so posix_cpu_timer_del() will not fail.
Similar assumption is with first posix_cpu_timer_del() call I added
in the patch.

> And I am not sure that TIMER_RETRY is the only error we should worry.
> And perhaps we need even more posix_cpu_timer_del's?
>
> For example. Suppose that posix_cpu_timer_create() succeeds and does
> get_task_struct(p). But than p dies, and the first posix_cpu_timer_set()
> fails with -ESRCH. No?

On second -ESRCH case posix_cpu_timer_set() internally call
put_task_struct(). It does not remove from cpu_timers list, but
that is done at exit(). First -ESRCH case, i.e. calling
posix_cpu_timer_set() with timer->it.cpu.task == NULL, is not possible
in our case.

BTW: I don't think we handle correctly case when traced process -
- timer->it.cpu.task will die. Tracing process - timer->it_process will
probably not be woken up.

Stanislaw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-07 12:22                 ` Stanislaw Gruszka
@ 2013-02-07 13:24                   ` Oleg Nesterov
  2013-02-07 16:04                     ` [PATCH v2] " Stanislaw Gruszka
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2013-02-07 13:24 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Thomas Gleixner, LKML, Dave Jones, John Stultz, Tommi Rantala

On 02/07, Stanislaw Gruszka wrote:
>
> On Wed, Feb 06, 2013 at 05:10:11PM +0100, Oleg Nesterov wrote:
> >
> > It is not clear to me why other posix_cpu_timer_del's above can't fail..
> > May be you can add a comment.
>
> Sure, I'll add more comments.

Yes, please. This will help the next reader to understand that no, we
do not forget to check the error code, just we do not need to do this.

> Once posix_cpu_timer_set(..., &zero_it, it) succeed with 0 return value,
> it's not possible to fire timer, so posix_cpu_timer_del() will not fail.
> Similar assumption is with first posix_cpu_timer_del() call I added
> in the patch.

OK, I see.

> > And I am not sure that TIMER_RETRY is the only error we should worry.
> > And perhaps we need even more posix_cpu_timer_del's?
> >
> > For example. Suppose that posix_cpu_timer_create() succeeds and does
> > get_task_struct(p). But than p dies, and the first posix_cpu_timer_set()
> > fails with -ESRCH. No?
>
> On second -ESRCH case posix_cpu_timer_set() internally call
> put_task_struct().

Ah, missed that, thanks.

> It does not remove from cpu_timers list, but
> that is done at exit().

Yes, posix_cpu_timers_exit().

OK, thanks for your explanations, I think the patch is fine.


> BTW: I don't think we handle correctly case when traced process -
> - timer->it.cpu.task will die. Tracing process - timer->it_process will
> probably not be woken up.

Probably... or perhaps we can treat this case as "timer never expires".

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-07 13:24                   ` Oleg Nesterov
@ 2013-02-07 16:04                     ` Stanislaw Gruszka
  2013-02-07 18:37                       ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Stanislaw Gruszka @ 2013-02-07 16:04 UTC (permalink / raw)
  To: Oleg Nesterov, Thomas Gleixner
  Cc: LKML, Dave Jones, John Stultz, Tommi Rantala

In do_cpu_nanosleep() we do posic_cpu_timer_create(), but forgot
corresponding posix_cpu_timer_del() what lead to task_struct leak.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
v1 -> v2: add comments

 kernel/posix-cpu-timers.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index a278cad..942ca27 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -1401,8 +1401,10 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		while (!signal_pending(current)) {
 			if (timer.it.cpu.expires.sched == 0) {
 				/*
-				 * Our timer fired and was reset.
+				 * Our timer fired and was reset, below
+				 * deletion can not fail.
 				 */
+				posix_cpu_timer_del(&timer);
 				spin_unlock_irq(&timer.it_lock);
 				return 0;
 			}
@@ -1420,9 +1422,26 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		 * We were interrupted by a signal.
 		 */
 		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
-		posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
+		if (!error) {
+			/*
+			 * Timer is now unarmed, deletion can not fail.
+			 */
+			posix_cpu_timer_del(&timer);
+		}
 		spin_unlock_irq(&timer.it_lock);
 
+		while (error == TIMER_RETRY) {
+			/*
+			 * We need to handle case when timer was or is in the
+			 * middle of firing. In other cases we already freed
+			 * resources.
+			 */
+			spin_lock_irq(&timer.it_lock);
+			error = posix_cpu_timer_del(&timer);
+			spin_unlock_irq(&timer.it_lock);
+		}
+
 		if ((it->it_value.tv_sec | it->it_value.tv_nsec) == 0) {
 			/*
 			 * It actually did fire already.
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] posix-cpu-timers: fix nanosleep task_struct leak
  2013-02-07 16:04                     ` [PATCH v2] " Stanislaw Gruszka
@ 2013-02-07 18:37                       ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2013-02-07 18:37 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Thomas Gleixner, LKML, Dave Jones, John Stultz, Tommi Rantala

Thanks Stanislaw.

I think this patch is fine.

On 02/07, Stanislaw Gruszka wrote:
>
> In do_cpu_nanosleep() we do posic_cpu_timer_create(), but forgot
> corresponding posix_cpu_timer_del() what lead to task_struct leak.
> 
> Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> ---
> v1 -> v2: add comments
> 
>  kernel/posix-cpu-timers.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index a278cad..942ca27 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -1401,8 +1401,10 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>  		while (!signal_pending(current)) {
>  			if (timer.it.cpu.expires.sched == 0) {
>  				/*
> -				 * Our timer fired and was reset.
> +				 * Our timer fired and was reset, below
> +				 * deletion can not fail.
>  				 */
> +				posix_cpu_timer_del(&timer);
>  				spin_unlock_irq(&timer.it_lock);
>  				return 0;
>  			}
> @@ -1420,9 +1422,26 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
>  		 * We were interrupted by a signal.
>  		 */
>  		sample_to_timespec(which_clock, timer.it.cpu.expires, rqtp);
> -		posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +		error = posix_cpu_timer_set(&timer, 0, &zero_it, it);
> +		if (!error) {
> +			/*
> +			 * Timer is now unarmed, deletion can not fail.
> +			 */
> +			posix_cpu_timer_del(&timer);
> +		}
>  		spin_unlock_irq(&timer.it_lock);
>  
> +		while (error == TIMER_RETRY) {
> +			/*
> +			 * We need to handle case when timer was or is in the
> +			 * middle of firing. In other cases we already freed
> +			 * resources.
> +			 */
> +			spin_lock_irq(&timer.it_lock);
> +			error = posix_cpu_timer_del(&timer);
> +			spin_unlock_irq(&timer.it_lock);
> +		}
> +
>  		if ((it->it_value.tv_sec | it->it_value.tv_nsec) == 0) {
>  			/*
>  			 * It actually did fire already.
> -- 
> 1.7.11.7
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-02-07 18:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-01 13:39 clock_nanosleep() task_struct leak Tommi Rantala
2013-02-01 13:52 ` Thomas Gleixner
2013-02-04 19:32   ` Oleg Nesterov
2013-02-05 10:34     ` Stanislaw Gruszka
2013-02-05 10:55       ` Thomas Gleixner
2013-02-06 11:23         ` Stanislaw Gruszka
2013-02-06 12:01           ` Tommi Rantala
2013-02-06 15:15             ` [PATCH] posix-cpu-timers: fix nanosleep " Stanislaw Gruszka
2013-02-06 16:10               ` Oleg Nesterov
2013-02-07 12:22                 ` Stanislaw Gruszka
2013-02-07 13:24                   ` Oleg Nesterov
2013-02-07 16:04                     ` [PATCH v2] " Stanislaw Gruszka
2013-02-07 18:37                       ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.