All of lore.kernel.org
 help / color / mirror / Atom feed
* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 15:19 ` Mark Brown
  0 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-23 15:19 UTC (permalink / raw)
  To: Tejun Heo, Rafael J. Wysocki; +Cc: linux-kernel, linux-arm-kernel

The recent series of commits reworking the freezer appear to have
caused serious issues on ARM.  The kernel constantly complains that
try_to_freeze() is bring called with interrupts disabled:

[   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
[   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
[   75.380000] no locks held by Xorg/1517.
[   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
[<c0464400>] (dump_stack+0x20/0x24)
[   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
(__might_sleep+0xfc/0x11c)
[   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
(do_signal+0x94/0x230)
[   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
(do_notify_resume+0x28/0x6c)
[   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
[<c000eaf8>] (work_pending+0x24/0x28)

and the boot runs very slowly.  Reverting the series merged in 56f0be
appears to resolve the issue, though looking at the changes I'd expect
there's some underlying bug here that just doesn't trigger very often.
I don't really have the bandwidth to understand what's gone wrong right
now but should be able to run tests if you've got anything you'd like
looking atl.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 15:19 ` Mark Brown
  0 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-23 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

The recent series of commits reworking the freezer appear to have
caused serious issues on ARM.  The kernel constantly complains that
try_to_freeze() is bring called with interrupts disabled:

[   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
[   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
[   75.380000] no locks held by Xorg/1517.
[   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
[<c0464400>] (dump_stack+0x20/0x24)
[   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
(__might_sleep+0xfc/0x11c)
[   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
(do_signal+0x94/0x230)
[   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
(do_notify_resume+0x28/0x6c)
[   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
[<c000eaf8>] (work_pending+0x24/0x28)

and the boot runs very slowly.  Reverting the series merged in 56f0be
appears to resolve the issue, though looking at the changes I'd expect
there's some underlying bug here that just doesn't trigger very often.
I don't really have the bandwidth to understand what's gone wrong right
now but should be able to run tests if you've got anything you'd like
looking atl.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 15:19 ` Mark Brown
@ 2011-08-23 15:43   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 15:43 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tejun Heo, Rafael J. Wysocki, linux-kernel, linux-arm-kernel

On Tue, Aug 23, 2011 at 04:19:36PM +0100, Mark Brown wrote:
> The recent series of commits reworking the freezer appear to have
> caused serious issues on ARM.  The kernel constantly complains that
> try_to_freeze() is bring called with interrupts disabled:
> 
> [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> [   75.380000] no locks held by Xorg/1517.
> [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> [<c0464400>] (dump_stack+0x20/0x24)
> [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> (__might_sleep+0xfc/0x11c)
> [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> (do_signal+0x94/0x230)
> [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> (do_notify_resume+0x28/0x6c)
> [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> [<c000eaf8>] (work_pending+0x24/0x28)
> 
> and the boot runs very slowly.  Reverting the series merged in 56f0be
> appears to resolve the issue, though looking at the changes I'd expect
> there's some underlying bug here that just doesn't trigger very often.
> I don't really have the bandwidth to understand what's gone wrong right
> now but should be able to run tests if you've got anything you'd like
> looking atl.

The signal handling is rather yucky, and it seems it needs to run with
IRQs enabled - but we run it with IRQs disabled.

There's horrible issues in there.  One such issue is the syscall restart
handling.  Here's an example:

	sys_ppoll
	...
	receives signal
	returns ERESTARTNOHAND (restart syscall if no handler)

On the exit path, we notice that we have TIF_SIGPENDING set.  So we call
do_notify_resume() and eventually do_signal().  At this point, 'syscall'
is set.

The first thing do_signal() does is fiddle with the stack to fake a
restart.  If we don't gain a signal here, everything is fine and we
return.  So far so good.

Lets say for the sake of argument that we've run do_signal() with IRQs
enabled.  Now, a scheduling event has come along and set TIF_NEED_RESCHED.

So, having returned from do_notify_resume(), we disable interrupts, reload
the work mask, and notice that TIF_NEED_RESCHED is set.  We now switch
away from this thread and do something else.

While something else is running, lets say a SIGHUP is sent to our process,
and there's a handler installed.

When we switch back, we again disable interrupts, reload the work mask,
and notice this time that TIF_SIGPENDING is again set.  So, just like
last time, we call do_notify_resume() and eventually do_signal().  This
time, syscall is false.

So we obtain the signal, save the current state, and set userspace up to
call the handler.

The state which we've saved though is the state required for the syscall
restart - but that's wrong, because the return code said 'restart if
no handler' and we're now invoking a handler.

What should happen is that the syscall is not restarted, and -EINTR is
returned instead.

What has this got to do with the issue you raise?  Having interrupts
off during signal handling helps to avoid this problem by ensuring that
we can't get resched events if there's a signal with no handler present.
Enabling interrupts for do_signal opens that race up.

Now, having interrupts disabled for writing to the processes stack is
technically bad news (although in the last 18 or so years its never
caused a problem.)  That said, we really should enable interrupts
before calling handle_signal() - which should be safe from the above
race.

But... without serious amounts of rework of the signal handling code
to avoid the above race I don't see how we can allow try_to_freeze()
to run with IRQs enabled and still have race-free syscall restarting.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 15:43   ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 15:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 23, 2011 at 04:19:36PM +0100, Mark Brown wrote:
> The recent series of commits reworking the freezer appear to have
> caused serious issues on ARM.  The kernel constantly complains that
> try_to_freeze() is bring called with interrupts disabled:
> 
> [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> [   75.380000] no locks held by Xorg/1517.
> [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> [<c0464400>] (dump_stack+0x20/0x24)
> [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> (__might_sleep+0xfc/0x11c)
> [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> (do_signal+0x94/0x230)
> [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> (do_notify_resume+0x28/0x6c)
> [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> [<c000eaf8>] (work_pending+0x24/0x28)
> 
> and the boot runs very slowly.  Reverting the series merged in 56f0be
> appears to resolve the issue, though looking at the changes I'd expect
> there's some underlying bug here that just doesn't trigger very often.
> I don't really have the bandwidth to understand what's gone wrong right
> now but should be able to run tests if you've got anything you'd like
> looking atl.

The signal handling is rather yucky, and it seems it needs to run with
IRQs enabled - but we run it with IRQs disabled.

There's horrible issues in there.  One such issue is the syscall restart
handling.  Here's an example:

	sys_ppoll
	...
	receives signal
	returns ERESTARTNOHAND (restart syscall if no handler)

On the exit path, we notice that we have TIF_SIGPENDING set.  So we call
do_notify_resume() and eventually do_signal().  At this point, 'syscall'
is set.

The first thing do_signal() does is fiddle with the stack to fake a
restart.  If we don't gain a signal here, everything is fine and we
return.  So far so good.

Lets say for the sake of argument that we've run do_signal() with IRQs
enabled.  Now, a scheduling event has come along and set TIF_NEED_RESCHED.

So, having returned from do_notify_resume(), we disable interrupts, reload
the work mask, and notice that TIF_NEED_RESCHED is set.  We now switch
away from this thread and do something else.

While something else is running, lets say a SIGHUP is sent to our process,
and there's a handler installed.

When we switch back, we again disable interrupts, reload the work mask,
and notice this time that TIF_SIGPENDING is again set.  So, just like
last time, we call do_notify_resume() and eventually do_signal().  This
time, syscall is false.

So we obtain the signal, save the current state, and set userspace up to
call the handler.

The state which we've saved though is the state required for the syscall
restart - but that's wrong, because the return code said 'restart if
no handler' and we're now invoking a handler.

What should happen is that the syscall is not restarted, and -EINTR is
returned instead.

What has this got to do with the issue you raise?  Having interrupts
off during signal handling helps to avoid this problem by ensuring that
we can't get resched events if there's a signal with no handler present.
Enabling interrupts for do_signal opens that race up.

Now, having interrupts disabled for writing to the processes stack is
technically bad news (although in the last 18 or so years its never
caused a problem.)  That said, we really should enable interrupts
before calling handle_signal() - which should be safe from the above
race.

But... without serious amounts of rework of the signal handling code
to avoid the above race I don't see how we can allow try_to_freeze()
to run with IRQs enabled and still have race-free syscall restarting.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 15:19 ` Mark Brown
@ 2011-08-23 21:51   ` Rafael J. Wysocki
  -1 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 21:51 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tejun Heo, linux-kernel, linux-arm-kernel

On Tuesday, August 23, 2011, Mark Brown wrote:
> The recent series of commits reworking the freezer appear to have
> caused serious issues on ARM.  The kernel constantly complains that
> try_to_freeze() is bring called with interrupts disabled:
> 
> [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> [   75.380000] no locks held by Xorg/1517.
> [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> [<c0464400>] (dump_stack+0x20/0x24)
> [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> (__might_sleep+0xfc/0x11c)
> [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> (do_signal+0x94/0x230)
> [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> (do_notify_resume+0x28/0x6c)
> [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> [<c000eaf8>] (work_pending+0x24/0x28)
> 
> and the boot runs very slowly.  Reverting the series merged in 56f0be
> appears to resolve the issue,

In fact, the patch from:

https://patchwork.kernel.org/patch/1083602/

is sufficient to make the calltrace go away.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 21:51   ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 21:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday, August 23, 2011, Mark Brown wrote:
> The recent series of commits reworking the freezer appear to have
> caused serious issues on ARM.  The kernel constantly complains that
> try_to_freeze() is bring called with interrupts disabled:
> 
> [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> [   75.380000] no locks held by Xorg/1517.
> [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> [<c0464400>] (dump_stack+0x20/0x24)
> [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> (__might_sleep+0xfc/0x11c)
> [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> (do_signal+0x94/0x230)
> [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> (do_notify_resume+0x28/0x6c)
> [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> [<c000eaf8>] (work_pending+0x24/0x28)
> 
> and the boot runs very slowly.  Reverting the series merged in 56f0be
> appears to resolve the issue,

In fact, the patch from:

https://patchwork.kernel.org/patch/1083602/

is sufficient to make the calltrace go away.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 21:51   ` Rafael J. Wysocki
@ 2011-08-23 21:53     ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 21:53 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Mark Brown, linux-kernel, linux-arm-kernel

Hello,

On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> and the boot runs very slowly.  Reverting the series merged in 56f0be
>> appears to resolve the issue,
>
> In fact, the patch from:
>
> https://patchwork.kernel.org/patch/1083602/
>
> is sufficient to make the calltrace go away.

Yes, that's true but the added might_sleep() discovered an actual bug
here, so I still think it's better to keep it there.

If proper fix is difficult, I think doing something like if
(!in_atomic()) try_to_freeze() with big fat warning explaining what's
broken and how it should be fixed should do it for now if the proper
fix is gonna take some time. That way, we document what's broken where
it's broken and get to keep the useful debugging annotation.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 21:53     ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 21:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> and the boot runs very slowly. ?Reverting the series merged in 56f0be
>> appears to resolve the issue,
>
> In fact, the patch from:
>
> https://patchwork.kernel.org/patch/1083602/
>
> is sufficient to make the calltrace go away.

Yes, that's true but the added might_sleep() discovered an actual bug
here, so I still think it's better to keep it there.

If proper fix is difficult, I think doing something like if
(!in_atomic()) try_to_freeze() with big fat warning explaining what's
broken and how it should be fixed should do it for now if the proper
fix is gonna take some time. That way, we document what's broken where
it's broken and get to keep the useful debugging annotation.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 21:53     ` Tejun Heo
@ 2011-08-23 22:00       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 22:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Tue, Aug 23, 2011 at 11:53:42PM +0200, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> and the boot runs very slowly.  Reverting the series merged in 56f0be
> >> appears to resolve the issue,
> >
> > In fact, the patch from:
> >
> > https://patchwork.kernel.org/patch/1083602/
> >
> > is sufficient to make the calltrace go away.
> 
> Yes, that's true but the added might_sleep() discovered an actual bug
> here, so I still think it's better to keep it there.
> 
> If proper fix is difficult, I think doing something like if
> (!in_atomic()) try_to_freeze() with big fat warning explaining what's
> broken and how it should be fixed should do it for now if the proper
> fix is gonna take some time. That way, we document what's broken where
> it's broken and get to keep the useful debugging annotation.

How does that solve it?  IRQs disabled from assembly, which doesn't touch
the preempt counter.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:00       ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 22:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 23, 2011 at 11:53:42PM +0200, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> and the boot runs very slowly. ?Reverting the series merged in 56f0be
> >> appears to resolve the issue,
> >
> > In fact, the patch from:
> >
> > https://patchwork.kernel.org/patch/1083602/
> >
> > is sufficient to make the calltrace go away.
> 
> Yes, that's true but the added might_sleep() discovered an actual bug
> here, so I still think it's better to keep it there.
> 
> If proper fix is difficult, I think doing something like if
> (!in_atomic()) try_to_freeze() with big fat warning explaining what's
> broken and how it should be fixed should do it for now if the proper
> fix is gonna take some time. That way, we document what's broken where
> it's broken and get to keep the useful debugging annotation.

How does that solve it?  IRQs disabled from assembly, which doesn't touch
the preempt counter.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:00       ` Russell King - ARM Linux
@ 2011-08-23 22:08         ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:08 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Wed, Aug 24, 2011 at 12:00 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> How does that solve it?  IRQs disabled from assembly, which doesn't touch
> the preempt counter.

Then test something else, be it the IRQ mask bit or some other context.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:08         ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 24, 2011 at 12:00 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> How does that solve it? ?IRQs disabled from assembly, which doesn't touch
> the preempt counter.

Then test something else, be it the IRQ mask bit or some other context.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 15:43   ` Russell King - ARM Linux
@ 2011-08-23 22:08     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 22:08 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Mark Brown, Tejun Heo, linux-kernel, linux-arm-kernel

On Tuesday, August 23, 2011, Russell King - ARM Linux wrote:
> On Tue, Aug 23, 2011 at 04:19:36PM +0100, Mark Brown wrote:
> > The recent series of commits reworking the freezer appear to have
> > caused serious issues on ARM.  The kernel constantly complains that
> > try_to_freeze() is bring called with interrupts disabled:
> > 
> > [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> > [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> > [   75.380000] no locks held by Xorg/1517.
> > [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> > [<c0464400>] (dump_stack+0x20/0x24)
> > [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> > (__might_sleep+0xfc/0x11c)
> > [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> > (do_signal+0x94/0x230)
> > [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> > (do_notify_resume+0x28/0x6c)
> > [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> > [<c000eaf8>] (work_pending+0x24/0x28)
> > 
> > and the boot runs very slowly.  Reverting the series merged in 56f0be
> > appears to resolve the issue, though looking at the changes I'd expect
> > there's some underlying bug here that just doesn't trigger very often.
> > I don't really have the bandwidth to understand what's gone wrong right
> > now but should be able to run tests if you've got anything you'd like
> > looking atl.
> 
> The signal handling is rather yucky, and it seems it needs to run with
> IRQs enabled - but we run it with IRQs disabled.
> 
> There's horrible issues in there.  One such issue is the syscall restart
> handling.  Here's an example:
> 
> 	sys_ppoll
> 	...
> 	receives signal
> 	returns ERESTARTNOHAND (restart syscall if no handler)
> 
> On the exit path, we notice that we have TIF_SIGPENDING set.  So we call
> do_notify_resume() and eventually do_signal().  At this point, 'syscall'
> is set.
> 
> The first thing do_signal() does is fiddle with the stack to fake a
> restart.  If we don't gain a signal here, everything is fine and we
> return.  So far so good.
> 
> Lets say for the sake of argument that we've run do_signal() with IRQs
> enabled.  Now, a scheduling event has come along and set TIF_NEED_RESCHED.
> 
> So, having returned from do_notify_resume(), we disable interrupts, reload
> the work mask, and notice that TIF_NEED_RESCHED is set.  We now switch
> away from this thread and do something else.
> 
> While something else is running, lets say a SIGHUP is sent to our process,
> and there's a handler installed.
> 
> When we switch back, we again disable interrupts, reload the work mask,
> and notice this time that TIF_SIGPENDING is again set.  So, just like
> last time, we call do_notify_resume() and eventually do_signal().  This
> time, syscall is false.
> 
> So we obtain the signal, save the current state, and set userspace up to
> call the handler.
> 
> The state which we've saved though is the state required for the syscall
> restart - but that's wrong, because the return code said 'restart if
> no handler' and we're now invoking a handler.
> 
> What should happen is that the syscall is not restarted, and -EINTR is
> returned instead.
> 
> What has this got to do with the issue you raise?  Having interrupts
> off during signal handling helps to avoid this problem by ensuring that
> we can't get resched events if there's a signal with no handler present.
> Enabling interrupts for do_signal opens that race up.
> 
> Now, having interrupts disabled for writing to the processes stack is
> technically bad news (although in the last 18 or so years its never
> caused a problem.)  That said, we really should enable interrupts
> before calling handle_signal() - which should be safe from the above
> race.
> 
> But... without serious amounts of rework of the signal handling code
> to avoid the above race I don't see how we can allow try_to_freeze()
> to run with IRQs enabled and still have race-free syscall restarting.

I'm not sure if it's necessary to call try_to_freeze() from do_signal().
At least x86 doesn't do that.

Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:08     ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 22:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday, August 23, 2011, Russell King - ARM Linux wrote:
> On Tue, Aug 23, 2011 at 04:19:36PM +0100, Mark Brown wrote:
> > The recent series of commits reworking the freezer appear to have
> > caused serious issues on ARM.  The kernel constantly complains that
> > try_to_freeze() is bring called with interrupts disabled:
> > 
> > [   75.380000] BUG: sleeping function called from invalid context at include/linux/freezer.h:44
> > [   75.380000] in_atomic(): 0, irqs_disabled(): 128, pid: 1517, name: Xorg
> > [   75.380000] no locks held by Xorg/1517.
> > [   75.380000] [<c0014308>] (unwind_backtrace+0x0/0x12c) from
> > [<c0464400>] (dump_stack+0x20/0x24)
> > [   75.380000] [<c0464400>] (dump_stack+0x20/0x24) from [<c0022b80>]
> > (__might_sleep+0xfc/0x11c)
> > [   75.380000] [<c0022b80>] (__might_sleep+0xfc/0x11c) from [<c0011520>]
> > (do_signal+0x94/0x230)
> > [   75.380000] [<c0011520>] (do_signal+0x94/0x230) from [<c00116e4>]
> > (do_notify_resume+0x28/0x6c)
> > [   75.380000] [<c00116e4>] (do_notify_resume+0x28/0x6c) from
> > [<c000eaf8>] (work_pending+0x24/0x28)
> > 
> > and the boot runs very slowly.  Reverting the series merged in 56f0be
> > appears to resolve the issue, though looking at the changes I'd expect
> > there's some underlying bug here that just doesn't trigger very often.
> > I don't really have the bandwidth to understand what's gone wrong right
> > now but should be able to run tests if you've got anything you'd like
> > looking atl.
> 
> The signal handling is rather yucky, and it seems it needs to run with
> IRQs enabled - but we run it with IRQs disabled.
> 
> There's horrible issues in there.  One such issue is the syscall restart
> handling.  Here's an example:
> 
> 	sys_ppoll
> 	...
> 	receives signal
> 	returns ERESTARTNOHAND (restart syscall if no handler)
> 
> On the exit path, we notice that we have TIF_SIGPENDING set.  So we call
> do_notify_resume() and eventually do_signal().  At this point, 'syscall'
> is set.
> 
> The first thing do_signal() does is fiddle with the stack to fake a
> restart.  If we don't gain a signal here, everything is fine and we
> return.  So far so good.
> 
> Lets say for the sake of argument that we've run do_signal() with IRQs
> enabled.  Now, a scheduling event has come along and set TIF_NEED_RESCHED.
> 
> So, having returned from do_notify_resume(), we disable interrupts, reload
> the work mask, and notice that TIF_NEED_RESCHED is set.  We now switch
> away from this thread and do something else.
> 
> While something else is running, lets say a SIGHUP is sent to our process,
> and there's a handler installed.
> 
> When we switch back, we again disable interrupts, reload the work mask,
> and notice this time that TIF_SIGPENDING is again set.  So, just like
> last time, we call do_notify_resume() and eventually do_signal().  This
> time, syscall is false.
> 
> So we obtain the signal, save the current state, and set userspace up to
> call the handler.
> 
> The state which we've saved though is the state required for the syscall
> restart - but that's wrong, because the return code said 'restart if
> no handler' and we're now invoking a handler.
> 
> What should happen is that the syscall is not restarted, and -EINTR is
> returned instead.
> 
> What has this got to do with the issue you raise?  Having interrupts
> off during signal handling helps to avoid this problem by ensuring that
> we can't get resched events if there's a signal with no handler present.
> Enabling interrupts for do_signal opens that race up.
> 
> Now, having interrupts disabled for writing to the processes stack is
> technically bad news (although in the last 18 or so years its never
> caused a problem.)  That said, we really should enable interrupts
> before calling handle_signal() - which should be safe from the above
> race.
> 
> But... without serious amounts of rework of the signal handling code
> to avoid the above race I don't see how we can allow try_to_freeze()
> to run with IRQs enabled and still have race-free syscall restarting.

I'm not sure if it's necessary to call try_to_freeze() from do_signal().
At least x86 doesn't do that.

Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:08         ` Tejun Heo
@ 2011-08-23 22:13           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 22:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Wed, Aug 24, 2011 at 12:08:23AM +0200, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:00 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > How does that solve it?  IRQs disabled from assembly, which doesn't touch
> > the preempt counter.
> 
> Then test something else, be it the IRQ mask bit or some other context.

Thereby entirely preventing threads from being frozen?  You're asking
me to effectively disable suspend/resume on an architecture where it's
heavily used.  That's not a good idea, and would be an out-right
regression.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:13           ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 22:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 24, 2011 at 12:08:23AM +0200, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:00 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > How does that solve it? ?IRQs disabled from assembly, which doesn't touch
> > the preempt counter.
> 
> Then test something else, be it the IRQ mask bit or some other context.

Thereby entirely preventing threads from being frozen?  You're asking
me to effectively disable suspend/resume on an architecture where it's
heavily used.  That's not a good idea, and would be an out-right
regression.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 21:53     ` Tejun Heo
@ 2011-08-23 22:13       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 22:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Mark Brown, linux-kernel, linux-arm-kernel

On Tuesday, August 23, 2011, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> and the boot runs very slowly.  Reverting the series merged in 56f0be
> >> appears to resolve the issue,
> >
> > In fact, the patch from:
> >
> > https://patchwork.kernel.org/patch/1083602/
> >
> > is sufficient to make the calltrace go away.
> 
> Yes, that's true but the added might_sleep() discovered an actual bug
> here, so I still think it's better to keep it there.

Well, I think we _have_ _to_ put it into __refrigerator() at least until
the ARM issue is fixed (other architectures seem to do similar things, BTW),
since otherwise this can be reported over and over again.

Let's try to fix the known issues first and _then_ we can move the
might_sleep() back into try_to_freeze().

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:13       ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-23 22:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday, August 23, 2011, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 23, 2011 at 11:51 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> and the boot runs very slowly.  Reverting the series merged in 56f0be
> >> appears to resolve the issue,
> >
> > In fact, the patch from:
> >
> > https://patchwork.kernel.org/patch/1083602/
> >
> > is sufficient to make the calltrace go away.
> 
> Yes, that's true but the added might_sleep() discovered an actual bug
> here, so I still think it's better to keep it there.

Well, I think we _have_ _to_ put it into __refrigerator() at least until
the ARM issue is fixed (other architectures seem to do similar things, BTW),
since otherwise this can be reported over and over again.

Let's try to fix the known issues first and _then_ we can move the
might_sleep() back into try_to_freeze().

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:13           ` Russell King - ARM Linux
@ 2011-08-23 22:17             ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:17 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Wed, Aug 24, 2011 at 12:13 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Thereby entirely preventing threads from being frozen?  You're asking
> me to effectively disable suspend/resume on an architecture where it's
> heavily used.  That's not a good idea, and would be an out-right
> regression.

Eh? So, it's supposed to enter refrigerator with IRQ disabled? Then,
moving might_sleep() inside refrigerator() doesn't help either, does
it? Then we should be doing is,

if (freezing() && IRQ disabled) {
      bust on IRQ;
      try_to_freeze();
      replug IRQ;
}

But, that can't be right. The current code isn't triggering warning
from scheduler code, right? If the above is the case, it should be
triggering that. What am I missing?

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:17             ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 24, 2011 at 12:13 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Thereby entirely preventing threads from being frozen? ?You're asking
> me to effectively disable suspend/resume on an architecture where it's
> heavily used. ?That's not a good idea, and would be an out-right
> regression.

Eh? So, it's supposed to enter refrigerator with IRQ disabled? Then,
moving might_sleep() inside refrigerator() doesn't help either, does
it? Then we should be doing is,

if (freezing() && IRQ disabled) {
      bust on IRQ;
      try_to_freeze();
      replug IRQ;
}

But, that can't be right. The current code isn't triggering warning
from scheduler code, right? If the above is the case, it should be
triggering that. What am I missing?

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:17             ` Tejun Heo
@ 2011-08-23 22:35               ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:35 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> if (freezing() && IRQ disabled) {
>       bust on IRQ;
>       try_to_freeze();
>       replug IRQ;
> }
> 
> But, that can't be right. The current code isn't triggering warning
> from scheduler code, right? If the above is the case, it should be
> triggering that. What am I missing?

I think the refrigerator() code was actually doing that through
spin_[un]lock_irq(), so it was accidentally masking the problem.  It
definitely seems to need fixing.

Anyways, for now, we can do two things,

1. if (freezing()) { irq_save; try_to_freeze(); irq_restore; } w/ BIG
   FAT UGLY comment.

2. Drop might_sleep() from try_to_freeze().  Moving it to
   refrigerator() wouldn't help much.  It would just trigger more
   sporadically during freeze, which is arguably worse than now.

I'd prefer #1 given that it documents the breakage while also
restoring the IRQ state afterwards FWIW.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-23 22:35               ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-23 22:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> if (freezing() && IRQ disabled) {
>       bust on IRQ;
>       try_to_freeze();
>       replug IRQ;
> }
> 
> But, that can't be right. The current code isn't triggering warning
> from scheduler code, right? If the above is the case, it should be
> triggering that. What am I missing?

I think the refrigerator() code was actually doing that through
spin_[un]lock_irq(), so it was accidentally masking the problem.  It
definitely seems to need fixing.

Anyways, for now, we can do two things,

1. if (freezing()) { irq_save; try_to_freeze(); irq_restore; } w/ BIG
   FAT UGLY comment.

2. Drop might_sleep() from try_to_freeze().  Moving it to
   refrigerator() wouldn't help much.  It would just trigger more
   sporadically during freeze, which is arguably worse than now.

I'd prefer #1 given that it documents the breakage while also
restoring the IRQ state afterwards FWIW.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:35               ` Tejun Heo
@ 2011-08-24 23:15                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-24 23:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Russell King - ARM Linux, Mark Brown, linux-kernel, linux-arm-kernel

On Wednesday, August 24, 2011, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> > if (freezing() && IRQ disabled) {
> >       bust on IRQ;
> >       try_to_freeze();
> >       replug IRQ;
> > }
> > 
> > But, that can't be right. The current code isn't triggering warning
> > from scheduler code, right? If the above is the case, it should be
> > triggering that. What am I missing?
> 
> I think the refrigerator() code was actually doing that through
> spin_[un]lock_irq(), so it was accidentally masking the problem.  It
> definitely seems to need fixing.
> 
> Anyways, for now, we can do two things,
> 
> 1. if (freezing()) { irq_save; try_to_freeze(); irq_restore; } w/ BIG
>    FAT UGLY comment.
> 
> 2. Drop might_sleep() from try_to_freeze().  Moving it to
>    refrigerator() wouldn't help much.  It would just trigger more
>    sporadically during freeze, which is arguably worse than now.
> 
> I'd prefer #1 given that it documents the breakage while also
> restoring the IRQ state afterwards FWIW.

OK, I'm fine with 1.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-24 23:15                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 66+ messages in thread
From: Rafael J. Wysocki @ 2011-08-24 23:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday, August 24, 2011, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> > if (freezing() && IRQ disabled) {
> >       bust on IRQ;
> >       try_to_freeze();
> >       replug IRQ;
> > }
> > 
> > But, that can't be right. The current code isn't triggering warning
> > from scheduler code, right? If the above is the case, it should be
> > triggering that. What am I missing?
> 
> I think the refrigerator() code was actually doing that through
> spin_[un]lock_irq(), so it was accidentally masking the problem.  It
> definitely seems to need fixing.
> 
> Anyways, for now, we can do two things,
> 
> 1. if (freezing()) { irq_save; try_to_freeze(); irq_restore; } w/ BIG
>    FAT UGLY comment.
> 
> 2. Drop might_sleep() from try_to_freeze().  Moving it to
>    refrigerator() wouldn't help much.  It would just trigger more
>    sporadically during freeze, which is arguably worse than now.
> 
> I'd prefer #1 given that it documents the breakage while also
> restoring the IRQ state afterwards FWIW.

OK, I'm fine with 1.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 21:51   ` Rafael J. Wysocki
@ 2011-08-25 11:37     ` Mark Brown
  -1 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-25 11:37 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Tejun Heo, linux-kernel, linux-arm-kernel

On Tue, Aug 23, 2011 at 11:51:55PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, August 23, 2011, Mark Brown wrote:

> > and the boot runs very slowly.  Reverting the series merged in 56f0be
> > appears to resolve the issue,

> In fact, the patch from:

> https://patchwork.kernel.org/patch/1083602/

> is sufficient to make the calltrace go away.

Yes, that works for me too thanks!

Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 11:37     ` Mark Brown
  0 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-25 11:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 23, 2011 at 11:51:55PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, August 23, 2011, Mark Brown wrote:

> > and the boot runs very slowly.  Reverting the series merged in 56f0be
> > appears to resolve the issue,

> In fact, the patch from:

> https://patchwork.kernel.org/patch/1083602/

> is sufficient to make the calltrace go away.

Yes, that works for me too thanks!

Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-23 22:17             ` Tejun Heo
@ 2011-08-25 12:14               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 12:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:13 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Thereby entirely preventing threads from being frozen?  You're asking
> > me to effectively disable suspend/resume on an architecture where it's
> > heavily used.  That's not a good idea, and would be an out-right
> > regression.
> 
> Eh? So, it's supposed to enter refrigerator with IRQ disabled? Then,
> moving might_sleep() inside refrigerator() doesn't help either, does
> it? Then we should be doing is,
> 
> if (freezing() && IRQ disabled) {
>       bust on IRQ;
>       try_to_freeze();
>       replug IRQ;
> }
> 
> But, that can't be right. The current code isn't triggering warning
> from scheduler code, right? If the above is the case, it should be
> triggering that. What am I missing?

The scheduler code does not check for CPU IRQs being masked.  It just
checks the preempt count, nothing more.

might_sleep() on the other hand checks the preempt count _and_ CPU IRQ
mask state.

Note that your 'IRQ disabled' will always be true for ARM at this point
at the moment - and conditionalizing this won't help (see below).

Given that get_signal_to_deliver() already forcefully enables IRQs,
I think our syscall restarting is already horribly racy here in the
way I've previously described.

It also looks to me like get_signal_to_deliver() already handles the
freezing stuff, so does ARM even need this call here?  Maybe when:

commit fc558a7496bfab3d29a68953b07a95883fdcfbb1
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Thu Mar 23 03:00:05 2006 -0800

    [PATCH] swsusp: finally solve mysqld problem

was introduced, every other architecture should have been updated for
that change...  So this call in the ARM signal handling code to
try_to_freeze() should just be deleted as it should've been done five
years ago.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 12:14               ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 12:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 24, 2011 at 12:17:03AM +0200, Tejun Heo wrote:
> On Wed, Aug 24, 2011 at 12:13 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Thereby entirely preventing threads from being frozen? ?You're asking
> > me to effectively disable suspend/resume on an architecture where it's
> > heavily used. ?That's not a good idea, and would be an out-right
> > regression.
> 
> Eh? So, it's supposed to enter refrigerator with IRQ disabled? Then,
> moving might_sleep() inside refrigerator() doesn't help either, does
> it? Then we should be doing is,
> 
> if (freezing() && IRQ disabled) {
>       bust on IRQ;
>       try_to_freeze();
>       replug IRQ;
> }
> 
> But, that can't be right. The current code isn't triggering warning
> from scheduler code, right? If the above is the case, it should be
> triggering that. What am I missing?

The scheduler code does not check for CPU IRQs being masked.  It just
checks the preempt count, nothing more.

might_sleep() on the other hand checks the preempt count _and_ CPU IRQ
mask state.

Note that your 'IRQ disabled' will always be true for ARM at this point
at the moment - and conditionalizing this won't help (see below).

Given that get_signal_to_deliver() already forcefully enables IRQs,
I think our syscall restarting is already horribly racy here in the
way I've previously described.

It also looks to me like get_signal_to_deliver() already handles the
freezing stuff, so does ARM even need this call here?  Maybe when:

commit fc558a7496bfab3d29a68953b07a95883fdcfbb1
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Thu Mar 23 03:00:05 2006 -0800

    [PATCH] swsusp: finally solve mysqld problem

was introduced, every other architecture should have been updated for
that change...  So this call in the ARM signal handling code to
try_to_freeze() should just be deleted as it should've been done five
years ago.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 12:14               ` Russell King - ARM Linux
@ 2011-08-25 12:17                 ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 12:17 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

Hello, Russell.

On Thu, Aug 25, 2011 at 01:14:16PM +0100, Russell King - ARM Linux wrote:
> was introduced, every other architecture should have been updated for
> that change...  So this call in the ARM signal handling code to
> try_to_freeze() should just be deleted as it should've been done five
> years ago.

I agree.  I don't think it's necessary there, so for now, let's just
delete that call, but get_signal_to_deliver() expects to be called
with IRQ enabled so it would still be a good idea to turn on/off IRQs
explicitly around it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 12:17                 ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 12:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hello, Russell.

On Thu, Aug 25, 2011 at 01:14:16PM +0100, Russell King - ARM Linux wrote:
> was introduced, every other architecture should have been updated for
> that change...  So this call in the ARM signal handling code to
> try_to_freeze() should just be deleted as it should've been done five
> years ago.

I agree.  I don't think it's necessary there, so for now, let's just
delete that call, but get_signal_to_deliver() expects to be called
with IRQ enabled so it would still be a good idea to turn on/off IRQs
explicitly around it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 12:17                 ` Tejun Heo
@ 2011-08-25 12:25                   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 12:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Thu, Aug 25, 2011 at 02:17:10PM +0200, Tejun Heo wrote:
> Hello, Russell.
> 
> On Thu, Aug 25, 2011 at 01:14:16PM +0100, Russell King - ARM Linux wrote:
> > was introduced, every other architecture should have been updated for
> > that change...  So this call in the ARM signal handling code to
> > try_to_freeze() should just be deleted as it should've been done five
> > years ago.
> 
> I agree.  I don't think it's necessary there, so for now, let's just
> delete that call, but get_signal_to_deliver() expects to be called
> with IRQ enabled so it would still be a good idea to turn on/off IRQs
> explicitly around it.

No.  Stop bodging and hiding problems.  Anywhere which does this:

	local_irq_enable();
	do something
	local_irq_disable();

is a bug.  Things are called with IRQs disabled for a reason - randomly
re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
while hiding warnings of those bugs.

Please go back and read my response to Mark at the beginning of this
thread, where I describe why IRQs are disabled here.

The only solution here is to fix the problem properly, and I'm working
on a patch to fix the problem I highlighted in my earlier response to
Mark.  Once we have that problem fixed, we can then (more) safely call
do_signal() with IRQs enabled.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 12:25                   ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 25, 2011 at 02:17:10PM +0200, Tejun Heo wrote:
> Hello, Russell.
> 
> On Thu, Aug 25, 2011 at 01:14:16PM +0100, Russell King - ARM Linux wrote:
> > was introduced, every other architecture should have been updated for
> > that change...  So this call in the ARM signal handling code to
> > try_to_freeze() should just be deleted as it should've been done five
> > years ago.
> 
> I agree.  I don't think it's necessary there, so for now, let's just
> delete that call, but get_signal_to_deliver() expects to be called
> with IRQ enabled so it would still be a good idea to turn on/off IRQs
> explicitly around it.

No.  Stop bodging and hiding problems.  Anywhere which does this:

	local_irq_enable();
	do something
	local_irq_disable();

is a bug.  Things are called with IRQs disabled for a reason - randomly
re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
while hiding warnings of those bugs.

Please go back and read my response to Mark at the beginning of this
thread, where I describe why IRQs are disabled here.

The only solution here is to fix the problem properly, and I'm working
on a patch to fix the problem I highlighted in my earlier response to
Mark.  Once we have that problem fixed, we can then (more) safely call
do_signal() with IRQs enabled.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 12:25                   ` Russell King - ARM Linux
@ 2011-08-25 12:35                     ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 12:35 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

Hello,

On Thu, Aug 25, 2011 at 01:25:43PM +0100, Russell King - ARM Linux wrote:
> No.  Stop bodging and hiding problems.  Anywhere which does this:
> 
> 	local_irq_enable();
> 	do something
> 	local_irq_disable();
> 
> is a bug.  Things are called with IRQs disabled for a reason - randomly
> re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
> while hiding warnings of those bugs.
> 
> Please go back and read my response to Mark at the beginning of this
> thread, where I describe why IRQs are disabled here.
> 
> The only solution here is to fix the problem properly, and I'm working
> on a patch to fix the problem I highlighted in my earlier response to
> Mark.  Once we have that problem fixed, we can then (more) safely call
> do_signal() with IRQs enabled.

Arrrrrrrrgghhhhhhhhhhhh, why is communication so difficult with you?
When did I ever suggest that as a proper fix.  ARM is frigging broken
in that path so don't push it over to PM and just deal with it inside
ARM arch code.  For now, we need to make the warning go away until
it's properly fixed so turn off and on IRQ around
get_signal_to_deliver() and mark it with giant FIXME comment.  How
many times did I write that?  I don't know how to make that any
clearer to you.  Gees...

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 12:35                     ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 12:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, Aug 25, 2011 at 01:25:43PM +0100, Russell King - ARM Linux wrote:
> No.  Stop bodging and hiding problems.  Anywhere which does this:
> 
> 	local_irq_enable();
> 	do something
> 	local_irq_disable();
> 
> is a bug.  Things are called with IRQs disabled for a reason - randomly
> re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
> while hiding warnings of those bugs.
> 
> Please go back and read my response to Mark at the beginning of this
> thread, where I describe why IRQs are disabled here.
> 
> The only solution here is to fix the problem properly, and I'm working
> on a patch to fix the problem I highlighted in my earlier response to
> Mark.  Once we have that problem fixed, we can then (more) safely call
> do_signal() with IRQs enabled.

Arrrrrrrrgghhhhhhhhhhhh, why is communication so difficult with you?
When did I ever suggest that as a proper fix.  ARM is frigging broken
in that path so don't push it over to PM and just deal with it inside
ARM arch code.  For now, we need to make the warning go away until
it's properly fixed so turn off and on IRQ around
get_signal_to_deliver() and mark it with giant FIXME comment.  How
many times did I write that?  I don't know how to make that any
clearer to you.  Gees...

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 12:35                     ` Tejun Heo
@ 2011-08-25 13:04                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 13:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

On Thu, Aug 25, 2011 at 02:35:42PM +0200, Tejun Heo wrote:
> Hello,
> 
> On Thu, Aug 25, 2011 at 01:25:43PM +0100, Russell King - ARM Linux wrote:
> > No.  Stop bodging and hiding problems.  Anywhere which does this:
> > 
> > 	local_irq_enable();
> > 	do something
> > 	local_irq_disable();
> > 
> > is a bug.  Things are called with IRQs disabled for a reason - randomly
> > re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
> > while hiding warnings of those bugs.
> > 
> > Please go back and read my response to Mark at the beginning of this
> > thread, where I describe why IRQs are disabled here.
> > 
> > The only solution here is to fix the problem properly, and I'm working
> > on a patch to fix the problem I highlighted in my earlier response to
> > Mark.  Once we have that problem fixed, we can then (more) safely call
> > do_signal() with IRQs enabled.
> 
> Arrrrrrrrgghhhhhhhhhhhh, why is communication so difficult with you?
> When did I ever suggest that as a proper fix.

I'm not the problem here - I'm pointing out that your suggestion is just
going to make things worse not better.  But it seems you can't cope with
people who criticise your solutions in any way.

> ARM is frigging broken
> in that path so don't push it over to PM and just deal with it inside
> ARM arch code.

For fuck sake, who said anything about pushing it over to PM to deal
with?

I'm talking about not putting sticky plasters over the problem, but fixing
the problem PROPERLY.  I'm sorry if doing a job properly offends you, but
you have to live with that.

I _HATE_ solutions which just paper over problems and make then disappear.
I've always been firmly in the "if there's a problem, fix the problem
properly" camp.  You seem to be firmly in the "lets paper over the problem
and forget about it".

> For now, we need to make the warning go away until it's properly fixed
> so turn off and on IRQ around get_signal_to_deliver() and mark it with
> giant FIXME comment.

No.  This is where we disagree.  It doesn't warrant that because it is
possible to fix the problem properly.  While _you_ may not be capable of
doing that, that's no reason not to seek help to have it fixed.

Given that I've posted a description of the problem, would it not be
reasonable to assume that I'm working on fixing it?  Strangely enough
before writing _any_ of the replies to this thread, I have a patch which
does just that - I just haven't tested it yet apart from a compile test.

> How many times did I write that?  I don't know how to make that any
> clearer to you.  Gees...

Hahaha.  Thanks for the laugh - because you never actually said that -
and you're now blaming me for miscommunication?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 13:04                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 13:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 25, 2011 at 02:35:42PM +0200, Tejun Heo wrote:
> Hello,
> 
> On Thu, Aug 25, 2011 at 01:25:43PM +0100, Russell King - ARM Linux wrote:
> > No.  Stop bodging and hiding problems.  Anywhere which does this:
> > 
> > 	local_irq_enable();
> > 	do something
> > 	local_irq_disable();
> > 
> > is a bug.  Things are called with IRQs disabled for a reason - randomly
> > re-enabling IRQs does not "fix" stuff, it merely introduces subtle bugs
> > while hiding warnings of those bugs.
> > 
> > Please go back and read my response to Mark at the beginning of this
> > thread, where I describe why IRQs are disabled here.
> > 
> > The only solution here is to fix the problem properly, and I'm working
> > on a patch to fix the problem I highlighted in my earlier response to
> > Mark.  Once we have that problem fixed, we can then (more) safely call
> > do_signal() with IRQs enabled.
> 
> Arrrrrrrrgghhhhhhhhhhhh, why is communication so difficult with you?
> When did I ever suggest that as a proper fix.

I'm not the problem here - I'm pointing out that your suggestion is just
going to make things worse not better.  But it seems you can't cope with
people who criticise your solutions in any way.

> ARM is frigging broken
> in that path so don't push it over to PM and just deal with it inside
> ARM arch code.

For fuck sake, who said anything about pushing it over to PM to deal
with?

I'm talking about not putting sticky plasters over the problem, but fixing
the problem PROPERLY.  I'm sorry if doing a job properly offends you, but
you have to live with that.

I _HATE_ solutions which just paper over problems and make then disappear.
I've always been firmly in the "if there's a problem, fix the problem
properly" camp.  You seem to be firmly in the "lets paper over the problem
and forget about it".

> For now, we need to make the warning go away until it's properly fixed
> so turn off and on IRQ around get_signal_to_deliver() and mark it with
> giant FIXME comment.

No.  This is where we disagree.  It doesn't warrant that because it is
possible to fix the problem properly.  While _you_ may not be capable of
doing that, that's no reason not to seek help to have it fixed.

Given that I've posted a description of the problem, would it not be
reasonable to assume that I'm working on fixing it?  Strangely enough
before writing _any_ of the replies to this thread, I have a patch which
does just that - I just haven't tested it yet apart from a compile test.

> How many times did I write that?  I don't know how to make that any
> clearer to you.  Gees...

Hahaha.  Thanks for the laugh - because you never actually said that -
and you're now blaming me for miscommunication?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 13:04                       ` Russell King - ARM Linux
@ 2011-08-25 13:09                         ` Tejun Heo
  -1 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 13:09 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Mark Brown, linux-kernel, linux-arm-kernel

Hey, Russell.

If you can fix it properly without going through temporary step,
that's awesome.  Let's put the arguments behind, okay?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 13:09                         ` Tejun Heo
  0 siblings, 0 replies; 66+ messages in thread
From: Tejun Heo @ 2011-08-25 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hey, Russell.

If you can fix it properly without going through temporary step,
that's awesome.  Let's put the arguments behind, okay?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 13:09                         ` Tejun Heo
@ 2011-08-25 14:55                           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 14:55 UTC (permalink / raw)
  To: Tejun Heo, Arnd Bergmann, Mark Brown
  Cc: Rafael J. Wysocki, linux-kernel, linux-arm-kernel

On Thu, Aug 25, 2011 at 03:09:07PM +0200, Tejun Heo wrote:
> Hey, Russell.
> 
> If you can fix it properly without going through temporary step,
> that's awesome.  Let's put the arguments behind, okay?

Here's the patch.  As the kernel I've run this against doesn't have the
change to try_to_freeze(), I added a might_sleep() in do_signal() during
my testing to verify that it fixes Mark's problem (which it does.)

I've tested functions returning -ERESTARTSYS, -ERESTARTNOHAND and
-ERESTART_RESTARTBLOCK, all of which seem to behave as expected with
signals such as SIGCONT (without handler) and SIGALRM (with handler).
I haven't tested -ERESTARTNOINTR.

I don't have a test case for the race condition I mentioned (which is
admittedly pretty difficult to construct, requiring an explicit
signal, schedule, signal sequence) but this should plug that too.

How do we achieve this?  Effectively the steps in this patch are:

1. Undo Arnd's fixups to the syscall restart processing (but don't worry,
   we restore it in step 3).

2. Introduce TIF_SYS_RESTART, which is set when we enter signal handling
   and the syscall has returned one of the restart codes.  This is used
   as a flag to indicate that we have some syscall restart processing to
   do at some point.

3. Clear TIF_SYS_RESTART whenever ptrace is used to set the GP registers
   (thereby restoring Arnd's fixup for his gdb testsuite problem - it
   would be good if Arnd could reconfirm that.)

4. When we setup a user handler to run, check TIF_SYS_RESTART and clear it.
   If it was set, we need to set things up to return -EINTR or restart the
   syscall as appropriate.  As we've cleared it, no further restart
   processing will occur.

5. Once we've run all work (signal delivery, and rescheduling events), and
   we're about to return to userspace, make a final check for TIF_SYS_RESTART.
   If it's still set, then we're returning to userspace having not setup
   any user handlers, and we need to restart the syscall.  This is mostly
   trivial, except for OABI restartblock which requires the user stack to
   be written.  We have to re-enable IRQs for this write, which means we
   have to manually re-check for rescheduling events, abort the restart,
   and try again later.

One of the side effects of reverting Arnd's patch is that we restore the
strace behaviour which we've had for years on ARM, and can still be seen
on x86: strace can see the -ERESTART return codes from the kernel syscalls,
rather than what seems to be the signal number:

Before:
rt_sigsuspend([] <unfinished ...>
--- SIGIO (I/O possible) ---
<... rt_sigsuspend resumed> )           = 29
sigreturn()                             = ? (mask now [])

vs:
rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
--- SIGIO (I/O possible) @ 0 (0) ---
sigreturn()                             = ? (mask now [])

x86:
rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
--- {si_signo=SIGIO, si_code=SI_USER} (I/O possible) ---
sigreturn()                             = ? (mask now [])

So, this patch should fix:
1. The race which I identified in the signal handling code (I think x86
   and other architectures can suffer from it too.)
2. The warning from try_to_freeze.
3. The unanticipated change to strace output.

Arnd, can you test this to make sure your gdb test case still works, and
Mark, can you test this to make sure it fixes your problem please?

Thanks.

 arch/arm/include/asm/thread_info.h |    3 +
 arch/arm/kernel/entry-common.S     |   11 ++
 arch/arm/kernel/ptrace.c           |    2 +
 arch/arm/kernel/signal.c           |  209 ++++++++++++++++++++++++------------
 4 files changed, 155 insertions(+), 70 deletions(-)

diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 7b5cc8d..40df533 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -129,6 +129,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 /*
  * thread information flags:
  *  TIF_SYSCALL_TRACE	- syscall trace active
+ *  TIF_SYS_RESTART	- syscall restart processing
  *  TIF_SIGPENDING	- signal pending
  *  TIF_NEED_RESCHED	- rescheduling necessary
  *  TIF_NOTIFY_RESUME	- callback before returning to user
@@ -139,6 +140,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 #define TIF_NEED_RESCHED	1
 #define TIF_NOTIFY_RESUME	2	/* callback before returning to user */
 #define TIF_SYSCALL_TRACE	8
+#define TIF_SYS_RESTART		9
 #define TIF_POLLING_NRFLAG	16
 #define TIF_USING_IWMMXT	17
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
@@ -147,6 +149,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 #define TIF_SECCOMP		21
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
+#define _TIF_SYS_RESTART	(1 << TIF_SYS_RESTART)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index b2a27b6..e922b85 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -45,6 +45,7 @@ ret_fast_syscall:
 fast_work_pending:
 	str	r0, [sp, #S_R0+S_OFF]!		@ returned r0
 work_pending:
+	enable_irq
 	tst	r1, #_TIF_NEED_RESCHED
 	bne	work_resched
 	tst	r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
@@ -56,6 +57,13 @@ work_pending:
 	bl	do_notify_resume
 	b	ret_slow_syscall		@ Check work again
 
+work_syscall_restart:
+	mov	r0, sp				@ 'regs'
+	bl	syscall_restart			@ process system call restart
+	teq	r0, #0				@ if ret=0 -> success, so
+	beq	ret_restart			@ return to userspace directly
+	b	ret_slow_syscall		@ otherwise, we have a segfault
+
 work_resched:
 	bl	schedule
 /*
@@ -69,6 +77,9 @@ ENTRY(ret_to_user_from_irq)
 	tst	r1, #_TIF_WORK_MASK
 	bne	work_pending
 no_work_pending:
+	tst	r1, #_TIF_SYS_RESTART
+	bne	work_syscall_restart
+ret_restart:
 #if defined(CONFIG_IRQSOFF_TRACER)
 	asm_trace_hardirqs_on
 #endif
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 2491f3b..ac8c34e 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -177,6 +177,7 @@ put_user_reg(struct task_struct *task, int offset, long data)
 
 	if (valid_user_regs(&newregs)) {
 		regs->uregs[offset] = data;
+		clear_ti_thread_flag(task_thread_info(task), TIF_SYS_RESTART);
 		ret = 0;
 	}
 
@@ -604,6 +605,7 @@ static int gpr_set(struct task_struct *target,
 		return -EINVAL;
 
 	*task_pt_regs(target) = newregs;
+	clear_ti_thread_flag(task_thread_info(target), TIF_SYS_RESTART);
 	return 0;
 }
 
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 0340224..42a1521 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -649,6 +649,135 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
 }
 
 /*
+ * Syscall restarting codes
+ *
+ * -ERESTARTSYS: restart system call if no handler, or if there is a
+ *	handler but it's marked SA_RESTART.  Otherwise return -EINTR.
+ * -ERESTARTNOINTR: always restart system call
+ * -ERESTARTNOHAND: restart system call only if no handler, otherwise
+ *	return -EINTR if invoking a user signal handler.
+ * -ERESTART_RESTARTBLOCK: call restart syscall if no handler, otherwise
+ *	return -EINTR if invoking a user signal handler.
+ */
+static void setup_syscall_restart(struct pt_regs *regs)
+{
+	regs->ARM_r0 = regs->ARM_ORIG_r0;
+	regs->ARM_pc -= thumb_mode(regs) ? 2 : 4;
+}
+
+/*
+ * Depending on the signal settings we may need to revert the decision
+ * to restart the system call.  But skip this if a debugger has chosen
+ * to restart at a different PC.
+ */
+static void syscall_restart_handler(struct pt_regs *regs, struct k_sigaction *ka)
+{
+	if (test_and_clear_thread_flag(TIF_SYS_RESTART)) {
+		long r0 = regs->ARM_r0;
+
+		/*
+		 * By default, return -EINTR to the user process for any
+		 * syscall which would otherwise be restarted.
+		 */
+		regs->ARM_r0 = -EINTR;
+
+		if (r0 == -ERESTARTNOINTR ||
+		    (r0 == -ERESTARTSYS && !(ka->sa.sa_flags & SA_RESTART)))
+			setup_syscall_restart(regs);
+	}
+}
+
+/*
+ * Handle syscall restarting when there is no user handler in place for
+ * a delivered signal.  Rather than doing this as part of the normal
+ * signal processing, we do this on the final return to userspace, after
+ * we've finished handling signals and checking for schedule events.
+ *
+ * This avoids bad behaviour such as:
+ *  - syscall returns -ERESTARTNOHAND
+ *  - signal with no handler (so we set things up to restart the syscall)
+ *  - schedule
+ *  - signal with handler (eg, SIGALRM)
+ *  - we call the handler and then restart the syscall
+ *
+ * In order to avoid races with TIF_NEED_RESCHED, IRQs must be disabled
+ * when this function is called and remain disabled until we exit to
+ * userspace.
+ */
+asmlinkage int syscall_restart(struct pt_regs *regs)
+{
+	struct thread_info *thread = current_thread_info();
+
+	clear_ti_thread_flag(thread, TIF_SYS_RESTART);
+	
+	/*
+	 * Restart the system call.  We haven't setup a signal handler
+	 * to invoke, and the regset hasn't been usurped by ptrace.
+	 */
+	if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
+		if (thumb_mode(regs)) {
+			regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
+			regs->ARM_pc -= 2;
+		} else {
+#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
+			regs->ARM_r7 = __NR_restart_syscall;
+			regs->ARM_pc -= 4;
+#else
+			u32 sp = regs->ARM_sp - 4;
+			u32 __user *usp = (u32 __user *)sp;
+			int ret;
+
+			/*
+			 * For OABI, we need to play some extra games, because
+			 * we need to write to the users stack, which we can't
+			 * do reliably from IRQs-disabled context.  Temporarily
+			 * re-enable IRQs, perform the store, and then plug
+			 * the resulting race afterwards.
+			 */
+			local_irq_enable();
+			ret = put_user(regs->ARM_pc, usp);
+			local_irq_disable();
+
+			/*
+			 * Plug the reschedule race - if we need to reschedule,
+			 * abort the syscall restarting.  We haven't modified
+			 * anything other than the attempted write to the stack
+			 * so we can merely retry later.
+			 */
+			if (need_resched()) {
+				set_ti_thread_flag(thread, TIF_SYS_RESTART);
+				return -EINTR;
+			}
+
+			/*
+			 * We failed (for some reason) to write to the stack.
+			 * Terminate the task.
+			 */
+			if (ret) {
+				force_sigsegv(0, current);
+				return -EFAULT;
+			}
+
+			/*
+			 * Success, update the stack pointer and point the
+			 * PC at the restarting code.
+			 */
+			regs->ARM_sp = sp;
+			regs->ARM_pc = KERN_RESTART_CODE;
+#endif
+		}
+	} else {
+		/*
+		 * Simple restart - just back up and re-execute the last
+		 * instruction.
+		 */
+		setup_syscall_restart(regs);
+	}
+
+	return 0;
+}
+
+/*
  * Note that 'init' is a special process: it doesn't get signals it doesn't
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
@@ -659,7 +788,6 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
  */
 static void do_signal(struct pt_regs *regs, int syscall)
 {
-	unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
 	struct k_sigaction ka;
 	siginfo_t info;
 	int signr;
@@ -674,32 +802,16 @@ static void do_signal(struct pt_regs *regs, int syscall)
 		return;
 
 	/*
-	 * If we were from a system call, check for system call restarting...
+	 * Set the SYS_RESTART flag to indicate that we have some
+	 * cleanup of the restart state to perform when returning to
+	 * userspace.
 	 */
-	if (syscall) {
-		continue_addr = regs->ARM_pc;
-		restart_addr = continue_addr - (thumb_mode(regs) ? 2 : 4);
-		retval = regs->ARM_r0;
-
-		/*
-		 * Prepare for system call restart.  We do this here so that a
-		 * debugger will see the already changed PSW.
-		 */
-		switch (retval) {
-		case -ERESTARTNOHAND:
-		case -ERESTARTSYS:
-		case -ERESTARTNOINTR:
-			regs->ARM_r0 = regs->ARM_ORIG_r0;
-			regs->ARM_pc = restart_addr;
-			break;
-		case -ERESTART_RESTARTBLOCK:
-			regs->ARM_r0 = -EINTR;
-			break;
-		}
-	}
-
-	if (try_to_freeze())
-		goto no_signal;
+	if (syscall &&
+	    (regs->ARM_r0 == -ERESTARTSYS ||
+	     regs->ARM_r0 == -ERESTARTNOINTR ||
+	     regs->ARM_r0 == -ERESTARTNOHAND ||
+	     regs->ARM_r0 == -ERESTART_RESTARTBLOCK))
+		set_thread_flag(TIF_SYS_RESTART);
 
 	/*
 	 * Get the signal to deliver.  When running under ptrace, at this
@@ -709,19 +821,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 	if (signr > 0) {
 		sigset_t *oldset;
 
-		/*
-		 * Depending on the signal settings we may need to revert the
-		 * decision to restart the system call.  But skip this if a
-		 * debugger has chosen to restart at a different PC.
-		 */
-		if (regs->ARM_pc == restart_addr) {
-			if (retval == -ERESTARTNOHAND
-			    || (retval == -ERESTARTSYS
-				&& !(ka.sa.sa_flags & SA_RESTART))) {
-				regs->ARM_r0 = -EINTR;
-				regs->ARM_pc = continue_addr;
-			}
-		}
+		syscall_restart_handler(regs, &ka);
 
 		if (test_thread_flag(TIF_RESTORE_SIGMASK))
 			oldset = &current->saved_sigmask;
@@ -740,38 +840,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 		return;
 	}
 
- no_signal:
 	if (syscall) {
-		/*
-		 * Handle restarting a different system call.  As above,
-		 * if a debugger has chosen to restart at a different PC,
-		 * ignore the restart.
-		 */
-		if (retval == -ERESTART_RESTARTBLOCK
-		    && regs->ARM_pc == continue_addr) {
-			if (thumb_mode(regs)) {
-				regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
-				regs->ARM_pc -= 2;
-			} else {
-#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
-				regs->ARM_r7 = __NR_restart_syscall;
-				regs->ARM_pc -= 4;
-#else
-				u32 __user *usp;
-
-				regs->ARM_sp -= 4;
-				usp = (u32 __user *)regs->ARM_sp;
-
-				if (put_user(regs->ARM_pc, usp) == 0) {
-					regs->ARM_pc = KERN_RESTART_CODE;
-				} else {
-					regs->ARM_sp += 4;
-					force_sigsegv(0, current);
-				}
-#endif
-			}
-		}
-
 		/* If there's no signal to deliver, we just put the saved sigmask
 		 * back.
 		 */


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-25 14:55                           ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-25 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 25, 2011 at 03:09:07PM +0200, Tejun Heo wrote:
> Hey, Russell.
> 
> If you can fix it properly without going through temporary step,
> that's awesome.  Let's put the arguments behind, okay?

Here's the patch.  As the kernel I've run this against doesn't have the
change to try_to_freeze(), I added a might_sleep() in do_signal() during
my testing to verify that it fixes Mark's problem (which it does.)

I've tested functions returning -ERESTARTSYS, -ERESTARTNOHAND and
-ERESTART_RESTARTBLOCK, all of which seem to behave as expected with
signals such as SIGCONT (without handler) and SIGALRM (with handler).
I haven't tested -ERESTARTNOINTR.

I don't have a test case for the race condition I mentioned (which is
admittedly pretty difficult to construct, requiring an explicit
signal, schedule, signal sequence) but this should plug that too.

How do we achieve this?  Effectively the steps in this patch are:

1. Undo Arnd's fixups to the syscall restart processing (but don't worry,
   we restore it in step 3).

2. Introduce TIF_SYS_RESTART, which is set when we enter signal handling
   and the syscall has returned one of the restart codes.  This is used
   as a flag to indicate that we have some syscall restart processing to
   do at some point.

3. Clear TIF_SYS_RESTART whenever ptrace is used to set the GP registers
   (thereby restoring Arnd's fixup for his gdb testsuite problem - it
   would be good if Arnd could reconfirm that.)

4. When we setup a user handler to run, check TIF_SYS_RESTART and clear it.
   If it was set, we need to set things up to return -EINTR or restart the
   syscall as appropriate.  As we've cleared it, no further restart
   processing will occur.

5. Once we've run all work (signal delivery, and rescheduling events), and
   we're about to return to userspace, make a final check for TIF_SYS_RESTART.
   If it's still set, then we're returning to userspace having not setup
   any user handlers, and we need to restart the syscall.  This is mostly
   trivial, except for OABI restartblock which requires the user stack to
   be written.  We have to re-enable IRQs for this write, which means we
   have to manually re-check for rescheduling events, abort the restart,
   and try again later.

One of the side effects of reverting Arnd's patch is that we restore the
strace behaviour which we've had for years on ARM, and can still be seen
on x86: strace can see the -ERESTART return codes from the kernel syscalls,
rather than what seems to be the signal number:

Before:
rt_sigsuspend([] <unfinished ...>
--- SIGIO (I/O possible) ---
<... rt_sigsuspend resumed> )           = 29
sigreturn()                             = ? (mask now [])

vs:
rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
--- SIGIO (I/O possible) @ 0 (0) ---
sigreturn()                             = ? (mask now [])

x86:
rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
--- {si_signo=SIGIO, si_code=SI_USER} (I/O possible) ---
sigreturn()                             = ? (mask now [])

So, this patch should fix:
1. The race which I identified in the signal handling code (I think x86
   and other architectures can suffer from it too.)
2. The warning from try_to_freeze.
3. The unanticipated change to strace output.

Arnd, can you test this to make sure your gdb test case still works, and
Mark, can you test this to make sure it fixes your problem please?

Thanks.

 arch/arm/include/asm/thread_info.h |    3 +
 arch/arm/kernel/entry-common.S     |   11 ++
 arch/arm/kernel/ptrace.c           |    2 +
 arch/arm/kernel/signal.c           |  209 ++++++++++++++++++++++++------------
 4 files changed, 155 insertions(+), 70 deletions(-)

diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 7b5cc8d..40df533 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -129,6 +129,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 /*
  * thread information flags:
  *  TIF_SYSCALL_TRACE	- syscall trace active
+ *  TIF_SYS_RESTART	- syscall restart processing
  *  TIF_SIGPENDING	- signal pending
  *  TIF_NEED_RESCHED	- rescheduling necessary
  *  TIF_NOTIFY_RESUME	- callback before returning to user
@@ -139,6 +140,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 #define TIF_NEED_RESCHED	1
 #define TIF_NOTIFY_RESUME	2	/* callback before returning to user */
 #define TIF_SYSCALL_TRACE	8
+#define TIF_SYS_RESTART		9
 #define TIF_POLLING_NRFLAG	16
 #define TIF_USING_IWMMXT	17
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
@@ -147,6 +149,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
 #define TIF_SECCOMP		21
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
+#define _TIF_SYS_RESTART	(1 << TIF_SYS_RESTART)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index b2a27b6..e922b85 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -45,6 +45,7 @@ ret_fast_syscall:
 fast_work_pending:
 	str	r0, [sp, #S_R0+S_OFF]!		@ returned r0
 work_pending:
+	enable_irq
 	tst	r1, #_TIF_NEED_RESCHED
 	bne	work_resched
 	tst	r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
@@ -56,6 +57,13 @@ work_pending:
 	bl	do_notify_resume
 	b	ret_slow_syscall		@ Check work again
 
+work_syscall_restart:
+	mov	r0, sp				@ 'regs'
+	bl	syscall_restart			@ process system call restart
+	teq	r0, #0				@ if ret=0 -> success, so
+	beq	ret_restart			@ return to userspace directly
+	b	ret_slow_syscall		@ otherwise, we have a segfault
+
 work_resched:
 	bl	schedule
 /*
@@ -69,6 +77,9 @@ ENTRY(ret_to_user_from_irq)
 	tst	r1, #_TIF_WORK_MASK
 	bne	work_pending
 no_work_pending:
+	tst	r1, #_TIF_SYS_RESTART
+	bne	work_syscall_restart
+ret_restart:
 #if defined(CONFIG_IRQSOFF_TRACER)
 	asm_trace_hardirqs_on
 #endif
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 2491f3b..ac8c34e 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -177,6 +177,7 @@ put_user_reg(struct task_struct *task, int offset, long data)
 
 	if (valid_user_regs(&newregs)) {
 		regs->uregs[offset] = data;
+		clear_ti_thread_flag(task_thread_info(task), TIF_SYS_RESTART);
 		ret = 0;
 	}
 
@@ -604,6 +605,7 @@ static int gpr_set(struct task_struct *target,
 		return -EINVAL;
 
 	*task_pt_regs(target) = newregs;
+	clear_ti_thread_flag(task_thread_info(target), TIF_SYS_RESTART);
 	return 0;
 }
 
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 0340224..42a1521 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -649,6 +649,135 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
 }
 
 /*
+ * Syscall restarting codes
+ *
+ * -ERESTARTSYS: restart system call if no handler, or if there is a
+ *	handler but it's marked SA_RESTART.  Otherwise return -EINTR.
+ * -ERESTARTNOINTR: always restart system call
+ * -ERESTARTNOHAND: restart system call only if no handler, otherwise
+ *	return -EINTR if invoking a user signal handler.
+ * -ERESTART_RESTARTBLOCK: call restart syscall if no handler, otherwise
+ *	return -EINTR if invoking a user signal handler.
+ */
+static void setup_syscall_restart(struct pt_regs *regs)
+{
+	regs->ARM_r0 = regs->ARM_ORIG_r0;
+	regs->ARM_pc -= thumb_mode(regs) ? 2 : 4;
+}
+
+/*
+ * Depending on the signal settings we may need to revert the decision
+ * to restart the system call.  But skip this if a debugger has chosen
+ * to restart at a different PC.
+ */
+static void syscall_restart_handler(struct pt_regs *regs, struct k_sigaction *ka)
+{
+	if (test_and_clear_thread_flag(TIF_SYS_RESTART)) {
+		long r0 = regs->ARM_r0;
+
+		/*
+		 * By default, return -EINTR to the user process for any
+		 * syscall which would otherwise be restarted.
+		 */
+		regs->ARM_r0 = -EINTR;
+
+		if (r0 == -ERESTARTNOINTR ||
+		    (r0 == -ERESTARTSYS && !(ka->sa.sa_flags & SA_RESTART)))
+			setup_syscall_restart(regs);
+	}
+}
+
+/*
+ * Handle syscall restarting when there is no user handler in place for
+ * a delivered signal.  Rather than doing this as part of the normal
+ * signal processing, we do this on the final return to userspace, after
+ * we've finished handling signals and checking for schedule events.
+ *
+ * This avoids bad behaviour such as:
+ *  - syscall returns -ERESTARTNOHAND
+ *  - signal with no handler (so we set things up to restart the syscall)
+ *  - schedule
+ *  - signal with handler (eg, SIGALRM)
+ *  - we call the handler and then restart the syscall
+ *
+ * In order to avoid races with TIF_NEED_RESCHED, IRQs must be disabled
+ * when this function is called and remain disabled until we exit to
+ * userspace.
+ */
+asmlinkage int syscall_restart(struct pt_regs *regs)
+{
+	struct thread_info *thread = current_thread_info();
+
+	clear_ti_thread_flag(thread, TIF_SYS_RESTART);
+	
+	/*
+	 * Restart the system call.  We haven't setup a signal handler
+	 * to invoke, and the regset hasn't been usurped by ptrace.
+	 */
+	if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
+		if (thumb_mode(regs)) {
+			regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
+			regs->ARM_pc -= 2;
+		} else {
+#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
+			regs->ARM_r7 = __NR_restart_syscall;
+			regs->ARM_pc -= 4;
+#else
+			u32 sp = regs->ARM_sp - 4;
+			u32 __user *usp = (u32 __user *)sp;
+			int ret;
+
+			/*
+			 * For OABI, we need to play some extra games, because
+			 * we need to write to the users stack, which we can't
+			 * do reliably from IRQs-disabled context.  Temporarily
+			 * re-enable IRQs, perform the store, and then plug
+			 * the resulting race afterwards.
+			 */
+			local_irq_enable();
+			ret = put_user(regs->ARM_pc, usp);
+			local_irq_disable();
+
+			/*
+			 * Plug the reschedule race - if we need to reschedule,
+			 * abort the syscall restarting.  We haven't modified
+			 * anything other than the attempted write to the stack
+			 * so we can merely retry later.
+			 */
+			if (need_resched()) {
+				set_ti_thread_flag(thread, TIF_SYS_RESTART);
+				return -EINTR;
+			}
+
+			/*
+			 * We failed (for some reason) to write to the stack.
+			 * Terminate the task.
+			 */
+			if (ret) {
+				force_sigsegv(0, current);
+				return -EFAULT;
+			}
+
+			/*
+			 * Success, update the stack pointer and point the
+			 * PC at the restarting code.
+			 */
+			regs->ARM_sp = sp;
+			regs->ARM_pc = KERN_RESTART_CODE;
+#endif
+		}
+	} else {
+		/*
+		 * Simple restart - just back up and re-execute the last
+		 * instruction.
+		 */
+		setup_syscall_restart(regs);
+	}
+
+	return 0;
+}
+
+/*
  * Note that 'init' is a special process: it doesn't get signals it doesn't
  * want to handle. Thus you cannot kill init even with a SIGKILL even by
  * mistake.
@@ -659,7 +788,6 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
  */
 static void do_signal(struct pt_regs *regs, int syscall)
 {
-	unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
 	struct k_sigaction ka;
 	siginfo_t info;
 	int signr;
@@ -674,32 +802,16 @@ static void do_signal(struct pt_regs *regs, int syscall)
 		return;
 
 	/*
-	 * If we were from a system call, check for system call restarting...
+	 * Set the SYS_RESTART flag to indicate that we have some
+	 * cleanup of the restart state to perform when returning to
+	 * userspace.
 	 */
-	if (syscall) {
-		continue_addr = regs->ARM_pc;
-		restart_addr = continue_addr - (thumb_mode(regs) ? 2 : 4);
-		retval = regs->ARM_r0;
-
-		/*
-		 * Prepare for system call restart.  We do this here so that a
-		 * debugger will see the already changed PSW.
-		 */
-		switch (retval) {
-		case -ERESTARTNOHAND:
-		case -ERESTARTSYS:
-		case -ERESTARTNOINTR:
-			regs->ARM_r0 = regs->ARM_ORIG_r0;
-			regs->ARM_pc = restart_addr;
-			break;
-		case -ERESTART_RESTARTBLOCK:
-			regs->ARM_r0 = -EINTR;
-			break;
-		}
-	}
-
-	if (try_to_freeze())
-		goto no_signal;
+	if (syscall &&
+	    (regs->ARM_r0 == -ERESTARTSYS ||
+	     regs->ARM_r0 == -ERESTARTNOINTR ||
+	     regs->ARM_r0 == -ERESTARTNOHAND ||
+	     regs->ARM_r0 == -ERESTART_RESTARTBLOCK))
+		set_thread_flag(TIF_SYS_RESTART);
 
 	/*
 	 * Get the signal to deliver.  When running under ptrace, at this
@@ -709,19 +821,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 	if (signr > 0) {
 		sigset_t *oldset;
 
-		/*
-		 * Depending on the signal settings we may need to revert the
-		 * decision to restart the system call.  But skip this if a
-		 * debugger has chosen to restart at a different PC.
-		 */
-		if (regs->ARM_pc == restart_addr) {
-			if (retval == -ERESTARTNOHAND
-			    || (retval == -ERESTARTSYS
-				&& !(ka.sa.sa_flags & SA_RESTART))) {
-				regs->ARM_r0 = -EINTR;
-				regs->ARM_pc = continue_addr;
-			}
-		}
+		syscall_restart_handler(regs, &ka);
 
 		if (test_thread_flag(TIF_RESTORE_SIGMASK))
 			oldset = &current->saved_sigmask;
@@ -740,38 +840,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 		return;
 	}
 
- no_signal:
 	if (syscall) {
-		/*
-		 * Handle restarting a different system call.  As above,
-		 * if a debugger has chosen to restart at a different PC,
-		 * ignore the restart.
-		 */
-		if (retval == -ERESTART_RESTARTBLOCK
-		    && regs->ARM_pc == continue_addr) {
-			if (thumb_mode(regs)) {
-				regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
-				regs->ARM_pc -= 2;
-			} else {
-#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
-				regs->ARM_r7 = __NR_restart_syscall;
-				regs->ARM_pc -= 4;
-#else
-				u32 __user *usp;
-
-				regs->ARM_sp -= 4;
-				usp = (u32 __user *)regs->ARM_sp;
-
-				if (put_user(regs->ARM_pc, usp) == 0) {
-					regs->ARM_pc = KERN_RESTART_CODE;
-				} else {
-					regs->ARM_sp += 4;
-					force_sigsegv(0, current);
-				}
-#endif
-			}
-		}
-
 		/* If there's no signal to deliver, we just put the saved sigmask
 		 * back.
 		 */

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 14:55                           ` Russell King - ARM Linux
@ 2011-08-26 14:44                             ` Arnd Bergmann
  -1 siblings, 0 replies; 66+ messages in thread
From: Arnd Bergmann @ 2011-08-26 14:44 UTC (permalink / raw)
  To: Russell King - ARM Linux, ulrich.weigand
  Cc: Tejun Heo, Mark Brown, Rafael J. Wysocki, linux-kernel, linux-arm-kernel

On Thursday 25 August 2011, Russell King - ARM Linux wrote:
> 
> Arnd, can you test this to make sure your gdb test case still works, and
> Mark, can you test this to make sure it fixes your problem please?

Hi Russell,

The patch in question was not actually from me but from Ulrich Weigand,
so he's probably the right person to test your patch.
I'm forwarding it in full to Uli for reference.

	Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-26 14:44                             ` Arnd Bergmann
  0 siblings, 0 replies; 66+ messages in thread
From: Arnd Bergmann @ 2011-08-26 14:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 25 August 2011, Russell King - ARM Linux wrote:
> 
> Arnd, can you test this to make sure your gdb test case still works, and
> Mark, can you test this to make sure it fixes your problem please?

Hi Russell,

The patch in question was not actually from me but from Ulrich Weigand,
so he's probably the right person to test your patch.
I'm forwarding it in full to Uli for reference.

	Arnd

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 14:55                           ` Russell King - ARM Linux
@ 2011-08-30 20:58                             ` Mark Brown
  -1 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-30 20:58 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Tejun Heo, Arnd Bergmann, Rafael J. Wysocki, linux-kernel,
	linux-arm-kernel

On Thu, Aug 25, 2011 at 03:55:58PM +0100, Russell King - ARM Linux wrote:

> Here's the patch.  As the kernel I've run this against doesn't have the
> change to try_to_freeze(), I added a might_sleep() in do_signal() during
> my testing to verify that it fixes Mark's problem (which it does.)

> Mark, can you test this to make sure it fixes your problem please?

Yes, it resolves the problem for me - thanks!

Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

Sorry about the delay, was away for the bank holiday weekend.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-30 20:58                             ` Mark Brown
  0 siblings, 0 replies; 66+ messages in thread
From: Mark Brown @ 2011-08-30 20:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 25, 2011 at 03:55:58PM +0100, Russell King - ARM Linux wrote:

> Here's the patch.  As the kernel I've run this against doesn't have the
> change to try_to_freeze(), I added a might_sleep() in do_signal() during
> my testing to verify that it fixes Mark's problem (which it does.)

> Mark, can you test this to make sure it fixes your problem please?

Yes, it resolves the problem for me - thanks!

Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

Sorry about the delay, was away for the bank holiday weekend.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-30 20:58                             ` Mark Brown
@ 2011-08-30 21:10                               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-30 21:10 UTC (permalink / raw)
  To: Mark Brown
  Cc: Tejun Heo, Arnd Bergmann, Rafael J. Wysocki, linux-kernel,
	linux-arm-kernel

On Tue, Aug 30, 2011 at 09:58:03PM +0100, Mark Brown wrote:
> On Thu, Aug 25, 2011 at 03:55:58PM +0100, Russell King - ARM Linux wrote:
> 
> > Here's the patch.  As the kernel I've run this against doesn't have the
> > change to try_to_freeze(), I added a might_sleep() in do_signal() during
> > my testing to verify that it fixes Mark's problem (which it does.)
> 
> > Mark, can you test this to make sure it fixes your problem please?
> 
> Yes, it resolves the problem for me - thanks!
> 
> Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
> 
> Sorry about the delay, was away for the bank holiday weekend.

That's good, thanks.

Just waiting for Ulrich Weigand to verify that his gdb test-case doesn't
break with this patch.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-08-30 21:10                               ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-08-30 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 30, 2011 at 09:58:03PM +0100, Mark Brown wrote:
> On Thu, Aug 25, 2011 at 03:55:58PM +0100, Russell King - ARM Linux wrote:
> 
> > Here's the patch.  As the kernel I've run this against doesn't have the
> > change to try_to_freeze(), I added a might_sleep() in do_signal() during
> > my testing to verify that it fixes Mark's problem (which it does.)
> 
> > Mark, can you test this to make sure it fixes your problem please?
> 
> Yes, it resolves the problem for me - thanks!
> 
> Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
> 
> Sorry about the delay, was away for the bank holiday weekend.

That's good, thanks.

Just waiting for Ulrich Weigand to verify that his gdb test-case doesn't
break with this patch.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-26 14:44                             ` Arnd Bergmann
@ 2011-09-01 13:41                               ` Ulrich Weigand
  -1 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-01 13:41 UTC (permalink / raw)
  To: Arnd Bergmann, Russell King - ARM Linux
  Cc: Mark Brown, linux-arm-kernel, linux-kernel, Rafael J. Wysocki,
	Tejun Heo, Nicolas Pitre, linaro-toolchain

Arnd Bergmann <arnd@arndb.de> wrote on 08/26/2011 04:44:26 PM:
> On Thursday 25 August 2011, Russell King - ARM Linux wrote:
> >
> > Arnd, can you test this to make sure your gdb test case still works,
and
> > Mark, can you test this to make sure it fixes your problem please?
>
> Hi Russell,
>
> The patch in question was not actually from me but from Ulrich Weigand,
> so he's probably the right person to test your patch.
> I'm forwarding it in full to Uli for reference.

Hi Arnd, hi Russell,

sorry for the late reply, I've just returned from vacation today ...

I've not yet run the test, but just from reading through the patch
it seems that this will at least partially re-introduce the problem
my original patch was trying to fix.

The situation here is about GDB performing an "inferior function
call", e.g. via the GDB "call" command.  To do so, GDB will:

0. [ Have gotten control of the target process via some ptrace
   intercept previously, and then ... ]
1. Save the register state
2. Create a dummy frame on the stack and set up registers (PC, SP,
   argument registers, ...) as appropriate for a function call
3. Restart via PTRACE_CONTINUE
   [ ... at this point, the target process runs the function until
     it returns to a breakpoint instruction and GDB gets control
     again via another ptrace intercept ... ]
4. Restore the register state saved in [1.]
5. At some later point, continue the target process [at its
   original location] with PTRACE_CONTINUE

The problem now occurs if at point [0.] the target process just
happened to be blocked in a restartable system call.  For this
sequence to then work as expected, two things have to happen:

- at point [3.], the kernel must *not* attempt to restart a
  system call, even though it thinks we're stopped in a
  restartable system call

- at point [5.], the kernel now *must* restart the originally
  interrupted system call, even though it thinks we're stopped
  at some breakpoint, and not within a system call

My patch achieved both these goals, while it would seem your
patch only solves the first issue, not the second one.  In
fact, since any interaction with ptrace will always cause the
TIF_SYS_RESTART flag to be *reset*, and there is no way at all
to *set* it, there doesn't appear to be any way for GDB to
achive that second goal.

[ With my patch, that second goal was implicitly achieved by
the fact that at [1.] GDB would save a register state that
already corresponds to the way things should be for restarting
the system call.  When that register set is then restored in [4.],
restart just happens automatically without any further kernel
intervention. ]

One way to fix this might be to make the TIF_SYS_RESTART flag
itself visible to ptrace, so the GDB could save/restore it
along with the rest of the register set; this would be similar
to how that problem is handled on other platforms.  However,
there doesn't appear to be an obvious place for the flag in
the ptrace register set ...


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-01 13:41                               ` Ulrich Weigand
  0 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-01 13:41 UTC (permalink / raw)
  To: linux-arm-kernel

Arnd Bergmann <arnd@arndb.de> wrote on 08/26/2011 04:44:26 PM:
> On Thursday 25 August 2011, Russell King - ARM Linux wrote:
> >
> > Arnd, can you test this to make sure your gdb test case still works,
and
> > Mark, can you test this to make sure it fixes your problem please?
>
> Hi Russell,
>
> The patch in question was not actually from me but from Ulrich Weigand,
> so he's probably the right person to test your patch.
> I'm forwarding it in full to Uli for reference.

Hi Arnd, hi Russell,

sorry for the late reply, I've just returned from vacation today ...

I've not yet run the test, but just from reading through the patch
it seems that this will at least partially re-introduce the problem
my original patch was trying to fix.

The situation here is about GDB performing an "inferior function
call", e.g. via the GDB "call" command.  To do so, GDB will:

0. [ Have gotten control of the target process via some ptrace
   intercept previously, and then ... ]
1. Save the register state
2. Create a dummy frame on the stack and set up registers (PC, SP,
   argument registers, ...) as appropriate for a function call
3. Restart via PTRACE_CONTINUE
   [ ... at this point, the target process runs the function until
     it returns to a breakpoint instruction and GDB gets control
     again via another ptrace intercept ... ]
4. Restore the register state saved in [1.]
5. At some later point, continue the target process [at its
   original location] with PTRACE_CONTINUE

The problem now occurs if at point [0.] the target process just
happened to be blocked in a restartable system call.  For this
sequence to then work as expected, two things have to happen:

- at point [3.], the kernel must *not* attempt to restart a
  system call, even though it thinks we're stopped in a
  restartable system call

- at point [5.], the kernel now *must* restart the originally
  interrupted system call, even though it thinks we're stopped
  at some breakpoint, and not within a system call

My patch achieved both these goals, while it would seem your
patch only solves the first issue, not the second one.  In
fact, since any interaction with ptrace will always cause the
TIF_SYS_RESTART flag to be *reset*, and there is no way at all
to *set* it, there doesn't appear to be any way for GDB to
achive that second goal.

[ With my patch, that second goal was implicitly achieved by
the fact that at [1.] GDB would save a register state that
already corresponds to the way things should be for restarting
the system call.  When that register set is then restored in [4.],
restart just happens automatically without any further kernel
intervention. ]

One way to fix this might be to make the TIF_SYS_RESTART flag
itself visible to ptrace, so the GDB could save/restore it
along with the rest of the register set; this would be similar
to how that problem is handled on other platforms.  However,
there doesn't appear to be an obvious place for the flag in
the ptrace register set ...


Bye,
Ulrich

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-01 13:41                               ` Ulrich Weigand
@ 2011-09-01 14:00                                 ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-01 14:00 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Arnd Bergmann, Mark Brown, linux-arm-kernel, linux-kernel,
	Rafael J. Wysocki, Tejun Heo, Nicolas Pitre, linaro-toolchain

On Thu, Sep 01, 2011 at 03:41:22PM +0200, Ulrich Weigand wrote:
> The problem now occurs if at point [0.] the target process just
> happened to be blocked in a restartable system call.  For this
> sequence to then work as expected, two things have to happen:
> 
> - at point [3.], the kernel must *not* attempt to restart a
>   system call, even though it thinks we're stopped in a
>   restartable system call
> 
> - at point [5.], the kernel now *must* restart the originally
>   interrupted system call, even though it thinks we're stopped
>   at some breakpoint, and not within a system call
> 
> My patch achieved both these goals, while it would seem your
> patch only solves the first issue, not the second one.  In
> fact, since any interaction with ptrace will always cause the
> TIF_SYS_RESTART flag to be *reset*, and there is no way at all
> to *set* it, there doesn't appear to be any way for GDB to
> achive that second goal.
...
> One way to fix this might be to make the TIF_SYS_RESTART flag
> itself visible to ptrace, so the GDB could save/restore it
> along with the rest of the register set; this would be similar
> to how that problem is handled on other platforms.  However,
> there doesn't appear to be an obvious place for the flag in
> the ptrace register set ...

Thanks for looking at this.

I don't think we can augment the ptrace register set - that would be a
major API change which would immediately break lots of userspace,
causing user stack overflows and such like.

I can't see a way out of this - and given the seriousness of the kernel
side issue (causing kernel warnings), and that your change altered the
strace behaviour (an unintended user-visible change) I think we're going
to have to live with the gdb testcase failing until we can come up with
a better fix for it.

I also wonder what the validity of this behaviour is - there are cases
where you can't do what gdb's trying to do - eg, with a syscall using
a restart block (-ERESTART_RESTARTBLOCK) because the restart information
could be wiped out by a new syscall performed by the function gdb wants
to run.  Or when the program receives a signal for it to handle while
running that function.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-01 14:00                                 ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-01 14:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 01, 2011 at 03:41:22PM +0200, Ulrich Weigand wrote:
> The problem now occurs if at point [0.] the target process just
> happened to be blocked in a restartable system call.  For this
> sequence to then work as expected, two things have to happen:
> 
> - at point [3.], the kernel must *not* attempt to restart a
>   system call, even though it thinks we're stopped in a
>   restartable system call
> 
> - at point [5.], the kernel now *must* restart the originally
>   interrupted system call, even though it thinks we're stopped
>   at some breakpoint, and not within a system call
> 
> My patch achieved both these goals, while it would seem your
> patch only solves the first issue, not the second one.  In
> fact, since any interaction with ptrace will always cause the
> TIF_SYS_RESTART flag to be *reset*, and there is no way at all
> to *set* it, there doesn't appear to be any way for GDB to
> achive that second goal.
...
> One way to fix this might be to make the TIF_SYS_RESTART flag
> itself visible to ptrace, so the GDB could save/restore it
> along with the rest of the register set; this would be similar
> to how that problem is handled on other platforms.  However,
> there doesn't appear to be an obvious place for the flag in
> the ptrace register set ...

Thanks for looking at this.

I don't think we can augment the ptrace register set - that would be a
major API change which would immediately break lots of userspace,
causing user stack overflows and such like.

I can't see a way out of this - and given the seriousness of the kernel
side issue (causing kernel warnings), and that your change altered the
strace behaviour (an unintended user-visible change) I think we're going
to have to live with the gdb testcase failing until we can come up with
a better fix for it.

I also wonder what the validity of this behaviour is - there are cases
where you can't do what gdb's trying to do - eg, with a syscall using
a restart block (-ERESTART_RESTARTBLOCK) because the restart information
could be wiped out by a new syscall performed by the function gdb wants
to run.  Or when the program receives a signal for it to handle while
running that function.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-01 14:00                                 ` Russell King - ARM Linux
@ 2011-09-02 14:47                                   ` Ulrich Weigand
  -1 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-02 14:47 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Nicolas Pitre, Rafael J. Wysocki, Tejun Heo,
	Martin Schwidefsky

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/01/2011
04:00:00 PM:
> On Thu, Sep 01, 2011 at 03:41:22PM +0200, Ulrich Weigand wrote:
> > One way to fix this might be to make the TIF_SYS_RESTART flag
> > itself visible to ptrace, so the GDB could save/restore it
> > along with the rest of the register set; this would be similar
> > to how that problem is handled on other platforms.  However,
> > there doesn't appear to be an obvious place for the flag in
> > the ptrace register set ...
>
> Thanks for looking at this.
>
> I don't think we can augment the ptrace register set - that would be a
> major API change which would immediately break lots of userspace,
> causing user stack overflows and such like.

Well, it wouldn't need to go into the default register set (REGSET_GPR),
but could be in a new register set.  This shouldn't change anything for
existing applications, but GDB could probe for support for the new
regset and use it if present.

> I can't see a way out of this - and given the seriousness of the kernel
> side issue (causing kernel warnings), and that your change altered the
> strace behaviour (an unintended user-visible change) I think we're going
> to have to live with the gdb testcase failing until we can come up with
> a better fix for it.

The change you suggest, with the addition of making the flag available
to the debugger in a new register set, should be fine for GDB.

I agree that the race you pointed out in the initial mail to this thread
needs to be fixed; an in fact, it seems that this race is also present
on s390 ...  In discussions with Martin, we were wondering however
whether the fix you suggest does actually fully plug the race:

Assume the scenario you initally describe, where a first signal is
ignored and leads to system call restart.  With your latest patch,
you call into syscall_restart which sets everything up to restart
the call (with interrupts disabled).  Then, you go back to user space,
with PC pointing to the SVC that is to be restarted.  Now assume that
some interrupt is pending at this point.  At the switch to user
space, interrupts will be enabled.  Can it now happen that this
pending interrupt is recognized *before* the SVC is executed?
(Maybe the hardware doesn't allow that on ARM ... it definitely
*is* possible on s390, though.)

If so, can this not then lead to the exact same race you initially
described (the interrupt causing some other task to be scheduled,
and then an second signal to be delivered to the task about to
restart the SVC)?

I guess one way to fix this would be to not actually go back to
user space for system call restart at all, but instead just short-
cut this in entry.S and simply go right back to the system call
entry path.  As a side benefit, this could eliminate the need to
create a stack frame for the OABI RESTARTBLOCK case ...  (This is
what we do on s390 b.t.w., where there is a similar issue with the
system call number being an immediate argument to the SVC
instruction.)

> I also wonder what the validity of this behaviour is - there are cases
> where you can't do what gdb's trying to do - eg, with a syscall using
> a restart block (-ERESTART_RESTARTBLOCK) because the restart information
> could be wiped out by a new syscall performed by the function gdb wants
> to run.  Or when the program receives a signal for it to handle while
> running that function.

Yes, performing an inferior call during a system call that restarts
using RESTARTBLOCK is broken.  This is a problem on all architectures,
and has been ever since the RESTARTBLOCK mechanism was introduced ...

To really fix this case would probably require some way for the
debugger to save and restore the restore_block saved state.  This
is not quite trivial, since it would expose that state to user space,
effectively creating a new ABI (and probably requiring sanity checks
to ensure a valid state is restored).  This probably cannot be fixed
by one architecture for itself, but would need support from common
kernel code.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-02 14:47                                   ` Ulrich Weigand
  0 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-02 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/01/2011
04:00:00 PM:
> On Thu, Sep 01, 2011 at 03:41:22PM +0200, Ulrich Weigand wrote:
> > One way to fix this might be to make the TIF_SYS_RESTART flag
> > itself visible to ptrace, so the GDB could save/restore it
> > along with the rest of the register set; this would be similar
> > to how that problem is handled on other platforms.  However,
> > there doesn't appear to be an obvious place for the flag in
> > the ptrace register set ...
>
> Thanks for looking at this.
>
> I don't think we can augment the ptrace register set - that would be a
> major API change which would immediately break lots of userspace,
> causing user stack overflows and such like.

Well, it wouldn't need to go into the default register set (REGSET_GPR),
but could be in a new register set.  This shouldn't change anything for
existing applications, but GDB could probe for support for the new
regset and use it if present.

> I can't see a way out of this - and given the seriousness of the kernel
> side issue (causing kernel warnings), and that your change altered the
> strace behaviour (an unintended user-visible change) I think we're going
> to have to live with the gdb testcase failing until we can come up with
> a better fix for it.

The change you suggest, with the addition of making the flag available
to the debugger in a new register set, should be fine for GDB.

I agree that the race you pointed out in the initial mail to this thread
needs to be fixed; an in fact, it seems that this race is also present
on s390 ...  In discussions with Martin, we were wondering however
whether the fix you suggest does actually fully plug the race:

Assume the scenario you initally describe, where a first signal is
ignored and leads to system call restart.  With your latest patch,
you call into syscall_restart which sets everything up to restart
the call (with interrupts disabled).  Then, you go back to user space,
with PC pointing to the SVC that is to be restarted.  Now assume that
some interrupt is pending at this point.  At the switch to user
space, interrupts will be enabled.  Can it now happen that this
pending interrupt is recognized *before* the SVC is executed?
(Maybe the hardware doesn't allow that on ARM ... it definitely
*is* possible on s390, though.)

If so, can this not then lead to the exact same race you initially
described (the interrupt causing some other task to be scheduled,
and then an second signal to be delivered to the task about to
restart the SVC)?

I guess one way to fix this would be to not actually go back to
user space for system call restart at all, but instead just short-
cut this in entry.S and simply go right back to the system call
entry path.  As a side benefit, this could eliminate the need to
create a stack frame for the OABI RESTARTBLOCK case ...  (This is
what we do on s390 b.t.w., where there is a similar issue with the
system call number being an immediate argument to the SVC
instruction.)

> I also wonder what the validity of this behaviour is - there are cases
> where you can't do what gdb's trying to do - eg, with a syscall using
> a restart block (-ERESTART_RESTARTBLOCK) because the restart information
> could be wiped out by a new syscall performed by the function gdb wants
> to run.  Or when the program receives a signal for it to handle while
> running that function.

Yes, performing an inferior call during a system call that restarts
using RESTARTBLOCK is broken.  This is a problem on all architectures,
and has been ever since the RESTARTBLOCK mechanism was introduced ...

To really fix this case would probably require some way for the
debugger to save and restore the restore_block saved state.  This
is not quite trivial, since it would expose that state to user space,
effectively creating a new ABI (and probably requiring sanity checks
to ensure a valid state is restored).  This probably cannot be fixed
by one architecture for itself, but would need support from common
kernel code.

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-02 14:47                                   ` Ulrich Weigand
@ 2011-09-02 17:22                                     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-02 17:22 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Nicolas Pitre, Rafael J. Wysocki, Tejun Heo,
	Martin Schwidefsky

On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> Assume the scenario you initally describe, where a first signal is
> ignored and leads to system call restart.  With your latest patch,
> you call into syscall_restart which sets everything up to restart
> the call (with interrupts disabled).

I don't think SIG_IGN signals even set the TIF work flag, so they
never even cause a call into do_signal().  Therefore, as far as
syscalls go, attempting to send a process (eg) a SIGINT which its
handler is set to SIG_IGN results in the process not even being
notified about the attempt - we won't even wake up while the
syscall is sleeping.

> To really fix this case would probably require some way for the
> debugger to save and restore the restore_block saved state.  This
> is not quite trivial, since it would expose that state to user space,
> effectively creating a new ABI (and probably requiring sanity checks
> to ensure a valid state is restored).  This probably cannot be fixed
> by one architecture for itself, but would need support from common
> kernel code.

Such state would have to be crytographically signed or kept entirely
within the kernel, as it would otherwise mean that you could redirect
the kernel PC to anywhere...

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-02 17:22                                     ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-02 17:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> Assume the scenario you initally describe, where a first signal is
> ignored and leads to system call restart.  With your latest patch,
> you call into syscall_restart which sets everything up to restart
> the call (with interrupts disabled).

I don't think SIG_IGN signals even set the TIF work flag, so they
never even cause a call into do_signal().  Therefore, as far as
syscalls go, attempting to send a process (eg) a SIGINT which its
handler is set to SIG_IGN results in the process not even being
notified about the attempt - we won't even wake up while the
syscall is sleeping.

> To really fix this case would probably require some way for the
> debugger to save and restore the restore_block saved state.  This
> is not quite trivial, since it would expose that state to user space,
> effectively creating a new ABI (and probably requiring sanity checks
> to ensure a valid state is restored).  This probably cannot be fixed
> by one architecture for itself, but would need support from common
> kernel code.

Such state would have to be crytographically signed or kept entirely
within the kernel, as it would otherwise mean that you could redirect
the kernel PC to anywhere...

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-02 17:22                                     ` Russell King - ARM Linux
@ 2011-09-02 17:40                                       ` Ulrich Weigand
  -1 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-02 17:40 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Martin Schwidefsky, Nicolas Pitre,
	Rafael J. Wysocki, Tejun Heo

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
07:22:59 PM:
> On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > Assume the scenario you initally describe, where a first signal is
> > ignored and leads to system call restart.  With your latest patch,
> > you call into syscall_restart which sets everything up to restart
> > the call (with interrupts disabled).
>
> I don't think SIG_IGN signals even set the TIF work flag, so they
> never even cause a call into do_signal().  Therefore, as far as
> syscalls go, attempting to send a process (eg) a SIGINT which its
> handler is set to SIG_IGN results in the process not even being
> notified about the attempt - we won't even wake up while the
> syscall is sleeping.

I don't see why SIG_IGN signals shouldn't set the TIF work flag;
the decision whether to ignore a signal is only made once we've
got to get_signal_to_deliver.  In any case, whether or not the
signal is SIG_IGN doesn't matter for the example at all; I'm
simply talking about the case whether the first signal we get
leads to system call restart, exactly the same as in the original
example you initially described here:
http://lists.arm.linux.org.uk/lurker/message/20110823.154329.a3e65f95.en.html

> > To really fix this case would probably require some way for the
> > debugger to save and restore the restore_block saved state.  This
> > is not quite trivial, since it would expose that state to user space,
> > effectively creating a new ABI (and probably requiring sanity checks
> > to ensure a valid state is restored).  This probably cannot be fixed
> > by one architecture for itself, but would need support from common
> > kernel code.
>
> Such state would have to be crytographically signed or kept entirely
> within the kernel, as it would otherwise mean that you could redirect
> the kernel PC to anywhere...

Agreed, that's why the state would need to be verified (in the case of
the function pointer, we probably would not want to export the kernel
code address to user space in any case, but identify which of the
possible target functions is to be called in some other manner).

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-02 17:40                                       ` Ulrich Weigand
  0 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-02 17:40 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
07:22:59 PM:
> On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > Assume the scenario you initally describe, where a first signal is
> > ignored and leads to system call restart.  With your latest patch,
> > you call into syscall_restart which sets everything up to restart
> > the call (with interrupts disabled).
>
> I don't think SIG_IGN signals even set the TIF work flag, so they
> never even cause a call into do_signal().  Therefore, as far as
> syscalls go, attempting to send a process (eg) a SIGINT which its
> handler is set to SIG_IGN results in the process not even being
> notified about the attempt - we won't even wake up while the
> syscall is sleeping.

I don't see why SIG_IGN signals shouldn't set the TIF work flag;
the decision whether to ignore a signal is only made once we've
got to get_signal_to_deliver.  In any case, whether or not the
signal is SIG_IGN doesn't matter for the example at all; I'm
simply talking about the case whether the first signal we get
leads to system call restart, exactly the same as in the original
example you initially described here:
http://lists.arm.linux.org.uk/lurker/message/20110823.154329.a3e65f95.en.html

> > To really fix this case would probably require some way for the
> > debugger to save and restore the restore_block saved state.  This
> > is not quite trivial, since it would expose that state to user space,
> > effectively creating a new ABI (and probably requiring sanity checks
> > to ensure a valid state is restored).  This probably cannot be fixed
> > by one architecture for itself, but would need support from common
> > kernel code.
>
> Such state would have to be crytographically signed or kept entirely
> within the kernel, as it would otherwise mean that you could redirect
> the kernel PC to anywhere...

Agreed, that's why the state would need to be verified (in the case of
the function pointer, we probably would not want to export the kernel
code address to user space in any case, but identify which of the
possible target functions is to be called in some other manner).

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-02 17:40                                       ` Ulrich Weigand
@ 2011-09-02 17:48                                         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-02 17:48 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Martin Schwidefsky, Nicolas Pitre,
	Rafael J. Wysocki, Tejun Heo

On Fri, Sep 02, 2011 at 07:40:34PM +0200, Ulrich Weigand wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
> 07:22:59 PM:
> > On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > > Assume the scenario you initally describe, where a first signal is
> > > ignored and leads to system call restart.  With your latest patch,
> > > you call into syscall_restart which sets everything up to restart
> > > the call (with interrupts disabled).
> >
> > I don't think SIG_IGN signals even set the TIF work flag, so they
> > never even cause a call into do_signal().  Therefore, as far as
> > syscalls go, attempting to send a process (eg) a SIGINT which its
> > handler is set to SIG_IGN results in the process not even being
> > notified about the attempt - we won't even wake up while the
> > syscall is sleeping.
> 
> I don't see why SIG_IGN signals shouldn't set the TIF work flag;
> the decision whether to ignore a signal is only made once we've
> got to get_signal_to_deliver.

Yes, having looked deeper, you seem to be right - but only if the thread
is being ptraced.  If it's not being ptraced, ignored signals don't
make it that far.

And yes, we can end up processing the interrupt before the SVC is
executed, which is still a hole.  So we need to avoid doing the
restart in userspace - which might actually make things easier.
I'll take a look into that over the weekend.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-02 17:48                                         ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2011-09-02 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 02, 2011 at 07:40:34PM +0200, Ulrich Weigand wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
> 07:22:59 PM:
> > On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > > Assume the scenario you initally describe, where a first signal is
> > > ignored and leads to system call restart.  With your latest patch,
> > > you call into syscall_restart which sets everything up to restart
> > > the call (with interrupts disabled).
> >
> > I don't think SIG_IGN signals even set the TIF work flag, so they
> > never even cause a call into do_signal().  Therefore, as far as
> > syscalls go, attempting to send a process (eg) a SIGINT which its
> > handler is set to SIG_IGN results in the process not even being
> > notified about the attempt - we won't even wake up while the
> > syscall is sleeping.
> 
> I don't see why SIG_IGN signals shouldn't set the TIF work flag;
> the decision whether to ignore a signal is only made once we've
> got to get_signal_to_deliver.

Yes, having looked deeper, you seem to be right - but only if the thread
is being ptraced.  If it's not being ptraced, ignored signals don't
make it that far.

And yes, we can end up processing the interrupt before the SVC is
executed, which is still a hole.  So we need to avoid doing the
restart in userspace - which might actually make things easier.
I'll take a look into that over the weekend.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-02 17:48                                         ` Russell King - ARM Linux
@ 2011-09-16 10:31                                           ` Martin Schwidefsky
  -1 siblings, 0 replies; 66+ messages in thread
From: Martin Schwidefsky @ 2011-09-16 10:31 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Nicolas Pitre, Rafael J. Wysocki, Tejun Heo,
	Ulrich Weigand

Hi Russell,

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
07:48:12 PM:

> On Fri, Sep 02, 2011 at 07:40:34PM +0200, Ulrich Weigand wrote:
> > Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
> > 07:22:59 PM:
> > > On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > > > Assume the scenario you initally describe, where a first signal is
> > > > ignored and leads to system call restart.  With your latest patch,
> > > > you call into syscall_restart which sets everything up to restart
> > > > the call (with interrupts disabled).
> > >
> > > I don't think SIG_IGN signals even set the TIF work flag, so they
> > > never even cause a call into do_signal().  Therefore, as far as
> > > syscalls go, attempting to send a process (eg) a SIGINT which its
> > > handler is set to SIG_IGN results in the process not even being
> > > notified about the attempt - we won't even wake up while the
> > > syscall is sleeping.
> >
> > I don't see why SIG_IGN signals shouldn't set the TIF work flag;
> > the decision whether to ignore a signal is only made once we've
> > got to get_signal_to_deliver.
>
> Yes, having looked deeper, you seem to be right - but only if the thread
> is being ptraced.  If it's not being ptraced, ignored signals don't
> make it that far.
>
> And yes, we can end up processing the interrupt before the SVC is
> executed, which is still a hole.  So we need to avoid doing the
> restart in userspace - which might actually make things easier.
> I'll take a look into that over the weekend.

After some more discussions with Uli I've now created a patch that
hopefully will address the issue on s390. The new code always uses
the TIF_RESTART_SVC work flag to restart the system call if there
is no signal to deliver. Only if there is a signal to deliver the
traditional rewind of the instruction pointer is used to restart
the system call (for -ERESTARTNOINTR and -ERESTARTSYS with a
SA_RESTART signal handler).

In case you are interested in the code is available at
git://git390.marist.edu/pub/scm/linux-2.6.git features
in particular git commit 4a6a001c39ac196530d8378408b61d6d2fee70d9.

blue skies,
   Martin


^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-16 10:31                                           ` Martin Schwidefsky
  0 siblings, 0 replies; 66+ messages in thread
From: Martin Schwidefsky @ 2011-09-16 10:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
07:48:12 PM:

> On Fri, Sep 02, 2011 at 07:40:34PM +0200, Ulrich Weigand wrote:
> > Russell King - ARM Linux <linux@arm.linux.org.uk> wrote on 09/02/2011
> > 07:22:59 PM:
> > > On Fri, Sep 02, 2011 at 04:47:35PM +0200, Ulrich Weigand wrote:
> > > > Assume the scenario you initally describe, where a first signal is
> > > > ignored and leads to system call restart.  With your latest patch,
> > > > you call into syscall_restart which sets everything up to restart
> > > > the call (with interrupts disabled).
> > >
> > > I don't think SIG_IGN signals even set the TIF work flag, so they
> > > never even cause a call into do_signal().  Therefore, as far as
> > > syscalls go, attempting to send a process (eg) a SIGINT which its
> > > handler is set to SIG_IGN results in the process not even being
> > > notified about the attempt - we won't even wake up while the
> > > syscall is sleeping.
> >
> > I don't see why SIG_IGN signals shouldn't set the TIF work flag;
> > the decision whether to ignore a signal is only made once we've
> > got to get_signal_to_deliver.
>
> Yes, having looked deeper, you seem to be right - but only if the thread
> is being ptraced.  If it's not being ptraced, ignored signals don't
> make it that far.
>
> And yes, we can end up processing the interrupt before the SVC is
> executed, which is still a hole.  So we need to avoid doing the
> restart in userspace - which might actually make things easier.
> I'll take a look into that over the weekend.

After some more discussions with Uli I've now created a patch that
hopefully will address the issue on s390. The new code always uses
the TIF_RESTART_SVC work flag to restart the system call if there
is no signal to deliver. Only if there is a signal to deliver the
traditional rewind of the instruction pointer is used to restart
the system call (for -ERESTARTNOINTR and -ERESTARTSYS with a
SA_RESTART signal handler).

In case you are interested in the code is available at
git://git390.marist.edu/pub/scm/linux-2.6.git features
in particular git commit 4a6a001c39ac196530d8378408b61d6d2fee70d9.

blue skies,
   Martin

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-09-02 17:48                                         ` Russell King - ARM Linux
@ 2011-09-27 17:45                                           ` Ulrich Weigand
  -1 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-27 17:45 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Mark Brown, linaro-toolchain, linux-arm-kernel,
	linux-kernel, Martin Schwidefsky, Nicolas Pitre,
	Rafael J. Wysocki, Tejun Heo

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> And yes, we can end up processing the interrupt before the SVC is
> executed, which is still a hole.  So we need to avoid doing the
> restart in userspace - which might actually make things easier.
> I'll take a look into that over the weekend.

Hi Russell, I was wondering whether is there any news on this?
If you do rewrite the signal handler processing, please let me
know so I can update GDB accordingly ...  Thanks!


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand | Phone: +49-7031/16-3727
  STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
  IBM Deutschland Research & Development GmbH
  Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
  Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2011-09-27 17:45                                           ` Ulrich Weigand
  0 siblings, 0 replies; 66+ messages in thread
From: Ulrich Weigand @ 2011-09-27 17:45 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> And yes, we can end up processing the interrupt before the SVC is
> executed, which is still a hole.  So we need to avoid doing the
> restart in userspace - which might actually make things easier.
> I'll take a look into that over the weekend.

Hi Russell, I was wondering whether is there any news on this?
If you do rewrite the signal handler processing, please let me
know so I can update GDB accordingly ...  Thanks!


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
  Dr. Ulrich Weigand | Phone: +49-7031/16-3727
  STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
  IBM Deutschland Research & Development GmbH
  Vorsitzender des Aufsichtsrats: Martin Jetter | Gesch?ftsf?hrung: Dirk
Wittkopp
  Sitz der Gesellschaft: B?blingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2011-08-25 14:55                           ` Russell King - ARM Linux
@ 2012-06-26 16:39                             ` Mandeep Singh Baines
  -1 siblings, 0 replies; 66+ messages in thread
From: Mandeep Singh Baines @ 2012-06-26 16:39 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Tejun Heo, Arnd Bergmann, Mark Brown, Rafael J. Wysocki,
	linux-kernel, linux-arm-kernel, Doug Anderson

On Thu, Aug 25, 2011 at 7:55 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Aug 25, 2011 at 03:09:07PM +0200, Tejun Heo wrote:
>> Hey, Russell.
>>
>> If you can fix it properly without going through temporary step,
>> that's awesome.  Let's put the arguments behind, okay?
>
> Here's the patch.  As the kernel I've run this against doesn't have the
> change to try_to_freeze(), I added a might_sleep() in do_signal() during
> my testing to verify that it fixes Mark's problem (which it does.)
>
> I've tested functions returning -ERESTARTSYS, -ERESTARTNOHAND and
> -ERESTART_RESTARTBLOCK, all of which seem to behave as expected with
> signals such as SIGCONT (without handler) and SIGALRM (with handler).
> I haven't tested -ERESTARTNOINTR.
>
> I don't have a test case for the race condition I mentioned (which is
> admittedly pretty difficult to construct, requiring an explicit
> signal, schedule, signal sequence) but this should plug that too.
>
> How do we achieve this?  Effectively the steps in this patch are:
>
> 1. Undo Arnd's fixups to the syscall restart processing (but don't worry,
>   we restore it in step 3).
>
> 2. Introduce TIF_SYS_RESTART, which is set when we enter signal handling
>   and the syscall has returned one of the restart codes.  This is used
>   as a flag to indicate that we have some syscall restart processing to
>   do at some point.
>
> 3. Clear TIF_SYS_RESTART whenever ptrace is used to set the GP registers
>   (thereby restoring Arnd's fixup for his gdb testsuite problem - it
>   would be good if Arnd could reconfirm that.)
>
> 4. When we setup a user handler to run, check TIF_SYS_RESTART and clear it.
>   If it was set, we need to set things up to return -EINTR or restart the
>   syscall as appropriate.  As we've cleared it, no further restart
>   processing will occur.
>
> 5. Once we've run all work (signal delivery, and rescheduling events), and
>   we're about to return to userspace, make a final check for TIF_SYS_RESTART.
>   If it's still set, then we're returning to userspace having not setup
>   any user handlers, and we need to restart the syscall.  This is mostly
>   trivial, except for OABI restartblock which requires the user stack to
>   be written.  We have to re-enable IRQs for this write, which means we
>   have to manually re-check for rescheduling events, abort the restart,
>   and try again later.
>
> One of the side effects of reverting Arnd's patch is that we restore the
> strace behaviour which we've had for years on ARM, and can still be seen
> on x86: strace can see the -ERESTART return codes from the kernel syscalls,
> rather than what seems to be the signal number:
>
> Before:
> rt_sigsuspend([] <unfinished ...>
> --- SIGIO (I/O possible) ---
> <... rt_sigsuspend resumed> )           = 29
> sigreturn()                             = ? (mask now [])
>
> vs:
> rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
> --- SIGIO (I/O possible) @ 0 (0) ---
> sigreturn()                             = ? (mask now [])
>
> x86:
> rt_sigsuspend([])                       = ? ERESTARTNOHAND (To be restarted)
> --- {si_signo=SIGIO, si_code=SI_USER} (I/O possible) ---
> sigreturn()                             = ? (mask now [])
>
> So, this patch should fix:
> 1. The race which I identified in the signal handling code (I think x86
>   and other architectures can suffer from it too.)
> 2. The warning from try_to_freeze.
> 3. The unanticipated change to strace output.
>
> Arnd, can you test this to make sure your gdb test case still works, and
> Mark, can you test this to make sure it fixes your problem please?
>
> Thanks.
>

Hi Russell,

We are seeing this on ARM with a 3.4 kernel. How was it eventually
resolved? This patch isn't in any tree and I can't find any patch in
linus/master or in any ARM tree that looks like it fixes this.

Here is a trace:

[  968.040000] BUG: sleeping function called from invalid context at
include/linux/freezer.h:46
[  968.050000] in_atomic(): 0, irqs_disabled(): 128, pid: 1, name: init
[  968.060000] [<8001550c>] (unwind_backtrace+0x0/0xec) from
[<805050c4>] (dump_stack+0x20/0x24)
[  968.060000] [<805050c4>] (dump_stack+0x20/0x24) from [<80052db8>]
(__might_sleep+0xec/0x10c)
[  968.070000] [<80052db8>] (__might_sleep+0xec/0x10c) from
[<80011290>] (do_signal+0x94/0x54c)
[  968.080000] [<80011290>] (do_signal+0x94/0x54c) from [<80011c24>]
(do_notify_resume+0x28/0x60)
[  968.090000] [<80011c24>] (do_notify_resume+0x28/0x60) from
[<8000e018>] (work_pending+0x24/0x28)

Regards,
Mandeep

>  arch/arm/include/asm/thread_info.h |    3 +
>  arch/arm/kernel/entry-common.S     |   11 ++
>  arch/arm/kernel/ptrace.c           |    2 +
>  arch/arm/kernel/signal.c           |  209 ++++++++++++++++++++++++------------
>  4 files changed, 155 insertions(+), 70 deletions(-)
>
> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
> index 7b5cc8d..40df533 100644
> --- a/arch/arm/include/asm/thread_info.h
> +++ b/arch/arm/include/asm/thread_info.h
> @@ -129,6 +129,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
>  /*
>  * thread information flags:
>  *  TIF_SYSCALL_TRACE  - syscall trace active
> + *  TIF_SYS_RESTART    - syscall restart processing
>  *  TIF_SIGPENDING     - signal pending
>  *  TIF_NEED_RESCHED   - rescheduling necessary
>  *  TIF_NOTIFY_RESUME  - callback before returning to user
> @@ -139,6 +140,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
>  #define TIF_NEED_RESCHED       1
>  #define TIF_NOTIFY_RESUME      2       /* callback before returning to user */
>  #define TIF_SYSCALL_TRACE      8
> +#define TIF_SYS_RESTART                9
>  #define TIF_POLLING_NRFLAG     16
>  #define TIF_USING_IWMMXT       17
>  #define TIF_MEMDIE             18      /* is terminating due to OOM killer */
> @@ -147,6 +149,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
>  #define TIF_SECCOMP            21
>
>  #define _TIF_SIGPENDING                (1 << TIF_SIGPENDING)
> +#define _TIF_SYS_RESTART       (1 << TIF_SYS_RESTART)
>  #define _TIF_NEED_RESCHED      (1 << TIF_NEED_RESCHED)
>  #define _TIF_NOTIFY_RESUME     (1 << TIF_NOTIFY_RESUME)
>  #define _TIF_SYSCALL_TRACE     (1 << TIF_SYSCALL_TRACE)
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index b2a27b6..e922b85 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -45,6 +45,7 @@ ret_fast_syscall:
>  fast_work_pending:
>        str     r0, [sp, #S_R0+S_OFF]!          @ returned r0
>  work_pending:
> +       enable_irq
>        tst     r1, #_TIF_NEED_RESCHED
>        bne     work_resched
>        tst     r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
> @@ -56,6 +57,13 @@ work_pending:
>        bl      do_notify_resume
>        b       ret_slow_syscall                @ Check work again
>
> +work_syscall_restart:
> +       mov     r0, sp                          @ 'regs'
> +       bl      syscall_restart                 @ process system call restart
> +       teq     r0, #0                          @ if ret=0 -> success, so
> +       beq     ret_restart                     @ return to userspace directly
> +       b       ret_slow_syscall                @ otherwise, we have a segfault
> +
>  work_resched:
>        bl      schedule
>  /*
> @@ -69,6 +77,9 @@ ENTRY(ret_to_user_from_irq)
>        tst     r1, #_TIF_WORK_MASK
>        bne     work_pending
>  no_work_pending:
> +       tst     r1, #_TIF_SYS_RESTART
> +       bne     work_syscall_restart
> +ret_restart:
>  #if defined(CONFIG_IRQSOFF_TRACER)
>        asm_trace_hardirqs_on
>  #endif
> diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
> index 2491f3b..ac8c34e 100644
> --- a/arch/arm/kernel/ptrace.c
> +++ b/arch/arm/kernel/ptrace.c
> @@ -177,6 +177,7 @@ put_user_reg(struct task_struct *task, int offset, long data)
>
>        if (valid_user_regs(&newregs)) {
>                regs->uregs[offset] = data;
> +               clear_ti_thread_flag(task_thread_info(task), TIF_SYS_RESTART);
>                ret = 0;
>        }
>
> @@ -604,6 +605,7 @@ static int gpr_set(struct task_struct *target,
>                return -EINVAL;
>
>        *task_pt_regs(target) = newregs;
> +       clear_ti_thread_flag(task_thread_info(target), TIF_SYS_RESTART);
>        return 0;
>  }
>
> diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> index 0340224..42a1521 100644
> --- a/arch/arm/kernel/signal.c
> +++ b/arch/arm/kernel/signal.c
> @@ -649,6 +649,135 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
>  }
>
>  /*
> + * Syscall restarting codes
> + *
> + * -ERESTARTSYS: restart system call if no handler, or if there is a
> + *     handler but it's marked SA_RESTART.  Otherwise return -EINTR.
> + * -ERESTARTNOINTR: always restart system call
> + * -ERESTARTNOHAND: restart system call only if no handler, otherwise
> + *     return -EINTR if invoking a user signal handler.
> + * -ERESTART_RESTARTBLOCK: call restart syscall if no handler, otherwise
> + *     return -EINTR if invoking a user signal handler.
> + */
> +static void setup_syscall_restart(struct pt_regs *regs)
> +{
> +       regs->ARM_r0 = regs->ARM_ORIG_r0;
> +       regs->ARM_pc -= thumb_mode(regs) ? 2 : 4;
> +}
> +
> +/*
> + * Depending on the signal settings we may need to revert the decision
> + * to restart the system call.  But skip this if a debugger has chosen
> + * to restart at a different PC.
> + */
> +static void syscall_restart_handler(struct pt_regs *regs, struct k_sigaction *ka)
> +{
> +       if (test_and_clear_thread_flag(TIF_SYS_RESTART)) {
> +               long r0 = regs->ARM_r0;
> +
> +               /*
> +                * By default, return -EINTR to the user process for any
> +                * syscall which would otherwise be restarted.
> +                */
> +               regs->ARM_r0 = -EINTR;
> +
> +               if (r0 == -ERESTARTNOINTR ||
> +                   (r0 == -ERESTARTSYS && !(ka->sa.sa_flags & SA_RESTART)))
> +                       setup_syscall_restart(regs);
> +       }
> +}
> +
> +/*
> + * Handle syscall restarting when there is no user handler in place for
> + * a delivered signal.  Rather than doing this as part of the normal
> + * signal processing, we do this on the final return to userspace, after
> + * we've finished handling signals and checking for schedule events.
> + *
> + * This avoids bad behaviour such as:
> + *  - syscall returns -ERESTARTNOHAND
> + *  - signal with no handler (so we set things up to restart the syscall)
> + *  - schedule
> + *  - signal with handler (eg, SIGALRM)
> + *  - we call the handler and then restart the syscall
> + *
> + * In order to avoid races with TIF_NEED_RESCHED, IRQs must be disabled
> + * when this function is called and remain disabled until we exit to
> + * userspace.
> + */
> +asmlinkage int syscall_restart(struct pt_regs *regs)
> +{
> +       struct thread_info *thread = current_thread_info();
> +
> +       clear_ti_thread_flag(thread, TIF_SYS_RESTART);
> +
> +       /*
> +        * Restart the system call.  We haven't setup a signal handler
> +        * to invoke, and the regset hasn't been usurped by ptrace.
> +        */
> +       if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
> +               if (thumb_mode(regs)) {
> +                       regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
> +                       regs->ARM_pc -= 2;
> +               } else {
> +#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
> +                       regs->ARM_r7 = __NR_restart_syscall;
> +                       regs->ARM_pc -= 4;
> +#else
> +                       u32 sp = regs->ARM_sp - 4;
> +                       u32 __user *usp = (u32 __user *)sp;
> +                       int ret;
> +
> +                       /*
> +                        * For OABI, we need to play some extra games, because
> +                        * we need to write to the users stack, which we can't
> +                        * do reliably from IRQs-disabled context.  Temporarily
> +                        * re-enable IRQs, perform the store, and then plug
> +                        * the resulting race afterwards.
> +                        */
> +                       local_irq_enable();
> +                       ret = put_user(regs->ARM_pc, usp);
> +                       local_irq_disable();
> +
> +                       /*
> +                        * Plug the reschedule race - if we need to reschedule,
> +                        * abort the syscall restarting.  We haven't modified
> +                        * anything other than the attempted write to the stack
> +                        * so we can merely retry later.
> +                        */
> +                       if (need_resched()) {
> +                               set_ti_thread_flag(thread, TIF_SYS_RESTART);
> +                               return -EINTR;
> +                       }
> +
> +                       /*
> +                        * We failed (for some reason) to write to the stack.
> +                        * Terminate the task.
> +                        */
> +                       if (ret) {
> +                               force_sigsegv(0, current);
> +                               return -EFAULT;
> +                       }
> +
> +                       /*
> +                        * Success, update the stack pointer and point the
> +                        * PC at the restarting code.
> +                        */
> +                       regs->ARM_sp = sp;
> +                       regs->ARM_pc = KERN_RESTART_CODE;
> +#endif
> +               }
> +       } else {
> +               /*
> +                * Simple restart - just back up and re-execute the last
> +                * instruction.
> +                */
> +               setup_syscall_restart(regs);
> +       }
> +
> +       return 0;
> +}
> +
> +/*
>  * Note that 'init' is a special process: it doesn't get signals it doesn't
>  * want to handle. Thus you cannot kill init even with a SIGKILL even by
>  * mistake.
> @@ -659,7 +788,6 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
>  */
>  static void do_signal(struct pt_regs *regs, int syscall)
>  {
> -       unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
>        struct k_sigaction ka;
>        siginfo_t info;
>        int signr;
> @@ -674,32 +802,16 @@ static void do_signal(struct pt_regs *regs, int syscall)
>                return;
>
>        /*
> -        * If we were from a system call, check for system call restarting...
> +        * Set the SYS_RESTART flag to indicate that we have some
> +        * cleanup of the restart state to perform when returning to
> +        * userspace.
>         */
> -       if (syscall) {
> -               continue_addr = regs->ARM_pc;
> -               restart_addr = continue_addr - (thumb_mode(regs) ? 2 : 4);
> -               retval = regs->ARM_r0;
> -
> -               /*
> -                * Prepare for system call restart.  We do this here so that a
> -                * debugger will see the already changed PSW.
> -                */
> -               switch (retval) {
> -               case -ERESTARTNOHAND:
> -               case -ERESTARTSYS:
> -               case -ERESTARTNOINTR:
> -                       regs->ARM_r0 = regs->ARM_ORIG_r0;
> -                       regs->ARM_pc = restart_addr;
> -                       break;
> -               case -ERESTART_RESTARTBLOCK:
> -                       regs->ARM_r0 = -EINTR;
> -                       break;
> -               }
> -       }
> -
> -       if (try_to_freeze())
> -               goto no_signal;
> +       if (syscall &&
> +           (regs->ARM_r0 == -ERESTARTSYS ||
> +            regs->ARM_r0 == -ERESTARTNOINTR ||
> +            regs->ARM_r0 == -ERESTARTNOHAND ||
> +            regs->ARM_r0 == -ERESTART_RESTARTBLOCK))
> +               set_thread_flag(TIF_SYS_RESTART);
>
>        /*
>         * Get the signal to deliver.  When running under ptrace, at this
> @@ -709,19 +821,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
>        if (signr > 0) {
>                sigset_t *oldset;
>
> -               /*
> -                * Depending on the signal settings we may need to revert the
> -                * decision to restart the system call.  But skip this if a
> -                * debugger has chosen to restart at a different PC.
> -                */
> -               if (regs->ARM_pc == restart_addr) {
> -                       if (retval == -ERESTARTNOHAND
> -                           || (retval == -ERESTARTSYS
> -                               && !(ka.sa.sa_flags & SA_RESTART))) {
> -                               regs->ARM_r0 = -EINTR;
> -                               regs->ARM_pc = continue_addr;
> -                       }
> -               }
> +               syscall_restart_handler(regs, &ka);
>
>                if (test_thread_flag(TIF_RESTORE_SIGMASK))
>                        oldset = &current->saved_sigmask;
> @@ -740,38 +840,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
>                return;
>        }
>
> - no_signal:
>        if (syscall) {
> -               /*
> -                * Handle restarting a different system call.  As above,
> -                * if a debugger has chosen to restart at a different PC,
> -                * ignore the restart.
> -                */
> -               if (retval == -ERESTART_RESTARTBLOCK
> -                   && regs->ARM_pc == continue_addr) {
> -                       if (thumb_mode(regs)) {
> -                               regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
> -                               regs->ARM_pc -= 2;
> -                       } else {
> -#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
> -                               regs->ARM_r7 = __NR_restart_syscall;
> -                               regs->ARM_pc -= 4;
> -#else
> -                               u32 __user *usp;
> -
> -                               regs->ARM_sp -= 4;
> -                               usp = (u32 __user *)regs->ARM_sp;
> -
> -                               if (put_user(regs->ARM_pc, usp) == 0) {
> -                                       regs->ARM_pc = KERN_RESTART_CODE;
> -                               } else {
> -                                       regs->ARM_sp += 4;
> -                                       force_sigsegv(0, current);
> -                               }
> -#endif
> -                       }
> -               }
> -
>                /* If there's no signal to deliver, we just put the saved sigmask
>                 * back.
>                 */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2012-06-26 16:39                             ` Mandeep Singh Baines
  0 siblings, 0 replies; 66+ messages in thread
From: Mandeep Singh Baines @ 2012-06-26 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 25, 2011 at 7:55 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Aug 25, 2011 at 03:09:07PM +0200, Tejun Heo wrote:
>> Hey, Russell.
>>
>> If you can fix it properly without going through temporary step,
>> that's awesome. ?Let's put the arguments behind, okay?
>
> Here's the patch. ?As the kernel I've run this against doesn't have the
> change to try_to_freeze(), I added a might_sleep() in do_signal() during
> my testing to verify that it fixes Mark's problem (which it does.)
>
> I've tested functions returning -ERESTARTSYS, -ERESTARTNOHAND and
> -ERESTART_RESTARTBLOCK, all of which seem to behave as expected with
> signals such as SIGCONT (without handler) and SIGALRM (with handler).
> I haven't tested -ERESTARTNOINTR.
>
> I don't have a test case for the race condition I mentioned (which is
> admittedly pretty difficult to construct, requiring an explicit
> signal, schedule, signal sequence) but this should plug that too.
>
> How do we achieve this? ?Effectively the steps in this patch are:
>
> 1. Undo Arnd's fixups to the syscall restart processing (but don't worry,
> ? we restore it in step 3).
>
> 2. Introduce TIF_SYS_RESTART, which is set when we enter signal handling
> ? and the syscall has returned one of the restart codes. ?This is used
> ? as a flag to indicate that we have some syscall restart processing to
> ? do at some point.
>
> 3. Clear TIF_SYS_RESTART whenever ptrace is used to set the GP registers
> ? (thereby restoring Arnd's fixup for his gdb testsuite problem - it
> ? would be good if Arnd could reconfirm that.)
>
> 4. When we setup a user handler to run, check TIF_SYS_RESTART and clear it.
> ? If it was set, we need to set things up to return -EINTR or restart the
> ? syscall as appropriate. ?As we've cleared it, no further restart
> ? processing will occur.
>
> 5. Once we've run all work (signal delivery, and rescheduling events), and
> ? we're about to return to userspace, make a final check for TIF_SYS_RESTART.
> ? If it's still set, then we're returning to userspace having not setup
> ? any user handlers, and we need to restart the syscall. ?This is mostly
> ? trivial, except for OABI restartblock which requires the user stack to
> ? be written. ?We have to re-enable IRQs for this write, which means we
> ? have to manually re-check for rescheduling events, abort the restart,
> ? and try again later.
>
> One of the side effects of reverting Arnd's patch is that we restore the
> strace behaviour which we've had for years on ARM, and can still be seen
> on x86: strace can see the -ERESTART return codes from the kernel syscalls,
> rather than what seems to be the signal number:
>
> Before:
> rt_sigsuspend([] <unfinished ...>
> --- SIGIO (I/O possible) ---
> <... rt_sigsuspend resumed> ) ? ? ? ? ? = 29
> sigreturn() ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? (mask now [])
>
> vs:
> rt_sigsuspend([]) ? ? ? ? ? ? ? ? ? ? ? = ? ERESTARTNOHAND (To be restarted)
> --- SIGIO (I/O possible) @ 0 (0) ---
> sigreturn() ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? (mask now [])
>
> x86:
> rt_sigsuspend([]) ? ? ? ? ? ? ? ? ? ? ? = ? ERESTARTNOHAND (To be restarted)
> --- {si_signo=SIGIO, si_code=SI_USER} (I/O possible) ---
> sigreturn() ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? (mask now [])
>
> So, this patch should fix:
> 1. The race which I identified in the signal handling code (I think x86
> ? and other architectures can suffer from it too.)
> 2. The warning from try_to_freeze.
> 3. The unanticipated change to strace output.
>
> Arnd, can you test this to make sure your gdb test case still works, and
> Mark, can you test this to make sure it fixes your problem please?
>
> Thanks.
>

Hi Russell,

We are seeing this on ARM with a 3.4 kernel. How was it eventually
resolved? This patch isn't in any tree and I can't find any patch in
linus/master or in any ARM tree that looks like it fixes this.

Here is a trace:

[  968.040000] BUG: sleeping function called from invalid context at
include/linux/freezer.h:46
[  968.050000] in_atomic(): 0, irqs_disabled(): 128, pid: 1, name: init
[  968.060000] [<8001550c>] (unwind_backtrace+0x0/0xec) from
[<805050c4>] (dump_stack+0x20/0x24)
[  968.060000] [<805050c4>] (dump_stack+0x20/0x24) from [<80052db8>]
(__might_sleep+0xec/0x10c)
[  968.070000] [<80052db8>] (__might_sleep+0xec/0x10c) from
[<80011290>] (do_signal+0x94/0x54c)
[  968.080000] [<80011290>] (do_signal+0x94/0x54c) from [<80011c24>]
(do_notify_resume+0x28/0x60)
[  968.090000] [<80011c24>] (do_notify_resume+0x28/0x60) from
[<8000e018>] (work_pending+0x24/0x28)

Regards,
Mandeep

> ?arch/arm/include/asm/thread_info.h | ? ?3 +
> ?arch/arm/kernel/entry-common.S ? ? | ? 11 ++
> ?arch/arm/kernel/ptrace.c ? ? ? ? ? | ? ?2 +
> ?arch/arm/kernel/signal.c ? ? ? ? ? | ?209 ++++++++++++++++++++++++------------
> ?4 files changed, 155 insertions(+), 70 deletions(-)
>
> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
> index 7b5cc8d..40df533 100644
> --- a/arch/arm/include/asm/thread_info.h
> +++ b/arch/arm/include/asm/thread_info.h
> @@ -129,6 +129,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
> ?/*
> ?* thread information flags:
> ?* ?TIF_SYSCALL_TRACE ?- syscall trace active
> + * ?TIF_SYS_RESTART ? ?- syscall restart processing
> ?* ?TIF_SIGPENDING ? ? - signal pending
> ?* ?TIF_NEED_RESCHED ? - rescheduling necessary
> ?* ?TIF_NOTIFY_RESUME ?- callback before returning to user
> @@ -139,6 +140,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
> ?#define TIF_NEED_RESCHED ? ? ? 1
> ?#define TIF_NOTIFY_RESUME ? ? ?2 ? ? ? /* callback before returning to user */
> ?#define TIF_SYSCALL_TRACE ? ? ?8
> +#define TIF_SYS_RESTART ? ? ? ? ? ? ? ?9
> ?#define TIF_POLLING_NRFLAG ? ? 16
> ?#define TIF_USING_IWMMXT ? ? ? 17
> ?#define TIF_MEMDIE ? ? ? ? ? ? 18 ? ? ?/* is terminating due to OOM killer */
> @@ -147,6 +149,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
> ?#define TIF_SECCOMP ? ? ? ? ? ?21
>
> ?#define _TIF_SIGPENDING ? ? ? ? ? ? ? ?(1 << TIF_SIGPENDING)
> +#define _TIF_SYS_RESTART ? ? ? (1 << TIF_SYS_RESTART)
> ?#define _TIF_NEED_RESCHED ? ? ?(1 << TIF_NEED_RESCHED)
> ?#define _TIF_NOTIFY_RESUME ? ? (1 << TIF_NOTIFY_RESUME)
> ?#define _TIF_SYSCALL_TRACE ? ? (1 << TIF_SYSCALL_TRACE)
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index b2a27b6..e922b85 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -45,6 +45,7 @@ ret_fast_syscall:
> ?fast_work_pending:
> ? ? ? ?str ? ? r0, [sp, #S_R0+S_OFF]! ? ? ? ? ?@ returned r0
> ?work_pending:
> + ? ? ? enable_irq
> ? ? ? ?tst ? ? r1, #_TIF_NEED_RESCHED
> ? ? ? ?bne ? ? work_resched
> ? ? ? ?tst ? ? r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
> @@ -56,6 +57,13 @@ work_pending:
> ? ? ? ?bl ? ? ?do_notify_resume
> ? ? ? ?b ? ? ? ret_slow_syscall ? ? ? ? ? ? ? ?@ Check work again
>
> +work_syscall_restart:
> + ? ? ? mov ? ? r0, sp ? ? ? ? ? ? ? ? ? ? ? ? ?@ 'regs'
> + ? ? ? bl ? ? ?syscall_restart ? ? ? ? ? ? ? ? @ process system call restart
> + ? ? ? teq ? ? r0, #0 ? ? ? ? ? ? ? ? ? ? ? ? ?@ if ret=0 -> success, so
> + ? ? ? beq ? ? ret_restart ? ? ? ? ? ? ? ? ? ? @ return to userspace directly
> + ? ? ? b ? ? ? ret_slow_syscall ? ? ? ? ? ? ? ?@ otherwise, we have a segfault
> +
> ?work_resched:
> ? ? ? ?bl ? ? ?schedule
> ?/*
> @@ -69,6 +77,9 @@ ENTRY(ret_to_user_from_irq)
> ? ? ? ?tst ? ? r1, #_TIF_WORK_MASK
> ? ? ? ?bne ? ? work_pending
> ?no_work_pending:
> + ? ? ? tst ? ? r1, #_TIF_SYS_RESTART
> + ? ? ? bne ? ? work_syscall_restart
> +ret_restart:
> ?#if defined(CONFIG_IRQSOFF_TRACER)
> ? ? ? ?asm_trace_hardirqs_on
> ?#endif
> diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
> index 2491f3b..ac8c34e 100644
> --- a/arch/arm/kernel/ptrace.c
> +++ b/arch/arm/kernel/ptrace.c
> @@ -177,6 +177,7 @@ put_user_reg(struct task_struct *task, int offset, long data)
>
> ? ? ? ?if (valid_user_regs(&newregs)) {
> ? ? ? ? ? ? ? ?regs->uregs[offset] = data;
> + ? ? ? ? ? ? ? clear_ti_thread_flag(task_thread_info(task), TIF_SYS_RESTART);
> ? ? ? ? ? ? ? ?ret = 0;
> ? ? ? ?}
>
> @@ -604,6 +605,7 @@ static int gpr_set(struct task_struct *target,
> ? ? ? ? ? ? ? ?return -EINVAL;
>
> ? ? ? ?*task_pt_regs(target) = newregs;
> + ? ? ? clear_ti_thread_flag(task_thread_info(target), TIF_SYS_RESTART);
> ? ? ? ?return 0;
> ?}
>
> diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
> index 0340224..42a1521 100644
> --- a/arch/arm/kernel/signal.c
> +++ b/arch/arm/kernel/signal.c
> @@ -649,6 +649,135 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
> ?}
>
> ?/*
> + * Syscall restarting codes
> + *
> + * -ERESTARTSYS: restart system call if no handler, or if there is a
> + * ? ? handler but it's marked SA_RESTART. ?Otherwise return -EINTR.
> + * -ERESTARTNOINTR: always restart system call
> + * -ERESTARTNOHAND: restart system call only if no handler, otherwise
> + * ? ? return -EINTR if invoking a user signal handler.
> + * -ERESTART_RESTARTBLOCK: call restart syscall if no handler, otherwise
> + * ? ? return -EINTR if invoking a user signal handler.
> + */
> +static void setup_syscall_restart(struct pt_regs *regs)
> +{
> + ? ? ? regs->ARM_r0 = regs->ARM_ORIG_r0;
> + ? ? ? regs->ARM_pc -= thumb_mode(regs) ? 2 : 4;
> +}
> +
> +/*
> + * Depending on the signal settings we may need to revert the decision
> + * to restart the system call. ?But skip this if a debugger has chosen
> + * to restart at a different PC.
> + */
> +static void syscall_restart_handler(struct pt_regs *regs, struct k_sigaction *ka)
> +{
> + ? ? ? if (test_and_clear_thread_flag(TIF_SYS_RESTART)) {
> + ? ? ? ? ? ? ? long r0 = regs->ARM_r0;
> +
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* By default, return -EINTR to the user process for any
> + ? ? ? ? ? ? ? ?* syscall which would otherwise be restarted.
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? regs->ARM_r0 = -EINTR;
> +
> + ? ? ? ? ? ? ? if (r0 == -ERESTARTNOINTR ||
> + ? ? ? ? ? ? ? ? ? (r0 == -ERESTARTSYS && !(ka->sa.sa_flags & SA_RESTART)))
> + ? ? ? ? ? ? ? ? ? ? ? setup_syscall_restart(regs);
> + ? ? ? }
> +}
> +
> +/*
> + * Handle syscall restarting when there is no user handler in place for
> + * a delivered signal. ?Rather than doing this as part of the normal
> + * signal processing, we do this on the final return to userspace, after
> + * we've finished handling signals and checking for schedule events.
> + *
> + * This avoids bad behaviour such as:
> + * ?- syscall returns -ERESTARTNOHAND
> + * ?- signal with no handler (so we set things up to restart the syscall)
> + * ?- schedule
> + * ?- signal with handler (eg, SIGALRM)
> + * ?- we call the handler and then restart the syscall
> + *
> + * In order to avoid races with TIF_NEED_RESCHED, IRQs must be disabled
> + * when this function is called and remain disabled until we exit to
> + * userspace.
> + */
> +asmlinkage int syscall_restart(struct pt_regs *regs)
> +{
> + ? ? ? struct thread_info *thread = current_thread_info();
> +
> + ? ? ? clear_ti_thread_flag(thread, TIF_SYS_RESTART);
> +
> + ? ? ? /*
> + ? ? ? ?* Restart the system call. ?We haven't setup a signal handler
> + ? ? ? ?* to invoke, and the regset hasn't been usurped by ptrace.
> + ? ? ? ?*/
> + ? ? ? if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
> + ? ? ? ? ? ? ? if (thumb_mode(regs)) {
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc -= 2;
> + ? ? ? ? ? ? ? } else {
> +#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r7 = __NR_restart_syscall;
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc -= 4;
> +#else
> + ? ? ? ? ? ? ? ? ? ? ? u32 sp = regs->ARM_sp - 4;
> + ? ? ? ? ? ? ? ? ? ? ? u32 __user *usp = (u32 __user *)sp;
> + ? ? ? ? ? ? ? ? ? ? ? int ret;
> +
> + ? ? ? ? ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ? ? ? ? ?* For OABI, we need to play some extra games, because
> + ? ? ? ? ? ? ? ? ? ? ? ?* we need to write to the users stack, which we can't
> + ? ? ? ? ? ? ? ? ? ? ? ?* do reliably from IRQs-disabled context. ?Temporarily
> + ? ? ? ? ? ? ? ? ? ? ? ?* re-enable IRQs, perform the store, and then plug
> + ? ? ? ? ? ? ? ? ? ? ? ?* the resulting race afterwards.
> + ? ? ? ? ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? ? ? ? ? local_irq_enable();
> + ? ? ? ? ? ? ? ? ? ? ? ret = put_user(regs->ARM_pc, usp);
> + ? ? ? ? ? ? ? ? ? ? ? local_irq_disable();
> +
> + ? ? ? ? ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ? ? ? ? ?* Plug the reschedule race - if we need to reschedule,
> + ? ? ? ? ? ? ? ? ? ? ? ?* abort the syscall restarting. ?We haven't modified
> + ? ? ? ? ? ? ? ? ? ? ? ?* anything other than the attempted write to the stack
> + ? ? ? ? ? ? ? ? ? ? ? ?* so we can merely retry later.
> + ? ? ? ? ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? ? ? ? ? if (need_resched()) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? set_ti_thread_flag(thread, TIF_SYS_RESTART);
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return -EINTR;
> + ? ? ? ? ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ? ? ? ? ?* We failed (for some reason) to write to the stack.
> + ? ? ? ? ? ? ? ? ? ? ? ?* Terminate the task.
> + ? ? ? ? ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? ? ? ? ? if (ret) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? force_sigsegv(0, current);
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return -EFAULT;
> + ? ? ? ? ? ? ? ? ? ? ? }
> +
> + ? ? ? ? ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ? ? ? ? ?* Success, update the stack pointer and point the
> + ? ? ? ? ? ? ? ? ? ? ? ?* PC at the restarting code.
> + ? ? ? ? ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_sp = sp;
> + ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc = KERN_RESTART_CODE;
> +#endif
> + ? ? ? ? ? ? ? }
> + ? ? ? } else {
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* Simple restart - just back up and re-execute the last
> + ? ? ? ? ? ? ? ?* instruction.
> + ? ? ? ? ? ? ? ?*/
> + ? ? ? ? ? ? ? setup_syscall_restart(regs);
> + ? ? ? }
> +
> + ? ? ? return 0;
> +}
> +
> +/*
> ?* Note that 'init' is a special process: it doesn't get signals it doesn't
> ?* want to handle. Thus you cannot kill init even with a SIGKILL even by
> ?* mistake.
> @@ -659,7 +788,6 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
> ?*/
> ?static void do_signal(struct pt_regs *regs, int syscall)
> ?{
> - ? ? ? unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
> ? ? ? ?struct k_sigaction ka;
> ? ? ? ?siginfo_t info;
> ? ? ? ?int signr;
> @@ -674,32 +802,16 @@ static void do_signal(struct pt_regs *regs, int syscall)
> ? ? ? ? ? ? ? ?return;
>
> ? ? ? ?/*
> - ? ? ? ?* If we were from a system call, check for system call restarting...
> + ? ? ? ?* Set the SYS_RESTART flag to indicate that we have some
> + ? ? ? ?* cleanup of the restart state to perform when returning to
> + ? ? ? ?* userspace.
> ? ? ? ? */
> - ? ? ? if (syscall) {
> - ? ? ? ? ? ? ? continue_addr = regs->ARM_pc;
> - ? ? ? ? ? ? ? restart_addr = continue_addr - (thumb_mode(regs) ? 2 : 4);
> - ? ? ? ? ? ? ? retval = regs->ARM_r0;
> -
> - ? ? ? ? ? ? ? /*
> - ? ? ? ? ? ? ? ?* Prepare for system call restart. ?We do this here so that a
> - ? ? ? ? ? ? ? ?* debugger will see the already changed PSW.
> - ? ? ? ? ? ? ? ?*/
> - ? ? ? ? ? ? ? switch (retval) {
> - ? ? ? ? ? ? ? case -ERESTARTNOHAND:
> - ? ? ? ? ? ? ? case -ERESTARTSYS:
> - ? ? ? ? ? ? ? case -ERESTARTNOINTR:
> - ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r0 = regs->ARM_ORIG_r0;
> - ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc = restart_addr;
> - ? ? ? ? ? ? ? ? ? ? ? break;
> - ? ? ? ? ? ? ? case -ERESTART_RESTARTBLOCK:
> - ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r0 = -EINTR;
> - ? ? ? ? ? ? ? ? ? ? ? break;
> - ? ? ? ? ? ? ? }
> - ? ? ? }
> -
> - ? ? ? if (try_to_freeze())
> - ? ? ? ? ? ? ? goto no_signal;
> + ? ? ? if (syscall &&
> + ? ? ? ? ? (regs->ARM_r0 == -ERESTARTSYS ||
> + ? ? ? ? ? ?regs->ARM_r0 == -ERESTARTNOINTR ||
> + ? ? ? ? ? ?regs->ARM_r0 == -ERESTARTNOHAND ||
> + ? ? ? ? ? ?regs->ARM_r0 == -ERESTART_RESTARTBLOCK))
> + ? ? ? ? ? ? ? set_thread_flag(TIF_SYS_RESTART);
>
> ? ? ? ?/*
> ? ? ? ? * Get the signal to deliver. ?When running under ptrace, at this
> @@ -709,19 +821,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
> ? ? ? ?if (signr > 0) {
> ? ? ? ? ? ? ? ?sigset_t *oldset;
>
> - ? ? ? ? ? ? ? /*
> - ? ? ? ? ? ? ? ?* Depending on the signal settings we may need to revert the
> - ? ? ? ? ? ? ? ?* decision to restart the system call. ?But skip this if a
> - ? ? ? ? ? ? ? ?* debugger has chosen to restart at a different PC.
> - ? ? ? ? ? ? ? ?*/
> - ? ? ? ? ? ? ? if (regs->ARM_pc == restart_addr) {
> - ? ? ? ? ? ? ? ? ? ? ? if (retval == -ERESTARTNOHAND
> - ? ? ? ? ? ? ? ? ? ? ? ? ? || (retval == -ERESTARTSYS
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? && !(ka.sa.sa_flags & SA_RESTART))) {
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r0 = -EINTR;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc = continue_addr;
> - ? ? ? ? ? ? ? ? ? ? ? }
> - ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? syscall_restart_handler(regs, &ka);
>
> ? ? ? ? ? ? ? ?if (test_thread_flag(TIF_RESTORE_SIGMASK))
> ? ? ? ? ? ? ? ? ? ? ? ?oldset = &current->saved_sigmask;
> @@ -740,38 +840,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
> ? ? ? ? ? ? ? ?return;
> ? ? ? ?}
>
> - no_signal:
> ? ? ? ?if (syscall) {
> - ? ? ? ? ? ? ? /*
> - ? ? ? ? ? ? ? ?* Handle restarting a different system call. ?As above,
> - ? ? ? ? ? ? ? ?* if a debugger has chosen to restart at a different PC,
> - ? ? ? ? ? ? ? ?* ignore the restart.
> - ? ? ? ? ? ? ? ?*/
> - ? ? ? ? ? ? ? if (retval == -ERESTART_RESTARTBLOCK
> - ? ? ? ? ? ? ? ? ? && regs->ARM_pc == continue_addr) {
> - ? ? ? ? ? ? ? ? ? ? ? if (thumb_mode(regs)) {
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc -= 2;
> - ? ? ? ? ? ? ? ? ? ? ? } else {
> -#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_r7 = __NR_restart_syscall;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc -= 4;
> -#else
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? u32 __user *usp;
> -
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_sp -= 4;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? usp = (u32 __user *)regs->ARM_sp;
> -
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if (put_user(regs->ARM_pc, usp) == 0) {
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_pc = KERN_RESTART_CODE;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? } else {
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? regs->ARM_sp += 4;
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? force_sigsegv(0, current);
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }
> -#endif
> - ? ? ? ? ? ? ? ? ? ? ? }
> - ? ? ? ? ? ? ? }
> -
> ? ? ? ? ? ? ? ?/* If there's no signal to deliver, we just put the saved sigmask
> ? ? ? ? ? ? ? ? * back.
> ? ? ? ? ? ? ? ? */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: try_to_freeze() called with IRQs disabled on ARM
  2012-06-26 16:39                             ` Mandeep Singh Baines
@ 2012-06-26 17:16                               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2012-06-26 17:16 UTC (permalink / raw)
  To: Mandeep Singh Baines
  Cc: Tejun Heo, Arnd Bergmann, Mark Brown, Rafael J. Wysocki,
	linux-kernel, linux-arm-kernel, Doug Anderson

On Tue, Jun 26, 2012 at 09:39:14AM -0700, Mandeep Singh Baines wrote:
> Hi Russell,
> 
> We are seeing this on ARM with a 3.4 kernel. How was it eventually
> resolved? This patch isn't in any tree and I can't find any patch in
> linus/master or in any ARM tree that looks like it fixes this.

It's still work in progress.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* try_to_freeze() called with IRQs disabled on ARM
@ 2012-06-26 17:16                               ` Russell King - ARM Linux
  0 siblings, 0 replies; 66+ messages in thread
From: Russell King - ARM Linux @ 2012-06-26 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 26, 2012 at 09:39:14AM -0700, Mandeep Singh Baines wrote:
> Hi Russell,
> 
> We are seeing this on ARM with a 3.4 kernel. How was it eventually
> resolved? This patch isn't in any tree and I can't find any patch in
> linus/master or in any ARM tree that looks like it fixes this.

It's still work in progress.

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2012-06-26 17:16 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-23 15:19 try_to_freeze() called with IRQs disabled on ARM Mark Brown
2011-08-23 15:19 ` Mark Brown
2011-08-23 15:43 ` Russell King - ARM Linux
2011-08-23 15:43   ` Russell King - ARM Linux
2011-08-23 22:08   ` Rafael J. Wysocki
2011-08-23 22:08     ` Rafael J. Wysocki
2011-08-23 21:51 ` Rafael J. Wysocki
2011-08-23 21:51   ` Rafael J. Wysocki
2011-08-23 21:53   ` Tejun Heo
2011-08-23 21:53     ` Tejun Heo
2011-08-23 22:00     ` Russell King - ARM Linux
2011-08-23 22:00       ` Russell King - ARM Linux
2011-08-23 22:08       ` Tejun Heo
2011-08-23 22:08         ` Tejun Heo
2011-08-23 22:13         ` Russell King - ARM Linux
2011-08-23 22:13           ` Russell King - ARM Linux
2011-08-23 22:17           ` Tejun Heo
2011-08-23 22:17             ` Tejun Heo
2011-08-23 22:35             ` Tejun Heo
2011-08-23 22:35               ` Tejun Heo
2011-08-24 23:15               ` Rafael J. Wysocki
2011-08-24 23:15                 ` Rafael J. Wysocki
2011-08-25 12:14             ` Russell King - ARM Linux
2011-08-25 12:14               ` Russell King - ARM Linux
2011-08-25 12:17               ` Tejun Heo
2011-08-25 12:17                 ` Tejun Heo
2011-08-25 12:25                 ` Russell King - ARM Linux
2011-08-25 12:25                   ` Russell King - ARM Linux
2011-08-25 12:35                   ` Tejun Heo
2011-08-25 12:35                     ` Tejun Heo
2011-08-25 13:04                     ` Russell King - ARM Linux
2011-08-25 13:04                       ` Russell King - ARM Linux
2011-08-25 13:09                       ` Tejun Heo
2011-08-25 13:09                         ` Tejun Heo
2011-08-25 14:55                         ` Russell King - ARM Linux
2011-08-25 14:55                           ` Russell King - ARM Linux
2011-08-26 14:44                           ` Arnd Bergmann
2011-08-26 14:44                             ` Arnd Bergmann
2011-09-01 13:41                             ` Ulrich Weigand
2011-09-01 13:41                               ` Ulrich Weigand
2011-09-01 14:00                               ` Russell King - ARM Linux
2011-09-01 14:00                                 ` Russell King - ARM Linux
2011-09-02 14:47                                 ` Ulrich Weigand
2011-09-02 14:47                                   ` Ulrich Weigand
2011-09-02 17:22                                   ` Russell King - ARM Linux
2011-09-02 17:22                                     ` Russell King - ARM Linux
2011-09-02 17:40                                     ` Ulrich Weigand
2011-09-02 17:40                                       ` Ulrich Weigand
2011-09-02 17:48                                       ` Russell King - ARM Linux
2011-09-02 17:48                                         ` Russell King - ARM Linux
2011-09-16 10:31                                         ` Martin Schwidefsky
2011-09-16 10:31                                           ` Martin Schwidefsky
2011-09-27 17:45                                         ` Ulrich Weigand
2011-09-27 17:45                                           ` Ulrich Weigand
2011-08-30 20:58                           ` Mark Brown
2011-08-30 20:58                             ` Mark Brown
2011-08-30 21:10                             ` Russell King - ARM Linux
2011-08-30 21:10                               ` Russell King - ARM Linux
2012-06-26 16:39                           ` Mandeep Singh Baines
2012-06-26 16:39                             ` Mandeep Singh Baines
2012-06-26 17:16                             ` Russell King - ARM Linux
2012-06-26 17:16                               ` Russell King - ARM Linux
2011-08-23 22:13     ` Rafael J. Wysocki
2011-08-23 22:13       ` Rafael J. Wysocki
2011-08-25 11:37   ` Mark Brown
2011-08-25 11:37     ` Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.