linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* use of preempt_count instead of in_atomic() at leds-gpio.c
@ 2008-03-16 18:43 Henrique de Moraes Holschuh
  2008-03-16 19:46 ` David Brownell
  0 siblings, 1 reply; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-16 18:43 UTC (permalink / raw)
  To: David Brownell, Richard Purdie; +Cc: linux-kernel, David Brownell

David, Richard,

Is the use of "if (preempt_count())" to know when to defer led gpio work to
a workqueue needed?  Shouldn't "if (in_atomic())" be enough?

I have found no other such uses of preempt_count() anywhere in kernel code,
while in_atomic() is used for that sort of heuristic in various places.

Relevant git commit id is: 00852279af5ad26956bc7f4d0e86fdb40192e542
"leds: Teach leds-gpio to handle timer-unsafe GPIOs".   It made mainline in
2.6.23-rc1.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-16 18:43 use of preempt_count instead of in_atomic() at leds-gpio.c Henrique de Moraes Holschuh
@ 2008-03-16 19:46 ` David Brownell
  2008-03-18  7:14   ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: David Brownell @ 2008-03-16 19:46 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh; +Cc: Richard Purdie, linux-kernel

On Sunday 16 March 2008, Henrique de Moraes Holschuh wrote:
> Is the use of "if (preempt_count())" to know when to defer led gpio work to
> a workqueue needed?  Shouldn't "if (in_atomic())" be enough?

At this point, I don't know of any such reason.

I remember hunting for the right heuristic, and settling on
that one for reasons that I can't recall now.  They may even
be no longer applicable.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-16 19:46 ` David Brownell
@ 2008-03-18  7:14   ` Andrew Morton
  2008-03-18 19:06     ` David Brownell
  2008-03-20 22:56     ` Henrique de Moraes Holschuh
  0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-18  7:14 UTC (permalink / raw)
  To: David Brownell
  Cc: Henrique de Moraes Holschuh, Richard Purdie, linux-kernel, Ingo Molnar

On Sun, 16 Mar 2008 11:46:23 -0800 David Brownell <david-b@pacbell.net> wrote:

> On Sunday 16 March 2008, Henrique de Moraes Holschuh wrote:
> > Is the use of "if (preempt_count())" to know when to defer led gpio work to
> > a workqueue needed? __Shouldn't "if (in_atomic())" be enough?
> 
> At this point, I don't know of any such reason.
> 
> I remember hunting for the right heuristic, and settling on
> that one for reasons that I can't recall now.  They may even
> be no longer applicable.

Both are incorrect.  When CONFIG_PREEMPT=n we have no support for
determining whether schedule() may be called.  The calling code has to sort
out its stuff on its own.

<greps for preempt_count>

The LEDs code seems to be the sole offender.  print_vma_addr() might be
wrong too, but Ingo did it, and perhaps he knows that all code paths which
call print_vma_addr() from deadlockable contexts have already called
inc_preempt_count().  But is that true for all architectures?

<greps for in_atomic>

omigawd, what have we done, and how can we fix it? :(

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-18  7:14   ` Andrew Morton
@ 2008-03-18 19:06     ` David Brownell
  2008-03-18 20:07       ` Andrew Morton
  2008-03-20 22:56     ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 45+ messages in thread
From: David Brownell @ 2008-03-18 19:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, Richard Purdie, linux-kernel, Ingo Molnar

On Tuesday 18 March 2008, Andrew Morton wrote:
> On Sun, 16 Mar 2008 11:46:23 -0800 David Brownell <david-b@pacbell.net> wrote:
> 
> > On Sunday 16 March 2008, Henrique de Moraes Holschuh wrote:
> > > Is the use of "if (preempt_count())" to know when to defer led gpio work to
> > > a workqueue needed? __Shouldn't "if (in_atomic())" be enough?
> > 
> > At this point, I don't know of any such reason.
> > 
> > I remember hunting for the right heuristic, and settling on
> > that one for reasons that I can't recall now.  They may even
> > be no longer applicable.
> 
> Both are incorrect. 

So something like the appended patch would seem "better"?


> <greps for in_atomic>
> 
> omigawd, what have we done, and how can we fix it? :(

 
==============
It appears that we can't just check to see if we're in a task
context ... so instead of trying that, just make the relevant
leds always schedule a little worklet.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
---
 drivers/leds/leds-gpio.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- g26.orig/drivers/leds/leds-gpio.c	2008-03-18 01:32:08.000000000 -0700
+++ g26/drivers/leds/leds-gpio.c	2008-03-18 02:01:23.000000000 -0700
@@ -49,13 +49,13 @@ static void gpio_led_set(struct led_clas
 	if (led_dat->active_low)
 		level = !level;
 
-	/* setting GPIOs with I2C/etc requires a preemptible task context */
+	/* Setting GPIOs with I2C/etc requires a task context, and we don't
+	 * seem to have a reliable way to know if we're already in one; so
+	 * let's just assume the worst.
+	 */
 	if (led_dat->can_sleep) {
-		if (preempt_count()) {
-			led_dat->new_level = level;
-			schedule_work(&led_dat->work);
-		} else
-			gpio_set_value_cansleep(led_dat->gpio, level);
+		led_dat->new_level = level;
+		schedule_work(&led_dat->work);
 	} else
 		gpio_set_value(led_dat->gpio, level);
 }



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-18 19:06     ` David Brownell
@ 2008-03-18 20:07       ` Andrew Morton
  0 siblings, 0 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-18 20:07 UTC (permalink / raw)
  To: David Brownell; +Cc: hmh, rpurdie, linux-kernel, mingo

On Tue, 18 Mar 2008 11:06:13 -0800
David Brownell <david-b@pacbell.net> wrote:

> On Tuesday 18 March 2008, Andrew Morton wrote:
> > On Sun, 16 Mar 2008 11:46:23 -0800 David Brownell <david-b@pacbell.net> wrote:
> > 
> > > On Sunday 16 March 2008, Henrique de Moraes Holschuh wrote:
> > > > Is the use of "if (preempt_count())" to know when to defer led gpio work to
> > > > a workqueue needed? __Shouldn't "if (in_atomic())" be enough?
> > > 
> > > At this point, I don't know of any such reason.
> > > 
> > > I remember hunting for the right heuristic, and settling on
> > > that one for reasons that I can't recall now.  They may even
> > > be no longer applicable.
> > 
> > Both are incorrect. 
> 
> So something like the appended patch would seem "better"?
> 
> 
> > <greps for in_atomic>
> > 
> > omigawd, what have we done, and how can we fix it? :(
> 
>  
> ==============
> It appears that we can't just check to see if we're in a task
> context ... so instead of trying that, just make the relevant
> leds always schedule a little worklet.
> 
> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
> ---
>  drivers/leds/leds-gpio.c |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> --- g26.orig/drivers/leds/leds-gpio.c	2008-03-18 01:32:08.000000000 -0700
> +++ g26/drivers/leds/leds-gpio.c	2008-03-18 02:01:23.000000000 -0700
> @@ -49,13 +49,13 @@ static void gpio_led_set(struct led_clas
>  	if (led_dat->active_low)
>  		level = !level;
>  
> -	/* setting GPIOs with I2C/etc requires a preemptible task context */
> +	/* Setting GPIOs with I2C/etc requires a task context, and we don't
> +	 * seem to have a reliable way to know if we're already in one; so
> +	 * let's just assume the worst.
> +	 */
>  	if (led_dat->can_sleep) {
> -		if (preempt_count()) {
> -			led_dat->new_level = level;
> -			schedule_work(&led_dat->work);
> -		} else
> -			gpio_set_value_cansleep(led_dat->gpio, level);
> +		led_dat->new_level = level;
> +		schedule_work(&led_dat->work);
>  	} else
>  		gpio_set_value(led_dat->gpio, level);
>  }
> 

Better, I guess.

There's a design problem in the LED interface, though.  If callers really
do want to be able to call led_classdev.brightness_set() from atomic
contexts then we should either

a) make that function atomic (as you've done).  But that's inefficient.

b) pass in a mode flag to tell the callee whether it is allowed to
   sleep.  Ugly, but there's lots of precedent: GFP_ATOMIC-vs-GFP_KERNEL.

c) create a separate led_classdev.brightness_set_atomic() which callers
   should use when they're in atomic contexts.


Option c) would be best from a cleanness and efficiency POV.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-18  7:14   ` Andrew Morton
  2008-03-18 19:06     ` David Brownell
@ 2008-03-20 22:56     ` Henrique de Moraes Holschuh
  2008-03-20 23:47       ` Andrew Morton
  1 sibling, 1 reply; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-20 22:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Brownell, Richard Purdie, linux-kernel, Ingo Molnar

On Tue, 18 Mar 2008, Andrew Morton wrote:
> On Sun, 16 Mar 2008 11:46:23 -0800 David Brownell <david-b@pacbell.net> wrote:
> > On Sunday 16 March 2008, Henrique de Moraes Holschuh wrote:
> > > Is the use of "if (preempt_count())" to know when to defer led gpio work to
> > > a workqueue needed? __Shouldn't "if (in_atomic())" be enough?
> > 
> > At this point, I don't know of any such reason.
> > 
> > I remember hunting for the right heuristic, and settling on
> > that one for reasons that I can't recall now.  They may even
> > be no longer applicable.
> 
> Both are incorrect.  When CONFIG_PREEMPT=n we have no support for
> determining whether schedule() may be called.  The calling code has to sort
> out its stuff on its own.
> 
> <greps for preempt_count>
> 
> The LEDs code seems to be the sole offender.  print_vma_addr() might be
> wrong too, but Ingo did it, and perhaps he knows that all code paths which
> call print_vma_addr() from deadlockable contexts have already called
> inc_preempt_count().  But is that true for all architectures?
> 
> <greps for in_atomic>
> 
> omigawd, what have we done, and how can we fix it? :(

Well, it is obvious an "are we in a sleep-ok state?" query that works in any
case is desired by a lot of code.

I certainly don't want to punt every thinkpad LED write to a workqueue,
because that would mean less time with the CPU in C3 in the bottom line,
even if there are some benefits to always doing it the workqueue way (the
workqueue helper colapses attempts to change the LED state too often).

Can we add "in_scheduleable()", or maybe "can_schedule()", that returns
in_atomic() if CONFIG_PREEMT, or 0 if there is no way to know?   To my
limited knowledge of how that part of the kernel works, it would do the
right thing.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-20 22:56     ` Henrique de Moraes Holschuh
@ 2008-03-20 23:47       ` Andrew Morton
  2008-03-21  0:36         ` Henrique de Moraes Holschuh
  2008-03-21  0:56         ` Richard Purdie
  0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-20 23:47 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: David Brownell, Richard Purdie, linux-kernel, Ingo Molnar

On Thu, 20 Mar 2008 19:56:12 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:

> Can we add "in_scheduleable()", or maybe "can_schedule()", that returns
> in_atomic() if CONFIG_PREEMT, or 0 if there is no way to know?   To my
> limited knowledge of how that part of the kernel works, it would do the
> right thing.

If we did that, then people would use it.  And that would be bad.  It'll
lead to code which behaves differently on non-preemptible kernels, to code
which works less well on non-preemptible kernels and it will lead to less
well-thought-out code in general.

Really, this all points at an ill-designed part of the leds interface.  The
consistent pattern we use in the kernel is that callers keep track of
whether they are running in a schedulable context and, if necessary, they
will inform callees about that.  Callees don't work it out for themselves.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-20 23:47       ` Andrew Morton
@ 2008-03-21  0:36         ` Henrique de Moraes Holschuh
  2008-03-21  1:08           ` Andrew Morton
  2008-03-21  0:56         ` Richard Purdie
  1 sibling, 1 reply; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-21  0:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Brownell, Richard Purdie, linux-kernel, Ingo Molnar

On Thu, 20 Mar 2008, Andrew Morton wrote:
> On Thu, 20 Mar 2008 19:56:12 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> > Can we add "in_scheduleable()", or maybe "can_schedule()", that returns
> > in_atomic() if CONFIG_PREEMT, or 0 if there is no way to know?   To my
> > limited knowledge of how that part of the kernel works, it would do the
> > right thing.
> 
> If we did that, then people would use it.  And that would be bad.  It'll
> lead to code which behaves differently on non-preemptible kernels, to code
> which works less well on non-preemptible kernels and it will lead to less
> well-thought-out code in general.
> 
> Really, this all points at an ill-designed part of the leds interface.  The
> consistent pattern we use in the kernel is that callers keep track of
> whether they are running in a schedulable context and, if necessary, they
> will inform callees about that.  Callees don't work it out for themselves.

ACK.  Richard?  I have changed the thinkpad-acpi LED support to always defer
to a workqueue right now, but this *really* wants a LED class API fixup.
I'm for adding an specific hook for atomic access, but a flag would be good
enough too.

Well, so far so good for LEDs, but what about the other users of in_atomic
that apparently should not be doing it either?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-20 23:47       ` Andrew Morton
  2008-03-21  0:36         ` Henrique de Moraes Holschuh
@ 2008-03-21  0:56         ` Richard Purdie
  2008-03-21  2:10           ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 45+ messages in thread
From: Richard Purdie @ 2008-03-21  0:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, David Brownell, linux-kernel, Ingo Molnar

On Thu, 2008-03-20 at 16:47 -0700, Andrew Morton wrote:
> On Thu, 20 Mar 2008 19:56:12 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> 
> > Can we add "in_scheduleable()", or maybe "can_schedule()", that returns
> > in_atomic() if CONFIG_PREEMT, or 0 if there is no way to know?   To my
> > limited knowledge of how that part of the kernel works, it would do the
> > right thing.
> 
> If we did that, then people would use it.  And that would be bad.  It'll
> lead to code which behaves differently on non-preemptible kernels, to code
> which works less well on non-preemptible kernels and it will lead to less
> well-thought-out code in general.
> 
> Really, this all points at an ill-designed part of the leds interface.  The
> consistent pattern we use in the kernel is that callers keep track of
> whether they are running in a schedulable context and, if necessary, they
> will inform callees about that.  Callees don't work it out for themselves.

The LED interface said that the brightness_set implementation should not
sleep since it was intended to be a 'cheap' function and to allow LED
triggers changing the LED brightness to be simple. A lot of embedded LED
hardware doesn't need to sleep to toggle gpios.

Some drivers do have a problem with that however and its usually been
suggested they offload the brightness changes into a workqueue. The gpio
driver tries to be clever and only uses the workqueue if the gpio
backend can sleep *and* the calling context requires it, the latter part
being the problem.

So the options are:

* fix the gpio driver not to be so clever and clearly document
* move the workqueue into the LED class, use it for everyone and remove
the limitation of the function (punishes the hardware which doesn't need
to sleep)
* move the workqueue into the LED class and have LED drivers state
whether they can sleep or not
* start passing around GFP_* flags

Passing flags around and maintaining a track of schedulable state for
the LED class sounds like overkill. I also don't like the idea of
needlessly always using a workqueue. The reason the workqueue was never
implemented in the core was basically a question of timing. If you know
the LED is on a serial bus running at 9600 baud you might not schedule
work quite as often as something on a faster bus. Yes you could start
passing this info around but to me it makes sense to leave this kind of
policy to the drivers.

So I'm leaning towards 'fixing' the gpio driver as I think David has
already offered. I will also improve the documentation on this function
and its requirements as I agree the current isn't as clear as it should
be.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  0:36         ` Henrique de Moraes Holschuh
@ 2008-03-21  1:08           ` Andrew Morton
  2008-03-21  1:31             ` Alan Stern
                               ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-21  1:08 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: David Brownell, Richard Purdie, linux-kernel, Ingo Molnar,
	Geert Uytterhoeven, netdev, Martin Schwidefsky, Heiko Carstens,
	linux-usb, linux-wireless, video4linux-list, Stefan Richter,
	lm-sensors

On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:

> Well, so far so good for LEDs, but what about the other users of in_atomic
> that apparently should not be doing it either?

Ho hum.  Lots of cc's added.



./arch/x86/mm/pageattr.c

  Looks wrong.

./arch/m68k/atari/time.c

  Possibly buggy: deadlockable

./sound/core/seq/seq_virmidi.c

  Possibly buggy

./net/iucv/iucv.c
./kernel/power/process.c

  Just a debug check.

./drivers/s390/char/sclp_tty.c

  Possibly buggy: deadlockable
  
./drivers/s390/char/sclp_vt220.c

  Possibly buggy: deadlockable

./drivers/s390/net/netiucv.c

  Possibly buggy: deadlockable

./drivers/char/isicom.c

  Possibly buggy: deadlockable

./drivers/usb/misc/sisusbvga/sisusb_con.c

  Possibly buggy: deadlockable

./drivers/net/usb/pegasus.c

  Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/airo.c

  Possibly buggy: deadlockable

./drivers/net/wireless/rt2x00/rt73usb.c

  Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/rt2x00/rt2500usb.c

  Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/hostap/hostap_ioctl.c

  Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/zd1211rw/zd_usb.c

  Possibly buggy: deadlockable (I assume)

./drivers/net/irda/sir_dev.c

  Possibly buggy: deadlockable

./drivers/net/netxen/netxen_nic_niu.c

  Possibly buggy: deadlockable

./drivers/net/netxen/netxen_nic_init.c

  Possibly buggy: deadlockable

./drivers/ieee1394/ieee1394_transactions.c

  Possibly buggy: deadlockable

./drivers/video/amba-clcd.c

  Possibly buggy: deadlockable

./drivers/i2c/i2c-core.c

  Possibly buggy: deadlockable


The usual pattern for most of the above is

	if (!in_atomic())
		do_something_which_might_sleep();

problem is, in_atomic() returns false inside spinlock on non-preptible
kernels.  So if anyone calls those functions inside spinlock they will
incorrectly schedule and another task can then come in and try take the
already-held lock.

Now, it happens that in_atomic() returns true on non-preemtible kernels
when running in interrupt or softirq context.  But if the above code really
is using in_atomic() to detect am-i-called-from-interrupt and NOT
am-i-called-from-inside-spinlock, they should be using in_irq(),
in_softirq() or in_interrupt().


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  1:08           ` Andrew Morton
@ 2008-03-21  1:31             ` Alan Stern
  2008-03-21  1:36               ` Michael Buesch
  2008-03-21  9:21             ` Stefan Richter
  2008-03-21 17:04             ` David Brownell
  2 siblings, 1 reply; 45+ messages in thread
From: Alan Stern @ 2008-03-21  1:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, David Brownell, Richard Purdie,
	linux-kernel, Ingo Molnar, Geert Uytterhoeven, netdev,
	Martin Schwidefsky, Heiko Carstens, linux-usb, linux-wireless,
	video4linux-list, Stefan Richter, lm-sensors

On Thu, 20 Mar 2008, Andrew Morton wrote:

> On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> 
> > Well, so far so good for LEDs, but what about the other users of in_atomic
> > that apparently should not be doing it either?
> 
> Ho hum.  Lots of cc's added.

...

> The usual pattern for most of the above is
> 
> 	if (!in_atomic())
> 		do_something_which_might_sleep();
> 
> problem is, in_atomic() returns false inside spinlock on non-preptible
> kernels.  So if anyone calls those functions inside spinlock they will
> incorrectly schedule and another task can then come in and try take the
> already-held lock.
> 
> Now, it happens that in_atomic() returns true on non-preemtible kernels
> when running in interrupt or softirq context.  But if the above code really
> is using in_atomic() to detect am-i-called-from-interrupt and NOT
> am-i-called-from-inside-spinlock, they should be using in_irq(),
> in_softirq() or in_interrupt().

Presumably most of these places are actually trying to detect 
am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
Why doesn't it do that in non-preemptible kernels?

For that matter, isn't it also the sort of thing that might_sleep() is 
supposed to check?  But looking at the definitions in 
include/linux/kernel.h, it appears that might_sleep() does nothing at 
all when neither CONFIG_PREEMPT_VOLUNTARY nor 
CONFIG_DEBUG_SPINLOCK_SLEEP is set.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  1:31             ` Alan Stern
@ 2008-03-21  1:36               ` Michael Buesch
  2008-03-21  2:27                 ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: Michael Buesch @ 2008-03-21  1:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrew Morton, Henrique de Moraes Holschuh, David Brownell,
	Richard Purdie, linux-kernel, Ingo Molnar, Geert Uytterhoeven,
	netdev, Martin Schwidefsky, Heiko Carstens, linux-usb,
	linux-wireless, video4linux-list, Stefan Richter, lm-sensors

On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> On Thu, 20 Mar 2008, Andrew Morton wrote:
> 
> > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> > 
> > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > that apparently should not be doing it either?
> > 
> > Ho hum.  Lots of cc's added.
> 
> ...
> 
> > The usual pattern for most of the above is
> > 
> > 	if (!in_atomic())
> > 		do_something_which_might_sleep();
> > 
> > problem is, in_atomic() returns false inside spinlock on non-preptible
> > kernels.  So if anyone calls those functions inside spinlock they will
> > incorrectly schedule and another task can then come in and try take the
> > already-held lock.
> > 
> > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > when running in interrupt or softirq context.  But if the above code really
> > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > in_softirq() or in_interrupt().
> 
> Presumably most of these places are actually trying to detect 
> am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  

No, I think there is no such check in the kernel. Most likely for performance
reasons, as it would require a global flag that is set on each spinlock.
You simply must always _know_, if you are allowed to sleep or not. This is
done by defining an API. The call-context is part of any kernel API.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  0:56         ` Richard Purdie
@ 2008-03-21  2:10           ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-21  2:10 UTC (permalink / raw)
  To: Richard Purdie; +Cc: Andrew Morton, David Brownell, linux-kernel, Ingo Molnar

On Fri, 21 Mar 2008, Richard Purdie wrote:
> The LED interface said that the brightness_set implementation should not
> sleep since it was intended to be a 'cheap' function and to allow LED
> triggers changing the LED brightness to be simple. A lot of embedded LED
> hardware doesn't need to sleep to toggle gpios.

So far so good.

> Some drivers do have a problem with that however and its usually been
> suggested they offload the brightness changes into a workqueue. The gpio

Also good.  But the fact is, the LED core *does* know when it is calling
from a scheduleable context (e.g. from sysfs handlers), and that's not an
uncommon path either.

The trigger code is more complicated, I don't know if most of its calls to
brightness_set are in safe or unsafe contexts for sleep.  But the people
calling the trigger code certainly would.

> * fix the gpio driver not to be so clever and clearly document
> * move the workqueue into the LED class, use it for everyone and remove
> the limitation of the function (punishes the hardware which doesn't need
> to sleep)
> * move the workqueue into the LED class and have LED drivers state
> whether they can sleep or not
> * start passing around GFP_* flags
> 
> Passing flags around and maintaining a track of schedulable state for
> the LED class sounds like overkill. I also don't like the idea of

It is the preferred way to do these things.  If you don't do it like that,
both gpio and *all* ACPI-based LED devices will have to always defer to
workqueues.

> So I'm leaning towards 'fixing' the gpio driver as I think David has
> already offered. I will also improve the documentation on this function
> and its requirements as I agree the current isn't as clear as it should
> be.

And we will have to always defer to workqueues on drivers that can't operate
from an atomic/interrupt context?  Even when there would be no need for it
because brightness_set is not being called from an non-scheduleable context
at all?

I hope I can live with that for LEDs (I have to think about LED
brightness_get first before I am sure about that), but I don't like it at
all for the long term.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  1:36               ` Michael Buesch
@ 2008-03-21  2:27                 ` Andrew Morton
  2008-03-21  3:07                   ` Alan Stern
  2008-03-21 13:47                   ` Heiko Carstens
  0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-21  2:27 UTC (permalink / raw)
  To: Michael Buesch
  Cc: Alan Stern, Henrique de Moraes Holschuh, David Brownell,
	Richard Purdie, linux-kernel, Ingo Molnar, Geert Uytterhoeven,
	netdev, Martin Schwidefsky, Heiko Carstens, linux-usb,
	linux-wireless, video4linux-list, Stefan Richter, lm-sensors

On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <mb@bu3sch.de> wrote:

> On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > On Thu, 20 Mar 2008, Andrew Morton wrote:
> > 
> > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> > > 
> > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > that apparently should not be doing it either?
> > > 
> > > Ho hum.  Lots of cc's added.
> > 
> > ...
> > 
> > > The usual pattern for most of the above is
> > > 
> > > 	if (!in_atomic())
> > > 		do_something_which_might_sleep();
> > > 
> > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > kernels.  So if anyone calls those functions inside spinlock they will
> > > incorrectly schedule and another task can then come in and try take the
> > > already-held lock.
> > > 
> > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > when running in interrupt or softirq context.  But if the above code really
> > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > in_softirq() or in_interrupt().
> > 
> > Presumably most of these places are actually trying to detect 
> > am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
> 
> No, I think there is no such check in the kernel. Most likely for performance
> reasons, as it would require a global flag that is set on each spinlock.

Yup.  non-preemptible kernels avoid the inc/dec of
current_thread_info->preempt_count on spin_lock/spin_unlock

> You simply must always _know_, if you are allowed to sleep or not. This is
> done by defining an API. The call-context is part of any kernel API.

Yup.  99.99% of kernel code manages to do this...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  2:27                 ` Andrew Morton
@ 2008-03-21  3:07                   ` Alan Stern
  2008-03-21  3:17                     ` Andrew Morton
  2008-03-21 13:47                   ` Heiko Carstens
  1 sibling, 1 reply; 45+ messages in thread
From: Alan Stern @ 2008-03-21  3:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael Buesch, Henrique de Moraes Holschuh, David Brownell,
	Richard Purdie, linux-kernel, Ingo Molnar, Geert Uytterhoeven,
	netdev, Martin Schwidefsky, Heiko Carstens, linux-usb,
	linux-wireless, video4linux-list, Stefan Richter, lm-sensors

On Thu, 20 Mar 2008, Andrew Morton wrote:

> > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > when running in interrupt or softirq context.  But if the above code really
> > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > in_softirq() or in_interrupt().
> > > 
> > > Presumably most of these places are actually trying to detect 
> > > am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
> > 
> > No, I think there is no such check in the kernel. Most likely for performance
> > reasons, as it would require a global flag that is set on each spinlock.
> 
> Yup.  non-preemptible kernels avoid the inc/dec of
> current_thread_info->preempt_count on spin_lock/spin_unlock

So then what's the point of having in_atomic() at all?  Is it nothing 
more than a shorthand form of (in_irq() | in_softirq() | 
in_interrupt())?

In short, you are saying that there is _no_ reliable way to determine
am-i-called-from-inside-spinlock.  Well, why isn't there?  Would it be 
so terrible if non-preemptible kernels did adjust preempt_count on 
spin_lock/unlock?

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  3:07                   ` Alan Stern
@ 2008-03-21  3:17                     ` Andrew Morton
  2008-03-21  9:53                       ` Jean Delvare
  2008-03-21 15:11                       ` Tetsuo Handa
  0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-03-21  3:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: Michael Buesch, Henrique de Moraes Holschuh, David Brownell,
	Richard Purdie, linux-kernel, Ingo Molnar, Geert Uytterhoeven,
	netdev, Martin Schwidefsky, Heiko Carstens, linux-usb,
	linux-wireless, video4linux-list, Stefan Richter, lm-sensors

On Thu, 20 Mar 2008 23:07:16 -0400 (EDT) Alan Stern <stern@rowland.harvard.edu> wrote:

> On Thu, 20 Mar 2008, Andrew Morton wrote:
> 
> > > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > > when running in interrupt or softirq context.  But if the above code really
> > > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > > in_softirq() or in_interrupt().
> > > > 
> > > > Presumably most of these places are actually trying to detect 
> > > > am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
> > > 
> > > No, I think there is no such check in the kernel. Most likely for performance
> > > reasons, as it would require a global flag that is set on each spinlock.
> > 
> > Yup.  non-preemptible kernels avoid the inc/dec of
> > current_thread_info->preempt_count on spin_lock/spin_unlock
> 
> So then what's the point of having in_atomic() at all?  Is it nothing 
> more than a shorthand form of (in_irq() | in_softirq() | 
> in_interrupt())?

in_atomic() is for core kernel use only.  Because in special circumstances
(ie: kmap_atomic()) we run inc_preempt_count() even on non-preemptible
kernels to tell the per-arch fault handler that it was invoked by
copy_*_user() inside kmap_atomic(), and it must fail.

> In short, you are saying that there is _no_ reliable way to determine
> am-i-called-from-inside-spinlock.

That's correct.

>  Well, why isn't there?

The reasons I identified: it adds additional overhead and it encourages
poorly-thought-out design.

Now we _could_ change kernel design principles from
caller-knows-whats-going-on over to callee-works-out-whats-going-on.  But
that would affect more than this particular thing.

>  Would it be 
> so terrible if non-preemptible kernels did adjust preempt_count on 
> spin_lock/unlock?

The vast, vast majority of kernel code has managed to get through life
without needing this hidden-argument-passing.  The handful of errant
callsites should be able to do so as well...


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  1:08           ` Andrew Morton
  2008-03-21  1:31             ` Alan Stern
@ 2008-03-21  9:21             ` Stefan Richter
  2008-03-21  9:27               ` Stefan Richter
  2008-03-21 12:37               ` Henrique de Moraes Holschuh
  2008-03-21 17:04             ` David Brownell
  2 siblings, 2 replies; 45+ messages in thread
From: Stefan Richter @ 2008-03-21  9:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, David Brownell, Richard Purdie,
	linux-kernel, Ingo Molnar, Geert Uytterhoeven, netdev,
	Martin Schwidefsky, Heiko Carstens, linux-usb, linux-wireless,
	video4linux-list, lm-sensors

Andrew Morton wrote:
> ./drivers/ieee1394/ieee1394_transactions.c
> 
>   Possibly buggy: deadlockable

That's in hpsb_get_tlabel(), an exported symbol of the ieee1394 core.

The in_atomic() there didn't cause problems yet and is unlikely to do so 
in the future, because there are no plans for substantial changes to the 
whole drivers/ieee1394/ anymore (because of drivers/firewire/).

Nevertheless I shall look into replacing the in_atomic() by in_softirq() 
or something like that.  Touching this legacy code is dangerous though.


Some background:

This in_atomic() is just one symptom of one of the fundamental design 
flaws of the ieee1394 stack:  The "tlabels" (transaction labels, a 
limited resource) are acquired not only in process context but also in 
soft IRQ context --- but they are released only in process context. 
Unsurprisingly (in hindsight), the stack used to run out of tlabels 
simply because the tlabel consumers were scheduled more frequently than 
the tlabel recycler.  This resulted in IO failures in sbp2 and eth1394.

This is one of the design problems which inspired the submission of a 
new alternative driver stack.  (Though this particular one of the 
ieee1394 stack's problems could of course also be solved by a rework of 
the stack --- with a respective need of resources for testing and some 
danger of regressions.)

In the meantime (Linux 2.6.19 and 2.6.22) I added workarounds in sbp2 
and eth1394 to deal with temporary lack of of tlabels.  Alas I just 
recently received a report that eth1394's workaround is unsuccessful on 
non-preemptible uniprocessor kernels.  I suspect the same issue exists 
with sbp2's workaround, it just isn't as likely to happen there.

The new drivers/firewire/ recycle tlabels in bottom halves context and 
in timer context, which is the appropriate approach.  Alas 
drivers/firewire/ don't have an eth1394 equivalent yet...
-- 
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  9:21             ` Stefan Richter
@ 2008-03-21  9:27               ` Stefan Richter
  2008-03-21 12:37               ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 45+ messages in thread
From: Stefan Richter @ 2008-03-21  9:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, David Brownell, Richard Purdie,
	linux-kernel, Ingo Molnar, Geert Uytterhoeven, netdev,
	Martin Schwidefsky, Heiko Carstens, linux-usb, linux-wireless,
	video4linux-list, lm-sensors

I wrote:
> Andrew Morton wrote:
>> ./drivers/ieee1394/ieee1394_transactions.c
>>
>>   Possibly buggy: deadlockable
> 
> That's in hpsb_get_tlabel(), an exported symbol of the ieee1394 core.
> 
> The in_atomic() there didn't cause problems yet and is unlikely to do so 
> in the future, because there are no plans for substantial changes to the 
> whole drivers/ieee1394/ anymore (because of drivers/firewire/).
> 
> Nevertheless I shall look into replacing the in_atomic() by in_softirq() 
> or something like that.

Or extend the API to have separate calls for callers which can sleep and 
callers which can't.  But that may be thwarted by deep call chains.

> Touching this legacy code is dangerous though.
-- 
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  3:17                     ` Andrew Morton
@ 2008-03-21  9:53                       ` Jean Delvare
  2008-03-21 17:37                         ` Andrew Morton
  2008-03-21 15:11                       ` Tetsuo Handa
  1 sibling, 1 reply; 45+ messages in thread
From: Jean Delvare @ 2008-03-21  9:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Stern, Michael Buesch, Henrique de Moraes Holschuh,
	David Brownell, Richard Purdie, linux-kernel, Ingo Molnar,
	Geert Uytterhoeven

On Thu, 20 Mar 2008 20:17:23 -0700, Andrew Morton wrote:
> in_atomic() is for core kernel use only.  (...)

Then why is it made available to drivers through <linux/hardirq.h>? If
it's such a dangerous macro to call from drivers, it shouldn't be made
available, or at the very least there should be a big fat warning in
<linux/hardirq.h> that drivers aren't supposed to use it. This would
have avoided the 23 uses cases in drivers we have right now.

-- 
Jean Delvare

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  9:21             ` Stefan Richter
  2008-03-21  9:27               ` Stefan Richter
@ 2008-03-21 12:37               ` Henrique de Moraes Holschuh
  2008-03-21 13:16                 ` Stefan Richter
  1 sibling, 1 reply; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-21 12:37 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Andrew Morton, David Brownell, Richard Purdie, linux-kernel,
	Ingo Molnar, Geert Uytterhoeven, netdev, Martin Schwidefsky,
	Heiko Carstens, linux-usb, linux-wireless, video4linux-list,
	lm-sensors

On Fri, 21 Mar 2008, Stefan Richter wrote:
> and eth1394 to deal with temporary lack of of tlabels.  Alas I just  
> recently received a report that eth1394's workaround is unsuccessful on  
> non-preemptible uniprocessor kernels.  I suspect the same issue exists  

Which, I think, is exactly the config where in_atomic() can't be used to
mean "in_scheduleable_context()" ?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 12:37               ` Henrique de Moraes Holschuh
@ 2008-03-21 13:16                 ` Stefan Richter
  2008-03-22 11:29                   ` Stefan Richter
  0 siblings, 1 reply; 45+ messages in thread
From: Stefan Richter @ 2008-03-21 13:16 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Andrew Morton, David Brownell, Richard Purdie, linux-kernel,
	Ingo Molnar, Geert Uytterhoeven, netdev, Martin Schwidefsky,
	Heiko Carstens, linux-usb, linux-wireless, video4linux-list,
	lm-sensors

Henrique de Moraes Holschuh wrote:
> On Fri, 21 Mar 2008, Stefan Richter wrote:
>> and eth1394 to deal with temporary lack of of tlabels.  Alas I just  
>> recently received a report that eth1394's workaround is unsuccessful on  
>> non-preemptible uniprocessor kernels.  
> 
> Which, I think, is exactly the config where in_atomic() can't be used to
> mean "in_scheduleable_context()" ?

That's coincidence.

The mentioned workaround fails this way:
   - tlabel consumer eth1394 (IPv4 over FireWire) grabs lots of tlabels
     in soft IRQ context.
   - tlabel recycler khpsbpkt (a kthread of ieee1394) sleeps even though
     it could start putting tlabels back into the pool.
   - eth1394 can't get tlabels anymore, stops the transmit queue,
     schedules a workqueue job.
   - eth1394's workqueue job (run by the events kthread) tries to acquire
     a tlabel.  It does so in non-atomic context and hence sleeps in
     hpsb_get_tlabel() until the tlabel pool is nonempty again.  It would
     then wake up the eth1394 transmit queue again.
   - Normally, khpsbpkt would have been woken up by now and would have
     released a lot of now unused tlabels back into the pool again.
     However, on UP preempt_none kernels, khpsbpkt continues to sleep.
     (The 1394 stack's lower level runing in IRQ context or perhaps
     tasklet context wakes up khpsbpkt.)
   - Since it doesn't get a tlabel, eth1394's workqueue jobs sleeps
     forever as well.

Result is that all other tasks of the shared workqueue can't be 
serviced, notably the keyboard is stuck, and that the eth1394 connection 
breaks down.  (I haven't started working on a fix, or opened a bugzilla 
ticket for it yet.  The reporter currently switched his kernel to 
PREEMPT which is not affected.)

IOW:
The failure in the workaround is *not* about the in_atomic() being the 
wrong question asked in hpsb_get_tlabel() --- no, ieee1394's in_atomic() 
abuse works just fine even on UP PREEMPT_NONE.  Instead, the failure is 
about kthreads not being scheduled in the way that I thought they would.
-- 
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  2:27                 ` Andrew Morton
  2008-03-21  3:07                   ` Alan Stern
@ 2008-03-21 13:47                   ` Heiko Carstens
  2008-03-21 16:54                     ` Greg KH
  1 sibling, 1 reply; 45+ messages in thread
From: Heiko Carstens @ 2008-03-21 13:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael Buesch, Alan Stern, Henrique de Moraes Holschuh,
	David Brownell, Richard Purdie, linux-kernel, Ingo Molnar,
	Geert Uytterhoeven, netdev, Martin Schwidefsky, linux-usb,
	linux-wireless, video4linux-list, Stefan Richter, lm-sensors

On Thu, Mar 20, 2008 at 07:27:19PM -0700, Andrew Morton wrote:
> On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <mb@bu3sch.de> wrote:
> > On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > > On Thu, 20 Mar 2008, Andrew Morton wrote:
> > > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> > > > 
> > > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > > that apparently should not be doing it either?
> > > > 
> > > > Ho hum.  Lots of cc's added.
> > > 
> > > ...
> > > 
> > > > The usual pattern for most of the above is
> > > > 
> > > > 	if (!in_atomic())
> > > > 		do_something_which_might_sleep();
> > > > 
> > > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > > kernels.  So if anyone calls those functions inside spinlock they will
> > > > incorrectly schedule and another task can then come in and try take the
> > > > already-held lock.
> > > > 
> > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > when running in interrupt or softirq context.  But if the above code really
> > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > in_softirq() or in_interrupt().
> > > 
> > > Presumably most of these places are actually trying to detect 
> > > am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
> > 
> > No, I think there is no such check in the kernel. Most likely for performance
> > reasons, as it would require a global flag that is set on each spinlock.
> 
> Yup.  non-preemptible kernels avoid the inc/dec of
> current_thread_info->preempt_count on spin_lock/spin_unlock
> 
> > You simply must always _know_, if you are allowed to sleep or not. This is
> > done by defining an API. The call-context is part of any kernel API.
> 
> Yup.  99.99% of kernel code manages to do this...

This is difficult for console drivers. They get called and are supposed to
print something and don't have the slightest clue which context they are
running in and if they are allowed to schedule.
This is the problem with e.g. s390's sclp driver. If there are no write
buffers available anymore it tries to allocate memory if schedule is allowed
or otherwise has to wait until finally a request finished and memory is
available again.
And now we have to always busy wait if we are out of buffers, since we
cannot tell which context we are in?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  3:17                     ` Andrew Morton
  2008-03-21  9:53                       ` Jean Delvare
@ 2008-03-21 15:11                       ` Tetsuo Handa
  2008-03-21 16:54                         ` Stefan Richter
  1 sibling, 1 reply; 45+ messages in thread
From: Tetsuo Handa @ 2008-03-21 15:11 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

Hello.

> > In short, you are saying that there is _no_ reliable way to determine
> > am-i-called-from-inside-spinlock.
> 
> That's correct.

So, it is impossible to know whether I am inside a spinlock or not.
OK. That's not what I want to do.

I want to make sure that my code (not a device driver) is called only from a context
where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/kmap() etc. are permitted.
Is "if (in_atomic()) return;" check a correct method for avoiding deadlocks
when my code was accidentally called from a context
where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/kmap() etc. are not permitted?
I'm assuming that in_atomic() returns nonzero whenever scheduling is not permitted.

Regards.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 13:47                   ` Heiko Carstens
@ 2008-03-21 16:54                     ` Greg KH
  2008-03-21 19:59                       ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: Greg KH @ 2008-03-21 16:54 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Andrew Morton, Michael Buesch, Alan Stern,
	Henrique de Moraes Holschuh, David Brownell, Richard Purdie,
	linux-kernel, Ingo Molnar, Geert Uytterhoeven, netdev,
	Martin Schwidefsky, linux-usb, linux-wireless, video4linux-list,
	Stefan Richter, lm-sensors

On Fri, Mar 21, 2008 at 02:47:50PM +0100, Heiko Carstens wrote:
> On Thu, Mar 20, 2008 at 07:27:19PM -0700, Andrew Morton wrote:
> > On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <mb@bu3sch.de> wrote:
> > > On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > > > On Thu, 20 Mar 2008, Andrew Morton wrote:
> > > > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <hmh@hmh.eng.br> wrote:
> > > > > 
> > > > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > > > that apparently should not be doing it either?
> > > > > 
> > > > > Ho hum.  Lots of cc's added.
> > > > 
> > > > ...
> > > > 
> > > > > The usual pattern for most of the above is
> > > > > 
> > > > > 	if (!in_atomic())
> > > > > 		do_something_which_might_sleep();
> > > > > 
> > > > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > > > kernels.  So if anyone calls those functions inside spinlock they will
> > > > > incorrectly schedule and another task can then come in and try take the
> > > > > already-held lock.
> > > > > 
> > > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > > when running in interrupt or softirq context.  But if the above code really
> > > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > > in_softirq() or in_interrupt().
> > > > 
> > > > Presumably most of these places are actually trying to detect 
> > > > am-i-allowed-to-sleep.  Isn't that what in_atomic() is supposed to do?  
> > > 
> > > No, I think there is no such check in the kernel. Most likely for performance
> > > reasons, as it would require a global flag that is set on each spinlock.
> > 
> > Yup.  non-preemptible kernels avoid the inc/dec of
> > current_thread_info->preempt_count on spin_lock/spin_unlock
> > 
> > > You simply must always _know_, if you are allowed to sleep or not. This is
> > > done by defining an API. The call-context is part of any kernel API.
> > 
> > Yup.  99.99% of kernel code manages to do this...
> 
> This is difficult for console drivers. They get called and are supposed to
> print something and don't have the slightest clue which context they are
> running in and if they are allowed to schedule.
> This is the problem with e.g. s390's sclp driver. If there are no write
> buffers available anymore it tries to allocate memory if schedule is allowed
> or otherwise has to wait until finally a request finished and memory is
> available again.
> And now we have to always busy wait if we are out of buffers, since we
> cannot tell which context we are in?

This is the reason why the drivers/usb/misc/sisusbvga driver is trying
to test for in_atomic:
        /* We can't handle console calls in non-schedulable
         * context due to our locks and the USB transport.
         * So we simply ignore them. This should only affect
         * some calls to printk.
         */
        if (in_atomic())
                return NULL;


So how should this be "fixed" if in_atomic() is not a valid test?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 15:11                       ` Tetsuo Handa
@ 2008-03-21 16:54                         ` Stefan Richter
  2008-03-21 17:02                           ` Stefan Richter
  0 siblings, 1 reply; 45+ messages in thread
From: Stefan Richter @ 2008-03-21 16:54 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: akpm, linux-kernel

Tetsuo Handa wrote:
> Hello.
> 
>>> In short, you are saying that there is _no_ reliable way to determine
>>> am-i-called-from-inside-spinlock.
>> That's correct.
> 
> So, it is impossible to know whether I am inside a spinlock or not.
> OK. That's not what I want to do.
> 
> I want to make sure that my code (not a device driver) is called only from a context
> where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/kmap() etc. are permitted.
> Is "if (in_atomic()) return;" check a correct method for avoiding deadlocks
> when my code was accidentally called from a context
> where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/kmap() etc. are not permitted?
> I'm assuming that in_atomic() returns nonzero whenever scheduling is not permitted.

You shouldn't sleep while holding a spinlock.  As soon as another thread 
attempts to take the spinlock, it will be stuck in a busy-wait loop.

So, it's better if you specify that your code either can be called in 
atomic context or must not be called in atomic context, and all callers 
observe this restriction.  Or callers pass a flag to your code which 
says whether your code is allowed to sleep or not.
-- 
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 16:54                         ` Stefan Richter
@ 2008-03-21 17:02                           ` Stefan Richter
  2008-03-23  5:53                             ` Tetsuo Handa
  0 siblings, 1 reply; 45+ messages in thread
From: Stefan Richter @ 2008-03-21 17:02 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: akpm, linux-kernel

PS,
I wrote:
> Tetsuo Handa wrote:
>> So, it is impossible to know whether I am inside a spinlock or not.
>> OK. That's not what I want to do.
>>
>> I want to make sure that my code (not a device driver) is called only 
>> from a context where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/
>> get_user_pages()/kmap() etc. are permitted.
>> Is "if (in_atomic()) return;" check a correct method for avoiding 
>> deadlocks when my code was accidentally called from a context
>> where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/
>> kmap() etc. are not permitted?

No.  Quoting Andrew:  "in_atomic() returns false inside spinlock on 
non-preemptible kernels."

> You shouldn't sleep while holding a spinlock.  As soon as another thread 

or interrupt handler or tasklet

> attempts to take the spinlock, it will be stuck in a busy-wait loop.
-- 
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  1:08           ` Andrew Morton
  2008-03-21  1:31             ` Alan Stern
  2008-03-21  9:21             ` Stefan Richter
@ 2008-03-21 17:04             ` David Brownell
  2 siblings, 0 replies; 45+ messages in thread
From: David Brownell @ 2008-03-21 17:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Henrique de Moraes Holschuh, Richard Purdie, linux-kernel,
	Ingo Molnar, Geert Uytterhoeven, netdev, Martin Schwidefsky,
	Heiko Carstens, linux-usb, linux-wireless, video4linux-list,
	Stefan Richter, lm-sensors

On Thursday 20 March 2008, Andrew Morton wrote:
> ./drivers/net/usb/pegasus.c
> 
>   Possibly buggy: deadlockable (I assume)

Looks just unecessary to me ... ethtool MII ops get called from
a task context, as I recall, and other drivers just rely on that.

- Dave

========= CUT HERE
Remove superfluous in-atomic() check; ethtool MII ops are called
from task context.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
---
 drivers/net/usb/pegasus.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

--- g26.orig/drivers/net/usb/pegasus.c	2008-03-21 08:53:28.000000000 -0700
+++ g26/drivers/net/usb/pegasus.c	2008-03-21 08:54:07.000000000 -0700
@@ -1128,12 +1128,8 @@ pegasus_get_settings(struct net_device *
 {
 	pegasus_t *pegasus;
 
-	if (in_atomic())
-		return 0;
-
 	pegasus = netdev_priv(dev);
 	mii_ethtool_gset(&pegasus->mii, ecmd);
-
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21  9:53                       ` Jean Delvare
@ 2008-03-21 17:37                         ` Andrew Morton
  2008-03-21 18:05                           ` Alan Stern
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2008-03-21 17:37 UTC (permalink / raw)
  To: Jean Delvare
  Cc: Alan Stern, Michael Buesch, Henrique de Moraes Holschuh,
	David Brownell, Richard Purdie, linux-kernel, Ingo Molnar,
	Geert Uytterhoeven

On Fri, 21 Mar 2008 10:53:11 +0100 Jean Delvare <khali@linux-fr.org> wrote:

> On Thu, 20 Mar 2008 20:17:23 -0700, Andrew Morton wrote:
> > in_atomic() is for core kernel use only.  (...)
> 
> Then why is it made available to drivers through <linux/hardirq.h>?

Because we suck.

> If
> it's such a dangerous macro to call from drivers, it shouldn't be made
> available, or at the very least there should be a big fat warning in
> <linux/hardirq.h> that drivers aren't supposed to use it. This would
> have avoided the 23 uses cases in drivers we have right now.

True.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 17:37                         ` Andrew Morton
@ 2008-03-21 18:05                           ` Alan Stern
  2008-03-24 19:34                             ` Jonathan Corbet
  0 siblings, 1 reply; 45+ messages in thread
From: Alan Stern @ 2008-03-21 18:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jean Delvare, Michael Buesch, Henrique de Moraes Holschuh,
	David Brownell, Richard Purdie, Kernel development list,
	Ingo Molnar, Geert Uytterhoeven, Jonathan Corbet

On Fri, 21 Mar 2008, Andrew Morton wrote:

> On Fri, 21 Mar 2008 10:53:11 +0100 Jean Delvare <khali@linux-fr.org> wrote:
> 
> > On Thu, 20 Mar 2008 20:17:23 -0700, Andrew Morton wrote:
> > > in_atomic() is for core kernel use only.  (...)
> > 
> > Then why is it made available to drivers through <linux/hardirq.h>?
> 
> Because we suck.
> 
> > If
> > it's such a dangerous macro to call from drivers, it shouldn't be made
> > available, or at the very least there should be a big fat warning in
> > <linux/hardirq.h> that drivers aren't supposed to use it. This would
> > have avoided the 23 uses cases in drivers we have right now.
> 
> True.

There's also a section about in_atomic() in the Linux Device Drivers 
(3rd ed.) book which may have contributed to the confusion.  On p. 198:

	A function related to in_interrupt() is in_atomic().  Its 
	return value is nonzero whenever scheduling is not allowed;
	this includes hardware and software interrupt contexts as well
	as any time when a spinlock is held.  In the latter case, 
	current may be valid, but access to user space is forbidden, 
	since it can cause scheduling to happen.  Whenever you are
	using in_interrupt(), you should really consider whether 
	in_atomic() is what you actually mean.  Both functions are
	declared in <asm/hardirq.h>.

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 16:54                     ` Greg KH
@ 2008-03-21 19:59                       ` Andrew Morton
  2008-03-21 20:16                         ` Michael Buesch
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2008-03-21 19:59 UTC (permalink / raw)
  To: Greg KH
  Cc: heiko.carstens, mb, stern, hmh, david-b, rpurdie, linux-kernel,
	mingo, geert, netdev, schwidefsky, linux-usb, linux-wireless,
	video4linux-list, stefanr, lm-sensors

On Fri, 21 Mar 2008 09:54:05 -0700 Greg KH <greg@kroah.com> wrote:

> > > > You simply must always _know_, if you are allowed to sleep or not. This is
> > > > done by defining an API. The call-context is part of any kernel API.
> > > 
> > > Yup.  99.99% of kernel code manages to do this...
> > 
> > This is difficult for console drivers. They get called and are supposed to
> > print something and don't have the slightest clue which context they are
> > running in and if they are allowed to schedule.
> > This is the problem with e.g. s390's sclp driver. If there are no write
> > buffers available anymore it tries to allocate memory if schedule is allowed
> > or otherwise has to wait until finally a request finished and memory is
> > available again.
> > And now we have to always busy wait if we are out of buffers, since we
> > cannot tell which context we are in?
> 
> This is the reason why the drivers/usb/misc/sisusbvga driver is trying
> to test for in_atomic:
>         /* We can't handle console calls in non-schedulable
>          * context due to our locks and the USB transport.
>          * So we simply ignore them. This should only affect
>          * some calls to printk.
>          */
>         if (in_atomic())
>                 return NULL;
> 
> 
> So how should this be "fixed" if in_atomic() is not a valid test?

Well.  The kernel has traditionally assumed that console writes are atomic.

But we now have complex sleepy drivers acting as consoles.  Presumably this
means that large amounts of device driver code, page allocator code, etc
cannot have printks in them without going recursive.  Except printk itself
internally handles that, due to its need to be able to handle
printk-from-interrupt-when-this-cpu-is-already-running-printk.

The typical fix is for these console drivers to just assume that they
cannot sleep: pass GFP_ATOMIC down into the device driver code.  But I bet
the device driver code was designed assuming that it could sleep,
oops-bad-we-lose.

And it's not just sleep-in-spinlock.  If any of that device driver code
uses alloc_pages(GFP_KERNEL) then it can deadlock if we do a printk from
within the page allocator (and hence a lot of the block and storage layer).
Because in those code paths we must use GFP_NOFS or GFP_NOIO to allocate
memory.

So I think the right fix here is to switch those drivers to being
unconditionally atomic: don't schedule, don't take mutexes, don't use
__GFP_WAIT allocations.

They could of course be switched to using
kmalloc(GFP_ATOMIC)+memcpy()+schedule_task().  That's rather slow, but this
is not a performance-sensitive area.  But more seriously, this could lead
to messages getting lost from a dying machine.

One possibility would be to do current->called_for_console_output=1 and
then test that in various places.  But a) ugh and b) that's only useful for
memory allocations - it doesn't help if sleeping locks need to be taken.

Another possibility might be:

	if (current->called_for_console_output == false) {
		mutex_lock(lock);
	} else {
		if (!mutex_trylock(lock))
			return -EAGAIN;
	}

and then teach the console-calling code to requeue the message for later. 
But that's hard, because the straightforward implementation would result in
the output being queued for _all_ the currently active consoles, but some
of them might already have displayed this output - there's only one
log_buf.


An interesting problem ;)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 19:59                       ` Andrew Morton
@ 2008-03-21 20:16                         ` Michael Buesch
  2008-03-21 20:20                           ` Michael Buesch
  0 siblings, 1 reply; 45+ messages in thread
From: Michael Buesch @ 2008-03-21 20:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg KH, heiko.carstens, stern, hmh, david-b, rpurdie,
	linux-kernel, mingo, geert, netdev, schwidefsky, linux-usb,
	linux-wireless, video4linux-list, stefanr, lm-sensors

On Friday 21 March 2008 20:59:50 Andrew Morton wrote:
> They could of course be switched to using
> kmalloc(GFP_ATOMIC)+memcpy()+schedule_task().  That's rather slow, but this
> is not a performance-sensitive area.  But more seriously, this could lead
> to messages getting lost from a dying machine.

Well, IMO drivers that need to sleep to transmit some data (to whatever,
the screen or something) are not useful for debugging a crashing kernel anyway.
Or how high is the possibility that it'd survive the actual sleep in the
memory allocation? I'd say almost zero.
So that schedule_task() is not that bad.

-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 20:16                         ` Michael Buesch
@ 2008-03-21 20:20                           ` Michael Buesch
  0 siblings, 0 replies; 45+ messages in thread
From: Michael Buesch @ 2008-03-21 20:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg KH, heiko.carstens, stern, hmh, david-b, rpurdie,
	linux-kernel, mingo, geert, netdev, schwidefsky, linux-usb,
	linux-wireless, video4linux-list, stefanr, lm-sensors

On Friday 21 March 2008 21:16:48 Michael Buesch wrote:
> On Friday 21 March 2008 20:59:50 Andrew Morton wrote:
> > They could of course be switched to using
> > kmalloc(GFP_ATOMIC)+memcpy()+schedule_task().  That's rather slow, but this
> > is not a performance-sensitive area.  But more seriously, this could lead
> > to messages getting lost from a dying machine.
> 
> Well, IMO drivers that need to sleep to transmit some data (to whatever,
> the screen or something) are not useful for debugging a crashing kernel anyway.
> Or how high is the possibility that it'd survive the actual sleep in the
> memory allocation? I'd say almost zero.
> So that schedule_task() is not that bad.

and

transmit_data_func()
{
	if (!oops_in_progress) {
		schedule_transmission_for_later();
	} else {
		/* We crash anyway, so we don't care about
		 * possible deadlocks from memory alloc sleeps
		 * or whatever. */
		close_eyes_and_transmit_it_now();
	}
}


-- 
Greetings Michael.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 13:16                 ` Stefan Richter
@ 2008-03-22 11:29                   ` Stefan Richter
  0 siblings, 0 replies; 45+ messages in thread
From: Stefan Richter @ 2008-03-22 11:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Henrique de Moraes Holschuh, Andrew Morton, David Brownell,
	Richard Purdie, Ingo Molnar, Geert Uytterhoeven, netdev,
	Martin Schwidefsky, Heiko Carstens, linux-usb, linux-wireless,
	video4linux-list, lm-sensors

I wrote:
>>> and eth1394 to deal with temporary lack of of tlabels.  Alas I just  
>>> recently received a report that eth1394's workaround is unsuccessful 
>>> on  non-preemptible uniprocessor kernels.  
> (I haven't started working on a fix, or opened a bugzilla 
> ticket for it yet.  The reporter currently switched his kernel to 
> PREEMPT which is not affected.)

now logged as http://bugzilla.kernel.org/show_bug.cgi?id=10306

> The failure in the workaround is *not* about the in_atomic() being the 
> wrong question asked in hpsb_get_tlabel() --- no, ieee1394's in_atomic() 
> abuse works just fine even on UP PREEMPT_NONE.  Instead, the failure is 
> about kthreads not being scheduled in the way that I thought they would.
-- 
Stefan Richter
-=====-==--- --== =-==-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 17:02                           ` Stefan Richter
@ 2008-03-23  5:53                             ` Tetsuo Handa
  0 siblings, 0 replies; 45+ messages in thread
From: Tetsuo Handa @ 2008-03-23  5:53 UTC (permalink / raw)
  To: stefanr; +Cc: akpm, linux-kernel

Hello.

> >> Is "if (in_atomic()) return;" check a correct method for avoiding 
> >> deadlocks when my code was accidentally called from a context
> >> where use of down()/mutex_lock()/kmalloc(GFP_KERNEL)/get_user_pages()/
> >> kmap() etc. are not permitted?
> 
> No.  Quoting Andrew:  "in_atomic() returns false inside spinlock on 
> non-preemptible kernels."

So, just "if (in_atomic()) return;" check is insufficient for detecting
all cases when it is not permitted to sleep. I see.

Is "in_atomic() returns false inside spinlock on non-preemptible kernels."
the only case that the in_atomic() can't tell whether it is permitted to sleep or not?
If this is the only case, can't we somehow know it by remembering
"how many spinlocks does this CPU is holding now" since
"it is not permitted to sleep inside the spinlock" means
"the CPU the current process is running will not change".
Something like

#ifdef CONFIG_COUNT_SPINLOCKS_HELD
atomic_t spinlock_held_counter[NR_CPUS];
#endif

void spin_lock(x)
{
	/* obtain this spinlock. */
#ifdef CONFIG_COUNT_SPINLOCKS_HELD
	/* increment spinlock_held_counter[this_CPU]. */
#endif
}

void spin_unlock(x)
{
#ifdef CONFIG_COUNT_SPINLOCKS_HELD
	/* decrement spinlock_held_counter[this_CPU]. */
#endif
	/* release this spinlock. */
}

bool in_spinlock()
{
#ifdef CONFIG_COUNT_SPINLOCKS_HELD
	/* return spinlock_held_counter[this_CPU] != 0. */
#else
	return false;
#endif
}

and use "if (in_atomic() || in_spinlock()) return;" instead of "if (in_atomic()) return;" ?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-21 18:05                           ` Alan Stern
@ 2008-03-24 19:34                             ` Jonathan Corbet
  2008-03-24 19:42                               ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: Jonathan Corbet @ 2008-03-24 19:34 UTC (permalink / raw)
  To: Alan Stern
  Cc: Jean Delvare, Michael Buesch, Henrique de Moraes Holschuh,
	David Brownell, Richard Purdie, Kernel development list,
	Ingo Molnar, Geert Uytterhoeven, akpm

Alan Stern <stern@rowland.harvard.edu> wrote:

> There's also a section about in_atomic() in the Linux Device Drivers 
> (3rd ed.) book which may have contributed to the confusion.

My fault (again).  Obviously it *looked* like something people could use
to me...  

How about the following patch as a short-term penance to keep others
from making the same mistake?

jon

--

Discourage people from using in_atomic()

Signed-off-by: Jonathan Corbet <corbet@lwn.net>

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 4982998..3d196cb 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -72,6 +72,11 @@
 #define in_softirq()		(softirq_count())
 #define in_interrupt()		(irq_count())
 
+/*
+ * Are we running in atomic context?  WARNING: this macro cannot
+ * always detect atomic context and should not be used to determine
+ * whether sleeping is possible.  Do not use it in driver code.
+ */
 #define in_atomic()		((preempt_count() & ~PREEMPT_ACTIVE) != 0)
 
 #ifdef CONFIG_PREEMPT


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-24 19:34                             ` Jonathan Corbet
@ 2008-03-24 19:42                               ` Andrew Morton
  2008-03-24 19:53                                 ` Jonathan Corbet
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2008-03-24 19:42 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: stern, khali, mb, hmh, david-b, rpurdie, linux-kernel, mingo, geert

On Mon, 24 Mar 2008 13:34:49 -0600
corbet@lwn.net (Jonathan Corbet) wrote:

> Discourage people from using in_atomic()
> 
> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
> 
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 4982998..3d196cb 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -72,6 +72,11 @@
>  #define in_softirq()		(softirq_count())
>  #define in_interrupt()		(irq_count())
>  
> +/*
> + * Are we running in atomic context?  WARNING: this macro cannot
> + * always detect atomic context and should not be used to determine
> + * whether sleeping is possible.  Do not use it in driver code.
> + */
>  #define in_atomic()		((preempt_count() & ~PREEMPT_ACTIVE) != 0)

It'd be better if the comment were to describe _why_ in_atomic() is
unreliable.  ie: "does not account for held spinlocks on non-preemptible
kernels".


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-24 19:42                               ` Andrew Morton
@ 2008-03-24 19:53                                 ` Jonathan Corbet
  2008-03-25  8:52                                   ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Jonathan Corbet @ 2008-03-24 19:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: stern, khali, mb, hmh, david-b, rpurdie, linux-kernel, mingo, geert

Andrew Morton <akpm@linux-foundation.org> wrote:

> It'd be better if the comment were to describe _why_ in_atomic() is
> unreliable.  ie: "does not account for held spinlocks on non-preemptible
> kernels".

But then...why would anybody have a reason to read the upcoming LWN
article on the subject?  

OK, how's this?

jon

--

Discourage people from inappropriately using in_atomic()

Signed-off-by: Jonathan Corbet <corbet@lwn.net>

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 4982998..63a7782 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -72,6 +72,13 @@
 #define in_softirq()		(softirq_count())
 #define in_interrupt()		(irq_count())
 
+/*
+ * Are we running in atomic context?  WARNING: this macro cannot
+ * always detect atomic context; in particular, it cannot know about
+ * held spinlocks in non-preemptible kernels.  Thus it should not be
+ * used in the general case to determine whether sleeping is possible.
+ * Do not use in_atomic() in driver code.
+ */
 #define in_atomic()		((preempt_count() & ~PREEMPT_ACTIVE) != 0)
 
 #ifdef CONFIG_PREEMPT

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-24 19:53                                 ` Jonathan Corbet
@ 2008-03-25  8:52                                   ` Junio C Hamano
  2008-03-25 10:39                                     ` Jean Delvare
  2008-03-25 13:44                                     ` Jonathan Corbet
  0 siblings, 2 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-03-25  8:52 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, stern, khali, mb, hmh, david-b, rpurdie,
	linux-kernel, mingo, geert

corbet@lwn.net (Jonathan Corbet) writes:

> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 4982998..63a7782 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -72,6 +72,13 @@
>  #define in_softirq()		(softirq_count())
>  #define in_interrupt()		(irq_count())
>  
> +/*
> + * Are we running in atomic context?  WARNING: this macro cannot
> + * always detect atomic context; in particular, it cannot know about
> + * held spinlocks in non-preemptible kernels.  Thus it should not be
> + * used in the general case to determine whether sleeping is possible.
> + * Do not use in_atomic() in driver code.
> + */
>  #define in_atomic()		((preempt_count() & ~PREEMPT_ACTIVE) != 0)
>  
>  #ifdef CONFIG_PREEMPT

Is it just me who feels this comment that says "in_atomic() is not a way
to tell if we are in atomic reliably and cannot be used for such and such"
very reader-unfriendly?  Ok, maybe the macro is not reliable and is not
meant to be used for the purpose its name seems to suggest (at least to a
non-kernel person).  An inevitable question is, then what is it good for?
What's the right situation to use this macro?

I guess an additional comment "even if this says no, you could still be in
atomic, but if this says yes, then you definitely are in atomic and cannot
sleep" may help unconfuse a clueless reader like myself.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-25  8:52                                   ` Junio C Hamano
@ 2008-03-25 10:39                                     ` Jean Delvare
  2008-03-25 13:44                                     ` Jonathan Corbet
  1 sibling, 0 replies; 45+ messages in thread
From: Jean Delvare @ 2008-03-25 10:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Corbet, Andrew Morton, stern, mb, hmh, david-b, rpurdie,
	linux-kernel, mingo, geert

On Tue, 25 Mar 2008 01:52:58 -0700, Junio C Hamano wrote:
> corbet@lwn.net (Jonathan Corbet) writes:
> 
> > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> > index 4982998..63a7782 100644
> > --- a/include/linux/hardirq.h
> > +++ b/include/linux/hardirq.h
> > @@ -72,6 +72,13 @@
> >  #define in_softirq()		(softirq_count())
> >  #define in_interrupt()		(irq_count())
> >  
> > +/*
> > + * Are we running in atomic context?  WARNING: this macro cannot
> > + * always detect atomic context; in particular, it cannot know about
> > + * held spinlocks in non-preemptible kernels.  Thus it should not be
> > + * used in the general case to determine whether sleeping is possible.
> > + * Do not use in_atomic() in driver code.
> > + */
> >  #define in_atomic()		((preempt_count() & ~PREEMPT_ACTIVE) != 0)
> >  
> >  #ifdef CONFIG_PREEMPT
> 
> Is it just me who feels this comment that says "in_atomic() is not a way
> to tell if we are in atomic reliably and cannot be used for such and such"
> very reader-unfriendly?  Ok, maybe the macro is not reliable and is not
> meant to be used for the purpose its name seems to suggest (at least to a
> non-kernel person).  An inevitable question is, then what is it good for?
> What's the right situation to use this macro?
> 
> I guess an additional comment "even if this says no, you could still be in
> atomic, but if this says yes, then you definitely are in atomic and cannot
> sleep" may help unconfuse a clueless reader like myself.

Andrew explained that in_atomic() could deadlock if called in a
condition where it is unreliable (although I did not understand the
details). Documenting that a "yes" from in_atomic() can always be
trusted, would invite driver authors to still use it, when my
understanding is that they still shouldn't.

If drivers shouldn't use in_atomic() at all then I think that the
long-term solution is to move its definition out of <linux/hardirq.h>.
But of course this means fixing all the drivers that still use it first.

-- 
Jean Delvare

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-25  8:52                                   ` Junio C Hamano
  2008-03-25 10:39                                     ` Jean Delvare
@ 2008-03-25 13:44                                     ` Jonathan Corbet
  2008-03-25 23:20                                       ` David Brownell
  1 sibling, 1 reply; 45+ messages in thread
From: Jonathan Corbet @ 2008-03-25 13:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Andrew Morton, stern, khali, mb, hmh, david-b, rpurdie,
	linux-kernel, mingo, geert

Junio C Hamano <gitster@pobox.com> wrote:

> Is it just me who feels this comment that says "in_atomic() is not a way
> to tell if we are in atomic reliably and cannot be used for such and such"
> very reader-unfriendly?  Ok, maybe the macro is not reliable and is not
> meant to be used for the purpose its name seems to suggest (at least to a
> non-kernel person).  An inevitable question is, then what is it good for?
> What's the right situation to use this macro?

The "right situation" would appear to be "you're deep in the mm code and
really know what you're doing."  It is not a useful way for code to
determine whether it's running in atomic context - as was discussed
elsewhere in the thread, that information really needs to be passed in
by the caller.
 
Look for more detail on LWN, probably later today :)

> I guess an additional comment "even if this says no, you could still be in
> atomic, but if this says yes, then you definitely are in atomic and cannot
> sleep" may help unconfuse a clueless reader like myself.

The point being that "you just *might* be in atomic context, where
sleeping would be a bad idea, but I can't tell you" really isn't all
that useful.  It's a trap which can only lead to incorrect code.

What really needs to happen, IMHO, is that this macro should be ripped
out of hardirq.h entirely and cleverly hidden somewhere.  That can't be
done, though, until the drivers which use it are fixed.  But while that
is happening, we can at least put up a skull-and-crossbones sign to
discourage others from making the same mistake.

jon

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-25 13:44                                     ` Jonathan Corbet
@ 2008-03-25 23:20                                       ` David Brownell
  2008-03-26 14:28                                         ` Alan Stern
  0 siblings, 1 reply; 45+ messages in thread
From: David Brownell @ 2008-03-25 23:20 UTC (permalink / raw)
  To: rpurdie
  Cc: gitster, corbet, stern, mingo, mb, linux-kernel, khali, hmh, geert, akpm

I _almost_ hate bringing this lovely flamage back onto $SUBJECT ... but
what's the resolution for the leds-gpio.c issue?  I've not seen a merge
notice for the patch I submitted a week ago now:

	http://marc.info/?l=linux-kernel&m=120597839009399&w=2

Just a "leaning..." comment:

	http://marc.info/?l=linux-kernel&m=120606104619198&w=2

Seems to me that by now there ought to be resolution on at least
one of the issues brought up on this thread.  :)

- Dave


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-25 23:20                                       ` David Brownell
@ 2008-03-26 14:28                                         ` Alan Stern
  2008-03-26 16:17                                           ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 45+ messages in thread
From: Alan Stern @ 2008-03-26 14:28 UTC (permalink / raw)
  To: David Brownell
  Cc: rpurdie, gitster, corbet, mingo, mb, linux-kernel, khali, hmh,
	geert, akpm

On Tue, 25 Mar 2008, David Brownell wrote:

> I _almost_ hate bringing this lovely flamage back onto $SUBJECT ... but
> what's the resolution for the leds-gpio.c issue?  I've not seen a merge
> notice for the patch I submitted a week ago now:
> 
> 	http://marc.info/?l=linux-kernel&m=120597839009399&w=2
> 
> Just a "leaning..." comment:
> 
> 	http://marc.info/?l=linux-kernel&m=120606104619198&w=2
> 
> Seems to me that by now there ought to be resolution on at least
> one of the issues brought up on this thread.  :)

Is it reasonable to have two version of that subroutine: one meant to 
be called in a sleepable context and the other to be called when 
sleeping isn't allowed?

Alan Stern


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-26 14:28                                         ` Alan Stern
@ 2008-03-26 16:17                                           ` Henrique de Moraes Holschuh
  2008-03-26 16:46                                             ` Richard Purdie
  2008-03-27 18:51                                             ` David Brownell
  0 siblings, 2 replies; 45+ messages in thread
From: Henrique de Moraes Holschuh @ 2008-03-26 16:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: David Brownell, rpurdie, gitster, corbet, mingo, mb,
	linux-kernel, khali, geert, akpm

On Wed, 26 Mar 2008, Alan Stern wrote:
> On Tue, 25 Mar 2008, David Brownell wrote:
> > I _almost_ hate bringing this lovely flamage back onto $SUBJECT ... but
> > what's the resolution for the leds-gpio.c issue?  I've not seen a merge
> > notice for the patch I submitted a week ago now:
> > 
> > 	http://marc.info/?l=linux-kernel&m=120597839009399&w=2
> > 
> > Just a "leaning..." comment:
> > 
> > 	http://marc.info/?l=linux-kernel&m=120606104619198&w=2
> > 
> > Seems to me that by now there ought to be resolution on at least
> > one of the issues brought up on this thread.  :)
> 
> Is it reasonable to have two version of that subroutine: one meant to 
> be called in a sleepable context and the other to be called when 
> sleeping isn't allowed?

I have changed the thinkpad-acpi leds code to always assume an atomic
context, but I too would appreciate a proper flag (or secondary hook)
from the LED class to know when I am in an atomic context or not.

LED Triggers also need to be modified, they are mostly called from an
atomic context so we have to assume that by default, but we'd do well to
add a way to call them from non-atomic contexts.

Richard?  AFAIK, the ball *is* in your court as the LED maintainer.  You
have to decide which way to go and tell us.  I suppose either I or Alan
could write up patches to implement it, but I have better things to do
than to write patches that would be rejected anyway... OTOH, I don't
mind writing ones that I know are at least attempting to implement an
approved idea and would be rejected only if they need some fixing.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-26 16:17                                           ` Henrique de Moraes Holschuh
@ 2008-03-26 16:46                                             ` Richard Purdie
  2008-03-27 18:51                                             ` David Brownell
  1 sibling, 0 replies; 45+ messages in thread
From: Richard Purdie @ 2008-03-26 16:46 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Alan Stern, David Brownell, gitster, corbet, mingo, mb,
	linux-kernel, khali, geert, akpm


On Wed, 2008-03-26 at 13:17 -0300, Henrique de Moraes Holschuh wrote:
> On Wed, 26 Mar 2008, Alan Stern wrote:
> > On Tue, 25 Mar 2008, David Brownell wrote:
> > > I _almost_ hate bringing this lovely flamage back onto $SUBJECT ... but
> > > what's the resolution for the leds-gpio.c issue?  I've not seen a merge
> > > notice for the patch I submitted a week ago now:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=120597839009399&w=2
>
[...]
> I have changed the thinkpad-acpi leds code to always assume an atomic
> context, but I too would appreciate a proper flag (or secondary hook)
> from the LED class to know when I am in an atomic context or not.
> 
> LED Triggers also need to be modified, they are mostly called from an
> atomic context so we have to assume that by default, but we'd do well to
> add a way to call them from non-atomic contexts.
> 
> Richard?  AFAIK, the ball *is* in your court as the LED maintainer.  You
> have to decide which way to go and tell us.  I suppose either I or Alan
> could write up patches to implement it, but I have better things to do
> than to write patches that would be rejected anyway... OTOH, I don't
> mind writing ones that I know are at least attempting to implement an
> approved idea and would be rejected only if they need some fixing.

I've been meaning to merge David's patch and will get that done shortly,
sorry for the delay.

As I've said, I don't really think we need context flags for the LED
class at the moment. As you say, most of the triggers are atomic context
and those which can sleep are not time critical so I don't really see
the point in the added complexity.

Cheers,

Richard



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: use of preempt_count instead of in_atomic() at leds-gpio.c
  2008-03-26 16:17                                           ` Henrique de Moraes Holschuh
  2008-03-26 16:46                                             ` Richard Purdie
@ 2008-03-27 18:51                                             ` David Brownell
  1 sibling, 0 replies; 45+ messages in thread
From: David Brownell @ 2008-03-27 18:51 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh, Alan Stern, rpurdie
  Cc: gitster, corbet, mingo, mb, linux-kernel, khali, geert, akpm

On Wednesday 26 March 2008, Henrique de Moraes Holschuh wrote:
> On Wed, 26 Mar 2008, Alan Stern wrote:
> > On Tue, 25 Mar 2008, David Brownell wrote:
> > > I _almost_ hate bringing this lovely flamage back onto $SUBJECT ... but
> > > what's the resolution for the leds-gpio.c issue?  I've not seen a merge
> > > notice for the patch I submitted a week ago now:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=120597839009399&w=2
> > > 
> > > Just a "leaning..." comment:
> > > 
> > > 	http://marc.info/?l=linux-kernel&m=120606104619198&w=2
> > > 
> > > Seems to me that by now there ought to be resolution on at least
> > > one of the issues brought up on this thread.  :)
> > 
> > Is it reasonable to have two version of that subroutine: one meant to 
> > be called in a sleepable context and the other to be called when 
> > sleeping isn't allowed?

Not before 2.6.25 ships it isn't.  :)


> I have changed the thinkpad-acpi leds code to always assume an atomic
> context, but I too would appreciate a proper flag (or secondary hook)
> from the LED class to know when I am in an atomic context or not.
> 
> LED Triggers also need to be modified, they are mostly called from an
> atomic context so we have to assume that by default, but we'd do well to
> add a way to call them from non-atomic contexts.
> 
> Richard?  AFAIK, the ball *is* in your court as the LED maintainer.  You
> have to decide which way to go and tell us.

Presumably, both near-term and long-term solutions are needed.

I'd suggest merging the leds-gpio and thinkpad-acpi fixes
before 2.6.25 ships, and then *maybe* adopting something
more invasive.

- Dave

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2008-03-27 19:04 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-16 18:43 use of preempt_count instead of in_atomic() at leds-gpio.c Henrique de Moraes Holschuh
2008-03-16 19:46 ` David Brownell
2008-03-18  7:14   ` Andrew Morton
2008-03-18 19:06     ` David Brownell
2008-03-18 20:07       ` Andrew Morton
2008-03-20 22:56     ` Henrique de Moraes Holschuh
2008-03-20 23:47       ` Andrew Morton
2008-03-21  0:36         ` Henrique de Moraes Holschuh
2008-03-21  1:08           ` Andrew Morton
2008-03-21  1:31             ` Alan Stern
2008-03-21  1:36               ` Michael Buesch
2008-03-21  2:27                 ` Andrew Morton
2008-03-21  3:07                   ` Alan Stern
2008-03-21  3:17                     ` Andrew Morton
2008-03-21  9:53                       ` Jean Delvare
2008-03-21 17:37                         ` Andrew Morton
2008-03-21 18:05                           ` Alan Stern
2008-03-24 19:34                             ` Jonathan Corbet
2008-03-24 19:42                               ` Andrew Morton
2008-03-24 19:53                                 ` Jonathan Corbet
2008-03-25  8:52                                   ` Junio C Hamano
2008-03-25 10:39                                     ` Jean Delvare
2008-03-25 13:44                                     ` Jonathan Corbet
2008-03-25 23:20                                       ` David Brownell
2008-03-26 14:28                                         ` Alan Stern
2008-03-26 16:17                                           ` Henrique de Moraes Holschuh
2008-03-26 16:46                                             ` Richard Purdie
2008-03-27 18:51                                             ` David Brownell
2008-03-21 15:11                       ` Tetsuo Handa
2008-03-21 16:54                         ` Stefan Richter
2008-03-21 17:02                           ` Stefan Richter
2008-03-23  5:53                             ` Tetsuo Handa
2008-03-21 13:47                   ` Heiko Carstens
2008-03-21 16:54                     ` Greg KH
2008-03-21 19:59                       ` Andrew Morton
2008-03-21 20:16                         ` Michael Buesch
2008-03-21 20:20                           ` Michael Buesch
2008-03-21  9:21             ` Stefan Richter
2008-03-21  9:27               ` Stefan Richter
2008-03-21 12:37               ` Henrique de Moraes Holschuh
2008-03-21 13:16                 ` Stefan Richter
2008-03-22 11:29                   ` Stefan Richter
2008-03-21 17:04             ` David Brownell
2008-03-21  0:56         ` Richard Purdie
2008-03-21  2:10           ` Henrique de Moraes Holschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).