All of lore.kernel.org
 help / color / mirror / Atom feed
* acpi_evaluate_integer broken by design
@ 2008-11-25 11:05 Pavel Machek
  2008-11-25 18:41 ` Moore, Robert
  2008-11-26 21:20 ` Andrew Morton
  0 siblings, 2 replies; 19+ messages in thread
From: Pavel Machek @ 2008-11-25 11:05 UTC (permalink / raw)
  To: kernel list, ACPI mailing list, Len Brown


Now I know why I had strange "scheduling in atomic" problems:
acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
: GFP_KERNEL)... which is (of course) broken.

There's no way to reliably tell if we need GFP_ATOMIC or not from
code, this one for example fails to detect spinlocks held.

Fortunately, allocation seems small enough to be done on stack.

Signed-off-by: Pavel Machek <pavel@suse.cz>

diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index e827be3..f844941 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -259,34 +259,26 @@ acpi_evaluate_integer(acpi_handle handle
 		      struct acpi_object_list *arguments, unsigned long long *data)
 {
 	acpi_status status = AE_OK;
-	union acpi_object *element;
+	union acpi_object element;
 	struct acpi_buffer buffer = { 0, NULL };
 
-
 	if (!data)
 		return AE_BAD_PARAMETER;
 
-	element = kzalloc(sizeof(union acpi_object), irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
-	if (!element)
-		return AE_NO_MEMORY;
-
 	buffer.length = sizeof(union acpi_object);
-	buffer.pointer = element;
+	buffer.pointer = &element;
 	status = acpi_evaluate_object(handle, pathname, arguments, &buffer);
 	if (ACPI_FAILURE(status)) {
 		acpi_util_eval_error(handle, pathname, status);
-		kfree(element);
 		return status;
 	}
 
-	if (element->type != ACPI_TYPE_INTEGER) {
+	if (element.type != ACPI_TYPE_INTEGER) {
 		acpi_util_eval_error(handle, pathname, AE_BAD_DATA);
-		kfree(element);
 		return AE_BAD_DATA;
 	}
 
-	*data = element->integer.value;
-	kfree(element);
+	*data = element.integer.value;
 
 	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Return value [%llu]\n", *data));
 



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* RE: acpi_evaluate_integer broken by design
  2008-11-25 11:05 acpi_evaluate_integer broken by design Pavel Machek
@ 2008-11-25 18:41 ` Moore, Robert
  2008-11-26 21:20 ` Andrew Morton
  1 sibling, 0 replies; 19+ messages in thread
From: Moore, Robert @ 2008-11-25 18:41 UTC (permalink / raw)
  To: Pavel Machek, kernel list, ACPI mailing list, Brown, Len

I agree, it should be allocated on the stack, like it is most everywhere else.


>-----Original Message-----
>From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-
>owner@vger.kernel.org] On Behalf Of Pavel Machek
>Sent: Tuesday, November 25, 2008 3:05 AM
>To: kernel list; ACPI mailing list; Brown, Len
>Subject: acpi_evaluate_integer broken by design
>
>
>Now I know why I had strange "scheduling in atomic" problems:
>acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
>: GFP_KERNEL)... which is (of course) broken.
>
>There's no way to reliably tell if we need GFP_ATOMIC or not from
>code, this one for example fails to detect spinlocks held.
>
>Fortunately, allocation seems small enough to be done on stack.
>
>Signed-off-by: Pavel Machek <pavel@suse.cz>
>
>diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
>index e827be3..f844941 100644
>--- a/drivers/acpi/utils.c
>+++ b/drivers/acpi/utils.c
>@@ -259,34 +259,26 @@ acpi_evaluate_integer(acpi_handle handle
> 		      struct acpi_object_list *arguments, unsigned long long
>*data)
> {
> 	acpi_status status = AE_OK;
>-	union acpi_object *element;
>+	union acpi_object element;
> 	struct acpi_buffer buffer = { 0, NULL };
>
>-
> 	if (!data)
> 		return AE_BAD_PARAMETER;
>
>-	element = kzalloc(sizeof(union acpi_object), irqs_disabled() ?
>GFP_ATOMIC: GFP_KERNEL);
>-	if (!element)
>-		return AE_NO_MEMORY;
>-
> 	buffer.length = sizeof(union acpi_object);
>-	buffer.pointer = element;
>+	buffer.pointer = &element;
> 	status = acpi_evaluate_object(handle, pathname, arguments, &buffer);
> 	if (ACPI_FAILURE(status)) {
> 		acpi_util_eval_error(handle, pathname, status);
>-		kfree(element);
> 		return status;
> 	}
>
>-	if (element->type != ACPI_TYPE_INTEGER) {
>+	if (element.type != ACPI_TYPE_INTEGER) {
> 		acpi_util_eval_error(handle, pathname, AE_BAD_DATA);
>-		kfree(element);
> 		return AE_BAD_DATA;
> 	}
>
>-	*data = element->integer.value;
>-	kfree(element);
>+	*data = element.integer.value;
>
> 	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Return value [%llu]\n", *data));
>
>
>
>
>--
>(english) http://www.livejournal.com/~pavelmachek
>(cesky, pictures)
>http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>--
>To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_evaluate_integer broken by design
  2008-11-25 11:05 acpi_evaluate_integer broken by design Pavel Machek
  2008-11-25 18:41 ` Moore, Robert
@ 2008-11-26 21:20 ` Andrew Morton
  2008-11-26 22:37   ` Len Brown
  2008-11-27 13:56   ` acpi_evaluate_integer broken by design Pavel Machek
  1 sibling, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2008-11-26 21:20 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel, linux-acpi, len.brown

On Tue, 25 Nov 2008 12:05:08 +0100
Pavel Machek <pavel@suse.cz> wrote:

> Subject: acpi_evaluate_integer broken by design

Please don't give patches daft and not very meaningful titles.  It just
means that someone else has to invent a useful title and then we get
the same patch floating about with two different titles.

> Date: 	Tue, 25 Nov 2008 12:05:08 +0100
> Sender: linux-kernel-owner@vger.kernel.org
> User-Agent: Mutt/1.5.18 (2008-05-17)
>
> 
> Now I know why I had strange "scheduling in atomic" problems:
> acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
> : GFP_KERNEL)... which is (of course) broken.

That is kinda weird.  When did this all start happening?

> There's no way to reliably tell if we need GFP_ATOMIC or not from
> code, this one for example fails to detect spinlocks held.
> 
> Fortunately, allocation seems small enough to be done on stack.

It's 24 bytes in an x86_64 allmodconfig build.  Clearly it should be a
local.

> diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
> index e827be3..f844941 100644
> --- a/drivers/acpi/utils.c
> +++ b/drivers/acpi/utils.c
> @@ -259,34 +259,26 @@ acpi_evaluate_integer(acpi_handle handle
>  		      struct acpi_object_list *arguments, unsigned long long *data)
>  {
>  	acpi_status status = AE_OK;
> -	union acpi_object *element;
> +	union acpi_object element;
>  	struct acpi_buffer buffer = { 0, NULL };
>  
> -
>  	if (!data)
>  		return AE_BAD_PARAMETER;
>  
> -	element = kzalloc(sizeof(union acpi_object), irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
> -	if (!element)
> -		return AE_NO_MEMORY;
> -
>  	buffer.length = sizeof(union acpi_object);
> -	buffer.pointer = element;
> +	buffer.pointer = &element;
>  	status = acpi_evaluate_object(handle, pathname, arguments, &buffer);
>  	if (ACPI_FAILURE(status)) {
>  		acpi_util_eval_error(handle, pathname, status);
> -		kfree(element);
>  		return status;
>  	}
>  
> -	if (element->type != ACPI_TYPE_INTEGER) {
> +	if (element.type != ACPI_TYPE_INTEGER) {
>  		acpi_util_eval_error(handle, pathname, AE_BAD_DATA);
> -		kfree(element);
>  		return AE_BAD_DATA;
>  	}
>  
> -	*data = element->integer.value;
> -	kfree(element);
> +	*data = element.integer.value;
>  
>  	ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Return value [%llu]\n", *data));

Len, this looks like 2.6.28 material.  But given the poor quality of
the changelog it is hard to be sure about this.  Why isn't everyone
seeing these warnings?  What did Pavel do to provoke these alleged
warnings?  Nobody knows...




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_evaluate_integer broken by design
  2008-11-26 21:20 ` Andrew Morton
@ 2008-11-26 22:37   ` Len Brown
  2008-11-26 23:02     ` Andrew Morton
  2008-11-27 13:56   ` acpi_evaluate_integer broken by design Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Len Brown @ 2008-11-26 22:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pavel Machek, Linux Kernel Mailing List, linux-acpi



> > Now I know why I had strange "scheduling in atomic" problems:
> > acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
> > : GFP_KERNEL)... which is (of course) broken.
> 
> That is kinda weird.  When did this all start happening?

> > There's no way to reliably tell if we need GFP_ATOMIC or not from
> > code, this one for example fails to detect spinlocks held.

> Len, this looks like 2.6.28 material.  But given the poor quality of
> the changelog it is hard to be sure about this.  Why isn't everyone
> seeing these warnings?  What did Pavel do to provoke these alleged
> warnings?  Nobody knows...

I don't know know why pavel sees this and nobody else --
maybe something unusual he's doing with suspend?

The reason that the ACPI code is littered with bogus
irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL)
is because, like boot, resume starts life with interrupts off.

I would prefer that resume and boot handle this the same way,
with system_state.  However, a few years ago when I suggested
using system_state for resume, Andrew thought that was a very
bad idea.  Andrew, do you still feel that way?

-Len

ps. I'll put this particular fix in my tree now.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_evaluate_integer broken by design
  2008-11-26 22:37   ` Len Brown
@ 2008-11-26 23:02     ` Andrew Morton
  2008-11-27 13:58       ` Pavel Machek
  2008-12-16  5:24       ` acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design) Len Brown
  0 siblings, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2008-11-26 23:02 UTC (permalink / raw)
  To: Len Brown; +Cc: pavel, linux-kernel, linux-acpi

On Wed, 26 Nov 2008 17:37:29 -0500 (EST)
Len Brown <lenb@kernel.org> wrote:

> 
> 
> > > Now I know why I had strange "scheduling in atomic" problems:
> > > acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
> > > : GFP_KERNEL)... which is (of course) broken.
> > 
> > That is kinda weird.  When did this all start happening?
> 
> > > There's no way to reliably tell if we need GFP_ATOMIC or not from
> > > code, this one for example fails to detect spinlocks held.
> 
> > Len, this looks like 2.6.28 material.  But given the poor quality of
> > the changelog it is hard to be sure about this.  Why isn't everyone
> > seeing these warnings?  What did Pavel do to provoke these alleged
> > warnings?  Nobody knows...
> 
> I don't know know why pavel sees this and nobody else --
> maybe something unusual he's doing with suspend?
> 
> The reason that the ACPI code is littered with bogus
> irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL)
> is because, like boot, resume starts life with interrupts off.

yes, suspend's disabling of interrupts causes problems all over the place.

If we really do need to inspect global state to handle this, it would
be much better to create a new

bool resume_is_running_so_you_cant_sleep;

and to test that.  Something which is clear, highly specific and which
cannot be accidentally triggered via other means.

> I would prefer that resume and boot handle this the same way,
> with system_state.  However, a few years ago when I suggested
> using system_state for resume, Andrew thought that was a very
> bad idea.  Andrew, do you still feel that way?

yep.  We've had problems in the past with system_state, where people add
new states, and old code breaks.  Plus as we add more dependencies on
system_state, that reduces our ability to add new states and makes it
harder to alter (ie: fix) system_state transitions, etc.  Just run
away!

As I said above, if we have a piece of code which wants to know when a
separate piece of code is in a particualr state, it is better to add a
new state flag just for that application, rather than trying to infer
things from the heavily overloaded system_state.


Obviously, it is even better to do neither.  It is a basic and
oft-reoccurring design mistake for a low-level piece of code to
hardwire its GFP_foo decision.  The gfp_t should be passed in from the
callers, all the way down the chain from the top-level code which
actually knows what state this thread of control is in.

> -Len
> 
> ps. I'll put this particular fix in my tree now.

bewdy.  It's a good change.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_evaluate_integer broken by design
  2008-11-26 21:20 ` Andrew Morton
  2008-11-26 22:37   ` Len Brown
@ 2008-11-27 13:56   ` Pavel Machek
  1 sibling, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2008-11-27 13:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-acpi, len.brown

Hi!

> > Subject: acpi_evaluate_integer broken by design
> 
> Please don't give patches daft and not very meaningful titles.  It just
> means that someone else has to invent a useful title and then we get
> the same patch floating about with two different titles.

Sorry. I was kind of angry at that code because it does not contain
any remarks when it is okay to call it, etc.

> > Date: 	Tue, 25 Nov 2008 12:05:08 +0100
> > Sender: linux-kernel-owner@vger.kernel.org
> > User-Agent: Mutt/1.5.18 (2008-05-17)
> >
> > 
> > Now I know why I had strange "scheduling in atomic" problems:
> > acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
> > : GFP_KERNEL)... which is (of course) broken.
> 
> That is kinda weird.  When did this all start happening?

With the HP LED driver... so this is "would be nice for 2.6.28, not
worth backporting" material.

> > There's no way to reliably tell if we need GFP_ATOMIC or not from
> > code, this one for example fails to detect spinlocks held.
> > 
> > Fortunately, allocation seems small enough to be done on stack.
> 
> It's 24 bytes in an x86_64 allmodconfig build.  Clearly it should be a
> local.

Yep.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_evaluate_integer broken by design
  2008-11-26 23:02     ` Andrew Morton
@ 2008-11-27 13:58       ` Pavel Machek
  2008-12-16  5:24       ` acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design) Len Brown
  1 sibling, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2008-11-27 13:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Len Brown, linux-kernel, linux-acpi, Rafael J. Wysocki

Hi!

> > > Len, this looks like 2.6.28 material.  But given the poor quality of
> > > the changelog it is hard to be sure about this.  Why isn't everyone
> > > seeing these warnings?  What did Pavel do to provoke these alleged
> > > warnings?  Nobody knows...
> > 
> > I don't know know why pavel sees this and nobody else --
> > maybe something unusual he's doing with suspend?

No, playing with driver for HP LED.

Now... I guess the test confused me. Because it tests for
irqs_disabled(), I assumed it was okay to call from interrupt context,
but now I'm not sure at all. Can you help? Comment would be nice.

> If we really do need to inspect global state to handle this, it would
> be much better to create a new
> 
> bool resume_is_running_so_you_cant_sleep;
> 
> and to test that.  Something which is clear, highly specific and which
> cannot be accidentally triggered via other means.

Works for me.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-11-26 23:02     ` Andrew Morton
  2008-11-27 13:58       ` Pavel Machek
@ 2008-12-16  5:24       ` Len Brown
  2008-12-16  5:41         ` Andrew Morton
  1 sibling, 1 reply; 19+ messages in thread
From: Len Brown @ 2008-12-16  5:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: pavel, Rafael J. Wysocki, Linux Kernel Mailing List, linux-acpi


> > The reason that the ACPI code is littered with bogus
> > irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL)
> > is because, like boot, resume starts life with interrupts off.
> 
> yes, suspend's disabling of interrupts causes problems all over the place.
> 
> If we really do need to inspect global state to handle this, it would
> be much better to create a new
> 
> bool resume_is_running_so_you_cant_sleep;
>
> and to test that.  Something which is clear, highly specific and which
> cannot be accidentally triggered via other means.

can do.

> > I would prefer that resume and boot handle this the same way,
> > with system_state.  However, a few years ago when I suggested
> > using system_state for resume, Andrew thought that was a very
> > bad idea.  Andrew, do you still feel that way?
> 
> yep.  We've had problems in the past with system_state, where people add
> new states, and old code breaks.  Plus as we add more dependencies on
> system_state, that reduces our ability to add new states and makes it
> harder to alter (ie: fix) system_state transitions, etc.  Just run
> away!
> 
> As I said above, if we have a piece of code which wants to know when a
> separate piece of code is in a particualr state, it is better to add a
> new state flag just for that application, rather than trying to infer
> things from the heavily overloaded system_state.
> 
> 
> Obviously, it is even better to do neither.  It is a basic and
> oft-reoccurring design mistake for a low-level piece of code to
> hardwire its GFP_foo decision.  The gfp_t should be passed in from the
> callers, all the way down the chain from the top-level code which
> actually knows what state this thread of control is in.

Here the caller does not know.

The crux of the problem it that the AML interpreter
may need to allocate memory and thus may sleep.
Under nominal conditions the interpreter only runs in 
user-context and allocatges with GFP_KERNEL
so sleeping is just dandy.

But AML also needs to run at boot time and at resume time
to do things like configure interrupts before interrupts are
enabled...

boot-time is handled by _cond_resched()
not calling __cond_resched unless
system_state == SYSTEM_RUNNING

I suppose I could do something like this,
though I'm not sure of the prettiest place
to check system_resume_irqsoff == true,
so that is left out for now...

-- Len Brown, Intel Open Source Technology Center

diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
index e52ad91..def91fa 100644
--- a/drivers/acpi/pci_link.c
+++ b/drivers/acpi/pci_link.c
@@ -320,7 +320,7 @@ static int acpi_pci_link_set(struct acpi_pci_link *link, int irq)
 	if (!link || !irq)
 		return -EINVAL;
 
-	resource = kzalloc(sizeof(*resource) + 1, irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
+	resource = kzalloc(sizeof(*resource) + 1, GFP_KERNEL);
 	if (!resource)
 		return -ENOMEM;
 
diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c
index bb7d50d..27243f9 100644
--- a/drivers/acpi/power.c
+++ b/drivers/acpi/power.c
@@ -218,8 +218,7 @@ static int acpi_power_on(acpi_handle handle, struct acpi_device *dev)
 	}
 
 	if (!found) {
-		ref = kmalloc(sizeof (struct acpi_power_reference),
-		    irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL);
+		ref = kmalloc(sizeof (struct acpi_power_reference), GFP_KERNEL);
 		if (!ref) {
 			ACPI_DEBUG_PRINT((ACPI_DB_INFO, "kmalloc() failed\n"));
 			mutex_unlock(&resource->resource_lock);
diff --git a/include/acpi/platform/aclinux.h b/include/acpi/platform/aclinux.h
index 029c8c0..9fbf73f 100644
--- a/include/acpi/platform/aclinux.h
+++ b/include/acpi/platform/aclinux.h
@@ -121,17 +121,16 @@ static inline acpi_thread_id acpi_os_get_thread_id(void)
 #include <acpi/actypes.h>
 static inline void *acpi_os_allocate(acpi_size size)
 {
-	return kmalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kmalloc(size, GFP_KERNEL);
 }
 static inline void *acpi_os_allocate_zeroed(acpi_size size)
 {
-	return kzalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kzalloc(size, GFP_KERNEL);
 }
 
 static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
 {
-	return kmem_cache_zalloc(cache,
-				 irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kmem_cache_zalloc(cache, GFP_KERNEL);
 }
 
 #define ACPI_ALLOCATE(a)	acpi_os_allocate(a)
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 2ce8207..a716de8 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -128,9 +128,16 @@ extern void arch_suspend_disable_irqs(void);
  */
 extern void arch_suspend_enable_irqs(void);
 
+/**
+ * suspend_irqs_off
+ *
+ * flag to indicate that IRQs are off for the benefit of suspend
+ */
+extern bool system_suspend_irqs_off;
 extern int pm_suspend(suspend_state_t state);
 #else /* !CONFIG_SUSPEND */
 #define suspend_valid_only_mem	NULL
+#define system_suspend_irqs_off false
 
 static inline void suspend_set_ops(struct platform_suspend_ops *ops) {}
 static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
diff --git a/kernel/power/main.c b/kernel/power/main.c
index b8f7ce9..2eadae7 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -268,16 +268,20 @@ static int suspend_prepare(void)
 	return error;
 }
 
+bool system_resume_irqsoff;	/* IRQs are off for resume */
+
 /* default implementation */
 void __attribute__ ((weak)) arch_suspend_disable_irqs(void)
 {
 	local_irq_disable();
+	system_resume_irqsoff = true;
 }
 
 /* default implementation */
 void __attribute__ ((weak)) arch_suspend_enable_irqs(void)
 {
 	local_irq_enable();
+	system_resume_irqsoff = false;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-16  5:24       ` acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design) Len Brown
@ 2008-12-16  5:41         ` Andrew Morton
  2008-12-16 15:34           ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2008-12-16  5:41 UTC (permalink / raw)
  To: Len Brown; +Cc: pavel, Rafael J. Wysocki, Linux Kernel Mailing List, linux-acpi

On Tue, 16 Dec 2008 00:24:43 -0500 (EST) Len Brown <lenb@kernel.org> wrote:

> > Obviously, it is even better to do neither.  It is a basic and
> > oft-reoccurring design mistake for a low-level piece of code to
> > hardwire its GFP_foo decision.  The gfp_t should be passed in from the
> > callers, all the way down the chain from the top-level code which
> > actually knows what state this thread of control is in.
> 
> Here the caller does not know.

I bet it does it's just that the caller is thirty five levels of
function call higher up...

> The crux of the problem it that the AML interpreter
> may need to allocate memory and thus may sleep.
> Under nominal conditions the interpreter only runs in 
> user-context and allocatges with GFP_KERNEL
> so sleeping is just dandy.
> 
> But AML also needs to run at boot time and at resume time
> to do things like configure interrupts before interrupts are
> enabled...
> 
> boot-time is handled by _cond_resched()
> not calling __cond_resched unless
> system_state == SYSTEM_RUNNING
> 
> I suppose I could do something like this,
> though I'm not sure of the prettiest place
> to check system_resume_irqsoff == true,
> so that is left out for now...
> 
>
> ...
> 
> diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
> index e52ad91..def91fa 100644
> --- a/drivers/acpi/pci_link.c
> +++ b/drivers/acpi/pci_link.c
> @@ -320,7 +320,7 @@ static int acpi_pci_link_set(struct acpi_pci_link *link, int irq)
>  	if (!link || !irq)
>  		return -EINVAL;
>  
> -	resource = kzalloc(sizeof(*resource) + 1, irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
> +	resource = kzalloc(sizeof(*resource) + 1, GFP_KERNEL);
>  	if (!resource)
>  		return -ENOMEM;
>  
> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c
> index bb7d50d..27243f9 100644
> --- a/drivers/acpi/power.c
> +++ b/drivers/acpi/power.c
> @@ -218,8 +218,7 @@ static int acpi_power_on(acpi_handle handle, struct acpi_device *dev)
>  	}
>  
>  	if (!found) {
> -		ref = kmalloc(sizeof (struct acpi_power_reference),
> -		    irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL);
> +		ref = kmalloc(sizeof (struct acpi_power_reference), GFP_KERNEL);
>  		if (!ref) {
>  			ACPI_DEBUG_PRINT((ACPI_DB_INFO, "kmalloc() failed\n"));
>  			mutex_unlock(&resource->resource_lock);
> diff --git a/include/acpi/platform/aclinux.h b/include/acpi/platform/aclinux.h
> index 029c8c0..9fbf73f 100644
> --- a/include/acpi/platform/aclinux.h
> +++ b/include/acpi/platform/aclinux.h
> @@ -121,17 +121,16 @@ static inline acpi_thread_id acpi_os_get_thread_id(void)
>  #include <acpi/actypes.h>
>  static inline void *acpi_os_allocate(acpi_size size)
>  {
> -	return kmalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> +	return kmalloc(size, GFP_KERNEL);
>  }
>  static inline void *acpi_os_allocate_zeroed(acpi_size size)
>  {
> -	return kzalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> +	return kzalloc(size, GFP_KERNEL);
>  }
>  
>  static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
>  {
> -	return kmem_cache_zalloc(cache,
> -				 irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> +	return kmem_cache_zalloc(cache, GFP_KERNEL);
>  }
>  
>  #define ACPI_ALLOCATE(a)	acpi_os_allocate(a)
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index 2ce8207..a716de8 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -128,9 +128,16 @@ extern void arch_suspend_disable_irqs(void);
>   */
>  extern void arch_suspend_enable_irqs(void);
>  
> +/**
> + * suspend_irqs_off
> + *
> + * flag to indicate that IRQs are off for the benefit of suspend
> + */
> +extern bool system_suspend_irqs_off;
>  extern int pm_suspend(suspend_state_t state);
>  #else /* !CONFIG_SUSPEND */
>  #define suspend_valid_only_mem	NULL
> +#define system_suspend_irqs_off false
>  
>  static inline void suspend_set_ops(struct platform_suspend_ops *ops) {}
>  static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
> diff --git a/kernel/power/main.c b/kernel/power/main.c
> index b8f7ce9..2eadae7 100644
> --- a/kernel/power/main.c
> +++ b/kernel/power/main.c
> @@ -268,16 +268,20 @@ static int suspend_prepare(void)
>  	return error;
>  }
>  
> +bool system_resume_irqsoff;	/* IRQs are off for resume */

I guess that was supposed to be "system_suspend_irqs_off"?

>  /* default implementation */
>  void __attribute__ ((weak)) arch_suspend_disable_irqs(void)
>  {
>  	local_irq_disable();
> +	system_resume_irqsoff = true;
>  }
>  
>  /* default implementation */
>  void __attribute__ ((weak)) arch_suspend_enable_irqs(void)
>  {
>  	local_irq_enable();
> +	system_resume_irqsoff = false;
>  }

(please use __weak)

Whether it's "system_resume_irqsoff" or "system_suspend_irqs_off",
neither of them actually got used ;)

If the code is really this simple and localised then perhaps:

gfp_t acpi_aml_gfp_flags = GFP_ATOMIC;

void __weak arch_suspend_disable_irqs(void)
{
	local_irq_disable();
	acpi_aml_gfp_flags = GFP_ATOMIC;
}

/* default implementation */
void __weak arch_suspend_enable_irqs(void)
{
	local_irq_enable();
	acpi_aml_gfp_flags = GFP_KERNEL;
}

would suffice.  Dunno.

Why are we providing for an arch override of this?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-16  5:41         ` Andrew Morton
@ 2008-12-16 15:34           ` Rafael J. Wysocki
  2008-12-16 23:40             ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2008-12-16 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Len Brown, pavel, Linux Kernel Mailing List, linux-acpi, Johannes Berg

On Tuesday, 16 of December 2008, Andrew Morton wrote:
> On Tue, 16 Dec 2008 00:24:43 -0500 (EST) Len Brown <lenb@kernel.org> wrote:
> 
> > > Obviously, it is even better to do neither.  It is a basic and
> > > oft-reoccurring design mistake for a low-level piece of code to
> > > hardwire its GFP_foo decision.  The gfp_t should be passed in from the
> > > callers, all the way down the chain from the top-level code which
> > > actually knows what state this thread of control is in.
> > 
> > Here the caller does not know.
> 
> I bet it does it's just that the caller is thirty five levels of
> function call higher up...
> 
> > The crux of the problem it that the AML interpreter
> > may need to allocate memory and thus may sleep.
> > Under nominal conditions the interpreter only runs in 
> > user-context and allocatges with GFP_KERNEL
> > so sleeping is just dandy.
> > 
> > But AML also needs to run at boot time and at resume time
> > to do things like configure interrupts before interrupts are
> > enabled...
> > 
> > boot-time is handled by _cond_resched()
> > not calling __cond_resched unless
> > system_state == SYSTEM_RUNNING
> > 
> > I suppose I could do something like this,
> > though I'm not sure of the prettiest place
> > to check system_resume_irqsoff == true,
> > so that is left out for now...
> > 
> >
> > ...
> > 
> > diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
> > index e52ad91..def91fa 100644
> > --- a/drivers/acpi/pci_link.c
> > +++ b/drivers/acpi/pci_link.c
> > @@ -320,7 +320,7 @@ static int acpi_pci_link_set(struct acpi_pci_link *link, int irq)
> >  	if (!link || !irq)
> >  		return -EINVAL;
> >  
> > -	resource = kzalloc(sizeof(*resource) + 1, irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
> > +	resource = kzalloc(sizeof(*resource) + 1, GFP_KERNEL);
> >  	if (!resource)
> >  		return -ENOMEM;
> >  
> > diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c
> > index bb7d50d..27243f9 100644
> > --- a/drivers/acpi/power.c
> > +++ b/drivers/acpi/power.c
> > @@ -218,8 +218,7 @@ static int acpi_power_on(acpi_handle handle, struct acpi_device *dev)
> >  	}
> >  
> >  	if (!found) {
> > -		ref = kmalloc(sizeof (struct acpi_power_reference),
> > -		    irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL);
> > +		ref = kmalloc(sizeof (struct acpi_power_reference), GFP_KERNEL);
> >  		if (!ref) {
> >  			ACPI_DEBUG_PRINT((ACPI_DB_INFO, "kmalloc() failed\n"));
> >  			mutex_unlock(&resource->resource_lock);
> > diff --git a/include/acpi/platform/aclinux.h b/include/acpi/platform/aclinux.h
> > index 029c8c0..9fbf73f 100644
> > --- a/include/acpi/platform/aclinux.h
> > +++ b/include/acpi/platform/aclinux.h
> > @@ -121,17 +121,16 @@ static inline acpi_thread_id acpi_os_get_thread_id(void)
> >  #include <acpi/actypes.h>
> >  static inline void *acpi_os_allocate(acpi_size size)
> >  {
> > -	return kmalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> > +	return kmalloc(size, GFP_KERNEL);
> >  }
> >  static inline void *acpi_os_allocate_zeroed(acpi_size size)
> >  {
> > -	return kzalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> > +	return kzalloc(size, GFP_KERNEL);
> >  }
> >  
> >  static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
> >  {
> > -	return kmem_cache_zalloc(cache,
> > -				 irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
> > +	return kmem_cache_zalloc(cache, GFP_KERNEL);
> >  }
> >  
> >  #define ACPI_ALLOCATE(a)	acpi_os_allocate(a)
> > diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> > index 2ce8207..a716de8 100644
> > --- a/include/linux/suspend.h
> > +++ b/include/linux/suspend.h
> > @@ -128,9 +128,16 @@ extern void arch_suspend_disable_irqs(void);
> >   */
> >  extern void arch_suspend_enable_irqs(void);
> >  
> > +/**
> > + * suspend_irqs_off
> > + *
> > + * flag to indicate that IRQs are off for the benefit of suspend
> > + */
> > +extern bool system_suspend_irqs_off;
> >  extern int pm_suspend(suspend_state_t state);
> >  #else /* !CONFIG_SUSPEND */
> >  #define suspend_valid_only_mem	NULL
> > +#define system_suspend_irqs_off false
> >  
> >  static inline void suspend_set_ops(struct platform_suspend_ops *ops) {}
> >  static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
> > diff --git a/kernel/power/main.c b/kernel/power/main.c
> > index b8f7ce9..2eadae7 100644
> > --- a/kernel/power/main.c
> > +++ b/kernel/power/main.c
> > @@ -268,16 +268,20 @@ static int suspend_prepare(void)
> >  	return error;
> >  }
> >  
> > +bool system_resume_irqsoff;	/* IRQs are off for resume */
> 
> I guess that was supposed to be "system_suspend_irqs_off"?
> 
> >  /* default implementation */
> >  void __attribute__ ((weak)) arch_suspend_disable_irqs(void)
> >  {
> >  	local_irq_disable();
> > +	system_resume_irqsoff = true;
> >  }
> >  
> >  /* default implementation */
> >  void __attribute__ ((weak)) arch_suspend_enable_irqs(void)
> >  {
> >  	local_irq_enable();
> > +	system_resume_irqsoff = false;
> >  }
> 
> (please use __weak)
> 
> Whether it's "system_resume_irqsoff" or "system_suspend_irqs_off",
> neither of them actually got used ;)
> 
> If the code is really this simple and localised then perhaps:
> 
> gfp_t acpi_aml_gfp_flags = GFP_ATOMIC;
> 
> void __weak arch_suspend_disable_irqs(void)
> {
> 	local_irq_disable();
> 	acpi_aml_gfp_flags = GFP_ATOMIC;
> }
> 
> /* default implementation */
> void __weak arch_suspend_enable_irqs(void)
> {
> 	local_irq_enable();
> 	acpi_aml_gfp_flags = GFP_KERNEL;
> }
> 
> would suffice.  Dunno.

arch_suspend_disable_irqs() and arch_suspend_enable_irqs() are only used during
suspend to RAM, but we have the 'suspend atomic context' during hibernation as
well.

> Why are we providing for an arch override of this?

Because of powerpc that does additional stuff here, AFAICS.

Anyway, analogous situation can happen on shutdown when the nonboot CPUs
are disabled and interrupts are off on the BSP.

I think what we need is something like 'if the nonboot CPUs are disabled and
interrupts are off (on the BSP, which is the only CPU active), use GFP_ATOMIC'.

I'll prepare a patch for that, unless someone is faster.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-16 15:34           ` Rafael J. Wysocki
@ 2008-12-16 23:40             ` Rafael J. Wysocki
  2008-12-16 23:49               ` Rafael J. Wysocki
                                 ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2008-12-16 23:40 UTC (permalink / raw)
  To: Andrew Morton, Len Brown
  Cc: pavel, Linux Kernel Mailing List, linux-acpi, Johannes Berg

On Tuesday, 16 of December 2008, Rafael J. Wysocki wrote:
> On Tuesday, 16 of December 2008, Andrew Morton wrote:
> > On Tue, 16 Dec 2008 00:24:43 -0500 (EST) Len Brown <lenb@kernel.org> wrote:
[--snip--]
> > 
> > (please use __weak)
> > 
> > Whether it's "system_resume_irqsoff" or "system_suspend_irqs_off",
> > neither of them actually got used ;)
> > 
> > If the code is really this simple and localised then perhaps:
> > 
> > gfp_t acpi_aml_gfp_flags = GFP_ATOMIC;
> > 
> > void __weak arch_suspend_disable_irqs(void)
> > {
> > 	local_irq_disable();
> > 	acpi_aml_gfp_flags = GFP_ATOMIC;
> > }
> > 
> > /* default implementation */
> > void __weak arch_suspend_enable_irqs(void)
> > {
> > 	local_irq_enable();
> > 	acpi_aml_gfp_flags = GFP_KERNEL;
> > }
> > 
> > would suffice.  Dunno.
> 
> arch_suspend_disable_irqs() and arch_suspend_enable_irqs() are only used during
> suspend to RAM, but we have the 'suspend atomic context' during hibernation as
> well.

In fact, we should be using GFP_ATOMIC already while devices are being
suspended, so we can switch from GFP_KERNEL to GFP_ATOMIC even before
interrupts are enabled.

In particular, we can do that in the suspend/hibernation platform callback as
shown in the patch below.

The patch hasn't been tested and is indended for discussion.

[Note: I'm not sure where to place the definition of acpi_gfp_flags.
 Temporarily I put that into bus.c, but if there's a better place, please
 tell me.]

Thanks,
Rafael

---
 drivers/acpi/bus.c              |    3 +++
 drivers/acpi/pci_link.c         |    2 +-
 drivers/acpi/sleep/main.c       |   22 ++++++++++++++++++----
 include/acpi/platform/aclinux.h |   15 +++++++++------
 include/linux/acpi.h            |    1 +
 5 files changed, 32 insertions(+), 11 deletions(-)

Index: linux-2.6/drivers/acpi/sleep/main.c
===================================================================
--- linux-2.6.orig/drivers/acpi/sleep/main.c
+++ linux-2.6/drivers/acpi/sleep/main.c
@@ -122,6 +122,7 @@ void __init acpi_s4_no_nvs(void)
 static int acpi_pm_disable_gpes(void)
 {
 	acpi_hw_disable_all_gpes();
+	acpi_gfp_flags = GFP_ATOMIC;
 	return 0;
 }
 
@@ -148,8 +149,10 @@ static int acpi_pm_prepare(void)
 {
 	int error = __acpi_pm_prepare();
 
-	if (!error)
+	if (!error) {
 		acpi_hw_disable_all_gpes();
+		acpi_gfp_flags = GFP_ATOMIC;
+	}
 	return error;
 }
 
@@ -175,6 +178,8 @@ static void acpi_pm_finish(void)
 	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
 
 	acpi_target_sleep_state = ACPI_STATE_S0;
+
+	acpi_gfp_flags = GFP_KERNEL;
 }
 
 /**
@@ -188,6 +193,7 @@ static void acpi_pm_end(void)
 	 */
 	acpi_target_sleep_state = ACPI_STATE_S0;
 	acpi_sleep_tts_switch(acpi_target_sleep_state);
+	acpi_gfp_flags = GFP_KERNEL;
 }
 #else /* !CONFIG_ACPI_SLEEP */
 #define acpi_target_sleep_state	ACPI_STATE_S0
@@ -328,7 +334,9 @@ static int acpi_suspend_begin_old(suspen
 {
 	int error = acpi_suspend_begin(pm_state);
 
-	if (!error)
+	if (error)
+		acpi_gfp_flags = GFP_KERNEL;
+	else
 		error = __acpi_pm_prepare();
 	return error;
 }
@@ -423,7 +431,9 @@ static int acpi_hibernation_pre_snapshot
 {
 	int error = acpi_pm_prepare();
 
-	if (!error)
+	if (error)
+		acpi_gfp_flags = GFP_KERNEL;
+	else
 		hibernate_nvs_save();
 
 	return error;
@@ -475,6 +485,7 @@ static void acpi_hibernation_leave(void)
 static void acpi_pm_enable_gpes(void)
 {
 	acpi_hw_enable_all_runtime_gpes();
+	acpi_gfp_flags = GFP_KERNEL;
 }
 
 static struct platform_hibernation_ops acpi_hibernation_ops = {
@@ -520,7 +531,9 @@ static int acpi_hibernation_pre_snapshot
 {
 	int error = acpi_pm_disable_gpes();
 
-	if (!error)
+	if (error)
+		acpi_gfp_flags = GFP_KERNEL;
+	else
 		hibernate_nvs_save();
 
 	return error;
@@ -675,6 +688,7 @@ static void acpi_power_off_prepare(void)
 	/* Prepare to power off the system */
 	acpi_sleep_prepare(ACPI_STATE_S5);
 	acpi_hw_disable_all_gpes();
+	acpi_gfp_flags = GFP_ATOMIC;
 }
 
 static void acpi_power_off(void)
Index: linux-2.6/drivers/acpi/bus.c
===================================================================
--- linux-2.6.orig/drivers/acpi/bus.c
+++ linux-2.6/drivers/acpi/bus.c
@@ -46,6 +46,9 @@ struct acpi_device *acpi_root;
 struct proc_dir_entry *acpi_root_dir;
 EXPORT_SYMBOL(acpi_root_dir);
 
+gfp_t acpi_gfp_flags = GFP_KERNEL;
+EXPORT_SYMBOL(acpi_gfp_flags);
+
 #define STRUCT_TO_INT(s)	(*((int*)&s))
 
 static int set_power_nocheck(const struct dmi_system_id *id)
Index: linux-2.6/drivers/acpi/pci_link.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_link.c
+++ linux-2.6/drivers/acpi/pci_link.c
@@ -320,7 +320,7 @@ static int acpi_pci_link_set(struct acpi
 	if (!link || !irq)
 		return -EINVAL;
 
-	resource = kzalloc(sizeof(*resource) + 1, irqs_disabled() ? GFP_ATOMIC: GFP_KERNEL);
+	resource = kzalloc(sizeof(*resource) + 1, acpi_gfp_flags);
 	if (!resource)
 		return -ENOMEM;
 
Index: linux-2.6/include/acpi/platform/aclinux.h
===================================================================
--- linux-2.6.orig/include/acpi/platform/aclinux.h
+++ linux-2.6/include/acpi/platform/aclinux.h
@@ -113,25 +113,28 @@ static inline acpi_thread_id acpi_os_get
 }
 
 /*
- * The irqs_disabled() check is for resume from RAM.
- * Interrupts are off during resume, just like they are for boot.
+ * acpi_gfp_flags is GFP_KERNEL except for the suspend-to-RAM and hibernation
+ * code paths, where it equals GFP_ATOMIC.
+ * Interrupts are off during suspend-resume, just like they are for boot.
  * However, boot has  (system_state != SYSTEM_RUNNING)
  * to quiet __might_sleep() in kmalloc() and resume does not.
  */
 #include <acpi/actypes.h>
+
+extern gfp_t acpi_gfp_flags;
+
 static inline void *acpi_os_allocate(acpi_size size)
 {
-	return kmalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kmalloc(size, acpi_gfp_flags);
 }
 static inline void *acpi_os_allocate_zeroed(acpi_size size)
 {
-	return kzalloc(size, irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kzalloc(size, acpi_gfp_flags);
 }
 
 static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
 {
-	return kmem_cache_zalloc(cache,
-				 irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
+	return kmem_cache_zalloc(cache, acpi_gfp_flags);
 }
 
 #define ACPI_ALLOCATE(a)	acpi_os_allocate(a)
Index: linux-2.6/include/linux/acpi.h
===================================================================
--- linux-2.6.orig/include/linux/acpi.h
+++ linux-2.6/include/linux/acpi.h
@@ -115,6 +115,7 @@ extern int pci_mmcfg_config_num;
 
 extern int sbf_port;
 extern unsigned long acpi_realmode_flags;
+extern gfp_t acpi_gfp_flags;
 
 int acpi_register_gsi (u32 gsi, int triggering, int polarity);
 int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-16 23:40             ` Rafael J. Wysocki
@ 2008-12-16 23:49               ` Rafael J. Wysocki
  2008-12-18  6:07               ` Len Brown
  2008-12-18  6:08               ` irqs_disabled() vs ACPI interpreter vs suspend Len Brown
  2 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2008-12-16 23:49 UTC (permalink / raw)
  To: Andrew Morton, Len Brown
  Cc: pavel, Linux Kernel Mailing List, linux-acpi, Johannes Berg

On Wednesday, 17 of December 2008, Rafael J. Wysocki wrote:
> On Tuesday, 16 of December 2008, Rafael J. Wysocki wrote:
> > On Tuesday, 16 of December 2008, Andrew Morton wrote:
> > > On Tue, 16 Dec 2008 00:24:43 -0500 (EST) Len Brown <lenb@kernel.org> wrote:
> [--snip--]
> > > 
> > > (please use __weak)
> > > 
> > > Whether it's "system_resume_irqsoff" or "system_suspend_irqs_off",
> > > neither of them actually got used ;)
> > > 
> > > If the code is really this simple and localised then perhaps:
> > > 
> > > gfp_t acpi_aml_gfp_flags = GFP_ATOMIC;
> > > 
> > > void __weak arch_suspend_disable_irqs(void)
> > > {
> > > 	local_irq_disable();
> > > 	acpi_aml_gfp_flags = GFP_ATOMIC;
> > > }
> > > 
> > > /* default implementation */
> > > void __weak arch_suspend_enable_irqs(void)
> > > {
> > > 	local_irq_enable();
> > > 	acpi_aml_gfp_flags = GFP_KERNEL;
> > > }
> > > 
> > > would suffice.  Dunno.
> > 
> > arch_suspend_disable_irqs() and arch_suspend_enable_irqs() are only used during
> > suspend to RAM, but we have the 'suspend atomic context' during hibernation as
> > well.
> 
> In fact, we should be using GFP_ATOMIC already while devices are being
> suspended, so we can switch from GFP_KERNEL to GFP_ATOMIC even before
> interrupts are enabled.
> 
> In particular, we can do that in the suspend/hibernation platform callback as
> shown in the patch below.
> 
> The patch hasn't been tested and is indended for discussion.

Below is another patch for discussion.

It standardizes the switching of interrupts etc. during suspend to RAM and
hibernation and adds disabling/enabling of preemption so that the
cond_resched() in the ACPI code doesn't try to schedule.

The patch hasn't been tested yet.

Thanks,
Rafael

---
 include/linux/suspend.h |    3 +
 kernel/power/disk.c     |   26 ++++++++--------
 kernel/power/main.c     |   75 ++++++++++++++++++++++++++++++++++++------------
 kernel/power/power.h    |    4 ++
 4 files changed, 78 insertions(+), 30 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -214,8 +214,8 @@ static int create_image(int platform_mod
 	if (error)
 		return error;
 
-	device_pm_lock();
-	local_irq_disable();
+	suspend_disable_concurrency();
+
 	/* At this point, device_suspend() has been called, but *not*
 	 * device_power_down(). We *must* call device_power_down() now.
 	 * Otherwise, drivers for some devices (e.g. interrupt controllers)
@@ -248,9 +248,10 @@ static int create_image(int platform_mod
 	 */
 	device_power_up(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+
  Enable_irqs:
-	local_irq_enable();
-	device_pm_unlock();
+	suspend_enable_concurrency();
+
 	return error;
 }
 
@@ -330,8 +331,8 @@ static int resume_target_kernel(void)
 {
 	int error;
 
-	device_pm_lock();
-	local_irq_disable();
+	suspend_disable_concurrency();
+
 	error = device_power_down(PMSG_QUIESCE);
 	if (error) {
 		printk(KERN_ERR "PM: Some devices failed to power down, "
@@ -361,9 +362,10 @@ static int resume_target_kernel(void)
 	restore_processor_state();
 	touch_softlockup_watchdog();
 	device_power_up(PMSG_RECOVER);
+
  Enable_irqs:
-	local_irq_enable();
-	device_pm_unlock();
+	suspend_enable_concurrency();
+
 	return error;
 }
 
@@ -442,16 +444,16 @@ int hibernation_platform_enter(void)
 	if (error)
 		goto Finish;
 
-	device_pm_lock();
-	local_irq_disable();
+	suspend_disable_concurrency();
+
 	error = device_power_down(PMSG_HIBERNATE);
 	if (!error) {
 		hibernation_ops->enter();
 		/* We should never get here */
 		while (1);
 	}
-	local_irq_enable();
-	device_pm_unlock();
+
+	suspend_enable_concurrency();
 
 	/*
 	 * We don't need to reenable the nonboot CPUs or resume consoles, since
Index: linux-2.6/kernel/power/main.c
===================================================================
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -55,6 +55,46 @@ int pm_notifier_call_chain(unsigned long
 			== NOTIFY_BAD) ? -EINVAL : 0;
 }
 
+/* default implementation */
+void __weak arch_suspend_disable_irqs(void)
+{
+	local_irq_disable();
+}
+
+/* default implementation */
+void __weak arch_suspend_enable_irqs(void)
+{
+	local_irq_enable();
+}
+
+/**
+ *	suspend_disable_concurrency - disable the execution of any other code
+ *                                    in parallel with the calling thread
+ *
+ *	To be executed with only one CPU on line.
+ */
+void suspend_disable_concurrency(void)
+{
+	device_pm_lock();
+	preempt_disable();
+	arch_suspend_disable_irqs();
+	BUG_ON(!irqs_disabled());
+}
+
+/**
+ *	suspend_enable_concurrency - allow other code to be executed in parallel
+ *                                   with the calling thread
+ *
+ *	To be executed with only one CPU on line.
+ */
+void suspend_enable_concurrency(void)
+{
+	arch_suspend_enable_irqs();
+	BUG_ON(irqs_disabled());
+	preempt_enable_no_resched();
+	device_pm_unlock();
+}
+
 #ifdef CONFIG_PM_DEBUG
 int pm_test_level = TEST_NONE;
 
@@ -268,18 +321,6 @@ static int suspend_prepare(void)
 	return error;
 }
 
-/* default implementation */
-void __attribute__ ((weak)) arch_suspend_disable_irqs(void)
-{
-	local_irq_disable();
-}
-
-/* default implementation */
-void __attribute__ ((weak)) arch_suspend_enable_irqs(void)
-{
-	local_irq_enable();
-}
-
 /**
  *	suspend_enter - enter the desired system sleep state.
  *	@state:		state to enter
@@ -290,9 +331,7 @@ static int suspend_enter(suspend_state_t
 {
 	int error = 0;
 
-	device_pm_lock();
-	arch_suspend_disable_irqs();
-	BUG_ON(!irqs_disabled());
+	suspend_disable_concurrency();
 
 	if ((error = device_power_down(PMSG_SUSPEND))) {
 		printk(KERN_ERR "PM: Some devices failed to power down\n");
@@ -303,10 +342,10 @@ static int suspend_enter(suspend_state_t
 		error = suspend_ops->enter(state);
 
 	device_power_up(PMSG_RESUME);
+
  Done:
-	arch_suspend_enable_irqs();
-	BUG_ON(irqs_disabled());
-	device_pm_unlock();
+	suspend_enable_concurrency();
+
 	return error;
 }
 
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -163,11 +163,15 @@ extern void swsusp_show_speed(struct tim
 #ifdef CONFIG_SUSPEND
 /* kernel/power/main.c */
 extern int suspend_devices_and_enter(suspend_state_t state);
+extern void suspend_disable_concurrency(void);
+extern void suspend_enable_concurrency(void);
 #else /* !CONFIG_SUSPEND */
 static inline int suspend_devices_and_enter(suspend_state_t state)
 {
 	return -ENOSYS;
 }
+static inline void suspend_disable_concurrency(void) {}
+static inline void suspend_enable_concurrency(void) {}
 #endif /* !CONFIG_SUSPEND */
 
 #ifdef CONFIG_PM_SLEEP

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-16 23:40             ` Rafael J. Wysocki
  2008-12-16 23:49               ` Rafael J. Wysocki
@ 2008-12-18  6:07               ` Len Brown
  2008-12-18 16:48                 ` Pavel Machek
  2008-12-18  6:08               ` irqs_disabled() vs ACPI interpreter vs suspend Len Brown
  2 siblings, 1 reply; 19+ messages in thread
From: Len Brown @ 2008-12-18  6:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg



-- Len Brown, Intel Open Source Technology Center

> In fact, we should be using GFP_ATOMIC already while devices are being
> suspended, so we can switch from GFP_KERNEL to GFP_ATOMIC even before
> interrupts are enabled.

true.  it makes me wonder if tinkering with ACPI
is a bad idea -- for it will address only one user of kmalloc.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* irqs_disabled() vs ACPI interpreter vs suspend
  2008-12-16 23:40             ` Rafael J. Wysocki
  2008-12-16 23:49               ` Rafael J. Wysocki
  2008-12-18  6:07               ` Len Brown
@ 2008-12-18  6:08               ` Len Brown
  2008-12-18 16:32                 ` Rafael J. Wysocki
  2 siblings, 1 reply; 19+ messages in thread
From: Len Brown @ 2008-12-18  6:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, pavel, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg

Rafael,

to answer your question "what happens at boot"...

interrupts are enabled in start_kernel()
well before the ACPI interpreter is started
up in a subsys_initcall().

The first use of the interpreter indeed allocates memory
(as every invocation of acpi_evaluate_object() does)
to evaluate _PIC
ie. when we print out "ACPI: Using IOAPIC for interrupt routing".

So one would first think we could WARN_ON(irqs_disabled())
right at acpi_evaluate_object(), or at any external
entry to the AML interpreter.

But _GTS and _BFS are counter-examples --
they are ONLY evaluated with interrupts OFF,
since they are between the invocation of arch_suspend_disable_irqs()
and arch_suspend_enable_irqs().  I believe that they are the
ONLY counter-examples, and for those we'd conceivably
WARN_ON(!irqs_disabled).

But at resume...
irqrouter_resume() is called to restore ACPI PCI Interrupt Link Devices
while we still have interrupts disabled.  If we called it after interrupts
were enabled, then an incorrectly resumed link could cause a
screaming interrupt.

This is different from boot-time.  At boot time
we disable all the links b/c we know that the drivers
that use them will all request_irq() and we'll set
up the links one by one at that time.

Originally we had planned for suspend to be like boot --
every driver would free_irq() at .suspend
and request_irq() at .resume -- indirectly for pci devices
via pci_enable_device()...
This would leave the Links disabled at suspend time, like we
disable them at boot time -- and then the request_irq()'s would
come in from the resumed drivers and the links would be re-programmed.
I don't think we succeeded here, and IIR Linus didn't like our
suggestion that every driver must do something, rather than do nothing....
So the irqrouter_resume safety-net remains.

But restoring a PCI Interrupt Link Device evaluates _CRS, _PRS, _SRS --
general methods which are also invoked at other times with
interrupts enabled.  So for those we'd not be able to WARN_ON()
for either irqs enabled or disabled:-(

I have to think about irqrouter_resume a bit.
I don't like it, but I don't see an alternative -- unless we
do something like ENFORCE all users of the links have to
stop using them at suspend, so we can _DIS them,
and they must request their IRQs at resume
like they do at boot...  IIR we'd have to add
some reference counting to handle shard links
so we could _DIS when the last user freed the irq.

So it looks like we will indeed need something like the
patch to transform ACPI's use of GFP_KERNEL
to GFP_ATOMIC across late suspend
and early resume; to avoid warnings from
_GTS, _BFS, and irqrouter_resume use of kmalloc.

thanks,
-- Len Brown, Intel Open Source Technology Center


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: irqs_disabled() vs ACPI interpreter vs suspend
  2008-12-18  6:08               ` irqs_disabled() vs ACPI interpreter vs suspend Len Brown
@ 2008-12-18 16:32                 ` Rafael J. Wysocki
  2008-12-18 16:52                   ` Pavel Machek
  2008-12-18 16:55                   ` Pavel Machek
  0 siblings, 2 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2008-12-18 16:32 UTC (permalink / raw)
  To: Len Brown
  Cc: Andrew Morton, pavel, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg

On Thursday, 18 of December 2008, Len Brown wrote:
> Rafael,
> 
> to answer your question "what happens at boot"...
> 
> interrupts are enabled in start_kernel()
> well before the ACPI interpreter is started
> up in a subsys_initcall().
> 
> The first use of the interpreter indeed allocates memory
> (as every invocation of acpi_evaluate_object() does)
> to evaluate _PIC
> ie. when we print out "ACPI: Using IOAPIC for interrupt routing".
> 
> So one would first think we could WARN_ON(irqs_disabled())
> right at acpi_evaluate_object(), or at any external
> entry to the AML interpreter.
> 
> But _GTS and _BFS are counter-examples --
> they are ONLY evaluated with interrupts OFF,
> since they are between the invocation of arch_suspend_disable_irqs()
> and arch_suspend_enable_irqs().  I believe that they are the
> ONLY counter-examples, and for those we'd conceivably
> WARN_ON(!irqs_disabled).
> 
> But at resume...
> irqrouter_resume() is called to restore ACPI PCI Interrupt Link Devices
> while we still have interrupts disabled.  If we called it after interrupts
> were enabled, then an incorrectly resumed link could cause a
> screaming interrupt.
> 
> This is different from boot-time.  At boot time
> we disable all the links b/c we know that the drivers
> that use them will all request_irq() and we'll set
> up the links one by one at that time.
> 
> Originally we had planned for suspend to be like boot --
> every driver would free_irq() at .suspend
> and request_irq() at .resume -- indirectly for pci devices
> via pci_enable_device()...
> This would leave the Links disabled at suspend time, like we
> disable them at boot time -- and then the request_irq()'s would
> come in from the resumed drivers and the links would be re-programmed.
> I don't think we succeeded here, and IIR Linus didn't like our
> suggestion that every driver must do something, rather than do nothing....
> So the irqrouter_resume safety-net remains.
> 
> But restoring a PCI Interrupt Link Device evaluates _CRS, _PRS, _SRS --
> general methods which are also invoked at other times with
> interrupts enabled.  So for those we'd not be able to WARN_ON()
> for either irqs enabled or disabled:-(
> 
> I have to think about irqrouter_resume a bit.
> I don't like it, but I don't see an alternative -- unless we
> do something like ENFORCE all users of the links have to
> stop using them at suspend, so we can _DIS them,
> and they must request their IRQs at resume
> like they do at boot...

Well, that was my question to Linus in a recent discussion.

I see some technical reasons to require drivers to free IRQs during
suspend.  First, the issue above.  Second, some removable devices may not
even be present during resume while we may still think they use an IRQ
(or is that impossible for some reason?).

Now, _if_ we decide we want devices to free IRQs during suspend, we can make
the core do that, so that the drivers won't have to worry about it.

Does the ACPI spec say anything about handling interrupts during
suspend-resume?

> IIR we'd have to add some reference counting to handle shard links
> so we could _DIS when the last user freed the irq.
> 
> So it looks like we will indeed need something like the
> patch to transform ACPI's use of GFP_KERNEL
> to GFP_ATOMIC across late suspend
> and early resume; to avoid warnings from
> _GTS, _BFS, and irqrouter_resume use of kmalloc.

OK, so there are two possibilities, IMO.

Either we switch that in the suspend callbacks like in my patch #1, or we
can add a bool variable that will be 'true' is the system is during
suspend-resume with interrupts off.  Any of them is fine by me, so please let
me know.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design)
  2008-12-18  6:07               ` Len Brown
@ 2008-12-18 16:48                 ` Pavel Machek
  0 siblings, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2008-12-18 16:48 UTC (permalink / raw)
  To: Len Brown
  Cc: Rafael J. Wysocki, Andrew Morton, Linux Kernel Mailing List,
	linux-acpi, Johannes Berg

On Thu 2008-12-18 01:07:08, Len Brown wrote:
> 
> 
> -- Len Brown, Intel Open Source Technology Center
> 
> > In fact, we should be using GFP_ATOMIC already while devices are being
> > suspended, so we can switch from GFP_KERNEL to GFP_ATOMIC even before
> > interrupts are enabled.
> 
> true.  it makes me wonder if tinkering with ACPI
> is a bad idea -- for it will address only one user of kmalloc.

Other users should already know that they are called while suspending
device, and behave accordingly (use right flags to kmalloc, etc.)

If not, that's a bug...
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: irqs_disabled() vs ACPI interpreter vs suspend
  2008-12-18 16:32                 ` Rafael J. Wysocki
@ 2008-12-18 16:52                   ` Pavel Machek
  2008-12-18 21:20                     ` Rafael J. Wysocki
  2008-12-18 16:55                   ` Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2008-12-18 16:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Andrew Morton, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg

Hi!

> > to answer your question "what happens at boot"...
> > 
> > interrupts are enabled in start_kernel()
> > well before the ACPI interpreter is started
> > up in a subsys_initcall().
> > 
> > The first use of the interpreter indeed allocates memory
> > (as every invocation of acpi_evaluate_object() does)
> > to evaluate _PIC
> > ie. when we print out "ACPI: Using IOAPIC for interrupt routing".
> > 
> > So one would first think we could WARN_ON(irqs_disabled())
> > right at acpi_evaluate_object(), or at any external
> > entry to the AML interpreter.
> > 
> > But _GTS and _BFS are counter-examples --
> > they are ONLY evaluated with interrupts OFF,
> > since they are between the invocation of arch_suspend_disable_irqs()
> > and arch_suspend_enable_irqs().  I believe that they are the
> > ONLY counter-examples, and for those we'd conceivably
> > WARN_ON(!irqs_disabled).
> > 
> > But at resume...
> > irqrouter_resume() is called to restore ACPI PCI Interrupt Link Devices
> > while we still have interrupts disabled.  If we called it after interrupts
> > were enabled, then an incorrectly resumed link could cause a
> > screaming interrupt.
> > 
> > This is different from boot-time.  At boot time
> > we disable all the links b/c we know that the drivers
> > that use them will all request_irq() and we'll set
> > up the links one by one at that time.
> > 
> > Originally we had planned for suspend to be like boot --
> > every driver would free_irq() at .suspend
> > and request_irq() at .resume -- indirectly for pci devices
> > via pci_enable_device()...
> > This would leave the Links disabled at suspend time, like we
> > disable them at boot time -- and then the request_irq()'s would
> > come in from the resumed drivers and the links would be re-programmed.
> > I don't think we succeeded here, and IIR Linus didn't like our
> > suggestion that every driver must do something, rather than do nothing....
> > So the irqrouter_resume safety-net remains.
> > 
> > But restoring a PCI Interrupt Link Device evaluates _CRS, _PRS, _SRS --
> > general methods which are also invoked at other times with
> > interrupts enabled.  So for those we'd not be able to WARN_ON()
> > for either irqs enabled or disabled:-(
> > 
> > I have to think about irqrouter_resume a bit.
> > I don't like it, but I don't see an alternative -- unless we
> > do something like ENFORCE all users of the links have to
> > stop using them at suspend, so we can _DIS them,
> > and they must request their IRQs at resume
> > like they do at boot...
> 
> Well, that was my question to Linus in a recent discussion.
> 
> I see some technical reasons to require drivers to free IRQs during
> suspend.  First, the issue above.  Second, some removable devices may not
> even be present during resume while we may still think they use an IRQ
> (or is that impossible for some reason?).
> 
> Now, _if_ we decide we want devices to free IRQs during suspend, we can make
> the core do that, so that the drivers won't have to worry about it.

How can we make core do that?

I think that would be everyones prefered suggestion:

Linus: happy, because drivers don't have to do anything

me, Len: happy, because interrupts are freed over suspend

...but I don't see how to do that easily&nicely. AFAICT currently
drivers do something like

private_struct->my_interrupt = register_irq()...

If we put my_interrupt in struct device (or somewhere) we could handle
99% devices that have just one interrupt nicely... maybe that's good
enough? (And the rest can free/register them by hand without telling core).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: irqs_disabled() vs ACPI interpreter vs suspend
  2008-12-18 16:32                 ` Rafael J. Wysocki
  2008-12-18 16:52                   ` Pavel Machek
@ 2008-12-18 16:55                   ` Pavel Machek
  1 sibling, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2008-12-18 16:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Andrew Morton, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg

Hi!

> > So it looks like we will indeed need something like the
> > patch to transform ACPI's use of GFP_KERNEL
> > to GFP_ATOMIC across late suspend
> > and early resume; to avoid warnings from
> > _GTS, _BFS, and irqrouter_resume use of kmalloc.
> 
> OK, so there are two possibilities, IMO.
> 
> Either we switch that in the suspend callbacks like in my patch #1, or we
> can add a bool variable that will be 'true' is the system is during

Do you mean acpi_gfp_flags? 

I think that's ugly, because not only gfp_flags need change during
late suspend (you may not sleep, for example). I'd prefer boolean.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: irqs_disabled() vs ACPI interpreter vs suspend
  2008-12-18 16:52                   ` Pavel Machek
@ 2008-12-18 21:20                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2008-12-18 21:20 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, Andrew Morton, Linux Kernel Mailing List, linux-acpi,
	Johannes Berg

On Thursday, 18 of December 2008, Pavel Machek wrote:
> Hi!
> 
> > > to answer your question "what happens at boot"...
> > > 
> > > interrupts are enabled in start_kernel()
> > > well before the ACPI interpreter is started
> > > up in a subsys_initcall().
> > > 
> > > The first use of the interpreter indeed allocates memory
> > > (as every invocation of acpi_evaluate_object() does)
> > > to evaluate _PIC
> > > ie. when we print out "ACPI: Using IOAPIC for interrupt routing".
> > > 
> > > So one would first think we could WARN_ON(irqs_disabled())
> > > right at acpi_evaluate_object(), or at any external
> > > entry to the AML interpreter.
> > > 
> > > But _GTS and _BFS are counter-examples --
> > > they are ONLY evaluated with interrupts OFF,
> > > since they are between the invocation of arch_suspend_disable_irqs()
> > > and arch_suspend_enable_irqs().  I believe that they are the
> > > ONLY counter-examples, and for those we'd conceivably
> > > WARN_ON(!irqs_disabled).
> > > 
> > > But at resume...
> > > irqrouter_resume() is called to restore ACPI PCI Interrupt Link Devices
> > > while we still have interrupts disabled.  If we called it after interrupts
> > > were enabled, then an incorrectly resumed link could cause a
> > > screaming interrupt.
> > > 
> > > This is different from boot-time.  At boot time
> > > we disable all the links b/c we know that the drivers
> > > that use them will all request_irq() and we'll set
> > > up the links one by one at that time.
> > > 
> > > Originally we had planned for suspend to be like boot --
> > > every driver would free_irq() at .suspend
> > > and request_irq() at .resume -- indirectly for pci devices
> > > via pci_enable_device()...
> > > This would leave the Links disabled at suspend time, like we
> > > disable them at boot time -- and then the request_irq()'s would
> > > come in from the resumed drivers and the links would be re-programmed.
> > > I don't think we succeeded here, and IIR Linus didn't like our
> > > suggestion that every driver must do something, rather than do nothing....
> > > So the irqrouter_resume safety-net remains.
> > > 
> > > But restoring a PCI Interrupt Link Device evaluates _CRS, _PRS, _SRS --
> > > general methods which are also invoked at other times with
> > > interrupts enabled.  So for those we'd not be able to WARN_ON()
> > > for either irqs enabled or disabled:-(
> > > 
> > > I have to think about irqrouter_resume a bit.
> > > I don't like it, but I don't see an alternative -- unless we
> > > do something like ENFORCE all users of the links have to
> > > stop using them at suspend, so we can _DIS them,
> > > and they must request their IRQs at resume
> > > like they do at boot...
> > 
> > Well, that was my question to Linus in a recent discussion.
> > 
> > I see some technical reasons to require drivers to free IRQs during
> > suspend.  First, the issue above.  Second, some removable devices may not
> > even be present during resume while we may still think they use an IRQ
> > (or is that impossible for some reason?).
> > 
> > Now, _if_ we decide we want devices to free IRQs during suspend, we can make
> > the core do that, so that the drivers won't have to worry about it.
> 
> How can we make core do that?
> 
> I think that would be everyones prefered suggestion:
> 
> Linus: happy, because drivers don't have to do anything
> 
> me, Len: happy, because interrupts are freed over suspend
> 
> ...but I don't see how to do that easily&nicely. AFAICT currently
> drivers do something like
> 
> private_struct->my_interrupt = register_irq()...
> 
> If we put my_interrupt in struct device (or somewhere) we could handle
> 99% devices that have just one interrupt nicely... maybe that's good
> enough? (And the rest can free/register them by hand without telling core).

Well, for PCI devices we have pdev->irq, so that would be easy.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-12-18 21:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-25 11:05 acpi_evaluate_integer broken by design Pavel Machek
2008-11-25 18:41 ` Moore, Robert
2008-11-26 21:20 ` Andrew Morton
2008-11-26 22:37   ` Len Brown
2008-11-26 23:02     ` Andrew Morton
2008-11-27 13:58       ` Pavel Machek
2008-12-16  5:24       ` acpi_os_allocate(GFP_KERNEL) (was Re: acpi_evaluate_integer broken by design) Len Brown
2008-12-16  5:41         ` Andrew Morton
2008-12-16 15:34           ` Rafael J. Wysocki
2008-12-16 23:40             ` Rafael J. Wysocki
2008-12-16 23:49               ` Rafael J. Wysocki
2008-12-18  6:07               ` Len Brown
2008-12-18 16:48                 ` Pavel Machek
2008-12-18  6:08               ` irqs_disabled() vs ACPI interpreter vs suspend Len Brown
2008-12-18 16:32                 ` Rafael J. Wysocki
2008-12-18 16:52                   ` Pavel Machek
2008-12-18 21:20                     ` Rafael J. Wysocki
2008-12-18 16:55                   ` Pavel Machek
2008-11-27 13:56   ` acpi_evaluate_integer broken by design Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.