linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
@ 2017-04-25 19:58 Dan Williams
  2017-04-26 22:25 ` Rafael J. Wysocki
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Williams @ 2017-04-25 19:58 UTC (permalink / raw)
  To: rafael.j.wysocki
  Cc: Anush Seetharaman, Tiffany Kasanicky, Ryon Jensen, linux-kernel,
	stable, linux-acpi, Kristin Jacque, Zhang Rui

Reading an ACPI table through the /sys/firmware/acpi/tables interface
more than 65,536 times leads to the following log message:

 ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
  (20170119/tbutils-423)

...and the table being unavailable until the next reboot. Add the
missing acpi_put_table() so the table ->validation_count is decremented
after each read.

Cc: <stable@vger.kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: Kristin Jacque <kristin.jacque@intel.com>
Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
Cc: Ryon Jensen <ryon.jensen@intel.com>
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

Changes in v2:
* compile fix s/table/table_header/. Sorry, I forgot to do 'stg refresh'
  before 'stg mail'

 drivers/acpi/sysfs.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
index cf05ae973381..5180fef9eb49 100644
--- a/drivers/acpi/sysfs.c
+++ b/drivers/acpi/sysfs.c
@@ -333,14 +333,17 @@ static ssize_t acpi_table_show(struct file *filp, struct kobject *kobj,
 	    container_of(bin_attr, struct acpi_table_attr, attr);
 	struct acpi_table_header *table_header = NULL;
 	acpi_status status;
+	ssize_t rc;
 
 	status = acpi_get_table(table_attr->name, table_attr->instance,
 				&table_header);
 	if (ACPI_FAILURE(status))
 		return -ENODEV;
 
-	return memory_read_from_buffer(buf, count, &offset,
-				       table_header, table_header->length);
+	rc = memory_read_from_buffer(buf, count, &offset, table_header,
+			table_header->length);
+	acpi_put_table(table_header);
+	return rc;
 }
 
 static int acpi_table_attr_init(struct kobject *tables_obj,

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-25 19:58 [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service Dan Williams
@ 2017-04-26 22:25 ` Rafael J. Wysocki
  2017-04-26 22:37   ` Dan Williams
  2017-04-27  6:49   ` Zheng, Lv
  0 siblings, 2 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2017-04-26 22:25 UTC (permalink / raw)
  To: Dan Williams
  Cc: Rafael Wysocki, Anush Seetharaman, Tiffany Kasanicky,
	Ryon Jensen, Linux Kernel Mailing List, Stable,
	ACPI Devel Maling List, Kristin Jacque, Zhang Rui

On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> Reading an ACPI table through the /sys/firmware/acpi/tables interface
> more than 65,536 times leads to the following log message:
>
>  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
>   (20170119/tbutils-423)
>
> ...and the table being unavailable until the next reboot. Add the
> missing acpi_put_table() so the table ->validation_count is decremented
> after each read.
>
> Cc: <stable@vger.kernel.org>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> Cc: Kristin Jacque <kristin.jacque@intel.com>
> Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
> Cc: Ryon Jensen <ryon.jensen@intel.com>
> Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
> Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I'm going to apply this, but your Fixes tag is not correct.

validation_count was added to struct acpi_table_desc by commit

commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
Author: Lv Zheng <lv.zheng@intel.com>
Date:   Wed Dec 14 15:04:25 2016 +0800

    ACPICA: Tables: Back port acpi_get_table_with_size() and
early_acpi_os_unmap_memory()
from Linux kernel

from the 4.10 time frame, so IMO it should be

Fixes: 174cc7187e6f (ACPICA: Tables: Back port
acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
kernel)

> ---
>
> Changes in v2:
> * compile fix s/table/table_header/. Sorry, I forgot to do 'stg refresh'
>   before 'stg mail'
>
>  drivers/acpi/sysfs.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
> index cf05ae973381..5180fef9eb49 100644
> --- a/drivers/acpi/sysfs.c
> +++ b/drivers/acpi/sysfs.c
> @@ -333,14 +333,17 @@ static ssize_t acpi_table_show(struct file *filp, struct kobject *kobj,
>             container_of(bin_attr, struct acpi_table_attr, attr);
>         struct acpi_table_header *table_header = NULL;
>         acpi_status status;
> +       ssize_t rc;
>
>         status = acpi_get_table(table_attr->name, table_attr->instance,
>                                 &table_header);
>         if (ACPI_FAILURE(status))
>                 return -ENODEV;
>
> -       return memory_read_from_buffer(buf, count, &offset,
> -                                      table_header, table_header->length);
> +       rc = memory_read_from_buffer(buf, count, &offset, table_header,
> +                       table_header->length);
> +       acpi_put_table(table_header);
> +       return rc;
>  }
>
>  static int acpi_table_attr_init(struct kobject *tables_obj,
>
> --

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-26 22:25 ` Rafael J. Wysocki
@ 2017-04-26 22:37   ` Dan Williams
  2017-05-24 19:01     ` Dan Williams
  2017-04-27  6:49   ` Zheng, Lv
  1 sibling, 1 reply; 8+ messages in thread
From: Dan Williams @ 2017-04-26 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Anush Seetharaman, Tiffany Kasanicky,
	Ryon Jensen, Linux Kernel Mailing List, Stable,
	ACPI Devel Maling List, Kristin Jacque, Zhang Rui

On Wed, Apr 26, 2017 at 3:25 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> Reading an ACPI table through the /sys/firmware/acpi/tables interface
>> more than 65,536 times leads to the following log message:
>>
>>  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
>>   (20170119/tbutils-423)
>>
>> ...and the table being unavailable until the next reboot. Add the
>> missing acpi_put_table() so the table ->validation_count is decremented
>> after each read.
>>
>> Cc: <stable@vger.kernel.org>
>> Cc: Zhang Rui <rui.zhang@intel.com>
>> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
>> Cc: Kristin Jacque <kristin.jacque@intel.com>
>> Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
>> Cc: Ryon Jensen <ryon.jensen@intel.com>
>> Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
>> Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> I'm going to apply this, but your Fixes tag is not correct.
>
> validation_count was added to struct acpi_table_desc by commit
>
> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
> Author: Lv Zheng <lv.zheng@intel.com>
> Date:   Wed Dec 14 15:04:25 2016 +0800
>
>     ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory()
> from Linux kernel
>
> from the 4.10 time frame, so IMO it should be
>
> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
> kernel)
>

Ah, thanks for the catch, I missed that detail and was wrong to argue
it was a 7 year old bug. Apologies Lv!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-26 22:25 ` Rafael J. Wysocki
  2017-04-26 22:37   ` Dan Williams
@ 2017-04-27  6:49   ` Zheng, Lv
  2017-04-27 14:48     ` Dan Williams
  1 sibling, 1 reply; 8+ messages in thread
From: Zheng, Lv @ 2017-04-27  6:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Williams, Dan J
  Cc: Wysocki, Rafael J, Seetharaman, Anush, Kasanicky, Tiffany J,
	Jensen, Ryon, Linux Kernel Mailing List, Stable,
	ACPI Devel Maling List, Jacque, Kristin, Zhang, Rui

Hi, Rafael

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Rafael J.
> Wysocki
> Subject: Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
> 
> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> > Reading an ACPI table through the /sys/firmware/acpi/tables interface
> > more than 65,536 times leads to the following log message:
> >
> >  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
> >   (20170119/tbutils-423)
> >
> > ...and the table being unavailable until the next reboot. Add the
> > missing acpi_put_table() so the table ->validation_count is decremented
> > after each read.
> >
> > Cc: <stable@vger.kernel.org>
> > Cc: Zhang Rui <rui.zhang@intel.com>
> > Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> > Cc: Kristin Jacque <kristin.jacque@intel.com>
> > Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
> > Cc: Ryon Jensen <ryon.jensen@intel.com>
> > Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
> > Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> I'm going to apply this, but your Fixes tag is not correct.

Even we want to skip the so-called "short period".
This fix is not 100% correct.
It's written in a very casual way.
I would think it was just a material to mention me to work on this issue.
See, there are several acpi_get_table clones invoked in sysfs.c, it just fixed one of them.
So it only fixes very limited cases.
I'll post one for fixing sysfs calls later.

There are also several cases needing urgent care, for example, FACS table used between suspend/resume process.
Which can also increase the count by 1 for each suspend/resume cycle.
I'll post an urgent fix for this along with the sysfs one.

> 
> validation_count was added to struct acpi_table_desc by commit
> 
> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
> Author: Lv Zheng <lv.zheng@intel.com>
> Date:   Wed Dec 14 15:04:25 2016 +0800
> 
>     ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory()
> from Linux kernel
> 
> from the 4.10 time frame, so IMO it should be
> 
> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
> kernel)

And TBH, IMO, my posted patch is a regression fix fixing this tag.
https://patchwork.kernel.org/patch/9700175/
It fixes my damaged brain of trying to enable this mechanism.
It's actually a mistake, the planned way is not that.

I don't think fixing ~50 users could be easily achieved in just 1 cycle without triggering any other issues.
Especially among them, there are design changes.
In order to upstream this mechanism to ACPICA, I've already changed the original design.
Originally it's done in a different way:
A acpi_get_validate_table() API is prepared for those wants to pin the ACPI table in memory.
And it is not reference counting based.
Now all solution need to be re-considered due to the design change.
So I don't think it will be a short period.

On the contrary, if there is no one executing a script to read/write sysfs more than 60000 times,
the error won't flood logs.

So I was actually trying to do the right things for Linux kernel.
While just laying down a simple hammer without caring about user experiences could be destructive.
It's a disaster if some servers need to be rebooted due to this issue.

Thanks and best regards
Lv

> 
> > ---
> >
> > Changes in v2:
> > * compile fix s/table/table_header/. Sorry, I forgot to do 'stg refresh'
> >   before 'stg mail'
> >
> >  drivers/acpi/sysfs.c |    7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
> > index cf05ae973381..5180fef9eb49 100644
> > --- a/drivers/acpi/sysfs.c
> > +++ b/drivers/acpi/sysfs.c
> > @@ -333,14 +333,17 @@ static ssize_t acpi_table_show(struct file *filp, struct kobject *kobj,
> >             container_of(bin_attr, struct acpi_table_attr, attr);
> >         struct acpi_table_header *table_header = NULL;
> >         acpi_status status;
> > +       ssize_t rc;
> >
> >         status = acpi_get_table(table_attr->name, table_attr->instance,
> >                                 &table_header);
> >         if (ACPI_FAILURE(status))
> >                 return -ENODEV;
> >
> > -       return memory_read_from_buffer(buf, count, &offset,
> > -                                      table_header, table_header->length);
> > +       rc = memory_read_from_buffer(buf, count, &offset, table_header,
> > +                       table_header->length);
> > +       acpi_put_table(table_header);
> > +       return rc;
> >  }
> >
> >  static int acpi_table_attr_init(struct kobject *tables_obj,
> >
> > --
> 
> Thanks,
> Rafael
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-27  6:49   ` Zheng, Lv
@ 2017-04-27 14:48     ` Dan Williams
  2017-04-28  0:31       ` Rafael J. Wysocki
  2017-04-28  1:13       ` Zheng, Lv
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Williams @ 2017-04-27 14:48 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Rafael J. Wysocki, Wysocki, Rafael J, Seetharaman, Anush,
	Kasanicky, Tiffany J, Jensen, Ryon, Linux Kernel Mailing List,
	Stable, ACPI Devel Maling List, Jacque, Kristin, Zhang, Rui

On Wed, Apr 26, 2017 at 11:49 PM, Zheng, Lv <lv.zheng@intel.com> wrote:
> Hi, Rafael
>
>> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Rafael J.
>> Wysocki
>> Subject: Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
>>
>> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> > Reading an ACPI table through the /sys/firmware/acpi/tables interface
>> > more than 65,536 times leads to the following log message:
>> >
>> >  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
>> >   (20170119/tbutils-423)
>> >
>> > ...and the table being unavailable until the next reboot. Add the
>> > missing acpi_put_table() so the table ->validation_count is decremented
>> > after each read.
>> >
>> > Cc: <stable@vger.kernel.org>
>> > Cc: Zhang Rui <rui.zhang@intel.com>
>> > Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
>> > Cc: Kristin Jacque <kristin.jacque@intel.com>
>> > Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
>> > Cc: Ryon Jensen <ryon.jensen@intel.com>
>> > Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
>> > Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
>> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>
>> I'm going to apply this, but your Fixes tag is not correct.
>
> Even we want to skip the so-called "short period".
> This fix is not 100% correct.
> It's written in a very casual way.
> I would think it was just a material to mention me to work on this issue.
> See, there are several acpi_get_table clones invoked in sysfs.c, it just fixed one of them.
> So it only fixes very limited cases.
> I'll post one for fixing sysfs calls later.

That's the point, the other problem areas can be fixed up incrementally.

>
> There are also several cases needing urgent care, for example, FACS table used between suspend/resume process.
> Which can also increase the count by 1 for each suspend/resume cycle.
> I'll post an urgent fix for this along with the sysfs one.
>
>>
>> validation_count was added to struct acpi_table_desc by commit
>>
>> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
>> Author: Lv Zheng <lv.zheng@intel.com>
>> Date:   Wed Dec 14 15:04:25 2016 +0800
>>
>>     ACPICA: Tables: Back port acpi_get_table_with_size() and
>> early_acpi_os_unmap_memory()
>> from Linux kernel
>>
>> from the 4.10 time frame, so IMO it should be
>>
>> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
>> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
>> kernel)
>
> And TBH, IMO, my posted patch is a regression fix fixing this tag.
> https://patchwork.kernel.org/patch/9700175/
> It fixes my damaged brain of trying to enable this mechanism.
> It's actually a mistake, the planned way is not that.
>
> I don't think fixing ~50 users could be easily achieved in just 1 cycle without triggering any other issues.
> Especially among them, there are design changes.
> In order to upstream this mechanism to ACPICA, I've already changed the original design.
> Originally it's done in a different way:
> A acpi_get_validate_table() API is prepared for those wants to pin the ACPI table in memory.
> And it is not reference counting based.
> Now all solution need to be re-considered due to the design change.
> So I don't think it will be a short period.
>
> On the contrary, if there is no one executing a script to read/write sysfs more than 60000 times,
> the error won't flood logs.

That's not true. Here is the log flood is how we discovered the
problem in the first place:

https://gist.github.com/djbw/ac03ac15bde6db0301a73a33bea70521

...this was not a deliberate loop. Now, we did find that this
application was reading the tables more often than it should, and
hopefully there aren't too many more applications like this out in the
wild.

> So I was actually trying to do the right things for Linux kernel.
> While just laying down a simple hammer without caring about user experiences could be destructive.
> It's a disaster if some servers need to be rebooted due to this issue.

It's not a disaster in that most systems should not be re-reading the
tables at a high enough frequency for this to matter. The fact that
this bug has shipped in Fedora and other places without issue helps
point out that this is hard to hit in practice.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-27 14:48     ` Dan Williams
@ 2017-04-28  0:31       ` Rafael J. Wysocki
  2017-04-28  1:13       ` Zheng, Lv
  1 sibling, 0 replies; 8+ messages in thread
From: Rafael J. Wysocki @ 2017-04-28  0:31 UTC (permalink / raw)
  To: Dan Williams
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Seetharaman,
	Anush, Kasanicky, Tiffany J, Jensen, Ryon,
	Linux Kernel Mailing List, Stable, ACPI Devel Maling List,
	Jacque, Kristin, Zhang, Rui

On Thursday, April 27, 2017 07:48:32 AM Dan Williams wrote:
> On Wed, Apr 26, 2017 at 11:49 PM, Zheng, Lv <lv.zheng@intel.com> wrote:
> > Hi, Rafael
> >
> >> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Rafael J.
> >> Wysocki
> >> Subject: Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
> >>
> >> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> >> > Reading an ACPI table through the /sys/firmware/acpi/tables interface
> >> > more than 65,536 times leads to the following log message:
> >> >
> >> >  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
> >> >   (20170119/tbutils-423)
> >> >
> >> > ...and the table being unavailable until the next reboot. Add the
> >> > missing acpi_put_table() so the table ->validation_count is decremented
> >> > after each read.
> >> >
> >> > Cc: <stable@vger.kernel.org>
> >> > Cc: Zhang Rui <rui.zhang@intel.com>
> >> > Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> >> > Cc: Kristin Jacque <kristin.jacque@intel.com>
> >> > Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
> >> > Cc: Ryon Jensen <ryon.jensen@intel.com>
> >> > Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
> >> > Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
> >> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >>
> >> I'm going to apply this, but your Fixes tag is not correct.
> >
> > Even we want to skip the so-called "short period".
> > This fix is not 100% correct.
> > It's written in a very casual way.
> > I would think it was just a material to mention me to work on this issue.
> > See, there are several acpi_get_table clones invoked in sysfs.c, it just fixed one of them.
> > So it only fixes very limited cases.
> > I'll post one for fixing sysfs calls later.
> 
> That's the point, the other problem areas can be fixed up incrementally.

Right.

> >
> > There are also several cases needing urgent care, for example, FACS table used between suspend/resume process.
> > Which can also increase the count by 1 for each suspend/resume cycle.
> > I'll post an urgent fix for this along with the sysfs one.
> >
> >>
> >> validation_count was added to struct acpi_table_desc by commit
> >>
> >> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
> >> Author: Lv Zheng <lv.zheng@intel.com>
> >> Date:   Wed Dec 14 15:04:25 2016 +0800
> >>
> >>     ACPICA: Tables: Back port acpi_get_table_with_size() and
> >> early_acpi_os_unmap_memory()
> >> from Linux kernel
> >>
> >> from the 4.10 time frame, so IMO it should be
> >>
> >> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
> >> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
> >> kernel)
> >
> > And TBH, IMO, my posted patch is a regression fix fixing this tag.
> > https://patchwork.kernel.org/patch/9700175/
> > It fixes my damaged brain of trying to enable this mechanism.
> > It's actually a mistake, the planned way is not that.
> >
> > I don't think fixing ~50 users could be easily achieved in just 1 cycle without triggering any other issues.
> > Especially among them, there are design changes.
> > In order to upstream this mechanism to ACPICA, I've already changed the original design.
> > Originally it's done in a different way:
> > A acpi_get_validate_table() API is prepared for those wants to pin the ACPI table in memory.
> > And it is not reference counting based.
> > Now all solution need to be re-considered due to the design change.
> > So I don't think it will be a short period.
> >
> > On the contrary, if there is no one executing a script to read/write sysfs more than 60000 times,
> > the error won't flood logs.
> 
> That's not true. Here is the log flood is how we discovered the
> problem in the first place:
> 
> https://gist.github.com/djbw/ac03ac15bde6db0301a73a33bea70521
> 
> ...this was not a deliberate loop. Now, we did find that this
> application was reading the tables more often than it should, and
> hopefully there aren't too many more applications like this out in the
> wild.
> 
> > So I was actually trying to do the right things for Linux kernel.
> > While just laying down a simple hammer without caring about user experiences could be destructive.
> > It's a disaster if some servers need to be rebooted due to this issue.
> 
> It's not a disaster in that most systems should not be re-reading the
> tables at a high enough frequency for this to matter. The fact that
> this bug has shipped in Fedora and other places without issue helps
> point out that this is hard to hit in practice.

Again, right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-27 14:48     ` Dan Williams
  2017-04-28  0:31       ` Rafael J. Wysocki
@ 2017-04-28  1:13       ` Zheng, Lv
  1 sibling, 0 replies; 8+ messages in thread
From: Zheng, Lv @ 2017-04-28  1:13 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Rafael J. Wysocki, Wysocki, Rafael J, Seetharaman, Anush,
	Kasanicky, Tiffany J, Jensen, Ryon, Linux Kernel Mailing List,
	Stable, ACPI Devel Maling List, Jacque, Kristin, Zhang, Rui

Hi,

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Dan
> Williams
> Subject: Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
> 
> On Wed, Apr 26, 2017 at 11:49 PM, Zheng, Lv <lv.zheng@intel.com> wrote:
> > Hi, Rafael
> >
> >> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> Rafael J.
> >> Wysocki
> >> Subject: Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
> >>
> >> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> >> > Reading an ACPI table through the /sys/firmware/acpi/tables interface
> >> > more than 65,536 times leads to the following log message:
> >> >
> >> >  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
> >> >   (20170119/tbutils-423)
> >> >
> >> > ...and the table being unavailable until the next reboot. Add the
> >> > missing acpi_put_table() so the table ->validation_count is decremented
> >> > after each read.
> >> >
> >> > Cc: <stable@vger.kernel.org>
> >> > Cc: Zhang Rui <rui.zhang@intel.com>
> >> > Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
> >> > Cc: Kristin Jacque <kristin.jacque@intel.com>
> >> > Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
> >> > Cc: Ryon Jensen <ryon.jensen@intel.com>
> >> > Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
> >> > Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
> >> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >>
> >> I'm going to apply this, but your Fixes tag is not correct.
> >
> > Even we want to skip the so-called "short period".
> > This fix is not 100% correct.
> > It's written in a very casual way.
> > I would think it was just a material to mention me to work on this issue.
> > See, there are several acpi_get_table clones invoked in sysfs.c, it just fixed one of them.
> > So it only fixes very limited cases.
> > I'll post one for fixing sysfs calls later.
> 
> That's the point, the other problem areas can be fixed up incrementally.

No, the others acpi_get_table() invocations are called each time when a table is accessed.
Have you dumped validation count for your commit?
For SSDTs, it does nothing, the validation counts are still increased each time acpidump is executed.
While SSDT is the major type of the tables listed in the XSDT/RSDT.
So IMO, your partial fix doesn't fix anything in the end.

> 
> >
> > There are also several cases needing urgent care, for example, FACS table used between
> suspend/resume process.
> > Which can also increase the count by 1 for each suspend/resume cycle.
> > I'll post an urgent fix for this along with the sysfs one.
> >
> >>
> >> validation_count was added to struct acpi_table_desc by commit
> >>
> >> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
> >> Author: Lv Zheng <lv.zheng@intel.com>
> >> Date:   Wed Dec 14 15:04:25 2016 +0800
> >>
> >>     ACPICA: Tables: Back port acpi_get_table_with_size() and
> >> early_acpi_os_unmap_memory()
> >> from Linux kernel
> >>
> >> from the 4.10 time frame, so IMO it should be
> >>
> >> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
> >> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
> >> kernel)
> >
> > And TBH, IMO, my posted patch is a regression fix fixing this tag.
> > https://patchwork.kernel.org/patch/9700175/
> > It fixes my damaged brain of trying to enable this mechanism.
> > It's actually a mistake, the planned way is not that.
> >
> > I don't think fixing ~50 users could be easily achieved in just 1 cycle without triggering any other
> issues.
> > Especially among them, there are design changes.
> > In order to upstream this mechanism to ACPICA, I've already changed the original design.
> > Originally it's done in a different way:
> > A acpi_get_validate_table() API is prepared for those wants to pin the ACPI table in memory.
> > And it is not reference counting based.
> > Now all solution need to be re-considered due to the design change.
> > So I don't think it will be a short period.
> >
> > On the contrary, if there is no one executing a script to read/write sysfs more than 60000 times,
> > the error won't flood logs.
> 
> That's not true. Here is the log flood is how we discovered the
> problem in the first place:
> 
> https://gist.github.com/djbw/ac03ac15bde6db0301a73a33bea70521
> 
> ...this was not a deliberate loop. Now, we did find that this
> application was reading the tables more often than it should, and
> hopefully there aren't too many more applications like this out in the
> wild.

I already followed your suggestion to change the error in v2, making it muted for this period.
So you won't see such log flood.

> 
> > So I was actually trying to do the right things for Linux kernel.
> > While just laying down a simple hammer without caring about user experiences could be destructive.
> > It's a disaster if some servers need to be rebooted due to this issue.
> 
> It's not a disaster in that most systems should not be re-reading the
> tables at a high enough frequency for this to matter. The fact that
> this bug has shipped in Fedora and other places without issue helps
> point out that this is hard to hit in practice.

One case you don't know is:
In ACPICA, when AML Load opcode is invoked, acpi_get_table_by_index() can happen.
So if Load/Unload is frequently invoked on servers (I would imagine this can happen to some per-cpu dynamic loading tables),
You'll soon reach the same problem.
And then it will cause servers per-cpu ACPI related functionality to stop working.

My first version includes a new function acpi_tb_get_validated_table().
To be invoked internally by ACPICA, so tables pinned (for example, due to no full Unload support) by ACPICA, the validation count will only be increased once.

However making changes in ACPICA requires excessive communication efforts.
So we chose the current PLAN:
1. Keep ACPICA less changed and keep the old behavior.
2. Make a paired acpi_put_table() available to eliminate kernel side wrong APIs (there are early_acpi_unmap_memory() need to be cleaned up at that time).
3. Go back to fix all use cases then enable the new behavior.

For changes related to ACPICA, I don't think it will be simple and can be done in a short period.
We shall do things while in the meanwhile to ensure no user experience regressions.

See, I already said my apologies to the community for leaking the wrong line to the kernel.
Why do we still argue around a possible user experience disaster?
Why shall we make our own preference conflict the higher priority than the user experience?

Thanks and best regards
Lv

> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service
  2017-04-26 22:37   ` Dan Williams
@ 2017-05-24 19:01     ` Dan Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Williams @ 2017-05-24 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Anush Seetharaman, Tiffany Kasanicky,
	Ryon Jensen, Linux Kernel Mailing List, Stable,
	ACPI Devel Maling List, Kristin Jacque, Zhang Rui

On Wed, Apr 26, 2017 at 3:37 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Wed, Apr 26, 2017 at 3:25 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Apr 25, 2017 at 9:58 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>>> Reading an ACPI table through the /sys/firmware/acpi/tables interface
>>> more than 65,536 times leads to the following log message:
>>>
>>>  ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment
>>>   (20170119/tbutils-423)
>>>
>>> ...and the table being unavailable until the next reboot. Add the
>>> missing acpi_put_table() so the table ->validation_count is decremented
>>> after each read.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Cc: Zhang Rui <rui.zhang@intel.com>
>>> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
>>> Cc: Kristin Jacque <kristin.jacque@intel.com>
>>> Cc: Tiffany Kasanicky <tiffany.j.kasanicky@intel.com>
>>> Cc: Ryon Jensen <ryon.jensen@intel.com>
>>> Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
>>> Fixes: 1c8fce27e275 ("ACPI: introduce drivers/acpi/sysfs.c")
>>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>>
>> I'm going to apply this, but your Fixes tag is not correct.
>>
>> validation_count was added to struct acpi_table_desc by commit
>>
>> commit 174cc7187e6f088942c8e74daa7baff7b44b33c9
>> Author: Lv Zheng <lv.zheng@intel.com>
>> Date:   Wed Dec 14 15:04:25 2016 +0800
>>
>>     ACPICA: Tables: Back port acpi_get_table_with_size() and
>> early_acpi_os_unmap_memory()
>> from Linux kernel
>>
>> from the 4.10 time frame, so IMO it should be
>>
>> Fixes: 174cc7187e6f (ACPICA: Tables: Back port
>> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
>> kernel)
>>
>
> Ah, thanks for the catch, I missed that detail and was wrong to argue
> it was a 7 year old bug. Apologies Lv!

Hi Rafael, I don't see this in latest Linus master or queued in your
bleeding-edge branch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-24 19:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-25 19:58 [PATCH v2] acpi: fix acpi_get_table() leak / acpi-sysfs denial of service Dan Williams
2017-04-26 22:25 ` Rafael J. Wysocki
2017-04-26 22:37   ` Dan Williams
2017-05-24 19:01     ` Dan Williams
2017-04-27  6:49   ` Zheng, Lv
2017-04-27 14:48     ` Dan Williams
2017-04-28  0:31       ` Rafael J. Wysocki
2017-04-28  1:13       ` Zheng, Lv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).