linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
@ 2017-04-26  1:49 Lv Zheng
  2017-04-26  5:00 ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Lv Zheng @ 2017-04-26  1:49 UTC (permalink / raw)
  To: Rafael J . Wysocki, Rafael J . Wysocki, Len Brown
  Cc: Lv Zheng, Lv Zheng, linux-kernel, linux-acpi, Dan Williams

In the Linux kernel side, acpi_get_table() hasn't been fully balanced by
acpi_put_table() invocations. So it is not a good timing to report errors.
The strict balanced validation count check should only be enabled after
confirming that all kernel side invocations are safe.

Thus this patch removes the fatal error but leaves the error report to
indicate the leak so that developers can notice the required engineering
change. Reported by Dan Williams, fixed by Lv Zheng.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
---
 drivers/acpi/acpica/tbutils.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/acpi/acpica/tbutils.c b/drivers/acpi/acpica/tbutils.c
index 5a968a7..9e7d95cf 100644
--- a/drivers/acpi/acpica/tbutils.c
+++ b/drivers/acpi/acpica/tbutils.c
@@ -422,7 +422,6 @@ acpi_tb_get_table(struct acpi_table_desc *table_desc,
 			    "Table %p, Validation count is zero after increment\n",
 			    table_desc));
 		table_desc->validation_count--;
-		return_ACPI_STATUS(AE_LIMIT);
 	}
 
 	*out_table = table_desc->pointer;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
  2017-04-26  1:49 [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling Lv Zheng
@ 2017-04-26  5:00 ` Dan Williams
  2017-04-26  5:15   ` Zheng, Lv
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2017-04-26  5:00 UTC (permalink / raw)
  To: Lv Zheng
  Cc: Rafael J . Wysocki, Rafael J . Wysocki, Len Brown, Lv Zheng,
	linux-kernel, Linux ACPI

On Tue, Apr 25, 2017 at 6:49 PM, Lv Zheng <lv.zheng@intel.com> wrote:
> In the Linux kernel side, acpi_get_table() hasn't been fully balanced by
> acpi_put_table() invocations. So it is not a good timing to report errors.
> The strict balanced validation count check should only be enabled after
> confirming that all kernel side invocations are safe.

We've been living with this bug for 7 years, let's just go fix all
acpi_get_table() invocations to make sure they have a corresponding
acpi_put_table().

>
> Thus this patch removes the fatal error but leaves the error report to
> indicate the leak so that developers can notice the required engineering
> change. Reported by Dan Williams, fixed by Lv Zheng.
>
> Reported-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Lv Zheng <lv.zheng@intel.com>
> ---
>  drivers/acpi/acpica/tbutils.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/acpi/acpica/tbutils.c b/drivers/acpi/acpica/tbutils.c
> index 5a968a7..9e7d95cf 100644
> --- a/drivers/acpi/acpica/tbutils.c
> +++ b/drivers/acpi/acpica/tbutils.c
> @@ -422,7 +422,6 @@ acpi_tb_get_table(struct acpi_table_desc *table_desc,
>                             "Table %p, Validation count is zero after increment\n",
>                             table_desc));
>                 table_desc->validation_count--;
> -               return_ACPI_STATUS(AE_LIMIT);

If you want to leave the error report turn it into a WARN_ON_ONCE() so
it doesn't keep triggering, but I'd rather we just focus on the
missing acpi_put_table() calls.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
  2017-04-26  5:00 ` Dan Williams
@ 2017-04-26  5:15   ` Zheng, Lv
  2017-04-26 14:13     ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Zheng, Lv @ 2017-04-26  5:15 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Wysocki, Rafael J, Rafael J . Wysocki, Brown, Len, Lv Zheng,
	linux-kernel, Linux ACPI

Hi,

> From: Dan Williams [mailto:dan.j.williams@intel.com]
> Subject: Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
> 
> On Tue, Apr 25, 2017 at 6:49 PM, Lv Zheng <lv.zheng@intel.com> wrote:
> > In the Linux kernel side, acpi_get_table() hasn't been fully balanced by
> > acpi_put_table() invocations. So it is not a good timing to report errors.
> > The strict balanced validation count check should only be enabled after
> > confirming that all kernel side invocations are safe.
> 
> We've been living with this bug for 7 years, let's just go fix all
> acpi_get_table() invocations to make sure they have a corresponding
> acpi_put_table().

We knew that, you should have already seen a series internally or
externally from me achieving this.
It's done several years ago. But it takes long time to make the
ACPICA part upstreamed.

Now my plan is:
1. introduce the APIs but allow old usage models in order not to
   change old ACPICA behavior and its users.
2. fix all users
3. disallow old usage models.
It's just my mistake to leak the final stage approach to the ACPICA
upstream from my local repo.
Now we can try to jump to the final step, but as far as I know,
not only Linux, ACPICA itself also contains several broken cases.

Bottom line of Linux kernel is we shouldn't break any running system.
So IMO, we will need this commit during this special period.

I didn't say the final step is wrong or is not required.
We can do both in parallel.

So could you please help to confirm if it's working.
And I would like to suggest linux to take this first step fix along
with other final step fixes during this period.

Thanks and best regards
Lv

> 
> >
> > Thus this patch removes the fatal error but leaves the error report to
> > indicate the leak so that developers can notice the required engineering
> > change. Reported by Dan Williams, fixed by Lv Zheng.
> >
> > Reported-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Lv Zheng <lv.zheng@intel.com>
> > ---
> >  drivers/acpi/acpica/tbutils.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/acpi/acpica/tbutils.c b/drivers/acpi/acpica/tbutils.c
> > index 5a968a7..9e7d95cf 100644
> > --- a/drivers/acpi/acpica/tbutils.c
> > +++ b/drivers/acpi/acpica/tbutils.c
> > @@ -422,7 +422,6 @@ acpi_tb_get_table(struct acpi_table_desc *table_desc,
> >                             "Table %p, Validation count is zero after increment\n",
> >                             table_desc));
> >                 table_desc->validation_count--;
> > -               return_ACPI_STATUS(AE_LIMIT);
> 
> If you want to leave the error report turn it into a WARN_ON_ONCE() so
> it doesn't keep triggering, but I'd rather we just focus on the
> missing acpi_put_table() calls.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
  2017-04-26  5:15   ` Zheng, Lv
@ 2017-04-26 14:13     ` Dan Williams
  2017-04-26 15:34       ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2017-04-26 14:13 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Wysocki, Rafael J, Rafael J . Wysocki, Brown, Len, Lv Zheng,
	linux-kernel, Linux ACPI

On Tue, Apr 25, 2017 at 10:15 PM, Zheng, Lv <lv.zheng@intel.com> wrote:
> Hi,
>
>> From: Dan Williams [mailto:dan.j.williams@intel.com]
>> Subject: Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
>>
>> On Tue, Apr 25, 2017 at 6:49 PM, Lv Zheng <lv.zheng@intel.com> wrote:
>> > In the Linux kernel side, acpi_get_table() hasn't been fully balanced by
>> > acpi_put_table() invocations. So it is not a good timing to report errors.
>> > The strict balanced validation count check should only be enabled after
>> > confirming that all kernel side invocations are safe.
>>
>> We've been living with this bug for 7 years, let's just go fix all
>> acpi_get_table() invocations to make sure they have a corresponding
>> acpi_put_table().
>
> We knew that, you should have already seen a series internally or
> externally from me achieving this.
> It's done several years ago. But it takes long time to make the
> ACPICA part upstreamed.
>
> Now my plan is:
> 1. introduce the APIs but allow old usage models in order not to
>    change old ACPICA behavior and its users.
> 2. fix all users
> 3. disallow old usage models.
> It's just my mistake to leak the final stage approach to the ACPICA
> upstream from my local repo.
> Now we can try to jump to the final step, but as far as I know,
> not only Linux, ACPICA itself also contains several broken cases.
>
> Bottom line of Linux kernel is we shouldn't break any running system.
> So IMO, we will need this commit during this special period.
>
> I didn't say the final step is wrong or is not required.
> We can do both in parallel.
>
> So could you please help to confirm if it's working.
> And I would like to suggest linux to take this first step fix along
> with other final step fixes during this period.

I just think "this period" is very short and we can skip the band-aid
and go straight to auditing the 48 call sites of acpi_get_table.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
  2017-04-26 14:13     ` Dan Williams
@ 2017-04-26 15:34       ` Dan Williams
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Williams @ 2017-04-26 15:34 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Wysocki, Rafael J, Rafael J . Wysocki, Brown, Len, Lv Zheng,
	linux-kernel, Linux ACPI

On Wed, Apr 26, 2017 at 7:13 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Apr 25, 2017 at 10:15 PM, Zheng, Lv <lv.zheng@intel.com> wrote:
>> Hi,
>>
>>> From: Dan Williams [mailto:dan.j.williams@intel.com]
>>> Subject: Re: [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling
>>>
>>> On Tue, Apr 25, 2017 at 6:49 PM, Lv Zheng <lv.zheng@intel.com> wrote:
>>> > In the Linux kernel side, acpi_get_table() hasn't been fully balanced by
>>> > acpi_put_table() invocations. So it is not a good timing to report errors.
>>> > The strict balanced validation count check should only be enabled after
>>> > confirming that all kernel side invocations are safe.
>>>
>>> We've been living with this bug for 7 years, let's just go fix all
>>> acpi_get_table() invocations to make sure they have a corresponding
>>> acpi_put_table().
>>
>> We knew that, you should have already seen a series internally or
>> externally from me achieving this.
>> It's done several years ago. But it takes long time to make the
>> ACPICA part upstreamed.
>>
>> Now my plan is:
>> 1. introduce the APIs but allow old usage models in order not to
>>    change old ACPICA behavior and its users.
>> 2. fix all users
>> 3. disallow old usage models.
>> It's just my mistake to leak the final stage approach to the ACPICA
>> upstream from my local repo.
>> Now we can try to jump to the final step, but as far as I know,
>> not only Linux, ACPICA itself also contains several broken cases.
>>
>> Bottom line of Linux kernel is we shouldn't break any running system.
>> So IMO, we will need this commit during this special period.
>>
>> I didn't say the final step is wrong or is not required.
>> We can do both in parallel.
>>
>> So could you please help to confirm if it's working.
>> And I would like to suggest linux to take this first step fix along
>> with other final step fixes during this period.
>
> I just think "this period" is very short and we can skip the band-aid
> and go straight to auditing the 48 call sites of acpi_get_table.

Moreover, I don't think this workaround is a workable approach because
it leaves the ACPI_ERROR() in place to continue to spam the logs.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-26 15:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26  1:49 [RFC PATCH] ACPICA: Tables: Fix regression introduced by a too early mechanism enabling Lv Zheng
2017-04-26  5:00 ` Dan Williams
2017-04-26  5:15   ` Zheng, Lv
2017-04-26 14:13     ` Dan Williams
2017-04-26 15:34       ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).