Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
       [not found] <20170107204227.bwdb5yzrjpiggkmo@pd.tnic>
@ 2017-01-07 23:30 ` Rafael J. Wysocki
  2017-01-08  0:07   ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-07 23:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Lv Zheng, Bob Moore, Jörg Rödel, lkml, Linux ACPI

On 1/7/2017 9:42 PM, Borislav Petkov wrote:
> Hi,
>
> I'm bisecting a boot freeze with 4.10-rc2 on a laptop and the commit in
> $Subject is introducing a breakage, see attached splat. Unfortunately,
> it is not complete as I don't have any other means of logging dmesg on a
> laptop.
>
> A temporary workaround is to boot with "intremap=off".
>
> Unfortunately, I cannot revert
>
>    174cc7187e6f ("ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel")
>
> as it doesn't revert cleanly anymore. From looking at the callstack,
> though, it looks like AMD IOMMU is calling acpi_put_table() and patch in
> $Subject touches it so I'm guessing the AMD IOMMU needs to be updated to
> the new way of parsing the IRQ remapping tables or whatever is going on
> there - I'm just guessing.

Hi,

Please check if this helps:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=696c7f8e0373026e8bfb29b2d9ff2d0a92059d4d

Thanks,
Rafael


> Thanks.
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-07 23:30 ` 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel Rafael J. Wysocki
@ 2017-01-08  0:07   ` Borislav Petkov
  2017-01-08  0:22     ` Rafael J. Wysocki
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-08  0:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lv Zheng, Bob Moore, Jörg Rödel, lkml, Linux ACPI

On Sun, Jan 08, 2017 at 12:30:27AM +0100, Rafael J. Wysocki wrote:
> Please check if this helps:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=696c7f8e0373026e8bfb29b2d9ff2d0a92059d4d

Unfortunately no, still same early freeze. :-\

The splat happens when booting 6b11d1d67713 directly, 4.10-rc2 doesn't
splat but freezes simply very early during boot.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  0:07   ` Borislav Petkov
@ 2017-01-08  0:22     ` Rafael J. Wysocki
  2017-01-08  0:37       ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-08  0:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J. Wysocki, Lv Zheng, Bob Moore, Jörg Rödel,
	lkml, Linux ACPI

On Sun, Jan 8, 2017 at 1:07 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Jan 08, 2017 at 12:30:27AM +0100, Rafael J. Wysocki wrote:
>> Please check if this helps:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=696c7f8e0373026e8bfb29b2d9ff2d0a92059d4d
>
> Unfortunately no, still same early freeze. :-\
>
> The splat happens when booting 6b11d1d67713 directly, 4.10-rc2 doesn't
> splat but freezes simply very early during boot.

Is an IVRS table actually present on this machine?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  0:22     ` Rafael J. Wysocki
@ 2017-01-08  0:37       ` Borislav Petkov
  2017-01-08  0:52         ` Rafael J. Wysocki
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-08  0:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Lv Zheng, Bob Moore, Jörg Rödel,
	lkml, Linux ACPI

On Sun, Jan 08, 2017 at 01:22:55AM +0100, Rafael J. Wysocki wrote:
> Is an IVRS table actually present on this machine?

Like this?

[    0.000000] ACPI: IVRS 0x000000009CFD6000 0000D0 (v02 AMD    AGESA    00000001 AMD  00000000)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  0:37       ` Borislav Petkov
@ 2017-01-08  0:52         ` Rafael J. Wysocki
  2017-01-08  1:01           ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-08  0:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Lv Zheng, Bob Moore,
	Jörg Rödel, lkml, Linux ACPI

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

On Sun, Jan 8, 2017 at 1:37 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Jan 08, 2017 at 01:22:55AM +0100, Rafael J. Wysocki wrote:
>> Is an IVRS table actually present on this machine?
>
> Like this?
>
> [    0.000000] ACPI: IVRS 0x000000009CFD6000 0000D0 (v02 AMD    AGESA    00000001 AMD  00000000)

Yup.

So we get the table, but apparently we crash when we attempt to put it.

Let's try to check the obvious just to rule it out (see attached), but
honestly I'm not sure what's going on in there.

Thanks,
Rafael

[-- Attachment #2: acpi-table-put-debug.patch --]
[-- Type: text/x-patch, Size: 790 bytes --]


---
 drivers/iommu/amd_iommu_init.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/iommu/amd_iommu_init.c
===================================================================
--- linux-pm.orig/drivers/iommu/amd_iommu_init.c
+++ linux-pm/drivers/iommu/amd_iommu_init.c
@@ -2337,8 +2337,10 @@ static int __init early_amd_iommu_init(v
 
 out:
 	/* Don't leak any ACPI memory */
-	acpi_put_table(ivrs_base);
-	ivrs_base = NULL;
+	if (ivrs_base) {
+		acpi_put_table(ivrs_base);
+		ivrs_base = NULL;
+	}
 
 	return ret;
 }
@@ -2372,7 +2374,8 @@ static bool detect_ivrs(void)
 		return false;
 	}
 
-	acpi_put_table(ivrs_base);
+	if (ivrs_base)
+		acpi_put_table(ivrs_base);
 
 	/* Make sure ACS will be enabled during PCI probe */
 	pci_request_acs();

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  0:52         ` Rafael J. Wysocki
@ 2017-01-08  1:01           ` Borislav Petkov
  2017-01-08  1:45             ` Rafael J. Wysocki
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-08  1:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Lv Zheng, Bob Moore, Jörg Rödel,
	lkml, Linux ACPI

On Sun, Jan 08, 2017 at 01:52:50AM +0100, Rafael J. Wysocki wrote:
> So we get the table, but apparently we crash when we attempt to put it.

Right, except on 4.10-rc2 we don't crash but we freeze early. These are
the last lines:

...
[    0.004778] mce: CPU supports 7 MCE banks
[    0.004861] LVT offset 1 assigned for vector 0xf9
[    0.004945] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
[    0.005025] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512, 1GB 0
[    0.005165] Freeing SMP alternatives memory: 24K
[    0.211154] ftrace: allocating 25022 entries in 98 pages
[    0.219614] smpboot: Max logical packages: 2
<EOF>

> Let's try to check the obvious just to rule it out (see attached), but
> honestly I'm not sure what's going on in there.

No change, same freeze.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  1:01           ` Borislav Petkov
@ 2017-01-08  1:45             ` Rafael J. Wysocki
  2017-01-08  2:20               ` Rafael J. Wysocki
  0 siblings, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-08  1:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Lv Zheng, Bob Moore,
	Jörg Rödel, lkml, Linux ACPI

On Sun, Jan 8, 2017 at 2:01 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Jan 08, 2017 at 01:52:50AM +0100, Rafael J. Wysocki wrote:
>> So we get the table, but apparently we crash when we attempt to put it.
>
> Right, except on 4.10-rc2 we don't crash but we freeze early. These are
> the last lines:
>
> ...
> [    0.004778] mce: CPU supports 7 MCE banks
> [    0.004861] LVT offset 1 assigned for vector 0xf9
> [    0.004945] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
> [    0.005025] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512, 1GB 0
> [    0.005165] Freeing SMP alternatives memory: 24K
> [    0.211154] ftrace: allocating 25022 entries in 98 pages
> [    0.219614] smpboot: Max logical packages: 2
> <EOF>
>
>> Let's try to check the obvious just to rule it out (see attached), but
>> honestly I'm not sure what's going on in there.
>
> No change, same freeze.

I was afraid that that would be the case.

Can you try to comment out the acpi_put_table() in
early_amd_iommu_init() and see if that makes any difference?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  1:45             ` Rafael J. Wysocki
@ 2017-01-08  2:20               ` Rafael J. Wysocki
  2017-01-08 13:03                 ` Borislav Petkov
  2017-01-09 10:52                 ` Jörg Rödel
  0 siblings, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-08  2:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, Rafael J. Wysocki, Lv Zheng, Bob Moore,
	Jörg Rödel, lkml, Linux ACPI

[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]

On Sun, Jan 8, 2017 at 2:45 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Sun, Jan 8, 2017 at 2:01 AM, Borislav Petkov <bp@alien8.de> wrote:
>> On Sun, Jan 08, 2017 at 01:52:50AM +0100, Rafael J. Wysocki wrote:
>>> So we get the table, but apparently we crash when we attempt to put it.
>>
>> Right, except on 4.10-rc2 we don't crash but we freeze early. These are
>> the last lines:
>>
>> ...
>> [    0.004778] mce: CPU supports 7 MCE banks
>> [    0.004861] LVT offset 1 assigned for vector 0xf9
>> [    0.004945] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
>> [    0.005025] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512, 1GB 0
>> [    0.005165] Freeing SMP alternatives memory: 24K
>> [    0.211154] ftrace: allocating 25022 entries in 98 pages
>> [    0.219614] smpboot: Max logical packages: 2
>> <EOF>
>>
>>> Let's try to check the obvious just to rule it out (see attached), but
>>> honestly I'm not sure what's going on in there.
>>
>> No change, same freeze.
>
> I was afraid that that would be the case.
>
> Can you try to comment out the acpi_put_table() in
> early_amd_iommu_init() and see if that makes any difference?

Well, there is a bug in early_amd_iommu_init() that may matter in
theory if the table checksum is incorrect.

Please see if the attached makes any difference.

Thanks,
Rafael

[-- Attachment #2: amd-iommu-fix.patch --]
[-- Type: text/x-patch, Size: 595 bytes --]

---
 drivers/iommu/amd_iommu_init.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/drivers/iommu/amd_iommu_init.c
===================================================================
--- linux-pm.orig/drivers/iommu/amd_iommu_init.c
+++ linux-pm/drivers/iommu/amd_iommu_init.c
@@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
 	 */
 	ret = check_ivrs_checksum(ivrs_base);
 	if (ret)
-		return ret;
+		goto out;
 
 	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
 	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  2:20               ` Rafael J. Wysocki
@ 2017-01-08 13:03                 ` Borislav Petkov
  2017-01-09  1:58                   ` Zheng, Lv
  2017-01-09 10:52                 ` Jörg Rödel
  1 sibling, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-08 13:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Lv Zheng, Bob Moore, Jörg Rödel,
	lkml, Linux ACPI

On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
>  drivers/iommu/amd_iommu_init.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/iommu/amd_iommu_init.c
> ===================================================================
> --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> +++ linux-pm/drivers/iommu/amd_iommu_init.c
> @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
>  	 */
>  	ret = check_ivrs_checksum(ivrs_base);
>  	if (ret)
> -		return ret;
> +		goto out;
>  
>  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
>  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

Good catch, this one needs to be applied regardless.

However, it doesn't fix my issue though.

But I think I have it - I went and applied the well-proven debugging
technique of sprinkling printks around. Here's what I'm seeing:

early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y

Now that function goes and sends IPIs, i.e., schedule_work()
but this is too early - we haven't even done workqueue_init().
Actually, from looking at the callstack, we do
kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
comes next.

And this makes sense because the splat rIP points to __queue_work() but
we haven't done that yet.

So that acpi_put_table() is happening too early. Looks like AMD IOMMU
should not put the table but WTH do I know?!

In any case, commenting out:

        acpi_put_table(ivrs_base);
        ivrs_base = NULL;

and the end of early_amd_iommu_init() makes the box boot again.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08 13:03                 ` Borislav Petkov
@ 2017-01-09  1:58                   ` Zheng, Lv
  2017-01-09  2:36                     ` Zheng, Lv
  2017-01-09  5:21                     ` Zheng, Lv
  0 siblings, 2 replies; 39+ messages in thread
From: Zheng, Lv @ 2017-01-09  1:58 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J. Wysocki
  Cc: Wysocki, Rafael J, Moore, Robert, J?rg R?del, lkml, Linux ACPI

Hi,

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Borislav
> Petkov
> Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> >  drivers/iommu/amd_iommu_init.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > ===================================================================
> > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> >  	 */
> >  	ret = check_ivrs_checksum(ivrs_base);
> >  	if (ret)
> > -		return ret;
> > +		goto out;
> >
> >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> 
> Good catch, this one needs to be applied regardless.
> 
> However, it doesn't fix my issue though.
> 
> But I think I have it - I went and applied the well-proven debugging
> technique of sprinkling printks around. Here's what I'm seeing:
> 
> early_amd_iommu_init()
> |-> acpi_put_table(ivrs_base);
> |-> acpi_tb_put_table(table_desc);
> |-> acpi_tb_invalidate_table(table_desc);
> |-> acpi_tb_release_table(...)
> |-> acpi_os_unmap_memory
> |-> acpi_os_unmap_iomem
> |-> acpi_os_map_cleanup
> |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> 
> Now that function goes and sends IPIs, i.e., schedule_work()
> but this is too early - we haven't even done workqueue_init().
> Actually, from looking at the callstack, we do
> kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> comes next.
> 
> And this makes sense because the splat rIP points to __queue_work() but
> we haven't done that yet.
> 
> So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> should not put the table but WTH do I know?!
> 
> In any case, commenting out:
> 
>         acpi_put_table(ivrs_base);
>         ivrs_base = NULL;
> 
> and the end of early_amd_iommu_init() makes the box boot again.

So please help to comment out these 2 lines (with descriptions and do not delete them).
Until acpi_os_unmap_memory() is able to handle such an early case.

Thanks and best regards
Lv

> 
> --
> Regards/Gruss,
>     Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09  1:58                   ` Zheng, Lv
@ 2017-01-09  2:36                     ` Zheng, Lv
  2017-01-09  9:33                       ` Borislav Petkov
  2017-01-09  5:21                     ` Zheng, Lv
  1 sibling, 1 reply; 39+ messages in thread
From: Zheng, Lv @ 2017-01-09  2:36 UTC (permalink / raw)
  To: Zheng, Lv, Borislav Petkov, Rafael J. Wysocki
  Cc: Wysocki, Rafael J, Moore, Robert, J?rg R?del, lkml, Linux ACPI

Hi,

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> Lv
> Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> Hi,
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> Borislav
> > Petkov
> > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> >
> > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > >  drivers/iommu/amd_iommu_init.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > >  	 */
> > >  	ret = check_ivrs_checksum(ivrs_base);
> > >  	if (ret)
> > > -		return ret;
> > > +		goto out;
> > >
> > >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> >
> > Good catch, this one needs to be applied regardless.
> >
> > However, it doesn't fix my issue though.
> >
> > But I think I have it - I went and applied the well-proven debugging
> > technique of sprinkling printks around. Here's what I'm seeing:
> >
> > early_amd_iommu_init()
> > |-> acpi_put_table(ivrs_base);
> > |-> acpi_tb_put_table(table_desc);
> > |-> acpi_tb_invalidate_table(table_desc);
> > |-> acpi_tb_release_table(...)
> > |-> acpi_os_unmap_memory
> > |-> acpi_os_unmap_iomem
> > |-> acpi_os_map_cleanup
> > |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> >
> > Now that function goes and sends IPIs, i.e., schedule_work()
> > but this is too early - we haven't even done workqueue_init().
> > Actually, from looking at the callstack, we do
> > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > comes next.
> >
> > And this makes sense because the splat rIP points to __queue_work() but
> > we haven't done that yet.
> >
> > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > should not put the table but WTH do I know?!
> >
> > In any case, commenting out:
> >
> >         acpi_put_table(ivrs_base);
> >         ivrs_base = NULL;
> >
> > and the end of early_amd_iommu_init() makes the box boot again.
> 
> So please help to comment out these 2 lines (with descriptions and do not delete them).
> Until acpi_os_unmap_memory() is able to handle such an early case.

IMO, synchronize_rcu_expedited() should be improved:
If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.

Thanks and best regards
Lv

> 
> Thanks and best regards
> Lv
> 
> >
> > --
> > Regards/Gruss,
> >     Boris.
> >
> > Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09  1:58                   ` Zheng, Lv
  2017-01-09  2:36                     ` Zheng, Lv
@ 2017-01-09  5:21                     ` Zheng, Lv
  1 sibling, 0 replies; 39+ messages in thread
From: Zheng, Lv @ 2017-01-09  5:21 UTC (permalink / raw)
  To: Zheng, Lv, Borislav Petkov, Rafael J. Wysocki
  Cc: Wysocki, Rafael J, Moore, Robert, J?rg R?del, lkml, Linux ACPI

Hi, Borislav

> From: Zheng, Lv
> Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> Hi,
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> > Lv
> > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> >
> > Hi,
> >
> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> > Borislav
> > > Petkov
> > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > early_acpi_os_unmap_memory() from Linux kernel
> > >
> > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > > >  drivers/iommu/amd_iommu_init.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > > >  	 */
> > > >  	ret = check_ivrs_checksum(ivrs_base);
> > > >  	if (ret)
> > > > -		return ret;
> > > > +		goto out;
> > > >
> > > >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > > >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> > >
> > > Good catch, this one needs to be applied regardless.
> > >
> > > However, it doesn't fix my issue though.
> > >
> > > But I think I have it - I went and applied the well-proven debugging
> > > technique of sprinkling printks around. Here's what I'm seeing:
> > >
> > > early_amd_iommu_init()
> > > |-> acpi_put_table(ivrs_base);
> > > |-> acpi_tb_put_table(table_desc);
> > > |-> acpi_tb_invalidate_table(table_desc);
> > > |-> acpi_tb_release_table(...)
> > > |-> acpi_os_unmap_memory
> > > |-> acpi_os_unmap_iomem
> > > |-> acpi_os_map_cleanup
> > > |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> > >
> > > Now that function goes and sends IPIs, i.e., schedule_work()
> > > but this is too early - we haven't even done workqueue_init().
> > > Actually, from looking at the callstack, we do
> > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > > comes next.
> > >
> > > And this makes sense because the splat rIP points to __queue_work() but
> > > we haven't done that yet.
> > >
> > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > > should not put the table but WTH do I know?!
> > >
> > > In any case, commenting out:
> > >
> > >         acpi_put_table(ivrs_base);
> > >         ivrs_base = NULL;
> > >
> > > and the end of early_amd_iommu_init() makes the box boot again.
> >
> > So please help to comment out these 2 lines (with descriptions and do not delete them).
> > Until acpi_os_unmap_memory() is able to handle such an early case.
> 
> IMO, synchronize_rcu_expedited() should be improved:
> If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.


Hmm, I think this problem has its history.
In APEI code, NMI handlers cannot utilize spinlock.
So the developers there used RCU to synchronize NMI handlers before the register region is unmapped.
At that time, there might not be pre-map/post-unmap code prepared in the APEI drivers.
So the APEI developers relied on map/unmap logics implemented in the ACPICA register read/write APIs where the OSL map/unmap is invoked.

That's why the RCU code is in acpi_os_xxx().
If:
1. there is map/unmap/read/write operations available for APEI developers to invoke;
2. RCU synchronization is invoked before invoking the last unmap operation;
3. map/unmap/read/write order is strictly ensured inside of the APEI drivers.
Then we can remove RCU stuffs from acpi_os_xxx.
And the root cause of this issue can be fixed.

I'm not sure if this approach is possible, but can give it a try.
Before that I think all such acpi_put_table() should just be commented out.

Thanks and best regards
Lv

> 
> Thanks and best regards
> Lv
> 
> >
> > Thanks and best regards
> > Lv
> >
> > >
> > > --
> > > Regards/Gruss,
> > >     Boris.
> > >
> > > Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09  2:36                     ` Zheng, Lv
@ 2017-01-09  9:33                       ` Borislav Petkov
  2017-01-09 22:18                         ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-09  9:33 UTC (permalink / raw)
  To: Zheng, Lv, Paul E. McKenney
  Cc: Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert, J?rg R?del,
	lkml, Linux ACPI

+ Paul for comment.

Leaving in the rest for him.

On Mon, Jan 09, 2017 at 02:36:33AM +0000, Zheng, Lv wrote:
> Hi,
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> > Lv
> > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> > 
> > Hi,
> > 
> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> > Borislav
> > > Petkov
> > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > early_acpi_os_unmap_memory() from Linux kernel
> > >
> > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > > >  drivers/iommu/amd_iommu_init.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > > >  	 */
> > > >  	ret = check_ivrs_checksum(ivrs_base);
> > > >  	if (ret)
> > > > -		return ret;
> > > > +		goto out;
> > > >
> > > >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > > >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> > >
> > > Good catch, this one needs to be applied regardless.
> > >
> > > However, it doesn't fix my issue though.
> > >
> > > But I think I have it - I went and applied the well-proven debugging
> > > technique of sprinkling printks around. Here's what I'm seeing:
> > >
> > > early_amd_iommu_init()
> > > |-> acpi_put_table(ivrs_base);
> > > |-> acpi_tb_put_table(table_desc);
> > > |-> acpi_tb_invalidate_table(table_desc);
> > > |-> acpi_tb_release_table(...)
> > > |-> acpi_os_unmap_memory
> > > |-> acpi_os_unmap_iomem
> > > |-> acpi_os_map_cleanup
> > > |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> > >
> > > Now that function goes and sends IPIs, i.e., schedule_work()
> > > but this is too early - we haven't even done workqueue_init().
> > > Actually, from looking at the callstack, we do
> > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > > comes next.
> > >
> > > And this makes sense because the splat rIP points to __queue_work() but
> > > we haven't done that yet.
> > >
> > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > > should not put the table but WTH do I know?!
> > >
> > > In any case, commenting out:
> > >
> > >         acpi_put_table(ivrs_base);
> > >         ivrs_base = NULL;
> > >
> > > and the end of early_amd_iommu_init() makes the box boot again.
> > 
> > So please help to comment out these 2 lines (with descriptions and do not delete them).
> > Until acpi_os_unmap_memory() is able to handle such an early case.
> 
> IMO, synchronize_rcu_expedited() should be improved:
> If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-08  2:20               ` Rafael J. Wysocki
  2017-01-08 13:03                 ` Borislav Petkov
@ 2017-01-09 10:52                 ` Jörg Rödel
  2017-01-09 22:41                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 39+ messages in thread
From: Jörg Rödel @ 2017-01-09 10:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, Rafael J. Wysocki, Lv Zheng, Bob Moore, lkml,
	Linux ACPI

Hi Rafael,

On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> ---
>  drivers/iommu/amd_iommu_init.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/iommu/amd_iommu_init.c
> ===================================================================
> --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> +++ linux-pm/drivers/iommu/amd_iommu_init.c
> @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
>  	 */
>  	ret = check_ivrs_checksum(ivrs_base);
>  	if (ret)
> -		return ret;
> +		goto out;
>  
>  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
>  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

Yeah, good catch. Can you send me a patch for this and I am going to
send the fix upstream asap.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09  9:33                       ` Borislav Petkov
@ 2017-01-09 22:18                         ` Paul E. McKenney
  2017-01-09 22:25                           ` Rafael J. Wysocki
  2017-01-09 23:15                           ` Borislav Petkov
  0 siblings, 2 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-09 22:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 10:33:29AM +0100, Borislav Petkov wrote:
> + Paul for comment.
> 
> Leaving in the rest for him.
> 
> On Mon, Jan 09, 2017 at 02:36:33AM +0000, Zheng, Lv wrote:
> > Hi,
> > 
> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> > > Lv
> > > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > early_acpi_os_unmap_memory() from Linux kernel
> > > 
> > > Hi,
> > > 
> > > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> > > Borislav
> > > > Petkov
> > > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > > early_acpi_os_unmap_memory() from Linux kernel
> > > >
> > > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > > > >  drivers/iommu/amd_iommu_init.c |    2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > > > ===================================================================
> > > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > > > >  	 */
> > > > >  	ret = check_ivrs_checksum(ivrs_base);
> > > > >  	if (ret)
> > > > > -		return ret;
> > > > > +		goto out;
> > > > >
> > > > >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > > > >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> > > >
> > > > Good catch, this one needs to be applied regardless.
> > > >
> > > > However, it doesn't fix my issue though.
> > > >
> > > > But I think I have it - I went and applied the well-proven debugging
> > > > technique of sprinkling printks around. Here's what I'm seeing:
> > > >
> > > > early_amd_iommu_init()
> > > > |-> acpi_put_table(ivrs_base);
> > > > |-> acpi_tb_put_table(table_desc);
> > > > |-> acpi_tb_invalidate_table(table_desc);
> > > > |-> acpi_tb_release_table(...)
> > > > |-> acpi_os_unmap_memory
> > > > |-> acpi_os_unmap_iomem
> > > > |-> acpi_os_map_cleanup
> > > > |-> synchronize_rcu_expedited	<-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> > > >
> > > > Now that function goes and sends IPIs, i.e., schedule_work()
> > > > but this is too early - we haven't even done workqueue_init().
> > > > Actually, from looking at the callstack, we do
> > > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > > > comes next.
> > > >
> > > > And this makes sense because the splat rIP points to __queue_work() but
> > > > we haven't done that yet.
> > > >
> > > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > > > should not put the table but WTH do I know?!
> > > >
> > > > In any case, commenting out:
> > > >
> > > >         acpi_put_table(ivrs_base);
> > > >         ivrs_base = NULL;
> > > >
> > > > and the end of early_amd_iommu_init() makes the box boot again.
> > > 
> > > So please help to comment out these 2 lines (with descriptions and do not delete them).
> > > Until acpi_os_unmap_memory() is able to handle such an early case.
> > 
> > IMO, synchronize_rcu_expedited() should be improved:
> > If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.

Indeed it should!

Does the (untested) patch below fix things for you?

If so, does this need to go into 4.10?  (My default workflow would get
it into 4.11 or 4.12, so please speak up if you need it.)

							Thanx, Paul

------------------------------------------------------------------------

commit 1b7feb708241f1662cfd529118468c9f9c0b1449
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Jan 9 14:10:50 2017 -0800

    rcu: Make synchronize_rcu_expedited() safe for early boot
    
    The synchronize_rcu_expedited() function does not check for early-boot
    use, which can result in failures if it is invoked before the scheduler
    has started.  Given that the rcupdate.rcu_expedited kernel parameter
    causes all calls to synchronize_rcu() to be directed instead to
    synchronize_rcu_expedited(), a usage restriction does not make sense.
    
    This commit therefore adds a rcu_scheduler_active check to
    synchronize_rcu_expedited(), so that it is a no-op before the scheduler
    starts.  This behavior is correct because there is only a single CPU
    running during that time.
    
    Reported-by: Lv Zheng <lv.zheng@intel.com>
    Reported-by: Borislav Petkov <bp@alien8.de>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index dfc3ba5a429e..a6c3d86480de 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
 {
 	struct rcu_state *rsp = rcu_state_p;
 
+	if (!rcu_scheduler_active)
+		return;
 	_synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 22:18                         ` Paul E. McKenney
@ 2017-01-09 22:25                           ` Rafael J. Wysocki
  2017-01-09 23:14                             ` Paul E. McKenney
  2017-01-09 23:15                           ` Borislav Petkov
  1 sibling, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-09 22:25 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Borislav Petkov, Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J,
	Moore, Robert, J?rg R?del, lkml, Linux ACPI

Hi Paul,

On Mon, Jan 9, 2017 at 11:18 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Jan 09, 2017 at 10:33:29AM +0100, Borislav Petkov wrote:
>> + Paul for comment.
>>
>> Leaving in the rest for him.
>>
>> On Mon, Jan 09, 2017 at 02:36:33AM +0000, Zheng, Lv wrote:
>> > Hi,
>> >
>> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
>> > > Lv
>> > > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
>> > > early_acpi_os_unmap_memory() from Linux kernel
>> > >
>> > > Hi,
>> > >
>> > > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
>> > > Borislav
>> > > > Petkov
>> > > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
>> > > > early_acpi_os_unmap_memory() from Linux kernel
>> > > >
>> > > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
>> > > > >  drivers/iommu/amd_iommu_init.c |    2 +-
>> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > > >
>> > > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
>> > > > > ===================================================================
>> > > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
>> > > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
>> > > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
>> > > > >        */
>> > > > >       ret = check_ivrs_checksum(ivrs_base);
>> > > > >       if (ret)
>> > > > > -             return ret;
>> > > > > +             goto out;
>> > > > >
>> > > > >       amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
>> > > > >       DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
>> > > >
>> > > > Good catch, this one needs to be applied regardless.
>> > > >
>> > > > However, it doesn't fix my issue though.
>> > > >
>> > > > But I think I have it - I went and applied the well-proven debugging
>> > > > technique of sprinkling printks around. Here's what I'm seeing:
>> > > >
>> > > > early_amd_iommu_init()
>> > > > |-> acpi_put_table(ivrs_base);
>> > > > |-> acpi_tb_put_table(table_desc);
>> > > > |-> acpi_tb_invalidate_table(table_desc);
>> > > > |-> acpi_tb_release_table(...)
>> > > > |-> acpi_os_unmap_memory
>> > > > |-> acpi_os_unmap_iomem
>> > > > |-> acpi_os_map_cleanup
>> > > > |-> synchronize_rcu_expedited   <-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
>> > > >
>> > > > Now that function goes and sends IPIs, i.e., schedule_work()
>> > > > but this is too early - we haven't even done workqueue_init().
>> > > > Actually, from looking at the callstack, we do
>> > > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
>> > > > comes next.
>> > > >
>> > > > And this makes sense because the splat rIP points to __queue_work() but
>> > > > we haven't done that yet.
>> > > >
>> > > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
>> > > > should not put the table but WTH do I know?!
>> > > >
>> > > > In any case, commenting out:
>> > > >
>> > > >         acpi_put_table(ivrs_base);
>> > > >         ivrs_base = NULL;
>> > > >
>> > > > and the end of early_amd_iommu_init() makes the box boot again.
>> > >
>> > > So please help to comment out these 2 lines (with descriptions and do not delete them).
>> > > Until acpi_os_unmap_memory() is able to handle such an early case.
>> >
>> > IMO, synchronize_rcu_expedited() should be improved:
>> > If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.
>
> Indeed it should!
>
> Does the (untested) patch below fix things for you?
>
> If so, does this need to go into 4.10?  (My default workflow would get
> it into 4.11 or 4.12, so please speak up if you need it.)

Yes it should go into 4.10 (if it fixes the problem) as the reported
regression was introduced in 4.10-rc1.

>
> ------------------------------------------------------------------------
>
> commit 1b7feb708241f1662cfd529118468c9f9c0b1449
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Jan 9 14:10:50 2017 -0800
>
>     rcu: Make synchronize_rcu_expedited() safe for early boot
>
>     The synchronize_rcu_expedited() function does not check for early-boot
>     use, which can result in failures if it is invoked before the scheduler
>     has started.  Given that the rcupdate.rcu_expedited kernel parameter
>     causes all calls to synchronize_rcu() to be directed instead to
>     synchronize_rcu_expedited(), a usage restriction does not make sense.
>
>     This commit therefore adds a rcu_scheduler_active check to
>     synchronize_rcu_expedited(), so that it is a no-op before the scheduler
>     starts.  This behavior is correct because there is only a single CPU
>     running during that time.
>
>     Reported-by: Lv Zheng <lv.zheng@intel.com>
>     Reported-by: Borislav Petkov <bp@alien8.de>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

When applying this, please also add:

Fixes: 174cc7187e6f ("ACPICA: Tables: Back port
acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
kernel")
Link: https://lkml.kernel.org/r/4034dde8-ffc1-18e2-f40c-00cf37471793@intel.com

>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index dfc3ba5a429e..a6c3d86480de 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
>  {
>         struct rcu_state *rsp = rcu_state_p;
>
> +       if (!rcu_scheduler_active)
> +               return;
>         _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 10:52                 ` Jörg Rödel
@ 2017-01-09 22:41                   ` Rafael J. Wysocki
  2017-01-09 22:57                     ` Borislav Petkov
  2017-01-10 13:58                     ` Jörg Rödel
  0 siblings, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-09 22:41 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: Rafael J. Wysocki, Borislav Petkov, Rafael J. Wysocki, Lv Zheng,
	Bob Moore, lkml, Linux ACPI

On Monday, January 09, 2017 11:52:26 AM Jörg Rödel wrote:
> Hi Rafael,
> 
> On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > ---
> >  drivers/iommu/amd_iommu_init.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > ===================================================================
> > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> >  	 */
> >  	ret = check_ivrs_checksum(ivrs_base);
> >  	if (ret)
> > -		return ret;
> > +		goto out;
> >  
> >  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> >  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> 
> Yeah, good catch. Can you send me a patch for this and I am going to
> send the fix upstream asap.

Please find appended.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] IOMMU / AMD: Fix error code path in early_amd_iommu_init()

Prevent early_amd_iommu_init() from leaking memory mapped via
acpi_get_table() if check_ivrs_checksum() returns an error.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/iommu/amd_iommu_init.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/drivers/iommu/amd_iommu_init.c
===================================================================
--- linux-pm.orig/drivers/iommu/amd_iommu_init.c
+++ linux-pm/drivers/iommu/amd_iommu_init.c
@@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
 	 */
 	ret = check_ivrs_checksum(ivrs_base);
 	if (ret)
-		return ret;
+		goto out;
 
 	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
 	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 22:41                   ` Rafael J. Wysocki
@ 2017-01-09 22:57                     ` Borislav Petkov
  2017-01-10 13:58                     ` Jörg Rödel
  1 sibling, 0 replies; 39+ messages in thread
From: Borislav Petkov @ 2017-01-09 22:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jörg Rödel, Rafael J. Wysocki, Rafael J. Wysocki,
	Lv Zheng, Bob Moore, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 11:41:15PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] IOMMU / AMD: Fix error code path in early_amd_iommu_init()
> 
> Prevent early_amd_iommu_init() from leaking memory mapped via
> acpi_get_table() if check_ivrs_checksum() returns an error.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/iommu/amd_iommu_init.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/iommu/amd_iommu_init.c
> ===================================================================
> --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> +++ linux-pm/drivers/iommu/amd_iommu_init.c
> @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
>  	 */
>  	ret = check_ivrs_checksum(ivrs_base);
>  	if (ret)
> -		return ret;
> +		goto out;
>  
>  	amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
>  	DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 22:25                           ` Rafael J. Wysocki
@ 2017-01-09 23:14                             ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-09 23:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, Zheng, Lv, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 11:25:39PM +0100, Rafael J. Wysocki wrote:
> Hi Paul,
> 
> On Mon, Jan 9, 2017 at 11:18 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Mon, Jan 09, 2017 at 10:33:29AM +0100, Borislav Petkov wrote:
> >> + Paul for comment.
> >>
> >> Leaving in the rest for him.
> >>
> >> On Mon, Jan 09, 2017 at 02:36:33AM +0000, Zheng, Lv wrote:
> >> > Hi,
> >> >
> >> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> >> > > Lv
> >> > > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> >> > > early_acpi_os_unmap_memory() from Linux kernel
> >> > >
> >> > > Hi,
> >> > >
> >> > > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> >> > > Borislav
> >> > > > Petkov
> >> > > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> >> > > > early_acpi_os_unmap_memory() from Linux kernel
> >> > > >
> >> > > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> >> > > > >  drivers/iommu/amd_iommu_init.c |    2 +-
> >> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> > > > >
> >> > > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> >> > > > > ===================================================================
> >> > > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> >> > > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> >> > > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> >> > > > >        */
> >> > > > >       ret = check_ivrs_checksum(ivrs_base);
> >> > > > >       if (ret)
> >> > > > > -             return ret;
> >> > > > > +             goto out;
> >> > > > >
> >> > > > >       amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> >> > > > >       DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> >> > > >
> >> > > > Good catch, this one needs to be applied regardless.
> >> > > >
> >> > > > However, it doesn't fix my issue though.
> >> > > >
> >> > > > But I think I have it - I went and applied the well-proven debugging
> >> > > > technique of sprinkling printks around. Here's what I'm seeing:
> >> > > >
> >> > > > early_amd_iommu_init()
> >> > > > |-> acpi_put_table(ivrs_base);
> >> > > > |-> acpi_tb_put_table(table_desc);
> >> > > > |-> acpi_tb_invalidate_table(table_desc);
> >> > > > |-> acpi_tb_release_table(...)
> >> > > > |-> acpi_os_unmap_memory
> >> > > > |-> acpi_os_unmap_iomem
> >> > > > |-> acpi_os_map_cleanup
> >> > > > |-> synchronize_rcu_expedited   <-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> >> > > >
> >> > > > Now that function goes and sends IPIs, i.e., schedule_work()
> >> > > > but this is too early - we haven't even done workqueue_init().
> >> > > > Actually, from looking at the callstack, we do
> >> > > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> >> > > > comes next.
> >> > > >
> >> > > > And this makes sense because the splat rIP points to __queue_work() but
> >> > > > we haven't done that yet.
> >> > > >
> >> > > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> >> > > > should not put the table but WTH do I know?!
> >> > > >
> >> > > > In any case, commenting out:
> >> > > >
> >> > > >         acpi_put_table(ivrs_base);
> >> > > >         ivrs_base = NULL;
> >> > > >
> >> > > > and the end of early_amd_iommu_init() makes the box boot again.
> >> > >
> >> > > So please help to comment out these 2 lines (with descriptions and do not delete them).
> >> > > Until acpi_os_unmap_memory() is able to handle such an early case.
> >> >
> >> > IMO, synchronize_rcu_expedited() should be improved:
> >> > If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.
> >
> > Indeed it should!
> >
> > Does the (untested) patch below fix things for you?
> >
> > If so, does this need to go into 4.10?  (My default workflow would get
> > it into 4.11 or 4.12, so please speak up if you need it.)
> 
> Yes it should go into 4.10 (if it fixes the problem) as the reported
> regression was introduced in 4.10-rc1.

OK, I will test with rcutorture, but I also need a Tested-by for the
specific problem at hand.

> > ------------------------------------------------------------------------
> >
> > commit 1b7feb708241f1662cfd529118468c9f9c0b1449
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Mon Jan 9 14:10:50 2017 -0800
> >
> >     rcu: Make synchronize_rcu_expedited() safe for early boot
> >
> >     The synchronize_rcu_expedited() function does not check for early-boot
> >     use, which can result in failures if it is invoked before the scheduler
> >     has started.  Given that the rcupdate.rcu_expedited kernel parameter
> >     causes all calls to synchronize_rcu() to be directed instead to
> >     synchronize_rcu_expedited(), a usage restriction does not make sense.
> >
> >     This commit therefore adds a rcu_scheduler_active check to
> >     synchronize_rcu_expedited(), so that it is a no-op before the scheduler
> >     starts.  This behavior is correct because there is only a single CPU
> >     running during that time.
> >
> >     Reported-by: Lv Zheng <lv.zheng@intel.com>
> >     Reported-by: Borislav Petkov <bp@alien8.de>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> When applying this, please also add:
> 
> Fixes: 174cc7187e6f ("ACPICA: Tables: Back port
> acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux
> kernel")
> Link: https://lkml.kernel.org/r/4034dde8-ffc1-18e2-f40c-00cf37471793@intel.com

Like this?

							Thanx, Paul

------------------------------------------------------------------------

commit 9ae87e58f7c40b39ff80737e3375672f16316d23
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Jan 9 14:10:50 2017 -0800

    rcu: Make synchronize_rcu_expedited() safe for early boot
    
    The synchronize_rcu_expedited() function does not check for early-boot
    use, which can result in failures if it is invoked before the scheduler
    has started.  Given that the rcupdate.rcu_expedited kernel parameter
    causes all calls to synchronize_rcu() to be directed instead to
    synchronize_rcu_expedited(), a usage restriction does not make sense.
    
    This commit therefore adds a rcu_scheduler_active check to
    synchronize_rcu_expedited(), so that it is a no-op before the scheduler
    starts.  This behavior is correct because there is only a single CPU
    running during that time.
    
    Reported-by: Lv Zheng <lv.zheng@intel.com>
    Reported-by: Borislav Petkov <bp@alien8.de>
    Fixes: 174cc7187e6f ("ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel")
    Link: Link: https://lkml.kernel.org/r/4034dde8-ffc1-18e2-f40c-00cf37471793@intel.com
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index dfc3ba5a429e..a6c3d86480de 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
 {
 	struct rcu_state *rsp = rcu_state_p;
 
+	if (!rcu_scheduler_active)
+		return;
 	_synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
 }
 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 22:18                         ` Paul E. McKenney
  2017-01-09 22:25                           ` Rafael J. Wysocki
@ 2017-01-09 23:15                           ` Borislav Petkov
  2017-01-09 23:32                             ` Paul E. McKenney
  1 sibling, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-09 23:15 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
> @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
>  {
>  	struct rcu_state *rsp = rcu_state_p;
>  
> +	if (!rcu_scheduler_active)
> +		return;
>  	_synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);

That doesn't work and it is because of those damn what goes before what
boot sequence issues :-\

We have:

rest_init()
|-> rcu_scheduler_starting()  ---> that sets rcu_scheduler_active = 1;
|-> kernel_thread(kernel_init, NULL, CLONE_FS);
|-> kernel_init()
|-> kernel_init_freeable()
|-> native_smp_prepare_cpus(setup_max_cpus)
|-> default_setup_apic_routing
|-> enable_IR_x2apic
|-> irq_remapping_prepare
|-> amd_iommu_prepare
|-> iommu_go_to_state
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited()

Now here we have rcu_scheduler_active already set so the test doesn't
hit and we hang.

So we must do it differently.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:15                           ` Borislav Petkov
@ 2017-01-09 23:32                             ` Paul E. McKenney
  2017-01-09 23:40                               ` Borislav Petkov
  2017-01-09 23:42                               ` Rafael J. Wysocki
  0 siblings, 2 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-09 23:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 12:15:01AM +0100, Borislav Petkov wrote:
> On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
> > @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
> >  {
> >  	struct rcu_state *rsp = rcu_state_p;
> >  
> > +	if (!rcu_scheduler_active)
> > +		return;
> >  	_synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
> >  }
> >  EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
> 
> That doesn't work and it is because of those damn what goes before what
> boot sequence issues :-\
> 
> We have:
> 
> rest_init()
> |-> rcu_scheduler_starting()  ---> that sets rcu_scheduler_active = 1;
> |-> kernel_thread(kernel_init, NULL, CLONE_FS);
> |-> kernel_init()
> |-> kernel_init_freeable()
> |-> native_smp_prepare_cpus(setup_max_cpus)
> |-> default_setup_apic_routing
> |-> enable_IR_x2apic
> |-> irq_remapping_prepare
> |-> amd_iommu_prepare
> |-> iommu_go_to_state
> |-> acpi_put_table(ivrs_base);
> |-> acpi_tb_put_table(table_desc);
> |-> acpi_tb_invalidate_table(table_desc);
> |-> acpi_tb_release_table(...)
> |-> acpi_os_unmap_memory
> |-> acpi_os_unmap_iomem
> |-> acpi_os_map_cleanup
> |-> synchronize_rcu_expedited()
> 
> Now here we have rcu_scheduler_active already set so the test doesn't
> hit and we hang.
> 
> So we must do it differently.

Yeah, there is a window just as the scheduler is starting where things don't
work.

We could move rcu_scheduler_starting() later, as long as there
is no chance of preemption or context switch before it is invoked.
Would that help in this case, or are we already context switching before
acpi_os_map_cleanup() is invoked?  (If we are already context switching,
short-circuiting synchronize_rcu_expedited() would be a bug.)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:32                             ` Paul E. McKenney
@ 2017-01-09 23:40                               ` Borislav Petkov
  2017-01-09 23:52                                 ` Paul E. McKenney
  2017-01-09 23:52                                 ` Borislav Petkov
  2017-01-09 23:42                               ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Borislav Petkov @ 2017-01-09 23:40 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 03:32:04PM -0800, Paul E. McKenney wrote:
> We could move rcu_scheduler_starting() later, as long as there
> is no chance of preemption or context switch before it is invoked.
> Would that help in this case, or are we already context switching before
> acpi_os_map_cleanup() is invoked?  (If we are already context switching,
> short-circuiting synchronize_rcu_expedited() would be a bug.)

Hmm, how about the below?

It would still happen before

        /*
         * The boot idle thread must execute schedule()
         * at least once to get things moving:
         */
        init_idle_bootup_task(current);
        schedule_preempt_disabled();

in rest_init() and right after native_smp_prepare_cpus() which is where
we're splatting.

Lemme run it.

Even if it works, we would have to stress-test this seriously...

---
diff --git a/init/main.c b/init/main.c
index b0c9d6facef9..9be221cc87c3 100644
--- a/init/main.c
+++ b/init/main.c
@@ -385,7 +385,6 @@ static noinline void __ref rest_init(void)
 {
 	int pid;
 
-	rcu_scheduler_starting();
 	/*
 	 * We need to spawn init first so that it obtains pid 1, however
 	 * the init task will end up wanting to create kthreads, which, if
@@ -1019,6 +1018,8 @@ static noinline void __init kernel_init_freeable(void)
 
 	smp_prepare_cpus(setup_max_cpus);
 
+	rcu_scheduler_starting();
+
 	workqueue_init();
 
 	do_pre_smp_initcalls();


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:32                             ` Paul E. McKenney
  2017-01-09 23:40                               ` Borislav Petkov
@ 2017-01-09 23:42                               ` Rafael J. Wysocki
  2017-01-10  0:21                                 ` Paul E. McKenney
  1 sibling, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-09 23:42 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Borislav Petkov, Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J,
	Moore, Robert, J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 12:32 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Jan 10, 2017 at 12:15:01AM +0100, Borislav Petkov wrote:
>> On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
>> > @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
>> >  {
>> >     struct rcu_state *rsp = rcu_state_p;
>> >
>> > +   if (!rcu_scheduler_active)
>> > +           return;
>> >     _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
>> >  }
>> >  EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
>>
>> That doesn't work and it is because of those damn what goes before what
>> boot sequence issues :-\
>>
>> We have:
>>
>> rest_init()
>> |-> rcu_scheduler_starting()  ---> that sets rcu_scheduler_active = 1;
>> |-> kernel_thread(kernel_init, NULL, CLONE_FS);
>> |-> kernel_init()
>> |-> kernel_init_freeable()
>> |-> native_smp_prepare_cpus(setup_max_cpus)
>> |-> default_setup_apic_routing
>> |-> enable_IR_x2apic
>> |-> irq_remapping_prepare
>> |-> amd_iommu_prepare
>> |-> iommu_go_to_state
>> |-> acpi_put_table(ivrs_base);
>> |-> acpi_tb_put_table(table_desc);
>> |-> acpi_tb_invalidate_table(table_desc);
>> |-> acpi_tb_release_table(...)
>> |-> acpi_os_unmap_memory
>> |-> acpi_os_unmap_iomem
>> |-> acpi_os_map_cleanup
>> |-> synchronize_rcu_expedited()
>>
>> Now here we have rcu_scheduler_active already set so the test doesn't
>> hit and we hang.
>>
>> So we must do it differently.
>
> Yeah, there is a window just as the scheduler is starting where things don't
> work.
>
> We could move rcu_scheduler_starting() later, as long as there
> is no chance of preemption or context switch before it is invoked.
> Would that help in this case, or are we already context switching before
> acpi_os_map_cleanup() is invoked?

In the particular AMD IOMMU case it doesn't look like we are, but we
do in other cases.

> (If we are already context switching,
> short-circuiting synchronize_rcu_expedited() would be a bug.)

It may be easier to make the caller avoid RCU synchronization
altogether if that's not necessary and the caller should actually be
able to figure out when that's the case.

The patch from Lv at https://patchwork.kernel.org/patch/9504277/ goes
in the right direction IMO, but I'm not yet convinced that this is the
right one.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:40                               ` Borislav Petkov
@ 2017-01-09 23:52                                 ` Paul E. McKenney
  2017-01-09 23:52                                 ` Borislav Petkov
  1 sibling, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-09 23:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> On Mon, Jan 09, 2017 at 03:32:04PM -0800, Paul E. McKenney wrote:
> > We could move rcu_scheduler_starting() later, as long as there
> > is no chance of preemption or context switch before it is invoked.
> > Would that help in this case, or are we already context switching before
> > acpi_os_map_cleanup() is invoked?  (If we are already context switching,
> > short-circuiting synchronize_rcu_expedited() would be a bug.)
> 
> Hmm, how about the below?
> 
> It would still happen before
> 
>         /*
>          * The boot idle thread must execute schedule()
>          * at least once to get things moving:
>          */
>         init_idle_bootup_task(current);
>         schedule_preempt_disabled();
> 
> in rest_init() and right after native_smp_prepare_cpus() which is where
> we're splatting.
> 
> Lemme run it.
> 
> Even if it works, we would have to stress-test this seriously...

Yeah, the call to wait_for_completion() at the beginning of
kernel_init_freeable() makes me extremely nervous.  Even if it does
happen to work, this looks like an accident waiting to happen.

Is it possible to instead move the ACPI initialization to follow the
workqueue initialization?

							Thanx, Paul

> ---
> diff --git a/init/main.c b/init/main.c
> index b0c9d6facef9..9be221cc87c3 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -385,7 +385,6 @@ static noinline void __ref rest_init(void)
>  {
>  	int pid;
> 
> -	rcu_scheduler_starting();
>  	/*
>  	 * We need to spawn init first so that it obtains pid 1, however
>  	 * the init task will end up wanting to create kthreads, which, if
> @@ -1019,6 +1018,8 @@ static noinline void __init kernel_init_freeable(void)
> 
>  	smp_prepare_cpus(setup_max_cpus);
> 
> +	rcu_scheduler_starting();
> +
>  	workqueue_init();
> 
>  	do_pre_smp_initcalls();
> 
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:40                               ` Borislav Petkov
  2017-01-09 23:52                                 ` Paul E. McKenney
@ 2017-01-09 23:52                                 ` Borislav Petkov
  2017-01-10  0:44                                   ` Zheng, Lv
  2017-01-10  1:27                                   ` Rafael J. Wysocki
  1 sibling, 2 replies; 39+ messages in thread
From: Borislav Petkov @ 2017-01-09 23:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> Lemme run it.

Well, it boots but I get:

[    0.291447] ------------[ cut here ]------------
[    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
[    0.292107] Modules linked in:
[    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
[    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
[    0.292893] Call Trace:
[    0.293072]  ? dump_stack+0x46/0x63
[    0.293285]  ? __warn+0xec/0x110
[    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
[    0.293735]  ? kernel_init_freeable+0x58/0x19a
[    0.293976]  ? rest_init+0x80/0x80
[    0.294153]  ? kernel_init+0xa/0x100
[    0.294334]  ? ret_from_fork+0x22/0x30
[    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---

TBH, I like Rafael's suggestion in the other mail to stick with fixing
this in ACPI, especially this is an ACPI problem, not RCU. Well,
more or less: RCU could be taught to *not* do schedule_work() if
workqueue_init() hasn't happened yet but that's a tangential.

So, I'm going to bed. When I wake up, I want to see working fixes!

:-)))

Thanks dudes!

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:42                               ` Rafael J. Wysocki
@ 2017-01-10  0:21                                 ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-10  0:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, Zheng, Lv, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 12:42:47AM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 10, 2017 at 12:32 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Jan 10, 2017 at 12:15:01AM +0100, Borislav Petkov wrote:
> >> On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
> >> > @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
> >> >  {
> >> >     struct rcu_state *rsp = rcu_state_p;
> >> >
> >> > +   if (!rcu_scheduler_active)
> >> > +           return;
> >> >     _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
> >>
> >> That doesn't work and it is because of those damn what goes before what
> >> boot sequence issues :-\
> >>
> >> We have:
> >>
> >> rest_init()
> >> |-> rcu_scheduler_starting()  ---> that sets rcu_scheduler_active = 1;
> >> |-> kernel_thread(kernel_init, NULL, CLONE_FS);
> >> |-> kernel_init()
> >> |-> kernel_init_freeable()
> >> |-> native_smp_prepare_cpus(setup_max_cpus)
> >> |-> default_setup_apic_routing
> >> |-> enable_IR_x2apic
> >> |-> irq_remapping_prepare
> >> |-> amd_iommu_prepare
> >> |-> iommu_go_to_state
> >> |-> acpi_put_table(ivrs_base);
> >> |-> acpi_tb_put_table(table_desc);
> >> |-> acpi_tb_invalidate_table(table_desc);
> >> |-> acpi_tb_release_table(...)
> >> |-> acpi_os_unmap_memory
> >> |-> acpi_os_unmap_iomem
> >> |-> acpi_os_map_cleanup
> >> |-> synchronize_rcu_expedited()
> >>
> >> Now here we have rcu_scheduler_active already set so the test doesn't
> >> hit and we hang.
> >>
> >> So we must do it differently.
> >
> > Yeah, there is a window just as the scheduler is starting where things don't
> > work.
> >
> > We could move rcu_scheduler_starting() later, as long as there
> > is no chance of preemption or context switch before it is invoked.
> > Would that help in this case, or are we already context switching before
> > acpi_os_map_cleanup() is invoked?
> 
> In the particular AMD IOMMU case it doesn't look like we are, but we
> do in other cases.
> 
> > (If we are already context switching,
> > short-circuiting synchronize_rcu_expedited() would be a bug.)
> 
> It may be easier to make the caller avoid RCU synchronization
> altogether if that's not necessary and the caller should actually be
> able to figure out when that's the case.
> 
> The patch from Lv at https://patchwork.kernel.org/patch/9504277/ goes
> in the right direction IMO, but I'm not yet convinced that this is the
> right one.

>From the RCU end, I could force expedited grace periods to translate to
normal grace periods during that window of time, and then make sure that
RCU's grace-period kthreads are spawned beforehand.  Looking into this...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:52                                 ` Borislav Petkov
@ 2017-01-10  0:44                                   ` Zheng, Lv
  2017-01-10  1:27                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 39+ messages in thread
From: Zheng, Lv @ 2017-01-10  0:44 UTC (permalink / raw)
  To: Borislav Petkov, Paul E. McKenney
  Cc: Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert, J?rg R?del,
	lkml, Linux ACPI

[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]

Hi,

Can the attached patch makes something different?

Thanks and best regards
Lv

> From: Borislav Petkov [mailto:bp@alien8.de]
> Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> > Lemme run it.
> 
> Well, it boots but I get:
> 
> [    0.291447] ------------[ cut here ]------------
> [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> [    0.292107] Modules linked in:
> [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> [    0.292893] Call Trace:
> [    0.293072]  ? dump_stack+0x46/0x63
> [    0.293285]  ? __warn+0xec/0x110
> [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> [    0.293976]  ? rest_init+0x80/0x80
> [    0.294153]  ? kernel_init+0xa/0x100
> [    0.294334]  ? ret_from_fork+0x22/0x30
> [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> 
> TBH, I like Rafael's suggestion in the other mail to stick with fixing
> this in ACPI, especially this is an ACPI problem, not RCU. Well,
> more or less: RCU could be taught to *not* do schedule_work() if
> workqueue_init() hasn't happened yet but that's a tangential.
> 
> So, I'm going to bed. When I wake up, I want to see working fixes!
> 
> :-)))
> 
> Thanks dudes!
> 
> --
> Regards/Gruss,
>     Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.

[-- Attachment #2: lv-rcu1.patch --]
[-- Type: application/octet-stream, Size: 1259 bytes --]

Index: linux-acpica/kernel/rcu/tree_exp.h
===================================================================
--- linux-acpica.orig/kernel/rcu/tree_exp.h
+++ linux-acpica/kernel/rcu/tree_exp.h
@@ -297,16 +297,22 @@ fastpath:
 	return false;
 }
 
-/* Invoked on each online non-idle CPU for expedited quiescent state. */
-static void sync_sched_exp_handler(void *data)
+static bool sync_sched_exp_is_required(struct rcu_state *rsp)
 {
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
-	struct rcu_state *rsp = data;
 
 	rdp = this_cpu_ptr(rsp->rda);
 	rnp = rdp->mynode;
-	if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) ||
+	return !!(READ_ONCE(rnp->expmask) & rdp->grpmask);
+}
+
+/* Invoked on each online non-idle CPU for expedited quiescent state. */
+static void sync_sched_exp_handler(void *data)
+{
+	struct rcu_state *rsp = data;
+
+	if (!sync_sched_exp_is_requierd(rsp) ||
 	    __this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))
 		return;
 	if (rcu_is_cpu_rrupt_from_idle()) {
@@ -610,6 +616,8 @@ void synchronize_sched_expedited(void)
 	/* If only one CPU, this is automatically a grace period. */
 	if (rcu_blocking_is_gp())
 		return;
+	if (!sync_sched_exp_is_required(rsp))
+		return;
 
 	_synchronize_rcu_expedited(rsp, sync_sched_exp_handler);
 }

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 23:52                                 ` Borislav Petkov
  2017-01-10  0:44                                   ` Zheng, Lv
@ 2017-01-10  1:27                                   ` Rafael J. Wysocki
  2017-01-10  2:23                                     ` Paul E. McKenney
  2017-01-10  9:41                                     ` Borislav Petkov
  1 sibling, 2 replies; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-10  1:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paul E. McKenney, Zheng, Lv, Rafael J. Wysocki, Wysocki,
	Rafael J, Moore, Robert, J?rg R?del, lkml, Linux ACPI

[-- Attachment #1: Type: text/plain, Size: 1781 bytes --]

On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
>> Lemme run it.
>
> Well, it boots but I get:
>
> [    0.291447] ------------[ cut here ]------------
> [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> [    0.292107] Modules linked in:
> [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> [    0.292893] Call Trace:
> [    0.293072]  ? dump_stack+0x46/0x63
> [    0.293285]  ? __warn+0xec/0x110
> [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> [    0.293976]  ? rest_init+0x80/0x80
> [    0.294153]  ? kernel_init+0xa/0x100
> [    0.294334]  ? ret_from_fork+0x22/0x30
> [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
>
> TBH, I like Rafael's suggestion in the other mail to stick with fixing
> this in ACPI, especially this is an ACPI problem, not RCU. Well,
> more or less: RCU could be taught to *not* do schedule_work() if
> workqueue_init() hasn't happened yet but that's a tangential.
>
> So, I'm going to bed. When I wake up, I want to see working fixes!
>
> :-)))

Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
worked, the attached one should work too (please test), but it can be
justified in a slightly more convincing way.

Namely, the idea is that acpi_os_read/write_memory() should never be
used before invoking acpi_os_initialize() and since those are the only
places where the list of memory regions is walked under RCU without
extra locking, it is safe to skip the RCU synchronization until that
happens.

Thanks,
Rafael

[-- Attachment #2: acpi-mem-unmap.patch --]
[-- Type: text/x-patch, Size: 1040 bytes --]

---
 drivers/acpi/osl.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
 static void acpi_os_map_cleanup(struct acpi_ioremap *map)
 {
 	if (!map->refcount) {
-		synchronize_rcu_expedited();
+		if (acpi_os_initialized)
+			synchronize_rcu_expedited();
+
 		acpi_unmap(map->phys, map->virt);
 		kfree(map);
 	}
@@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
 	bool unmap = false;
 	u64 dummy;
 
+	WARN_ON_ONCE(!acpi_os_initialized);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {
@@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
 	unsigned int size = width / 8;
 	bool unmap = false;
 
+	WARN_ON_ONCE(!acpi_os_initialized);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  1:27                                   ` Rafael J. Wysocki
@ 2017-01-10  2:23                                     ` Paul E. McKenney
  2017-01-10  5:41                                       ` Zheng, Lv
  2017-01-10  9:41                                     ` Borislav Petkov
  1 sibling, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-10  2:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, Zheng, Lv, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> >> Lemme run it.
> >
> > Well, it boots but I get:
> >
> > [    0.291447] ------------[ cut here ]------------
> > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > [    0.292107] Modules linked in:
> > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > [    0.292893] Call Trace:
> > [    0.293072]  ? dump_stack+0x46/0x63
> > [    0.293285]  ? __warn+0xec/0x110
> > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > [    0.293976]  ? rest_init+0x80/0x80
> > [    0.294153]  ? kernel_init+0xa/0x100
> > [    0.294334]  ? ret_from_fork+0x22/0x30
> > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> >
> > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > more or less: RCU could be taught to *not* do schedule_work() if
> > workqueue_init() hasn't happened yet but that's a tangential.
> >
> > So, I'm going to bed. When I wake up, I want to see working fixes!
> >
> > :-)))
> 
> Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> worked, the attached one should work too (please test), but it can be
> justified in a slightly more convincing way.
> 
> Namely, the idea is that acpi_os_read/write_memory() should never be
> used before invoking acpi_os_initialize() and since those are the only
> places where the list of memory regions is walked under RCU without
> extra locking, it is safe to skip the RCU synchronization until that
> happens.
> 
> Thanks,
> Rafael

Makes sense to me!

It looks like I can make the grace-period-free boot-time window
for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
look like something suitable for jamming into 4.10.

							Thanx, Paul

> ---
>  drivers/acpi/osl.c |    8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/acpi/osl.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
>  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
>  {
>  	if (!map->refcount) {
> -		synchronize_rcu_expedited();
> +		if (acpi_os_initialized)
> +			synchronize_rcu_expedited();
> +
>  		acpi_unmap(map->phys, map->virt);
>  		kfree(map);
>  	}
> @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
>  	bool unmap = false;
>  	u64 dummy;
>  
> +	WARN_ON_ONCE(!acpi_os_initialized);
> +
>  	rcu_read_lock();
>  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
>  	if (!virt_addr) {
> @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
>  	unsigned int size = width / 8;
>  	bool unmap = false;
>  
> +	WARN_ON_ONCE(!acpi_os_initialized);
> +
>  	rcu_read_lock();
>  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
>  	if (!virt_addr) {

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  2:23                                     ` Paul E. McKenney
@ 2017-01-10  5:41                                       ` Zheng, Lv
  2017-01-10  5:51                                         ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Zheng, Lv @ 2017-01-10  5:41 UTC (permalink / raw)
  To: paulmck, Rafael J. Wysocki
  Cc: Borislav Petkov, Wysocki, Rafael J, Moore, Robert, J?rg R?del,
	lkml, Linux ACPI

Hi, Rafael and Paul

> From: Paul E. McKenney [mailto:paulmck@linux.vnet.ibm.com]
> Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
> 
> On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> > On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> > >> Lemme run it.
> > >
> > > Well, it boots but I get:
> > >
> > > [    0.291447] ------------[ cut here ]------------
> > > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > > [    0.292107] Modules linked in:
> > > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > > [    0.292893] Call Trace:
> > > [    0.293072]  ? dump_stack+0x46/0x63
> > > [    0.293285]  ? __warn+0xec/0x110
> > > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > > [    0.293976]  ? rest_init+0x80/0x80
> > > [    0.294153]  ? kernel_init+0xa/0x100
> > > [    0.294334]  ? ret_from_fork+0x22/0x30
> > > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> > >
> > > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > > more or less: RCU could be taught to *not* do schedule_work() if
> > > workqueue_init() hasn't happened yet but that's a tangential.
> > >
> > > So, I'm going to bed. When I wake up, I want to see working fixes!
> > >
> > > :-)))
> >
> > Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> > worked, the attached one should work too (please test), but it can be
> > justified in a slightly more convincing way.
> >
> > Namely, the idea is that acpi_os_read/write_memory() should never be
> > used before invoking acpi_os_initialize() and since those are the only
> > places where the list of memory regions is walked under RCU without
> > extra locking, it is safe to skip the RCU synchronization until that
> > happens.
> >
> > Thanks,
> > Rafael
> 
> Makes sense to me!

Also looks good to me.

> 
> It looks like I can make the grace-period-free boot-time window
> for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
> look like something suitable for jamming into 4.10.

OK, we can have this fixed in ACPI layer first.

Thanks
Lv

> 
> 							Thanx, Paul
> 
> > ---
> >  drivers/acpi/osl.c |    8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/acpi/osl.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/osl.c
> > +++ linux-pm/drivers/acpi/osl.c
> > @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
> >  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
> >  {
> >  	if (!map->refcount) {
> > -		synchronize_rcu_expedited();
> > +		if (acpi_os_initialized)
> > +			synchronize_rcu_expedited();
> > +
> >  		acpi_unmap(map->phys, map->virt);
> >  		kfree(map);
> >  	}
> > @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
> >  	bool unmap = false;
> >  	u64 dummy;
> >
> > +	WARN_ON_ONCE(!acpi_os_initialized);
> > +
> >  	rcu_read_lock();
> >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> >  	if (!virt_addr) {
> > @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
> >  	unsigned int size = width / 8;
> >  	bool unmap = false;
> >
> > +	WARN_ON_ONCE(!acpi_os_initialized);
> > +
> >  	rcu_read_lock();
> >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> >  	if (!virt_addr) {

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  5:41                                       ` Zheng, Lv
@ 2017-01-10  5:51                                         ` Paul E. McKenney
  2017-01-11  9:21                                           ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-10  5:51 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Rafael J. Wysocki, Borislav Petkov, Wysocki, Rafael J, Moore,
	Robert, J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 05:41:45AM +0000, Zheng, Lv wrote:
> Hi, Rafael and Paul
> 
> > From: Paul E. McKenney [mailto:paulmck@linux.vnet.ibm.com]
> > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> > 
> > On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> > > On Tue, Jan 10, 2017 at 12:52 AM, Borislav Petkov <bp@alien8.de> wrote:
> > > > On Tue, Jan 10, 2017 at 12:40:39AM +0100, Borislav Petkov wrote:
> > > >> Lemme run it.
> > > >
> > > > Well, it boots but I get:
> > > >
> > > > [    0.291447] ------------[ cut here ]------------
> > > > [    0.291702] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree.c:3993 rcu_scheduler_starting+0x5c/0x70
> > > > [    0.292107] Modules linked in:
> > > > [    0.292277] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3+ #21
> > > > [    0.292540] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08 01/28/2016
> > > > [    0.292893] Call Trace:
> > > > [    0.293072]  ? dump_stack+0x46/0x63
> > > > [    0.293285]  ? __warn+0xec/0x110
> > > > [    0.293487]  ? rcu_scheduler_starting+0x5c/0x70
> > > > [    0.293735]  ? kernel_init_freeable+0x58/0x19a
> > > > [    0.293976]  ? rest_init+0x80/0x80
> > > > [    0.294153]  ? kernel_init+0xa/0x100
> > > > [    0.294334]  ? ret_from_fork+0x22/0x30
> > > > [    0.294525] ---[ end trace 4c0fe009ed4dc740 ]---
> > > >
> > > > TBH, I like Rafael's suggestion in the other mail to stick with fixing
> > > > this in ACPI, especially this is an ACPI problem, not RCU. Well,
> > > > more or less: RCU could be taught to *not* do schedule_work() if
> > > > workqueue_init() hasn't happened yet but that's a tangential.
> > > >
> > > > So, I'm going to bed. When I wake up, I want to see working fixes!
> > > >
> > > > :-)))
> > >
> > > Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> > > worked, the attached one should work too (please test), but it can be
> > > justified in a slightly more convincing way.
> > >
> > > Namely, the idea is that acpi_os_read/write_memory() should never be
> > > used before invoking acpi_os_initialize() and since those are the only
> > > places where the list of memory regions is walked under RCU without
> > > extra locking, it is safe to skip the RCU synchronization until that
> > > happens.
> > >
> > > Thanks,
> > > Rafael
> > 
> > Makes sense to me!
> 
> Also looks good to me.
> 
> > 
> > It looks like I can make the grace-period-free boot-time window
> > for CONFIG_PREEMPT=y kernels quite a bit narrower, but it does not
> > look like something suitable for jamming into 4.10.
> 
> OK, we can have this fixed in ACPI layer first.

Definitely.

I have the RCU changes written in ink on paper and they definitely are
-not- something that goes into 4.10.  4.11 at the earliest, and if no
one asks for it in 4.11, it goes into 4.12.  Serious testing needed.  ;-)

							Thanx, Paul

> Thanks
> Lv
> 
> > 
> > 							Thanx, Paul
> > 
> > > ---
> > >  drivers/acpi/osl.c |    8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > Index: linux-pm/drivers/acpi/osl.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/acpi/osl.c
> > > +++ linux-pm/drivers/acpi/osl.c
> > > @@ -378,7 +378,9 @@ static void acpi_os_drop_map_ref(struct
> > >  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
> > >  {
> > >  	if (!map->refcount) {
> > > -		synchronize_rcu_expedited();
> > > +		if (acpi_os_initialized)
> > > +			synchronize_rcu_expedited();
> > > +
> > >  		acpi_unmap(map->phys, map->virt);
> > >  		kfree(map);
> > >  	}
> > > @@ -671,6 +673,8 @@ acpi_os_read_memory(acpi_physical_addres
> > >  	bool unmap = false;
> > >  	u64 dummy;
> > >
> > > +	WARN_ON_ONCE(!acpi_os_initialized);
> > > +
> > >  	rcu_read_lock();
> > >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> > >  	if (!virt_addr) {
> > > @@ -716,6 +720,8 @@ acpi_os_write_memory(acpi_physical_addre
> > >  	unsigned int size = width / 8;
> > >  	bool unmap = false;
> > >
> > > +	WARN_ON_ONCE(!acpi_os_initialized);
> > > +
> > >  	rcu_read_lock();
> > >  	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
> > >  	if (!virt_addr) {
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  1:27                                   ` Rafael J. Wysocki
  2017-01-10  2:23                                     ` Paul E. McKenney
@ 2017-01-10  9:41                                     ` Borislav Petkov
  2017-01-11  3:42                                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-10  9:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Paul E. McKenney, Zheng, Lv, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
> Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
> worked, the attached one should work too (please test), but it can be
> justified in a slightly more convincing way.

No workie:

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 57fb5f4..acb6118 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -378,7 +378,11 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
 static void acpi_os_map_cleanup(struct acpi_ioremap *map)
 {
        if (!map->refcount) {
-               synchronize_rcu_expedited();
+               if (acpi_os_initialized) {
+                       pr_err("%s: acpi_os_initialized\n", __func__);
+                       synchronize_rcu_expedited();
+               }
+
                acpi_unmap(map->phys, map->virt);
                kfree(map);
        }

The pr_err() gets issued before the box hangs.

Lv's version which set the bool in acpi_os_map_generic_address() did
work though.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-09 22:41                   ` Rafael J. Wysocki
  2017-01-09 22:57                     ` Borislav Petkov
@ 2017-01-10 13:58                     ` Jörg Rödel
  1 sibling, 0 replies; 39+ messages in thread
From: Jörg Rödel @ 2017-01-10 13:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Borislav Petkov, Rafael J. Wysocki, Lv Zheng,
	Bob Moore, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 11:41:15PM +0100, Rafael J. Wysocki wrote:
> 
> Please find appended.

Applied, thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  9:41                                     ` Borislav Petkov
@ 2017-01-11  3:42                                       ` Rafael J. Wysocki
  2017-01-11  9:42                                         ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Rafael J. Wysocki @ 2017-01-11  3:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J. Wysocki, Paul E. McKenney, Zheng, Lv, Wysocki,
	Rafael J, Moore, Robert, J?rg R?del, lkml, Linux ACPI

[-- Attachment #1: Type: text/plain, Size: 1651 bytes --]

On Tue, Jan 10, 2017 at 10:41 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Jan 10, 2017 at 02:27:16AM +0100, Rafael J. Wysocki wrote:
>> Well, if the https://patchwork.kernel.org/patch/9504277/ patch from Lv
>> worked, the attached one should work too (please test), but it can be
>> justified in a slightly more convincing way.
>
> No workie:
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 57fb5f4..acb6118 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -378,7 +378,11 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
>  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
>  {
>         if (!map->refcount) {
> -               synchronize_rcu_expedited();
> +               if (acpi_os_initialized) {
> +                       pr_err("%s: acpi_os_initialized\n", __func__);
> +                       synchronize_rcu_expedited();
> +               }
> +
>                 acpi_unmap(map->phys, map->virt);
>                 kfree(map);
>         }
>
> The pr_err() gets issued before the box hangs.
>
> Lv's version which set the bool in acpi_os_map_generic_address() did
> work though.

Well, it would if nothing mapped by acpi_os_map_generic_address() was
in the memory address space, for example, but then it would never use
the RCU synchronization as well (not good).

Basically, we need to find a point during the initialization such that
acpi_os_read/write_memory() is not invoked earlier and set a flag in
there.

Let's try the attached one.

BTW, I'm going to travel to the LCA starting tomorrow, so I guess I
will be a bit unresponsive during the next few days.

Thanks,
Rafael

[-- Attachment #2: acpi-debug-table-load.patch --]
[-- Type: text/x-patch, Size: 2296 bytes --]

---
 drivers/acpi/bus.c     |    6 ++++++
 drivers/acpi/osl.c     |    9 ++++++++-
 include/acpi/acpi_io.h |    1 +
 3 files changed, 15 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/bus.c
===================================================================
--- linux-pm.orig/drivers/acpi/bus.c
+++ linux-pm/drivers/acpi/bus.c
@@ -1191,6 +1191,12 @@ static int __init acpi_init(void)
 		acpi_kobj = NULL;
 	}
 
+	/*
+	 * acpi_os_read_memory()/acpi_os_write_memory() should not be invoked
+	 * before this point.
+	 */
+	acpi_sync_memory_unmap = true;
+
 	init_acpi_device_notify();
 	result = acpi_bus_init();
 	if (result) {
Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -77,6 +77,7 @@ static struct workqueue_struct *kacpi_ho
 static bool acpi_os_initialized;
 unsigned int acpi_sci_irq = INVALID_ACPI_IRQ;
 bool acpi_permanent_mmap = false;
+bool acpi_sync_memory_unmap;
 
 /*
  * This list of permanent mappings is for memory that may be accessed from
@@ -378,7 +379,9 @@ static void acpi_os_drop_map_ref(struct
 static void acpi_os_map_cleanup(struct acpi_ioremap *map)
 {
 	if (!map->refcount) {
-		synchronize_rcu_expedited();
+		if (acpi_sync_memory_unmap)
+			synchronize_rcu_expedited();
+
 		acpi_unmap(map->phys, map->virt);
 		kfree(map);
 	}
@@ -671,6 +674,8 @@ acpi_os_read_memory(acpi_physical_addres
 	bool unmap = false;
 	u64 dummy;
 
+	WARN_ON_ONCE(!acpi_sync_memory_unmap);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {
@@ -716,6 +721,8 @@ acpi_os_write_memory(acpi_physical_addre
 	unsigned int size = width / 8;
 	bool unmap = false;
 
+	WARN_ON_ONCE(!acpi_sync_memory_unmap);
+
 	rcu_read_lock();
 	virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
 	if (!virt_addr) {
Index: linux-pm/include/acpi/acpi_io.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_io.h
+++ linux-pm/include/acpi/acpi_io.h
@@ -14,6 +14,7 @@ static inline void __iomem *acpi_os_iore
 #endif
 
 extern bool acpi_permanent_mmap;
+extern bool acpi_sync_memory_unmap;
 
 void __iomem *__ref
 acpi_os_map_iomem(acpi_physical_address phys, acpi_size size);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-10  5:51                                         ` Paul E. McKenney
@ 2017-01-11  9:21                                           ` Borislav Petkov
  2017-01-11  9:51                                             ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-11  9:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Mon, Jan 09, 2017 at 09:51:29PM -0800, Paul E. McKenney wrote:
> Definitely.

Btw, we have more breakage from RCU expedited using workqueues:
https://bugzilla.kernel.org/show_bug.cgi?id=192111

I've added you to CC but let me have other bug reporters confirm reverting

  8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")

does fix the issue for them too.

Thanks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-11  3:42                                       ` Rafael J. Wysocki
@ 2017-01-11  9:42                                         ` Borislav Petkov
  0 siblings, 0 replies; 39+ messages in thread
From: Borislav Petkov @ 2017-01-11  9:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Paul E. McKenney, Zheng, Lv, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Wed, Jan 11, 2017 at 04:42:16AM +0100, Rafael J. Wysocki wrote:
> Let's try the attached one.

Works, thanks. Lemme give you the tags :)

Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Fixes: 174cc7187e6f ("ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel")
Link: https://lkml.kernel.org/r/4034dde8-ffc1-18e2-f40c-00cf37471793@intel.com

> BTW, I'm going to travel to the LCA starting tomorrow, so I guess I
> will be a bit unresponsive during the next few days.

Sure, we have time. Have a safe trip!

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-11  9:21                                           ` Borislav Petkov
@ 2017-01-11  9:51                                             ` Paul E. McKenney
  2017-01-11 10:03                                               ` Borislav Petkov
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-11  9:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Wed, Jan 11, 2017 at 10:21:06AM +0100, Borislav Petkov wrote:
> On Mon, Jan 09, 2017 at 09:51:29PM -0800, Paul E. McKenney wrote:
> > Definitely.
> 
> Btw, we have more breakage from RCU expedited using workqueues:
> https://bugzilla.kernel.org/show_bug.cgi?id=192111
> 
> I've added you to CC but let me have other bug reporters confirm reverting
> 
>   8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
> 
> does fix the issue for them too.

Yes, you could make RCU expedited grace periods go back to using the
requesting task, and that would allow expedited grace periods to run early
in the boot process.  But that causes problems with signals and the like
unless you revert a few other patches.  The bugzilla is interesting --
it looks like ACPI was in some cases doing early-boot grace-period waits
some time back?

I have a limping prototype RCU patch that should avoid this problem.

If all goes well, I will send it out late tomorrow evening, Pacific Time.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-11  9:51                                             ` Paul E. McKenney
@ 2017-01-11 10:03                                               ` Borislav Petkov
  2017-01-11 10:22                                                 ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Borislav Petkov @ 2017-01-11 10:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Wed, Jan 11, 2017 at 01:51:56AM -0800, Paul E. McKenney wrote:
> Yes, you could make RCU expedited grace periods go back to using the
> requesting task, and that would allow expedited grace periods to run early
> in the boot process.  But that causes problems with signals and the like
> unless you revert a few other patches.  The bugzilla is interesting --
> it looks like ACPI was in some cases doing early-boot grace-period waits
> some time back?

I think this and https://bugzilla.suse.com/show_bug.cgi?id=1017783 is an
example of a bunch of toshiba schlaptops which cause the issue. So it
looks like ACPI is doing something very early on those which tickles the
issue to happen.

But this is ACPI - anything can happen!

> I have a limping prototype RCU patch that should avoid this problem.
>
> If all goes well, I will send it out late tomorrow evening, Pacific Time.

Attach it to the bugzilla too, pls, because the people there trigger the
issue.

I have the respective(?) SUSE bug and I can ask people there to run it
too.

Thanks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
  2017-01-11 10:03                                               ` Borislav Petkov
@ 2017-01-11 10:22                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-01-11 10:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Zheng, Lv, Rafael J. Wysocki, Wysocki, Rafael J, Moore, Robert,
	J?rg R?del, lkml, Linux ACPI

On Wed, Jan 11, 2017 at 11:03:23AM +0100, Borislav Petkov wrote:
> On Wed, Jan 11, 2017 at 01:51:56AM -0800, Paul E. McKenney wrote:
> > Yes, you could make RCU expedited grace periods go back to using the
> > requesting task, and that would allow expedited grace periods to run early
> > in the boot process.  But that causes problems with signals and the like
> > unless you revert a few other patches.  The bugzilla is interesting --
> > it looks like ACPI was in some cases doing early-boot grace-period waits
> > some time back?
> 
> I think this and https://bugzilla.suse.com/show_bug.cgi?id=1017783 is an
> example of a bunch of toshiba schlaptops which cause the issue. So it
> looks like ACPI is doing something very early on those which tickles the
> issue to happen.
> 
> But this is ACPI - anything can happen!

;-) ;-) ;-)

> > I have a limping prototype RCU patch that should avoid this problem.
> >
> > If all goes well, I will send it out late tomorrow evening, Pacific Time.
> 
> Attach it to the bugzilla too, pls, because the people there trigger the
> issue.
> 
> I have the respective(?) SUSE bug and I can ask people there to run it
> too.

That would be very good!  Thinking good thoughts for the ongoing tests...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-01-11 10:22 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170107204227.bwdb5yzrjpiggkmo@pd.tnic>
2017-01-07 23:30 ` 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel Rafael J. Wysocki
2017-01-08  0:07   ` Borislav Petkov
2017-01-08  0:22     ` Rafael J. Wysocki
2017-01-08  0:37       ` Borislav Petkov
2017-01-08  0:52         ` Rafael J. Wysocki
2017-01-08  1:01           ` Borislav Petkov
2017-01-08  1:45             ` Rafael J. Wysocki
2017-01-08  2:20               ` Rafael J. Wysocki
2017-01-08 13:03                 ` Borislav Petkov
2017-01-09  1:58                   ` Zheng, Lv
2017-01-09  2:36                     ` Zheng, Lv
2017-01-09  9:33                       ` Borislav Petkov
2017-01-09 22:18                         ` Paul E. McKenney
2017-01-09 22:25                           ` Rafael J. Wysocki
2017-01-09 23:14                             ` Paul E. McKenney
2017-01-09 23:15                           ` Borislav Petkov
2017-01-09 23:32                             ` Paul E. McKenney
2017-01-09 23:40                               ` Borislav Petkov
2017-01-09 23:52                                 ` Paul E. McKenney
2017-01-09 23:52                                 ` Borislav Petkov
2017-01-10  0:44                                   ` Zheng, Lv
2017-01-10  1:27                                   ` Rafael J. Wysocki
2017-01-10  2:23                                     ` Paul E. McKenney
2017-01-10  5:41                                       ` Zheng, Lv
2017-01-10  5:51                                         ` Paul E. McKenney
2017-01-11  9:21                                           ` Borislav Petkov
2017-01-11  9:51                                             ` Paul E. McKenney
2017-01-11 10:03                                               ` Borislav Petkov
2017-01-11 10:22                                                 ` Paul E. McKenney
2017-01-10  9:41                                     ` Borislav Petkov
2017-01-11  3:42                                       ` Rafael J. Wysocki
2017-01-11  9:42                                         ` Borislav Petkov
2017-01-09 23:42                               ` Rafael J. Wysocki
2017-01-10  0:21                                 ` Paul E. McKenney
2017-01-09  5:21                     ` Zheng, Lv
2017-01-09 10:52                 ` Jörg Rödel
2017-01-09 22:41                   ` Rafael J. Wysocki
2017-01-09 22:57                     ` Borislav Petkov
2017-01-10 13:58                     ` Jörg Rödel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).