All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 13:10 Andre Przywara
  2012-05-30 13:33   ` Jan Beulich
                   ` (3 more replies)
  0 siblings, 4 replies; 63+ messages in thread
From: Andre Przywara @ 2012-05-30 13:10 UTC (permalink / raw)
  To: mingo, hpa, tglx
  Cc: Andre Przywara, jeremy, xen-devel, stable, konrad.wilk, linux-kernel

Because we are behind a family check before tweaking the topology
bit, we can use the standard rd/wrmsr variants for the CPUID feature
register.
This fixes a crash when using the kernel as a Xen Dom0 on affected
Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
yet (this will be fixed in another patch).

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Cc: stable@vger.kernel.org # 3.4+
---
 arch/x86/kernel/cpu/amd.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 146bb62..80ccd99 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
 	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
 		u64 val;
 
-		if (!rdmsrl_amd_safe(0xc0011005, &val)) {
+		if (!rdmsrl_safe(0xc0011005, &val)) {
 			val |= 1ULL << 54;
-			wrmsrl_amd_safe(0xc0011005, val);
+			checking_wrmsrl(0xc0011005, val);
 			rdmsrl(0xc0011005, val);
 			if (val & (1ULL << 54)) {
 				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 13:10 [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Andre Przywara
@ 2012-05-30 13:33   ` Jan Beulich
  2012-05-30 14:39 ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 13:33 UTC (permalink / raw)
  To: Andre Przywara
  Cc: mingo, jeremy, tglx, xen-devel, konrad.wilk, linux-kernel, stable, hpa

>>> On 30.05.12 at 15:10, Andre Przywara <andre.przywara@amd.com> wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.
> This fixes a crash when using the kernel as a Xen Dom0 on affected
> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> yet (this will be fixed in another patch).

I'm not following: If the AMD variants (putting a special value into
%edi) can be freely replaced by the non-AMD variants, why did
the AMD special ones get used in the first place?

Further, I can't see how checking_wrmsrl() is being paravirtualized
any better than wrmsrl_amd_safe() - both have nothing but an
exception handling fixup attached to the wrmsr invocation. Care
to point out what actual crash it is that was seen?

Finally, I would question whether re-enabling the topology
extensions under Xen shouldn't be skipped altogether, perhaps
even on Dom0 (as the hypervisor is controlling this MSR, but in
any case on DomU - the hypervisor won't allow (read: ignore,
not fault on) the write anyway (and will log a message for each
(v)CPU that attempts this).

Jan

> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Cc: stable@vger.kernel.org # 3.4+
> ---
>  arch/x86/kernel/cpu/amd.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 146bb62..80ccd99 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>  	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>  		u64 val;
>  
> -		if (!rdmsrl_amd_safe(0xc0011005, &val)) {
> +		if (!rdmsrl_safe(0xc0011005, &val)) {
>  			val |= 1ULL << 54;
> -			wrmsrl_amd_safe(0xc0011005, val);
> +			checking_wrmsrl(0xc0011005, val);
>  			rdmsrl(0xc0011005, val);
>  			if (val & (1ULL << 54)) {
>  				set_cpu_cap(c, X86_FEATURE_TOPOEXT);



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 13:33   ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 13:33 UTC (permalink / raw)
  To: Andre Przywara
  Cc: jeremy, xen-devel, stable, konrad.wilk, linux-kernel, hpa, mingo, tglx

>>> On 30.05.12 at 15:10, Andre Przywara <andre.przywara@amd.com> wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.
> This fixes a crash when using the kernel as a Xen Dom0 on affected
> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> yet (this will be fixed in another patch).

I'm not following: If the AMD variants (putting a special value into
%edi) can be freely replaced by the non-AMD variants, why did
the AMD special ones get used in the first place?

Further, I can't see how checking_wrmsrl() is being paravirtualized
any better than wrmsrl_amd_safe() - both have nothing but an
exception handling fixup attached to the wrmsr invocation. Care
to point out what actual crash it is that was seen?

Finally, I would question whether re-enabling the topology
extensions under Xen shouldn't be skipped altogether, perhaps
even on Dom0 (as the hypervisor is controlling this MSR, but in
any case on DomU - the hypervisor won't allow (read: ignore,
not fault on) the write anyway (and will log a message for each
(v)CPU that attempts this).

Jan

> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Cc: stable@vger.kernel.org # 3.4+
> ---
>  arch/x86/kernel/cpu/amd.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 146bb62..80ccd99 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>  	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>  		u64 val;
>  
> -		if (!rdmsrl_amd_safe(0xc0011005, &val)) {
> +		if (!rdmsrl_safe(0xc0011005, &val)) {
>  			val |= 1ULL << 54;
> -			wrmsrl_amd_safe(0xc0011005, val);
> +			checking_wrmsrl(0xc0011005, val);
>  			rdmsrl(0xc0011005, val);
>  			if (val & (1ULL << 54)) {
>  				set_cpu_cap(c, X86_FEATURE_TOPOEXT);

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 13:33   ` Jan Beulich
@ 2012-05-30 14:02     ` Andre Przywara
  -1 siblings, 0 replies; 63+ messages in thread
From: Andre Przywara @ 2012-05-30 14:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: mingo, jeremy, tglx, xen-devel, konrad.wilk, linux-kernel, hpa,
	Shin, Jacob

On 05/30/2012 03:33 PM, Jan Beulich wrote:
>>>> On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
>> Because we are behind a family check before tweaking the topology
>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>> register.
>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>> yet (this will be fixed in another patch).
>
> I'm not following: If the AMD variants (putting a special value into
> %edi) can be freely replaced by the non-AMD variants, why did
> the AMD special ones get used in the first place?

Older CPUs (K8) needed the AMD variants, starting with family 10h we can 
use the normal versions.

> Further, I can't see how checking_wrmsrl() is being paravirtualized
> any better than wrmsrl_amd_safe() - both have nothing but an
> exception handling fixup attached to the wrmsr invocation. Care
> to point out what actual crash it is that was seen?

AFAIK, the difference is between the "l" and the regs version for 
rd/wrmsr. We have a patch already here to fix this. Will send it out 
soon. Jacob, can you comment on this?

The crash dump:

[    1.601021] ------------[ cut here ]------------
[    1.601025] kernel BUG at 
/buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
[    1.601031] invalid opcode: 0000 [#1] SMP
[    1.601038] CPU 0
[    1.601041] Modules linked in:
[    1.601047]
[    1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD Virgo 
Platform/Annapurna
[    1.601061] RIP: e030:[<ffffffff8169b4fe>]  [<ffffffff8169b4fe>] 
init_amd+0x21d/0x603
[    1.601072] RSP: e02b:ffffffff81c01df8  EFLAGS: 00010246
[    1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX: 
0000000000000000
[    1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI: 
ffffffff81c01e78
[    1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09: 
00000000ffffffff
[    1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12: 
ffffffff81ca7574
[    1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15: 
ffffffff81ca756c
[    1.601103] FS:  0000000000000000(0000) GS:ffff8801ce600000(0000) 
knlGS:0000000000000000
[    1.601108] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 
0000000000040660
[    1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[    1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[    1.601127] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, 
task ffffffff81c14020)
[    1.601131] Stack:
[    1.601134]  ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017 
0000000000000000
[    1.601146]  ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48 
ffffffff81140118
[    1.601157]  ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480 
00000000000000d0
[    1.601169] Call Trace:
[    1.601175]  [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
[    1.601183]  [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
[    1.601189]  [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
[    1.601195]  [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
[    1.601201]  [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
[    1.601209]  [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
[    1.601216]  [<ffffffff81cd2052>] check_bugs+0x9/0x2d
[    1.601222]  [<ffffffff81cc7e08>] start_kernel+0x445/0x461
[    1.601227]  [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
[    1.601233]  [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
[    1.601240]  [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
[    1.601247]  [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
[    1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00 48 
8d 7d b0 b9 08 00 00 00 f3
ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c 8d 
75 b0 4c 89 f7 ff 14 25 b0 11
c2 81 85 c0 8b
[    1.601374] RIP  [<ffffffff8169b4fe>] init_amd+0x21d/0x603
[    1.601381]  RSP <ffffffff81c01df8>
[    1.601391] ---[ end trace a7919e7f17c0a725 ]---
[    1.601397] Kernel panic - not syncing: Attempted to kill the idle task!

> Finally, I would question whether re-enabling the topology
> extensions under Xen shouldn't be skipped altogether, perhaps
> even on Dom0 (as the hypervisor is controlling this MSR, but in
> any case on DomU - the hypervisor won't allow (read: ignore,
> not fault on) the write anyway (and will log a message for each
> (v)CPU that attempts this).

This is probably right. Let me think about this.

Thanks for picking this up.

Regards,
Andre.

>
>> Signed-off-by: Andre Przywara<andre.przywara@amd.com>
>> Cc: stable@vger.kernel.org # 3.4+
>> ---
>>   arch/x86/kernel/cpu/amd.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 146bb62..80ccd99 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>>   	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>>   		u64 val;
>>
>> -		if (!rdmsrl_amd_safe(0xc0011005,&val)) {
>> +		if (!rdmsrl_safe(0xc0011005,&val)) {
>>   			val |= 1ULL<<  54;
>> -			wrmsrl_amd_safe(0xc0011005, val);
>> +			checking_wrmsrl(0xc0011005, val);
>>   			rdmsrl(0xc0011005, val);
>>   			if (val&  (1ULL<<  54)) {
>>   				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
>
>
>


-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 14:02     ` Andre Przywara
  0 siblings, 0 replies; 63+ messages in thread
From: Andre Przywara @ 2012-05-30 14:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: jeremy, xen-devel, konrad.wilk, Shin, Jacob, linux-kernel, hpa,
	mingo, tglx

On 05/30/2012 03:33 PM, Jan Beulich wrote:
>>>> On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
>> Because we are behind a family check before tweaking the topology
>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>> register.
>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>> yet (this will be fixed in another patch).
>
> I'm not following: If the AMD variants (putting a special value into
> %edi) can be freely replaced by the non-AMD variants, why did
> the AMD special ones get used in the first place?

Older CPUs (K8) needed the AMD variants, starting with family 10h we can 
use the normal versions.

> Further, I can't see how checking_wrmsrl() is being paravirtualized
> any better than wrmsrl_amd_safe() - both have nothing but an
> exception handling fixup attached to the wrmsr invocation. Care
> to point out what actual crash it is that was seen?

AFAIK, the difference is between the "l" and the regs version for 
rd/wrmsr. We have a patch already here to fix this. Will send it out 
soon. Jacob, can you comment on this?

The crash dump:

[    1.601021] ------------[ cut here ]------------
[    1.601025] kernel BUG at 
/buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
[    1.601031] invalid opcode: 0000 [#1] SMP
[    1.601038] CPU 0
[    1.601041] Modules linked in:
[    1.601047]
[    1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD Virgo 
Platform/Annapurna
[    1.601061] RIP: e030:[<ffffffff8169b4fe>]  [<ffffffff8169b4fe>] 
init_amd+0x21d/0x603
[    1.601072] RSP: e02b:ffffffff81c01df8  EFLAGS: 00010246
[    1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX: 
0000000000000000
[    1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI: 
ffffffff81c01e78
[    1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09: 
00000000ffffffff
[    1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12: 
ffffffff81ca7574
[    1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15: 
ffffffff81ca756c
[    1.601103] FS:  0000000000000000(0000) GS:ffff8801ce600000(0000) 
knlGS:0000000000000000
[    1.601108] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 
0000000000040660
[    1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[    1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[    1.601127] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, 
task ffffffff81c14020)
[    1.601131] Stack:
[    1.601134]  ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017 
0000000000000000
[    1.601146]  ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48 
ffffffff81140118
[    1.601157]  ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480 
00000000000000d0
[    1.601169] Call Trace:
[    1.601175]  [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
[    1.601183]  [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
[    1.601189]  [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
[    1.601195]  [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
[    1.601201]  [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
[    1.601209]  [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
[    1.601216]  [<ffffffff81cd2052>] check_bugs+0x9/0x2d
[    1.601222]  [<ffffffff81cc7e08>] start_kernel+0x445/0x461
[    1.601227]  [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
[    1.601233]  [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
[    1.601240]  [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
[    1.601247]  [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
[    1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00 48 
8d 7d b0 b9 08 00 00 00 f3
ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c 8d 
75 b0 4c 89 f7 ff 14 25 b0 11
c2 81 85 c0 8b
[    1.601374] RIP  [<ffffffff8169b4fe>] init_amd+0x21d/0x603
[    1.601381]  RSP <ffffffff81c01df8>
[    1.601391] ---[ end trace a7919e7f17c0a725 ]---
[    1.601397] Kernel panic - not syncing: Attempted to kill the idle task!

> Finally, I would question whether re-enabling the topology
> extensions under Xen shouldn't be skipped altogether, perhaps
> even on Dom0 (as the hypervisor is controlling this MSR, but in
> any case on DomU - the hypervisor won't allow (read: ignore,
> not fault on) the write anyway (and will log a message for each
> (v)CPU that attempts this).

This is probably right. Let me think about this.

Thanks for picking this up.

Regards,
Andre.

>
>> Signed-off-by: Andre Przywara<andre.przywara@amd.com>
>> Cc: stable@vger.kernel.org # 3.4+
>> ---
>>   arch/x86/kernel/cpu/amd.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index 146bb62..80ccd99 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>>   	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>>   		u64 val;
>>
>> -		if (!rdmsrl_amd_safe(0xc0011005,&val)) {
>> +		if (!rdmsrl_safe(0xc0011005,&val)) {
>>   			val |= 1ULL<<  54;
>> -			wrmsrl_amd_safe(0xc0011005, val);
>> +			checking_wrmsrl(0xc0011005, val);
>>   			rdmsrl(0xc0011005, val);
>>   			if (val&  (1ULL<<  54)) {
>>   				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
>
>
>


-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:02     ` Andre Przywara
@ 2012-05-30 14:23       ` Jan Beulich
  -1 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 14:23 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jacob Shin, mingo, jeremy, tglx, xen-devel, konrad.wilk,
	linux-kernel, hpa

>>> On 30.05.12 at 16:02, Andre Przywara <andre.przywara@amd.com> wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
>> Further, I can't see how checking_wrmsrl() is being paravirtualized
>> any better than wrmsrl_amd_safe() - both have nothing but an
>> exception handling fixup attached to the wrmsr invocation. Care
>> to point out what actual crash it is that was seen?
> 
> AFAIK, the difference is between the "l" and the regs version for 
> rd/wrmsr. We have a patch already here to fix this. Will send it out 
> soon. Jacob, can you comment on this?

I see - the Xen code blindly overwrites pv_cpu_ops, despite not
having initialized all members. That's an obvious oversight of the
patch that introduced the _regs variants.

Plus having secondary instances of things like rdmsrl_amd_safe()
in asm/paravirt.h seems pretty strange an approach (which was
why initially I didn't spot how a crash could happen there) - only
the lowest level functions should get re-implemented here.

>> Finally, I would question whether re-enabling the topology
>> extensions under Xen shouldn't be skipped altogether, perhaps
>> even on Dom0 (as the hypervisor is controlling this MSR, but in
>> any case on DomU - the hypervisor won't allow (read: ignore,
>> not fault on) the write anyway (and will log a message for each
>> (v)CPU that attempts this).
> 
> This is probably right. Let me think about this.

I'll submit a respective hypervisor side patch soonish.

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 14:23       ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 14:23 UTC (permalink / raw)
  To: Andre Przywara
  Cc: jeremy, xen-devel, konrad.wilk, Jacob Shin, linux-kernel, hpa,
	mingo, tglx

>>> On 30.05.12 at 16:02, Andre Przywara <andre.przywara@amd.com> wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
>> Further, I can't see how checking_wrmsrl() is being paravirtualized
>> any better than wrmsrl_amd_safe() - both have nothing but an
>> exception handling fixup attached to the wrmsr invocation. Care
>> to point out what actual crash it is that was seen?
> 
> AFAIK, the difference is between the "l" and the regs version for 
> rd/wrmsr. We have a patch already here to fix this. Will send it out 
> soon. Jacob, can you comment on this?

I see - the Xen code blindly overwrites pv_cpu_ops, despite not
having initialized all members. That's an obvious oversight of the
patch that introduced the _regs variants.

Plus having secondary instances of things like rdmsrl_amd_safe()
in asm/paravirt.h seems pretty strange an approach (which was
why initially I didn't spot how a crash could happen there) - only
the lowest level functions should get re-implemented here.

>> Finally, I would question whether re-enabling the topology
>> extensions under Xen shouldn't be skipped altogether, perhaps
>> even on Dom0 (as the hypervisor is controlling this MSR, but in
>> any case on DomU - the hypervisor won't allow (read: ignore,
>> not fault on) the write anyway (and will log a message for each
>> (v)CPU that attempts this).
> 
> This is probably right. Let me think about this.

I'll submit a respective hypervisor side patch soonish.

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 13:10 [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Andre Przywara
  2012-05-30 13:33   ` Jan Beulich
@ 2012-05-30 14:39 ` Konrad Rzeszutek Wilk
  2012-05-30 14:50   ` H. Peter Anvin
  2012-05-30 14:42 ` H. Peter Anvin
  2012-05-30 23:31 ` H. Peter Anvin
  3 siblings, 1 reply; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 14:39 UTC (permalink / raw)
  To: Andre Przywara; +Cc: mingo, hpa, tglx, jeremy, xen-devel, stable, linux-kernel

On Wed, May 30, 2012 at 03:10:02PM +0200, Andre Przywara wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.
> This fixes a crash when using the kernel as a Xen Dom0 on affected
> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> yet (this will be fixed in another patch).

So with a rdmsrl_amd_safe and wrmsrl_amd_safe being implemented in
the pv_cpu_ops - would this patch even be neccessary?

> 
> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Cc: stable@vger.kernel.org # 3.4+
> ---
>  arch/x86/kernel/cpu/amd.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 146bb62..80ccd99 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
>  	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
>  		u64 val;
>  
> -		if (!rdmsrl_amd_safe(0xc0011005, &val)) {
> +		if (!rdmsrl_safe(0xc0011005, &val)) {
>  			val |= 1ULL << 54;
> -			wrmsrl_amd_safe(0xc0011005, val);
> +			checking_wrmsrl(0xc0011005, val);
>  			rdmsrl(0xc0011005, val);
>  			if (val & (1ULL << 54)) {
>  				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
> -- 
> 1.7.4.4
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 13:10 [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Andre Przywara
  2012-05-30 13:33   ` Jan Beulich
  2012-05-30 14:39 ` Konrad Rzeszutek Wilk
@ 2012-05-30 14:42 ` H. Peter Anvin
  2012-05-30 14:55   ` Borislav Petkov
  2012-05-30 23:31 ` H. Peter Anvin
  3 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 14:42 UTC (permalink / raw)
  To: Andre Przywara
  Cc: mingo, tglx, konrad.wilk, jeremy, linux-kernel, xen-devel, stable

On 05/30/2012 06:10 AM, Andre Przywara wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.

That is not what the *msr*_amd*() functions do.

NAK.  This is a totally bogus patch.

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:23       ` Jan Beulich
  (?)
@ 2012-05-30 14:42       ` H. Peter Anvin
  2012-05-30 14:49         ` Konrad Rzeszutek Wilk
  -1 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 14:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	konrad.wilk, linux-kernel

On 05/30/2012 07:23 AM, Jan Beulich wrote:
> 
> I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> having initialized all members. That's an obvious oversight of the
> patch that introduced the _regs variants.
> 
> Plus having secondary instances of things like rdmsrl_amd_safe()
> in asm/paravirt.h seems pretty strange an approach (which was
> why initially I didn't spot how a crash could happen there) - only
> the lowest level functions should get re-implemented here.
> 

This kinds of things are part of why Xen makes me want to cry regularly.

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:02     ` Andre Przywara
@ 2012-05-30 14:48       ` Jacob Shin
  -1 siblings, 0 replies; 63+ messages in thread
From: Jacob Shin @ 2012-05-30 14:48 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jan Beulich, mingo, jeremy, tglx, xen-devel, konrad.wilk,
	linux-kernel, hpa

On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
> >>>>On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
> >>Because we are behind a family check before tweaking the topology
> >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> >>register.
> >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> >>yet (this will be fixed in another patch).
> >
> >I'm not following: If the AMD variants (putting a special value into
> >%edi) can be freely replaced by the non-AMD variants, why did
> >the AMD special ones get used in the first place?
> 
> Older CPUs (K8) needed the AMD variants, starting with family 10h we
> can use the normal versions.
> 
> >Further, I can't see how checking_wrmsrl() is being paravirtualized
> >any better than wrmsrl_amd_safe() - both have nothing but an
> >exception handling fixup attached to the wrmsr invocation. Care
> >to point out what actual crash it is that was seen?
> 
> AFAIK, the difference is between the "l" and the regs version for
> rd/wrmsr. We have a patch already here to fix this. Will send it out
> soon. Jacob, can you comment on this?

Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
by enlighten.

> 
> The crash dump:
> 
> [    1.601021] ------------[ cut here ]------------
> [    1.601025] kernel BUG at
> /buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
> [    1.601031] invalid opcode: 0000 [#1] SMP
> [    1.601038] CPU 0
> [    1.601041] Modules linked in:
> [    1.601047]
> [    1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD
> Virgo Platform/Annapurna
> [    1.601061] RIP: e030:[<ffffffff8169b4fe>]  [<ffffffff8169b4fe>]
> init_amd+0x21d/0x603
> [    1.601072] RSP: e02b:ffffffff81c01df8  EFLAGS: 00010246
> [    1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX:
> 0000000000000000
> [    1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI:
> ffffffff81c01e78
> [    1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09:
> 00000000ffffffff
> [    1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12:
> ffffffff81ca7574
> [    1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15:
> ffffffff81ca756c
> [    1.601103] FS:  0000000000000000(0000) GS:ffff8801ce600000(0000)
> knlGS:0000000000000000
> [    1.601108] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4:
> 0000000000040660
> [    1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [    1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [    1.601127] Process swapper/0 (pid: 0, threadinfo
> ffffffff81c00000, task ffffffff81c14020)
> [    1.601131] Stack:
> [    1.601134]  ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017
> 0000000000000000
> [    1.601146]  ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48
> ffffffff81140118
> [    1.601157]  ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480
> 00000000000000d0
> [    1.601169] Call Trace:
> [    1.601175]  [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
> [    1.601183]  [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
> [    1.601189]  [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
> [    1.601195]  [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
> [    1.601201]  [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
> [    1.601209]  [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
> [    1.601216]  [<ffffffff81cd2052>] check_bugs+0x9/0x2d
> [    1.601222]  [<ffffffff81cc7e08>] start_kernel+0x445/0x461
> [    1.601227]  [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
> [    1.601233]  [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
> [    1.601240]  [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
> [    1.601247]  [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
> [    1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00
> 48 8d 7d b0 b9 08 00 00 00 f3
> ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c
> 8d 75 b0 4c 89 f7 ff 14 25 b0 11
> c2 81 85 c0 8b
> [    1.601374] RIP  [<ffffffff8169b4fe>] init_amd+0x21d/0x603
> [    1.601381]  RSP <ffffffff81c01df8>
> [    1.601391] ---[ end trace a7919e7f17c0a725 ]---
> [    1.601397] Kernel panic - not syncing: Attempted to kill the idle task!
> 
> >Finally, I would question whether re-enabling the topology
> >extensions under Xen shouldn't be skipped altogether, perhaps
> >even on Dom0 (as the hypervisor is controlling this MSR, but in
> >any case on DomU - the hypervisor won't allow (read: ignore,
> >not fault on) the write anyway (and will log a message for each
> >(v)CPU that attempts this).
> 
> This is probably right. Let me think about this.
> 
> Thanks for picking this up.
> 
> Regards,
> Andre.
> 
> >
> >>Signed-off-by: Andre Przywara<andre.przywara@amd.com>
> >>Cc: stable@vger.kernel.org # 3.4+
> >>---
> >>  arch/x86/kernel/cpu/amd.c |    4 ++--
> >>  1 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> >>index 146bb62..80ccd99 100644
> >>--- a/arch/x86/kernel/cpu/amd.c
> >>+++ b/arch/x86/kernel/cpu/amd.c
> >>@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
> >>  	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
> >>  		u64 val;
> >>
> >>-		if (!rdmsrl_amd_safe(0xc0011005,&val)) {
> >>+		if (!rdmsrl_safe(0xc0011005,&val)) {
> >>  			val |= 1ULL<<  54;
> >>-			wrmsrl_amd_safe(0xc0011005, val);
> >>+			checking_wrmsrl(0xc0011005, val);
> >>  			rdmsrl(0xc0011005, val);
> >>  			if (val&  (1ULL<<  54)) {
> >>  				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
> >
> >
> >
> 
> 
> -- 
> Andre Przywara
> AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 14:48       ` Jacob Shin
  0 siblings, 0 replies; 63+ messages in thread
From: Jacob Shin @ 2012-05-30 14:48 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jan Beulich, mingo, jeremy, tglx, xen-devel, konrad.wilk,
	linux-kernel, hpa

On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> On 05/30/2012 03:33 PM, Jan Beulich wrote:
> >>>>On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
> >>Because we are behind a family check before tweaking the topology
> >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> >>register.
> >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> >>yet (this will be fixed in another patch).
> >
> >I'm not following: If the AMD variants (putting a special value into
> >%edi) can be freely replaced by the non-AMD variants, why did
> >the AMD special ones get used in the first place?
> 
> Older CPUs (K8) needed the AMD variants, starting with family 10h we
> can use the normal versions.
> 
> >Further, I can't see how checking_wrmsrl() is being paravirtualized
> >any better than wrmsrl_amd_safe() - both have nothing but an
> >exception handling fixup attached to the wrmsr invocation. Care
> >to point out what actual crash it is that was seen?
> 
> AFAIK, the difference is between the "l" and the regs version for
> rd/wrmsr. We have a patch already here to fix this. Will send it out
> soon. Jacob, can you comment on this?

Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
by enlighten.

> 
> The crash dump:
> 
> [    1.601021] ------------[ cut here ]------------
> [    1.601025] kernel BUG at
> /buildrepos/linux-2.6-stable/arch/x86/include/asm/paravirt.h:133!
> [    1.601031] invalid opcode: 0000 [#1] SMP
> [    1.601038] CPU 0
> [    1.601041] Modules linked in:
> [    1.601047]
> [    1.601050] Pid: 0, comm: swapper/0 Not tainted 3.4.0 #1 AMD
> Virgo Platform/Annapurna
> [    1.601061] RIP: e030:[<ffffffff8169b4fe>]  [<ffffffff8169b4fe>]
> init_amd+0x21d/0x603
> [    1.601072] RSP: e02b:ffffffff81c01df8  EFLAGS: 00010246
> [    1.601076] RAX: 0000000000000000 RBX: ffffffff81ca7500 RCX:
> 0000000000000000
> [    1.601081] RDX: 0000000000000020 RSI: 000000000000c000 RDI:
> ffffffff81c01e78
> [    1.601086] RBP: ffffffff81c01ea8 R08: 0000000000004003 R09:
> 00000000ffffffff
> [    1.601090] R10: 0000000000000000 R11: 00000000ffffffff R12:
> ffffffff81ca7574
> [    1.601095] R13: ffffffff81ca7514 R14: ffffffff81ca754c R15:
> ffffffff81ca756c
> [    1.601103] FS:  0000000000000000(0000) GS:ffff8801ce600000(0000)
> knlGS:0000000000000000
> [    1.601108] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.601112] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4:
> 0000000000040660
> [    1.601117] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [    1.601122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [    1.601127] Process swapper/0 (pid: 0, threadinfo
> ffffffff81c00000, task ffffffff81c14020)
> [    1.601131] Stack:
> [    1.601134]  ffffffff81c01ea8 ffffffff8169a7d8 0000001600000017
> 0000000000000000
> [    1.601146]  ffff8801c1c3ac00 ffff8801c1c002d0 ffffffff81c01e48
> ffffffff81140118
> [    1.601157]  ffffffff811415cc ffff8801c1c04220 ffff8801c1c02480
> 00000000000000d0
> [    1.601169] Call Trace:
> [    1.601175]  [<ffffffff8169a7d8>] ? get_cpu_cap+0x1f2/0x201
> [    1.601183]  [<ffffffff81140118>] ? init_kmem_cache_cpus+0x3e/0x4d
> [    1.601189]  [<ffffffff811415cc>] ? sysfs_slab_alias+0x60/0x8d
> [    1.601195]  [<ffffffff8169aa3d>] identify_cpu+0x256/0x34d
> [    1.601201]  [<ffffffff81141649>] ? kmem_cache_alloc+0x50/0xe4
> [    1.601209]  [<ffffffff81cd1c51>] identify_boot_cpu+0x10/0x3c
> [    1.601216]  [<ffffffff81cd2052>] check_bugs+0x9/0x2d
> [    1.601222]  [<ffffffff81cc7e08>] start_kernel+0x445/0x461
> [    1.601227]  [<ffffffff81cc77d6>] ? kernel_init+0x1d8/0x1d8
> [    1.601233]  [<ffffffff81cc72cf>] x86_64_start_reservations+0xba/0xc1
> [    1.601240]  [<ffffffff81041797>] ? xen_setup_runstate_info+0x2c/0x36
> [    1.601247]  [<ffffffff81ccbdc4>] xen_start_kernel+0x58b/0x592
> [    1.601251] Code: 0f 85 d3 00 00 00 31 c0 48 83 3d cd 5c 58 00 00
> 48 8d 7d b0 b9 08 00 00 00 f3
> ab c7 45 b4 05 10 01 c0 c7 45 cc 3a 20 5a 9c 75 04 <0f> 0b eb fe 4c
> 8d 75 b0 4c 89 f7 ff 14 25 b0 11
> c2 81 85 c0 8b
> [    1.601374] RIP  [<ffffffff8169b4fe>] init_amd+0x21d/0x603
> [    1.601381]  RSP <ffffffff81c01df8>
> [    1.601391] ---[ end trace a7919e7f17c0a725 ]---
> [    1.601397] Kernel panic - not syncing: Attempted to kill the idle task!
> 
> >Finally, I would question whether re-enabling the topology
> >extensions under Xen shouldn't be skipped altogether, perhaps
> >even on Dom0 (as the hypervisor is controlling this MSR, but in
> >any case on DomU - the hypervisor won't allow (read: ignore,
> >not fault on) the write anyway (and will log a message for each
> >(v)CPU that attempts this).
> 
> This is probably right. Let me think about this.
> 
> Thanks for picking this up.
> 
> Regards,
> Andre.
> 
> >
> >>Signed-off-by: Andre Przywara<andre.przywara@amd.com>
> >>Cc: stable@vger.kernel.org # 3.4+
> >>---
> >>  arch/x86/kernel/cpu/amd.c |    4 ++--
> >>  1 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> >>index 146bb62..80ccd99 100644
> >>--- a/arch/x86/kernel/cpu/amd.c
> >>+++ b/arch/x86/kernel/cpu/amd.c
> >>@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
> >>  	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
> >>  		u64 val;
> >>
> >>-		if (!rdmsrl_amd_safe(0xc0011005,&val)) {
> >>+		if (!rdmsrl_safe(0xc0011005,&val)) {
> >>  			val |= 1ULL<<  54;
> >>-			wrmsrl_amd_safe(0xc0011005, val);
> >>+			checking_wrmsrl(0xc0011005, val);
> >>  			rdmsrl(0xc0011005, val);
> >>  			if (val&  (1ULL<<  54)) {
> >>  				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
> >
> >
> >
> 
> 
> -- 
> Andre Przywara
> AMD-Operating System Research Center (OSRC), Dresden, Germany

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:42       ` [Xen-devel] " H. Peter Anvin
@ 2012-05-30 14:49         ` Konrad Rzeszutek Wilk
  2012-05-30 15:12           ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 14:49 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Jan Beulich, Andre Przywara, Jacob Shin, mingo, jeremy, tglx,
	xen-devel, linux-kernel

On Wed, May 30, 2012 at 07:42:56AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 07:23 AM, Jan Beulich wrote:
> > 
> > I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> > having initialized all members. That's an obvious oversight of the
> > patch that introduced the _regs variants.
> > 
> > Plus having secondary instances of things like rdmsrl_amd_safe()
> > in asm/paravirt.h seems pretty strange an approach (which was
> > why initially I didn't spot how a crash could happen there) - only
> > the lowest level functions should get re-implemented here.
> > 
> 
> This kinds of things are part of why Xen makes me want to cry regularly.

It looks like an oversight by Borislav (177fed1ee8d727c39601ce9fc2299b4cb25a718e
and 132ec92f) and Yinghai (b05f78f5c713eda2c34e495d92495ee4f1c3b5e1) where
they added these wrappers way back in 2009!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:48       ` Jacob Shin
  (?)
@ 2012-05-30 14:50       ` Konrad Rzeszutek Wilk
  2012-05-30 15:03           ` Jacob Shin
  -1 siblings, 1 reply; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 14:50 UTC (permalink / raw)
  To: Jacob Shin
  Cc: Andre Przywara, Jan Beulich, mingo, jeremy, tglx, xen-devel,
	linux-kernel, hpa

On Wed, May 30, 2012 at 09:48:51AM -0500, Jacob Shin wrote:
> On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> > On 05/30/2012 03:33 PM, Jan Beulich wrote:
> > >>>>On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
> > >>Because we are behind a family check before tweaking the topology
> > >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > >>register.
> > >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> > >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> > >>yet (this will be fixed in another patch).
> > >
> > >I'm not following: If the AMD variants (putting a special value into
> > >%edi) can be freely replaced by the non-AMD variants, why did
> > >the AMD special ones get used in the first place?
> > 
> > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > can use the normal versions.
> > 
> > >Further, I can't see how checking_wrmsrl() is being paravirtualized
> > >any better than wrmsrl_amd_safe() - both have nothing but an
> > >exception handling fixup attached to the wrmsr invocation. Care
> > >to point out what actual crash it is that was seen?
> > 
> > AFAIK, the difference is between the "l" and the regs version for
> > rd/wrmsr. We have a patch already here to fix this. Will send it out
> > soon. Jacob, can you comment on this?
> 
> Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
> but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
> by enlighten.

So would a patch to implements the rdmsr_regs fix this crash?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:39 ` Konrad Rzeszutek Wilk
@ 2012-05-30 14:50   ` H. Peter Anvin
  2012-05-30 14:51     ` Konrad Rzeszutek Wilk
  2012-05-30 15:08       ` Jan Beulich
  0 siblings, 2 replies; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 14:50 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, mingo, tglx, jeremy, xen-devel, stable, linux-kernel

On 05/30/2012 07:39 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 03:10:02PM +0200, Andre Przywara wrote:
>> Because we are behind a family check before tweaking the topology
>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>> register.
>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>> yet (this will be fixed in another patch).
> 
> So with a rdmsrl_amd_safe and wrmsrl_amd_safe being implemented in
> the pv_cpu_ops - would this patch even be neccessary?
> 

That is still bogus;  a better thing would be to implement the _regs
interface.  Even better would be to trap and emulate rdmsr/wrmsr!

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:50   ` H. Peter Anvin
@ 2012-05-30 14:51     ` Konrad Rzeszutek Wilk
  2012-05-30 15:08       ` Jan Beulich
  1 sibling, 0 replies; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 14:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, mingo, tglx, jeremy, xen-devel, stable, linux-kernel

On Wed, May 30, 2012 at 07:50:15AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 07:39 AM, Konrad Rzeszutek Wilk wrote:
> > On Wed, May 30, 2012 at 03:10:02PM +0200, Andre Przywara wrote:
> >> Because we are behind a family check before tweaking the topology
> >> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> >> register.
> >> This fixes a crash when using the kernel as a Xen Dom0 on affected
> >> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> >> yet (this will be fixed in another patch).
> > 
> > So with a rdmsrl_amd_safe and wrmsrl_amd_safe being implemented in
> > the pv_cpu_ops - would this patch even be neccessary?
> > 
> 
> That is still bogus;  a better thing would be to implement the _regs
> interface.  Even better would be to trap and emulate rdmsr/wrmsr!

That is what I meant - implement these two:

	rdmsr_regs = native_rdmsr_safe_regs,
        .wrmsr_regs = native_wrmsr_safe_regs,

Xen already traps the rdmsr/wrms - I believe it just didn't do
anything for this specific MSR.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:42 ` H. Peter Anvin
@ 2012-05-30 14:55   ` Borislav Petkov
  2012-05-30 14:58     ` H. Peter Anvin
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 14:55 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, mingo, tglx, konrad.wilk, jeremy, linux-kernel,
	xen-devel, stable

On Wed, May 30, 2012 at 07:42:27AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 06:10 AM, Andre Przywara wrote:
> > Because we are behind a family check before tweaking the topology
> > bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > register.
> 
> That is not what the *msr*_amd*() functions do.
> 
> NAK.  This is a totally bogus patch.

The *msr*_amd*() variants were used instead of the normal *msrl_safe
variants although the AMD variants weren't needed there at all.

This has no issue on baremetal but breaks xen and this is how we caught
this.

So the patch corrects the original patch so that xen is happy too.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:55   ` Borislav Petkov
@ 2012-05-30 14:58     ` H. Peter Anvin
  2012-05-30 15:00       ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 14:58 UTC (permalink / raw)
  To: Borislav Petkov, Andre Przywara, mingo, tglx, konrad.wilk,
	jeremy, linux-kernel, xen-devel, stable

On 05/30/2012 07:55 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 07:42:27AM -0700, H. Peter Anvin wrote:
>> On 05/30/2012 06:10 AM, Andre Przywara wrote:
>>> Because we are behind a family check before tweaking the topology
>>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>>> register.
>>
>> That is not what the *msr*_amd*() functions do.
>>
>> NAK.  This is a totally bogus patch.
> 
> The *msr*_amd*() variants were used instead of the normal *msrl_safe
> variants although the AMD variants weren't needed there at all.
> 
> This has no issue on baremetal but breaks xen and this is how we caught
> this.
> 
> So the patch corrects the original patch so that xen is happy too.
> 

If so, then fix the description to match reality and we can take the patch.

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:58     ` H. Peter Anvin
@ 2012-05-30 15:00       ` Borislav Petkov
  2012-05-30 15:01         ` H. Peter Anvin
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 15:00 UTC (permalink / raw)
  To: Andre Przywara
  Cc: H. Peter Anvin, mingo, tglx, konrad.wilk, jeremy, linux-kernel,
	xen-devel, stable

On Wed, May 30, 2012 at 07:58:23AM -0700, H. Peter Anvin wrote:
> If so, then fix the description to match reality and we can take the patch.

Andre, would you please do that?

Thanks.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:00       ` Borislav Petkov
@ 2012-05-30 15:01         ` H. Peter Anvin
  2012-05-30 15:05           ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 15:01 UTC (permalink / raw)
  To: Borislav Petkov, Andre Przywara, mingo, tglx, konrad.wilk,
	jeremy, linux-kernel, xen-devel, stable

On 05/30/2012 08:00 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 07:58:23AM -0700, H. Peter Anvin wrote:
>> If so, then fix the description to match reality and we can take the patch.
> 
> Andre, would you please do that?
> 

Ah, but now we get:

>> I'm not following: If the AMD variants (putting a special value into
>> %edi) can be freely replaced by the non-AMD variants, why did
>> the AMD special ones get used in the first place?

> Older CPUs (K8) needed the AMD variants, starting with family 10h we
> can use the normal versions.

So no, not correct.

	-hpa



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:50       ` Konrad Rzeszutek Wilk
@ 2012-05-30 15:03           ` Jacob Shin
  0 siblings, 0 replies; 63+ messages in thread
From: Jacob Shin @ 2012-05-30 15:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, Jan Beulich, mingo, jeremy, tglx, xen-devel,
	linux-kernel, hpa

On Wed, May 30, 2012 at 10:50:05AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 09:48:51AM -0500, Jacob Shin wrote:
> > On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> > > On 05/30/2012 03:33 PM, Jan Beulich wrote:
> > > >>>>On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
> > > >>Because we are behind a family check before tweaking the topology
> > > >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > > >>register.
> > > >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> > > >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> > > >>yet (this will be fixed in another patch).
> > > >
> > > >I'm not following: If the AMD variants (putting a special value into
> > > >%edi) can be freely replaced by the non-AMD variants, why did
> > > >the AMD special ones get used in the first place?
> > > 
> > > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > > can use the normal versions.
> > > 
> > > >Further, I can't see how checking_wrmsrl() is being paravirtualized
> > > >any better than wrmsrl_amd_safe() - both have nothing but an
> > > >exception handling fixup attached to the wrmsr invocation. Care
> > > >to point out what actual crash it is that was seen?
> > > 
> > > AFAIK, the difference is between the "l" and the regs version for
> > > rd/wrmsr. We have a patch already here to fix this. Will send it out
> > > soon. Jacob, can you comment on this?
> > 
> > Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
> > but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
> > by enlighten.
> 
> So would a patch to implements the rdmsr_regs fix this crash?

Yes, the following patch also fixed the crash for us:

Implement rdmsr_regs and wrmsr_regs for Xen pvops.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Cc: stable@vger.kernel.org # 3.4+

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..f3ae5ec 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -945,9 +945,12 @@ static void xen_write_cr4(unsigned long cr4)
 	native_write_cr4(cr4);
 }
 
-static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+static int xen_wrmsr_safe_regs(u32 regs[8])
 {
 	int ret;
+	unsigned int msr = regs[1];
+	unsigned low = regs[0];
+	unsigned high = regs[2];
 
 	ret = 0;
 
@@ -985,12 +988,23 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
 		break;
 
 	default:
-		ret = native_write_msr_safe(msr, low, high);
+		ret = native_wrmsr_safe_regs(regs);
 	}
 
 	return ret;
 }
 
+static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+{
+	u32 regs[8] = { 0 };
+
+	regs[0] = low;
+	regs[1] = msr;
+	regs[2] = high;
+
+	return xen_wrmsr_safe_regs(regs);
+}
+
 void xen_setup_shared_info(void)
 {
 	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -1116,7 +1130,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.wbinvd = native_wbinvd,
 
 	.read_msr = native_read_msr_safe,
+	.rdmsr_regs = native_rdmsr_safe_regs,
 	.write_msr = xen_write_msr_safe,
+	.wrmsr_regs = xen_wrmsr_safe_regs,
 	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 15:03           ` Jacob Shin
  0 siblings, 0 replies; 63+ messages in thread
From: Jacob Shin @ 2012-05-30 15:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, Jan Beulich, mingo, jeremy, tglx, xen-devel,
	linux-kernel, hpa

On Wed, May 30, 2012 at 10:50:05AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 09:48:51AM -0500, Jacob Shin wrote:
> > On Wed, May 30, 2012 at 04:02:48PM +0200, Andre Przywara wrote:
> > > On 05/30/2012 03:33 PM, Jan Beulich wrote:
> > > >>>>On 30.05.12 at 15:10, Andre Przywara<andre.przywara@amd.com>  wrote:
> > > >>Because we are behind a family check before tweaking the topology
> > > >>bit, we can use the standard rd/wrmsr variants for the CPUID feature
> > > >>register.
> > > >>This fixes a crash when using the kernel as a Xen Dom0 on affected
> > > >>Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> > > >>yet (this will be fixed in another patch).
> > > >
> > > >I'm not following: If the AMD variants (putting a special value into
> > > >%edi) can be freely replaced by the non-AMD variants, why did
> > > >the AMD special ones get used in the first place?
> > > 
> > > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > > can use the normal versions.
> > > 
> > > >Further, I can't see how checking_wrmsrl() is being paravirtualized
> > > >any better than wrmsrl_amd_safe() - both have nothing but an
> > > >exception handling fixup attached to the wrmsr invocation. Care
> > > >to point out what actual crash it is that was seen?
> > > 
> > > AFAIK, the difference is between the "l" and the regs version for
> > > rd/wrmsr. We have a patch already here to fix this. Will send it out
> > > soon. Jacob, can you comment on this?
> > 
> > Right, the checking_wrmsrl turns into wrmsr_safe which is paravirtualized
> > but the rdmsrl_amd_safe which turns into rdmsr_regs is not paravirtualized
> > by enlighten.
> 
> So would a patch to implements the rdmsr_regs fix this crash?

Yes, the following patch also fixed the crash for us:

Implement rdmsr_regs and wrmsr_regs for Xen pvops.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Cc: stable@vger.kernel.org # 3.4+

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..f3ae5ec 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -945,9 +945,12 @@ static void xen_write_cr4(unsigned long cr4)
 	native_write_cr4(cr4);
 }
 
-static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+static int xen_wrmsr_safe_regs(u32 regs[8])
 {
 	int ret;
+	unsigned int msr = regs[1];
+	unsigned low = regs[0];
+	unsigned high = regs[2];
 
 	ret = 0;
 
@@ -985,12 +988,23 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
 		break;
 
 	default:
-		ret = native_write_msr_safe(msr, low, high);
+		ret = native_wrmsr_safe_regs(regs);
 	}
 
 	return ret;
 }
 
+static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
+{
+	u32 regs[8] = { 0 };
+
+	regs[0] = low;
+	regs[1] = msr;
+	regs[2] = high;
+
+	return xen_wrmsr_safe_regs(regs);
+}
+
 void xen_setup_shared_info(void)
 {
 	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -1116,7 +1130,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.wbinvd = native_wbinvd,
 
 	.read_msr = native_read_msr_safe,
+	.rdmsr_regs = native_rdmsr_safe_regs,
 	.write_msr = xen_write_msr_safe,
+	.wrmsr_regs = xen_wrmsr_safe_regs,
 	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:01         ` H. Peter Anvin
@ 2012-05-30 15:05           ` Borislav Petkov
  0 siblings, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 15:05 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, mingo, tglx, konrad.wilk, jeremy, linux-kernel,
	xen-devel, stable

On Wed, May 30, 2012 at 08:01:19AM -0700, H. Peter Anvin wrote:
> >> I'm not following: If the AMD variants (putting a special value into
> >> %edi) can be freely replaced by the non-AMD variants, why did
> >> the AMD special ones get used in the first place?
> 
> > Older CPUs (K8) needed the AMD variants, starting with family 10h we
> > can use the normal versions.
> 
> So no, not correct.

I don't see what the problem is: the amd* variants shouldn't have been
used at all in the first place. Fullstop. The patch should've been doing
*msrl_safe from the get-go. Regardless of family.

So what's the issue?

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:50   ` H. Peter Anvin
@ 2012-05-30 15:08       ` Jan Beulich
  2012-05-30 15:08       ` Jan Beulich
  1 sibling, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:08 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, stable

>>> On 30.05.12 at 16:50, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/30/2012 07:39 AM, Konrad Rzeszutek Wilk wrote:
>> On Wed, May 30, 2012 at 03:10:02PM +0200, Andre Przywara wrote:
>>> Because we are behind a family check before tweaking the topology
>>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>>> register.
>>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>>> yet (this will be fixed in another patch).
>> 
>> So with a rdmsrl_amd_safe and wrmsrl_amd_safe being implemented in
>> the pv_cpu_ops - would this patch even be neccessary?
>> 
> 
> That is still bogus;  a better thing would be to implement the _regs
> interface.  Even better would be to trap and emulate rdmsr/wrmsr!

The crash is not on the wrmsr instruction, but on the paravirt
layer finding a NULL pointer in one of the methods. Xen does
trap and emulate (possibly just ignore) both instructions.

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 15:08       ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:08 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, stable

>>> On 30.05.12 at 16:50, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/30/2012 07:39 AM, Konrad Rzeszutek Wilk wrote:
>> On Wed, May 30, 2012 at 03:10:02PM +0200, Andre Przywara wrote:
>>> Because we are behind a family check before tweaking the topology
>>> bit, we can use the standard rd/wrmsr variants for the CPUID feature
>>> register.
>>> This fixes a crash when using the kernel as a Xen Dom0 on affected
>>> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
>>> yet (this will be fixed in another patch).
>> 
>> So with a rdmsrl_amd_safe and wrmsrl_amd_safe being implemented in
>> the pv_cpu_ops - would this patch even be neccessary?
>> 
> 
> That is still bogus;  a better thing would be to implement the _regs
> interface.  Even better would be to trap and emulate rdmsr/wrmsr!

The crash is not on the wrmsr instruction, but on the paravirt
layer finding a NULL pointer in one of the methods. Xen does
trap and emulate (possibly just ignore) both instructions.

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 14:49         ` Konrad Rzeszutek Wilk
@ 2012-05-30 15:12           ` Borislav Petkov
  2012-05-30 15:40               ` Jan Beulich
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 15:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: H. Peter Anvin, Jan Beulich, Andre Przywara, Jacob Shin, mingo,
	jeremy, tglx, xen-devel, linux-kernel

On Wed, May 30, 2012 at 10:49:29AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 30, 2012 at 07:42:56AM -0700, H. Peter Anvin wrote:
> > On 05/30/2012 07:23 AM, Jan Beulich wrote:
> > > 
> > > I see - the Xen code blindly overwrites pv_cpu_ops, despite not
> > > having initialized all members. That's an obvious oversight of the
> > > patch that introduced the _regs variants.
> > > 
> > > Plus having secondary instances of things like rdmsrl_amd_safe()
> > > in asm/paravirt.h seems pretty strange an approach (which was
> > > why initially I didn't spot how a crash could happen there) - only
> > > the lowest level functions should get re-implemented here.
> > > 
> > 
> > This kinds of things are part of why Xen makes me want to cry regularly.
> 
> It looks like an oversight by Borislav (177fed1ee8d727c39601ce9fc2299b4cb25a718e
> and 132ec92f) and Yinghai (b05f78f5c713eda2c34e495d92495ee4f1c3b5e1) where
> they added these wrappers way back in 2009!

This is exactly why xen has nothing to do in arch/x86/. It is not an
oversight - I simply didn't test it on xen because I don't care about
it.

Remember our last discussion about mcelog?

This current case should be a perfect example for why xen shouldn't be
sprinkling code all over the place.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:08       ` Jan Beulich
  (?)
@ 2012-05-30 15:15       ` H. Peter Anvin
  2012-05-30 15:35           ` Jan Beulich
  -1 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 15:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andre Przywara, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, stable

On 05/30/2012 08:08 AM, Jan Beulich wrote:
> 
> Xen does trap and emulate (possibly just ignore) both instructions.
> 

Then why the fsck is there paravirt_crap on this?

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:15       ` H. Peter Anvin
@ 2012-05-30 15:35           ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:35 UTC (permalink / raw)
  To: jeremy, Konrad Rzeszutek Wilk, H. Peter Anvin
  Cc: Andre Przywara, mingo, tglx, xen-devel, linux-kernel, stable

>>> On 30.05.12 at 17:15, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/30/2012 08:08 AM, Jan Beulich wrote:
>> 
>> Xen does trap and emulate (possibly just ignore) both instructions.
>> 
> 
> Then why the fsck is there paravirt_crap on this?

I have no clue. Jeremy, Konrad?

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 15:35           ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:35 UTC (permalink / raw)
  To: jeremy, Konrad Rzeszutek Wilk, H. Peter Anvin
  Cc: Andre Przywara, mingo, tglx, xen-devel, linux-kernel, stable

>>> On 30.05.12 at 17:15, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 05/30/2012 08:08 AM, Jan Beulich wrote:
>> 
>> Xen does trap and emulate (possibly just ignore) both instructions.
>> 
> 
> Then why the fsck is there paravirt_crap on this?

I have no clue. Jeremy, Konrad?

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:12           ` Borislav Petkov
@ 2012-05-30 15:40               ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin

>>> On 30.05.12 at 17:12, Borislav Petkov <bp@alien8.de> wrote:
> This current case should be a perfect example for why xen shouldn't be
> sprinkling code all over the place.

Which means you're denying the benefits of para-virtualization
(at the base system level; perhaps it's less of a problem for you
when it comes to pv device drivers, which are generally
standalone entities) as that's what distinguishes Xen from all
other virtualization solutions Linux supports.

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-30 15:40               ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-30 15:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin

>>> On 30.05.12 at 17:12, Borislav Petkov <bp@alien8.de> wrote:
> This current case should be a perfect example for why xen shouldn't be
> sprinkling code all over the place.

Which means you're denying the benefits of para-virtualization
(at the base system level; perhaps it's less of a problem for you
when it comes to pv device drivers, which are generally
standalone entities) as that's what distinguishes Xen from all
other virtualization solutions Linux supports.

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:40               ` Jan Beulich
  (?)
@ 2012-05-30 15:45               ` H. Peter Anvin
  -1 siblings, 0 replies; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 15:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Borislav Petkov, Andre Przywara, Jacob Shin, mingo, jeremy, tglx,
	xen-devel, Konrad Rzeszutek Wilk, linux-kernel

On 05/30/2012 08:40 AM, Jan Beulich wrote:
>>>> On 30.05.12 at 17:12, Borislav Petkov <bp@alien8.de> wrote:
>> This current case should be a perfect example for why xen shouldn't be
>> sprinkling code all over the place.
> 
> Which means you're denying the benefits of para-virtualization
> (at the base system level; perhaps it's less of a problem for you
> when it comes to pv device drivers, which are generally
> standalone entities) as that's what distinguishes Xen from all
> other virtualization solutions Linux supports.
> 

And it also denies the just atrocious cost of Xen grabbing random kernel
internals and turning them into ABIs that we have to work around from
that point on.  This is particularly obnoxious because the cost is
largely not borne by the Xen community but by the general Linux development.

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:40               ` Jan Beulich
  (?)
  (?)
@ 2012-05-30 15:58               ` Borislav Petkov
  -1 siblings, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 15:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin

On Wed, May 30, 2012 at 04:40:37PM +0100, Jan Beulich wrote:
> >>> On 30.05.12 at 17:12, Borislav Petkov <bp@alien8.de> wrote:
> > This current case should be a perfect example for why xen shouldn't be
> > sprinkling code all over the place.
> 
> Which means you're denying the benefits of para-virtualization

Does it really mean that?

> (at the base system level; perhaps it's less of a problem for
> you when it comes to pv device drivers, which are generally
> standalone entities) as that's what distinguishes Xen from all other
> virtualization solutions Linux supports.

All I'm saying is, xen should solve the whole deal of what it wants to
do (whatever that is) _without_ and _agnostic_ from arch/x86/. Otherwise
x86 changes break xen.

I couldn't care less about what distinguishes xen from all other
solutions.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:35           ` Jan Beulich
  (?)
@ 2012-05-30 16:48           ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 16:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: jeremy, H. Peter Anvin, Andre Przywara, xen-devel, stable,
	linux-kernel, mingo, tglx

On Wed, May 30, 2012 at 04:35:41PM +0100, Jan Beulich wrote:
> >>> On 30.05.12 at 17:15, "H. Peter Anvin" <hpa@zytor.com> wrote:
> > On 05/30/2012 08:08 AM, Jan Beulich wrote:
> >> 
> >> Xen does trap and emulate (possibly just ignore) both instructions.
> >> 
> > 
> > Then why the fsck is there paravirt_crap on this?
> 
> I have no clue. Jeremy, Konrad?

No idea. The git 132ec92f says:

Borislav Petkov <petkovbb@googlemail.com>
Date:   Mon Aug 31 09:50:09 2009 +0200

    x86, msr: Add rd/wrmsr interfaces with preset registers

    native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow
    presetting of a subset of eight x86 GPRs before executing the rd/wrmsr
    instructions. This is needed at least on AMD K8 for accessing an erratum
    workaround MSR.

    Originally based on an idea by H. Peter Anvin.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 15:03           ` Jacob Shin
  (?)
@ 2012-05-30 17:17           ` Konrad Rzeszutek Wilk
  2012-05-30 17:31             ` H. Peter Anvin
                               ` (2 more replies)
  -1 siblings, 3 replies; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 17:17 UTC (permalink / raw)
  To: Jacob Shin
  Cc: Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	hpa, mingo, tglx

> Yes, the following patch also fixed the crash for us:
> 
> Implement rdmsr_regs and wrmsr_regs for Xen pvops.

That needs more data. Such as the reason for it, the crash
tombstone, and an analysis of the bug.

But at this point I am not sure what we are going to do.

I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
function out altogether (so altering the amd_rdmsr... to use the
.rdmsr/.wrdmsr). At which point I think this would all work
just fine?

I am tempted to write a patch that checks all the pv-cpu-ops
to see if there are any that are NULL and throw a warning so
that this does not hit us in the future - to be at least more
proactive about this sort of thing.


> 
> Signed-off-by: Jacob Shin <jacob.shin@amd.com>
> Cc: stable@vger.kernel.org # 3.4+
> 
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..f3ae5ec 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -945,9 +945,12 @@ static void xen_write_cr4(unsigned long cr4)
>  	native_write_cr4(cr4);
>  }
>  
> -static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
> +static int xen_wrmsr_safe_regs(u32 regs[8])
>  {
>  	int ret;
> +	unsigned int msr = regs[1];
> +	unsigned low = regs[0];
> +	unsigned high = regs[2];
>  
>  	ret = 0;
>  
> @@ -985,12 +988,23 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
>  		break;
>  
>  	default:
> -		ret = native_write_msr_safe(msr, low, high);
> +		ret = native_wrmsr_safe_regs(regs);
>  	}
>  
>  	return ret;
>  }
>  
> +static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
> +{
> +	u32 regs[8] = { 0 };
> +
> +	regs[0] = low;
> +	regs[1] = msr;
> +	regs[2] = high;
> +
> +	return xen_wrmsr_safe_regs(regs);
> +}
> +
>  void xen_setup_shared_info(void)
>  {
>  	if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> @@ -1116,7 +1130,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>  	.wbinvd = native_wbinvd,
>  
>  	.read_msr = native_read_msr_safe,
> +	.rdmsr_regs = native_rdmsr_safe_regs,
>  	.write_msr = xen_write_msr_safe,
> +	.wrmsr_regs = xen_wrmsr_safe_regs,
>  	.read_tsc = native_read_tsc,
>  	.read_pmc = native_read_pmc,
>  
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:17           ` Konrad Rzeszutek Wilk
@ 2012-05-30 17:31             ` H. Peter Anvin
  2012-05-30 22:23               ` Konrad Rzeszutek Wilk
  2012-05-30 17:32             ` [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Borislav Petkov
  2012-05-31  7:17               ` Jan Beulich
  2 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 17:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jacob Shin, Andre Przywara, jeremy, xen-devel, linux-kernel,
	Jan Beulich, mingo, tglx

On 05/30/2012 10:17 AM, Konrad Rzeszutek Wilk wrote:
>> Yes, the following patch also fixed the crash for us:
>>
>> Implement rdmsr_regs and wrmsr_regs for Xen pvops.
> 
> That needs more data. Such as the reason for it, the crash
> tombstone, and an analysis of the bug.
> 
> But at this point I am not sure what we are going to do.
> 
> I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> function out altogether (so altering the amd_rdmsr... to use the
> .rdmsr/.wrdmsr). At which point I think this would all work
> just fine?
> 

No, you can't just do that.  Rip them out as in trap and emulate.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:17           ` Konrad Rzeszutek Wilk
  2012-05-30 17:31             ` H. Peter Anvin
@ 2012-05-30 17:32             ` Borislav Petkov
  2012-05-30 17:47               ` [PATCH] x86, AMD: Fix " Borislav Petkov
  2012-05-30 17:47               ` [Xen-devel] [PATCH] x86/amd: fix " H. Peter Anvin
  2012-05-31  7:17               ` Jan Beulich
  2 siblings, 2 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 17:32 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jacob Shin, Andre Przywara, jeremy, xen-devel, linux-kernel,
	Jan Beulich, hpa, mingo, tglx

On Wed, May 30, 2012 at 01:17:54PM -0400, Konrad Rzeszutek Wilk wrote:
> > Yes, the following patch also fixed the crash for us:
> > 
> > Implement rdmsr_regs and wrmsr_regs for Xen pvops.
> 
> That needs more data. Such as the reason for it, the crash
> tombstone, and an analysis of the bug.
> 
> But at this point I am not sure what we are going to do.
> 
> I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> function out altogether (so altering the amd_rdmsr... to use the
> .rdmsr/.wrdmsr). At which point I think this would all work
> just fine?

I wouldn't do that. Andre's patch switches to the
rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
_regs functions for pvops.

I'll send it now with corrected commit message.

> I am tempted to write a patch that checks all the pv-cpu-ops to see if
> there are any that are NULL and throw a warning so that this does not
> hit us in the future - to be at least more proactive about this sort
> of thing.

This could be a prudent thing to do.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH] x86, AMD: Fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:32             ` [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Borislav Petkov
@ 2012-05-30 17:47               ` Borislav Petkov
  2012-05-30 17:47               ` [Xen-devel] [PATCH] x86/amd: fix " H. Peter Anvin
  1 sibling, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 17:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andre Przywara, jeremy, xen-devel, Konrad Rzeszutek Wilk,
	Jacob Shin, LKML, stable, Borislav Petkov, Jan Beulich,
	Thomas Gleixner, Ingo Molnar

From: Andre Przywara <andre.przywara@amd.com>

Switch to the standard {rd,wr}msr*_safe* variants which should've been
used in the first place anyway and avoided unneeded excitation with xen.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Cc: stable@vger.kernel.org # 3.4+
Link: <http://lkml.kernel.org/r/1338383402-3838-1-git-send-email-andre.przywara@amd.com>
[Boris: correct commit message]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
---
 arch/x86/kernel/cpu/amd.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 146bb6218eec..80ccd99542e6 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -586,9 +586,9 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
 	    !cpu_has(c, X86_FEATURE_TOPOEXT)) {
 		u64 val;
 
-		if (!rdmsrl_amd_safe(0xc0011005, &val)) {
+		if (!rdmsrl_safe(0xc0011005, &val)) {
 			val |= 1ULL << 54;
-			wrmsrl_amd_safe(0xc0011005, val);
+			checking_wrmsrl(0xc0011005, val);
 			rdmsrl(0xc0011005, val);
 			if (val & (1ULL << 54)) {
 				set_cpu_cap(c, X86_FEATURE_TOPOEXT);
-- 
1.7.9.3.362.g71319

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:32             ` [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Borislav Petkov
  2012-05-30 17:47               ` [PATCH] x86, AMD: Fix " Borislav Petkov
@ 2012-05-30 17:47               ` H. Peter Anvin
  2012-05-30 17:51                 ` Borislav Petkov
  1 sibling, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 17:47 UTC (permalink / raw)
  To: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

On 05/30/2012 10:32 AM, Borislav Petkov wrote:
> 
> I wouldn't do that. Andre's patch switches to the
> rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
> _regs functions for pvops.
> 

But we just stated - it is wrong for K8?

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:47               ` [Xen-devel] [PATCH] x86/amd: fix " H. Peter Anvin
@ 2012-05-30 17:51                 ` Borislav Petkov
  2012-05-30 18:00                   ` H. Peter Anvin
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 17:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Jacob Shin, Andre Przywara, jeremy,
	xen-devel, linux-kernel, Jan Beulich, mingo, tglx

On Wed, May 30, 2012 at 10:47:32AM -0700, H. Peter Anvin wrote:
> > I wouldn't do that. Andre's patch switches to the
> > rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
> > _regs functions for pvops.
> > 
> But we just stated - it is wrong for K8?

Not executed on K8 - only on F15h, models 0x10 - 0x1f.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:51                 ` Borislav Petkov
@ 2012-05-30 18:00                   ` H. Peter Anvin
  2012-05-30 18:17                     ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 18:00 UTC (permalink / raw)
  To: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

On 05/30/2012 10:51 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 10:47:32AM -0700, H. Peter Anvin wrote:
>>> I wouldn't do that. Andre's patch switches to the
>>> rdmsrl_safe/checking_wrmsrl functions so you don't need to implement the
>>> _regs functions for pvops.
>>>
>> But we just stated - it is wrong for K8?
> 
> Not executed on K8 - only on F15h, models 0x10 - 0x1f.
> 

OK.  But there is still the general problem, no?

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:00                   ` H. Peter Anvin
@ 2012-05-30 18:17                     ` Borislav Petkov
  2012-05-30 18:19                       ` Borislav Petkov
                                         ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 18:17 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Jacob Shin, Andre Przywara, jeremy,
	xen-devel, linux-kernel, Jan Beulich, mingo, tglx

On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> OK.  But there is still the general problem, no?

With this patch xen crashes go away because they paravirt
native_{read,write}_msr_safe.

The other place where we use the amd_safe variants is an obscure K8,
revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
so I'm assuming people haven't run xen on such boxes yet. Does it need
fixing? Probably, if we really really have to.

Now, someone probably needs to paravirt the *safe_regs variants in case
something else decides to use them. I don't know what to do here, do I
want more paravirt code in there? No. I guess if this is done carefully
and cleanly, then it should be ok but it can't be done like that - it
needs to adhere to the current pv_cpu_ops thing which is already there.

Hmm...

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:17                     ` Borislav Petkov
@ 2012-05-30 18:19                       ` Borislav Petkov
  2012-05-30 18:21                         ` H. Peter Anvin
  2012-05-30 18:20                       ` H. Peter Anvin
  2012-05-31  7:39                         ` Jan Beulich
  2 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 18:19 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Jacob Shin, Andre Przywara, jeremy,
	xen-devel, linux-kernel, Jan Beulich, mingo, tglx

On Wed, May 30, 2012 at 08:17:23PM +0200, Borislav Petkov wrote:
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.

I'm being told we're safe here. Those boxes didn't have virtualization
support yet (SVM) so this one doesn't need fixing.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:17                     ` Borislav Petkov
  2012-05-30 18:19                       ` Borislav Petkov
@ 2012-05-30 18:20                       ` H. Peter Anvin
  2012-05-30 22:33                         ` Konrad Rzeszutek Wilk
  2012-05-31  7:39                         ` Jan Beulich
  2 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 18:20 UTC (permalink / raw)
  To: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

On 05/30/2012 11:17 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
>> OK.  But there is still the general problem, no?
> 
> With this patch xen crashes go away because they paravirt
> native_{read,write}_msr_safe.
> 
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.
> 
> Now, someone probably needs to paravirt the *safe_regs variants in case
> something else decides to use them. I don't know what to do here, do I
> want more paravirt code in there? No. I guess if this is done carefully
> and cleanly, then it should be ok but it can't be done like that - it
> needs to adhere to the current pv_cpu_ops thing which is already there.
> 

I thought I was being told that Xen would trap and emulate the
rdmsr/wrmsr instructions.  I guess they don't want to do that for the
handful of performance-sensitive MSRs there are, but those don't use the
*_regs variants.

	-hpa


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:19                       ` Borislav Petkov
@ 2012-05-30 18:21                         ` H. Peter Anvin
  2012-05-30 18:29                           ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 18:21 UTC (permalink / raw)
  To: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

On 05/30/2012 11:19 AM, Borislav Petkov wrote:
> On Wed, May 30, 2012 at 08:17:23PM +0200, Borislav Petkov wrote:
>> The other place where we use the amd_safe variants is an obscure K8,
>> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
>> so I'm assuming people haven't run xen on such boxes yet. Does it need
>> fixing? Probably, if we really really have to.
> 
> I'm being told we're safe here. Those boxes didn't have virtualization
> support yet (SVM) so this one doesn't need fixing.
> 

This is for PV, not for HVM (VT-x/SVM).  HVM doesn't have this kind of
drain bamage.

	-hpa

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:21                         ` H. Peter Anvin
@ 2012-05-30 18:29                           ` Borislav Petkov
  0 siblings, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-30 18:29 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Jacob Shin, Andre Przywara, jeremy,
	xen-devel, linux-kernel, Jan Beulich, mingo, tglx

On Wed, May 30, 2012 at 11:21:00AM -0700, H. Peter Anvin wrote:
> This is for PV, not for HVM (VT-x/SVM).  HVM doesn't have this kind of
> drain bamage.

Frankly, I wouldn't want to fix this for K8 only - they're going away
anyway. And besides, this is a BIOS fix for a very small subset of very
early K8s. It probably shouldnt've been there in the first place but
at the time I did it I wanted to fix the whole world (I'm much more
reasonable now :-)).

IOW, I'd like to leave sleeping dogs lie if you don't mind.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:31             ` H. Peter Anvin
@ 2012-05-30 22:23               ` Konrad Rzeszutek Wilk
  2012-05-30 23:23                 ` [tip:x86/urgent] x86, amd, xen: Avoid NULL pointer paravirt references tip-bot for Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 22:23 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Andre Przywara, jeremy, xen-devel,
	Jacob Shin, linux-kernel, Jan Beulich, mingo, tglx

On Wed, May 30, 2012 at 10:31:41AM -0700, H. Peter Anvin wrote:
> On 05/30/2012 10:17 AM, Konrad Rzeszutek Wilk wrote:
> >> Yes, the following patch also fixed the crash for us:
> >>
> >> Implement rdmsr_regs and wrmsr_regs for Xen pvops.
> > 
> > That needs more data. Such as the reason for it, the crash
> > tombstone, and an analysis of the bug.
> > 
> > But at this point I am not sure what we are going to do.
> > 
> > I think Peter leans towards ripping the .rdmsr_regs/wdmsr_regs
> > function out altogether (so altering the amd_rdmsr... to use the
> > .rdmsr/.wrdmsr). At which point I think this would all work
> > just fine?
> > 
> 
> No, you can't just do that.  Rip them out as in trap and emulate.

Then this should do it:

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.wbinvd = native_wbinvd,
 
 	.read_msr = native_read_msr_safe,
+	.rdmsr_regs = native_rdmsr_safe_regs,
 	.write_msr = xen_write_msr_safe,
+	.wrmsr_regs = native_wrmsr_safe_regs,
+
 	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:20                       ` H. Peter Anvin
@ 2012-05-30 22:33                         ` Konrad Rzeszutek Wilk
  2012-05-30 23:09                           ` H. Peter Anvin
  2012-05-31 12:24                             ` Andre Przywara
  0 siblings, 2 replies; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-30 22:33 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

> > Now, someone probably needs to paravirt the *safe_regs variants in case
> > something else decides to use them. I don't know what to do here, do I
> > want more paravirt code in there? No. I guess if this is done carefully
> > and cleanly, then it should be ok but it can't be done like that - it
> > needs to adhere to the current pv_cpu_ops thing which is already there.

Using the native variant seems the right thing to do.
> > 
> 
> I thought I was being told that Xen would trap and emulate the
> rdmsr/wrmsr instructions.  I guess they don't want to do that for the

It does.
> handful of performance-sensitive MSRs there are, but those don't use the
> *_regs variants.

The underlaying issue (as I understand) was that .rdmsr_regs
(and the corresponding write) was NULL and that caused the crash.
This tiny patch should do it:

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.wbinvd = native_wbinvd,
 
 	.read_msr = native_read_msr_safe,
+	.rdmsr_regs = native_rdmsr_safe_regs,
 	.write_msr = xen_write_msr_safe,
+	.wrmsr_regs = native_wrmsr_safe_regs,
+
 	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 22:33                         ` Konrad Rzeszutek Wilk
@ 2012-05-30 23:09                           ` H. Peter Anvin
  2012-06-06  9:27                             ` Ingo Molnar
  2012-05-31 12:24                             ` Andre Przywara
  1 sibling, 1 reply; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 23:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Borislav Petkov, Konrad Rzeszutek Wilk, Jacob Shin,
	Andre Przywara, jeremy, xen-devel, linux-kernel, Jan Beulich,
	mingo, tglx

On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..e74df95 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>  	.wbinvd = native_wbinvd,
>  
>  	.read_msr = native_read_msr_safe,
> +	.rdmsr_regs = native_rdmsr_safe_regs,
>  	.write_msr = xen_write_msr_safe,
> +	.wrmsr_regs = native_wrmsr_safe_regs,
> +
>  	.read_tsc = native_read_tsc,
>  	.read_pmc = native_read_pmc,
>  

Okay, tell me again:

why do we have these hooks again?  Can we please weed out paravirt
methods that have no users?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [tip:x86/urgent] x86, amd, xen: Avoid NULL pointer paravirt references
  2012-05-30 22:23               ` Konrad Rzeszutek Wilk
@ 2012-05-30 23:23                 ` tip-bot for Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 63+ messages in thread
From: tip-bot for Konrad Rzeszutek Wilk @ 2012-05-30 23:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, konrad, andre.przywara, stable, tglx

Commit-ID:  1ab46fd319bcf1fcd9fb6311727d532b580e4eba
Gitweb:     http://git.kernel.org/tip/1ab46fd319bcf1fcd9fb6311727d532b580e4eba
Author:     Konrad Rzeszutek Wilk <konrad@darnok.org>
AuthorDate: Wed, 30 May 2012 18:23:56 -0400
Committer:  H. Peter Anvin <hpa@zytor.com>
CommitDate: Wed, 30 May 2012 16:15:02 -0700

x86, amd, xen: Avoid NULL pointer paravirt references

Stub out MSR methods that aren't actually needed.  This fixes a crash
as Xen Dom0 on AMD Trinity systems.  A bigger patch should be added to
remove the paravirt machinery completely for the methods which
apparently have no users!

Reported-by: Andre Przywara <andre.przywara@amd.com>
Link: http://lkml.kernel.org/r/20120530222356.GA28417@andromeda.dapyr.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
---
 arch/x86/xen/enlighten.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 75f33b2..e74df95 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.wbinvd = native_wbinvd,
 
 	.read_msr = native_read_msr_safe,
+	.rdmsr_regs = native_rdmsr_safe_regs,
 	.write_msr = xen_write_msr_safe,
+	.wrmsr_regs = native_wrmsr_safe_regs,
+
 	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 13:10 [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Andre Przywara
                   ` (2 preceding siblings ...)
  2012-05-30 14:42 ` H. Peter Anvin
@ 2012-05-30 23:31 ` H. Peter Anvin
  3 siblings, 0 replies; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-30 23:31 UTC (permalink / raw)
  To: Andre Przywara
  Cc: mingo, tglx, konrad.wilk, jeremy, linux-kernel, xen-devel, stable

On 05/30/2012 06:10 AM, Andre Przywara wrote:
> Because we are behind a family check before tweaking the topology
> bit, we can use the standard rd/wrmsr variants for the CPUID feature
> register.
> This fixes a crash when using the kernel as a Xen Dom0 on affected
> Trinity systems. The wrmsrl_amd_safe is not properly paravirtualized
> yet (this will be fixed in another patch).
> 
> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Cc: stable@vger.kernel.org # 3.4+

With Konrad's patch, this is no longer relevant, correct?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 17:17           ` Konrad Rzeszutek Wilk
@ 2012-05-31  7:17               ` Jan Beulich
  2012-05-30 17:32             ` [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Borislav Petkov
  2012-05-31  7:17               ` Jan Beulich
  2 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-31  7:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	linux-kernel, hpa

>>> On 30.05.12 at 19:17, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> I am tempted to write a patch that checks all the pv-cpu-ops
> to see if there are any that are NULL and throw a warning so
> that this does not hit us in the future - to be at least more
> proactive about this sort of thing.

Perhaps rather than using C99 initializers, using old-style ones
would be an alternative (assuming that the signatures of the
respective entries [or at least immediately neighboring ones]
are different), with a sentinel that is required to remain last
(i.e. adding at the very end would be prohibited)?

Or rather than doing a full structure assignment, assign
individual members directly to pv_cpu_ops (thus leaving
everything that's not explicitly overridden at its "native"
default)? After all, this is being done on __init code, so the
few extra code bytes shouldn't matter much? (All this of
course in the context of hpa's valid request that there be
no unused paravirt hooks in the first place.)

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-31  7:17               ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-31  7:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	linux-kernel, hpa

>>> On 30.05.12 at 19:17, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> I am tempted to write a patch that checks all the pv-cpu-ops
> to see if there are any that are NULL and throw a warning so
> that this does not hit us in the future - to be at least more
> proactive about this sort of thing.

Perhaps rather than using C99 initializers, using old-style ones
would be an alternative (assuming that the signatures of the
respective entries [or at least immediately neighboring ones]
are different), with a sentinel that is required to remain last
(i.e. adding at the very end would be prohibited)?

Or rather than doing a full structure assignment, assign
individual members directly to pv_cpu_ops (thus leaving
everything that's not explicitly overridden at its "native"
default)? After all, this is being done on __init code, so the
few extra code bytes shouldn't matter much? (All this of
course in the context of hpa's valid request that there be
no unused paravirt hooks in the first place.)

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 18:17                     ` Borislav Petkov
@ 2012-05-31  7:39                         ` Jan Beulich
  2012-05-30 18:20                       ` H. Peter Anvin
  2012-05-31  7:39                         ` Jan Beulich
  2 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-31  7:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin

>>> On 30.05.12 at 20:17, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.

This again is something that shouldn't even be attempted under
Xen. The hypervisor, unless really old, does this (and wouldn't
allow the write by any domain - privileged or not - anyway).

There's one more user though - the code triggered by the
"show_msr=" command line option. This one indeed requires
rdmsr_safe_regs to be functional (albeit under Xen, once
again, this won't work currently anyway for those MRS on
old CPUs where the special key in %edi is required, which the
emulation code in Xen doesn't support).

Jan


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-31  7:39                         ` Jan Beulich
  0 siblings, 0 replies; 63+ messages in thread
From: Jan Beulich @ 2012-05-31  7:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin

>>> On 30.05.12 at 20:17, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> The other place where we use the amd_safe variants is an obscure K8,
> revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> so I'm assuming people haven't run xen on such boxes yet. Does it need
> fixing? Probably, if we really really have to.

This again is something that shouldn't even be attempted under
Xen. The hypervisor, unless really old, does this (and wouldn't
allow the write by any domain - privileged or not - anyway).

There's one more user though - the code triggered by the
"show_msr=" command line option. This one indeed requires
rdmsr_safe_regs to be functional (albeit under Xen, once
again, this won't work currently anyway for those MRS on
old CPUs where the special key in %edi is required, which the
emulation code in Xen doesn't support).

Jan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 22:33                         ` Konrad Rzeszutek Wilk
@ 2012-05-31 12:24                             ` Andre Przywara
  2012-05-31 12:24                             ` Andre Przywara
  1 sibling, 0 replies; 63+ messages in thread
From: Andre Przywara @ 2012-05-31 12:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: H. Peter Anvin, Borislav Petkov, Konrad Rzeszutek Wilk,
	Jacob Shin, jeremy, xen-devel, linux-kernel, Jan Beulich, mingo,
	tglx

On 05/31/2012 12:33 AM, Konrad Rzeszutek Wilk wrote:
>>> Now, someone probably needs to paravirt the *safe_regs variants in case
>>> something else decides to use them. I don't know what to do here, do I
>>> want more paravirt code in there? No. I guess if this is done carefully
>>> and cleanly, then it should be ok but it can't be done like that - it
>>> needs to adhere to the current pv_cpu_ops thing which is already there.
>
> Using the native variant seems the right thing to do.
>>>
>>
>> I thought I was being told that Xen would trap and emulate the
>> rdmsr/wrmsr instructions.  I guess they don't want to do that for the
>
> It does.
>> handful of performance-sensitive MSRs there are, but those don't use the
>> *_regs variants.
>
> The underlaying issue (as I understand) was that .rdmsr_regs
> (and the corresponding write) was NULL and that caused the crash.
> This tiny patch should do it:
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..e74df95 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>   	.wbinvd = native_wbinvd,
>
>   	.read_msr = native_read_msr_safe,
> +	.rdmsr_regs = native_rdmsr_safe_regs,
>   	.write_msr = xen_write_msr_safe,
> +	.wrmsr_regs = native_wrmsr_safe_regs,
> +
>   	.read_tsc = native_read_tsc,
>   	.read_pmc = native_read_pmc,
>
>

Acked-by: Andre Przywara <andre.przywara@amd.com>

This works on the test machine.

Though I'd still like to have my original patch applied, because it 
makes the thing a bit cleaner.

And I made a patch to remove the {rd,wr}msr_regs hooks from paravirt_ops 
completely. Shall I send it out or do you want to make this part of 
larger patch series to clean up pvops?

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
@ 2012-05-31 12:24                             ` Andre Przywara
  0 siblings, 0 replies; 63+ messages in thread
From: Andre Przywara @ 2012-05-31 12:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: jeremy, xen-devel, Konrad Rzeszutek Wilk, Jacob Shin,
	linux-kernel, Borislav Petkov, Jan Beulich, H. Peter Anvin,
	mingo, tglx

On 05/31/2012 12:33 AM, Konrad Rzeszutek Wilk wrote:
>>> Now, someone probably needs to paravirt the *safe_regs variants in case
>>> something else decides to use them. I don't know what to do here, do I
>>> want more paravirt code in there? No. I guess if this is done carefully
>>> and cleanly, then it should be ok but it can't be done like that - it
>>> needs to adhere to the current pv_cpu_ops thing which is already there.
>
> Using the native variant seems the right thing to do.
>>>
>>
>> I thought I was being told that Xen would trap and emulate the
>> rdmsr/wrmsr instructions.  I guess they don't want to do that for the
>
> It does.
>> handful of performance-sensitive MSRs there are, but those don't use the
>> *_regs variants.
>
> The underlaying issue (as I understand) was that .rdmsr_regs
> (and the corresponding write) was NULL and that caused the crash.
> This tiny patch should do it:
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 75f33b2..e74df95 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>   	.wbinvd = native_wbinvd,
>
>   	.read_msr = native_read_msr_safe,
> +	.rdmsr_regs = native_rdmsr_safe_regs,
>   	.write_msr = xen_write_msr_safe,
> +	.wrmsr_regs = native_wrmsr_safe_regs,
> +
>   	.read_tsc = native_read_tsc,
>   	.read_pmc = native_read_pmc,
>
>

Acked-by: Andre Przywara <andre.przywara@amd.com>

This works on the test machine.

Though I'd still like to have my original patch applied, because it 
makes the thing a bit cleaner.

And I made a patch to remove the {rd,wr}msr_regs hooks from paravirt_ops 
completely. Shall I send it out or do you want to make this part of 
larger patch series to clean up pvops?

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-31 12:24                             ` Andre Przywara
  (?)
@ 2012-05-31 15:27                             ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 63+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-05-31 15:27 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Konrad Rzeszutek Wilk, jeremy, xen-devel, Jacob Shin,
	linux-kernel, Borislav Petkov, Jan Beulich, H. Peter Anvin,
	mingo, tglx

> >diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >index 75f33b2..e74df95 100644
> >--- a/arch/x86/xen/enlighten.c
> >+++ b/arch/x86/xen/enlighten.c
> >@@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> >  	.wbinvd = native_wbinvd,
> >
> >  	.read_msr = native_read_msr_safe,
> >+	.rdmsr_regs = native_rdmsr_safe_regs,
> >  	.write_msr = xen_write_msr_safe,
> >+	.wrmsr_regs = native_wrmsr_safe_regs,
> >+
> >  	.read_tsc = native_read_tsc,
> >  	.read_pmc = native_read_pmc,
> >
> >
> 
> Acked-by: Andre Przywara <andre.przywara@amd.com>
> 
> This works on the test machine.

Great! Thanks for doing a quick test for this.
> 
> Though I'd still like to have my original patch applied, because it
> makes the thing a bit cleaner.

OK. Please re-send with an up-to-date git commit as suggested
by Peter.

> 
> And I made a patch to remove the {rd,wr}msr_regs hooks from
> paravirt_ops completely. Shall I send it out or do you want to make
> this part of larger patch series to clean up pvops?

Please do send it out. Thanks again!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-31  7:17               ` Jan Beulich
  (?)
@ 2012-05-31 15:59               ` H. Peter Anvin
  -1 siblings, 0 replies; 63+ messages in thread
From: H. Peter Anvin @ 2012-05-31 15:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Konrad Rzeszutek Wilk, Andre Przywara, Jacob Shin, mingo, jeremy,
	tglx, xen-devel, linux-kernel

On 05/31/2012 12:17 AM, Jan Beulich wrote:
>>>> On 30.05.12 at 19:17, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>> I am tempted to write a patch that checks all the pv-cpu-ops
>> to see if there are any that are NULL and throw a warning so
>> that this does not hit us in the future - to be at least more
>> proactive about this sort of thing.
> 
> Perhaps rather than using C99 initializers, using old-style ones
> would be an alternative (assuming that the signatures of the
> respective entries [or at least immediately neighboring ones]
> are different), with a sentinel that is required to remain last
> (i.e. adding at the very end would be prohibited)?
> 
> Or rather than doing a full structure assignment, assign
> individual members directly to pv_cpu_ops (thus leaving
> everything that's not explicitly overridden at its "native"
> default)? After all, this is being done on __init code, so the
> few extra code bytes shouldn't matter much? (All this of
> course in the context of hpa's valid request that there be
> no unused paravirt hooks in the first place.)
> 

Actually there is a really easy way to do this with C99 initializers:
create a macro with all the default assignments, and put that one first.
 This is because it is legal to have more than one C99 initializer for
the same member, the last one is the one that takes effect.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-31  7:39                         ` Jan Beulich
  (?)
@ 2012-05-31 16:55                         ` Borislav Petkov
  -1 siblings, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2012-05-31 16:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andre Przywara, Jacob Shin, mingo, jeremy, tglx, xen-devel,
	Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin, Yinghai Lu

On Thu, May 31, 2012 at 08:39:15AM +0100, Jan Beulich wrote:
> >>> On 30.05.12 at 20:17, Borislav Petkov <bp@alien8.de> wrote:
> > On Wed, May 30, 2012 at 11:00:23AM -0700, H. Peter Anvin wrote:
> > The other place where we use the amd_safe variants is an obscure K8,
> > revC and earlier fix for _some_ BIOSen and this hasn't bitten us yet
> > so I'm assuming people haven't run xen on such boxes yet. Does it need
> > fixing? Probably, if we really really have to.
> 
> This again is something that shouldn't even be attempted under
> Xen. The hypervisor, unless really old, does this (and wouldn't
> allow the write by any domain - privileged or not - anyway).
> 
> There's one more user though - the code triggered by the
> "show_msr=" command line option. This one indeed requires
> rdmsr_safe_regs to be functional (albeit under Xen, once
> again, this won't work currently anyway for those MRS on
> old CPUs where the special key in %edi is required, which the
> emulation code in Xen doesn't support).

This doesn't look right. Yinghai, why does generic x86 code use
rdmsrl_amd_safe - it should simply use rdmsrl_safe.

-- 
Regards/Gruss,
Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-05-30 23:09                           ` H. Peter Anvin
@ 2012-06-06  9:27                             ` Ingo Molnar
  2012-06-06  9:42                               ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Ingo Molnar @ 2012-06-06  9:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, Borislav Petkov, Konrad Rzeszutek Wilk,
	Jacob Shin, Andre Przywara, jeremy, xen-devel, linux-kernel,
	Jan Beulich, mingo, tglx


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 75f33b2..e74df95 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> >  	.wbinvd = native_wbinvd,
> >  
> >  	.read_msr = native_read_msr_safe,
> > +	.rdmsr_regs = native_rdmsr_safe_regs,
> >  	.write_msr = xen_write_msr_safe,
> > +	.wrmsr_regs = native_wrmsr_safe_regs,
> > +
> >  	.read_tsc = native_read_tsc,
> >  	.read_pmc = native_read_pmc,
> >  
> 
> Okay, tell me again:
> 
> why do we have these hooks again?  Can we please weed out 
> paravirt methods that have no users?

ping?

	Ingo

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-06-06  9:27                             ` Ingo Molnar
@ 2012-06-06  9:42                               ` Borislav Petkov
  2012-06-06  9:45                                 ` Ingo Molnar
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2012-06-06  9:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Konrad Rzeszutek Wilk, Konrad Rzeszutek Wilk,
	Jacob Shin, Andre Przywara, jeremy, xen-devel, linux-kernel,
	Jan Beulich, mingo, tglx

On Wed, Jun 06, 2012 at 11:27:13AM +0200, Ingo Molnar wrote:
> 
> * H. Peter Anvin <hpa@zytor.com> wrote:
> 
> > On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > > index 75f33b2..e74df95 100644
> > > --- a/arch/x86/xen/enlighten.c
> > > +++ b/arch/x86/xen/enlighten.c
> > > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > >  	.wbinvd = native_wbinvd,
> > >  
> > >  	.read_msr = native_read_msr_safe,
> > > +	.rdmsr_regs = native_rdmsr_safe_regs,
> > >  	.write_msr = xen_write_msr_safe,
> > > +	.wrmsr_regs = native_wrmsr_safe_regs,
> > > +
> > >  	.read_tsc = native_read_tsc,
> > >  	.read_pmc = native_read_pmc,
> > >  
> > 
> > Okay, tell me again:
> > 
> > why do we have these hooks again?  Can we please weed out 
> > paravirt methods that have no users?
> 
> ping?

I think we solved all that and hpa wanted to queue them up shortly:

http://marc.info/?l=linux-kernel&m=133856342618027&w=2

If it's easier, I can send you a pull request later.

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems
  2012-06-06  9:42                               ` Borislav Petkov
@ 2012-06-06  9:45                                 ` Ingo Molnar
  0 siblings, 0 replies; 63+ messages in thread
From: Ingo Molnar @ 2012-06-06  9:45 UTC (permalink / raw)
  To: Borislav Petkov, H. Peter Anvin, Konrad Rzeszutek Wilk,
	Konrad Rzeszutek Wilk, Jacob Shin, Andre Przywara, jeremy,
	xen-devel, linux-kernel, Jan Beulich, mingo, tglx


* Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Jun 06, 2012 at 11:27:13AM +0200, Ingo Molnar wrote:
> > 
> > * H. Peter Anvin <hpa@zytor.com> wrote:
> > 
> > > On 05/30/2012 03:33 PM, Konrad Rzeszutek Wilk wrote:
> > > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > > > index 75f33b2..e74df95 100644
> > > > --- a/arch/x86/xen/enlighten.c
> > > > +++ b/arch/x86/xen/enlighten.c
> > > > @@ -1116,7 +1116,10 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
> > > >  	.wbinvd = native_wbinvd,
> > > >  
> > > >  	.read_msr = native_read_msr_safe,
> > > > +	.rdmsr_regs = native_rdmsr_safe_regs,
> > > >  	.write_msr = xen_write_msr_safe,
> > > > +	.wrmsr_regs = native_wrmsr_safe_regs,
> > > > +
> > > >  	.read_tsc = native_read_tsc,
> > > >  	.read_pmc = native_read_pmc,
> > > >  
> > > 
> > > Okay, tell me again:
> > > 
> > > why do we have these hooks again?  Can we please weed out 
> > > paravirt methods that have no users?
> > 
> > ping?
> 
> I think we solved all that and hpa wanted to queue them up shortly:
> 
> http://marc.info/?l=linux-kernel&m=133856342618027&w=2

Great, will wait for hpa then.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2012-06-06  9:45 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-30 13:10 [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Andre Przywara
2012-05-30 13:33 ` [Xen-devel] " Jan Beulich
2012-05-30 13:33   ` Jan Beulich
2012-05-30 14:02   ` [Xen-devel] " Andre Przywara
2012-05-30 14:02     ` Andre Przywara
2012-05-30 14:23     ` [Xen-devel] " Jan Beulich
2012-05-30 14:23       ` Jan Beulich
2012-05-30 14:42       ` [Xen-devel] " H. Peter Anvin
2012-05-30 14:49         ` Konrad Rzeszutek Wilk
2012-05-30 15:12           ` Borislav Petkov
2012-05-30 15:40             ` Jan Beulich
2012-05-30 15:40               ` Jan Beulich
2012-05-30 15:45               ` H. Peter Anvin
2012-05-30 15:58               ` Borislav Petkov
2012-05-30 14:48     ` Jacob Shin
2012-05-30 14:48       ` Jacob Shin
2012-05-30 14:50       ` Konrad Rzeszutek Wilk
2012-05-30 15:03         ` Jacob Shin
2012-05-30 15:03           ` Jacob Shin
2012-05-30 17:17           ` Konrad Rzeszutek Wilk
2012-05-30 17:31             ` H. Peter Anvin
2012-05-30 22:23               ` Konrad Rzeszutek Wilk
2012-05-30 23:23                 ` [tip:x86/urgent] x86, amd, xen: Avoid NULL pointer paravirt references tip-bot for Konrad Rzeszutek Wilk
2012-05-30 17:32             ` [Xen-devel] [PATCH] x86/amd: fix crash as Xen Dom0 on AMD Trinity systems Borislav Petkov
2012-05-30 17:47               ` [PATCH] x86, AMD: Fix " Borislav Petkov
2012-05-30 17:47               ` [Xen-devel] [PATCH] x86/amd: fix " H. Peter Anvin
2012-05-30 17:51                 ` Borislav Petkov
2012-05-30 18:00                   ` H. Peter Anvin
2012-05-30 18:17                     ` Borislav Petkov
2012-05-30 18:19                       ` Borislav Petkov
2012-05-30 18:21                         ` H. Peter Anvin
2012-05-30 18:29                           ` Borislav Petkov
2012-05-30 18:20                       ` H. Peter Anvin
2012-05-30 22:33                         ` Konrad Rzeszutek Wilk
2012-05-30 23:09                           ` H. Peter Anvin
2012-06-06  9:27                             ` Ingo Molnar
2012-06-06  9:42                               ` Borislav Petkov
2012-06-06  9:45                                 ` Ingo Molnar
2012-05-31 12:24                           ` Andre Przywara
2012-05-31 12:24                             ` Andre Przywara
2012-05-31 15:27                             ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-05-31  7:39                       ` Jan Beulich
2012-05-31  7:39                         ` Jan Beulich
2012-05-31 16:55                         ` Borislav Petkov
2012-05-31  7:17             ` Jan Beulich
2012-05-31  7:17               ` Jan Beulich
2012-05-31 15:59               ` H. Peter Anvin
2012-05-30 14:39 ` Konrad Rzeszutek Wilk
2012-05-30 14:50   ` H. Peter Anvin
2012-05-30 14:51     ` Konrad Rzeszutek Wilk
2012-05-30 15:08     ` Jan Beulich
2012-05-30 15:08       ` Jan Beulich
2012-05-30 15:15       ` H. Peter Anvin
2012-05-30 15:35         ` Jan Beulich
2012-05-30 15:35           ` Jan Beulich
2012-05-30 16:48           ` Konrad Rzeszutek Wilk
2012-05-30 14:42 ` H. Peter Anvin
2012-05-30 14:55   ` Borislav Petkov
2012-05-30 14:58     ` H. Peter Anvin
2012-05-30 15:00       ` Borislav Petkov
2012-05-30 15:01         ` H. Peter Anvin
2012-05-30 15:05           ` Borislav Petkov
2012-05-30 23:31 ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.