All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] x86/AMD: flush TLB after ucode update
@ 2019-01-28  9:51 Jan Beulich
  2019-01-28 11:40 ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2019-01-28  9:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, Andrew Cooper, Suravee Suthikulpanit,
	Brian Woods, Roger Pau Monne

The increased number of messages (spec_ctrl.c:print_details()) within a
certain time window made me notice some slowness of boot time screen
output. Experimentally I've narrowed the time window to be from
immediately after the early ucode update on the BSP to the PAT write in
cpu_init(), which upon further investigation has an effect because of
the full TLB flush that's implied by that write.

For that reason, as a workaround, flush the TLB of the mapping of the
page that holds the blob. Note that flushing just a single page is
sufficient: As per verify_patch_size() patch size can't exceed 4k, and
the way xmalloc() works the blob can't be crossing a page boundary.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Use TLB flush instead of PAT write.
v2: Re-base.

--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -218,6 +218,12 @@ static int apply_microcode(unsigned int
 
     spin_unlock_irqrestore(&microcode_update_lock, flags);
 
+    /*
+     * Experimentally this helps with performance issues on at least certain
+     * Fam15 models.
+     */
+    flush_area_local(hdr, FLUSH_TLB_GLOBAL | FLUSH_ORDER(0));
+
     /* check current patch id and patch's id for match */
     if ( hw_err || (rev != hdr->patch_id) )
     {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86/AMD: flush TLB after ucode update
  2019-01-28  9:51 [PATCH v3] x86/AMD: flush TLB after ucode update Jan Beulich
@ 2019-01-28 11:40 ` Andrew Cooper
  2019-01-28 14:19   ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2019-01-28 11:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Juergen Gross, Wei Liu, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monne

On 28/01/2019 09:51, Jan Beulich wrote:
> The increased number of messages (spec_ctrl.c:print_details()) within a
> certain time window made me notice some slowness of boot time screen
> output. Experimentally I've narrowed the time window to be from
> immediately after the early ucode update on the BSP to the PAT write in
> cpu_init(), which upon further investigation has an effect because of
> the full TLB flush that's implied by that write.
>
> For that reason, as a workaround, flush the TLB of the mapping of the
> page that holds the blob. Note that flushing just a single page is
> sufficient: As per verify_patch_size() patch size can't exceed 4k, and
> the way xmalloc() works the blob can't be crossing a page boundary.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I think it is worth at least noting that we are expecting a BKGD/PPR
update to this effect in due course, even if this doesn't end up in the
commit message.

> ---
> v3: Use TLB flush instead of PAT write.
> v2: Re-base.
>
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -218,6 +218,12 @@ static int apply_microcode(unsigned int
>  
>      spin_unlock_irqrestore(&microcode_update_lock, flags);
>  
> +    /*
> +     * Experimentally this helps with performance issues on at least certain
> +     * Fam15 models.

This is no longer experimental, now that we understand why.  How about:

"Some processors leave the ucode blob mapping as UC after the update. 
Flush the mapping to regain normal cacheability" ?

That way, its also slightly less cryptic in the code.

~Andrew

> +     */
> +    flush_area_local(hdr, FLUSH_TLB_GLOBAL | FLUSH_ORDER(0));
> +
>      /* check current patch id and patch's id for match */
>      if ( hw_err || (rev != hdr->patch_id) )
>      {
>
>
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86/AMD: flush TLB after ucode update
  2019-01-28 11:40 ` Andrew Cooper
@ 2019-01-28 14:19   ` Jan Beulich
  2019-01-28 14:25     ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2019-01-28 14:19 UTC (permalink / raw)
  To: Brian Woods, Suravee Suthikulpanit, Andrew Cooper
  Cc: Juergen Gross, xen-devel, Wei Liu, Roger Pau Monne

>>> On 28.01.19 at 12:40, <andrew.cooper3@citrix.com> wrote:
> On 28/01/2019 09:51, Jan Beulich wrote:
>> The increased number of messages (spec_ctrl.c:print_details()) within a
>> certain time window made me notice some slowness of boot time screen
>> output. Experimentally I've narrowed the time window to be from
>> immediately after the early ucode update on the BSP to the PAT write in
>> cpu_init(), which upon further investigation has an effect because of
>> the full TLB flush that's implied by that write.
>>
>> For that reason, as a workaround, flush the TLB of the mapping of the
>> page that holds the blob. Note that flushing just a single page is
>> sufficient: As per verify_patch_size() patch size can't exceed 4k, and
>> the way xmalloc() works the blob can't be crossing a page boundary.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I think it is worth at least noting that we are expecting a BKGD/PPR
> update to this effect in due course, even if this doesn't end up in the
> commit message.

To be honest I wasn't sure whether I should say anything like this.

>> --- a/xen/arch/x86/microcode_amd.c
>> +++ b/xen/arch/x86/microcode_amd.c
>> @@ -218,6 +218,12 @@ static int apply_microcode(unsigned int
>>  
>>      spin_unlock_irqrestore(&microcode_update_lock, flags);
>>  
>> +    /*
>> +     * Experimentally this helps with performance issues on at least certain
>> +     * Fam15 models.
> 
> This is no longer experimental, now that we understand why.  How about:
> 
> "Some processors leave the ucode blob mapping as UC after the update. 
> Flush the mapping to regain normal cacheability" ?
> 
> That way, its also slightly less cryptic in the code.

I did consider re-wording the comment, but decided to leave it unchanged,
for the way you word it not having public proof anywhere (for now at least).
I'm fine to change the comment, if I can explicit go-ahead from AMD. Brian,
Suravee?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86/AMD: flush TLB after ucode update
  2019-01-28 14:19   ` Jan Beulich
@ 2019-01-28 14:25     ` Andrew Cooper
  2019-01-28 15:51       ` Woods, Brian
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2019-01-28 14:25 UTC (permalink / raw)
  To: Jan Beulich, Brian Woods, Suravee Suthikulpanit
  Cc: Juergen Gross, xen-devel, Wei Liu, Roger Pau Monne

On 28/01/2019 14:19, Jan Beulich wrote:
>>> --- a/xen/arch/x86/microcode_amd.c
>>> +++ b/xen/arch/x86/microcode_amd.c
>>> @@ -218,6 +218,12 @@ static int apply_microcode(unsigned int
>>>  
>>>      spin_unlock_irqrestore(&microcode_update_lock, flags);
>>>  
>>> +    /*
>>> +     * Experimentally this helps with performance issues on at least certain
>>> +     * Fam15 models.
>> This is no longer experimental, now that we understand why.  How about:
>>
>> "Some processors leave the ucode blob mapping as UC after the update. 
>> Flush the mapping to regain normal cacheability" ?
>>
>> That way, its also slightly less cryptic in the code.
> I did consider re-wording the comment, but decided to leave it unchanged,
> for the way you word it not having public proof anywhere (for now at least).
> I'm fine to change the comment, if I can explicit go-ahead from AMD. Brian,
> Suravee?

Preferably with the amended wording, (but ultimately, as agreed upon
with AMD), Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86/AMD: flush TLB after ucode update
  2019-01-28 14:25     ` Andrew Cooper
@ 2019-01-28 15:51       ` Woods, Brian
  0 siblings, 0 replies; 6+ messages in thread
From: Woods, Brian @ 2019-01-28 15:51 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich, Suthikulpanit, Suravee
  Cc: Juergen Gross, xen-devel, Wei Liu, Roger Pau Monne

On 1/28/19 8:25 AM, Andrew Cooper wrote:
> On 28/01/2019 14:19, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/microcode_amd.c
>>>> +++ b/xen/arch/x86/microcode_amd.c
>>>> @@ -218,6 +218,12 @@ static int apply_microcode(unsigned int
>>>>   
>>>>       spin_unlock_irqrestore(&microcode_update_lock, flags);
>>>>   
>>>> +    /*
>>>> +     * Experimentally this helps with performance issues on at least certain
>>>> +     * Fam15 models.
>>> This is no longer experimental, now that we understand why.  How about:
>>>
>>> "Some processors leave the ucode blob mapping as UC after the update.
>>> Flush the mapping to regain normal cacheability" ?
>>>
>>> That way, its also slightly less cryptic in the code.
>> I did consider re-wording the comment, but decided to leave it unchanged,
>> for the way you word it not having public proof anywhere (for now at least).
>> I'm fine to change the comment, if I can explicit go-ahead from AMD. Brian,
>> Suravee?
> 
> Preferably with the amended wording, (but ultimately, as agreed upon
> with AMD), Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
Likewise,
Reviewed-by: Brian Woods <brian.woods@amd.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86/AMD: flush TLB after ucode update
       [not found] <5C4ED0A302000078002119D5@suse.com>
@ 2019-01-28 12:13 ` Juergen Gross
  0 siblings, 0 replies; 6+ messages in thread
From: Juergen Gross @ 2019-01-28 12:13 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, Brian Woods, Wei Liu, Suravee Suthikulpanit,
	Roger Pau Monne

On 28/01/2019 10:51, Jan Beulich wrote:
> The increased number of messages (spec_ctrl.c:print_details()) within a
> certain time window made me notice some slowness of boot time screen
> output. Experimentally I've narrowed the time window to be from
> immediately after the early ucode update on the BSP to the PAT write in
> cpu_init(), which upon further investigation has an effect because of
> the full TLB flush that's implied by that write.
> 
> For that reason, as a workaround, flush the TLB of the mapping of the
> page that holds the blob. Note that flushing just a single page is
> sufficient: As per verify_patch_size() patch size can't exceed 4k, and
> the way xmalloc() works the blob can't be crossing a page boundary.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-01-28 15:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28  9:51 [PATCH v3] x86/AMD: flush TLB after ucode update Jan Beulich
2019-01-28 11:40 ` Andrew Cooper
2019-01-28 14:19   ` Jan Beulich
2019-01-28 14:25     ` Andrew Cooper
2019-01-28 15:51       ` Woods, Brian
     [not found] <5C4ED0A302000078002119D5@suse.com>
2019-01-28 12:13 ` Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.