All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: speck@linutronix.de
Subject: [MODERATED] Re: [PATCH v2 5/8] MDSv2 7
Date: Wed, 12 Dec 2018 10:05:32 -0800	[thread overview]
Message-ID: <a4c71ed3-3216-8684-4ed4-3b622b287ab8@citrix.com> (raw)
In-Reply-To: <fef2282c-dd0f-5e30-d11e-fb0acb8938a7@citrix.com>

[-- Attachment #1: Type: text/plain, Size: 3316 bytes --]

On 10/12/2018 16:33, speck for Andrew Cooper wrote:
> On 10/12/2018 17:53, speck for Andi Kleen wrote:
>> xdiff --git a/arch/x86/lib/clear_cpu.S b/arch/x86/lib/clear_cpu.S
>> new file mode 100644
>> index 000000000000..5af33baf5427
>> --- /dev/null
>> +++ b/arch/x86/lib/clear_cpu.S
>> @@ -0,0 +1,107 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#include <linux/linkage.h>
>> +#include <asm/alternative-asm.h>
>> +#include <asm/cpufeatures.h>
>> +
>> +/*
>> + * Clear internal CPU buffers on kernel boundaries.
>> + *
>> + * These sequences are somewhat fragile, please don't add
>> + * or change instructions in the middle of the areas marked with
>> + * start/end.
>> + *
>> + * Interrupts and NMIs we deal with by reclearing. We clear parts
>> + * of the kernel stack, which has other advantages too.
>> + *
>> + * Save all registers to make it easier to use for callers.
>> + *
>> + * This sequence is for Nehalem-IvyBridge. For Haswell we jump
>> + * to hsw_clear_buf.
>> + *
>> + * These functions need to be called on a full stack, as they may
>> + * use upto 1.5k of stack. They should be also called with
>> + * interrupts disabled. NMIs etc. are handled by letting every
>> + * NMI do its own clear sequence.
>> + */
>> +ENTRY(ivb_clear_cpu)
>> +GLOBAL(do_clear_cpu)
>> +	/*
>> +	 * obj[tf]ool complains about unreachable code here,
>> +	 * which appears to be spurious?!?
>> +	 */
>> +	ALTERNATIVE "", "jmp hsw_clear_cpu", X86_BUG_MDS_CLEAR_CPU_HSWange
>> +	push %__ASM_REG(si)
>> +	push %__ASM_REG(di)
>> +	push %__ASM_REG(cx)
>> +	mov %_ASM_SP, %__ASM_REG(si)
>> +	sub  $2*16, %_ASM_SP
>> +	and  $-16,%_ASM_SP
>> +	movdqa %xmm0, (%_ASM_SP)
>> +	movdqa %xmm1, 1*16(%_ASM_SP)
> You don't need to preserve %xmm1 here.  It is unmodified by the
> sequence, because the orpd pulls zero out of its memory operand. 
> Similarly...
>
>> +	sub  $672, %_ASM_SP
>> +	xorpd %xmm0,%xmm0
>> +	movdqa %xmm0, (%_ASM_SP)
>> +	movdqa %xmm0, 16(%_ASM_SP)
> ... this store doesn't appear to do anything useful, as that stack slot
> isn't read again, and...
>
>> +	mov %_ASM_SP, %__ASM_REG(di)
>> +	/* Clear sequence start */
>> +	movdqu %xmm0,(%__ASM_REG(di))
>> +	lfence
>> +	orpd (%__ASM_REG(di)), %xmm0
>> +	orpd (%__ASM_REG(di)), %xmm1
>> +	mfence
>> +	movl $40, %ecx
>> +	add  $32, %__ASM_REG(di)
> ... I know this was in the recommended sequence, but bytes 16-31 aren't
> used at all, and it feels fishy.
>
> Either this wants to be $16, or the second orpd wants a displacement of
> 16 (and we do need to retain the second zeroing write) so all 32 bytes
> are used.

Based on what Ronak has said in person, two back-to-back loads are
guaranteed to be scheduled on alternate load ports.

Therefore, this can be a 16 byte change rather than 32.

However, there is a separate problem with synchronising the other
threads, because a pause waitloop will be racing with this sequence for
allocation of load ports.  There doesn't appear to be a viable option to
guarantee that these two orpd hit both load ports in the core.

Given the GPR restoration later in the return to guest path which is
15ish loads in a line, one option being discussed is to do away with the
first half of software sequence entirely.

~Andrew


  reply	other threads:[~2018-12-12 18:06 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-10 17:53 [MODERATED] [PATCH v2 0/8] MDSv2 8 Andi Kleen
2018-12-10 17:53 ` [MODERATED] [PATCH v2 1/8] MDSv2 4 Andi Kleen
2018-12-11 14:14   ` [MODERATED] " Paolo Bonzini
2018-12-12 21:22   ` Konrad Rzeszutek Wilk
2018-12-12 21:28     ` Andi Kleen
2018-12-12 21:25   ` Konrad Rzeszutek Wilk
2018-12-10 17:53 ` [MODERATED] [PATCH v2 2/8] MDSv2 1 Andi Kleen
2018-12-10 22:49   ` [MODERATED] " Jiri Kosina
2018-12-11  0:03     ` Andi Kleen
2018-12-11  0:13     ` Kanth Ghatraju
2018-12-11  2:00       ` Andi Kleen
2018-12-11  5:36       ` Jiri Kosina
2018-12-11 10:03       ` Borislav Petkov
2018-12-12 21:31         ` Konrad Rzeszutek Wilk
2018-12-12 21:43           ` Andi Kleen
2018-12-12 22:17           ` Borislav Petkov
2018-12-12 22:40             ` Konrad Rzeszutek Wilk
2018-12-12 22:45               ` Borislav Petkov
2018-12-13 15:15                 ` Andrew Cooper
2018-12-13 16:52                   ` Borislav Petkov
2018-12-10 17:53 ` [MODERATED] [PATCH v2 3/8] MDSv2 5 Andi Kleen
2018-12-10 23:00   ` [MODERATED] " Linus Torvalds
2018-12-11  0:03     ` Andi Kleen
2018-12-11  0:43       ` Linus Torvalds
2018-12-11  1:33         ` Linus Torvalds
2018-12-11  2:12           ` Andi Kleen
2018-12-11  2:20           ` Linus Torvalds
2018-12-11  3:25             ` Andi Kleen
2018-12-11 17:55               ` Linus Torvalds
2018-12-11 18:10                 ` Borislav Petkov
2018-12-11 18:21                 ` Linus Torvalds
2018-12-11 18:26                   ` Borislav Petkov
2018-12-11 19:47                   ` Andi Kleen
2018-12-11 21:22                   ` Thomas Gleixner
2018-12-12 14:02               ` [MODERATED] " Paolo Bonzini
2018-12-12 17:58                 ` Andi Kleen
2018-12-12 18:47                   ` Linus Torvalds
2018-12-13 19:44                     ` Linus Torvalds
2018-12-13 20:48                       ` Andi Kleen
2018-12-13 20:56                         ` Linus Torvalds
2018-12-15  0:30                         ` Andi Kleen
2018-12-11  2:10         ` Andi Kleen
2018-12-11  0:09     ` Andrew Cooper
2018-12-10 17:53 ` [MODERATED] [PATCH v2 4/8] MDSv2 0 Andi Kleen
2018-12-12 21:45   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-12-12 22:09     ` Andi Kleen
2018-12-12 22:36       ` Konrad Rzeszutek Wilk
2018-12-10 17:53 ` [MODERATED] [PATCH v2 5/8] MDSv2 7 Andi Kleen
2018-12-11  0:33   ` [MODERATED] " Andrew Cooper
2018-12-12 18:05     ` Andrew Cooper [this message]
2018-12-12 21:41   ` Konrad Rzeszutek Wilk
2018-12-12 22:12     ` Andi Kleen
2018-12-10 17:53 ` [MODERATED] [PATCH v2 6/8] MDSv2 3 Andi Kleen
2018-12-11  0:37   ` [MODERATED] " Andrew Cooper
2018-12-11  0:46     ` Luck, Tony
2018-12-11  1:02       ` Andrew Cooper
2018-12-11  1:53       ` Andi Kleen
2018-12-10 17:53 ` [MODERATED] [PATCH v2 7/8] MDSv2 6 Andi Kleen
2018-12-10 17:53 ` [MODERATED] [PATCH v2 8/8] MDSv2 2 Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4c71ed3-3216-8684-4ed4-3b622b287ab8@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=speck@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.