linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Joerg Roedel <joro@8bytes.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Jacob Jun Pan <jacob.jun.pan@intel.com>,
	Raj Ashok <ashok.raj@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	iommu@lists.linux-foundation.org,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
Date: Tue, 28 Sep 2021 13:28:23 -0700	[thread overview]
Message-ID: <YVN652x14dMgyE85@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <3f97b77e-a609-997b-3be7-f44ff7312b0d@intel.com>

On Tue, Sep 28, 2021 at 12:19:22PM -0700, Dave Hansen wrote:
> On 9/28/21 11:50 AM, Luck, Tony wrote:
> > On Mon, Sep 27, 2021 at 04:51:25PM -0700, Dave Hansen wrote:
> ...
> >> 1. Hide whether we need to write to real registers
> >> 2. Hide whether we need to update the in-memory image
> >> 3. Hide other FPU infrastructure like the TIF flag.
> >> 4. Make the users deal with a *whole* state in the replace API
> > 
> > Is that difference just whether you need to save the
> > state from registers to memory (for the "update" case)
> > or not (for the "replace" case ... where you can ignore
> > the current register, overwrite the whole per-feature
> > xsave area and mark it to be restored to registers).
> > 
> > If so, just a "bool full" argument might do the trick?
> 
> I want to be able to hide the complexity of where the old state comes
> from.  It might be in registers or it might be in memory or it might be
> *neither*.  It's possible we're running with stale register state and a
> current->...->xsave buffer that has XFEATURES&XFEATURE_FOO 0.
> 
> In that case, the "old" copy might be memcpy'd out of the init_task.
> Or, for pkeys, we might build it ourselves with init_pkru_val.

So should there be an error case if there isn't an "old" state, and
the user calls:

	p = begin_update_one_xsave_feature(XFEATURE_something, false);

Maybe instead of an error, just fill it in with the init state for the feature?

> > Also - you have a "tsk" argument in your pseudo code. Is
> > this needed? Are there places where we need to perform
> > these operations on something other than "current"?
> 
> Two cases come to mind:
> 1. Fork/clone where we are doing things to our child's XSAVE buffer
> 2. ptrace() where we are poking into another task's state
> 
> ptrace() goes for the *whole* buffer now.  I'm not sure it would need
> this per-feature API.  I just call it out as something that we might
> need in the future.

Ok - those seem ok ... it is up to the caller to make sure that the
target task is in some "not running, and can't suddenly start running"
state before calling these functions.

> 
> > pseudo-code:
> > 
> > void *begin_update_one_xsave_feature(enum xfeature xfeature, bool full)
> > {
> > 	void *addr;
> > 
> > 	BUG_ON(!(xsave->header.xcomp_bv & xfeature));
> > 
> > 	addr = __raw_xsave_addr(xsave, xfeature);
> > 
> > 	fpregs_lock();
> > 
> > 	if (full)
> > 		return addr;
> 
> If the feature is marked as in the init state in the buffer
> (XSTATE_BV[feature]==0), this addr *could* contain total garbage.  So,
> we'd want to make sure that the memory contents have the init state
> written before handing them back to the caller.  That's not strictly
> required if the user is writing the whole thing, but it's the nice thing
> to do.

Nice guys waste CPU cycles writing to memory that is just going to get
written again.

> 
> > 	if (xfeature registers are "live")
> > 		xsaves(xstate, 1 << xfeature);
> 
> One little note: I don't think we would necessarily need to do an XSAVES
> here.  For PKRU, for instance, we could just do a rdpkru.

Like this?

	if (tsk == current) {
		switch (xfeature) {
		case XFEATURE_PKRU:
			*(u32 *)addr = rdpkru();
			break;
		case XFEATURE_PASID:
			rdmsrl(MSR_IA32_PASID, msr);
			*(u64 *)addr = msr;
			break;
		... any other "easy" states ...
		default:
			xsaves(xstate, 1 << xfeature);
			break;
		}
	}

> 
> > 	return addr;
> > }
> > 
> > void finish_update_one_xsave_feature(enum xfeature xfeature)
> > {
> > 	mark feature modified
> 
> I think we'd want to do this at the "begin" time.  Also, do you mean we
> should set XSTATE_BV[feature]?

Begin? End? It's all inside fpregs_lock(). But whatever seems best.

Yes, I think that this means set XSTATE_BV[feature] ... but I'm
relying on you as the xsave expert to help get the subtle bits right so
the Andy Lutomirski can smile at this code.

> > 	set TIF bit
> 
> Since the XSAVE buffer was updated, it now contains the canonical FPU
> state.  It may have diverged from the register state, thus we need to
> set TIF_NEED_FPU_LOAD.

Yes, that's the TIF bit my pseudo-code intended.

> It's also worth noting that we *could*:
> 
> 	xrstors(xstate, 1<<xfeature);
> 
> as well.  That would bring the registers back up to day and we could
> keep TIF_NEED_FPU_LOAD==0.

Only makes sense if "tsk == current". But does this help. The work seems
to be the same whether we do it now, or later. We don't know for sure
that we will directly return to the task. We might context switch to
another task, so loading the state into registers now would just be
wasted time.

> 
> > 	fpregs_unlock();
> > }

-Tony

  reply	other threads:[~2021-09-28 20:28 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 19:23 [PATCH 0/8] Re-enable ENQCMD and PASID MSR Fenghua Yu
2021-09-20 19:23 ` [PATCH 1/8] iommu/vt-d: Clean up unused PASID updating functions Fenghua Yu
2021-09-29  7:34   ` Lu Baolu
2021-09-30  0:40     ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 2/8] x86/process: Clear PASID state for a newly forked/cloned thread Fenghua Yu
2021-09-20 19:23 ` [PATCH 3/8] sched: Define and initialize a flag to identify valid PASID in the task Fenghua Yu
2021-09-20 19:23 ` [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP Fenghua Yu
2021-09-22 21:07   ` Peter Zijlstra
2021-09-22 21:11     ` Peter Zijlstra
2021-09-22 21:26       ` Luck, Tony
2021-09-23  7:03         ` Peter Zijlstra
2021-09-22 21:33       ` Dave Hansen
2021-09-23  7:05         ` Peter Zijlstra
2021-09-22 21:36       ` Fenghua Yu
2021-09-22 23:39     ` Fenghua Yu
2021-09-23 17:14     ` Luck, Tony
2021-09-24 13:37       ` Peter Zijlstra
2021-09-24 15:39         ` Luck, Tony
2021-09-29  9:00           ` Peter Zijlstra
2021-09-23 11:31   ` Thomas Gleixner
2021-09-23 23:17   ` Andy Lutomirski
2021-09-24  2:56     ` Fenghua Yu
2021-09-24  5:12       ` Andy Lutomirski
2021-09-27 21:02     ` Luck, Tony
2021-09-27 23:51       ` Dave Hansen
2021-09-28 18:50         ` Luck, Tony
2021-09-28 19:19           ` Dave Hansen
2021-09-28 20:28             ` Luck, Tony [this message]
2021-09-28 20:55               ` Dave Hansen
2021-09-28 23:10                 ` Luck, Tony
2021-09-28 23:50                   ` Fenghua Yu
2021-09-29  0:08                     ` Luck, Tony
2021-09-29  0:26                       ` Yu, Fenghua
2021-09-29  1:06                         ` Luck, Tony
2021-09-29  1:16                           ` Fenghua Yu
2021-09-29  2:11                             ` Luck, Tony
2021-09-29  1:56                       ` Yu, Fenghua
2021-09-29  2:15                         ` Luck, Tony
2021-09-29 16:58                   ` Andy Lutomirski
2021-09-29 17:07                     ` Luck, Tony
2021-09-29 17:48                       ` Andy Lutomirski
2021-09-20 19:23 ` [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting Fenghua Yu
2021-09-23  5:43   ` Lu Baolu
2021-09-30  0:44     ` Fenghua Yu
2021-09-23 14:36   ` Thomas Gleixner
2021-09-23 16:40     ` Luck, Tony
2021-09-23 17:48       ` Thomas Gleixner
2021-09-24 13:18         ` Thomas Gleixner
2021-09-24 16:12           ` Luck, Tony
2021-09-24 23:03             ` Andy Lutomirski
2021-09-24 23:11               ` Luck, Tony
2021-09-29  9:54               ` Peter Zijlstra
2021-09-29 12:28                 ` Thomas Gleixner
2021-09-29 16:51                   ` Luck, Tony
2021-09-29 17:07                     ` Fenghua Yu
2021-09-29 16:59                   ` Andy Lutomirski
2021-09-29 17:15                     ` Thomas Gleixner
2021-09-29 17:41                       ` Luck, Tony
2021-09-29 17:46                         ` Andy Lutomirski
2021-09-29 18:07                         ` Fenghua Yu
2021-09-29 18:31                           ` Luck, Tony
2021-09-29 20:07                             ` Thomas Gleixner
2021-09-24 16:12           ` Fenghua Yu
2021-09-25 23:13             ` Thomas Gleixner
2021-09-28 16:36               ` Fenghua Yu
2021-09-23 23:09   ` Andy Lutomirski
2021-09-23 23:22     ` Luck, Tony
2021-09-24  5:17       ` Andy Lutomirski
2021-09-20 19:23 ` [PATCH 6/8] x86/cpufeatures: Re-enable ENQCMD Fenghua Yu
2021-09-20 19:23 ` [PATCH 7/8] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
2021-09-22 21:03   ` Peter Zijlstra
2021-09-22 23:44     ` Fenghua Yu
2021-09-23  7:17       ` Peter Zijlstra
2021-09-23 15:26         ` Fenghua Yu
2021-09-24  0:55           ` Josh Poimboeuf
2021-09-24  0:57             ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 8/8] docs: x86: Change documentation for SVA (Shared Virtual Addressing) Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVN652x14dMgyE85@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).