All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Andy Lutomirski" <luto@kernel.org>
To: "Tony Luck" <tony.luck@intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>
Cc: "Fenghua Yu" <fenghua.yu@intel.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Lu Baolu" <baolu.lu@linux.intel.com>,
	"Joerg Roedel" <joro@8bytes.org>,
	"Josh Poimboeuf" <jpoimboe@redhat.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Jacob Jun Pan" <jacob.jun.pan@intel.com>,
	"Raj Ashok" <ashok.raj@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	iommu@lists.linux-foundation.org,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
Date: Fri, 24 Sep 2021 16:03:53 -0700	[thread overview]
Message-ID: <a77ee33c-6fa7-468c-8fc0-a0a2ce725e75@www.fastmail.com> (raw)
In-Reply-To: <YU3414QT0J7EN4w9@agluck-desk2.amr.corp.intel.com>



On Fri, Sep 24, 2021, at 9:12 AM, Luck, Tony wrote:
> On Fri, Sep 24, 2021 at 03:18:12PM +0200, Thomas Gleixner wrote:
>> On Thu, Sep 23 2021 at 19:48, Thomas Gleixner wrote:
>> > On Thu, Sep 23 2021 at 09:40, Tony Luck wrote:
>> >
>> > fpu_write_task_pasid() can just grab the pasid from current->mm->pasid
>> > and be done with it.
>> >
>> > The task exit code can just call iommu_pasid_put_task_ref() from the
>> > generic code and not from within x86.
>> 
>> But OTOH why do you need a per task reference count on the PASID at all?
>> 
>> The PASID is fundamentaly tied to the mm and the mm can't go away before
>> the threads have gone away unless this magically changed after I checked
>> that ~20 years ago.
>
> It would be possible to avoid a per-task reference to the PASID by
> taking an extra reference when mm->pasid is first allocated using
> the CONFIG_PASID_TASK_REFS you proposed yesterday to define a function
> to take the extra reference, and another to drop it when the mm is
> finally freed ... with stubs to do nothing on architectures that
> don't create per-task PASID context.
>
> This solution works, but is functionally different from Fenghua's
> proposal for this case:
>
> 	Process clones a task
> 	task binds a device
> 	task accesses device using PASID
> 	task unbinds device
> 	task exits (but process lives on)
>
> Fenghua will free the PASID because the reference count goes
> back to zero. The "take an extra reference and keep until the
> mm is freed" option would needlessly hold onto the PASID.
>
> This seems like an odd usage case ... even if it does exist, a process
> that does this may spawn another task that does the same thing.
>
> If you think it is sufficiently simpler to use the "extra reference"
> option (invoking the "perfect is the enemy of good enough" rule) then we
> can change.

I think the perfect and the good are a bit confused here. If we go for "good", then we have an mm owning a PASID for its entire lifetime.  If we want "perfect", then we should actually do it right: teach the kernel to update an entire mm's PASID setting all at once.  This isn't *that* hard -- it involves two things:

1. The context switch code needs to resync PASID.  Unfortunately, this adds some overhead to every context switch, although a static_branch could minimize it for non-PASID users.
2. A change to an mm's PASID needs to sent an IPI, but that IPI can't touch FPU state.  So instead the IPI should use task_work_add() to make sure PASID gets resynced.

And there is still no per-task refcounting.

After all, the not so perfect attempt at perfect in this patch set won't actually work if a thread pool is involved.

WARNING: multiple messages have this Message-ID (diff)
From: "Andy Lutomirski" <luto@kernel.org>
To: "Tony Luck" <tony.luck@intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Raj Ashok <ashok.raj@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	"Peter Zijlstra \(Intel\)" <peterz@infradead.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	iommu@lists.linux-foundation.org, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Jacob Jun Pan <jacob.jun.pan@intel.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
Date: Fri, 24 Sep 2021 16:03:53 -0700	[thread overview]
Message-ID: <a77ee33c-6fa7-468c-8fc0-a0a2ce725e75@www.fastmail.com> (raw)
In-Reply-To: <YU3414QT0J7EN4w9@agluck-desk2.amr.corp.intel.com>



On Fri, Sep 24, 2021, at 9:12 AM, Luck, Tony wrote:
> On Fri, Sep 24, 2021 at 03:18:12PM +0200, Thomas Gleixner wrote:
>> On Thu, Sep 23 2021 at 19:48, Thomas Gleixner wrote:
>> > On Thu, Sep 23 2021 at 09:40, Tony Luck wrote:
>> >
>> > fpu_write_task_pasid() can just grab the pasid from current->mm->pasid
>> > and be done with it.
>> >
>> > The task exit code can just call iommu_pasid_put_task_ref() from the
>> > generic code and not from within x86.
>> 
>> But OTOH why do you need a per task reference count on the PASID at all?
>> 
>> The PASID is fundamentaly tied to the mm and the mm can't go away before
>> the threads have gone away unless this magically changed after I checked
>> that ~20 years ago.
>
> It would be possible to avoid a per-task reference to the PASID by
> taking an extra reference when mm->pasid is first allocated using
> the CONFIG_PASID_TASK_REFS you proposed yesterday to define a function
> to take the extra reference, and another to drop it when the mm is
> finally freed ... with stubs to do nothing on architectures that
> don't create per-task PASID context.
>
> This solution works, but is functionally different from Fenghua's
> proposal for this case:
>
> 	Process clones a task
> 	task binds a device
> 	task accesses device using PASID
> 	task unbinds device
> 	task exits (but process lives on)
>
> Fenghua will free the PASID because the reference count goes
> back to zero. The "take an extra reference and keep until the
> mm is freed" option would needlessly hold onto the PASID.
>
> This seems like an odd usage case ... even if it does exist, a process
> that does this may spawn another task that does the same thing.
>
> If you think it is sufficiently simpler to use the "extra reference"
> option (invoking the "perfect is the enemy of good enough" rule) then we
> can change.

I think the perfect and the good are a bit confused here. If we go for "good", then we have an mm owning a PASID for its entire lifetime.  If we want "perfect", then we should actually do it right: teach the kernel to update an entire mm's PASID setting all at once.  This isn't *that* hard -- it involves two things:

1. The context switch code needs to resync PASID.  Unfortunately, this adds some overhead to every context switch, although a static_branch could minimize it for non-PASID users.
2. A change to an mm's PASID needs to sent an IPI, but that IPI can't touch FPU state.  So instead the IPI should use task_work_add() to make sure PASID gets resynced.

And there is still no per-task refcounting.

After all, the not so perfect attempt at perfect in this patch set won't actually work if a thread pool is involved.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-09-24 23:04 UTC|newest]

Thread overview: 154+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-20 19:23 [PATCH 0/8] Re-enable ENQCMD and PASID MSR Fenghua Yu
2021-09-20 19:23 ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 1/8] iommu/vt-d: Clean up unused PASID updating functions Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-29  7:34   ` Lu Baolu
2021-09-29  7:34     ` Lu Baolu
2021-09-30  0:40     ` Fenghua Yu
2021-09-30  0:40       ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 2/8] x86/process: Clear PASID state for a newly forked/cloned thread Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 3/8] sched: Define and initialize a flag to identify valid PASID in the task Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-22 21:07   ` Peter Zijlstra
2021-09-22 21:07     ` Peter Zijlstra
2021-09-22 21:11     ` Peter Zijlstra
2021-09-22 21:11       ` Peter Zijlstra
2021-09-22 21:26       ` Luck, Tony
2021-09-22 21:26         ` Luck, Tony
2021-09-23  7:03         ` Peter Zijlstra
2021-09-23  7:03           ` Peter Zijlstra
2021-09-22 21:33       ` Dave Hansen
2021-09-22 21:33         ` Dave Hansen
2021-09-23  7:05         ` Peter Zijlstra
2021-09-23  7:05           ` Peter Zijlstra
2021-09-22 21:36       ` Fenghua Yu
2021-09-22 21:36         ` Fenghua Yu
2021-09-22 23:39     ` Fenghua Yu
2021-09-22 23:39       ` Fenghua Yu
2021-09-23 17:14     ` Luck, Tony
2021-09-23 17:14       ` Luck, Tony
2021-09-24 13:37       ` Peter Zijlstra
2021-09-24 13:37         ` Peter Zijlstra
2021-09-24 15:39         ` Luck, Tony
2021-09-24 15:39           ` Luck, Tony
2021-09-29  9:00           ` Peter Zijlstra
2021-09-29  9:00             ` Peter Zijlstra
2021-09-23 11:31   ` Thomas Gleixner
2021-09-23 11:31     ` Thomas Gleixner
2021-09-23 23:17   ` Andy Lutomirski
2021-09-23 23:17     ` Andy Lutomirski
2021-09-24  2:56     ` Fenghua Yu
2021-09-24  2:56       ` Fenghua Yu
2021-09-24  5:12       ` Andy Lutomirski
2021-09-24  5:12         ` Andy Lutomirski
2021-09-27 21:02     ` Luck, Tony
2021-09-27 21:02       ` Luck, Tony
2021-09-27 23:51       ` Dave Hansen
2021-09-27 23:51         ` Dave Hansen
2021-09-28 18:50         ` Luck, Tony
2021-09-28 18:50           ` Luck, Tony
2021-09-28 19:19           ` Dave Hansen
2021-09-28 19:19             ` Dave Hansen
2021-09-28 20:28             ` Luck, Tony
2021-09-28 20:28               ` Luck, Tony
2021-09-28 20:55               ` Dave Hansen
2021-09-28 20:55                 ` Dave Hansen
2021-09-28 23:10                 ` Luck, Tony
2021-09-28 23:10                   ` Luck, Tony
2021-09-28 23:50                   ` Fenghua Yu
2021-09-28 23:50                     ` Fenghua Yu
2021-09-29  0:08                     ` Luck, Tony
2021-09-29  0:08                       ` Luck, Tony
2021-09-29  0:26                       ` Yu, Fenghua
2021-09-29  0:26                         ` Yu, Fenghua
2021-09-29  1:06                         ` Luck, Tony
2021-09-29  1:06                           ` Luck, Tony
2021-09-29  1:16                           ` Fenghua Yu
2021-09-29  1:16                             ` Fenghua Yu
2021-09-29  2:11                             ` Luck, Tony
2021-09-29  2:11                               ` Luck, Tony
2021-09-29  1:56                       ` Yu, Fenghua
2021-09-29  1:56                         ` Yu, Fenghua
2021-09-29  2:15                         ` Luck, Tony
2021-09-29  2:15                           ` Luck, Tony
2021-09-29 16:58                   ` Andy Lutomirski
2021-09-29 16:58                     ` Andy Lutomirski
2021-09-29 17:07                     ` Luck, Tony
2021-09-29 17:07                       ` Luck, Tony
2021-09-29 17:48                       ` Andy Lutomirski
2021-09-29 17:48                         ` Andy Lutomirski
2021-09-20 19:23 ` [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-23  5:43   ` Lu Baolu
2021-09-23  5:43     ` Lu Baolu
2021-09-30  0:44     ` Fenghua Yu
2021-09-30  0:44       ` Fenghua Yu
2021-09-23 14:36   ` Thomas Gleixner
2021-09-23 14:36     ` Thomas Gleixner
2021-09-23 16:40     ` Luck, Tony
2021-09-23 16:40       ` Luck, Tony
2021-09-23 17:48       ` Thomas Gleixner
2021-09-23 17:48         ` Thomas Gleixner
2021-09-24 13:18         ` Thomas Gleixner
2021-09-24 13:18           ` Thomas Gleixner
2021-09-24 16:12           ` Luck, Tony
2021-09-24 16:12             ` Luck, Tony
2021-09-24 23:03             ` Andy Lutomirski [this message]
2021-09-24 23:03               ` Andy Lutomirski
2021-09-24 23:11               ` Luck, Tony
2021-09-24 23:11                 ` Luck, Tony
2021-09-29  9:54               ` Peter Zijlstra
2021-09-29  9:54                 ` Peter Zijlstra
2021-09-29 12:28                 ` Thomas Gleixner
2021-09-29 12:28                   ` Thomas Gleixner
2021-09-29 16:51                   ` Luck, Tony
2021-09-29 16:51                     ` Luck, Tony
2021-09-29 17:07                     ` Fenghua Yu
2021-09-29 17:07                       ` Fenghua Yu
2021-09-29 16:59                   ` Andy Lutomirski
2021-09-29 16:59                     ` Andy Lutomirski
2021-09-29 17:15                     ` Thomas Gleixner
2021-09-29 17:15                       ` Thomas Gleixner
2021-09-29 17:41                       ` Luck, Tony
2021-09-29 17:41                         ` Luck, Tony
2021-09-29 17:46                         ` Andy Lutomirski
2021-09-29 17:46                           ` Andy Lutomirski
2021-09-29 18:07                         ` Fenghua Yu
2021-09-29 18:07                           ` Fenghua Yu
2021-09-29 18:31                           ` Luck, Tony
2021-09-29 18:31                             ` Luck, Tony
2021-09-29 20:07                             ` Thomas Gleixner
2021-09-29 20:07                               ` Thomas Gleixner
2021-09-24 16:12           ` Fenghua Yu
2021-09-24 16:12             ` Fenghua Yu
2021-09-25 23:13             ` Thomas Gleixner
2021-09-25 23:13               ` Thomas Gleixner
2021-09-28 16:36               ` Fenghua Yu
2021-09-28 16:36                 ` Fenghua Yu
2021-09-23 23:09   ` Andy Lutomirski
2021-09-23 23:09     ` Andy Lutomirski
2021-09-23 23:22     ` Luck, Tony
2021-09-23 23:22       ` Luck, Tony
2021-09-24  5:17       ` Andy Lutomirski
2021-09-24  5:17         ` Andy Lutomirski
2021-09-20 19:23 ` [PATCH 6/8] x86/cpufeatures: Re-enable ENQCMD Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 7/8] tools/objtool: Check for use of the ENQCMD instruction in the kernel Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu
2021-09-22 21:03   ` Peter Zijlstra
2021-09-22 21:03     ` Peter Zijlstra
2021-09-22 23:44     ` Fenghua Yu
2021-09-22 23:44       ` Fenghua Yu
2021-09-23  7:17       ` Peter Zijlstra
2021-09-23  7:17         ` Peter Zijlstra
2021-09-23 15:26         ` Fenghua Yu
2021-09-23 15:26           ` Fenghua Yu
2021-09-24  0:55           ` Josh Poimboeuf
2021-09-24  0:55             ` Josh Poimboeuf
2021-09-24  0:57             ` Fenghua Yu
2021-09-24  0:57               ` Fenghua Yu
2021-09-20 19:23 ` [PATCH 8/8] docs: x86: Change documentation for SVA (Shared Virtual Addressing) Fenghua Yu
2021-09-20 19:23   ` Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a77ee33c-6fa7-468c-8fc0-a0a2ce725e75@www.fastmail.com \
    --to=luto@kernel.org \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.