linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Andy Lutomirski" <luto@kernel.org>
To: "Sohil Mehta" <sohil.mehta@intel.com>,
	"the arch/x86 maintainers" <x86@kernel.org>
Cc: "Tony Luck" <tony.luck@intel.com>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>, "Jens Axboe" <axboe@kernel.dk>,
	"Christian Brauner" <christian@brauner.io>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Shuah Khan" <shuah@kernel.org>, "Arnd Bergmann" <arnd@arndb.de>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Raj Ashok" <ashok.raj@intel.com>,
	"Jacob Pan" <jacob.jun.pan@linux.intel.com>,
	"Gayatri Kammela" <gayatri.kammela@intel.com>,
	"Zeng Guang" <guang.zeng@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"Randy E Witt" <randy.e.witt@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	"Ramesh Thomas" <ramesh.thomas@intel.com>,
	"Linux API" <linux-api@vger.kernel.org>,
	linux-arch@vger.kernel.org,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org
Subject: Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall
Date: Thu, 30 Sep 2021 11:08:35 -0700	[thread overview]
Message-ID: <fd54f257-fa02-4ec3-a81b-b5e60f24bf94@www.fastmail.com> (raw)
In-Reply-To: <c6e83d0e-6551-4e16-0822-0abbc4d656c4@intel.com>

On Tue, Sep 28, 2021, at 9:56 PM, Sohil Mehta wrote:
> On 9/28/2021 8:30 PM, Andy Lutomirski wrote:
>> On Mon, Sep 13, 2021, at 1:01 PM, Sohil Mehta wrote:
>>> Add a new system call to allow applications to block in the kernel and
>>> wait for user interrupts.
>>>
>> ...
>>
>>> When the application makes this syscall the notification vector is
>>> switched to a new kernel vector. Any new SENDUIPI will invoke the kernel
>>> interrupt which is then used to wake up the process.
>> Any new SENDUIPI that happens to hit the target CPU's ucode at a time when the kernel vector is enabled will deliver the interrupt.  Any new SENDUIPI that happens to hit the target CPU's ucode at a time when a different UIPI-using task is running will *not* deliver the interrupt, unless I'm missing some magic.  Which means that wakeups will be missed, which I think makes this whole idea a nonstarter.
>>
>> Am I missing something?
>
>
> The current kernel implementation reserves 2 notification vectors (NV) 
> for the 2 states of a thread (running vs blocked).
>
> NV-1 – used only for tasks that are running. (results in a user 
> interrupt or a spurious kernel interrupt)
>
> NV-2 – used only for a tasks that are blocked in the kernel. (always 
> results in a kernel interrupt)
>
> The UPID.UINV bits are switched between NV-1 and NV-2 based on the state 
> of the task.

Aha, cute.  So NV-1 is only sent if the target is directly paying attention and, assuming all the atomics are done right, NV-2 will be sent for tasks that are asleep.

Logically, I think these are the possible states for a receiving task:

1. Running.  SENDUIPI will actually deliver the event directly (or not if uintr is masked).  If the task just stopped running and the atomics are right, then the schedule-out code can, I think, notice.

2. Not running, but either runnable or not currently waiting for uintr (e.g. blocked in an unrelated syscall).  This is straightforward -- no IPI or other action is needed other than setting the uintr-pending bit.

3. Blocked and waiting for uintr.  For this to work right, anyone trying to send with SENDUIPI (or maybe a vdso or similar clever wrapper around it) needs to result in either a fault or an IPI so the kernel can process the wakeup.

(Note that, depending on how fancy we get with file descriptors and polling, we need to watch out for the running-and-also-waiting-for-kernel-notification state.  That one will never work right.)

3 is the nasty case, and your patch makes it work with this NV-2 trick.  The trick is a bit gross for a couple reasons.  First, it conveys no useful information to the kernel except that an unknown task did SENDUIPI and maybe that the target was most recently on a given CPU.  So a big list search is needed.  Also, it hits an essentially arbitrary and possibly completely innocent victim CPU and task, and people doing any sort of task isolation workload will strongly dislike this.  For some of those users, "strongly" may mean "treat system as completely failed, fail over to something else and call expensive tech support."  So we can't do that.

I think we have three choices:

Use a fancy wrapper around SENDUIPI.  This is probably a bad idea.

Treat the NV-2 as a real interrupt and honor affinity settings.  This will be annoying and slow, I think, if it's even workable at all.

Handle this case with faults instead of interrupts.  We could set a reserved bit in UPID so that SENDUIPI results in #GP, decode it, and process it.  This puts the onus on the actual task causing trouble, which is nice, and it lets us find the UPID and target directly instead of walking all of them.  I don't know how well it would play with hypothetical future hardware-initiated uintrs, though.

  reply	other threads:[~2021-09-30 18:09 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13 20:01 [RFC PATCH 00/13] x86 User Interrupts support Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 01/13] x86/uintr/man-page: Include man pages draft for reference Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 02/13] Documentation/x86: Add documentation for User Interrupts Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 03/13] x86/cpu: Enumerate User Interrupts support Sohil Mehta
2021-09-23 22:24   ` Thomas Gleixner
2021-09-24 19:59     ` Sohil Mehta
2021-09-27 20:42     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 04/13] x86/fpu/xstate: Enumerate User Interrupts supervisor state Sohil Mehta
2021-09-23 22:34   ` Thomas Gleixner
2021-09-27 22:25     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 05/13] x86/irq: Reserve a user IPI notification vector Sohil Mehta
2021-09-23 23:07   ` Thomas Gleixner
2021-09-25 13:30     ` Thomas Gleixner
2021-09-26 12:39       ` Thomas Gleixner
2021-09-27 19:07         ` Sohil Mehta
2021-09-28  8:11           ` Thomas Gleixner
2021-09-27 19:26     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 06/13] x86/uintr: Introduce uintr receiver syscalls Sohil Mehta
2021-09-23 12:26   ` Greg KH
2021-09-24  0:05     ` Thomas Gleixner
2021-09-27 23:20     ` Sohil Mehta
2021-09-28  4:39       ` Greg KH
2021-09-28 16:47         ` Sohil Mehta
2021-09-23 23:52   ` Thomas Gleixner
2021-09-27 23:57     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 07/13] x86/process/64: Add uintr task context switch support Sohil Mehta
2021-09-24  0:41   ` Thomas Gleixner
2021-09-28  0:30     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 08/13] x86/process/64: Clean up uintr task fork and exit paths Sohil Mehta
2021-09-24  1:02   ` Thomas Gleixner
2021-09-28  1:23     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 09/13] x86/uintr: Introduce vector registration and uintr_fd syscall Sohil Mehta
2021-09-24 10:33   ` Thomas Gleixner
2021-09-28 20:40     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 10/13] x86/uintr: Introduce user IPI sender syscalls Sohil Mehta
2021-09-23 12:28   ` Greg KH
2021-09-28 18:01     ` Sohil Mehta
2021-09-29  7:04       ` Greg KH
2021-09-29 14:27         ` Sohil Mehta
2021-09-24 10:54   ` Thomas Gleixner
2021-09-13 20:01 ` [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall Sohil Mehta
2021-09-24 11:04   ` Thomas Gleixner
2021-09-25 12:08     ` Thomas Gleixner
2021-09-28 23:13       ` Sohil Mehta
2021-09-28 23:08     ` Sohil Mehta
2021-09-26 14:41   ` Thomas Gleixner
2021-09-29  1:09     ` Sohil Mehta
2021-09-29  3:30   ` Andy Lutomirski
2021-09-29  4:56     ` Sohil Mehta
2021-09-30 18:08       ` Andy Lutomirski [this message]
2021-09-30 19:29         ` Thomas Gleixner
2021-09-30 22:01           ` Andy Lutomirski
2021-10-01  0:01             ` Thomas Gleixner
2021-10-01  4:41               ` Andy Lutomirski
2021-10-01  9:56                 ` Thomas Gleixner
2021-10-01 15:13                   ` Andy Lutomirski
2021-10-01 18:04                     ` Sohil Mehta
2021-10-01 21:29                     ` Thomas Gleixner
2021-10-01 23:00                       ` Sohil Mehta
2021-10-01 23:04                       ` Andy Lutomirski
2021-09-13 20:01 ` [RFC PATCH 12/13] x86/uintr: Wire up the user interrupt syscalls Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 13/13] selftests/x86: Add basic tests for User IPI Sohil Mehta
2021-09-13 20:27 ` [RFC PATCH 00/13] x86 User Interrupts support Dave Hansen
2021-09-14 19:03   ` Mehta, Sohil
2021-09-23 12:19     ` Greg KH
2021-09-23 14:09       ` Greg KH
2021-09-23 14:46         ` Dave Hansen
2021-09-23 15:07           ` Greg KH
2021-09-23 23:24         ` Sohil Mehta
2021-09-23 23:09       ` Sohil Mehta
2021-09-24  0:17       ` Sohil Mehta
2021-09-23 14:39 ` Jens Axboe
2021-09-29  4:31 ` Andy Lutomirski
2021-09-30 16:30   ` Stefan Hajnoczi
2021-09-30 17:24     ` Sohil Mehta
2021-09-30 17:26       ` Andy Lutomirski
2021-10-01 16:35       ` Stefan Hajnoczi
2021-10-01 16:41         ` Richard Henderson
2021-09-30 16:26 ` Stefan Hajnoczi
2021-10-01  0:40   ` Sohil Mehta
2021-10-01  8:19 ` Pavel Machek
2021-11-18 22:19   ` Sohil Mehta
2021-11-16  3:49 ` Prakash Sangappa
2021-11-18 21:44   ` Sohil Mehta
2021-12-22 16:17 ` Chrisma Pakha
2022-01-07  2:08   ` Sohil Mehta
2022-01-17  1:14     ` Chrisma Pakha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd54f257-fa02-4ec3-a81b-b5e60f24bf94@www.fastmail.com \
    --to=luto@kernel.org \
    --cc=arnd@arndb.de \
    --cc=ashok.raj@intel.com \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=christian@brauner.io \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=gayatri.kammela@intel.com \
    --cc=guang.zeng@intel.com \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ramesh.thomas@intel.com \
    --cc=randy.e.witt@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=shuah@kernel.org \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).