linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sohil Mehta <sohil.mehta@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>, <x86@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"Ingo Molnar" <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christian Brauner <christian@brauner.io>,
	Peter Zijlstra <peterz@infradead.org>,
	Shuah Khan <shuah@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Jonathan Corbet <corbet@lwn.net>, Ashok Raj <ashok.raj@intel.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	"Gayatri Kammela" <gayatri.kammela@intel.com>,
	Zeng Guang <guang.zeng@intel.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	Randy E Witt <randy.e.witt@intel.com>,
	Ravi V Shankar <ravi.v.shankar@intel.com>,
	Ramesh Thomas <ramesh.thomas@intel.com>,
	<linux-api@vger.kernel.org>, <linux-arch@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-kselftest@vger.kernel.org>
Subject: Re: [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall
Date: Tue, 28 Sep 2021 16:08:40 -0700	[thread overview]
Message-ID: <2d404db6-828a-98de-f409-94ddf2c2af67@intel.com> (raw)
In-Reply-To: <87r1dedykm.ffs@tglx>

On 9/24/2021 4:04 AM, Thomas Gleixner wrote:
> On Mon, Sep 13 2021 at 13:01, Sohil Mehta wrote:
>> Currently, the task wait list is global one. To make the implementation
>> scalable there is a need to move to a distributed per-cpu wait list.
> How are per cpu wait lists going to solve the problem?


Currently, the global wait list can be concurrently accessed by multiple 
cpus. If we have per-cpu wait lists then the UPID scanning only needs to 
happen on the local cpu's wait list.

After an application calls uintr_wait(), the notification interrupt will 
be delivered only to the cpu where the task blocked. In this case, we 
can reduce the UPID search list and probably get rid of the global 
spinlock as well.

Though, I am not sure how much impact this would have vs. the problem of 
scanning the entire wait list.

>> +
>> +/*
>> + * Handler for UINTR_KERNEL_VECTOR.
>> + */
>> +DEFINE_IDTENTRY_SYSVEC(sysvec_uintr_kernel_notification)
>> +{
>> +	/* TODO: Add entry-exit tracepoints */
>> +	ack_APIC_irq();
>> +	inc_irq_stat(uintr_kernel_notifications);
>> +
>> +	uintr_wake_up_process();
> So this interrupt happens for any of those notifications. How are they
> differentiated?


Unfortunately, there is no help from the hardware here to identify the 
intended target.

When a task blocks we:
* switch the UINV to a kernel NV.
* leave SN as 0
* leave UPID.NDST to the current cpu
* add the task to a wait list

When the notification interrupt arrives:
* Scan the entire wait list to check if the ON bit is set for any UPID 
(very inefficient)
* Set SN to 1 for that task.
* Change the UINV to user NV.
* Remove the task from the list and make it runnable.

We could end up detecting multiple tasks that have the ON bit set. The 
notification interrupt for any task that has ON set is expected to 
arrive soon anyway. So no harm done here.

The main issue here is we would end up scanning the entire list for 
every interrupt. Not sure if there any way we could optimize this?


> Again. We have proper wait primitives.

I'll use proper wait primitives next time.
>> +	return -EINTR;
>> +}
>> +
>> +/*
>> + * Runs in interrupt context.
>> + * Scan through all UPIDs to check if any interrupt is on going.
>> + */
>> +void uintr_wake_up_process(void)
>> +{
>> +	struct uintr_upid_ctx *upid_ctx, *tmp;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&uintr_wait_lock, flags);
>> +	list_for_each_entry_safe(upid_ctx, tmp, &uintr_wait_list, node) {
>> +		if (test_bit(UPID_ON, (unsigned long*)&upid_ctx->upid->nc.status)) {
>> +			set_bit(UPID_SN, (unsigned long *)&upid_ctx->upid->nc.status);
>> +			upid_ctx->upid->nc.nv = UINTR_NOTIFICATION_VECTOR;
>> +			upid_ctx->waiting = false;
>> +			wake_up_process(upid_ctx->task);
>> +			list_del(&upid_ctx->node);
> So any of these notification interrupts does a global mass wake up? How
> does that make sense?


The wake up happens only for the tasks that have a pending interrupt. 
They are going to be woken up soon anyways.

>> +/* Called when task is unregistering/exiting */
>> +static void uintr_remove_task_wait(struct task_struct *task)
>> +{
>> +	struct uintr_upid_ctx *upid_ctx, *tmp;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&uintr_wait_lock, flags);
>> +	list_for_each_entry_safe(upid_ctx, tmp, &uintr_wait_list, node) {
>> +		if (upid_ctx->task == task) {
>> +			pr_debug("wait: Removing task %d from wait\n",
>> +				 upid_ctx->task->pid);
>> +			upid_ctx->upid->nc.nv = UINTR_NOTIFICATION_VECTOR;
>> +			upid_ctx->waiting = false;
>> +			list_del(&upid_ctx->node);
>> +		}
> What? You have to do a global list walk to find the entry which you
> added yourself?

Duh! I could have gotten the upid_ctx from the task_struct itself. Will 
fix this.

Thanks,

Sohil



  parent reply	other threads:[~2021-09-28 23:08 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13 20:01 [RFC PATCH 00/13] x86 User Interrupts support Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 01/13] x86/uintr/man-page: Include man pages draft for reference Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 02/13] Documentation/x86: Add documentation for User Interrupts Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 03/13] x86/cpu: Enumerate User Interrupts support Sohil Mehta
2021-09-23 22:24   ` Thomas Gleixner
2021-09-24 19:59     ` Sohil Mehta
2021-09-27 20:42     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 04/13] x86/fpu/xstate: Enumerate User Interrupts supervisor state Sohil Mehta
2021-09-23 22:34   ` Thomas Gleixner
2021-09-27 22:25     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 05/13] x86/irq: Reserve a user IPI notification vector Sohil Mehta
2021-09-23 23:07   ` Thomas Gleixner
2021-09-25 13:30     ` Thomas Gleixner
2021-09-26 12:39       ` Thomas Gleixner
2021-09-27 19:07         ` Sohil Mehta
2021-09-28  8:11           ` Thomas Gleixner
2021-09-27 19:26     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 06/13] x86/uintr: Introduce uintr receiver syscalls Sohil Mehta
2021-09-23 12:26   ` Greg KH
2021-09-24  0:05     ` Thomas Gleixner
2021-09-27 23:20     ` Sohil Mehta
2021-09-28  4:39       ` Greg KH
2021-09-28 16:47         ` Sohil Mehta
2021-09-23 23:52   ` Thomas Gleixner
2021-09-27 23:57     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 07/13] x86/process/64: Add uintr task context switch support Sohil Mehta
2021-09-24  0:41   ` Thomas Gleixner
2021-09-28  0:30     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 08/13] x86/process/64: Clean up uintr task fork and exit paths Sohil Mehta
2021-09-24  1:02   ` Thomas Gleixner
2021-09-28  1:23     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 09/13] x86/uintr: Introduce vector registration and uintr_fd syscall Sohil Mehta
2021-09-24 10:33   ` Thomas Gleixner
2021-09-28 20:40     ` Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 10/13] x86/uintr: Introduce user IPI sender syscalls Sohil Mehta
2021-09-23 12:28   ` Greg KH
2021-09-28 18:01     ` Sohil Mehta
2021-09-29  7:04       ` Greg KH
2021-09-29 14:27         ` Sohil Mehta
2021-09-24 10:54   ` Thomas Gleixner
2021-09-13 20:01 ` [RFC PATCH 11/13] x86/uintr: Introduce uintr_wait() syscall Sohil Mehta
2021-09-24 11:04   ` Thomas Gleixner
2021-09-25 12:08     ` Thomas Gleixner
2021-09-28 23:13       ` Sohil Mehta
2021-09-28 23:08     ` Sohil Mehta [this message]
2021-09-26 14:41   ` Thomas Gleixner
2021-09-29  1:09     ` Sohil Mehta
2021-09-29  3:30   ` Andy Lutomirski
2021-09-29  4:56     ` Sohil Mehta
2021-09-30 18:08       ` Andy Lutomirski
2021-09-30 19:29         ` Thomas Gleixner
2021-09-30 22:01           ` Andy Lutomirski
2021-10-01  0:01             ` Thomas Gleixner
2021-10-01  4:41               ` Andy Lutomirski
2021-10-01  9:56                 ` Thomas Gleixner
2021-10-01 15:13                   ` Andy Lutomirski
2021-10-01 18:04                     ` Sohil Mehta
2021-10-01 21:29                     ` Thomas Gleixner
2021-10-01 23:00                       ` Sohil Mehta
2021-10-01 23:04                       ` Andy Lutomirski
2021-09-13 20:01 ` [RFC PATCH 12/13] x86/uintr: Wire up the user interrupt syscalls Sohil Mehta
2021-09-13 20:01 ` [RFC PATCH 13/13] selftests/x86: Add basic tests for User IPI Sohil Mehta
2021-09-13 20:27 ` [RFC PATCH 00/13] x86 User Interrupts support Dave Hansen
2021-09-14 19:03   ` Mehta, Sohil
2021-09-23 12:19     ` Greg KH
2021-09-23 14:09       ` Greg KH
2021-09-23 14:46         ` Dave Hansen
2021-09-23 15:07           ` Greg KH
2021-09-23 23:24         ` Sohil Mehta
2021-09-23 23:09       ` Sohil Mehta
2021-09-24  0:17       ` Sohil Mehta
2021-09-23 14:39 ` Jens Axboe
2021-09-29  4:31 ` Andy Lutomirski
2021-09-30 16:30   ` Stefan Hajnoczi
2021-09-30 17:24     ` Sohil Mehta
2021-09-30 17:26       ` Andy Lutomirski
2021-10-01 16:35       ` Stefan Hajnoczi
2021-10-01 16:41         ` Richard Henderson
2021-09-30 16:26 ` Stefan Hajnoczi
2021-10-01  0:40   ` Sohil Mehta
2021-10-01  8:19 ` Pavel Machek
2021-11-18 22:19   ` Sohil Mehta
2021-11-16  3:49 ` Prakash Sangappa
2021-11-18 21:44   ` Sohil Mehta
2021-12-22 16:17 ` Chrisma Pakha
2022-01-07  2:08   ` Sohil Mehta
2022-01-17  1:14     ` Chrisma Pakha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d404db6-828a-98de-f409-94ddf2c2af67@intel.com \
    --to=sohil.mehta@intel.com \
    --cc=arnd@arndb.de \
    --cc=ashok.raj@intel.com \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=christian@brauner.io \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=gayatri.kammela@intel.com \
    --cc=guang.zeng@intel.com \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ramesh.thomas@intel.com \
    --cc=randy.e.witt@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).