linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH] ipc/mqueue: avoid sleep after wakeup
       [not found] <20210514030130.3253-1-hdanton@sina.com>
@ 2021-05-14 15:51 ` Manfred Spraul
       [not found] ` <20210515040613.12820-1-hdanton@sina.com>
  1 sibling, 0 replies; 2+ messages in thread
From: Manfred Spraul @ 2021-05-14 15:51 UTC (permalink / raw)
  To: Hillf Danton, Davidlohr Bueso
  Cc: Matthias von Faber, Varad Gautam, Christian Brauner,
	Oleg Nesterov, Eric W. Biederman, Andrew Morton, LKML

Hi Hillf,

On 5/14/21 5:01 AM, Hillf Danton wrote:
> The pipeline waker could start doing its job once waiter releases lock and
> get the work done before waiter takes a nap, so check wait condition before
> sleep to avoid waiting the wakeup that will never come, though that does not
> hurt much thanks to timer timeouts like a second.

First: The timeout could be infinity, thus the code must not rely on a 
timeout wakeup.

A wrong wait is would be a bug.


>
> Check signal for the same reason.
>
> Signed-off-by: Hillf Danton <hdanton@sina.com>
> ---
>
> --- y/ipc/mqueue.c
> +++ x/ipc/mqueue.c
> @@ -710,15 +710,24 @@ static int wq_sleep(struct mqueue_inode_
>   		__set_current_state(TASK_INTERRUPTIBLE);
>   
>   		spin_unlock(&info->lock);
> -		time = schedule_hrtimeout_range_clock(timeout, 0,
> -			HRTIMER_MODE_ABS, CLOCK_REALTIME);
>   

I do not see a bug:

We do the __set_current_state() while holding the spinlock. If there is 
a wakeup, then the wakeup will change current->state to TASK_RUNNING.

schedule() will not remove us from the run queue when current->state is 
TASK_RUNNING. The same applies if there are pending signals: schedule() 
checks for pending signals and sets current->state to TASK_RUNNING.

Since the __set_current_state() is done while we hold info->lock, and 
since the wakeup cannot happen before we have dropped the lock [because 
the task that wakes us up needs the same lock], I do not see how a 
wakeup could be lost.

Thus: Which issue do you see?

--

     Manfred


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC PATCH] ipc/mqueue: avoid sleep after wakeup
       [not found] ` <20210515040613.12820-1-hdanton@sina.com>
@ 2021-05-15  5:41   ` Manfred Spraul
  0 siblings, 0 replies; 2+ messages in thread
From: Manfred Spraul @ 2021-05-15  5:41 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Davidlohr Bueso, Matthias von Faber, Varad Gautam,
	Christian Brauner, Oleg Nesterov, Eric W. Biederman,
	Andrew Morton, LKML

On 5/15/21 6:06 AM, Hillf Danton wrote:
> On Fri, 14 May 2021 17:51:47 +0200  Manfred Spraul wrote:
>> On 5/14/21 5:01 AM, Hillf Danton wrote:
>>> The pipeline waker could start doing its job once waiter releases lock and
>>> get the work done before waiter takes a nap, so check wait condition before
>>> sleep to avoid waiting the wakeup that will never come, though that does not
>>> hurt much thanks to timer timeouts like a second.
>> First: The timeout could be infinity, thus the code must not rely on a
>> timeout wakeup.
>>
>> A wrong wait is would be a bug.
>>
>>
>>> Check signal for the same reason.
>>>
>>> Signed-off-by: Hillf Danton <hdanton@sina.com>
>>> ---
>>>
>>> --- y/ipc/mqueue.c
>>> +++ x/ipc/mqueue.c
>>> @@ -710,15 +710,24 @@ static int wq_sleep(struct mqueue_inode_
>>>    		__set_current_state(TASK_INTERRUPTIBLE);
>>>    
>>>    		spin_unlock(&info->lock);
>>> -		time = schedule_hrtimeout_range_clock(timeout, 0,
>>> -			HRTIMER_MODE_ABS, CLOCK_REALTIME);
>>>    
>> I do not see a bug:
>>
>> We do the __set_current_state() while holding the spinlock. If there is
>> a wakeup, then the wakeup will change current->state to TASK_RUNNING.
> Correct.
>> schedule() will not remove us from the run queue when current->state is
>> TASK_RUNNING. The same applies if there are pending signals: schedule()
>> checks for pending signals and sets current->state to TASK_RUNNING.
>>
>> Since the __set_current_state() is done while we hold info->lock, and
>> since the wakeup cannot happen before we have dropped the lock [because
>> the task that wakes us up needs the same lock], I do not see how a
>> wakeup could be lost.
>>
>> Thus: Which issue do you see?
> 	waiter		waker
> 	----		----
> 	unlock
> 			lock
> 	irq		set STATE_READY
> 	softirq		unlock
> 			wakeup
> 	sleep a tick
> 	  schedule();
>
> No need to schedule given READY.

This is not possible to avoid:

	waiter		waker
	----		----
	unlock
	schedule();
	  calls __schedule()
	   <before rq_lock()>

  	                lock
			set STATE_READY
			unlock
			wakeup
			--> set waiter->state = TASK_RUNNING

Now the run queue will be evaluated even though there is strictly speaking no need to do that.
Changes in ipc/sem.c can't solve that: From what I see, the majority of the critical window is in kernel/sched/*.c and not in ipc/sem.c
I do not consider it as useful to add complexity just to reduce the size of a extremely rare event.

Especially: No harm is done. User space can be preempted at any time, so the kernel is always allowed to check the run queue.

--
	Manfred


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-05-15  5:41 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210514030130.3253-1-hdanton@sina.com>
2021-05-14 15:51 ` [RFC PATCH] ipc/mqueue: avoid sleep after wakeup Manfred Spraul
     [not found] ` <20210515040613.12820-1-hdanton@sina.com>
2021-05-15  5:41   ` Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).