All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kohli, Gaurav" <gkohli@codeaurora.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, mpe@ellerman.id.au, mingo@kernel.org,
	bigeasy@linutronix.de, linux-kernel@vger.kernel.org,
	linux-arm-msm@vger.kernel.org,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Will Deacon <will.deacon@arm.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup
Date: Tue, 1 May 2018 13:20:26 +0530	[thread overview]
Message-ID: <3af3365b-4e3f-e388-8e90-45a3bd4120fd@codeaurora.org> (raw)
In-Reply-To: <20180430111744.GE4082@hirez.programming.kicks-ass.net>

sorry for spam, Adding list

On 4/30/2018 4:47 PM, Peter Zijlstra wrote:
> On Thu, Apr 26, 2018 at 09:23:25PM +0530, Kohli, Gaurav wrote:
>> On 4/26/2018 2:27 PM, Peter Zijlstra wrote:
>>
>>> On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote:
>>>> diff --git a/kernel/kthread.c b/kernel/kthread.c
>>>> index cd50e99202b0..4b6503c6a029 100644
>>>> --- a/kernel/kthread.c
>>>> +++ b/kernel/kthread.c
>>>> @@ -177,12 +177,13 @@ void *kthread_probe_data(struct task_struct *task)
>>>>    static void __kthread_parkme(struct kthread *self)
>>>>    {
>>>> -	__set_current_state(TASK_PARKED);
>>>> -	while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
>>>> +	for (;;) {
>>>> +		__set_task_state(TASK_PARKED);
>>> 		set_current_state(TASK_PARKED);
>>>
>>> of course..
>>
>> Hi Peter,
>>
>> Maybe i am missing something , but still that race can come as we don't put task_parked on special state.
>>
>> Controller                                                                       Hotplug
>>
>>                                                                                   Loop
>>
>>                                                                                   Task_Interruptible
>>
>> Set SHOULD_PARK
>>
>> wakeup -> Proceeds
>>
>>                                                                                    Set Running
>>
>>                                                                                    kthread_parkme
>>
>>                                                                                    SET TASK_PARKED
>>
>>                                                                                    schedule
>>
>> Set TASK_RUNNING
>>
>> Can you please correct ME, if I misunderstood this.
> 
> If that could happen, all wait-loops would be broken. However,
> AFAICT that cannot happen, because ttwu_remote() and schedule()
> serialize on rq->lock. See:
> 
> 
> A						B
> 
> for (;;) {
> 	set_current_state(UNINTERRUPTIBLE);
> 
> 						cond1 = true;
> 						wake_up_process(A)
> 						  lock(A->pi_lock)
> 						  smp_mb__after_spinlock()
> 						  if (A->state & TASK_NORMAL)
> 						    A->on_rq && ttwu_remote()
> 	if (cond1) // true
> 		break;
> 	schedule();
> }
> __set_current_state(RUNNING);
> 

Hi Peter,

Sorry for the late reply and i was on leave.

Thanks for the new patches, We will apply and test for issue reproduction.

But In our older case, where we have seen failure below is the wake up 
path and ftraces, Wakeup occured  and completed before schedule call only.

So final state of CPUHP is running not parked. I have also pasted debug 
ftraces that we got during issue reproduction.

Here wakeup for cpuhp is below:

takedown_cpu-> kthread_park-> wake_up_process


  39,034,311,742,395  apps (10240)        Trace Printk 
cpuhp/0  (16)  [000]  39015.625000: <debug> __kthread_parkme state=512 
task=ffffffcc7458e680 flags: 0x5 -> state 5 -> state is parked inside 
parkme function

39,034,311,846,510  apps (10240)        Trace Printk 
cpuhp/0  (16)  [000]  39015.625000: <debug> before schedule 
__kthread_parkme state=0 task=ffffffcc7458e680 flags: 0xd  ->  just 
before schedule call, state is running

tatic void __kthread_parkme(struct kthread *self)

{

         __set_current_state(TASK_PARKED);

         while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {

                 if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))

                         complete(&self->parked);

                 schedule();

                 __set_current_state(TASK_PARKED);

         }

         clear_bit(KTHREAD_IS_PARKED, &self->flags);

         __set_current_state(TASK_RUNNING);

}

So my point is here also, if it is reschedule then it can set 
TASK_PARKED, but it seems after takedown_cpu call this thread never get 
a chance to run, So final state is TASK_RUNNING.

In our current fix also can't we observe same scenario where final state 
is TASK_RUNNING.

Regards

Gaurav

> for (;;) {
> 	set_current_state(UNINTERRUPTIBLE);
> 	if (cond2)
> 		break;
> 
> 	schedule();
> 	  lock(rq->lock)
> 	  smp_mb__after_spinlock();
> 	  deactivate_task(A);
> 	  <sched-out>
> 	  unlock(rq->lock);
> 						      rq = __task_rq_lock(A)
> 						      if (A->on_rq) // false
> 						        A->state = TASK_RUNNING;
> 						      __task_rq_unlock(rq)
> 
> 
> Either A's schedule() must observe RUNNING (not shown) or B must
> observe !A->on_rq (shown) and not issue the store.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

  reply	other threads:[~2018-05-01  7:50 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-25  8:33 [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup Gaurav Kohli
2018-04-25 20:09 ` Peter Zijlstra
2018-04-26  4:04   ` Kohli, Gaurav
2018-04-26  9:14     ` Peter Zijlstra
2018-04-26  8:41   ` Peter Zijlstra
2018-04-26  8:57     ` Peter Zijlstra
2018-04-26 15:53       ` Kohli, Gaurav
2018-04-30 11:17         ` Peter Zijlstra
2018-05-01  7:50           ` Kohli, Gaurav [this message]
2018-05-01 10:18             ` Peter Zijlstra
2018-05-01 10:40               ` Peter Zijlstra
2018-05-01 10:40               ` Kohli, Gaurav
2018-05-01 11:31                 ` Peter Zijlstra
2018-05-01 11:46                   ` Kohli, Gaurav
2018-05-01 13:19                     ` Peter Zijlstra
2018-05-02  5:15                       ` Kohli, Gaurav
2018-05-02  8:20                         ` Peter Zijlstra
2018-05-02 10:13                           ` Kohli, Gaurav
2018-05-07 11:09                             ` Kohli, Gaurav
2018-05-07 11:23                               ` Kohli, Gaurav
2018-06-05 11:13                                 ` Kohli, Gaurav
2018-06-05 15:08                                   ` Oleg Nesterov
2018-06-05 15:22                                     ` Peter Zijlstra
2018-06-05 15:40                                       ` Peter Zijlstra
2018-06-05 16:35                                         ` Oleg Nesterov
2018-06-05 18:21                                           ` Kohli, Gaurav
2018-06-05 20:13                                           ` Peter Zijlstra
2018-06-06 13:51                                             ` Oleg Nesterov
2018-06-06 15:03                                               ` Peter Zijlstra
2018-06-06 15:04                                               ` Peter Zijlstra
2018-06-06 15:22                                               ` Peter Zijlstra
2018-06-06 18:59                                               ` Peter Zijlstra
2018-06-07  8:30                                                 ` Kohli, Gaurav
2018-05-01 10:44               ` Peter Zijlstra
2018-04-26 16:02     ` Andrea Parri
2018-04-26 16:18     ` Oleg Nesterov
2018-04-30 11:20       ` Peter Zijlstra
2018-04-30 11:56         ` Peter Zijlstra
2018-04-28  6:43 ` [lkp-robot] [kthread/smpboot] cad8e99675: inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage kernel test robot
2018-04-28  6:43   ` kernel test robot
2018-04-28  6:43   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3af3365b-4e3f-e388-8e90-45a3bd4120fd@codeaurora.org \
    --to=gkohli@codeaurora.org \
    --cc=bigeasy@linutronix.de \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=neeraju@codeaurora.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.