From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1756438AbcIGMfz (ORCPT <rfc822;w@1wt.eu>);
        Wed, 7 Sep 2016 08:35:55 -0400
Received: from mx1.redhat.com ([209.132.183.28]:42558 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752961AbcIGMfy (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 7 Sep 2016 08:35:54 -0400
Date: Wed, 7 Sep 2016 14:35:11 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: chengchao <chengchao@kedacom.com>
Cc: mingo@kernel.org, peterz@infradead.org, tj@kernel.org,
        akpm@linux-foundation.org, chris@chris-wilson.co.uk,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched/core: simpler function for sched_exec migration
Message-ID: <20160907123511.GA1132@redhat.com>
References: <1473056403-7877-1-git-send-email-chengchao@kedacom.com> <20160905131147.GA8552@redhat.com> <db5c6fcd-ae5d-0f41-2d45-d161421cf9c4@kedacom.com> <20160906152253.GB17586@redhat.com> <89a992af-67cd-91b4-8890-a19ccb251fe6@kedacom.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <89a992af-67cd-91b4-8890-a19ccb251fe6@kedacom.com>
User-Agent: Mutt/1.5.18 (2008-05-17)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 07 Sep 2016 12:35:53 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 09/07, chengchao wrote:
>
> Oleg, thank you very much.
>
> on 09/06/2016 11:22 PM, Oleg Nesterov wrote:
> > On 09/06, chengchao wrote:
> >>
> >> the key point is for CONFIG_PREEMPT_NONE=y,
> >> ...
> >> it is too much overhead for one task(fork()+exec()), isn't it?
> >
> > Yes, yes, I see, this is suboptimal. Not sure we actually do care,
> > but yes, perhaps another helper which migrates the current task makes
> > sense, I dunno.
>
> for CONFIG_PREEMPT_NONE=y, this patch wants the stopper thread can migrate the current
> successfully instead of doing nothing.

I understand the intent. But I am not sure this optimization makes
sense.

> > So you need something like
> >
> > 	void stop_one_cpu_sync(cpu_stop_fn_t fn, void *arg)
> > 	{
> > 		struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = NULL };
> >
> > 		preempt_disable();
> > 		cpu_stop_queue_work(raw_smp_processor_id(), &work);
> > 		preempt_enable_no_resched();
> > 		schedule();
> > 	}
> >
>
> > or I am totally confused. Note that it doesn't (and shouldn't) have
> > the "int cpu" argument.
> >
>
>
> if preempt happens after preempt_enable_no_resched(),

This doesn't differ from explicit schedule() call. Either way the
stopper thread will preempt us on the same CPU.

> there is still risky that the
> stop_one_cpu_sync() returns before the stopper thread can use cpu_stop_work safely.
> as you said previously.

No.


However, there is another problem. It can race with another
stop_one_cpu(migration_cpu_stop) which comes between preempt_disable()
and cpu_stop_queue_work(). So the caller still can migrate to another
CPU right after after preempt_enable_no_resched() and run before the
stopper thread completes the cpu_stop_work queued by us.

> int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
> {
>         struct cpu_stop_done done;
>         struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = &done };
>
>         cpu_stop_init_done(&done, 1);
>         if (!cpu_stop_queue_work(cpu, &work))
>                 return -ENOENT;
>
> #if defined(CONFIG_PREEMPT_NONE)
> 	/*
>          * let the stopper thread runs as soon as possible,
>          * and keep current TASK_RUNNING.
>          */
> 	scheudle();
> #endif
>         wait_for_completion(&done.completion);
>         return done.ret;
> }

Agreed this looks better, although I'd suggest _cond_resche().

Again, I am not sure this makes sense, I leave this to maintainers.

Oleg.