From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 882CAC433F5 for ; Tue, 11 Oct 2022 11:38:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E7CA16B0072; Tue, 11 Oct 2022 07:38:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E2B4B6B0073; Tue, 11 Oct 2022 07:38:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF3786B0074; Tue, 11 Oct 2022 07:38:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BC4356B0072 for ; Tue, 11 Oct 2022 07:38:34 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8EAB7C10A2 for ; Tue, 11 Oct 2022 11:38:34 +0000 (UTC) X-FDA: 80008470948.30.E4248C4 Received: from r3-11.sinamail.sina.com.cn (r3-11.sinamail.sina.com.cn [202.108.3.11]) by imf03.hostedemail.com (Postfix) with ESMTP id B7A622001C for ; Tue, 11 Oct 2022 11:38:32 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.60.223]) by sina.com (172.16.97.27) with ESMTP id 634555800002C5F1; Tue, 11 Oct 2022 19:37:38 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 65887449283426 From: Hillf Danton To: Suren Baghdasaryan Cc: Pavan Kondeti , Johannes Weiner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, quic_charante@quicinc.com Subject: Re: PSI idle-shutoff Date: Tue, 11 Oct 2022 19:38:18 +0800 Message-Id: <20221011113818.340-1-hdanton@sina.com> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665488314; a=rsa-sha256; cv=none; b=ASDIrDXyhoyh8oMkGtUHof0r32ZC+/uSUyQzOOB/DJxKy1xLMc1CFAY88nAB/hOjHkyx2P Qlu99xg2kwHHhmp/So9H7UfKW7kp+Jk0L+isPCw7ZNUDa9hDsMSz891IWOf1zar+ZqfX0Y 9DpSfqGuJqsxBKv9vSeKh3Qp6J9pnxA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.11 as permitted sender) smtp.mailfrom=hdanton@sina.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665488314; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pdSYrhMoLOfXngUKXYCgUyP/dKOGclRYlNTQXtTTMwo=; b=Np8eG5slJQamkIZfsdDpy4pRkxnaahA46aGX1qE3nzseT5YO5drOsvlu6TonZmjHKLLOT2 0D4Ts2swtd2M/L4RISpAVlutgVqzIcLzlL/R+OGdtF0X7imjRdTCNJRDUgn+NzKnDYDVuI 1WHSs2rFz2sRQw2kej8qgQhBued9J24= Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.11 as permitted sender) smtp.mailfrom=hdanton@sina.com X-Stat-Signature: bws45ibatjm93owpgmfyio9bk5adekaa X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: B7A622001C X-Rspam-User: X-HE-Tag: 1665488312-839224 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000321, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10 Oct 2022 14:16:26 -0700 Suren Baghdasaryan > On Mon, Oct 10, 2022 at 3:57 AM Hillf Danton wrote: > > On 13 Sep 2022 19:38:17 +0530 Pavan Kondeti > > > Hi > > > > > > The fact that psi_avgs_work()->collect_percpu_times()->get_recent_times() > > > run from a kworker thread, PSI_NONIDLE condition would be observed as > > > there is a RUNNING task. So we would always end up re-arming the work. > > > > > > If the work is re-armed from the psi_avgs_work() it self, the backing off > > > logic in psi_task_change() (will be moved to psi_task_switch soon) can't > > > help. The work is already scheduled. so we don't do anything there. > > > > > > Probably I am missing some thing here. Can you please clarify how we > > > shut off re-arming the psi avg work? > > > > Instead of open coding schedule_delayed_work() in bid to check if timer > > hits the idle task (see delayed_work_timer_fn()), the idle task is tracked > > in psi_task_switch() and checked by kworker to see if it preempted the idle > > task. > > > > Only for thoughts now. > > > > Hillf > > > > +++ b/kernel/sched/psi.c > > @@ -412,6 +412,8 @@ static u64 update_averages(struct psi_gr > > return avg_next_update; > > } > > > > +static DEFINE_PER_CPU(int, prev_task_is_idle); > > + > > static void psi_avgs_work(struct work_struct *work) > > { > > struct delayed_work *dwork; > > @@ -439,7 +441,7 @@ static void psi_avgs_work(struct work_st > > if (now >= group->avg_next_update) > > group->avg_next_update = update_averages(group, now); > > > > - if (nonidle) { > > + if (nonidle && 0 == per_cpu(prev_task_is_idle, raw_smp_processor_id())) { > > This condition would be incorrect if nonidle was set by a cpu other > than raw_smp_processor_id() and > prev_task_is_idle[raw_smp_processor_id()] == 0. Thanks for taking a look. > IOW, if some activity happens on a non-current cpu, we would fail to > reschedule psi_avgs_work for it. Given activities on remote CPUs, can you specify what prevents psi_avgs_work from being scheduled on remote CPUs if for example the local CPU has been idle for a second? > This can be fixed in collect_percpu_times() by > considering prev_task_is_idle for all other CPUs as well. However > Chengming's approach seems simpler to me TBH and does not require an > additional per-cpu variable. Good ideas are always welcome.