From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6803C433FE for ; Tue, 8 Dec 2020 02:08:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 915E9239FD for ; Tue, 8 Dec 2020 02:08:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727558AbgLHCIZ (ORCPT ); Mon, 7 Dec 2020 21:08:25 -0500 Received: from mga12.intel.com ([192.55.52.136]:50515 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725877AbgLHCIZ (ORCPT ); Mon, 7 Dec 2020 21:08:25 -0500 IronPort-SDR: Z5eIp2rArFljrypTy6Z4rdk1YiqjlPxf4pu79/TRoPOA5d4KKgosZRti+ueQ5/kaPpHGRkC+RK /jWRLqkJWERQ== X-IronPort-AV: E=McAfee;i="6000,8403,9828"; a="153054075" X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="153054075" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2020 18:06:40 -0800 IronPort-SDR: mRW2hLrS8WIzcV5YaJ+awzzv8yiTvuX95femrWkLHoMumS/7GTNyL1ZRvtOFarpzodmL85wQa0 KUxFVqnkOIxA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="375689516" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.125]) ([10.239.161.125]) by FMSMGA003.fm.intel.com with ESMTP; 07 Dec 2020 18:06:38 -0800 Subject: Re: [RFC PATCH 0/4] Reduce worst-case scanning of runqueues in select_idle_sibling To: Mel Gorman , Vincent Guittot Cc: LKML , Barry Song , Ingo Molnar , Peter Ziljstra , Juri Lelli , Valentin Schneider , Linux-ARM References: <20201207091516.24683-1-mgorman@techsingularity.net> <20201207154216.GE3371@techsingularity.net> From: "Li, Aubrey" Message-ID: <895d0c8a-5039-e569-80f3-a8a6f87380bd@linux.intel.com> Date: Tue, 8 Dec 2020 10:06:37 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20201207154216.GE3371@techsingularity.net> Content-Type: text/plain; charset=iso-8859-15 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/12/7 23:42, Mel Gorman wrote: > On Mon, Dec 07, 2020 at 04:04:41PM +0100, Vincent Guittot wrote: >> On Mon, 7 Dec 2020 at 10:15, Mel Gorman wrote: >>> >>> This is a minimal series to reduce the amount of runqueue scanning in >>> select_idle_sibling in the worst case. >>> >>> Patch 1 removes SIS_AVG_CPU because it's unused. >>> >>> Patch 2 improves the hit rate of p->recent_used_cpu to reduce the amount >>> of scanning. It should be relatively uncontroversial >>> >>> Patch 3-4 scans the runqueues in a single pass for select_idle_core() >>> and select_idle_cpu() so runqueues are not scanned twice. It's >>> a tradeoff because it benefits deep scans but introduces overhead >>> for shallow scans. >>> >>> Even if patch 3-4 is rejected to allow more time for Aubrey's idle cpu mask >> >> patch 3 looks fine and doesn't collide with Aubrey's work. But I don't >> like patch 4 which manipulates different cpumask including >> load_balance_mask out of LB and I prefer to wait for v6 of Aubrey's >> patchset which should fix the problem of possibly scanning twice busy >> cpus in select_idle_core and select_idle_cpu >> > > Seems fair, we can see where we stand after V6 of Aubrey's work. A lot > of the motivation for patch 4 would go away if we managed to avoid calling > select_idle_core() unnecessarily. As it stands, we can call it a lot from > hackbench even though the chance of getting an idle core are minimal. > Sorry for the delay, I sent v6 out just now. Comparing to v5, v6 followed Vincent's suggestion to decouple idle cpumask update from stop_tick signal, that is, the CPU is set in idle cpumask every time the CPU enters idle, this should address Peter's concern about the facebook trail-latency workload, as I didn't see any regression in schbench workload 99.0000th latency report. However, I also didn't see any significant benefit so far, probably I should put more load on the system. I'll do more characterization of uperf workload to see if I can find anything. Thanks, -Aubrey From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 850DEC433FE for ; Tue, 8 Dec 2020 02:07:57 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D5910239FD for ; Tue, 8 Dec 2020 02:07:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5910239FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=4r8uxN8H9WuolnOgw4d/dJWYAP13Ew7+UCs2tBiQatM=; b=aXkGJtZZFcAoRuudMg77Rb816 2abb6BZX9hrWemMNqbAv4nOLbwIUQIQyWi8rY0NGU8hb+m+GTnaqx1/3peTDITEKbFLflSM7nDnH5 rCw38EN44Z5+kNS+oeqHG+0oZ8SXEpP98MvHwQZCnqa+/6/hhygBhoATjNUtV3fq2qg/exSxXpAoa G1yZ6ZQQqgmyFf/z455VmMVbEMNEZVQoRKH70jPaoXPeteENX9AurNrokA7mKODUCcRFKBU0YEgRG n3qZLX/4X0O39j4/pdPb2YphThIL8ZLVjo/e84LGs7kxy6TvdIose/KEPBLku3M+4mwwLPC8vlUvN F4yTbHVhQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmSOs-0008S1-GT; Tue, 08 Dec 2020 02:06:46 +0000 Received: from mga12.intel.com ([192.55.52.136]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmSOp-0008Qg-6c for linux-arm-kernel@lists.infradead.org; Tue, 08 Dec 2020 02:06:44 +0000 IronPort-SDR: xoGMF2JNXgPLE9QEwqkHtSRuaI04m81u2tKcZuD9tYSgJPhlFnu/Sah9q+X+NhKRj0m+Qhd4YH rPAfvN4do3FQ== X-IronPort-AV: E=McAfee;i="6000,8403,9828"; a="153054077" X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="153054077" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2020 18:06:40 -0800 IronPort-SDR: mRW2hLrS8WIzcV5YaJ+awzzv8yiTvuX95femrWkLHoMumS/7GTNyL1ZRvtOFarpzodmL85wQa0 KUxFVqnkOIxA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="375689516" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.125]) ([10.239.161.125]) by FMSMGA003.fm.intel.com with ESMTP; 07 Dec 2020 18:06:38 -0800 Subject: Re: [RFC PATCH 0/4] Reduce worst-case scanning of runqueues in select_idle_sibling To: Mel Gorman , Vincent Guittot References: <20201207091516.24683-1-mgorman@techsingularity.net> <20201207154216.GE3371@techsingularity.net> From: "Li, Aubrey" Message-ID: <895d0c8a-5039-e569-80f3-a8a6f87380bd@linux.intel.com> Date: Tue, 8 Dec 2020 10:06:37 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20201207154216.GE3371@techsingularity.net> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_210643_423219_790E45F7 X-CRM114-Status: GOOD ( 19.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Peter Ziljstra , LKML , Ingo Molnar , Valentin Schneider , Linux-ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2020/12/7 23:42, Mel Gorman wrote: > On Mon, Dec 07, 2020 at 04:04:41PM +0100, Vincent Guittot wrote: >> On Mon, 7 Dec 2020 at 10:15, Mel Gorman wrote: >>> >>> This is a minimal series to reduce the amount of runqueue scanning in >>> select_idle_sibling in the worst case. >>> >>> Patch 1 removes SIS_AVG_CPU because it's unused. >>> >>> Patch 2 improves the hit rate of p->recent_used_cpu to reduce the amount >>> of scanning. It should be relatively uncontroversial >>> >>> Patch 3-4 scans the runqueues in a single pass for select_idle_core() >>> and select_idle_cpu() so runqueues are not scanned twice. It's >>> a tradeoff because it benefits deep scans but introduces overhead >>> for shallow scans. >>> >>> Even if patch 3-4 is rejected to allow more time for Aubrey's idle cpu mask >> >> patch 3 looks fine and doesn't collide with Aubrey's work. But I don't >> like patch 4 which manipulates different cpumask including >> load_balance_mask out of LB and I prefer to wait for v6 of Aubrey's >> patchset which should fix the problem of possibly scanning twice busy >> cpus in select_idle_core and select_idle_cpu >> > > Seems fair, we can see where we stand after V6 of Aubrey's work. A lot > of the motivation for patch 4 would go away if we managed to avoid calling > select_idle_core() unnecessarily. As it stands, we can call it a lot from > hackbench even though the chance of getting an idle core are minimal. > Sorry for the delay, I sent v6 out just now. Comparing to v5, v6 followed Vincent's suggestion to decouple idle cpumask update from stop_tick signal, that is, the CPU is set in idle cpumask every time the CPU enters idle, this should address Peter's concern about the facebook trail-latency workload, as I didn't see any regression in schbench workload 99.0000th latency report. However, I also didn't see any significant benefit so far, probably I should put more load on the system. I'll do more characterization of uperf workload to see if I can find anything. Thanks, -Aubrey _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel