From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B710C48BDF for ; Fri, 18 Jun 2021 16:14:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C3D5613B4 for ; Fri, 18 Jun 2021 16:14:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235826AbhFRQQ7 (ORCPT ); Fri, 18 Jun 2021 12:16:59 -0400 Received: from mga01.intel.com ([192.55.52.88]:11556 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235572AbhFRQQx (ORCPT ); Fri, 18 Jun 2021 12:16:53 -0400 IronPort-SDR: G5igGP8NHH4zqzEUR4m6z+kAHV3PRbssP8k69h2lqTstNYv4O5Bdyt6FM2uqPzHLFQtMUy8hat GLi6kmT8GeoA== X-IronPort-AV: E=McAfee;i="6200,9189,10019"; a="228107989" X-IronPort-AV: E=Sophos;i="5.83,284,1616482800"; d="scan'208";a="228107989" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 09:14:09 -0700 IronPort-SDR: grecWCbWLShL3k2D500boaEDGhZ/XvdIqKc1m8uoOBzd3fFDwJz8dyVlzGkOYCdlnc/jssRQGA 0ueJBctZGTqQ== X-IronPort-AV: E=Sophos;i="5.83,284,1616482800"; d="scan'208";a="640748206" Received: from salmansi-mobl1.amr.corp.intel.com (HELO schen9-mobl.amr.corp.intel.com) ([10.212.173.244]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 09:14:08 -0700 Subject: Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ To: Vincent Guittot Cc: Qais Yousef , Joel Fernandes , linux-kernel , Paul McKenney , Frederic Weisbecker , Dietmar Eggeman , Ben Segall , Daniel Bristot de Oliveira , Ingo Molnar , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , "Uladzislau Rezki (Sony)" , Neeraj upadhyay , Aubrey Li References: <4aa674d9-db49-83d5-356f-a20f9e2a7935@linux.intel.com> <2d2294ce-f1d1-f827-754b-4541c1b43be8@linux.intel.com> <577b0aae-0111-97aa-0c99-c2a2fcfb5e2e@linux.intel.com> <20210512135955.suzvxxfilvwg33y2@e107158-lin.cambridge.arm.com> <729718fd-bd2c-2e0e-46f5-8027281e5821@linux.intel.com> From: Tim Chen Message-ID: <366aa93b-ecbf-ac0f-cd9e-3376b20d4929@linux.intel.com> Date: Fri, 18 Jun 2021 09:14:07 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/18/21 3:28 AM, Vincent Guittot wrote: >> >> The current logic is when a CPU becomes idle, next_balance occur very >> shortly (usually in the next jiffie) as get_sd_balance_interval returns >> the next_balance in the next jiffie if the CPU is idle. However, in >> reality, I saw most CPUs are 95% busy on average for my workload and >> a task will wake up on an idle CPU shortly. So having frequent idle >> balancing towards shortly idle CPUs is counter productive and simply >> increase overhead and does not improve performance. > > Just to make sure that I understand your problem correctly: Your problem is: > - that we have an ilb happening on the idle CPU and consume cycle That's right. The cycles are consumed heavily in update_blocked_averages() when cgroup is enabled. > - or that the ilb will pull a task on an idle CPU on which a task will > shortly wakeup which ends to 2 tasks competing for the same CPU. > Because for the OLTP workload I'm looking at, we have tasks that sleep for a short while and wake again very shortly (i.e. the CPU actually is ~95% busy on average), pulling tasks to such a CPU is really not helpful to improve overall CPU utilization in the system. So my intuition is for such almost fully busy CPU, we should defer load balancing to it (see prototype patch 3). Tim