From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756219AbcDDTcM (ORCPT ); Mon, 4 Apr 2016 15:32:12 -0400 Received: from mail-db3on0072.outbound.protection.outlook.com ([157.55.234.72]:16864 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754591AbcDDTcJ (ORCPT ); Mon, 4 Apr 2016 15:32:09 -0400 Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=mellanox.com; Subject: Re: [PATCH] nohz_full: Make sched_should_stop_tick() more conservative To: Rik van Riel , Frederic Weisbecker , Christoph Lameter , Ingo Molnar , Luiz Capitulino , Peter Zijlstra , Thomas Gleixner , Viresh Kumar , References: <1459539771-4251-1-git-send-email-cmetcalf@mellanox.com> <1459797143.6219.22.camel@redhat.com> From: Chris Metcalf Message-ID: <5702C126.1030904@mellanox.com> Date: Mon, 4 Apr 2016 15:31:50 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1 MIME-Version: 1.0 In-Reply-To: <1459797143.6219.22.camel@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [173.76.23.146] X-ClientProxiedBy: BLUPR19CA0006.namprd19.prod.outlook.com (10.162.230.144) To VI1PR05MB1693.eurprd05.prod.outlook.com (10.165.235.155) X-MS-Office365-Filtering-Correlation-Id: c82c3c82-f6f5-40c9-e202-08d35cbfcde3 X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1693;2:LTORhfAXIlnytyhYz3qqJaivrhIau8OiXK1Z69r0GlPUtG/ZT1YzX67O5Ltax7cAO+0leSP4OdMltvnTv9c+X7N5sbRoAW97oFh3uwEVDwGCGtUKeqjYX97O4h6uOlNnDNG010qCy+KnE/QrLb01N2hPs+1oEInvRS+Ftrm+kfVloA93sT76Sf5U+9CTZVMZ;3:4C7yoEZnsw2dI4Ph2FwB52jqU6rqeInnn5l2hffTKuGcqL5U2eAh0DuTBfReU72NZpQS6jo9OVoO3RkjqkQfnrwXndEiNA1bJEfIsmv1C7t9iBJT/kiTavf4bP2IlCjX;25:A95Q2fOcPHNkwuWpmvYc7FZU0DcpxVak+AT822OBpxt2QopxbsrYD0Z/sROSDhD5WmxSK3jAmyYnCn6kU3AMaCQq/UViRGVv0cpun2HzBGu29BpKO42m5JSjoVxdDX69chHJ1dpYI+vyE5QLIP7LxMVJUjVTmtWTdfBE6E84hPl8hPz8vNxYPMr4kTU8FuEHZCEwBHZNuYqbePrFeoER14UqQN/mBrld5Mr75DMRqgg3gQSyG3FtodR3D06kHkefxkYQz1VumQu5D7N56YTVJlbVv//fRc8wrgrruXc++QpCNWNVaC7XIVlRAcNjregdOOFV0CdDcbyOJ2mPPIrwts44pB68Aqr3mE/0iMZ67qs= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR05MB1693; X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1693;20:yAynSeDVmdbMB1Jr8upV2LGTcyIpVZ9ro1tzRLzlDGS/dAzLUnFbgbEOVtm6lPMnHCsyytCpZDGA0+ppAABXgPoDpEA+6FuWXk952sz9VZI59fXDFJekUlujtDXwT9pPmlYMxK/tfWVr/VE4oGiyaK+xZdYtcKDmszD9dRJPJUwfu371qz4U9kBUNHl4UFPbAbx5DFPbQXPvuAcp3L5nOjCmdVMfxCiEnxyMfVjx8mLbiHB3baQHJ6y97POOw7OPkvixQd44p71adQnPIqm2MmYp0HeMAwivDOPy258Yl/RZ8qpWD8r1AfzM7F8e8uej0oE6j8L6lA8v45Pgn/UsgjKIskdYw/hraBcb9RPEhiU5tp2D2gLtHDSkwgTf40M1pqhFQiXW05u+qvQR9UjwCmXKeNN1DCC5UnrVJR1WhiqUrzlSOched6lY5mIks6p6/woH1EAfptyBUmCrONr/wGOvbMHY8YMwc9Jqesdw/BFtcFK0tAURPqPOU2CZHcki;4:+K+kKM8DBzi0c8LU6LPwmt138oNZ1hCnTZpWbDA+QVSC5PflBd1ekuCQtt+7zr1BbrG3uueURre5OATg+H0E0rtm/0qDaP/hmnx8LXEeL53v7vB3/s6NRx89DexJn76pFc/rAjuodEuUGynBNk8uHUTvijUqIoBddIufS4/X23IIh/rx9/4JxttmT2GNH8mdl/Fd4PN0Z61HtJ11s8tB+Izts7jicA8CuYmXhCpOwT1WQxDsHeXhoRvybU3S15OTXgsk9wLvKn8ynKQSBciTcgBUI7N4Yl8yyN67vETv7eJxsZVimkxWsji6aDlvYTVCGfoepXjgc/xrmtpMrdyPC41d03jBkzrGblb/fCWH4jVXHPwMbXrPEnc/2cHHht+Z X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:VI1PR05MB1693;BCL:0;PCL:0;RULEID:;SRVR:VI1PR05MB1693; X-Forefront-PRVS: 0902222726 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(377454003)(24454002)(377424004)(64126003)(76176999)(83506001)(87266999)(86362001)(65816999)(50466002)(50986999)(54356999)(586003)(6116002)(36756003)(81166005)(1096002)(2906002)(92566002)(230700001)(3846002)(23676002)(5004730100002)(59896002)(47776003)(65956001)(117156001)(33656002)(66066001)(2950100001)(4001350100001)(5001770100001)(80316001)(42186005)(19580395003)(5008740100001)(189998001)(107886002)(77096005)(15975445007)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR05MB1693;H:[192.168.1.158];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtWSTFQUjA1TUIxNjkzOzIzOkVXVWdjS1lOOVkzakVFb1ltSHp4TDk1MjUw?= =?utf-8?B?bktCTG9LaFFDcnp0cnBmcVN0d2lXTXFkQndCQTNzYmFrL0l0Y3VuZ0dJaUly?= =?utf-8?B?NHBhYjFVS05VMGMvWXc2Z2dhOEZ5ZTlkcm1OVURwK1ZoalQ1QWdod3N5MUg4?= =?utf-8?B?QysrWGRERnQ2RFArY2N1RnRyQnVhWXEvcFd2cTBxMkgvUFI1VFlHRi9wekdC?= =?utf-8?B?dDI2SUxGaTVpamx2b0doaWVmYnF6eFdEbXFKT0U0Z2RoUDVxTE1lMVdkV3JG?= =?utf-8?B?dllxUW9JYVdDS3FrQU96RGVjVUZKVjhIQ2Fyem1Sb0ZpMWViN25Fdm9IYXBz?= =?utf-8?B?UkRmMVhtNnUzNTl3VVhpdGQvcnppUDByelNwWUZuQlZaMmg4TWhVUEhWN29V?= =?utf-8?B?THZjb3l2Mlh4RXRLWUh6T05sY1FvMXB2RldxM2kxRldaaHE2eURhbzBvYXZr?= =?utf-8?B?d3hTb0V1aDFRNG5hN1FYTDJWMCttUGtZdG9IV1JTK2tLOVB5LzYvcm5ES0ZF?= =?utf-8?B?eWhuVEZVWnVFNUhJcGJFNEpCR0I5bWEvM3VzODFoa1NqUDBkVWtHemFHdUJH?= =?utf-8?B?UTZQNG1zNDV2THdmdmdiZVhla2FCeUpRc1ViZ0trcU1QalVMQUlJQ2RvRkVz?= =?utf-8?B?aUY3WmZjd2xpbUh2VW9JbDF2THV0OFQ1TlU2OTY1N2NlNDhuOGVEZ1hSczJZ?= =?utf-8?B?dHNvYWVoeWhkUmFBY0pSMzBPVzVKNFJoWUlCRzc3MWxKS0JtV2VFd3V2QWRa?= =?utf-8?B?UnA0OWU3Z1lvcnlRakNVdFJoNWhXbnU0OCtSeXVTWFdnMFhEdTNmeFlDNVZW?= =?utf-8?B?MFVCS05zS05iV0pXVVNPTHc1cEFFYnRtUzFmL3luM2t6RUViS2Z4K3NQUGdt?= =?utf-8?B?a2hMQ0xBMmVCVXVVNEZ6V0QreUc1ZGJyVmxXaXlrQ0RXa2EvRnhENXBiREJQ?= =?utf-8?B?L2xUbGpZM2VVUzVmL1dGc0NycGdlTTlxei9nWTZvOVhyY0l3MXJvV255Ump4?= =?utf-8?B?NXJ0UE9yN21XUFlBT3pNSmtvbHk1ZG8vaTNSTE9oZE5PZlh5UVJ2SnQzNFVW?= =?utf-8?B?amdrMlNxd21xenQxQ2lHYkplaXpMUU45YmxROFdlSVQyVld3YThKVEN6Q2wr?= =?utf-8?B?SmE3b3c3RENwSStZaWRkdTRhVkRJNWM1OHhFS2hxT2FUUWdNTGNxbFhsMGMx?= =?utf-8?B?Rk51ZGRtK0lpUjJaTVMzSStML3EvVXQrMDZRanR3RUd5ZzNVczYxR0Y0d1hz?= =?utf-8?B?ZVNuYWR1RVJhL0VyeVJNOXIvcmZDMGU2N011cXhvVnpSOTl3T0prVS9GSjc0?= =?utf-8?B?ck1HQmlFd3RhZVlsRU9SWGh1MEh2dWlhdkZwenhZbnJzeTFydmFEZXpKUElN?= =?utf-8?B?S0FRaWd6NWpvTGlZdVNLNzZ6ZGl2S25LaStlNXlFZUVnVmpxdFBNbDRSZWVI?= =?utf-8?B?bWZYcXJXbnJEY0dvSGRvYUFvdGpTREs4SktoVVZyNG1JUTZWNjh6OFBReFFp?= =?utf-8?B?aVgvWEVJV21FdWVWckY3dnRuU25PUXhyOHlEZ3hYcjRDZVdhS0E3S1ZnRFRW?= =?utf-8?B?TkM5MVVXL3BCVlBuN1JXUC9QYTdTS1E9PQ==?= X-Microsoft-Exchange-Diagnostics: 1;VI1PR05MB1693;5:79N41z7ErjuDitduu0KDOQPpSxrV1hc1Om3uHTwQ84LbH1Mam+i2fiv+e92cIoqavRXlEob0OytZIIAftAJEz3Jt+6RFPOPwx4/5mexqC+1u5pB2g82OdccBZf3Py1Ox7vgJaAvAEaMSSBDyeckQnw==;24:nd2y5L/WIpsLZ724rMc9Bsm+w2xwZYzGh3X4VPUZ9Yqe4OY9/yqHsxuHvwrnGJrNUikMwJITkD8vI78fJxTDW9wekqrCniak+4AVr9cFWns= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2016 19:32:03.7456 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB1693 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/4/2016 3:12 PM, Rik van Riel wrote: > On Fri, 2016-04-01 at 15:42 -0400, Chris Metcalf wrote: >> On arm64, when calling enqueue_task_fair() from migration_cpu_stop(), >> we find the nr_running value updated by add_nr_running(), but the >> cfs.nr_running value has not always yet been updated. Accordingly, >> the sched_can_stop_tick() false returns true when we are migrating a >> second task onto a core. > I don't get it. > > Looking at the enqueue_task_fair(), I see this: > > for_each_sched_entity(se) { > cfs_rq = cfs_rq_of(se); > cfs_rq->h_nr_running++; > ... > } > > if (!se) > add_nr_running(rq, 1); > > What is the difference between cfs_rq->h_nr_running, > and rq->cfs.nr_running? > > Why do we have two? > Are we simply testing against the wrong one in > sched_can_stop_tick? It seems that using the non-CFS one is what we want. I don't know whether using a different CFS count instead might be more correct. Since I'm not sure what causes the difference I see between tile (correct) and arm64 (incorrect) it's hard for me to speculate. >> Correct this by using rq->nr_running instead of rq->cfs.nr_running. >> This should always be more conservative, and reverts the test to the >> form it had before commit 76d92ac305f2 ("sched: Migrate sched to use >> new tick dependency mask model"). > That would cause us to run the timer tick while running > a single SCHED_RR real time task, with a single > SCHED_OTHER task sitting in the background (which will > not get run until the SCHED_RR task is done). No, because in sched_can_stop_tick(), we first handle the special cases of RR or FIFO tasks present. For example, RR: if (rq->rt.rr_nr_running) { if (rq->rt.rr_nr_running == 1) return true; else return false; } Once we see there's any RR tasks running, the return value ignores any possible SCHED_OTHER tasks. Only after the code concludes there are no RR/FIFO tasks do we even look at the over nr_running value. -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com