From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2593BC433DB for ; Thu, 18 Mar 2021 00:07:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BFE0B64F33 for ; Thu, 18 Mar 2021 00:07:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230020AbhCRAG4 (ORCPT ); Wed, 17 Mar 2021 20:06:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229644AbhCRAGo (ORCPT ); Wed, 17 Mar 2021 20:06:44 -0400 Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F943C06174A for ; Wed, 17 Mar 2021 17:06:43 -0700 (PDT) Received: by mail-qk1-x72a.google.com with SMTP id 130so327131qkh.11 for ; Wed, 17 Mar 2021 17:06:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Xnnnpd2csmTZ0lK2n7cLmK6b42cUVgaKFGkxOmUg7Vo=; b=J9sGEwgRnAT1eVln6y/b3oPEwiWWUasu9IJXEcghKp3qlfU8XLyqUm+Xkw0WyJIC5H XdwqmUx+ny9U4qqSt0HWMmnmocZwrFk/+XhwcqZILYKBsgrwbtpB4jbRigXsRwZTuiZW pHhiRrwzpO+9p93zRonA5LSCBOufk5xGtXCHb7h6MZCg/Ce6tBdWrYnavHdETpBU+mXY AAeV+JcI+OA66FvSn3LujLO6IEXPvxJHlizkfX16K6DHxKWVFcaGHNn4fg4Cy4WiGv/V GSOrN+3N/aBBtJOdx2vXT6LGqgKD6FFvaVxHNETxF+/CdRU7xsng7pQX38UvnD7HdMxa iFZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Xnnnpd2csmTZ0lK2n7cLmK6b42cUVgaKFGkxOmUg7Vo=; b=AHKJFkow61PYGhASO04Uky5Ey+PwgI9XhER6EK6+wmtvyNm/+1QWGbJW0EKbTuVo/3 bu/2CCpd00fU8n1a+Qvpy78ilLMLYlolaWg8x9xIAWDGQxPFMGMnPbLeyPZFyYHWVs+k dIkhVVq3Vh6M4CXvuTJEiHFOxATVJPmypf5QivKsauMQbydoKbJ/wUObv9NtinzIQaQ1 L5XQE4//KW+8r9cg/ffAtEoyvBUHhJFKTvZ/dj383RITOoO0X/SysCKy7poTMq42qkO6 Ks45r/qeNiWjeri71hmIpcJsA7bYMz6VWNBWM+5RKp9OrBEfa/Mv28fqRlECkiyIPyIf fPLg== X-Gm-Message-State: AOAM531gTwD7F18OgGu0LHq6EtzA5FKMzucXhCf3DzyXt76cMmyJ/Far lf7NQoBSXLy/aZVYog9LNi2mKzHfECyvhlFfCMcmlg== X-Google-Smtp-Source: ABdhPJz7keNgboG0A5QU6NVwQhRdr3IXK5N/8Lw9uhEOO1HnJqHWVdhKlgKFZVtTJnHfrNfBQXCMqDPWktT6EPO6q4I= X-Received: by 2002:a37:a643:: with SMTP id p64mr1862917qke.276.1616026002308; Wed, 17 Mar 2021 17:06:42 -0700 (PDT) MIME-Version: 1.0 References: <20210317045949.1584952-1-joshdon@google.com> <20210317082550.GA3881262@gmail.com> In-Reply-To: <20210317082550.GA3881262@gmail.com> From: Josh Don Date: Wed, 17 Mar 2021 17:06:31 -0700 Message-ID: Subject: Re: [PATCH] sched: Warn on long periods of pending need_resched To: Ingo Molnar Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Luis Chamberlain , Kees Cook , Iurii Zaikin , linux-kernel , linux-fsdevel@vger.kernel.org, David Rientjes , Oleg Rombakh , Paul Turner Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 17, 2021 at 1:25 AM Ingo Molnar wrote: > > * Josh Don wrote: > > > If resched_latency_warn_ms is set to the default value, only one warning > > will be produced per boot. > > Looks like a value hack, should probably be a separate flag, > defaulting to warn-once. Agreed, done. > > This warning only exists under CONFIG_SCHED_DEBUG. If it goes off, it is > > likely that there is a missing cond_resched() somewhere. > > CONFIG_SCHED_DEBUG is default-y, so most distros have it enabled. To avoid log spam for people who don't care, I was considering having the feature default disabled. Perhaps a better alternative is to only show a single line warning and not print the full backtrace by default. Does the latter sound good to you? > > +#ifdef CONFIG_KASAN > > +#define RESCHED_DEFAULT_WARN_LATENCY_MS 101 > > +#define RESCHED_BOOT_QUIET_SEC 600 > > +#else > > +#define RESCHED_DEFAULT_WARN_LATENCY_MS 51 > > +#define RESCHED_BOOT_QUIET_SEC 300 > > #endif > > +int sysctl_resched_latency_warn_ms = RESCHED_DEFAULT_WARN_LATENCY_MS; > > +#endif /* CONFIG_SCHED_DEBUG */ > > I'd really just make this a single value - say 100 or 200 msecs. Replacing these both with a single value (the more conservative default of 100ms and 600s). > > +static inline void resched_latency_warn(int cpu, u64 latency) > > +{ > > + static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1); > > + > > + WARN(__ratelimit(&latency_check_ratelimit), > > + "CPU %d: need_resched set for > %llu ns (%d ticks) " > > + "without schedule\n", > > + cpu, latency, cpu_rq(cpu)->ticks_without_resched); > > +} > > Could you please put the 'sched:' prefix into scheduler warnings. > Let's have a bit of a namespace structure in new warnings. Sounds good, done.