From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E643C388F9 for ; Fri, 23 Oct 2020 21:47:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ED48220857 for ; Fri, 23 Oct 2020 21:47:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=joelfernandes.org header.i=@joelfernandes.org header.b="ZG1XonjV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756849AbgJWVrF (ORCPT ); Fri, 23 Oct 2020 17:47:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756841AbgJWVrE (ORCPT ); Fri, 23 Oct 2020 17:47:04 -0400 Received: from mail-qv1-xf43.google.com (mail-qv1-xf43.google.com [IPv6:2607:f8b0:4864:20::f43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 825BAC0613CE for ; Fri, 23 Oct 2020 14:47:04 -0700 (PDT) Received: by mail-qv1-xf43.google.com with SMTP id s17so1548841qvr.11 for ; Fri, 23 Oct 2020 14:47:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=J/t5tWSdOrwNBWByXoZ9Yq6S/MfgMJAqu5KglZSB+kM=; b=ZG1XonjV5SZSee8y8WUsmFfyF4HvY/nEKBVcC/h+fB5wOUWjh3HEtvrlp/eGNJwIQN 73QsO/yesViEx1JD+j3l/2LPmYCjGM4rRK3PBAPv5SOnOpxs/aEoDadOqQGyvCXAWtiT DJ8OZto+5MyOf18VoEbig+/WxFGIGVSCKp0zA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=J/t5tWSdOrwNBWByXoZ9Yq6S/MfgMJAqu5KglZSB+kM=; b=mjrvoNEA3DGAqyUcLaWkuyhS3m5nBtXIoWdUCdOlBriTQMFO5B621SzRGIPb106wig 5qTMOTtjKXLJhfGn/bg8fZRA31CdK9F4em2DBuM6Rlbf5WSzVUvYR670U44Tzc74eWTF WkVch1Bm2UT/z6K7wsAzswJNkGgJOL0rSCEpLzrTmpq3Vkz5M6vNGnNMFQErYBbbJYLs qb1Z3IsBVnK3kNtDQbqEpp1NiUeR3yunT+t8JpN0VEJrs/w3RA0HsZXyali5TUCW5zx6 Gb4G+DDL52OUWy5BljlgKe3d8hdReSVp8E5oNrl6gdBvVff9GijLOkonnXS49mABBm1b s64w== X-Gm-Message-State: AOAM530S709xqlP7nK7Kd/aarKL4Pj4hE+OStwhRUdqJdXEIjUng+KLB kvXBwZ+is5kgezEZGRkwtiko6g== X-Google-Smtp-Source: ABdhPJx7RW54pLrZeOG6Dqzkay+2sC4WFUIrueHo9xwUDkxxgg+g4mI4YjDZr9OC9N4SpSFP5xe22g== X-Received: by 2002:ad4:45a5:: with SMTP id y5mr1030695qvu.40.1603489623683; Fri, 23 Oct 2020 14:47:03 -0700 (PDT) Received: from localhost ([2620:15c:6:411:cad3:ffff:feb3:bd59]) by smtp.gmail.com with ESMTPSA id d142sm1707689qke.125.2020.10.23.14.47.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Oct 2020 14:47:03 -0700 (PDT) Date: Fri, 23 Oct 2020 17:47:02 -0400 From: Joel Fernandes To: "Li, Aubrey" Cc: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , Vineeth Pillai , Aaron Lu , Aubrey Li , Thomas Glexiner , LKML , Ingo Molnar , Linus Torvalds , Frederic Weisbecker , Kees Cook , Greg Kerr , Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , vineeth@bitbyteword.org, Chen Yu , Christian Brauner , Agata Gruza , Antonio Gomez Iglesias , graf@amazon.com, konrad.wilk@oracle.com, Dario Faggioli , Paul Turner , Steven Rostedt , Patrick Bellasi , =?utf-8?B?YmVuYmppYW5nKOiSi+W9qik=?= , Alexandre Chartre , James.Bottomley@hansenpartnership.com, OWeisse@umich.edu, Dhaval Giani , Junaid Shahid , Jesse Barnes , "Hyser,Chris" , Vineeth Remanan Pillai , "Paul E. McKenney" , Tim Chen , "Ning, Hongyu" Subject: Re: [PATCH v8 -tip 02/26] sched: Introduce sched_class::pick_task() Message-ID: <20201023214702.GA3603399@google.com> References: <20201020014336.2076526-1-joel@joelfernandes.org> <20201020014336.2076526-3-joel@joelfernandes.org> <8ea1aa61-4a1c-2687-9f15-1062d37606c7@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 23, 2020 at 01:25:38PM +0800, Li, Aubrey wrote: > >>> @@ -2517,6 +2528,7 @@ const struct sched_class dl_sched_class > >>> > >>> #ifdef CONFIG_SMP > >>> .balance = balance_dl, > >>> + .pick_task = pick_task_dl, > >>> .select_task_rq = select_task_rq_dl, > >>> .migrate_task_rq = migrate_task_rq_dl, > >>> .set_cpus_allowed = set_cpus_allowed_dl, > >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >>> index dbd9368a959d..bd6aed63f5e3 100644 > >>> --- a/kernel/sched/fair.c > >>> +++ b/kernel/sched/fair.c > >>> @@ -4450,7 +4450,7 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr) > >>> * Avoid running the skip buddy, if running something else can > >>> * be done without getting too unfair. > >>> */ > >>> - if (cfs_rq->skip == se) { > >>> + if (cfs_rq->skip && cfs_rq->skip == se) { > >>> struct sched_entity *second; > >>> > >>> if (se == curr) { > >>> @@ -6976,6 +6976,35 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ > >>> set_last_buddy(se); > >>> } > >>> > >>> +#ifdef CONFIG_SMP > >>> +static struct task_struct *pick_task_fair(struct rq *rq) > >>> +{ > >>> + struct cfs_rq *cfs_rq = &rq->cfs; > >>> + struct sched_entity *se; > >>> + > >>> + if (!cfs_rq->nr_running) > >>> + return NULL; > >>> + > >>> + do { > >>> + struct sched_entity *curr = cfs_rq->curr; > >>> + > >>> + se = pick_next_entity(cfs_rq, NULL); > >>> + > >>> + if (curr) { > >>> + if (se && curr->on_rq) > >>> + update_curr(cfs_rq); > >>> + > >>> + if (!se || entity_before(curr, se)) > >>> + se = curr; > >>> + } > >>> + > >>> + cfs_rq = group_cfs_rq(se); > >>> + } while (cfs_rq); > >>> ++ > >>> + return task_of(se); > >>> +} > >>> +#endif > >> > >> One of my machines hangs when I run uperf with only one message: > >> [ 719.034962] BUG: kernel NULL pointer dereference, address: 0000000000000050 > >> > >> Then I replicated the problem on my another machine(no serial console), > >> here is the stack by manual copy. > >> > >> Call Trace: > >> pick_next_entity+0xb0/0x160 > >> pick_task_fair+0x4b/0x90 > >> __schedule+0x59b/0x12f0 > >> schedule_idle+0x1e/0x40 > >> do_idle+0x193/0x2d0 > >> cpu_startup_entry+0x19/0x20 > >> start_secondary+0x110/0x150 > >> secondary_startup_64_no_verify+0xa6/0xab > > > > Interesting. Wondering if we screwed something up in the rebase. > > > > Questions: > > 1. Does the issue happen if you just apply only up until this patch, > > or the entire series? > > I applied the entire series and just find a related patch to report the > issue. Ok. > > 2. Do you see the issue in v7? Not much if at all has changed in this > > part of the code from v7 -> v8 but could be something in the newer > > kernel. > > > > IIRC, I can run uperf successfully on v7. > I'm on tip/master 2d3e8c9424c9 (origin/master) "Merge branch 'linus'." > Please let me know if this is a problem, or you have a repo I can pull > for testing. Here is a repo with v8 series on top of v5.9 release: https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/log/?h=coresched-v5.9 thanks, - Joel