From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB88ECAAA1 for ; Tue, 6 Sep 2022 20:51:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229576AbiIFUvI (ORCPT ); Tue, 6 Sep 2022 16:51:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbiIFUvG (ORCPT ); Tue, 6 Sep 2022 16:51:06 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC17779A5A; Tue, 6 Sep 2022 13:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=GzCE/imuOcJ+g/sNBTROhEBFbz+670zTIBkR3Ox5lik=; b=Wq8JJC2I8k9ghAF1rh90IxH1g1 6B5eeRWNA2ClA+/QuqBjIVPG+vOa3naUYz8tkXZegSrBEdZEUeeyQPxJ0M4MXqQrI5I68PDnyDsIb QUgqk3THblmxaJlRawNJTfv6AOvl5GQkHL++vjZRa4qhEObP5x3vbkUhKXKqmK2zhil4k31p4BUsu 1w86Xyg88pCB74zI6M76HtjoaEcEEo5ZB6xGGgIozCgqSwYMb5IuzzkpupWExTX+bjHBZJlz3wYfp hvEBL908KQXs42Ckacb877wbojdN45KH60/PEhO7xReXrO6e8NWPxkfmSW/OXKYC3P2CnbKHaRWyi a9Ozu7dQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oVfWf-00AGIo-Vn; Tue, 06 Sep 2022 20:50:30 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1F3AD3006A4; Tue, 6 Sep 2022 22:50:28 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id CEECF203C095B; Tue, 6 Sep 2022 22:50:28 +0200 (CEST) Date: Tue, 6 Sep 2022 22:50:28 +0200 From: Peter Zijlstra To: Waiman Long Cc: Tejun Heo , Jing-Ting Wu , Mukesh Ojha , Valentin Schneider , wsd_upstream@mediatek.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Jonathan.JMChen@mediatek.com, "chris.redpath@arm.com" , Dietmar Eggemann , Vincent Donnefort , Ingo Molnar , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Christian Brauner , cgroups@vger.kernel.org, lixiong.liu@mediatek.com, wenju.xu@mediatek.com Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete Message-ID: References: <88b2910181bda955ac46011b695c53f7da39ac47.camel@mediatek.com> <203d4614c1b2a498a240ace287156e9f401d5395.camel@mediatek.com> <02b8e7b3-941d-8bb9-cd0e-992738893ba3@redhat.com> <36a73401-7011-834a-7949-c65a2f66246c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <36a73401-7011-834a-7949-c65a2f66246c@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote: I've not followed the earlier stuff due to being unreadable; just reacting to this.. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 838623b68031..5d9ea1553ec0 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, >                 if (cpumask_equal(&p->cpus_mask, new_mask)) >                         goto out; > > -               if (WARN_ON_ONCE(p == current && > -                                is_migration_disabled(p) && > -                                !cpumask_test_cpu(task_cpu(p), new_mask))) > { > +               if (is_migration_disabled(p) && > +                   !cpumask_test_cpu(task_cpu(p), new_mask)) { > +                       WARN_ON_ONCE(p == current); >                         ret = -EBUSY; >                         goto out; >                 } > @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, >         if (flags & SCA_USER) >                 user_mask = clear_user_cpus_ptr(p); > > -       ret = affine_move_task(rq, p, rf, dest_cpu, flags); > +       if (!is_migration_disabled(p) || (flags & SCA_MIGRATE_ENABLE)) { > +               ret = affine_move_task(rq, p, rf, dest_cpu, flags); > +       } else { > +               task_rq_unlock(rq, p, rf); > +       } This cannot be right. There might be previous set_cpus_allowed_ptr() callers that are blocked and waiting for the task to land on a valid CPU. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DE8FBECAAA1 for ; Tue, 6 Sep 2022 20:51:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=mI6jMWpbK4zkMepyEbQXzBPWZrDyWUvjY+cLrPncoFo=; b=FMGLcnJHQg9bOH udiyzMbr02U6/YQr7t9pBfZPJYXuG7h4z3VIOivELV/fN1gxBQWE8PcaQ/444v+YS1lZEIfS9W7nU pk7uyZ0fzTBFxPnNC5nO0cnPtayQhZfNZ9T2IFYTXP0v02I84ixqtBosH9LiyEW1IYVzYqbXMKoIA 69Y8xvt5eQ/yIfUYDwwZTrHbAJ8CqsqKOB1q6cX372RB3r964NjPEOU4Ou5AswA6aP8FSJFGLcrE0 7yHVZXC6C40fK1LFTACpcJU9zRXOpxVwyMOf6NrdOMH5en9O5P1huCpasSRnfSJojHF9ZsHeHBqon OBiZyTl50D7wXYedeesg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oVfWu-00HGxZ-7u; Tue, 06 Sep 2022 20:50:44 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oVfWs-00HGwd-24; Tue, 06 Sep 2022 20:50:42 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=GzCE/imuOcJ+g/sNBTROhEBFbz+670zTIBkR3Ox5lik=; b=Wq8JJC2I8k9ghAF1rh90IxH1g1 6B5eeRWNA2ClA+/QuqBjIVPG+vOa3naUYz8tkXZegSrBEdZEUeeyQPxJ0M4MXqQrI5I68PDnyDsIb QUgqk3THblmxaJlRawNJTfv6AOvl5GQkHL++vjZRa4qhEObP5x3vbkUhKXKqmK2zhil4k31p4BUsu 1w86Xyg88pCB74zI6M76HtjoaEcEEo5ZB6xGGgIozCgqSwYMb5IuzzkpupWExTX+bjHBZJlz3wYfp hvEBL908KQXs42Ckacb877wbojdN45KH60/PEhO7xReXrO6e8NWPxkfmSW/OXKYC3P2CnbKHaRWyi a9Ozu7dQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oVfWf-00AGIo-Vn; Tue, 06 Sep 2022 20:50:30 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1F3AD3006A4; Tue, 6 Sep 2022 22:50:28 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id CEECF203C095B; Tue, 6 Sep 2022 22:50:28 +0200 (CEST) Date: Tue, 6 Sep 2022 22:50:28 +0200 From: Peter Zijlstra To: Waiman Long Cc: Tejun Heo , Jing-Ting Wu , Mukesh Ojha , Valentin Schneider , wsd_upstream@mediatek.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Jonathan.JMChen@mediatek.com, "chris.redpath@arm.com" , Dietmar Eggemann , Vincent Donnefort , Ingo Molnar , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Christian Brauner , cgroups@vger.kernel.org, lixiong.liu@mediatek.com, wenju.xu@mediatek.com Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete Message-ID: References: <88b2910181bda955ac46011b695c53f7da39ac47.camel@mediatek.com> <203d4614c1b2a498a240ace287156e9f401d5395.camel@mediatek.com> <02b8e7b3-941d-8bb9-cd0e-992738893ba3@redhat.com> <36a73401-7011-834a-7949-c65a2f66246c@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <36a73401-7011-834a-7949-c65a2f66246c@redhat.com> X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote: I've not followed the earlier stuff due to being unreadable; just reacting to this.. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 838623b68031..5d9ea1553ec0 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (cpumask_equal(&p->cpus_= mask, new_mask)) > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 got= o out; > = > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (WARN_ON_ONCE(p =3D=3D cur= rent && > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 is_migration_disabled(p) && > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 !cpumask_test_cpu(task_cpu(p), new_mask))) > { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (is_migration_disabled(p) = && > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 !cpumask_test_cpu= (task_cpu(p), new_mask)) { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 WARN_= ON_ONCE(p =3D=3D current); > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ret= =3D -EBUSY; > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 got= o out; > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 } > @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, > =A0=A0=A0=A0=A0=A0=A0 if (flags & SCA_USER) > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 user_mask =3D clear_user_cp= us_ptr(p); > = > -=A0=A0=A0=A0=A0=A0 ret =3D affine_move_task(rq, p, rf, dest_cpu, flags); > +=A0=A0=A0=A0=A0=A0 if (!is_migration_disabled(p) || (flags & SCA_MIGRATE= _ENABLE)) { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ret =3D affine_move_task(rq, = p, rf, dest_cpu, flags); > +=A0=A0=A0=A0=A0=A0 } else { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 task_rq_unlock(rq, p, rf); > +=A0=A0=A0=A0=A0=A0 } This cannot be right. There might be previous set_cpus_allowed_ptr() callers that are blocked and waiting for the task to land on a valid CPU. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete Date: Tue, 6 Sep 2022 22:50:28 +0200 Message-ID: References: <88b2910181bda955ac46011b695c53f7da39ac47.camel@mediatek.com> <203d4614c1b2a498a240ace287156e9f401d5395.camel@mediatek.com> <02b8e7b3-941d-8bb9-cd0e-992738893ba3@redhat.com> <36a73401-7011-834a-7949-c65a2f66246c@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=GzCE/imuOcJ+g/sNBTROhEBFbz+670zTIBkR3Ox5lik=; b=Wq8JJC2I8k9ghAF1rh90IxH1g1 6B5eeRWNA2ClA+/QuqBjIVPG+vOa3naUYz8tkXZegSrBEdZEUeeyQPxJ0M4MXqQrI5I68PDnyDsIb QUgqk3THblmxaJlRawNJTfv6AOvl5GQkHL++vjZRa4qhEObP5x3vbkUhKXKqmK2zhil4k31p4BUsu 1w86Xyg88pCB74zI6M76HtjoaEcEEo5ZB6xGGgIozCgqSwYMb5IuzzkpupWExTX+bjHBZJlz3wYfp hvEBL908KQXs42Ckacb877wbojdN45KH60/PEhO7xReXrO6e8NWPxkfmSW/OXKYC3P2CnbKHaRWyi a9Ozu7dQ==; Content-Disposition: inline In-Reply-To: <36a73401-7011-834a-7949-c65a2f66246c-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> List-ID: Content-Type: text/plain; charset="iso-8859-1" To: Waiman Long Cc: Tejun Heo , Jing-Ting Wu , Mukesh Ojha , Valentin Schneider , wsd_upstream-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Jonathan.JMChen-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org, "chris.redpath-5wv7dgnIgG8@public.gmane.org" , Dietmar Eggemann , Vincent Donnefort , Ingo Molnar , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Christian Brauner , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, lixiong.liu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org, wenju.xu On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote: I've not followed the earlier stuff due to being unreadable; just reacting to this.. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 838623b68031..5d9ea1553ec0 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (cpumask_equal(&p->cpus_= mask, new_mask)) > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 got= o out; >=20 > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (WARN_ON_ONCE(p =3D=3D cur= rent && > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 is_migration_disabled(p) && > -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 !cpumask_test_cpu(task_cpu(p), new_mask))) > { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 if (is_migration_disabled(p) = && > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 !cpumask_test_cpu= (task_cpu(p), new_mask)) { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 WARN_= ON_ONCE(p =3D=3D current); > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ret= =3D -EBUSY; > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 got= o out; > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 } > @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct > task_struct *p, > =A0=A0=A0=A0=A0=A0=A0 if (flags & SCA_USER) > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 user_mask =3D clear_user_cp= us_ptr(p); >=20 > -=A0=A0=A0=A0=A0=A0 ret =3D affine_move_task(rq, p, rf, dest_cpu, flags); > +=A0=A0=A0=A0=A0=A0 if (!is_migration_disabled(p) || (flags & SCA_MIGRATE= _ENABLE)) { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ret =3D affine_move_task(rq, = p, rf, dest_cpu, flags); > +=A0=A0=A0=A0=A0=A0 } else { > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 task_rq_unlock(rq, p, rf); > +=A0=A0=A0=A0=A0=A0 } This cannot be right. There might be previous set_cpus_allowed_ptr() callers that are blocked and waiting for the task to land on a valid CPU.