From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751428AbdKUPdb (ORCPT ); Tue, 21 Nov 2017 10:33:31 -0500 Received: from mail-bn3nam01on0114.outbound.protection.outlook.com ([104.47.33.114]:30752 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751145AbdKUPd3 (ORCPT ); Tue, 21 Nov 2017 10:33:29 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Joe.Korty@concurrent-rt.com; Date: Tue, 21 Nov 2017 10:33:17 -0500 From: joe.korty@concurrent-rt.com To: Steven Rostedt Cc: Thomas Gleixner , Peter Zijlstra , Linux Kernel Mailing List Subject: Re: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed Message-ID: <20171121153317.GA672@zipoli.concurrent-rt.com> Reply-To: "Joe Korty" References: <20171115192529.GA14158@zipoli.concurrent-rt.com> <20171117174851.2a253785@gandalf.local.home> <20171120163040.GA25993@zipoli.concurrent-rt.com> <20171120230207.19a4bc14@vmware.local.home> <20171120235751.0424cf23@vmware.local.home> <20171121143352.GA25941@zipoli.concurrent-rt.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171121143352.GA25941@zipoli.concurrent-rt.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-Originating-IP: [12.220.59.2] X-ClientProxiedBy: CY4PR08CA0029.namprd08.prod.outlook.com (10.173.247.143) To DM5PR11MB1483.namprd11.prod.outlook.com (10.172.36.142) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c926c05d-b263-46a2-7cf4-08d530f53597 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(2017052603258);SRVR:DM5PR11MB1483; X-Microsoft-Exchange-Diagnostics: 1;DM5PR11MB1483;3:shk3/nQmt8fdaqo7ppA8bbhd75dzI9uM8Vp6VNDXxezBlmqm4cM+5daLc3pV3ec8QsQvfSjBJkComGuAX8WnxiOcP0J/HOc6FWv80+wdScDINO5gIeQUXflzICQpMimUwRz7MiIfkFl54YXuPDriKy8YmkIGgFt8VyV+AnAZn7G/jBS58bCI72tlwkhpHMMSHHI+tFF/SCoTdVBzj9g55fYTwpa6v8RwGGOKOrrP5NM8Iyjcul24eV1kkUcbETZp;25:/wR+bICeseNqJvQBZUztamaJZhmiBQagZ6VF2FJEn/USCdSqh4RbgpQCZqkWGmx6qWYkvDa5MpVskZZeNLwC3tWzZJaff8kn83LRenoWVqN0uAa18sEP9UDpiy8gPEwJY036DQb3WhQeB+bL8njpkopfkztFACTNYohFf+Sr3kcnshFzKY46Z+PKfSq8wMEnxm0RrRBmoKNGIa77YBpOfwHEceJiL6i1xCUXlXZcPR82Z6r0jtHwf52isjTMB94gCKusPb321ohbyyoMYDbRQZCiRxfNw+4OceEBM5BMCPKfwHyIcRGPfJn6Suyz8izkQm8qxaZjzkiP3/aUmf3xcQ==;31:4Er3wQIVWCkspQAhDuggcjEO6KBN8gohX6kNLg0OzxMeGkbw7vaBfJGgXud08sG5uzMTpafgpjUPsOlzFLqXJEL/hCsdqXE3uq5Dc7jFFXesKmD9RxHRhIQSXwKG0ivj1Hx8gtQK7JPCiOYDAZJYA+l8WB3PViCfSRWmcOQyTXIXK0ywgcnoIMx7/3ZF3E6OmYU8L5Hl1ExrlgAjkbIkQf27MRsK+URfk7e0eC63Q5g= X-MS-TrafficTypeDiagnostic: DM5PR11MB1483: X-Microsoft-Exchange-Diagnostics: 1;DM5PR11MB1483;20:x6ucDwu8L+XgCV+rnyGQd//XSCvtdfRWhamS6Q+P/V4UWs8d9J80AknkqynF2XFuH/l1AbhNQiH42HJO3Dn+eobAtbSP61ML8kXPuuud00HeV5IjRQG3VSX9qEO10RyUQrx+rxXsmyK7opU6Fy8VJS6PEc+/dPcKONd8nElzncM=;4:j/5XM4yQNDQgK7wPYQJpXsLYnABz9YyXkiU8yVbeCxskpWvdMfttMJOhRdUQdAJTFSLulYfX/tspV926gDVEhe6kaLxLIxFWfvwUcGyiwogAZQGtGsuG0920BcgPSnHuhJCoJ3whWnSv9N46SicrkTk+0jINGJIOLZ8wm5QtRzKCfGta1FhWrXC+8K4rQ1t4mVvXMyj7ttQqlAwFIUZq7SHqcNObL0BXCvn+WDF3/5DHrfM0YUjxu9v3zyHgGJQh2iu5Qi8/dtdXc1CZ8uXeSKIxiaPTVvEa6NOR+SYiXOTCM6/1evEHQ1ktXdWkI9Z6fja+yGNj85XUqayTBBAJ/ScYVO5sFP4tRzR3INGNeoQ= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(209352067349851)(9452136761055); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(100000703101)(100105400095)(93006095)(93001095)(3231022)(10201501046)(3002001)(6041248)(20161123564025)(20161123558100)(20161123562025)(20161123555025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:DM5PR11MB1483;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:DM5PR11MB1483; X-Forefront-PRVS: 049897979A X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6009001)(39830400002)(346002)(376002)(199003)(24454002)(5423002)(189002)(377424004)(43066004)(53416004)(305945005)(9686003)(47776003)(6246003)(33656002)(66066001)(5660300001)(53936002)(55016002)(106356001)(16586007)(69596002)(58126008)(316002)(50466002)(189998001)(8936002)(7736002)(54906003)(8676002)(105586002)(83506002)(68736007)(93886005)(6916009)(575784001)(2950100002)(81156014)(81166006)(16526018)(6666003)(86362001)(53546010)(478600001)(1076002)(229853002)(2906002)(6116002)(72206003)(3846002)(25786009)(23726003)(4001150100001)(97736004)(4326008)(85782001)(101416001)(76176999)(54356999)(50986999)(18370500001)(85772001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR11MB1483;H:zipoli.concurrent-rt.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;DM5PR11MB1483;23:bye6YtIVrkFh7zrA6uB3YOxkHu69kxlhTjM8RBDBg?= =?us-ascii?Q?OlGs/HkX0ANCSuF1jYWsY5qQ+Ig/4S9qpbRNjpxRdA9/uJBKu1umCDUvVTCM?= =?us-ascii?Q?9QcFu5R1UifPbojni6JS5HVpchQMmfcIuSMORGJ54PywbhVo8/FSE5/j4Yqq?= =?us-ascii?Q?yiByGBQIBw0j5kkHMNrA+SBCi+t/Rr/n4mlQraTjJh40mTPd70aSwbgJgw5T?= =?us-ascii?Q?q5nSaMjmwEFhcfsQwKegBnKPWAHzqhb6TrOF94tGMMSkEaUt6bpfuDSw/rq4?= =?us-ascii?Q?rDtnM2uH50AUMG4U7DYVCXCl5EemwLVpx5qExfAnqBWzmX7EsUAWCo1Ahz4k?= =?us-ascii?Q?NpS01CEn2tUg0m8XSMUuKWq3VJyY8gneBN6NifGRcg6Y+hbDQbKxQEsEvpke?= =?us-ascii?Q?Mk8oL37ljpkySYy24SYHmNG3ZfpQxFEWhDMGzCSKMn22/P/21Us1kL8K8A/+?= =?us-ascii?Q?cZWLtNZ1Lp99TOJwdImKNtNc39vAfw3+ow2EoigSPYvf/HCKuM2Rc05j4bpu?= =?us-ascii?Q?vYcJNQYzr4fz6rlxL42g/fWpblpixpREo3E1JbkHgpK0UQgCxEQEZBA97yhV?= =?us-ascii?Q?5mMDfqcWcIld9UcpRGgh/EFDMYSC4CwE+PWXLS7MYB3snjL20/BHHw+mGC/h?= =?us-ascii?Q?BHpmMmhR1YLE5qqjvn4MZ9Rgiug7AoPSNq84Nc/llR0YPdrjYankbXul5dGY?= =?us-ascii?Q?+gJW5//WAmw+fRVC404CbBDjnyC3ebMsB9gfFm29ZH0+vu0i3SmEnp/rnF80?= =?us-ascii?Q?Yzf0e0gxfWSOV8UeSFDvQQiZV9gPWmjTwQ4C4v21ecCrEmQhRWlTgWfwPDYo?= =?us-ascii?Q?uK2joooZs4wWpdavgoZ4UQU9J0gOeK1YrnfYj4SUZCE9En47n/THrcJkYuxW?= =?us-ascii?Q?g0QuPziqjTBsLeeSCMmOnC8ScwipVcKgUziDDPKF2W2ZZOhFrcmyBhqoQhyS?= =?us-ascii?Q?r/VdTwBiON/ksE3XJ8u6pMbGWsYapd+BPSWjjyyRHl5hgh+6NuyHVz0GfYwo?= =?us-ascii?Q?qYNauKFny2Bl12v9My7cDZZu9tQX8L49CtiJ8bp7uvioQYybgtAAHB4Vf86z?= =?us-ascii?Q?VadCkvmjcXyMmXTSxteXVgMp332xLdSkPuPK+WJ5OatsrZq2aXDglTO1tCfJ?= =?us-ascii?Q?RI05EJtsTpVErdNm34UeOuGvezXRtnHMPwmzr9aYC2rUcv/bGFK9uIPvK5N+?= =?us-ascii?Q?PFAlNA+apGCJLJO/vLhwt8nF//o5DDvUNZ4O+IZnDATeHrIklYSWVaLetBrn?= =?us-ascii?Q?BS6oYVrecofllE4GXt4kjIhcZ9vcdBhlYJ6/az1GAFcOSydRyVGI8kQGqQY9?= =?us-ascii?Q?6agIp8j3s0uqbDzzkgLhfrbkW7ZKsyeT67BXblkKr4auIVKzbeK4vBeww4IJ?= =?us-ascii?Q?5swGbXhTlCMM1ymOoBJa4QT1G0yAVRMMmBW8HjYv5778nuO8V5ftVsRTOIV0?= =?us-ascii?Q?JpgEzFG7yHLdWtYLlfaz2rWLlqZuudAtZYq453b4kVNULX5TciK?= X-Microsoft-Exchange-Diagnostics: 1;DM5PR11MB1483;6:UKOpeR0cwHTD3z28rVXmfTsRLsmO53Qyvx3xNl7cenE2P2cWugGKOexGsgWSM76FGEaOv7j7DAQzMZzfLF6tprBnrlXONh/29MN0xP332cEPjLlGHtZkp04tV6b8cKMK/JFbeib+98NSyUAv0ZuHXVlPyyqjp7JITNJXjAUXThhtXOGWLlVNU9pT/0Vfdnu1Q06/dyRN4Gsqb6Lc5r7ITv0/DGKB+OqRKLAT/1i9FIQyj7mhZzLYLyX0xXQ+hegeqf1yxvJvkBUljTlfgX17ha/Aq9p4hxtKky1Ovzwo7T+T+tjbHEWBowTQRQRc3p2rH83nPLN3Aj7OAmo3ZILIQAITiG1LUlheELfWwbuhUjc=;5:nBSXSsIXu/UxcAELuwhWSxDhUT5UzHXQPLeluNsLzqOHht2ZBKbBIPnsq+HxMVbkL2btCQM2F1JebHgDItkbCi2ZYLsVc9Dhyq56Yo8I0LarzVJJ07GX0Sq4rmyjXc275ao1IEW57A1n089Webc2IXtQdduoKB8dVBXQA3jA/Ig=;24:gp1ZyZ0RSQjPIbJxaC6qTlyRqdYMLBACcttED2LwdM4h9F6TgoJYz4Iq3tcJNn7dXfxRUmd7oERDjSo/RRoDM9bB0c1mCBg3JkqY7O41Ts4=;7:09bWp7E1auNHKDVAiNbq4WCA0Rz7W4nb80/hTffL22UoNm8sNQm94hH2UESoiZpo3SGnj54nrMmIZHjRlpspEKA00Gd7GIdnyrr8N4nvWRnxuXZJ4UhVZ7KkLnHeUp2ItXi9qDhFH3D+8HprkJZaw0t31qCSFLCE2jY3gFmHQlaBjif6VPo3o56KirJB2PJXcyOvXtK7x1z6JJgRc375mwvb9jP1Mz3WWVf0qcgs4uwQDHEcrCvkKUERSaIi3xdj SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: concurrent-rt.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Nov 2017 15:33:26.0782 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c926c05d-b263-46a2-7cf4-08d530f53597 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 38747689-e6b0-4933-86c0-1116ee3ef93e X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR11MB1483 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 21, 2017 at 09:33:52AM -0500, joe.korty@concurrent-rt.com wrote: > On Mon, Nov 20, 2017 at 11:57:51PM -0500, Steven Rostedt wrote: > > On Mon, 20 Nov 2017 23:02:07 -0500 > > Steven Rostedt wrote: > > > > > > > Ideally, I would like to stay close to what upstream -rt does. Would > > > you be able to backport the 4.11-rt patch? > > > > > > I'm currently working on releasing 4.9-rt and 4.4-rt with the latest > > > backports. I could easily add this one too. > > > > Speaking of which. I just backported this patch to 4.4-rt. Is this what > > you are talking about? > > Yes it is. > Thanks for finding that! > Joe I spoke too fast. You will a variant of my one-liner fix when you backport the 4.11.12-r16 patch: rt-Increase-decrease-the-nr-of-migratory-tasks-when-.patch to 4.9-rt and 4.4-rt. The fix of interest is the introduction of p->nr_cpus_allowed = cpumask_weight(&p->cpus_mask); to migrate_enable_update_cpus_allowed(). Regards, Joe > > > >From 1dc89be37874bfc7bb4a0ea7c45492d7db39f62b Mon Sep 17 00:00:00 2001 > > From: Sebastian Andrzej Siewior > > Date: Mon, 19 Jun 2017 09:55:47 +0200 > > Subject: [PATCH] sched/migrate disable: handle updated task-mask mg-dis > > section > > > > If task's cpumask changes while in the task is in a migrate_disable() > > section then we don't react on it after a migrate_enable(). It matters > > however if current CPU is no longer part of the cpumask. We also miss > > the ->set_cpus_allowed() callback. > > This patch fixes it by setting task->migrate_disable_update once we this > > "delayed" hook. > > This bug was introduced while fixing unrelated issue in > > migrate_disable() in v4.4-rt3 (update_migrate_disable() got removed > > during that). > > > > Cc: stable-rt@vger.kernel.org > > Signed-off-by: Sebastian Andrzej Siewior > > Signed-off-by: Steven Rostedt (VMware) > > --- > > include/linux/sched.h | 1 > > kernel/sched/core.c | 59 ++++++++++++++++++++++++++++++++++++++++++++------ > > 2 files changed, 54 insertions(+), 6 deletions(-) > > > > Index: stable-rt.git/include/linux/sched.h > > =================================================================== > > --- stable-rt.git.orig/include/linux/sched.h 2017-11-20 23:43:24.214077537 -0500 > > +++ stable-rt.git/include/linux/sched.h 2017-11-20 23:43:24.154079278 -0500 > > @@ -1438,6 +1438,7 @@ struct task_struct { > > unsigned int policy; > > #ifdef CONFIG_PREEMPT_RT_FULL > > int migrate_disable; > > + int migrate_disable_update; > > # ifdef CONFIG_SCHED_DEBUG > > int migrate_disable_atomic; > > # endif > > Index: stable-rt.git/kernel/sched/core.c > > =================================================================== > > --- stable-rt.git.orig/kernel/sched/core.c 2017-11-20 23:43:24.214077537 -0500 > > +++ stable-rt.git/kernel/sched/core.c 2017-11-20 23:56:05.071687323 -0500 > > @@ -1212,18 +1212,14 @@ void set_cpus_allowed_common(struct task > > p->nr_cpus_allowed = cpumask_weight(new_mask); > > } > > > > -void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) > > +static void __do_set_cpus_allowed_tail(struct task_struct *p, > > + const struct cpumask *new_mask) > > { > > struct rq *rq = task_rq(p); > > bool queued, running; > > > > lockdep_assert_held(&p->pi_lock); > > > > - if (__migrate_disabled(p)) { > > - cpumask_copy(&p->cpus_allowed, new_mask); > > - return; > > - } > > - > > queued = task_on_rq_queued(p); > > running = task_current(rq, p); > > > > @@ -1246,6 +1242,20 @@ void do_set_cpus_allowed(struct task_str > > enqueue_task(rq, p, ENQUEUE_RESTORE); > > } > > > > +void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) > > +{ > > + if (__migrate_disabled(p)) { > > + lockdep_assert_held(&p->pi_lock); > > + > > + cpumask_copy(&p->cpus_allowed, new_mask); > > +#if defined(CONFIG_PREEMPT_RT_FULL) && defined(CONFIG_SMP) > > + p->migrate_disable_update = 1; > > +#endif > > + return; > > + } > > + __do_set_cpus_allowed_tail(p, new_mask); > > +} > > + > > static DEFINE_PER_CPU(struct cpumask, sched_cpumasks); > > static DEFINE_MUTEX(sched_down_mutex); > > static cpumask_t sched_down_cpumask; > > @@ -3231,6 +3241,43 @@ void migrate_enable(void) > > */ > > p->migrate_disable = 0; > > > > + if (p->migrate_disable_update) { > > + unsigned long flags; > > + struct rq *rq; > > + > > + rq = task_rq_lock(p, &flags); > > + update_rq_clock(rq); > > + > > + __do_set_cpus_allowed_tail(p, &p->cpus_allowed); > > + task_rq_unlock(rq, p, &flags); > > + > > + p->migrate_disable_update = 0; > > + > > + WARN_ON(smp_processor_id() != task_cpu(p)); > > + if (!cpumask_test_cpu(task_cpu(p), &p->cpus_allowed)) { > > + const struct cpumask *cpu_valid_mask = cpu_active_mask; > > + struct migration_arg arg; > > + unsigned int dest_cpu; > > + > > + if (p->flags & PF_KTHREAD) { > > + /* > > + * Kernel threads are allowed on online && !active CPUs > > + */ > > + cpu_valid_mask = cpu_online_mask; > > + } > > + dest_cpu = cpumask_any_and(cpu_valid_mask, &p->cpus_allowed); > > + arg.task = p; > > + arg.dest_cpu = dest_cpu; > > + > > + unpin_current_cpu(); > > + preempt_lazy_enable(); > > + preempt_enable(); > > + stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg); > > + tlb_migrate_finish(p->mm); > > + return; > > + } > > + } > > + > > unpin_current_cpu(); > > preempt_enable(); > > preempt_lazy_enable();