From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751397AbcGMVXI (ORCPT ); Wed, 13 Jul 2016 17:23:08 -0400 Received: from mga14.intel.com ([192.55.52.115]:38202 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751238AbcGMVWz convert rfc822-to-8bit (ORCPT ); Wed, 13 Jul 2016 17:22:55 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,359,1464678000"; d="scan'208";a="138705919" From: "Yu, Fenghua" To: Thomas Gleixner CC: Ingo Molnar , "Anvin, H Peter" , "Luck, Tony" , Tejun Heo , "Borislav Petkov" , Stephane Eranian , Peter Zijlstra , Marcelo Tosatti , "David Carrillo-Cisneros" , "Shankar, Ravi V" , Vikas Shivappa , "Prakhya, Sai Praneeth" , linux-kernel , x86 Subject: RE: [PATCH 24/32] Task fork and exit for rdtgroup Thread-Topic: [PATCH 24/32] Task fork and exit for rdtgroup Thread-Index: AQHR3IkMdg/VWHQfnU2y93DLx1VK4aAWzUKA///CFKCAAMCugP//j14g Date: Wed, 13 Jul 2016 21:22:52 +0000 Message-ID: <3E5A0FA7E9CA944F9D5414FEC6C712205DFD4BC8@ORSMSX106.amr.corp.intel.com> References: <1468371785-53231-1-git-send-email-fenghua.yu@intel.com> <1468371785-53231-25-git-send-email-fenghua.yu@intel.com> <3E5A0FA7E9CA944F9D5414FEC6C712205DFD4A75@ORSMSX106.amr.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.140] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > From: Thomas Gleixner [mailto:tglx@linutronix.de] > Sent: Wednesday, July 13, 2016 2:03 PM > On Wed, 13 Jul 2016, Yu, Fenghua wrote: > > On Wed, July 2016, Thomas Gleixner wrote > > > On Tue, 12 Jul 2016, Fenghua Yu wrote: > > > > +void rdtgroup_post_fork(struct task_struct *child) { > > > > + if (!use_rdtgroup_tasks) > > > > + return; > > > > + > > > > + spin_lock_irq(&rdtgroup_task_lock); > > > > + if (list_empty(&child->rg_list)) { > > > > > > Why would the list be non empty after a fork? > > > > In this situation for a pid: > > 1.rdtgroup_fork(): rg_list=null. > > 2.setup_task_rg_lists(): rg_list is setup > > 3.rdtgroup_fork(): rg_list is not empty > > Why would rdtgroup_fork() be called twice for a given thread? > > > This situation happens only during rscctrl mount time. Before mount, > > post_fork() returns from !use_rdtgroup_tasks and doesn't set up > > rg_list. After mount, rg_list() is always empty in post_fork(). But we need > to check rg_list for above situation. > > > > Does that make sense? > > No, that does not make any sense at all. > > > Any suggestion for better soluation? > > The problem you have is: > > fork > list_init(rg_list); > write_lock(tasklist_lock); > > task becomes visible > > write_unlock(tasklist_lock); > > rdtgroup_post_fork(); > if (!use_rdtgroup_tasks) > return; > > spin_lock_irq(&rdtgroup_task_lock); > list_add(); > spin_unlock_irq(&rdtgroup_task_lock); > > I have no idea why this lock must be taken with _irq, but thats another story. > Let's look at the mount side: > > spin_lock_irq(&rdtgroup_task_lock); > read_lock(&tasklist_lock); > > do_each_thread(g, p) { > WARN_ON(More magic crap happening there) > > spin_lock_irq(&p->sighand->siglock); > list_add(); > spin_unlock_irq(&p->sighand->siglock); > ^^^^ > Great: You undo the irq disable of (&rdtgroup_task_lock) above! Oh well.... > > read_unlock(&tasklist_lock); > spin_unlock_irq(&rdtgroup_task_lock); > > So you need all this magic in rdtgroup_post_fork() and setup_task_rg_lists() > just because you blindly positioned rdtgroup_post_fork() at the point where > the cgroup_post_fork() stuff is. But you did not think a second about the > locking rules here otherwise they would be documented somewhere. > > You need a read_lock(&tasklist_lock) for the mount part anyway. So why > don't you do the obvious: > > fork > list_init(rg_list); > write_lock(tasklist_lock); > > rdtgroup_post_fork(); > if (use_rdtgroup_tasks) > spin_lock(&rdtgroup_task_lock); > list_add(); > spin_unlock(&rdtgroup_task_lock); > > task becomes visible > > write_unlock(tasklist_lock); > > And reorder the lock ordering in the mount path: > > read_lock(&tasklist_lock); > spin_lock(&rdtgroup_task_lock); > > Now using rdtgroup_task_lock to protect current->rdtgroup is horrible as > well. You need task->sighand->siglock in the mount path anyway to prevent > exit races. So you can simplify the whole magic to: > > fork > list_init(rg_list); > write_lock(tasklist_lock); > > spin_lock(¤t->sighand->siglock); > > rdtgroup_post_fork(); > if (use_rdtgroup_tasks) > list_add(); > > spin_unlock(¤t->sighand->siglock); > write_unlock(tasklist_lock); > > That removes an extra lock/unlock operation from the fork path because > current->sighand->siglock is taken inside of the tasklist_lock write > current->sighand->locked > section already. > > So you need protection for use_rdtgroup_task, which is a complete > misnomer btw. (rdtgroup_active would be too obvious, right?). That > protection is simple because you can set that flag with tasklist_lock read > locked which you hold anyway for iterating all threads in the mount path. > > Aside of that you need to take tsk->sighand->siglock when you change > tsk->rdtgroup, but that's a no-brainer and it gives you the extra > tsk->benefit that > you can protect such an operation against exit of the task that way by > checking PF_EXITING under the lock. I don't see any protection against exit in > your current implementation when a task is moved to a different partition. > > Please sit down and describe the complete locking and protection scheme of > this stuff. I'm not going to figure this out from the obscure code another time. > > Thanks, > > tglx Sure, I'll rethink of the locking and protection scheme for the tasks. Thanks. -Fenghua