From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751668Ab0CRDCH (ORCPT ); Wed, 17 Mar 2010 23:02:07 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:34987 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751171Ab0CRDCF (ORCPT ); Wed, 17 Mar 2010 23:02:05 -0400 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 Date: Thu, 18 Mar 2010 11:58:08 +0900 From: KAMEZAWA Hiroyuki To: Daisuke Nishimura Cc: balbir@linux.vnet.ibm.com, Andrea Righi , Vivek Goyal , Peter Zijlstra , Trond Myklebust , Suleiman Souhlal , Greg Thelen , "Kirill A. Shutemov" , Andrew Morton , containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Message-Id: <20100318115808.a62a31d6.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20100318111653.92f899e6.nishimura@mxp.nes.nec.co.jp> References: <1268609202-15581-1-git-send-email-arighi@develer.com> <1268609202-15581-2-git-send-email-arighi@develer.com> <20100317115855.GS18054@balbir.in.ibm.com> <20100318085411.834e1e46.kamezawa.hiroyu@jp.fujitsu.com> <20100318094519.cd1eed72.kamezawa.hiroyu@jp.fujitsu.com> <20100318111653.92f899e6.nishimura@mxp.nes.nec.co.jp> Organization: FUJITSU Co. LTD. X-Mailer: Sylpheed 3.0.0 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 18 Mar 2010 11:16:53 +0900 Daisuke Nishimura wrote: > On Thu, 18 Mar 2010 09:45:19 +0900, KAMEZAWA Hiroyuki wrote: > > On Thu, 18 Mar 2010 08:54:11 +0900 > > KAMEZAWA Hiroyuki wrote: > > > > > On Wed, 17 Mar 2010 17:28:55 +0530 > > > Balbir Singh wrote: > > > > > > > * Andrea Righi [2010-03-15 00:26:38]: > > > > > > > > > From: KAMEZAWA Hiroyuki > > > > > > > > > > Now, file-mapped is maintaiend. But more generic update function > > > > > will be needed for dirty page accounting. > > > > > > > > > > For accountig page status, we have to guarantee lock_page_cgroup() > > > > > will be never called under tree_lock held. > > > > > To guarantee that, we use trylock at updating status. > > > > > By this, we do fuzzy accounting, but in almost all case, it's correct. > > > > > > > > > > > > > I don't like this at all, but in almost all cases is not acceptable > > > > for statistics, since decisions will be made on them and having them > > > > incorrect is really bad. Could we do a form of deferred statistics and > > > > fix this. > > > > > > > > > > plz show your implementation which has no performance regresssion. > > > For me, I don't neee file_mapped accounting, at all. If we can remove that, > > > we can add simple migration lock. > > > file_mapped is a feattue you added. please improve it. > > > > > > > BTW, I should explain how acculate this accounting is in this patch itself. > > > > Now, lock_page_cgroup/unlock_page_cgroup happens when > > - charge/uncharge/migrate/move accounting > > > > Then, the lock contention (trylock failure) seems to occur in conflict > > with > > - charge, uncharge, migarate. move accounting > > > > About dirty accounting, charge/uncharge/migarate are operation in synchronous > > manner with radix-tree (holding treelock etc). Then no account leak. > > move accounting is only source for inacculacy...but I don't think this move-task > > is ciritial....moreover, we don't move any file pages at task-move, now. > > (But Nishimura-san has a plan to do so.) > > So, contention will happen only at confliction with force_empty. > > > > About FILE_MAPPED accounting, it's not synchronous with radix-tree operaton. > > Then, accounting-miss seems to happen when charge/uncharge/migrate/account move. > > But > > charge .... we don't account a page as FILE_MAPPED before it's charged. > > uncharge .. usual operation in turncation is unmap->remove-from-radix-tree. > > Then, it's sequential in almost all case. The race exists when... > > Assume there are 2 threads A and B. A truncate a file, B map/unmap that. > > This is very unusal confliction. > > migrate.... we do try_to_unmap before migrating pages. Then, FILE_MAPPED > > is properly handled. > > move account .... we don't have move-account-mapped-file, yet. > > > FILE_MAPPED is updated under pte lock. OTOH, move account is also done under > pte lock. page cgroup lock is held under pte lock in both cases, so move account > is not so problem as for FILE_MAPPED. > HmmHmm, thank you. then, only racy cases are truncate and force_empty. Thanks, -Kame From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail203.messagelabs.com (mail203.messagelabs.com [216.82.254.243]) by kanga.kvack.org (Postfix) with SMTP id 045A66B013F for ; Wed, 17 Mar 2010 23:02:05 -0400 (EDT) Received: from m1.gw.fujitsu.co.jp ([10.0.50.71]) by fgwmail5.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o2I322ml012623 for (envelope-from kamezawa.hiroyu@jp.fujitsu.com); Thu, 18 Mar 2010 12:02:02 +0900 Received: from smail (m1 [127.0.0.1]) by outgoing.m1.gw.fujitsu.co.jp (Postfix) with ESMTP id CC17D45DE52 for ; Thu, 18 Mar 2010 12:02:01 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (s1.gw.fujitsu.co.jp [10.0.50.91]) by m1.gw.fujitsu.co.jp (Postfix) with ESMTP id B8CD245DE55 for ; Thu, 18 Mar 2010 12:02:00 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id 974511DB8048 for ; Thu, 18 Mar 2010 12:02:00 +0900 (JST) Received: from m107.s.css.fujitsu.com (m107.s.css.fujitsu.com [10.249.87.107]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id 16B1AE38001 for ; Thu, 18 Mar 2010 12:02:00 +0900 (JST) Date: Thu, 18 Mar 2010 11:58:08 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [PATCH -mmotm 1/5] memcg: disable irq at page cgroup lock Message-Id: <20100318115808.a62a31d6.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20100318111653.92f899e6.nishimura@mxp.nes.nec.co.jp> References: <1268609202-15581-1-git-send-email-arighi@develer.com> <1268609202-15581-2-git-send-email-arighi@develer.com> <20100317115855.GS18054@balbir.in.ibm.com> <20100318085411.834e1e46.kamezawa.hiroyu@jp.fujitsu.com> <20100318094519.cd1eed72.kamezawa.hiroyu@jp.fujitsu.com> <20100318111653.92f899e6.nishimura@mxp.nes.nec.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org To: Daisuke Nishimura Cc: balbir@linux.vnet.ibm.com, Andrea Righi , Vivek Goyal , Peter Zijlstra , Trond Myklebust , Suleiman Souhlal , Greg Thelen , "Kirill A. Shutemov" , Andrew Morton , containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org List-ID: On Thu, 18 Mar 2010 11:16:53 +0900 Daisuke Nishimura wrote: > On Thu, 18 Mar 2010 09:45:19 +0900, KAMEZAWA Hiroyuki wrote: > > On Thu, 18 Mar 2010 08:54:11 +0900 > > KAMEZAWA Hiroyuki wrote: > > > > > On Wed, 17 Mar 2010 17:28:55 +0530 > > > Balbir Singh wrote: > > > > > > > * Andrea Righi [2010-03-15 00:26:38]: > > > > > > > > > From: KAMEZAWA Hiroyuki > > > > > > > > > > Now, file-mapped is maintaiend. But more generic update function > > > > > will be needed for dirty page accounting. > > > > > > > > > > For accountig page status, we have to guarantee lock_page_cgroup() > > > > > will be never called under tree_lock held. > > > > > To guarantee that, we use trylock at updating status. > > > > > By this, we do fuzzy accounting, but in almost all case, it's correct. > > > > > > > > > > > > > I don't like this at all, but in almost all cases is not acceptable > > > > for statistics, since decisions will be made on them and having them > > > > incorrect is really bad. Could we do a form of deferred statistics and > > > > fix this. > > > > > > > > > > plz show your implementation which has no performance regresssion. > > > For me, I don't neee file_mapped accounting, at all. If we can remove that, > > > we can add simple migration lock. > > > file_mapped is a feattue you added. please improve it. > > > > > > > BTW, I should explain how acculate this accounting is in this patch itself. > > > > Now, lock_page_cgroup/unlock_page_cgroup happens when > > - charge/uncharge/migrate/move accounting > > > > Then, the lock contention (trylock failure) seems to occur in conflict > > with > > - charge, uncharge, migarate. move accounting > > > > About dirty accounting, charge/uncharge/migarate are operation in synchronous > > manner with radix-tree (holding treelock etc). Then no account leak. > > move accounting is only source for inacculacy...but I don't think this move-task > > is ciritial....moreover, we don't move any file pages at task-move, now. > > (But Nishimura-san has a plan to do so.) > > So, contention will happen only at confliction with force_empty. > > > > About FILE_MAPPED accounting, it's not synchronous with radix-tree operaton. > > Then, accounting-miss seems to happen when charge/uncharge/migrate/account move. > > But > > charge .... we don't account a page as FILE_MAPPED before it's charged. > > uncharge .. usual operation in turncation is unmap->remove-from-radix-tree. > > Then, it's sequential in almost all case. The race exists when... > > Assume there are 2 threads A and B. A truncate a file, B map/unmap that. > > This is very unusal confliction. > > migrate.... we do try_to_unmap before migrating pages. Then, FILE_MAPPED > > is properly handled. > > move account .... we don't have move-account-mapped-file, yet. > > > FILE_MAPPED is updated under pte lock. OTOH, move account is also done under > pte lock. page cgroup lock is held under pte lock in both cases, so move account > is not so problem as for FILE_MAPPED. > HmmHmm, thank you. then, only racy cases are truncate and force_empty. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org