From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754836AbbBKWPd (ORCPT ); Wed, 11 Feb 2015 17:15:33 -0500 Received: from mail-lb0-f177.google.com ([209.85.217.177]:62996 "EHLO mail-lb0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754076AbbBKWPb (ORCPT ); Wed, 11 Feb 2015 17:15:31 -0500 MIME-Version: 1.0 In-Reply-To: <20150211220530.GA12728@htj.duckdns.org> References: <20150206141746.GB10580@htj.dyndns.org> <20150207143839.GA9926@htj.dyndns.org> <20150211021906.GA21356@htj.duckdns.org> <20150211203359.GF21356@htj.duckdns.org> <20150211214650.GA11920@htj.duckdns.org> <20150211220530.GA12728@htj.duckdns.org> Date: Thu, 12 Feb 2015 02:15:29 +0400 Message-ID: Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma From: Konstantin Khlebnikov To: Tejun Heo Cc: Greg Thelen , Konstantin Khlebnikov , Johannes Weiner , Michal Hocko , Cgroups , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Jan Kara , Dave Chinner , Jens Axboe , Christoph Hellwig , Li Zefan , Hugh Dickins Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 12, 2015 at 1:05 AM, Tejun Heo wrote: > On Thu, Feb 12, 2015 at 01:57:04AM +0400, Konstantin Khlebnikov wrote: >> On Thu, Feb 12, 2015 at 12:46 AM, Tejun Heo wrote: >> > Hello, >> > >> > On Thu, Feb 12, 2015 at 12:22:34AM +0300, Konstantin Khlebnikov wrote: >> >> > Yeah, available memory to the matching memcg and the number of dirty >> >> > pages in it. It's gonna work the same way as the global case just >> >> > scoped to the cgroup. >> >> >> >> That might be a problem: all dirty pages accounted to cgroup must be >> >> reachable for its own personal writeback or balanace-drity-pages will be >> >> unable to satisfy memcg dirty memory thresholds. I've done accounting >> > >> > Yeah, it would. Why wouldn't it? >> >> How do you plan to do per-memcg/blkcg writeback for balance-dirty-pages? >> Or you're thinking only about separating writeback flow into blkio cgroups >> without actual inode filtering? I mean delaying inode writeback and keeping >> dirty pages as long as possible if their cgroups are far from threshold. > > What? The code was already in the previous patchset. I'm just gonna > rip out the code to handle inode being dirtied on multiple wb's. Well, ok. Even if shared writes are rare whey should be handled somehow without relying on kupdate-like writeback. If memcg has a lot of dirty pages but their inodes are accidentially belong to wrong wb queues when tasks in that memcg shouldn't stuck in balance-dirty-pages until somebody outside acidentially writes this data. That's all what I wanted to say. > > -- > tejun From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-la0-f53.google.com (mail-la0-f53.google.com [209.85.215.53]) by kanga.kvack.org (Postfix) with ESMTP id 6BCA56B0038 for ; Wed, 11 Feb 2015 17:15:32 -0500 (EST) Received: by labgf13 with SMTP id gf13so6413264lab.9 for ; Wed, 11 Feb 2015 14:15:31 -0800 (PST) Received: from mail-lb0-x22f.google.com (mail-lb0-x22f.google.com. [2a00:1450:4010:c04::22f]) by mx.google.com with ESMTPS id q4si1612485lag.103.2015.02.11.14.15.30 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Feb 2015 14:15:30 -0800 (PST) Received: by mail-lb0-f175.google.com with SMTP id n10so6041800lbv.6 for ; Wed, 11 Feb 2015 14:15:29 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20150211220530.GA12728@htj.duckdns.org> References: <20150206141746.GB10580@htj.dyndns.org> <20150207143839.GA9926@htj.dyndns.org> <20150211021906.GA21356@htj.duckdns.org> <20150211203359.GF21356@htj.duckdns.org> <20150211214650.GA11920@htj.duckdns.org> <20150211220530.GA12728@htj.duckdns.org> Date: Thu, 12 Feb 2015 02:15:29 +0400 Message-ID: Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma From: Konstantin Khlebnikov Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Tejun Heo Cc: Greg Thelen , Konstantin Khlebnikov , Johannes Weiner , Michal Hocko , Cgroups , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Jan Kara , Dave Chinner , Jens Axboe , Christoph Hellwig , Li Zefan , Hugh Dickins On Thu, Feb 12, 2015 at 1:05 AM, Tejun Heo wrote: > On Thu, Feb 12, 2015 at 01:57:04AM +0400, Konstantin Khlebnikov wrote: >> On Thu, Feb 12, 2015 at 12:46 AM, Tejun Heo wrote: >> > Hello, >> > >> > On Thu, Feb 12, 2015 at 12:22:34AM +0300, Konstantin Khlebnikov wrote: >> >> > Yeah, available memory to the matching memcg and the number of dirty >> >> > pages in it. It's gonna work the same way as the global case just >> >> > scoped to the cgroup. >> >> >> >> That might be a problem: all dirty pages accounted to cgroup must be >> >> reachable for its own personal writeback or balanace-drity-pages will be >> >> unable to satisfy memcg dirty memory thresholds. I've done accounting >> > >> > Yeah, it would. Why wouldn't it? >> >> How do you plan to do per-memcg/blkcg writeback for balance-dirty-pages? >> Or you're thinking only about separating writeback flow into blkio cgroups >> without actual inode filtering? I mean delaying inode writeback and keeping >> dirty pages as long as possible if their cgroups are far from threshold. > > What? The code was already in the previous patchset. I'm just gonna > rip out the code to handle inode being dirtied on multiple wb's. Well, ok. Even if shared writes are rare whey should be handled somehow without relying on kupdate-like writeback. If memcg has a lot of dirty pages but their inodes are accidentially belong to wrong wb queues when tasks in that memcg shouldn't stuck in balance-dirty-pages until somebody outside acidentially writes this data. That's all what I wanted to say. > > -- > tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konstantin Khlebnikov Subject: Re: [RFC] Making memcg track ownership per address_space or anon_vma Date: Thu, 12 Feb 2015 02:15:29 +0400 Message-ID: References: <20150206141746.GB10580@htj.dyndns.org> <20150207143839.GA9926@htj.dyndns.org> <20150211021906.GA21356@htj.duckdns.org> <20150211203359.GF21356@htj.duckdns.org> <20150211214650.GA11920@htj.duckdns.org> <20150211220530.GA12728@htj.duckdns.org> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9ZonhSv9gRghSuljMa79UaQeDWmhCIPA8y5W5olbMpw=; b=MIMuLZpkusjybwyYesILzY045ddciOCjFoC6VVjmXk0G4juFfS4BgvZuwjx05mkBqp BVPo3YJ3L4mkt9lVNaMRe7DBptNZagyKs9LRneYvCtEHxa6Gaz4GKoNjTHBPzwTS2ufW Mk865AZpLTgw/FR8HL6j2YaMehVbgR28vKFURQsuiwsLgTpFSvYcufi6gQZzokZxuhCx VwdBZKCYhKbUa6h6nMTAXlXkGNmRuko+IQ7nlDdksS4A6pX2D0ePxG5x1G0WuYshFUO/ C9DzI2OvmA6xRNswEUPITCPG1hzzRGvhG+qUEnk8POtWSOqRsiByYhbT5UJ6cvnIMQaI +HMw== In-Reply-To: <20150211220530.GA12728-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Tejun Heo Cc: Greg Thelen , Konstantin Khlebnikov , Johannes Weiner , Michal Hocko , Cgroups , "linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Jan Kara , Dave Chinner , Jens Axboe , Christoph Hellwig , Li Zefan , Hugh Dickins On Thu, Feb 12, 2015 at 1:05 AM, Tejun Heo wrote: > On Thu, Feb 12, 2015 at 01:57:04AM +0400, Konstantin Khlebnikov wrote: >> On Thu, Feb 12, 2015 at 12:46 AM, Tejun Heo wrote: >> > Hello, >> > >> > On Thu, Feb 12, 2015 at 12:22:34AM +0300, Konstantin Khlebnikov wrote: >> >> > Yeah, available memory to the matching memcg and the number of dirty >> >> > pages in it. It's gonna work the same way as the global case just >> >> > scoped to the cgroup. >> >> >> >> That might be a problem: all dirty pages accounted to cgroup must be >> >> reachable for its own personal writeback or balanace-drity-pages will be >> >> unable to satisfy memcg dirty memory thresholds. I've done accounting >> > >> > Yeah, it would. Why wouldn't it? >> >> How do you plan to do per-memcg/blkcg writeback for balance-dirty-pages? >> Or you're thinking only about separating writeback flow into blkio cgroups >> without actual inode filtering? I mean delaying inode writeback and keeping >> dirty pages as long as possible if their cgroups are far from threshold. > > What? The code was already in the previous patchset. I'm just gonna > rip out the code to handle inode being dirtied on multiple wb's. Well, ok. Even if shared writes are rare whey should be handled somehow without relying on kupdate-like writeback. If memcg has a lot of dirty pages but their inodes are accidentially belong to wrong wb queues when tasks in that memcg shouldn't stuck in balance-dirty-pages until somebody outside acidentially writes this data. That's all what I wanted to say. > > -- > tejun