From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751839AbbKIT16 (ORCPT ); Mon, 9 Nov 2015 14:27:58 -0500 Received: from mx2.parallels.com ([199.115.105.18]:55185 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751417AbbKIT14 (ORCPT ); Mon, 9 Nov 2015 14:27:56 -0500 Date: Mon, 9 Nov 2015 22:27:47 +0300 From: Vladimir Davydov To: Tejun Heo CC: Michal Hocko , Andrew Morton , Johannes Weiner , Greg Thelen , , , Subject: Re: [PATCH 0/5] memcg/kmem: switch to white list policy Message-ID: <20151109192747.GN31308@esperanza> References: <20151109140832.GE8916@dhcp22.suse.cz> <20151109182840.GJ31308@esperanza> <20151109185401.GB28507@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20151109185401.GB28507@mtj.duckdns.org> X-ClientProxiedBy: US-EXCH.sw.swsoft.com (10.255.249.47) To US-EXCH2.sw.swsoft.com (10.255.249.46) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 09, 2015 at 01:54:01PM -0500, Tejun Heo wrote: > On Mon, Nov 09, 2015 at 09:28:40PM +0300, Vladimir Davydov wrote: > > > I am _all_ for this semantic I am just not sure what to do with the > > > legacy kmem controller. Can we change its semantic? If we cannot do that > > > > I think we can. If somebody reports a "bug" caused by this change, i.e. > > basically notices that something that used to be accounted is not any > > longer, it will be trivial to fix by adding __GFP_ACCOUNT where > > appropriate. If it is not, e.g. if accounting of objects of a particular > > type leads to intense false-sharing, we would end up disabling > > accounting for it anyway. > > I agree too, if anything is meaningfully broken by the flip, it just > indicates that the whitelist needs to be expanded; however, I wonder > whether this would be done better at slab level rather than per > allocation site. I'd like to, but this is not as simple as it seems at first glance. The problem is that slab caches of the same size are actively merged with each other. If we just added SLAB_ACCOUNT flag, which would be passed to kmem_cache_create to enable accounting, we'd divide all caches into two groups that couldn't be merged with each other even if kmem accounting was not used at all. This would be a show stopper. Of course, we could rework slab merging so that kmem_cache_create returned a new dummy cache even if it was actually merged. Such a cache would point to the real cache, which would be used for allocations. This wouldn't limit slab merging, but this would add one more dereference to alloc path, which is even worse. That's why I decided to go with marking individual allocations. Thanks, Vladimir From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by kanga.kvack.org (Postfix) with ESMTP id 3A7536B0257 for ; Mon, 9 Nov 2015 14:27:57 -0500 (EST) Received: by pacdm15 with SMTP id dm15so183864284pac.3 for ; Mon, 09 Nov 2015 11:27:57 -0800 (PST) Received: from mx2.parallels.com (mx2.parallels.com. [199.115.105.18]) by mx.google.com with ESMTPS id uv3si24118339pac.101.2015.11.09.11.27.56 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Nov 2015 11:27:56 -0800 (PST) Date: Mon, 9 Nov 2015 22:27:47 +0300 From: Vladimir Davydov Subject: Re: [PATCH 0/5] memcg/kmem: switch to white list policy Message-ID: <20151109192747.GN31308@esperanza> References: <20151109140832.GE8916@dhcp22.suse.cz> <20151109182840.GJ31308@esperanza> <20151109185401.GB28507@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20151109185401.GB28507@mtj.duckdns.org> Sender: owner-linux-mm@kvack.org List-ID: To: Tejun Heo Cc: Michal Hocko , Andrew Morton , Johannes Weiner , Greg Thelen , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org On Mon, Nov 09, 2015 at 01:54:01PM -0500, Tejun Heo wrote: > On Mon, Nov 09, 2015 at 09:28:40PM +0300, Vladimir Davydov wrote: > > > I am _all_ for this semantic I am just not sure what to do with the > > > legacy kmem controller. Can we change its semantic? If we cannot do that > > > > I think we can. If somebody reports a "bug" caused by this change, i.e. > > basically notices that something that used to be accounted is not any > > longer, it will be trivial to fix by adding __GFP_ACCOUNT where > > appropriate. If it is not, e.g. if accounting of objects of a particular > > type leads to intense false-sharing, we would end up disabling > > accounting for it anyway. > > I agree too, if anything is meaningfully broken by the flip, it just > indicates that the whitelist needs to be expanded; however, I wonder > whether this would be done better at slab level rather than per > allocation site. I'd like to, but this is not as simple as it seems at first glance. The problem is that slab caches of the same size are actively merged with each other. If we just added SLAB_ACCOUNT flag, which would be passed to kmem_cache_create to enable accounting, we'd divide all caches into two groups that couldn't be merged with each other even if kmem accounting was not used at all. This would be a show stopper. Of course, we could rework slab merging so that kmem_cache_create returned a new dummy cache even if it was actually merged. Such a cache would point to the real cache, which would be used for allocations. This wouldn't limit slab merging, but this would add one more dereference to alloc path, which is even worse. That's why I decided to go with marking individual allocations. Thanks, Vladimir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vladimir Davydov Subject: Re: [PATCH 0/5] memcg/kmem: switch to white list policy Date: Mon, 9 Nov 2015 22:27:47 +0300 Message-ID: <20151109192747.GN31308@esperanza> References: <20151109140832.GE8916@dhcp22.suse.cz> <20151109182840.GJ31308@esperanza> <20151109185401.GB28507@mtj.duckdns.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Return-path: Content-Disposition: inline In-Reply-To: <20151109185401.GB28507-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Transfer-Encoding: 7bit To: Tejun Heo Cc: Michal Hocko , Andrew Morton , Johannes Weiner , Greg Thelen , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Mon, Nov 09, 2015 at 01:54:01PM -0500, Tejun Heo wrote: > On Mon, Nov 09, 2015 at 09:28:40PM +0300, Vladimir Davydov wrote: > > > I am _all_ for this semantic I am just not sure what to do with the > > > legacy kmem controller. Can we change its semantic? If we cannot do that > > > > I think we can. If somebody reports a "bug" caused by this change, i.e. > > basically notices that something that used to be accounted is not any > > longer, it will be trivial to fix by adding __GFP_ACCOUNT where > > appropriate. If it is not, e.g. if accounting of objects of a particular > > type leads to intense false-sharing, we would end up disabling > > accounting for it anyway. > > I agree too, if anything is meaningfully broken by the flip, it just > indicates that the whitelist needs to be expanded; however, I wonder > whether this would be done better at slab level rather than per > allocation site. I'd like to, but this is not as simple as it seems at first glance. The problem is that slab caches of the same size are actively merged with each other. If we just added SLAB_ACCOUNT flag, which would be passed to kmem_cache_create to enable accounting, we'd divide all caches into two groups that couldn't be merged with each other even if kmem accounting was not used at all. This would be a show stopper. Of course, we could rework slab merging so that kmem_cache_create returned a new dummy cache even if it was actually merged. Such a cache would point to the real cache, which would be used for allocations. This wouldn't limit slab merging, but this would add one more dereference to alloc path, which is even worse. That's why I decided to go with marking individual allocations. Thanks, Vladimir