From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757575Ab2AROCF (ORCPT ); Wed, 18 Jan 2012 09:02:05 -0500 Received: from mail-ww0-f44.google.com ([74.125.82.44]:52426 "EHLO mail-ww0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757529Ab2AROCD (ORCPT ); Wed, 18 Jan 2012 09:02:03 -0500 MIME-Version: 1.0 In-Reply-To: <20120118134053.GD31112@tiehlicka.suse.cz> References: <20120117131601.GB14907@tiehlicka.suse.cz> <20120117140712.GC14907@tiehlicka.suse.cz> <20120118134053.GD31112@tiehlicka.suse.cz> Date: Wed, 18 Jan 2012 22:01:57 +0800 Message-ID: Subject: Re: [PATCH] mm: memcg: remove checking reclaim order in soft limit reclaim From: Hillf Danton To: Michal Hocko Cc: linux-mm@kvack.org, Johannes Weiner , KAMEZAWA Hiroyuki , Hugh Dickins , Andrew Morton , LKML , Balbir Singh Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 18, 2012 at 9:40 PM, Michal Hocko wrote: > On Wed 18-01-12 20:30:41, Hillf Danton wrote: >> On Tue, Jan 17, 2012 at 10:07 PM, Michal Hocko wrote: >> > On Tue 17-01-12 21:29:52, Hillf Danton wrote: >> >> On Tue, Jan 17, 2012 at 9:16 PM, Michal Hocko wrote: >> >> > Hi, >> >> > >> >> > On Tue 17-01-12 20:47:59, Hillf Danton wrote: >> >> >> If async order-O reclaim expected here, it is settled down when setting up scan >> >> >> control, with scan priority hacked to be zero. Other than that, deny of reclaim >> >> >> should be removed. >> >> > >> >> > Maybe I have misunderstood you but this is not right. The check is to >> >> > protect from the _global_ reclaim with order > 0 when we prevent from >> >> > memcg soft reclaim. >> >> > >> >> need to bear mm hog in this way? >> > >> > Could you be more specific? Are you trying to fix any particular >> > problem? >> > >> My thought is simple, the outcome of softlimit reclaim depends little on the >> value of reclaim order, zero or not, and only exceeding is reclaimed, so >> selective response to swapd's request is incorrect. > > OK, got your point, finally. Let's add Balbir (the proposed patch can > be found at https://lkml.org/lkml/2012/1/17/166) to the CC list because > this seems to be a design decision. > > I always thought that this is because we want non-userspace (high order) > mem pressure to be handled by the global reclaim only. And it makes some > sense to me because it is little bit strange to reclaim for order-0 > while the request is for an higher order. I guess this might lead to an > extensive and pointless reclaiming because we might end up with many > free pages which cannot satisfy higher order allocation. > > On the other hand, it is true that the documentation says that the soft > limit is considered when "the system detects memory contention or low > memory" which doesn't say that the contention comes from memcg accounted > memory. > > Anyway this changes the current behavior so it would better come with > much better justification which shows that over reclaim doesn't happen > and that we will not see higher latencies with higher order allocations. > As the function shows, the checked reclaim order is not used, but the scan control is prepared with order(= 0), which is called async order-0 reclaim in my tern, then your worries on over reclaim and higher latencies could be removed, I think 8-) Thanks Hillf unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg, gfp_t gfp_mask, bool noswap, struct zone *zone, unsigned long *nr_scanned) { struct scan_control sc = { .nr_scanned = 0, .nr_to_reclaim = SWAP_CLUSTER_MAX, .may_writepage = !laptop_mode, .may_unmap = 1, .may_swap = !noswap, .order = 0, .target_mem_cgroup = memcg, }; struct mem_cgroup_zone mz = { .mem_cgroup = memcg, .zone = zone, }; sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) | (GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK); trace_mm_vmscan_memcg_softlimit_reclaim_begin(0, sc.may_writepage, sc.gfp_mask); /* * NOTE: Although we can get the priority field, using it * here is not a good idea, since it limits the pages we can scan. * if we don't reclaim here, the shrink_zone from balance_pgdat * will pick up pages from other mem cgroup's as well. We hack * the priority and make it zero. */ shrink_mem_cgroup_zone(0, &mz, &sc); trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed); *nr_scanned = sc.nr_scanned; return sc.nr_reclaimed; }