linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-media@vger.kernel.org, linux-mm@kvack.org,
	linaro-mm-sig@lists.linaro.org,
	"'Michal Nazarewicz'" <mina86@mina86.com>,
	"'Kyungmin Park'" <kyungmin.park@samsung.com>,
	"'Russell King'" <linux@arm.linux.org.uk>,
	"'Andrew Morton'" <akpm@linux-foundation.org>,
	"'KAMEZAWA Hiroyuki'" <kamezawa.hiroyu@jp.fujitsu.com>,
	"'Daniel Walker'" <dwalker@codeaurora.org>,
	"'Arnd Bergmann'" <arnd@arndb.de>,
	"'Jesse Barker'" <jesse.barker@linaro.org>,
	"'Jonathan Corbet'" <corbet@lwn.net>,
	"'Shariq Hasnain'" <shariq.hasnain@linaro.org>,
	"'Chunsang Jeong'" <chunsang.jeong@linaro.org>,
	"'Dave Hansen'" <dave@linux.vnet.ibm.com>,
	"'Benjamin Gaignard'" <benjamin.gaignard@linaro.org>,
	"'Rob Clark'" <rob.clark@linaro.org>,
	"'Ohad Ben-Cohen'" <ohad@wizery.com>
Subject: Re: [PATCH 11/15] mm: trigger page reclaim in alloc_contig_range() to stabilize watermarks
Date: Fri, 10 Feb 2012 11:19:13 +0000	[thread overview]
Message-ID: <20120210111913.GP5796@csn.ul.ie> (raw)
In-Reply-To: <000001cce674$64bb67e0$2e3237a0$%szyprowski@samsung.com>

On Wed, Feb 08, 2012 at 04:14:46PM +0100, Marek Szyprowski wrote:
> > > <SNIP>
> > > +static int __reclaim_pages(struct zone *zone, gfp_t gfp_mask, int count)
> > > +{
> > > +	enum zone_type high_zoneidx = gfp_zone(gfp_mask);
> > > +	struct zonelist *zonelist = node_zonelist(0, gfp_mask);
> > > +	int did_some_progress = 0;
> > > +	int order = 1;
> > > +	unsigned long watermark;
> > > +
> > > +	/*
> > > +	 * Increase level of watermarks to force kswapd do his job
> > > +	 * to stabilize at new watermark level.
> > > +	 */
> > > +	min_free_kbytes += count * PAGE_SIZE / 1024;
> > 
> > There is a risk of overflow here although it is incredibly
> > small. Still, a potentially nicer way of doing this was
> > 
> > count << (PAGE_SHIFT - 10)
> > 
> > > +	setup_per_zone_wmarks();
> > > +
> > 
> > Nothing prevents two or more processes updating the wmarks at the same
> > time which is racy and unpredictable. Today it is not much of a problem
> > but CMA makes this path hotter than it was and you may see weirdness
> > if two processes are updating zonelists at the same time. Swap-over-NFS
> > actually starts with a patch that serialises setup_per_zone_wmarks()
> > 
> > You also potentially have a BIG problem here if this happens
> > 
> > min_free_kbytes = 32768
> > Process a: min_free_kbytes  += 65536
> > Process a: start direct reclaim
> > echo 16374 > /proc/sys/vm/min_free_kbytes
> > Process a: exit direct_reclaim
> > Process a: min_free_kbytes -= 65536
> > 
> > min_free_kbytes now wraps negative and the machine hangs.
> > 
> > The damage is confined to CMA though so I am not going to lose sleep
> > over it but you might want to consider at least preventing parallel
> > updates to min_free_kbytes from proc.
> 
> Right. This approach was definitely too hacky. What do you think about replacing 
> it with the following code (I assume that setup_per_zone_wmarks() serialization 
> patch will be merged anyway so I skipped it here):
> 

It's part of a larger series and the rest of that series is
controversial. That single patch can be split out obviously so feel free
to add it to your series and stick your Signed-off-by on the end of it.

> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 82f4fa5..bb9ae41 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -371,6 +371,13 @@ struct zone {
>         /* see spanned/present_pages for more description */
>         seqlock_t               span_seqlock;
>  #endif
> +#ifdef CONFIG_CMA
> +       /*
> +        * CMA needs to increase watermark levels during the allocation
> +        * process to make sure that the system is not starved.
> +        */
> +       unsigned long           min_cma_pages;
> +#endif
>         struct free_area        free_area[MAX_ORDER];
> 
>  #ifndef CONFIG_SPARSEMEM
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 824fb37..1ca52f0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5044,6 +5044,11 @@ void setup_per_zone_wmarks(void)
> 
>                 zone->watermark[WMARK_LOW]  = min_wmark_pages(zone) + (tmp >> 2);
>                 zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + (tmp >> 1);
> +#ifdef CONFIG_CMA
> +               zone->watermark[WMARK_MIN] += zone->min_cma_pages;
> +               zone->watermark[WMARK_LOW] += zone->min_cma_pages;
> +               zone->watermark[WMARK_HIGH] += zone->min_cma_pages;
> +#endif
>                 setup_zone_migrate_reserve(zone);
>                 spin_unlock_irqrestore(&zone->lock, flags);
>         }

This is better in that it is not vunerable to parallel updates of
min_free_kbytes. It would be slightly tidier to introduce something
like cma_wmark_pages() that returns min_cma_pages if CONFIG_CMA and 0
otherwise. Use the helper to get right of this ifdef CONFIG_CMA within
setup_per_zone_wmarks().

You'll still have the problem of kswapd not taking CMA pages properly into
account when deciding whether to reclaim or not though.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2012-02-10 11:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03 12:18 [PATCHv20 00/15] Contiguous Memory Allocator Marek Szyprowski
2012-02-03 12:18 ` [PATCH 01/15] mm: page_alloc: remove trailing whitespace Marek Szyprowski
2012-02-03 12:18 ` [PATCH 02/15] mm: compaction: introduce isolate_migratepages_range() Marek Szyprowski
2012-02-03 12:18 ` [PATCH 03/15] mm: compaction: introduce map_pages() Marek Szyprowski
2012-02-03 13:30   ` Mel Gorman
2012-02-03 12:18 ` [PATCH 04/15] mm: compaction: introduce isolate_freepages_range() Marek Szyprowski
2012-02-03 12:18 ` [PATCH 05/15] mm: compaction: export some of the functions Marek Szyprowski
2012-02-05  7:40   ` Hillf Danton
2012-02-05 14:34     ` Michal Nazarewicz
2012-02-06 12:46       ` Hillf Danton
2012-02-03 12:18 ` [PATCH 06/15] mm: page_alloc: introduce alloc_contig_range() Marek Szyprowski
2012-02-03 12:18 ` [PATCH 07/15] mm: page_alloc: change fallbacks array handling Marek Szyprowski
2012-02-03 12:18 ` [PATCH 08/15] mm: mmzone: MIGRATE_CMA migration type added Marek Szyprowski
2012-02-03 13:53   ` Mel Gorman
2012-02-03 14:19   ` Hillf Danton
2012-02-03 15:50     ` Michal Nazarewicz
2012-02-04  9:09       ` Hillf Danton
2012-02-05 14:37         ` Michal Nazarewicz
2012-02-03 12:18 ` [PATCH 09/15] mm: page_isolation: MIGRATE_CMA isolation functions added Marek Szyprowski
2012-02-03 12:18 ` [PATCH 10/15] mm: extract reclaim code from __alloc_pages_direct_reclaim() Marek Szyprowski
2012-02-03 12:18 ` [PATCH 11/15] mm: trigger page reclaim in alloc_contig_range() to stabilize watermarks Marek Szyprowski
2012-02-03 14:04   ` Mel Gorman
2012-02-08  2:04     ` [Linaro-mm-sig] " sandeep patil
2012-02-08  9:21       ` Michal Nazarewicz
2012-02-08 19:26         ` sandeep patil
2012-02-08 15:14     ` Marek Szyprowski
2012-02-10 11:19       ` Mel Gorman [this message]
2012-02-10 15:36         ` Marek Szyprowski
2012-02-03 12:18 ` [PATCH 12/15] drivers: add Contiguous Memory Allocator Marek Szyprowski
2012-02-05  4:25   ` Hillf Danton
2012-02-05 14:33     ` Michal Nazarewicz
2012-02-06 12:51       ` Hillf Danton
2012-02-03 12:18 ` [PATCH 13/15] X86: integrate CMA with DMA-mapping subsystem Marek Szyprowski
2012-02-03 12:18 ` [PATCH 14/15] ARM: " Marek Szyprowski
2012-02-03 12:18 ` [PATCH 15/15] ARM: Samsung: use CMA for 2 memory banks for s5p-mfc device Marek Szyprowski
2012-02-03 14:09 ` [PATCHv20 00/15] Contiguous Memory Allocator Mel Gorman
2012-02-07  9:06 ` Contiguous Memory Allocator on HIGHMEM cp.zou
2012-02-07  9:48   ` Marek Szyprowski
  -- strict thread matches above, loose matches on Subject: below --
2012-01-26  9:00 [PATCHv19 00/15] Contiguous Memory Allocator Marek Szyprowski
2012-01-26  9:00 ` [PATCH 11/15] mm: trigger page reclaim in alloc_contig_range() to stabilize watermarks Marek Szyprowski
2012-01-30 13:05   ` Mel Gorman
2012-01-31 17:15     ` Marek Szyprowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120210111913.GP5796@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=benjamin.gaignard@linaro.org \
    --cc=chunsang.jeong@linaro.org \
    --cc=corbet@lwn.net \
    --cc=dave@linux.vnet.ibm.com \
    --cc=dwalker@codeaurora.org \
    --cc=jesse.barker@linaro.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kyungmin.park@samsung.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@arm.linux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=mina86@mina86.com \
    --cc=ohad@wizery.com \
    --cc=rob.clark@linaro.org \
    --cc=shariq.hasnain@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).