All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/3] compaction accouting fix
  2011-11-12 16:37 ` Minchan Kim
  (?)
@ 2011-08-29 16:43 ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-29 16:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

I saw the following accouting of compaction during test of the series.

compact_blocks_moved 251
compact_pages_moved 44

It's very awkward to me although it's possbile because it means we try to compact 251 blocks
but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
but doesn't increased compact_pages_moved.

This patch makes accouting of compaction works only in case of success of isolation.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 mm/compaction.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 0e572d1..bb3a209 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -260,6 +260,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long low_pfn, end_pfn;
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
+	isolate_migrate_t ret = ISOLATE_NONE;
	struct list_head *migratelist = &cc->migratepages;
	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
				ISOLATE_UNEVICTABLE;
@@ -273,7 +274,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	/* Do not cross the free scanner or scan within a memory hole */
	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
		cc->migrate_pfn = end_pfn;
-		return ISOLATE_NONE;
+		return ret;
	}

	/*
@@ -370,14 +371,17 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
			break;
	}

-	acct_isolated(zone, cc);
+	if (cc->nr_migratepages > 0) {
+		acct_isolated(zone, cc);
+		ret = ISOLATE_SUCCESS;
+	}

	spin_unlock_irq(&zone->lru_lock);
	cc->migrate_pfn = low_pfn;

	trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);

-	return ISOLATE_SUCCESS;
+	return ret;
 }

 /*
--
1.7.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-11-12 16:37 ` Minchan Kim
  (?)
  (?)
@ 2011-08-29 16:43 ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-29 16:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

[c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
made a mistake on the bitwise type.

This patch corrects it.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1ed4116..188cb2f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -166,13 +166,13 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define LRU_ALL	     ((1 << NR_LRU_LISTS) - 1)

 /* Isolate inactive pages */
-#define ISOLATE_INACTIVE	((__force fmode_t)0x1)
+#define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
-#define ISOLATE_ACTIVE		((__force fmode_t)0x2)
+#define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force fmode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force fmode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
--
1.7.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/3] compaction: compact unevictable page
  2011-11-12 16:37 ` Minchan Kim
                   ` (2 preceding siblings ...)
  (?)
@ 2011-08-29 16:43 ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-29 16:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
which doesn't consider unevicatable page. It has been used by just lumpy so
it was pointless that it isolates unevictable page. But the situation is
changed. Compaction could handle unevictable page and it can help getting
big contiguos pages in fragment memory by many pinned page with mlock.

I tested this patch with following scenario.

1. A : allocate 80% anon pages in system
2. B : allocate 20% mlocked page in system
/* Maybe, mlocked pages are located in low pfn address */
3. kill A /* high pfn address are free */
4. echo 1 > /proc/sys/vm/compact_memory

old:

compact_blocks_moved 251
compact_pages_moved 44

new:

compact_blocks_moved 258
compact_pages_moved 412

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    6 ++++--
 mm/compaction.c        |    3 ++-
 mm/vmscan.c            |    7 +------
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 188cb2f..82b505e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -169,10 +169,12 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
 #define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
+/* Isolate unevictable pages */
+#define ISOLATE_UNEVICTABLE	((__force isolate_mode_t)0x4)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x8)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x10)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
diff --git a/mm/compaction.c b/mm/compaction.c
index a0e4202..0e572d1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -261,7 +261,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
	struct list_head *migratelist = &cc->migratepages;
-	isolate_mode_t mode = ISOLATE_ACTIVE|ISOLATE_INACTIVE;
+	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
+				ISOLATE_UNEVICTABLE;

	/* Do not scan outside zone boundaries */
	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 23256e8..2300342 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1057,12 +1057,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file)
	if (!all_lru_mode && !!page_is_file_cache(page) != file)
		return ret;

-	/*
-	 * When this function is being called for lumpy reclaim, we
-	 * initially look into all LRU pages, active, inactive and
-	 * unevictable; only give shrink_page_list evictable pages.
-	 */
-	if (PageUnevictable(page))
+	if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
		return ret;

	ret = -EBUSY;
--
1.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-11-12 16:37   ` Minchan Kim
@ 2011-08-30 17:51     ` Rik van Riel
  -1 siblings, 0 replies; 48+ messages in thread
From: Rik van Riel @ 2011-08-30 17:51 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Johannes Weiner

On 11/12/2011 11:37 AM, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
>
> This patch corrects it.
>
> CC: Mel Gorman<mgorman@suse.de>
> CC: Johannes Weiner<jweiner@redhat.com>
> CC: Rik van Riel<riel@redhat.com>
> Signed-off-by: Minchan Kim<minchan.kim@gmail.com>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
@ 2011-08-30 17:51     ` Rik van Riel
  0 siblings, 0 replies; 48+ messages in thread
From: Rik van Riel @ 2011-08-30 17:51 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Johannes Weiner

On 11/12/2011 11:37 AM, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
>
> This patch corrects it.
>
> CC: Mel Gorman<mgorman@suse.de>
> CC: Johannes Weiner<jweiner@redhat.com>
> CC: Rik van Riel<riel@redhat.com>
> Signed-off-by: Minchan Kim<minchan.kim@gmail.com>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-11-12 16:37   ` Minchan Kim
@ 2011-08-31  1:09     ` Rik van Riel
  -1 siblings, 0 replies; 48+ messages in thread
From: Rik van Riel @ 2011-08-31  1:09 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Johannes Weiner

On 11/12/2011 11:37 AM, Minchan Kim wrote:
> Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> which doesn't consider unevicatable page. It has been used by just lumpy so
> it was pointless that it isolates unevictable page. But the situation is
> changed. Compaction could handle unevictable page and it can help getting
> big contiguos pages in fragment memory by many pinned page with mlock.
>
> I tested this patch with following scenario.
>
> 1. A : allocate 80% anon pages in system
> 2. B : allocate 20% mlocked page in system
> /* Maybe, mlocked pages are located in low pfn address */
> 3. kill A /* high pfn address are free */
> 4. echo 1>  /proc/sys/vm/compact_memory
>
> old:
>
> compact_blocks_moved 251
> compact_pages_moved 44
>
> new:
>
> compact_blocks_moved 258
> compact_pages_moved 412
>
> CC: Mel Gorman<mgorman@suse.de>
> CC: Johannes Weiner<jweiner@redhat.com>
> CC: Rik van Riel<riel@redhat.com>
> Signed-off-by: Minchan Kim<minchan.kim@gmail.com>

Reviewed-by: Rik van Riel <riel@redhat.com>


-- 
All rights reversed

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-08-31  1:09     ` Rik van Riel
  0 siblings, 0 replies; 48+ messages in thread
From: Rik van Riel @ 2011-08-31  1:09 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Johannes Weiner

On 11/12/2011 11:37 AM, Minchan Kim wrote:
> Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> which doesn't consider unevicatable page. It has been used by just lumpy so
> it was pointless that it isolates unevictable page. But the situation is
> changed. Compaction could handle unevictable page and it can help getting
> big contiguos pages in fragment memory by many pinned page with mlock.
>
> I tested this patch with following scenario.
>
> 1. A : allocate 80% anon pages in system
> 2. B : allocate 20% mlocked page in system
> /* Maybe, mlocked pages are located in low pfn address */
> 3. kill A /* high pfn address are free */
> 4. echo 1>  /proc/sys/vm/compact_memory
>
> old:
>
> compact_blocks_moved 251
> compact_pages_moved 44
>
> new:
>
> compact_blocks_moved 258
> compact_pages_moved 412
>
> CC: Mel Gorman<mgorman@suse.de>
> CC: Johannes Weiner<jweiner@redhat.com>
> CC: Rik van Riel<riel@redhat.com>
> Signed-off-by: Minchan Kim<minchan.kim@gmail.com>

Reviewed-by: Rik van Riel <riel@redhat.com>


-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-11-12 16:37   ` Minchan Kim
@ 2011-08-31 11:13     ` Johannes Weiner
  -1 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:13 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
> 
> This patch corrects it.
> 
> CC: Mel Gorman <mgorman@suse.de>
> CC: Johannes Weiner <jweiner@redhat.com>
> CC: Rik van Riel <riel@redhat.com>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
@ 2011-08-31 11:13     ` Johannes Weiner
  0 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:13 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
> 
> This patch corrects it.
> 
> CC: Mel Gorman <mgorman@suse.de>
> CC: Johannes Weiner <jweiner@redhat.com>
> CC: Rik van Riel <riel@redhat.com>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-11-12 16:37   ` Minchan Kim
@ 2011-08-31 11:19     ` Johannes Weiner
  -1 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:19 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> which doesn't consider unevicatable page. It has been used by just lumpy so
> it was pointless that it isolates unevictable page. But the situation is
> changed. Compaction could handle unevictable page and it can help getting
> big contiguos pages in fragment memory by many pinned page with mlock.

This may result in applications unexpectedly faulting and waiting on
mlocked pages under migration.  I wonder how realtime people feel
about that?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-08-31 11:19     ` Johannes Weiner
  0 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:19 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> which doesn't consider unevicatable page. It has been used by just lumpy so
> it was pointless that it isolates unevictable page. But the situation is
> changed. Compaction could handle unevictable page and it can help getting
> big contiguos pages in fragment memory by many pinned page with mlock.

This may result in applications unexpectedly faulting and waiting on
mlocked pages under migration.  I wonder how realtime people feel
about that?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-11-12 16:37   ` Minchan Kim
@ 2011-08-31 11:37     ` Johannes Weiner
  -1 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:37 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> I saw the following accouting of compaction during test of the series.
> 
> compact_blocks_moved 251
> compact_pages_moved 44
> 
> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> but doesn't increased compact_pages_moved.
> 
> This patch makes accouting of compaction works only in case of success of isolation.
> 
> CC: Mel Gorman <mgorman@suse.de>
> CC: Johannes Weiner <jweiner@redhat.com>
> CC: Rik van Riel <riel@redhat.com>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

It's a teensy-bit awkward that isolate_migratepages() can return
success without actually isolating any new pages, just because there
are still some pages left from a previous run (cc->nr_migratepages is
maintained across isolation calls).

Maybe isolate_migratepages() should just return an error if compaction
should really be aborted and 0 otherwise, and have compact_zone()
always check for cc->nr_migratepages itself?

	if (isolate_migratepages(zone, cc) < 0) {
		ret = COMPACT_PARTIAL;
		goto out;
	}

	if (!cc->nr_migratepages)
		continue;

	...

Just a nit-pick, though.  If you don't agree, just leave it as is.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-08-31 11:37     ` Johannes Weiner
  0 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 11:37 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> I saw the following accouting of compaction during test of the series.
> 
> compact_blocks_moved 251
> compact_pages_moved 44
> 
> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> but doesn't increased compact_pages_moved.
> 
> This patch makes accouting of compaction works only in case of success of isolation.
> 
> CC: Mel Gorman <mgorman@suse.de>
> CC: Johannes Weiner <jweiner@redhat.com>
> CC: Rik van Riel <riel@redhat.com>
> Signed-off-by: Minchan Kim <minchan.kim@gmail.com>

Acked-by: Johannes Weiner <jweiner@redhat.com>

It's a teensy-bit awkward that isolate_migratepages() can return
success without actually isolating any new pages, just because there
are still some pages left from a previous run (cc->nr_migratepages is
maintained across isolation calls).

Maybe isolate_migratepages() should just return an error if compaction
should really be aborted and 0 otherwise, and have compact_zone()
always check for cc->nr_migratepages itself?

	if (isolate_migratepages(zone, cc) < 0) {
		ret = COMPACT_PARTIAL;
		goto out;
	}

	if (!cc->nr_migratepages)
		continue;

	...

Just a nit-pick, though.  If you don't agree, just leave it as is.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-08-31 11:19     ` Johannes Weiner
@ 2011-08-31 14:41       ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-31 14:41 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> > which doesn't consider unevicatable page. It has been used by just lumpy so
> > it was pointless that it isolates unevictable page. But the situation is
> > changed. Compaction could handle unevictable page and it can help getting
> > big contiguos pages in fragment memory by many pinned page with mlock.
> 
> This may result in applications unexpectedly faulting and waiting on
> mlocked pages under migration.  I wonder how realtime people feel
> about that?

I didn't consider it but it's very important point.
The migrate_page can call pageout on dirty page so RT process could wait on the
mlocked page during very long time.
I can mitigate it with isolating mlocked page in case of !sync but still we can't
guarantee the time because we can't know how many vmas point the page so that try_to_unmap
could spend lots of time.

We can think it's a trade off between high order allocation VS RT latency.
Now I am biasing toward RT latency as considering mlock man page.

Any thoughts?

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-08-31 14:41       ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-31 14:41 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> > which doesn't consider unevicatable page. It has been used by just lumpy so
> > it was pointless that it isolates unevictable page. But the situation is
> > changed. Compaction could handle unevictable page and it can help getting
> > big contiguos pages in fragment memory by many pinned page with mlock.
> 
> This may result in applications unexpectedly faulting and waiting on
> mlocked pages under migration.  I wonder how realtime people feel
> about that?

I didn't consider it but it's very important point.
The migrate_page can call pageout on dirty page so RT process could wait on the
mlocked page during very long time.
I can mitigate it with isolating mlocked page in case of !sync but still we can't
guarantee the time because we can't know how many vmas point the page so that try_to_unmap
could spend lots of time.

We can think it's a trade off between high order allocation VS RT latency.
Now I am biasing toward RT latency as considering mlock man page.

Any thoughts?

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-08-31 11:37     ` Johannes Weiner
@ 2011-08-31 14:56       ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-31 14:56 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 01:37:10PM +0200, Johannes Weiner wrote:
> On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> > I saw the following accouting of compaction during test of the series.
> > 
> > compact_blocks_moved 251
> > compact_pages_moved 44
> > 
> > It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> > but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> > isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> > but doesn't increased compact_pages_moved.
> > 
> > This patch makes accouting of compaction works only in case of success of isolation.
> > 
> > CC: Mel Gorman <mgorman@suse.de>
> > CC: Johannes Weiner <jweiner@redhat.com>
> > CC: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
> 
> Acked-by: Johannes Weiner <jweiner@redhat.com>

Thanks, Hannes.

> 
> It's a teensy-bit awkward that isolate_migratepages() can return
> success without actually isolating any new pages, just because there
> are still some pages left from a previous run (cc->nr_migratepages is
> maintained across isolation calls).

If migrate_pages fails, we reset cc->nr_migratepages to zero in compact_zone.
Am I missing something?

> 
> Maybe isolate_migratepages() should just return an error if compaction
> should really be aborted and 0 otherwise, and have compact_zone()
> always check for cc->nr_migratepages itself?
> 
> 	if (isolate_migratepages(zone, cc) < 0) {
> 		ret = COMPACT_PARTIAL;
> 		goto out;
> 	}
> 
> 	if (!cc->nr_migratepages)
> 		continue;
> 
> 	...
> 
> Just a nit-pick, though.  If you don't agree, just leave it as is.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-08-31 14:56       ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-31 14:56 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 01:37:10PM +0200, Johannes Weiner wrote:
> On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> > I saw the following accouting of compaction during test of the series.
> > 
> > compact_blocks_moved 251
> > compact_pages_moved 44
> > 
> > It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> > but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> > isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> > but doesn't increased compact_pages_moved.
> > 
> > This patch makes accouting of compaction works only in case of success of isolation.
> > 
> > CC: Mel Gorman <mgorman@suse.de>
> > CC: Johannes Weiner <jweiner@redhat.com>
> > CC: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
> 
> Acked-by: Johannes Weiner <jweiner@redhat.com>

Thanks, Hannes.

> 
> It's a teensy-bit awkward that isolate_migratepages() can return
> success without actually isolating any new pages, just because there
> are still some pages left from a previous run (cc->nr_migratepages is
> maintained across isolation calls).

If migrate_pages fails, we reset cc->nr_migratepages to zero in compact_zone.
Am I missing something?

> 
> Maybe isolate_migratepages() should just return an error if compaction
> should really be aborted and 0 otherwise, and have compact_zone()
> always check for cc->nr_migratepages itself?
> 
> 	if (isolate_migratepages(zone, cc) < 0) {
> 		ret = COMPACT_PARTIAL;
> 		goto out;
> 	}
> 
> 	if (!cc->nr_migratepages)
> 		continue;
> 
> 	...
> 
> Just a nit-pick, though.  If you don't agree, just leave it as is.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-08-31 14:56       ` Minchan Kim
@ 2011-08-31 15:03         ` Johannes Weiner
  -1 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 15:03 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 11:56:51PM +0900, Minchan Kim wrote:
> On Wed, Aug 31, 2011 at 01:37:10PM +0200, Johannes Weiner wrote:
> > It's a teensy-bit awkward that isolate_migratepages() can return
> > success without actually isolating any new pages, just because there
> > are still some pages left from a previous run (cc->nr_migratepages is
> > maintained across isolation calls).
> 
> If migrate_pages fails, we reset cc->nr_migratepages to zero in compact_zone.
> Am I missing something?

Brainfart on my side, sorry.  It's all good, then :)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-08-31 15:03         ` Johannes Weiner
  0 siblings, 0 replies; 48+ messages in thread
From: Johannes Weiner @ 2011-08-31 15:03 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Mel Gorman, Rik van Riel

On Wed, Aug 31, 2011 at 11:56:51PM +0900, Minchan Kim wrote:
> On Wed, Aug 31, 2011 at 01:37:10PM +0200, Johannes Weiner wrote:
> > It's a teensy-bit awkward that isolate_migratepages() can return
> > success without actually isolating any new pages, just because there
> > are still some pages left from a previous run (cc->nr_migratepages is
> > maintained across isolation calls).
> 
> If migrate_pages fails, we reset cc->nr_migratepages to zero in compact_zone.
> Am I missing something?

Brainfart on my side, sorry.  It's all good, then :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-11-12 16:37   ` Minchan Kim
@ 2011-09-01 13:05     ` Mel Gorman
  -1 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 13:05 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
> 

Minor nit, commit c1e8b0ae8 does not exist anywhere. I suspect you
are looking at a git tree generated from the mmotm quilt series. It
would be easier if you had said "This patch should be merged with
mm-change-isolate-mode-from-define-to-bitwise-type.patch in mmotm".

Otherwise, looks ok.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
@ 2011-09-01 13:05     ` Mel Gorman
  0 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 13:05 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
> made a mistake on the bitwise type.
> 

Minor nit, commit c1e8b0ae8 does not exist anywhere. I suspect you
are looking at a git tree generated from the mmotm quilt series. It
would be easier if you had said "This patch should be merged with
mm-change-isolate-mode-from-define-to-bitwise-type.patch in mmotm".

Otherwise, looks ok.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-08-31 14:41       ` Minchan Kim
@ 2011-09-01 14:02         ` Mel Gorman
  -1 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 14:02 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> > > which doesn't consider unevicatable page. It has been used by just lumpy so
> > > it was pointless that it isolates unevictable page. But the situation is
> > > changed. Compaction could handle unevictable page and it can help getting
> > > big contiguos pages in fragment memory by many pinned page with mlock.
> > 
> > This may result in applications unexpectedly faulting and waiting on
> > mlocked pages under migration.  I wonder how realtime people feel
> > about that?
> 
> I didn't consider it but it's very important point.
> The migrate_page can call pageout on dirty page so RT process could wait on the
> mlocked page during very long time.

On the plus side, the filesystem that is likely to suffer from this
is btrfs. The other important cases avoid the writeout.

> I can mitigate it with isolating mlocked page in case of !sync but still we can't
> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
> could spend lots of time.
> 

This loss of guarantee arguably violates POSIX 1B as part of the
real-time extension. The wording is "The function mlock shall cause
those whole pages containing any part of the address space of the
process starting at address addr and continuing for len bytes to be
memory resident until unlocked or until the process exits or execs
another process image."

It defines locking as "memory locking guarantees the residence of
portions of the address space. It is implementation defined whether
locking memory guarantees fixed translation between virtual addresses
(as seen by the process) and physical addresses."

As it's up to the implementation whether to preserve the physical
page mapping, it's allowed for compaction to move that page. However,
as it mlock is recommended for use by time-critical applications,
I fear we would be breaking developer expectations on the behaviour
of mlock even if it is permitted by POSIX.

> We can think it's a trade off between high order allocation VS RT latency.
> Now I am biasing toward RT latency as considering mlock man page.
> 
> Any thoughts?
> 

At the very least it should not be the default behaviour. I do not have
suggestions on how it could be enabled though. It's a bit obscure to
have as a kernel parameter or even a proc tunable and it's not a perfect
for /sys/kernel/mm/transparent_hugepage/defrag either.

How big of a problem is it that mlocked pages are not compacted at the
moment?

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-09-01 14:02         ` Mel Gorman
  0 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 14:02 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> > > which doesn't consider unevicatable page. It has been used by just lumpy so
> > > it was pointless that it isolates unevictable page. But the situation is
> > > changed. Compaction could handle unevictable page and it can help getting
> > > big contiguos pages in fragment memory by many pinned page with mlock.
> > 
> > This may result in applications unexpectedly faulting and waiting on
> > mlocked pages under migration.  I wonder how realtime people feel
> > about that?
> 
> I didn't consider it but it's very important point.
> The migrate_page can call pageout on dirty page so RT process could wait on the
> mlocked page during very long time.

On the plus side, the filesystem that is likely to suffer from this
is btrfs. The other important cases avoid the writeout.

> I can mitigate it with isolating mlocked page in case of !sync but still we can't
> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
> could spend lots of time.
> 

This loss of guarantee arguably violates POSIX 1B as part of the
real-time extension. The wording is "The function mlock shall cause
those whole pages containing any part of the address space of the
process starting at address addr and continuing for len bytes to be
memory resident until unlocked or until the process exits or execs
another process image."

It defines locking as "memory locking guarantees the residence of
portions of the address space. It is implementation defined whether
locking memory guarantees fixed translation between virtual addresses
(as seen by the process) and physical addresses."

As it's up to the implementation whether to preserve the physical
page mapping, it's allowed for compaction to move that page. However,
as it mlock is recommended for use by time-critical applications,
I fear we would be breaking developer expectations on the behaviour
of mlock even if it is permitted by POSIX.

> We can think it's a trade off between high order allocation VS RT latency.
> Now I am biasing toward RT latency as considering mlock man page.
> 
> Any thoughts?
> 

At the very least it should not be the default behaviour. I do not have
suggestions on how it could be enabled though. It's a bit obscure to
have as a kernel parameter or even a proc tunable and it's not a perfect
for /sys/kernel/mm/transparent_hugepage/defrag either.

How big of a problem is it that mlocked pages are not compacted at the
moment?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-11-12 16:37   ` Minchan Kim
@ 2011-09-01 14:20     ` Mel Gorman
  -1 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 14:20 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> I saw the following accouting of compaction during test of the series.

s/accouting/accounting/ both here and in the subject. A nicer name the
patch would have been

"mm: compaction: Only update compact_blocks_moved if compaction was successful"

> 
> compact_blocks_moved 251
> compact_pages_moved 44
> 
> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> but doesn't increased compact_pages_moved.
> 
> This patch makes accouting of compaction works only in case of success of isolation.
> 

compact_blocks_moved exists to indicate the rate compaction is
scanning pageblocks. If compact_blocks_moved and compact_pages_moved
are increasing at a similar rate for example, it could imply that
compaction is doing a lot of scanning but is not necessarily useful
work. It's not necessarily reflected by compact_fail because that
counter is only updated for pages that were isolated from the LRU.

I now recognise of course that "compact_blocks_moved" was an *awful*
choice of name for this stat.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-09-01 14:20     ` Mel Gorman
  0 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-01 14:20 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> I saw the following accouting of compaction during test of the series.

s/accouting/accounting/ both here and in the subject. A nicer name the
patch would have been

"mm: compaction: Only update compact_blocks_moved if compaction was successful"

> 
> compact_blocks_moved 251
> compact_pages_moved 44
> 
> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> but doesn't increased compact_pages_moved.
> 
> This patch makes accouting of compaction works only in case of success of isolation.
> 

compact_blocks_moved exists to indicate the rate compaction is
scanning pageblocks. If compact_blocks_moved and compact_pages_moved
are increasing at a similar rate for example, it could imply that
compaction is doing a lot of scanning but is not necessarily useful
work. It's not necessarily reflected by compact_fail because that
counter is only updated for pages that were isolated from the LRU.

I now recognise of course that "compact_blocks_moved" was an *awful*
choice of name for this stat.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-09-01 13:05     ` Mel Gorman
@ 2011-09-02  3:29       ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  3:29 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Thu, Sep 1, 2011 at 10:05 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
>> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
>> made a mistake on the bitwise type.
>>
>
> Minor nit, commit c1e8b0ae8 does not exist anywhere. I suspect you
> are looking at a git tree generated from the mmotm quilt series. It
> would be easier if you had said "This patch should be merged with
> mm-change-isolate-mode-from-define-to-bitwise-type.patch in mmotm".

Right you are. I did it by habit without realizing just quilt series. :(
For preventing such bad thing, I always use [git commit id, patch
name] together these day. Fortunately, it seems to be effective. But
unfortunately, I couldn't prevent your wasted time which spend until
notice that.

I will be careful in the future.

>
> Otherwise, looks ok.

Thanks, Mel.
-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/3] Correct isolate_mode_t bitwise type
@ 2011-09-02  3:29       ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  3:29 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Thu, Sep 1, 2011 at 10:05 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Sun, Nov 13, 2011 at 01:37:41AM +0900, Minchan Kim wrote:
>> [c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
>> made a mistake on the bitwise type.
>>
>
> Minor nit, commit c1e8b0ae8 does not exist anywhere. I suspect you
> are looking at a git tree generated from the mmotm quilt series. It
> would be easier if you had said "This patch should be merged with
> mm-change-isolate-mode-from-define-to-bitwise-type.patch in mmotm".

Right you are. I did it by habit without realizing just quilt series. :(
For preventing such bad thing, I always use [git commit id, patch
name] together these day. Fortunately, it seems to be effective. But
unfortunately, I couldn't prevent your wasted time which spend until
notice that.

I will be careful in the future.

>
> Otherwise, looks ok.

Thanks, Mel.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-09-01 14:02         ` Mel Gorman
@ 2011-09-02  4:48           ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  4:48 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Thu, Sep 1, 2011 at 11:02 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
>> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
>> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
>> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
>> > > which doesn't consider unevicatable page. It has been used by just lumpy so
>> > > it was pointless that it isolates unevictable page. But the situation is
>> > > changed. Compaction could handle unevictable page and it can help getting
>> > > big contiguos pages in fragment memory by many pinned page with mlock.
>> >
>> > This may result in applications unexpectedly faulting and waiting on
>> > mlocked pages under migration.  I wonder how realtime people feel
>> > about that?
>>
>> I didn't consider it but it's very important point.
>> The migrate_page can call pageout on dirty page so RT process could wait on the
>> mlocked page during very long time.
>
> On the plus side, the filesystem that is likely to suffer from this
> is btrfs. The other important cases avoid the writeout.

You mean only btrfs does write in reclaim context?
It's the oppose I have known.
Anyway, it would be mitigated if we merge your patch "Reduce
filesystem writeback from page reclaim".
But we have other problem as I said.
We can suffer from unmapping page of many ptes.

>
>> I can mitigate it with isolating mlocked page in case of !sync but still we can't
>> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
>> could spend lots of time.
>>
>
> This loss of guarantee arguably violates POSIX 1B as part of the
> real-time extension. The wording is "The function mlock shall cause
> those whole pages containing any part of the address space of the
> process starting at address addr and continuing for len bytes to be
> memory resident until unlocked or until the process exits or execs
> another process image."
>
> It defines locking as "memory locking guarantees the residence of
> portions of the address space. It is implementation defined whether
> locking memory guarantees fixed translation between virtual addresses
> (as seen by the process) and physical addresses."
>
> As it's up to the implementation whether to preserve the physical
> page mapping, it's allowed for compaction to move that page. However,
> as it mlock is recommended for use by time-critical applications,
> I fear we would be breaking developer expectations on the behaviour
> of mlock even if it is permitted by POSIX.

Agree.

>
>> We can think it's a trade off between high order allocation VS RT latency.
>> Now I am biasing toward RT latency as considering mlock man page.
>>
>> Any thoughts?
>>
>
> At the very least it should not be the default behaviour. I do not have
> suggestions on how it could be enabled though. It's a bit obscure to
> have as a kernel parameter or even a proc tunable and it's not a perfect
> for /sys/kernel/mm/transparent_hugepage/defrag either.
>
> How big of a problem is it that mlocked pages are not compacted at the
> moment?

I found it by just code review and didn't see any reports about that.
But it is quite possible that someone calls mlock with small request sparsely.
And logically, compaction could be a feature to solve it if user
endures the pain.
(But still, I am not sure how many of user on mlock can bear it)

We can solve a bit that by another approach if it's really problem
with RT processes. The another approach is to separate mlocked pages
with allocation time like below pseudo patch which just show the
concept)

ex)
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 3a93f73..8ae2e60 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -175,7 +175,8 @@ static inline struct page *
 alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
                                        unsigned long vaddr)
 {
-       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
+       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
+       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
 }

But it's a solution about newly allocated page on mlocked vma.
Old pages in the VMA is still a problem.
We can solve it at mlock system call through migrating the pages to
UNMOVABLE block.

What we need is just VOC. Who know there are such systems which call
mlock call frequently with small pages?
If any customer doesn't require it strongly, I can drop this patch.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-09-02  4:48           ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  4:48 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Thu, Sep 1, 2011 at 11:02 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
>> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
>> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
>> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
>> > > which doesn't consider unevicatable page. It has been used by just lumpy so
>> > > it was pointless that it isolates unevictable page. But the situation is
>> > > changed. Compaction could handle unevictable page and it can help getting
>> > > big contiguos pages in fragment memory by many pinned page with mlock.
>> >
>> > This may result in applications unexpectedly faulting and waiting on
>> > mlocked pages under migration.  I wonder how realtime people feel
>> > about that?
>>
>> I didn't consider it but it's very important point.
>> The migrate_page can call pageout on dirty page so RT process could wait on the
>> mlocked page during very long time.
>
> On the plus side, the filesystem that is likely to suffer from this
> is btrfs. The other important cases avoid the writeout.

You mean only btrfs does write in reclaim context?
It's the oppose I have known.
Anyway, it would be mitigated if we merge your patch "Reduce
filesystem writeback from page reclaim".
But we have other problem as I said.
We can suffer from unmapping page of many ptes.

>
>> I can mitigate it with isolating mlocked page in case of !sync but still we can't
>> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
>> could spend lots of time.
>>
>
> This loss of guarantee arguably violates POSIX 1B as part of the
> real-time extension. The wording is "The function mlock shall cause
> those whole pages containing any part of the address space of the
> process starting at address addr and continuing for len bytes to be
> memory resident until unlocked or until the process exits or execs
> another process image."
>
> It defines locking as "memory locking guarantees the residence of
> portions of the address space. It is implementation defined whether
> locking memory guarantees fixed translation between virtual addresses
> (as seen by the process) and physical addresses."
>
> As it's up to the implementation whether to preserve the physical
> page mapping, it's allowed for compaction to move that page. However,
> as it mlock is recommended for use by time-critical applications,
> I fear we would be breaking developer expectations on the behaviour
> of mlock even if it is permitted by POSIX.

Agree.

>
>> We can think it's a trade off between high order allocation VS RT latency.
>> Now I am biasing toward RT latency as considering mlock man page.
>>
>> Any thoughts?
>>
>
> At the very least it should not be the default behaviour. I do not have
> suggestions on how it could be enabled though. It's a bit obscure to
> have as a kernel parameter or even a proc tunable and it's not a perfect
> for /sys/kernel/mm/transparent_hugepage/defrag either.
>
> How big of a problem is it that mlocked pages are not compacted at the
> moment?

I found it by just code review and didn't see any reports about that.
But it is quite possible that someone calls mlock with small request sparsely.
And logically, compaction could be a feature to solve it if user
endures the pain.
(But still, I am not sure how many of user on mlock can bear it)

We can solve a bit that by another approach if it's really problem
with RT processes. The another approach is to separate mlocked pages
with allocation time like below pseudo patch which just show the
concept)

ex)
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 3a93f73..8ae2e60 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -175,7 +175,8 @@ static inline struct page *
 alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
                                        unsigned long vaddr)
 {
-       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
+       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
+       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
 }

But it's a solution about newly allocated page on mlocked vma.
Old pages in the VMA is still a problem.
We can solve it at mlock system call through migrating the pages to
UNMOVABLE block.

What we need is just VOC. Who know there are such systems which call
mlock call frequently with small pages?
If any customer doesn't require it strongly, I can drop this patch.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-09-01 14:20     ` Mel Gorman
@ 2011-09-02  5:09       ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  5:09 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Thu, Sep 1, 2011 at 11:20 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
>> I saw the following accouting of compaction during test of the series.
>
> s/accouting/accounting/ both here and in the subject. A nicer name the
> patch would have been
>
> "mm: compaction: Only update compact_blocks_moved if compaction was successful"

Thanks, I will fix it at next version. :)

>
>>
>> compact_blocks_moved 251
>> compact_pages_moved 44
>>
>> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
>> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
>> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
>> but doesn't increased compact_pages_moved.
>>
>> This patch makes accouting of compaction works only in case of success of isolation.
>>
>
> compact_blocks_moved exists to indicate the rate compaction is
> scanning pageblocks. If compact_blocks_moved and compact_pages_moved
> are increasing at a similar rate for example, it could imply that
> compaction is doing a lot of scanning but is not necessarily useful
> work. It's not necessarily reflected by compact_fail because that
> counter is only updated for pages that were isolated from the LRU.

You seem to say "compact_pagemigrate_failed" not "compact_fail".

>
> I now recognise of course that "compact_blocks_moved" was an *awful*
> choice of name for this stat.

I hope changing stat names as follows unless it's too late(ie, it
doesn't break ABI with any tools)

compact_blocks_moved -> compact_blocks
compact_pages_moved -> compact_pgmigrated_success
compact_pagemigrate_failed -> compact_pgmigrated_fail
compact_stall -> compact_alloc_stall
compact_fail -> compact_alloc_fail
compact_success -> compact_alloc_success


>
> --
> Mel Gorman
> SUSE Labs
>



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-09-02  5:09       ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-09-02  5:09 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Thu, Sep 1, 2011 at 11:20 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
>> I saw the following accouting of compaction during test of the series.
>
> s/accouting/accounting/ both here and in the subject. A nicer name the
> patch would have been
>
> "mm: compaction: Only update compact_blocks_moved if compaction was successful"

Thanks, I will fix it at next version. :)

>
>>
>> compact_blocks_moved 251
>> compact_pages_moved 44
>>
>> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
>> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
>> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
>> but doesn't increased compact_pages_moved.
>>
>> This patch makes accouting of compaction works only in case of success of isolation.
>>
>
> compact_blocks_moved exists to indicate the rate compaction is
> scanning pageblocks. If compact_blocks_moved and compact_pages_moved
> are increasing at a similar rate for example, it could imply that
> compaction is doing a lot of scanning but is not necessarily useful
> work. It's not necessarily reflected by compact_fail because that
> counter is only updated for pages that were isolated from the LRU.

You seem to say "compact_pagemigrate_failed" not "compact_fail".

>
> I now recognise of course that "compact_blocks_moved" was an *awful*
> choice of name for this stat.

I hope changing stat names as follows unless it's too late(ie, it
doesn't break ABI with any tools)

compact_blocks_moved -> compact_blocks
compact_pages_moved -> compact_pgmigrated_success
compact_pagemigrate_failed -> compact_pgmigrated_fail
compact_stall -> compact_alloc_stall
compact_fail -> compact_alloc_fail
compact_success -> compact_alloc_success


>
> --
> Mel Gorman
> SUSE Labs
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
  2011-09-02  4:48           ` Minchan Kim
@ 2011-09-02 13:34             ` Mel Gorman
  -1 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-02 13:34 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Fri, Sep 02, 2011 at 01:48:54PM +0900, Minchan Kim wrote:
> On Thu, Sep 1, 2011 at 11:02 PM, Mel Gorman <mgorman@suse.de> wrote:
> > On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
> >> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> >> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> >> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> >> > > which doesn't consider unevicatable page. It has been used by just lumpy so
> >> > > it was pointless that it isolates unevictable page. But the situation is
> >> > > changed. Compaction could handle unevictable page and it can help getting
> >> > > big contiguos pages in fragment memory by many pinned page with mlock.
> >> >
> >> > This may result in applications unexpectedly faulting and waiting on
> >> > mlocked pages under migration.  I wonder how realtime people feel
> >> > about that?
> >>
> >> I didn't consider it but it's very important point.
> >> The migrate_page can call pageout on dirty page so RT process could wait on the
> >> mlocked page during very long time.
> >
> > On the plus side, the filesystem that is likely to suffer from this
> > is btrfs. The other important cases avoid the writeout.
> 
> You mean only btrfs does write in reclaim context?

In compaction context. It ultimately uses fallback_migrate_page
because btrfs_extent_io_ops lacks a migratepage hook.

> >> I can mitigate it with isolating mlocked page in case of !sync but still we can't
> >> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
> >> could spend lots of time.
> >>
> >
> > This loss of guarantee arguably violates POSIX 1B as part of the
> > real-time extension. The wording is "The function mlock shall cause
> > those whole pages containing any part of the address space of the
> > process starting at address addr and continuing for len bytes to be
> > memory resident until unlocked or until the process exits or execs
> > another process image."
> >
> > It defines locking as "memory locking guarantees the residence of
> > portions of the address space. It is implementation defined whether
> > locking memory guarantees fixed translation between virtual addresses
> > (as seen by the process) and physical addresses."
> >
> > As it's up to the implementation whether to preserve the physical
> > page mapping, it's allowed for compaction to move that page. However,
> > as it mlock is recommended for use by time-critical applications,
> > I fear we would be breaking developer expectations on the behaviour
> > of mlock even if it is permitted by POSIX.
> 
> Agree.
> 
> >
> >> We can think it's a trade off between high order allocation VS RT latency.
> >> Now I am biasing toward RT latency as considering mlock man page.
> >>
> >> Any thoughts?
> >>
> >
> > At the very least it should not be the default behaviour. I do not have
> > suggestions on how it could be enabled though. It's a bit obscure to
> > have as a kernel parameter or even a proc tunable and it's not a perfect
> > for /sys/kernel/mm/transparent_hugepage/defrag either.
> >
> > How big of a problem is it that mlocked pages are not compacted at the
> > moment?
> 
> I found it by just code review and didn't see any reports about that.
> But it is quite possible that someone calls mlock with small request sparsely.

This is done for security-sensitive applications to avoid any
possibility that information would leak to swap by accident. Consider
for example a gpg passphrase being written to swap. It's why users are
allowed to mlock a very small amount of memory.

I would expect these pages to only be locked for a very short time.

> And logically, compaction could be a feature to solve it if user
> endures the pain.
> (But still, I am not sure how many of user on mlock can bear it)
> 
> We can solve a bit that by another approach if it's really problem
> with RT processes. The another approach is to separate mlocked pages
> with allocation time like below pseudo patch which just show the
> concept)
> 
> ex)
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 3a93f73..8ae2e60 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -175,7 +175,8 @@ static inline struct page *
>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>                                         unsigned long vaddr)
>  {
> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
> +       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>  }
> 
> But it's a solution about newly allocated page on mlocked vma.
> Old pages in the VMA is still a problem.

Agreed, and because of this, I think it would only help a small number
of cases.

> We can solve it at mlock system call through migrating the pages to
> UNMOVABLE block.
> 

That's an interesting proposal.

> What we need is just VOC. Who know there are such systems which call
> mlock call frequently with small pages?

The security-sensitive applications are the only ones I know of that
mlock small amounts but the locking is very short-lived. I'm not aware
of other examples.

> If any customer doesn't require it strongly, I can drop this patch.
> 

I'm not aware of anyone suffering from this problem. However, in the
even we find such a case, I like your proposal of migrating pages to
UNMOVABLE blocks at mlock() time as a solution.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/3] compaction: compact unevictable page
@ 2011-09-02 13:34             ` Mel Gorman
  0 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-02 13:34 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Johannes Weiner, Andrew Morton, linux-mm, LKML, Rik van Riel

On Fri, Sep 02, 2011 at 01:48:54PM +0900, Minchan Kim wrote:
> On Thu, Sep 1, 2011 at 11:02 PM, Mel Gorman <mgorman@suse.de> wrote:
> > On Wed, Aug 31, 2011 at 11:41:50PM +0900, Minchan Kim wrote:
> >> On Wed, Aug 31, 2011 at 01:19:54PM +0200, Johannes Weiner wrote:
> >> > On Sun, Nov 13, 2011 at 01:37:42AM +0900, Minchan Kim wrote:
> >> > > Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
> >> > > which doesn't consider unevicatable page. It has been used by just lumpy so
> >> > > it was pointless that it isolates unevictable page. But the situation is
> >> > > changed. Compaction could handle unevictable page and it can help getting
> >> > > big contiguos pages in fragment memory by many pinned page with mlock.
> >> >
> >> > This may result in applications unexpectedly faulting and waiting on
> >> > mlocked pages under migration.  I wonder how realtime people feel
> >> > about that?
> >>
> >> I didn't consider it but it's very important point.
> >> The migrate_page can call pageout on dirty page so RT process could wait on the
> >> mlocked page during very long time.
> >
> > On the plus side, the filesystem that is likely to suffer from this
> > is btrfs. The other important cases avoid the writeout.
> 
> You mean only btrfs does write in reclaim context?

In compaction context. It ultimately uses fallback_migrate_page
because btrfs_extent_io_ops lacks a migratepage hook.

> >> I can mitigate it with isolating mlocked page in case of !sync but still we can't
> >> guarantee the time because we can't know how many vmas point the page so that try_to_unmap
> >> could spend lots of time.
> >>
> >
> > This loss of guarantee arguably violates POSIX 1B as part of the
> > real-time extension. The wording is "The function mlock shall cause
> > those whole pages containing any part of the address space of the
> > process starting at address addr and continuing for len bytes to be
> > memory resident until unlocked or until the process exits or execs
> > another process image."
> >
> > It defines locking as "memory locking guarantees the residence of
> > portions of the address space. It is implementation defined whether
> > locking memory guarantees fixed translation between virtual addresses
> > (as seen by the process) and physical addresses."
> >
> > As it's up to the implementation whether to preserve the physical
> > page mapping, it's allowed for compaction to move that page. However,
> > as it mlock is recommended for use by time-critical applications,
> > I fear we would be breaking developer expectations on the behaviour
> > of mlock even if it is permitted by POSIX.
> 
> Agree.
> 
> >
> >> We can think it's a trade off between high order allocation VS RT latency.
> >> Now I am biasing toward RT latency as considering mlock man page.
> >>
> >> Any thoughts?
> >>
> >
> > At the very least it should not be the default behaviour. I do not have
> > suggestions on how it could be enabled though. It's a bit obscure to
> > have as a kernel parameter or even a proc tunable and it's not a perfect
> > for /sys/kernel/mm/transparent_hugepage/defrag either.
> >
> > How big of a problem is it that mlocked pages are not compacted at the
> > moment?
> 
> I found it by just code review and didn't see any reports about that.
> But it is quite possible that someone calls mlock with small request sparsely.

This is done for security-sensitive applications to avoid any
possibility that information would leak to swap by accident. Consider
for example a gpg passphrase being written to swap. It's why users are
allowed to mlock a very small amount of memory.

I would expect these pages to only be locked for a very short time.

> And logically, compaction could be a feature to solve it if user
> endures the pain.
> (But still, I am not sure how many of user on mlock can bear it)
> 
> We can solve a bit that by another approach if it's really problem
> with RT processes. The another approach is to separate mlocked pages
> with allocation time like below pseudo patch which just show the
> concept)
> 
> ex)
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 3a93f73..8ae2e60 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -175,7 +175,8 @@ static inline struct page *
>  alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
>                                         unsigned long vaddr)
>  {
> -       return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
> +       gfp_t gfp_flag = vma->vm_flags & VM_LCOKED ? 0 : __GFP_MOVABLE;
> +       return __alloc_zeroed_user_highpage(gfp_flag, vma, vaddr);
>  }
> 
> But it's a solution about newly allocated page on mlocked vma.
> Old pages in the VMA is still a problem.

Agreed, and because of this, I think it would only help a small number
of cases.

> We can solve it at mlock system call through migrating the pages to
> UNMOVABLE block.
> 

That's an interesting proposal.

> What we need is just VOC. Who know there are such systems which call
> mlock call frequently with small pages?

The security-sensitive applications are the only ones I know of that
mlock small amounts but the locking is very short-lived. I'm not aware
of other examples.

> If any customer doesn't require it strongly, I can drop this patch.
> 

I'm not aware of anyone suffering from this problem. However, in the
even we find such a case, I like your proposal of migrating pages to
UNMOVABLE blocks at mlock() time as a solution.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
  2011-09-02  5:09       ` Minchan Kim
@ 2011-09-02 13:36         ` Mel Gorman
  -1 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-02 13:36 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Fri, Sep 02, 2011 at 02:09:55PM +0900, Minchan Kim wrote:
> On Thu, Sep 1, 2011 at 11:20 PM, Mel Gorman <mgorman@suse.de> wrote:
> > On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> >> I saw the following accouting of compaction during test of the series.
> >
> > s/accouting/accounting/ both here and in the subject. A nicer name the
> > patch would have been
> >
> > "mm: compaction: Only update compact_blocks_moved if compaction was successful"
> 
> Thanks, I will fix it at next version. :)
> 
> >
> >>
> >> compact_blocks_moved 251
> >> compact_pages_moved 44
> >>
> >> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> >> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> >> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> >> but doesn't increased compact_pages_moved.
> >>
> >> This patch makes accouting of compaction works only in case of success of isolation.
> >>
> >
> > compact_blocks_moved exists to indicate the rate compaction is
> > scanning pageblocks. If compact_blocks_moved and compact_pages_moved
> > are increasing at a similar rate for example, it could imply that
> > compaction is doing a lot of scanning but is not necessarily useful
> > work. It's not necessarily reflected by compact_fail because that
> > counter is only updated for pages that were isolated from the LRU.
> 
> You seem to say "compact_pagemigrate_failed" not "compact_fail".
> 

I did. Thanks.

> >
> > I now recognise of course that "compact_blocks_moved" was an *awful*
> > choice of name for this stat.
> 
> I hope changing stat names as follows unless it's too late(ie, it
> doesn't break ABI with any tools)
> 

I'm not aware of any tools that depend on this except my own reporting
scripts and even those do not particularly care.

> compact_blocks_moved -> compact_blocks

compact_pageblock_scanned?

> compact_pages_moved -> compact_pgmigrated_success
> compact_pagemigrate_failed -> compact_pgmigrated_fail
> compact_stall -> compact_alloc_stall
> compact_fail -> compact_alloc_fail
> compact_success -> compact_alloc_success
> 

Seems reasonable to me.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/3] compaction accouting fix
@ 2011-09-02 13:36         ` Mel Gorman
  0 siblings, 0 replies; 48+ messages in thread
From: Mel Gorman @ 2011-09-02 13:36 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Andrew Morton, linux-mm, LKML, Johannes Weiner, Rik van Riel

On Fri, Sep 02, 2011 at 02:09:55PM +0900, Minchan Kim wrote:
> On Thu, Sep 1, 2011 at 11:20 PM, Mel Gorman <mgorman@suse.de> wrote:
> > On Sun, Nov 13, 2011 at 01:37:43AM +0900, Minchan Kim wrote:
> >> I saw the following accouting of compaction during test of the series.
> >
> > s/accouting/accounting/ both here and in the subject. A nicer name the
> > patch would have been
> >
> > "mm: compaction: Only update compact_blocks_moved if compaction was successful"
> 
> Thanks, I will fix it at next version. :)
> 
> >
> >>
> >> compact_blocks_moved 251
> >> compact_pages_moved 44
> >>
> >> It's very awkward to me although it's possbile because it means we try to compact 251 blocks
> >> but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
> >> isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
> >> but doesn't increased compact_pages_moved.
> >>
> >> This patch makes accouting of compaction works only in case of success of isolation.
> >>
> >
> > compact_blocks_moved exists to indicate the rate compaction is
> > scanning pageblocks. If compact_blocks_moved and compact_pages_moved
> > are increasing at a similar rate for example, it could imply that
> > compaction is doing a lot of scanning but is not necessarily useful
> > work. It's not necessarily reflected by compact_fail because that
> > counter is only updated for pages that were isolated from the LRU.
> 
> You seem to say "compact_pagemigrate_failed" not "compact_fail".
> 

I did. Thanks.

> >
> > I now recognise of course that "compact_blocks_moved" was an *awful*
> > choice of name for this stat.
> 
> I hope changing stat names as follows unless it's too late(ie, it
> doesn't break ABI with any tools)
> 

I'm not aware of any tools that depend on this except my own reporting
scripts and even those do not particularly care.

> compact_blocks_moved -> compact_blocks

compact_pageblock_scanned?

> compact_pages_moved -> compact_pgmigrated_success
> compact_pagemigrate_failed -> compact_pgmigrated_fail
> compact_stall -> compact_alloc_stall
> compact_fail -> compact_alloc_fail
> compact_success -> compact_alloc_success
> 

Seems reasonable to me.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/3] Fix compaction about mlocked pages
  2011-11-12 16:37 ` Minchan Kim
@ 2011-10-06 21:54   ` Andrew Morton
  -1 siblings, 0 replies; 48+ messages in thread
From: Andrew Morton @ 2011-10-06 21:54 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel

On Sun, 13 Nov 2011 01:37:40 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> This patch's goal is to enable mlocked page migration.
> The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.
> 

This patch series appears to be a resend of stuff I already have.

Given the various concerns which were voiced during review of
mm-compaction-compact-unevictable-pages.patch and the uncertainty of
the overall usefulness of the feature, I'm inclined to drop

mm-compaction-compact-unevictable-pages.patch
mm-compaction-compact-unevictable-pages-checkpatch-fixes.patch
mm-compaction-accounting-fix.patch

for now, OK?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/3] Fix compaction about mlocked pages
@ 2011-10-06 21:54   ` Andrew Morton
  0 siblings, 0 replies; 48+ messages in thread
From: Andrew Morton @ 2011-10-06 21:54 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel

On Sun, 13 Nov 2011 01:37:40 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> This patch's goal is to enable mlocked page migration.
> The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.
> 

This patch series appears to be a resend of stuff I already have.

Given the various concerns which were voiced during review of
mm-compaction-compact-unevictable-pages.patch and the uncertainty of
the overall usefulness of the feature, I'm inclined to drop

mm-compaction-compact-unevictable-pages.patch
mm-compaction-compact-unevictable-pages-checkpatch-fixes.patch
mm-compaction-accounting-fix.patch

for now, OK?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/3] Fix compaction about mlocked pages
  2011-10-06 21:54   ` Andrew Morton
@ 2011-10-06 23:07     ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-10-06 23:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel

Hi Andrew,

On Fri, Oct 7, 2011 at 6:54 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Sun, 13 Nov 2011 01:37:40 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> This patch's goal is to enable mlocked page migration.
>> The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.
>>
>
> This patch series appears to be a resend of stuff I already have.
>
> Given the various concerns which were voiced during review of
> mm-compaction-compact-unevictable-pages.patch and the uncertainty of
> the overall usefulness of the feature, I'm inclined to drop
>
> mm-compaction-compact-unevictable-pages.patch
> mm-compaction-compact-unevictable-pages-checkpatch-fixes.patch
> mm-compaction-accounting-fix.patch
>
> for now, OK?
>

It's okay on mm-compaction-compact-unevictable-pages.patch.
On mm-compaction-accounting-fix.patch, we still need it but I need to
change title as Mel commented out and I will send further patche which
changes stat names(https://lkml.org/lkml/2011/9/2/3) with it. So let's
drop it all.
I will resend further patches after rc-1.

Thanks, Andrew.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/3] Fix compaction about mlocked pages
@ 2011-10-06 23:07     ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-10-06 23:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel

Hi Andrew,

On Fri, Oct 7, 2011 at 6:54 AM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Sun, 13 Nov 2011 01:37:40 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> This patch's goal is to enable mlocked page migration.
>> The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.
>>
>
> This patch series appears to be a resend of stuff I already have.
>
> Given the various concerns which were voiced during review of
> mm-compaction-compact-unevictable-pages.patch and the uncertainty of
> the overall usefulness of the feature, I'm inclined to drop
>
> mm-compaction-compact-unevictable-pages.patch
> mm-compaction-compact-unevictable-pages-checkpatch-fixes.patch
> mm-compaction-accounting-fix.patch
>
> for now, OK?
>

It's okay on mm-compaction-compact-unevictable-pages.patch.
On mm-compaction-accounting-fix.patch, we still need it but I need to
change title as Mel commented out and I will send further patche which
changes stat names(https://lkml.org/lkml/2011/9/2/3) with it. So let's
drop it all.
I will resend further patches after rc-1.

Thanks, Andrew.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 0/3] Fix compaction about mlocked pages
@ 2011-11-12 16:37 ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

This patch's goal is to enable mlocked page migration.
The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.

During making this patch, I found my silly bug which
[1/3] which fixes it.
[2/3] enables compaction of mlocked.
[3/3] enhance the accouting of compaction.

Frankly speaking, each patch is orthogonal but I send it by thread as I found them during
making patch on mlocked page compaction.

Minchan Kim (3):
  Correct isolate_mode_t bitwise type
  compaction: compact unevictable page
  compaction accouting fix

 include/linux/mmzone.h |   10 ++++++----
 mm/compaction.c        |   13 +++++++++----
 mm/vmscan.c            |    7 +------
 3 files changed, 16 insertions(+), 14 deletions(-)

--
1.7.6

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 0/3] Fix compaction about mlocked pages
@ 2011-11-12 16:37 ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

This patch's goal is to enable mlocked page migration.
The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.

During making this patch, I found my silly bug which
[1/3] which fixes it.
[2/3] enables compaction of mlocked.
[3/3] enhance the accouting of compaction.

Frankly speaking, each patch is orthogonal but I send it by thread as I found them during
making patch on mlocked page compaction.

Minchan Kim (3):
  Correct isolate_mode_t bitwise type
  compaction: compact unevictable page
  compaction accouting fix

 include/linux/mmzone.h |   10 ++++++----
 mm/compaction.c        |   13 +++++++++----
 mm/vmscan.c            |    7 +------
 3 files changed, 16 insertions(+), 14 deletions(-)

--
1.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/3] Correct isolate_mode_t bitwise type
  2011-11-12 16:37 ` Minchan Kim
@ 2011-11-12 16:37   ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

[c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
made a mistake on the bitwise type.

This patch corrects it.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1ed4116..188cb2f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -166,13 +166,13 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define LRU_ALL	     ((1 << NR_LRU_LISTS) - 1)

 /* Isolate inactive pages */
-#define ISOLATE_INACTIVE	((__force fmode_t)0x1)
+#define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
-#define ISOLATE_ACTIVE		((__force fmode_t)0x2)
+#define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force fmode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force fmode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
--
1.7.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/3] Correct isolate_mode_t bitwise type
@ 2011-11-12 16:37   ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

[c1e8b0ae8, mm-change-isolate-mode-from-define-to-bitwise-type]
made a mistake on the bitwise type.

This patch corrects it.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1ed4116..188cb2f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -166,13 +166,13 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define LRU_ALL	     ((1 << NR_LRU_LISTS) - 1)

 /* Isolate inactive pages */
-#define ISOLATE_INACTIVE	((__force fmode_t)0x1)
+#define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
-#define ISOLATE_ACTIVE		((__force fmode_t)0x2)
+#define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force fmode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force fmode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
--
1.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/3] compaction: compact unevictable page
  2011-11-12 16:37 ` Minchan Kim
@ 2011-11-12 16:37   ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
which doesn't consider unevicatable page. It has been used by just lumpy so
it was pointless that it isolates unevictable page. But the situation is
changed. Compaction could handle unevictable page and it can help getting
big contiguos pages in fragment memory by many pinned page with mlock.

I tested this patch with following scenario.

1. A : allocate 80% anon pages in system
2. B : allocate 20% mlocked page in system
/* Maybe, mlocked pages are located in low pfn address */
3. kill A /* high pfn address are free */
4. echo 1 > /proc/sys/vm/compact_memory

old:

compact_blocks_moved 251
compact_pages_moved 44

new:

compact_blocks_moved 258
compact_pages_moved 412

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    6 ++++--
 mm/compaction.c        |    3 ++-
 mm/vmscan.c            |    7 +------
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 188cb2f..82b505e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -169,10 +169,12 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
 #define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
+/* Isolate unevictable pages */
+#define ISOLATE_UNEVICTABLE	((__force isolate_mode_t)0x4)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x8)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x10)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
diff --git a/mm/compaction.c b/mm/compaction.c
index a0e4202..0e572d1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -261,7 +261,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
	struct list_head *migratelist = &cc->migratepages;
-	isolate_mode_t mode = ISOLATE_ACTIVE|ISOLATE_INACTIVE;
+	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
+				ISOLATE_UNEVICTABLE;

	/* Do not scan outside zone boundaries */
	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 23256e8..2300342 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1057,12 +1057,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file)
	if (!all_lru_mode && !!page_is_file_cache(page) != file)
		return ret;

-	/*
-	 * When this function is being called for lumpy reclaim, we
-	 * initially look into all LRU pages, active, inactive and
-	 * unevictable; only give shrink_page_list evictable pages.
-	 */
-	if (PageUnevictable(page))
+	if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
		return ret;

	ret = -EBUSY;
--
1.7.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/3] compaction: compact unevictable page
@ 2011-11-12 16:37   ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

Now compaction doesn't handle mlocked page as it uses __isolate_lru_page
which doesn't consider unevicatable page. It has been used by just lumpy so
it was pointless that it isolates unevictable page. But the situation is
changed. Compaction could handle unevictable page and it can help getting
big contiguos pages in fragment memory by many pinned page with mlock.

I tested this patch with following scenario.

1. A : allocate 80% anon pages in system
2. B : allocate 20% mlocked page in system
/* Maybe, mlocked pages are located in low pfn address */
3. kill A /* high pfn address are free */
4. echo 1 > /proc/sys/vm/compact_memory

old:

compact_blocks_moved 251
compact_pages_moved 44

new:

compact_blocks_moved 258
compact_pages_moved 412

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 include/linux/mmzone.h |    6 ++++--
 mm/compaction.c        |    3 ++-
 mm/vmscan.c            |    7 +------
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 188cb2f..82b505e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -169,10 +169,12 @@ static inline int is_unevictable_lru(enum lru_list l)
 #define ISOLATE_INACTIVE	((__force isolate_mode_t)0x1)
 /* Isolate active pages */
 #define ISOLATE_ACTIVE		((__force isolate_mode_t)0x2)
+/* Isolate unevictable pages */
+#define ISOLATE_UNEVICTABLE	((__force isolate_mode_t)0x4)
 /* Isolate clean file */
-#define ISOLATE_CLEAN		((__force isolate_mode_t)0x4)
+#define ISOLATE_CLEAN		((__force isolate_mode_t)0x8)
 /* Isolate unmapped file */
-#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x8)
+#define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x10)

 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
diff --git a/mm/compaction.c b/mm/compaction.c
index a0e4202..0e572d1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -261,7 +261,8 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
	struct list_head *migratelist = &cc->migratepages;
-	isolate_mode_t mode = ISOLATE_ACTIVE|ISOLATE_INACTIVE;
+	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
+				ISOLATE_UNEVICTABLE;

	/* Do not scan outside zone boundaries */
	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 23256e8..2300342 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1057,12 +1057,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode, int file)
	if (!all_lru_mode && !!page_is_file_cache(page) != file)
		return ret;

-	/*
-	 * When this function is being called for lumpy reclaim, we
-	 * initially look into all LRU pages, active, inactive and
-	 * unevictable; only give shrink_page_list evictable pages.
-	 */
-	if (PageUnevictable(page))
+	if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
		return ret;

	ret = -EBUSY;
--
1.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 3/3] compaction accouting fix
  2011-11-12 16:37 ` Minchan Kim
@ 2011-11-12 16:37   ` Minchan Kim
  -1 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

I saw the following accouting of compaction during test of the series.

compact_blocks_moved 251
compact_pages_moved 44

It's very awkward to me although it's possbile because it means we try to compact 251 blocks
but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
but doesn't increased compact_pages_moved.

This patch makes accouting of compaction works only in case of success of isolation.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 mm/compaction.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 0e572d1..bb3a209 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -260,6 +260,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long low_pfn, end_pfn;
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
+	isolate_migrate_t ret = ISOLATE_NONE;
	struct list_head *migratelist = &cc->migratepages;
	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
				ISOLATE_UNEVICTABLE;
@@ -273,7 +274,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	/* Do not cross the free scanner or scan within a memory hole */
	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
		cc->migrate_pfn = end_pfn;
-		return ISOLATE_NONE;
+		return ret;
	}

	/*
@@ -370,14 +371,17 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
			break;
	}

-	acct_isolated(zone, cc);
+	if (cc->nr_migratepages > 0) {
+		acct_isolated(zone, cc);
+		ret = ISOLATE_SUCCESS;
+	}

	spin_unlock_irq(&zone->lru_lock);
	cc->migrate_pfn = low_pfn;

	trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);

-	return ISOLATE_SUCCESS;
+	return ret;
 }

 /*
--
1.7.6

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 3/3] compaction accouting fix
@ 2011-11-12 16:37   ` Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-11-12 16:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

I saw the following accouting of compaction during test of the series.

compact_blocks_moved 251
compact_pages_moved 44

It's very awkward to me although it's possbile because it means we try to compact 251 blocks
but it just migrated 44 pages. As further investigation, I found isolate_migratepages doesn't
isolate any pages but it returns ISOLATE_SUCCESS and then, it just increases compact_blocks_moved
but doesn't increased compact_pages_moved.

This patch makes accouting of compaction works only in case of success of isolation.

CC: Mel Gorman <mgorman@suse.de>
CC: Johannes Weiner <jweiner@redhat.com>
CC: Rik van Riel <riel@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 mm/compaction.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 0e572d1..bb3a209 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -260,6 +260,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	unsigned long low_pfn, end_pfn;
	unsigned long last_pageblock_nr = 0, pageblock_nr;
	unsigned long nr_scanned = 0, nr_isolated = 0;
+	isolate_migrate_t ret = ISOLATE_NONE;
	struct list_head *migratelist = &cc->migratepages;
	isolate_mode_t mode = ISOLATE_ACTIVE| ISOLATE_INACTIVE |
				ISOLATE_UNEVICTABLE;
@@ -273,7 +274,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
	/* Do not cross the free scanner or scan within a memory hole */
	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
		cc->migrate_pfn = end_pfn;
-		return ISOLATE_NONE;
+		return ret;
	}

	/*
@@ -370,14 +371,17 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
			break;
	}

-	acct_isolated(zone, cc);
+	if (cc->nr_migratepages > 0) {
+		acct_isolated(zone, cc);
+		ret = ISOLATE_SUCCESS;
+	}

	spin_unlock_irq(&zone->lru_lock);
	cc->migrate_pfn = low_pfn;

	trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);

-	return ISOLATE_SUCCESS;
+	return ret;
 }

 /*
--
1.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 0/3] Fix compaction about mlocked pages
@ 2011-08-29 16:43 Minchan Kim
  0 siblings, 0 replies; 48+ messages in thread
From: Minchan Kim @ 2011-08-29 16:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, Mel Gorman, Johannes Weiner, Rik van Riel, Minchan Kim

This patch's goal is to enable mlocked page migration.
The compaction can migrate mlocked page to get a contiguous memory unlike lumpy.

During making this patch, I found my silly bug which
[1/3] which fixes it.
[2/3] enables compaction of mlocked.
[3/3] enhance the accouting of compaction.

Frankly speaking, each patch is orthogonal but I send it by thread as I found them during
making patch on mlocked page compaction.

Minchan Kim (3):
  Correct isolate_mode_t bitwise type
  compaction: compact unevictable page
  compaction accouting fix

 include/linux/mmzone.h |   10 ++++++----
 mm/compaction.c        |   13 +++++++++----
 mm/vmscan.c            |    7 +------
 3 files changed, 16 insertions(+), 14 deletions(-)

--
1.7.6

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2011-10-06 23:08 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-12 16:37 [PATCH 0/3] Fix compaction about mlocked pages Minchan Kim
2011-11-12 16:37 ` Minchan Kim
2011-08-29 16:43 ` [PATCH 3/3] compaction accouting fix Minchan Kim
2011-08-29 16:43 ` [PATCH 1/3] Correct isolate_mode_t bitwise type Minchan Kim
2011-08-29 16:43 ` [PATCH 2/3] compaction: compact unevictable page Minchan Kim
2011-10-06 21:54 ` [PATCH 0/3] Fix compaction about mlocked pages Andrew Morton
2011-10-06 21:54   ` Andrew Morton
2011-10-06 23:07   ` Minchan Kim
2011-10-06 23:07     ` Minchan Kim
2011-11-12 16:37 ` [PATCH 1/3] Correct isolate_mode_t bitwise type Minchan Kim
2011-11-12 16:37   ` Minchan Kim
2011-08-30 17:51   ` Rik van Riel
2011-08-30 17:51     ` Rik van Riel
2011-08-31 11:13   ` Johannes Weiner
2011-08-31 11:13     ` Johannes Weiner
2011-09-01 13:05   ` Mel Gorman
2011-09-01 13:05     ` Mel Gorman
2011-09-02  3:29     ` Minchan Kim
2011-09-02  3:29       ` Minchan Kim
2011-11-12 16:37 ` [PATCH 2/3] compaction: compact unevictable page Minchan Kim
2011-11-12 16:37   ` Minchan Kim
2011-08-31  1:09   ` Rik van Riel
2011-08-31  1:09     ` Rik van Riel
2011-08-31 11:19   ` Johannes Weiner
2011-08-31 11:19     ` Johannes Weiner
2011-08-31 14:41     ` Minchan Kim
2011-08-31 14:41       ` Minchan Kim
2011-09-01 14:02       ` Mel Gorman
2011-09-01 14:02         ` Mel Gorman
2011-09-02  4:48         ` Minchan Kim
2011-09-02  4:48           ` Minchan Kim
2011-09-02 13:34           ` Mel Gorman
2011-09-02 13:34             ` Mel Gorman
2011-11-12 16:37 ` [PATCH 3/3] compaction accouting fix Minchan Kim
2011-11-12 16:37   ` Minchan Kim
2011-08-31 11:37   ` Johannes Weiner
2011-08-31 11:37     ` Johannes Weiner
2011-08-31 14:56     ` Minchan Kim
2011-08-31 14:56       ` Minchan Kim
2011-08-31 15:03       ` Johannes Weiner
2011-08-31 15:03         ` Johannes Weiner
2011-09-01 14:20   ` Mel Gorman
2011-09-01 14:20     ` Mel Gorman
2011-09-02  5:09     ` Minchan Kim
2011-09-02  5:09       ` Minchan Kim
2011-09-02 13:36       ` Mel Gorman
2011-09-02 13:36         ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2011-08-29 16:43 [PATCH 0/3] Fix compaction about mlocked pages Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.