linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
@ 2016-01-28  9:11 Anshuman Khandual
  2016-01-28  9:11 ` [PATCH 2/2] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Anshuman Khandual @ 2016-01-28  9:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
HugeTLB page at the PMD level. It enables the kernel configuration option
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
hugepage_migration_supported() during migration. After the recent changes
to the PTE format, HugeTLB page migration happens successfully.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e4824fd..65d52a0 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
 config ARCH_HAS_DMA_SET_COHERENT_MASK
         bool
 
+config ARCH_ENABLE_HUGEPAGE_MIGRATION
+	def_bool y
+	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
+
 config PPC
 	bool
 	default y
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] selfttest/powerpc: Add memory page migration tests
  2016-01-28  9:11 [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
@ 2016-01-28  9:11 ` Anshuman Khandual
  2016-01-28 11:04 ` [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
  2016-01-28 14:44 ` Aneesh Kumar K.V
  2 siblings, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2016-01-28  9:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

This adds two tests for memory page migration. One for normal page
migration which works for both 4K or 64K base page size kernel and
the other one is for 16MB huge page migration which will work both
4K or 64K base page sized 16MB huge pages as and when we support
huge page migration.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
Changes in V3:
- Minor changes to the code for considering skipped pages
- Enabled HugeTLB test in the script as it works now

Changes in V2:
- Changed the script to accommodate review comments from Michael
- Disabled huge page migration test till it is supported on POWER

Sample test result
==================
....................
Test HugeTLB vs THP
....................
test: hugetlb_vs_thp
tags: git_version:v4.5-rc1-30-gda30491
success: hugetlb_vs_thp
[PASS]
.........................
Test subpage protection
.........................
test: subpage_prot_anon
tags: git_version:v4.5-rc1-30-gda30491
allocated malloc block of 0x4000000 bytes at 0x0x3fff80720000
testing malloc block...
success: subpage_prot_anon
test: subpage_prot_file
tags: git_version:v4.5-rc1-30-gda30491
allocated tempfile for 0x10000 bytes at 0x0x3fff84720000
testing file map...
success: subpage_prot_file
[PASS]
...........................
Test normal page migration
...........................
test: page_migration
tags: git_version:v4.5-rc1-30-gda30491
Running on base page size 64K
64 moved 0 skipped 0 failed
1024 moved 0 skipped 0 failed
3328 moved 768 skipped 0 failed
4352 moved 3840 skipped 0 failed
8448 moved 7936 skipped 0 failed
16640 moved 16128 skipped 0 failed
success: page_migration
[PASS]
.........................
Test huge page migration
.........................
test: hugepage_migration
tags: git_version:v4.5-rc1-30-gda30491
Running on base page size 64K
1 moved 0 skipped 0 failed
16 moved 0 skipped 0 failed
32 moved 0 skipped 0 failed
success: hugepage_migration
[PASS]



 tools/testing/selftests/powerpc/mm/Makefile        |  14 +-
 .../selftests/powerpc/mm/hugepage-migration.c      |  30 +++
 tools/testing/selftests/powerpc/mm/migration.h     | 204 +++++++++++++++++++++
 .../testing/selftests/powerpc/mm/page-migration.c  |  33 ++++
 tools/testing/selftests/powerpc/mm/run_mmtests     | 104 +++++++++++
 5 files changed, 380 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/mm/hugepage-migration.c
 create mode 100644 tools/testing/selftests/powerpc/mm/migration.h
 create mode 100644 tools/testing/selftests/powerpc/mm/page-migration.c
 create mode 100755 tools/testing/selftests/powerpc/mm/run_mmtests

diff --git a/tools/testing/selftests/powerpc/mm/Makefile b/tools/testing/selftests/powerpc/mm/Makefile
index ee179e2..c482614 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -1,12 +1,16 @@
 noarg:
 	$(MAKE) -C ../
 
-TEST_PROGS := hugetlb_vs_thp_test subpage_prot
-TEST_FILES := tempfile
+TEST_PROGS := run_mmtests
+TEST_FILES := hugetlb_vs_thp_test
+TEST_FILES += subpage_prot
+TEST_FILES += tempfile
+TEST_FILES += hugepage-migration
+TEST_FILES += page-migration
 
-all: $(TEST_PROGS) $(TEST_FILES)
+all: $(TEST_FILES)
 
-$(TEST_PROGS): ../harness.c
+$(TEST_FILES): ../harness.c
 
 include ../../lib.mk
 
@@ -14,4 +18,4 @@ tempfile:
 	dd if=/dev/zero of=tempfile bs=64k count=1
 
 clean:
-	rm -f $(TEST_PROGS) tempfile
+	rm -f $(TEST_FILES)
diff --git a/tools/testing/selftests/powerpc/mm/hugepage-migration.c b/tools/testing/selftests/powerpc/mm/hugepage-migration.c
new file mode 100644
index 0000000..b60bc10
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/hugepage-migration.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include "migration.h"
+
+static int hugepage_migration(void)
+{
+	int ret = 0;
+
+	if ((unsigned long)getpagesize() == 0x1000)
+		printf("Running on base page size 4K\n");
+
+	if ((unsigned long)getpagesize() == 0x10000)
+		printf("Running on base page size 64K\n");
+
+	ret = test_huge_migration(16 * MEM_MB);
+	ret = test_huge_migration(256 * MEM_MB);
+	ret = test_huge_migration(512 * MEM_MB);
+
+	return ret;
+}
+
+int main(void)
+{
+	return test_harness(hugepage_migration, "hugepage_migration");
+}
diff --git a/tools/testing/selftests/powerpc/mm/migration.h b/tools/testing/selftests/powerpc/mm/migration.h
new file mode 100644
index 0000000..fe35849
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/migration.h
@@ -0,0 +1,204 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+
+#include "utils.h"
+
+#define HPAGE_OFF	0
+#define HPAGE_ON	1
+
+#define PAGE_SHIFT_4K	12
+#define PAGE_SHIFT_64K	16
+#define PAGE_SIZE_4K	0x1000
+#define PAGE_SIZE_64K	0x10000
+#define PAGE_SIZE_HUGE	16UL * 1024 * 1024
+
+#define MEM_GB		1024UL * 1024 * 1024
+#define MEM_MB		1024UL * 1024
+#define MME_KB		1024UL
+
+#define PMAP_FILE	"/proc/self/pagemap"
+#define PMAP_PFN	0x007FFFFFFFFFFFFFUL
+#define PMAP_SIZE	8
+
+#define SOFT_OFFLINE	"/sys/devices/system/memory/soft_offline_page"
+#define HARD_OFFLINE	"/sys/devices/system/memory/hard_offline_page"
+
+#define MMAP_LENGTH	(256 * MEM_MB)
+#define MMAP_ADDR	(void *)(0x0UL)
+#define MMAP_PROT	(PROT_READ | PROT_WRITE)
+#define MMAP_FLAGS	(MAP_PRIVATE | MAP_ANONYMOUS)
+#define MMAP_FLAGS_HUGE	(MAP_SHARED)
+
+#define FILE_NAME	"huge/hugepagefile"
+
+static void write_buffer(char *addr, unsigned long length)
+{
+	unsigned long i;
+
+	for (i = 0; i < length; i++)
+		*(addr + i) = (char)i;
+}
+
+static int read_buffer(char *addr, unsigned long length)
+{
+	unsigned long i;
+
+	for (i = 0; i < length; i++) {
+		if (*(addr + i) != (char)i) {
+			printf("Data miscompare at addr[%lu]\n", i);
+			return 1;
+		}
+	}
+	return 0;
+}
+
+static unsigned long get_npages(unsigned long length, unsigned long size)
+{
+	unsigned int tmp1 = length, tmp2 = size;
+
+	return tmp1/tmp2;
+}
+
+static void soft_offline_pages(int hugepage, void *addr,
+	unsigned long npages, unsigned long *skipped, unsigned long *failed)
+{
+	unsigned long psize, offset, pfn, paddr, fail, skip, i;
+	void *tmp;
+	int fd1, fd2;
+	char buf[20];
+
+	fd1 = open(PMAP_FILE, O_RDONLY);
+	if (fd1 == -1) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	fd2 = open(SOFT_OFFLINE, O_WRONLY);
+	if (fd2 == -1) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	fail = skip = 0;
+	psize = getpagesize();
+	for (i = 0; i < npages; i++) {
+		if (hugepage)
+			tmp = addr + i * PAGE_SIZE_HUGE;
+		else
+			tmp = addr + i * psize;
+
+		offset = ((unsigned long) tmp / psize) * PMAP_SIZE;
+
+		if (lseek(fd1, offset, SEEK_SET) == -1) {
+			perror("lseek() failed");
+			exit(-1);
+		}
+
+		if (read(fd1, &pfn, sizeof(pfn)) == -1) {
+			perror("read() failed");
+			exit(-1);
+		}
+
+		/* Skip if no valid PFN */
+		pfn = pfn & PMAP_PFN;
+		if (!pfn) {
+			skip++;
+			continue;
+		}
+
+		if (psize == PAGE_SIZE_4K)
+			paddr = pfn << PAGE_SHIFT_4K;
+
+		if (psize == PAGE_SIZE_64K)
+			paddr = pfn << PAGE_SHIFT_64K;
+
+		sprintf(buf, "0x%lx\n", paddr);
+
+		if (write(fd2, buf, strlen(buf)) == -1) {
+			perror("write() failed");
+			printf("[%ld] PFN: %lx BUF: %s\n",i, pfn, buf);
+			fail++;
+		}
+
+	}
+
+	if (failed)
+		*failed = fail;
+
+	if (skipped)
+		*skipped = skip;
+
+	close(fd1);
+	close(fd2);
+}
+
+int test_migration(unsigned long length)
+{
+	unsigned long skipped, failed;
+	void *addr;
+	int ret;
+
+	addr = mmap(MMAP_ADDR, length, MMAP_PROT, MMAP_FLAGS, -1, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap() failed");
+		exit(-1);
+	}
+
+	write_buffer(addr, length);
+	soft_offline_pages(HPAGE_OFF, addr, length/getpagesize(), &skipped, &failed);
+	ret = read_buffer(addr, length);
+
+	printf("%ld moved %ld skipped %ld failed\n", (length/getpagesize() - skipped - failed), skipped, failed);
+
+	munmap(addr, length);
+	return ret;
+}
+
+int test_huge_migration(unsigned long length)
+{
+	unsigned long skipped, failed, npages;
+	void *addr;
+	int fd, ret;
+
+	fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755);
+	if (fd < 0) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	addr = mmap(MMAP_ADDR, length, MMAP_PROT, MMAP_FLAGS_HUGE, fd, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap() failed");
+		unlink(FILE_NAME);
+		exit(-1);
+	}
+
+        if (mlock(addr, length) == -1) {
+                perror("mlock() failed");
+		munmap(addr, length);
+                unlink(FILE_NAME);
+                exit(-1);
+        }
+
+	write_buffer(addr, length);
+	npages = get_npages(length, PAGE_SIZE_HUGE);
+	soft_offline_pages(HPAGE_ON, addr, npages, &skipped, &failed);
+	ret = read_buffer(addr, length);
+
+	printf("%ld moved %ld skipped %ld failed\n", (npages - skipped - failed), skipped, failed);
+
+	munmap(addr, length);
+	unlink(FILE_NAME);
+	return ret;
+}
diff --git a/tools/testing/selftests/powerpc/mm/page-migration.c b/tools/testing/selftests/powerpc/mm/page-migration.c
new file mode 100644
index 0000000..fc6e472
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/page-migration.c
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include "migration.h"
+
+static int page_migration(void)
+{
+	int ret = 0;
+
+	if ((unsigned long)getpagesize() == 0x1000)
+		printf("Running on base page size 4K\n");
+
+	if ((unsigned long)getpagesize() == 0x10000)
+		printf("Running on base page size 64K\n");
+
+	ret = test_migration(4 * MEM_MB);
+	ret = test_migration(64 * MEM_MB);
+	ret = test_migration(256 * MEM_MB);
+	ret = test_migration(512 * MEM_MB);
+	ret = test_migration(1 * MEM_GB);
+	ret = test_migration(2 * MEM_GB);
+
+	return ret;
+}
+
+int main(void)
+{
+	return test_harness(page_migration, "page_migration");
+}
diff --git a/tools/testing/selftests/powerpc/mm/run_mmtests b/tools/testing/selftests/powerpc/mm/run_mmtests
new file mode 100755
index 0000000..19805ba
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/run_mmtests
@@ -0,0 +1,104 @@
+#!/bin/bash
+
+# Mostly borrowed from tools/testing/selftests/vm/run_vmtests
+
+# Please run this as root
+# Try allocating 2GB of 16MB huge pages, below is the size in kB.
+# Please change this needed memory if the test program changes
+needmem=2097152
+mnt=./huge
+exitcode=0
+
+# Get huge pagesize and freepages from /proc/meminfo
+while read name size unit; do
+	if [ "$name" = "HugePages_Free:" ]; then
+		freepgs=$size
+	fi
+	if [ "$name" = "Hugepagesize:" ]; then
+		pgsize=$size
+	fi
+done < /proc/meminfo
+
+# Set required nr_hugepages
+if [ -n "$freepgs" ] && [ -n "$pgsize" ]; then
+	nr_hugepgs=`cat /proc/sys/vm/nr_hugepages`
+	needpgs=`expr $needmem / $pgsize`
+	tries=2
+	while [ $tries -gt 0 ] && [ $freepgs -lt $needpgs ]; do
+		lackpgs=$(( $needpgs - $freepgs ))
+		echo 3 > /proc/sys/vm/drop_caches
+		echo $(( $lackpgs + $nr_hugepgs )) > /proc/sys/vm/nr_hugepages
+		if [ $? -ne 0 ]; then
+			echo "Please run this test as root"
+		fi
+		while read name size unit; do
+			if [ "$name" = "HugePages_Free:" ]; then
+				freepgs=$size
+			fi
+		done < /proc/meminfo
+		tries=$((tries - 1))
+	done
+	if [ $freepgs -lt $needpgs ]; then
+		printf "Not enough huge pages available (%d < %d)\n" \
+		       $freepgs $needpgs
+	fi
+else
+	echo "No hugetlbfs support in kernel ? check dmesg"
+fi
+
+mkdir $mnt
+mount -t hugetlbfs none $mnt
+
+# Run the test programs
+echo "...................."
+echo "Test HugeTLB vs THP"
+echo "...................."
+./hugetlb_vs_thp_test
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+echo "........................."
+echo "Test subpage protection"
+echo "........................."
+./subpage_prot
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+echo "..........................."
+echo "Test normal page migration"
+echo "..........................."
+./page-migration
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+# Enable this after huge page migration is supported on POWER
+
+echo "........................."
+echo "Test huge page migration"
+echo "........................."
+./hugepage-migration
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+# Huge pages cleanup
+umount $mnt
+rm -rf $mnt
+echo $nr_hugepgs > /proc/sys/vm/nr_hugepages
+
+exit $exitcode
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
  2016-01-28  9:11 [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
  2016-01-28  9:11 ` [PATCH 2/2] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
@ 2016-01-28 11:04 ` Anshuman Khandual
  2016-01-28 14:44 ` Aneesh Kumar K.V
  2 siblings, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2016-01-28 11:04 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar

On 01/28/2016 02:41 PM, Anshuman Khandual wrote:
> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
> HugeTLB page at the PMD level. It enables the kernel configuration option

As mentioned above, it works only for 16MB HugeTLB page migration on 64K base
pages implemented right on the PMD as a single PTE but not for the 16MB HugeTLB
page on 4K base pages implemented through huge page directory. As generic
VM migrate code does not look into the page table structure when initiating
migration of 16MB on 4K it will just fail without stating the reason which
might be some times confusing. I can edit the commit message to capture this
point if needed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
  2016-01-28  9:11 [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
  2016-01-28  9:11 ` [PATCH 2/2] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
  2016-01-28 11:04 ` [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
@ 2016-01-28 14:44 ` Aneesh Kumar K.V
  2016-01-29  9:15   ` Anshuman Khandual
  2 siblings, 1 reply; 7+ messages in thread
From: Aneesh Kumar K.V @ 2016-01-28 14:44 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
> HugeTLB page at the PMD level. It enables the kernel configuration option
> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
> hugepage_migration_supported() during migration. After the recent changes
> to the PTE format, HugeTLB page migration happens successfully.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/Kconfig | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e4824fd..65d52a0 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
>  config ARCH_HAS_DMA_SET_COHERENT_MASK
>          bool
>
> +config ARCH_ENABLE_HUGEPAGE_MIGRATION
> +	def_bool y
> +	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
> +
>  config PPC
>  	bool
>  	default y


Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb
migration and our follow_huge_addr will BUG_ON on that. Look at
e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table
lock in follow_huge_pmd()").

Again this doesn't work with 4K page size. So if you are taking this
route, we will need that restriction here.

I would suggest we switch 64K page size hugetlb to generic
hugetlb and then do hugetlb migration on top of that.

Till you help me understnd why that FOLL_GET issue is not valid for
powerpc,

NAK.

-aneesh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
  2016-01-28 14:44 ` Aneesh Kumar K.V
@ 2016-01-29  9:15   ` Anshuman Khandual
  2016-02-02  6:19     ` Anshuman Khandual
  0 siblings, 1 reply; 7+ messages in thread
From: Anshuman Khandual @ 2016-01-29  9:15 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev

On 01/28/2016 08:14 PM, Aneesh Kumar K.V wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
>> HugeTLB page at the PMD level. It enables the kernel configuration option
>> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
>> hugepage_migration_supported() during migration. After the recent changes
>> to the PTE format, HugeTLB page migration happens successfully.
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/Kconfig | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index e4824fd..65d52a0 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
>>  config ARCH_HAS_DMA_SET_COHERENT_MASK
>>          bool
>>
>> +config ARCH_ENABLE_HUGEPAGE_MIGRATION
>> +	def_bool y
>> +	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
>> +
>>  config PPC
>>  	bool
>>  	default y
> 
> 
> Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb
> migration and our follow_huge_addr will BUG_ON on that. Look at
> e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table
> lock in follow_huge_pmd()").

HugeTLB page migration was successful without any error and data integrity
check passed on them as well. But yes there might be some corner cases which
trigger the race condition we have not faced yet. Will try to understand the
situation there and get back.

> 
> Again this doesn't work with 4K page size. So if you are taking this
> route, we will need that restriction here.
> 

Agreed, I had already put a comment on the thread pointing out the same.
But yes, the restriction needs to be there in the enabling config option
here as well.

> I would suggest we switch 64K page size hugetlb to generic
> hugetlb and then do hugetlb migration on top of that.

Will explore it and get back.

> 
> Till you help me understnd why that FOLL_GET issue is not valid for
> powerpc,

Sure will get back.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
  2016-01-29  9:15   ` Anshuman Khandual
@ 2016-02-02  6:19     ` Anshuman Khandual
  2016-02-24 12:20       ` Anshuman Khandual
  0 siblings, 1 reply; 7+ messages in thread
From: Anshuman Khandual @ 2016-02-02  6:19 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev

On 01/29/2016 02:45 PM, Anshuman Khandual wrote:
> On 01/28/2016 08:14 PM, Aneesh Kumar K.V wrote:
>> > Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
>> > 
>>> >> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
>>> >> HugeTLB page at the PMD level. It enables the kernel configuration option
>>> >> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
>>> >> hugepage_migration_supported() during migration. After the recent changes
>>> >> to the PTE format, HugeTLB page migration happens successfully.
>>> >>
>>> >> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> >> ---
>>> >>  arch/powerpc/Kconfig | 4 ++++
>>> >>  1 file changed, 4 insertions(+)
>>> >>
>>> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> >> index e4824fd..65d52a0 100644
>>> >> --- a/arch/powerpc/Kconfig
>>> >> +++ b/arch/powerpc/Kconfig
>>> >> @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
>>> >>  config ARCH_HAS_DMA_SET_COHERENT_MASK
>>> >>          bool
>>> >>
>>> >> +config ARCH_ENABLE_HUGEPAGE_MIGRATION
>>> >> +	def_bool y
>>> >> +	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
>>> >> +
>>> >>  config PPC
>>> >>  	bool
>>> >>  	default y
>> > 
>> > 
>> > Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb
>> > migration and our follow_huge_addr will BUG_ON on that. Look at
>> > e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table
>> > lock in follow_huge_pmd()").
> HugeTLB page migration was successful without any error and data integrity
> check passed on them as well. But yes there might be some corner cases which
> trigger the race condition we have not faced yet. Will try to understand the
> situation there and get back.

Aneesh,

The current test which passed successfully in moving HugeTLB pages is
driven from the soft offline sysfs interface taking PFN (from pagemap
interface) as the argument. Its kind of directly calls migrate_pages()
handing out the page struct list as the argument. But the sample test
case in commit e66f17ff71772b ("mm/hugetlb: take page table lock in
follow_huge_pmd()") is able to crash the kernel at the BUG_ON as you
had mentioned before.

CPU: 2 PID: 6386 Comm: movepages Not tainted 4.5.0-rc2-00002-gc3ac0a3 #3
task: c0000003e8792400 ti: c0000003f65cc000 task.ti: c0000003f65cc000
NIP: c0000000001f485c LR: c0000000001f4844 CTR: 0000000000000000
REGS: c0000003f65cfa20 TRAP: 0700   Not tainted  (4.5.0-rc2-00002-gc3ac0a3)
MSR: 800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 28044488  XER: 00000000
CFAR: c000000000050310 SOFTE: 1 
GPR00: c0000000001f4844 c0000003f65cfca0 c000000000d82e00 f000000000ec0000 
GPR04: 00003efff6000000 c0000003f65cfc74 c0000003f65cfc70 0000000000000001 
GPR08: f000000000000000 0000000000000000 fffffffffffff000 0000000000000000 
GPR12: 0000000000004400 c00000000ea70800 fffffffffffffff2 c0000003d3330000 
GPR16: 0000000000000a00 c0000003f65cfd30 0000000000000000 c0000003d3330000 
GPR20: c000000000239f50 0000000000000100 fffffffffffff000 0000000001320122 
GPR24: c0000003ed267ee8 c0000003f65cfd70 00003efff6000000 c0000003f65cfd80 
GPR28: 000000000000008c c0000003ed267e80 c0000003ed2299d0 0000000000000004 
NIP [c0000000001f485c] follow_page_mask+0x7c/0x440
LR [c0000000001f4844] follow_page_mask+0x64/0x440
Call Trace:
[c0000003f65cfca0] [c0000000001f4844] follow_page_mask+0x64/0x440 (unreliable)
[c0000003f65cfd10] [c00000000023d2d8] SyS_move_pages+0x518/0x820
[c0000003f65cfe30] [c000000000009260] system_call+0x38/0xd0
Instruction dump:
7cdb3378 7c9a2378 eba30040 91260000 7fa3eb78 4be5b9f9 60000000 3940f000 
7fa35040 41dd0044 57ff077a 7bff0020 <0b1f0000> 38210070 e8010010 eae1ffb8

In the function follow_page_mask() we have this code block right at the
front where it fails.

	page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
	if (!IS_ERR(page)) {
		BUG_ON(flags & FOLL_GET);
		return page;
	}

As you mentioned, currently we dont implement CONFIG_ARCH_WANT_GENERAL_HUGETLB.
So we define follow_huge_addr() function which returns a valid page struct
for any given address. But then dont understand why we bug on for FOLL_GET.
move_pages() had called follow_page() with FOLL_GET flag at the first place.
So this condition is going to be true all the time. I am still digging into
this but meanwhile if you can throw some light on why we have BUG_ON for
FOLL_GET flag it will really be helpful. Did not get much clues looking into
the previous commit which added this BUG_ON.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
  2016-02-02  6:19     ` Anshuman Khandual
@ 2016-02-24 12:20       ` Anshuman Khandual
  0 siblings, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2016-02-24 12:20 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev

On 02/02/2016 11:49 AM, Anshuman Khandual wrote:
> On 01/29/2016 02:45 PM, Anshuman Khandual wrote:
>> > On 01/28/2016 08:14 PM, Aneesh Kumar K.V wrote:
>>>> >> > Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
>>>> >> > 
>>>>>> >>> >> This enables HugeTLB page migration for PPC64_BOOK3S systems which implement
>>>>>> >>> >> HugeTLB page at the PMD level. It enables the kernel configuration option
>>>>>> >>> >> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function
>>>>>> >>> >> hugepage_migration_supported() during migration. After the recent changes
>>>>>> >>> >> to the PTE format, HugeTLB page migration happens successfully.
>>>>>> >>> >>
>>>>>> >>> >> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>>>>> >>> >> ---
>>>>>> >>> >>  arch/powerpc/Kconfig | 4 ++++
>>>>>> >>> >>  1 file changed, 4 insertions(+)
>>>>>> >>> >>
>>>>>> >>> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>>>> >>> >> index e4824fd..65d52a0 100644
>>>>>> >>> >> --- a/arch/powerpc/Kconfig
>>>>>> >>> >> +++ b/arch/powerpc/Kconfig
>>>>>> >>> >> @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT
>>>>>> >>> >>  config ARCH_HAS_DMA_SET_COHERENT_MASK
>>>>>> >>> >>          bool
>>>>>> >>> >>
>>>>>> >>> >> +config ARCH_ENABLE_HUGEPAGE_MIGRATION
>>>>>> >>> >> +	def_bool y
>>>>>> >>> >> +	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
>>>>>> >>> >> +
>>>>>> >>> >>  config PPC
>>>>>> >>> >>  	bool
>>>>>> >>> >>  	default y
>>>> >> > 
>>>> >> > 
>>>> >> > Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb
>>>> >> > migration and our follow_huge_addr will BUG_ON on that. Look at
>>>> >> > e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table
>>>> >> > lock in follow_huge_pmd()").
>> > HugeTLB page migration was successful without any error and data integrity
>> > check passed on them as well. But yes there might be some corner cases which
>> > trigger the race condition we have not faced yet. Will try to understand the
>> > situation there and get back.
> Aneesh,
> 
> The current test which passed successfully in moving HugeTLB pages is
> driven from the soft offline sysfs interface taking PFN (from pagemap
> interface) as the argument. Its kind of directly calls migrate_pages()
> handing out the page struct list as the argument. But the sample test
> case in commit e66f17ff71772b ("mm/hugetlb: take page table lock in
> follow_huge_pmd()") is able to crash the kernel at the BUG_ON as you
> had mentioned before.
> 
> CPU: 2 PID: 6386 Comm: movepages Not tainted 4.5.0-rc2-00002-gc3ac0a3 #3
> task: c0000003e8792400 ti: c0000003f65cc000 task.ti: c0000003f65cc000
> NIP: c0000000001f485c LR: c0000000001f4844 CTR: 0000000000000000
> REGS: c0000003f65cfa20 TRAP: 0700   Not tainted  (4.5.0-rc2-00002-gc3ac0a3)
> MSR: 800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 28044488  XER: 00000000
> CFAR: c000000000050310 SOFTE: 1 
> GPR00: c0000000001f4844 c0000003f65cfca0 c000000000d82e00 f000000000ec0000 
> GPR04: 00003efff6000000 c0000003f65cfc74 c0000003f65cfc70 0000000000000001 
> GPR08: f000000000000000 0000000000000000 fffffffffffff000 0000000000000000 
> GPR12: 0000000000004400 c00000000ea70800 fffffffffffffff2 c0000003d3330000 
> GPR16: 0000000000000a00 c0000003f65cfd30 0000000000000000 c0000003d3330000 
> GPR20: c000000000239f50 0000000000000100 fffffffffffff000 0000000001320122 
> GPR24: c0000003ed267ee8 c0000003f65cfd70 00003efff6000000 c0000003f65cfd80 
> GPR28: 000000000000008c c0000003ed267e80 c0000003ed2299d0 0000000000000004 
> NIP [c0000000001f485c] follow_page_mask+0x7c/0x440
> LR [c0000000001f4844] follow_page_mask+0x64/0x440
> Call Trace:
> [c0000003f65cfca0] [c0000000001f4844] follow_page_mask+0x64/0x440 (unreliable)
> [c0000003f65cfd10] [c00000000023d2d8] SyS_move_pages+0x518/0x820
> [c0000003f65cfe30] [c000000000009260] system_call+0x38/0xd0
> Instruction dump:
> 7cdb3378 7c9a2378 eba30040 91260000 7fa3eb78 4be5b9f9 60000000 3940f000 
> 7fa35040 41dd0044 57ff077a 7bff0020 <0b1f0000> 38210070 e8010010 eae1ffb8
> 
> In the function follow_page_mask() we have this code block right at the
> front where it fails.
> 
> 	page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
> 	if (!IS_ERR(page)) {
> 		BUG_ON(flags & FOLL_GET);
> 		return page;
> 	}
> 
> As you mentioned, currently we dont implement CONFIG_ARCH_WANT_GENERAL_HUGETLB.
> So we define follow_huge_addr() function which returns a valid page struct
> for any given address. But then dont understand why we bug on for FOLL_GET.
> move_pages() had called follow_page() with FOLL_GET flag at the first place.
> So this condition is going to be true all the time. I am still digging into
> this but meanwhile if you can throw some light on why we have BUG_ON for
> FOLL_GET flag it will really be helpful. Did not get much clues looking into
> the previous commit which added this BUG_ON.

I understand that the draft patch below is just a hack but it does point out
the problem we should address. The test cases of the commit e66f17ff71772b2
(" mm/hugetlb: take page table lock in follow_huge_pmd()") has not been able
to create the race condition like before.

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 744e24b..3d600162 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -668,8 +668,18 @@ struct page *
 follow_huge_pmd(struct mm_struct *mm, unsigned long address,
                pmd_t *pmd, int write)
 {
-       BUG();
-       return NULL;
+       struct page *page;
+
+       page = follow_huge_addr(mm, address, write & FOLL_WRITE);
+       if (!IS_ERR(page)) {
+               if (page && (write & FOLL_GET)) {
+                       if (PageHead(page))
+                               get_page(page);
+                       else
+                               page = NULL;
+               }
+       }
+       return page;
 }
 
 struct page *
diff --git a/mm/gup.c b/mm/gup.c
index 7bf19ff..04a80ee0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -225,7 +225,12 @@ struct page *follow_page_mask(struct vm_area_struct *vma,
 
        page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
        if (!IS_ERR(page)) {
-               BUG_ON(flags & FOLL_GET);
+               if (page && (flags & FOLL_GET)) {
+                       if (PageHead(page))
+                               get_page(page);
+                       else
+                               page = NULL;
+               }
                return page;
        }

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-24 12:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28  9:11 [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
2016-01-28  9:11 ` [PATCH 2/2] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
2016-01-28 11:04 ` [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
2016-01-28 14:44 ` Aneesh Kumar K.V
2016-01-29  9:15   ` Anshuman Khandual
2016-02-02  6:19     ` Anshuman Khandual
2016-02-24 12:20       ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).