All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Patch for remapping pages around the fault page
@ 2017-05-21 15:12 Sarunya Pumma
  2017-05-22 15:50 ` kbuild test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Sarunya Pumma @ 2017-05-21 15:12 UTC (permalink / raw)
  To: rppt
  Cc: linux-mm, akpm, kirill.shutemov, jack, ross.zwisler, mhocko,
	aneesh.kumar, lstoakes, dave.jiang, Sarunya Pumma

After the fault handler performs the __do_fault function to read a fault
page when a page fault occurs, it does not map other pages that have been
read together with the fault page. This can cause a number of minor page
faults to be large. Therefore, this patch is developed to remap pages
around the fault page by aiming to map the pages that have been read
synchronously or asynchronously with the fault page.

The major function of this patch is the redo_fault_around function. This
function computes the start and end offsets of the pages to be mapped,
determines whether to do the page remapping, remaps pages using the
map_pages function, and returns. In the redo_fault_around function, the
start and end offsets are computed the same way as the do_fault_around
function. To determine whether to do the remapping, we determine if the
pages around the fault page are already mapped. If they are, the remapping
will not be performed.

As checking every page can be inefficient if a number of pages to be mapped
is large, we have added a threshold called "vm_nr_rempping" to consider
whether to check the status of every page around the fault page or just
some pages. Note that the vm_nr_rempping parameter can be adjusted via the
Sysctl interface. In the case that a number of pages to be mapped is
smaller than the vm_nr_rempping threshold, we check all pages around the
fault page (within the start and end offsets). Otherwise, we check only the
adjacent pages (left and right).

The page remapping is beneficial when performing the "almost sequential"
page accesses, where pages are accessed in order but some pages are
skipped.

The following is one example scenario that we can reduce one page fault
every 16 page:

Assume that we want to access pages sequentially and skip every page that
marked as PG_readahead. Assume that the read-ahead size is 32 pages and the
number of pages to be mapped each time (fault_around_pages) is 16.

When accessing a page at offset 0, a major page fault occurs, so pages from
page 0 to page 31 is read from the disk to the page cache. With this, page
24 is marked as a read-ahead page (PG_readahead). Then only page 0 is
mapped to the virtual memory space.

When accessing a page at offset 1, a minor page fault occurs, pages from
page 0 to page 15 will be mapped.

We keep accessing pages until page 31. Note that we skip page 24.

When accessing a page at offset 32, a major page fault occurs.  The same
process will be repeated. The other 32 pages will be read from the disk.
Only page 32 is mapped. Then a minor page fault at the next page (page
33) will occur.

>From this example, two page faults occur every 16 page. With this patch, we
can eliminate the minor page fault in every 16 page.

Thank you very much for your time for reviewing the patch.

Signed-off-by: Sarunya Pumma <sarunya@vt.edu>
---
 include/linux/mm.h |  2 ++
 kernel/sysctl.c    |  8 +++++
 mm/memory.c        | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7cb17c6..2d533a3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -34,6 +34,8 @@ struct bdi_writeback;
 
 void init_mm_internals(void);
 
+extern unsigned long vm_nr_remapping;
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES	/* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4dfba1a..16c7efe 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1332,6 +1332,14 @@ static struct ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one_hundred,
 	},
+	{
+		.procname	= "nr_remapping",
+		.data		= &vm_nr_remapping,
+		.maxlen		= sizeof(vm_nr_remapping),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+		.extra1		= &zero,
+	},
 #ifdef CONFIG_HUGETLB_PAGE
 	{
 		.procname	= "nr_hugepages",
diff --git a/mm/memory.c b/mm/memory.c
index 6ff5d72..3d0dca9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -83,6 +83,9 @@
 #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid.
 #endif
 
+/* A preset threshold for considering page remapping */
+unsigned long vm_nr_remapping = 32;
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES
 /* use the per-pgdat data instead for discontigmem - mbligh */
 unsigned long max_mapnr;
@@ -3374,6 +3377,82 @@ static int do_fault_around(struct vm_fault *vmf)
 	return ret;
 }
 
+static int redo_fault_around(struct vm_fault *vmf)
+{
+	unsigned long address = vmf->address, nr_pages, mask;
+	pgoff_t start_pgoff = vmf->pgoff;
+	pgoff_t end_pgoff;
+	pte_t *lpte, *rpte;
+	int off, ret = 0, is_mapped = 0;
+
+	nr_pages = READ_ONCE(fault_around_bytes) >> PAGE_SHIFT;
+	mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
+
+	vmf->address = max(address & mask, vmf->vma->vm_start);
+	off = ((address - vmf->address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+	start_pgoff -= off;
+
+	/*
+	 *  end_pgoff is either end of page table or end of vma
+	 *  or fault_around_pages() from start_pgoff, depending what is nearest.
+	 */
+	end_pgoff = start_pgoff -
+		((vmf->address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
+		PTRS_PER_PTE - 1;
+	end_pgoff = min3(end_pgoff, vma_pages(vmf->vma) + vmf->vma->vm_pgoff - 1,
+			start_pgoff + nr_pages - 1);
+
+	if (nr_pages < vm_nr_remapping) {
+		int i, start_off = 0, end_off = 0;
+
+		lpte = vmf->pte - off;
+		for (i = 0; i < nr_pages; i++) {
+			if (!pte_none(*lpte)) {
+				is_mapped++;
+			} else {
+				if (!start_off)
+					start_off = i;
+				end_off = i;
+			}
+			lpte++;
+		}
+		if (is_mapped != nr_pages) {
+			is_mapped = 0;
+			end_pgoff = start_pgoff + end_off;
+			start_pgoff += start_off;
+			vmf->pte += start_off;
+		}
+		lpte = NULL;
+	} else {
+		lpte = vmf->pte - 1;
+		rpte = vmf->pte + 1;
+		if (!pte_none(*lpte) && !pte_none(*rpte))
+			is_mapped = 1;
+		lpte = NULL;
+		rpte = NULL;
+	}
+
+	if (!is_mapped) {
+		vmf->pte -= off;
+		vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
+		vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT);
+	}
+
+	/* Huge page is mapped? Page fault is solved */
+	if (pmd_trans_huge(*vmf->pmd)) {
+		ret = VM_FAULT_NOPAGE;
+		goto out;
+	}
+
+	if (vmf->pte)
+		pte_unmap_unlock(vmf->pte, vmf->ptl);
+
+out:
+	vmf->address = address;
+	vmf->pte = NULL;
+	return ret;
+}
+
 static int do_read_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
@@ -3394,6 +3473,17 @@ static int do_read_fault(struct vm_fault *vmf)
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;
 
+	/*
+	 * Remap pages after read
+	 */
+	if (!(vma->vm_flags & VM_RAND_READ) && vma->vm_ops->map_pages
+			&& fault_around_bytes >> PAGE_SHIFT > 1) {
+		ret |= alloc_set_pte(vmf, vmf->memcg, vmf->page);
+		unlock_page(vmf->page);
+		redo_fault_around(vmf);
+		return ret;
+	}
+
 	ret |= finish_fault(vmf);
 	unlock_page(vmf->page);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Patch for remapping pages around the fault page
  2017-05-21 15:12 [PATCH] Patch for remapping pages around the fault page Sarunya Pumma
@ 2017-05-22 15:50 ` kbuild test robot
  2017-05-22 17:07 ` kbuild test robot
  2017-05-23 13:07 ` [PATCH] " Kirill A. Shutemov
  2 siblings, 0 replies; 6+ messages in thread
From: kbuild test robot @ 2017-05-22 15:50 UTC (permalink / raw)
  To: Sarunya Pumma
  Cc: kbuild-all, rppt, linux-mm, akpm, kirill.shutemov, jack,
	ross.zwisler, mhocko, aneesh.kumar, lstoakes, dave.jiang

[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]

Hi Sarunya,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.12-rc2 next-20170522]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Sarunya-Pumma/Patch-for-remapping-pages-around-the-fault-page/20170522-211816
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: microblaze-nommu_defconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

>> kernel/built-in.o:(.data+0x1b94): undefined reference to `vm_nr_remapping'
   net/built-in.o: In function `rpc_print_iostats':
   net/sunrpc/stats.c:206: undefined reference to `_GLOBAL_OFFSET_TABLE_'
   scripts/link-vmlinux.sh: line 80: 20490 Segmentation fault      ${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${objects}

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 12704 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Patch for remapping pages around the fault page
  2017-05-21 15:12 [PATCH] Patch for remapping pages around the fault page Sarunya Pumma
  2017-05-22 15:50 ` kbuild test robot
@ 2017-05-22 17:07 ` kbuild test robot
  2017-06-01 22:01   ` [PATCH v2] " Sarunya Pumma
  2017-05-23 13:07 ` [PATCH] " Kirill A. Shutemov
  2 siblings, 1 reply; 6+ messages in thread
From: kbuild test robot @ 2017-05-22 17:07 UTC (permalink / raw)
  To: Sarunya Pumma
  Cc: kbuild-all, rppt, linux-mm, akpm, kirill.shutemov, jack,
	ross.zwisler, mhocko, aneesh.kumar, lstoakes, dave.jiang

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

Hi Sarunya,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.12-rc2 next-20170522]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Sarunya-Pumma/Patch-for-remapping-pages-around-the-fault-page/20170522-211816
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: c6x-evmc6678_defconfig (attached as .config)
compiler: c6x-elf-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=c6x 

All errors (new ones prefixed by >>):

>> kernel/built-in.o:(.fardata+0x1b2c): undefined reference to `vm_nr_remapping'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 5427 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Patch for remapping pages around the fault page
  2017-05-21 15:12 [PATCH] Patch for remapping pages around the fault page Sarunya Pumma
  2017-05-22 15:50 ` kbuild test robot
  2017-05-22 17:07 ` kbuild test robot
@ 2017-05-23 13:07 ` Kirill A. Shutemov
  2 siblings, 0 replies; 6+ messages in thread
From: Kirill A. Shutemov @ 2017-05-23 13:07 UTC (permalink / raw)
  To: Sarunya Pumma
  Cc: rppt, linux-mm, akpm, kirill.shutemov, jack, ross.zwisler,
	mhocko, aneesh.kumar, lstoakes, dave.jiang

On Sun, May 21, 2017 at 11:12:00AM -0400, Sarunya Pumma wrote:
> After the fault handler performs the __do_fault function to read a fault
> page when a page fault occurs, it does not map other pages that have been
> read together with the fault page. This can cause a number of minor page
> faults to be large. Therefore, this patch is developed to remap pages
> around the fault page by aiming to map the pages that have been read
> synchronously or asynchronously with the fault page.
> 
> The major function of this patch is the redo_fault_around function. This
> function computes the start and end offsets of the pages to be mapped,
> determines whether to do the page remapping, remaps pages using the
> map_pages function, and returns. In the redo_fault_around function, the
> start and end offsets are computed the same way as the do_fault_around
> function. To determine whether to do the remapping, we determine if the
> pages around the fault page are already mapped. If they are, the remapping
> will not be performed.
> 
> As checking every page can be inefficient if a number of pages to be mapped
> is large, we have added a threshold called "vm_nr_rempping" to consider
> whether to check the status of every page around the fault page or just
> some pages. Note that the vm_nr_rempping parameter can be adjusted via the
> Sysctl interface. In the case that a number of pages to be mapped is
> smaller than the vm_nr_rempping threshold, we check all pages around the
> fault page (within the start and end offsets). Otherwise, we check only the
> adjacent pages (left and right).
> 
> The page remapping is beneficial when performing the "almost sequential"
> page accesses, where pages are accessed in order but some pages are
> skipped.
> 
> The following is one example scenario that we can reduce one page fault
> every 16 page:
> 
> Assume that we want to access pages sequentially and skip every page that
> marked as PG_readahead. Assume that the read-ahead size is 32 pages and the
> number of pages to be mapped each time (fault_around_pages) is 16.
> 
> When accessing a page at offset 0, a major page fault occurs, so pages from
> page 0 to page 31 is read from the disk to the page cache. With this, page
> 24 is marked as a read-ahead page (PG_readahead). Then only page 0 is
> mapped to the virtual memory space.
> 
> When accessing a page at offset 1, a minor page fault occurs, pages from
> page 0 to page 15 will be mapped.
> 
> We keep accessing pages until page 31. Note that we skip page 24.
> 
> When accessing a page at offset 32, a major page fault occurs.  The same
> process will be repeated. The other 32 pages will be read from the disk.
> Only page 32 is mapped. Then a minor page fault at the next page (page
> 33) will occur.
> 
> From this example, two page faults occur every 16 page. With this patch, we
> can eliminate the minor page fault in every 16 page.
> 
> Thank you very much for your time for reviewing the patch.
> 
> Signed-off-by: Sarunya Pumma <sarunya@vt.edu>

Still no performance numbers?

I doubt it's useful. You woundn't get "a number of minor page faults".
The first minor page fault would take faultaround path and map these
pages.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] Patch for remapping pages around the fault page
  2017-05-22 17:07 ` kbuild test robot
@ 2017-06-01 22:01   ` Sarunya Pumma
  2017-06-02  5:04     ` Ross Zwisler
  0 siblings, 1 reply; 6+ messages in thread
From: Sarunya Pumma @ 2017-06-01 22:01 UTC (permalink / raw)
  Cc: kbuild-all, rppt, linux-mm, akpm, kirill.shutemov, jack,
	ross.zwisler, mhocko, aneesh.kumar, lstoakes, dave.jiang,
	Sarunya Pumma

After the fault handler performs the __do_fault function to read a fault
page when a page fault occurs, it does not map other pages that have been
read together with the fault page. This can cause a number of minor page
faults to be large. Therefore, this patch is developed to remap pages
around the fault page by aiming to map the pages that have been read
synchronously or asynchronously with the fault page.

The major function of this patch is the redo_fault_around function. This
function computes the start and end offsets of the pages to be mapped,
determines whether to do the page remapping, remaps pages using the
map_pages function, and returns. In the redo_fault_around function, the
start and end offsets are computed the same way as the do_fault_around
function. To determine whether to do the remapping, we determine if the
pages around the fault page are already mapped. If they are, the remapping
will not be performed.

As checking every page can be inefficient if a number of pages to be mapped
is large, we have added a threshold called "vm_nr_rempping" to consider
whether to check the status of every page around the fault page or just
some pages. Note that the vm_nr_rempping parameter can be adjusted via the
Sysctl interface. In the case that a number of pages to be mapped is
smaller than the vm_nr_rempping threshold, we check all pages around the
fault page (within the start and end offsets). Otherwise, we check only the
adjacent pages (left and right).

The page remapping is beneficial when performing the "almost sequential"
page accesses, where pages are accessed in order but some pages are
skipped.

The following is one example scenario that we can reduce one page fault
every 16 page:

Assume that we want to access pages sequentially and skip every page that
marked as PG_readahead. Assume that the read-ahead size is 32 pages and the
number of pages to be mapped each time (fault_around_pages) is 16.

When accessing a page at offset 0, a major page fault occurs, so pages from
page 0 to page 31 is read from the disk to the page cache. With this, page
24 is marked as a read-ahead page (PG_readahead). Then only page 0 is
mapped to the virtual memory space.

When accessing a page at offset 1, a minor page fault occurs, pages from
page 0 to page 15 will be mapped.

We keep accessing pages until page 31. Note that we skip page 24.

When accessing a page at offset 32, a major page fault occurs.  The same
process will be repeated. The other 32 pages will be read from the disk.
Only page 32 is mapped. Then a minor page fault at the next page (page
33) will occur.

>From this example, two page faults occur every 16 page. With this patch, we
can eliminate the minor page fault in every 16 page.

Thank you very much for your time for reviewing the patch.

Signed-off-by: Sarunya Pumma <sarunya@vt.edu>
---
 include/linux/mm.h |  2 ++
 kernel/sysctl.c    | 10 ++++++
 mm/memory.c        | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 102 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7cb17c6..2d533a3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -34,6 +34,8 @@ struct bdi_writeback;
 
 void init_mm_internals(void);
 
+extern unsigned long vm_nr_remapping;
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES	/* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4dfba1a..dfd61c1 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1332,6 +1332,16 @@ static struct ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one_hundred,
 	},
+#ifdef CONFIG_MMU
+	{
+		.procname	= "nr_remapping",
+		.data		= &vm_nr_remapping,
+		.maxlen		= sizeof(vm_nr_remapping),
+		.mode		= 0644,
+		.proc_handler	= proc_doulongvec_minmax,
+		.extra1		= &zero,
+	},
+#endif
 #ifdef CONFIG_HUGETLB_PAGE
 	{
 		.procname	= "nr_hugepages",
diff --git a/mm/memory.c b/mm/memory.c
index 6ff5d72..3d0dca9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -83,6 +83,9 @@
 #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid.
 #endif
 
+/* A preset threshold for considering page remapping */
+unsigned long vm_nr_remapping = 32;
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES
 /* use the per-pgdat data instead for discontigmem - mbligh */
 unsigned long max_mapnr;
@@ -3374,6 +3377,82 @@ static int do_fault_around(struct vm_fault *vmf)
 	return ret;
 }
 
+static int redo_fault_around(struct vm_fault *vmf)
+{
+	unsigned long address = vmf->address, nr_pages, mask;
+	pgoff_t start_pgoff = vmf->pgoff;
+	pgoff_t end_pgoff;
+	pte_t *lpte, *rpte;
+	int off, ret = 0, is_mapped = 0;
+
+	nr_pages = READ_ONCE(fault_around_bytes) >> PAGE_SHIFT;
+	mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
+
+	vmf->address = max(address & mask, vmf->vma->vm_start);
+	off = ((address - vmf->address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+	start_pgoff -= off;
+
+	/*
+	 *  end_pgoff is either end of page table or end of vma
+	 *  or fault_around_pages() from start_pgoff, depending what is nearest.
+	 */
+	end_pgoff = start_pgoff -
+		((vmf->address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
+		PTRS_PER_PTE - 1;
+	end_pgoff = min3(end_pgoff, vma_pages(vmf->vma) + vmf->vma->vm_pgoff - 1,
+			start_pgoff + nr_pages - 1);
+
+	if (nr_pages < vm_nr_remapping) {
+		int i, start_off = 0, end_off = 0;
+
+		lpte = vmf->pte - off;
+		for (i = 0; i < nr_pages; i++) {
+			if (!pte_none(*lpte)) {
+				is_mapped++;
+			} else {
+				if (!start_off)
+					start_off = i;
+				end_off = i;
+			}
+			lpte++;
+		}
+		if (is_mapped != nr_pages) {
+			is_mapped = 0;
+			end_pgoff = start_pgoff + end_off;
+			start_pgoff += start_off;
+			vmf->pte += start_off;
+		}
+		lpte = NULL;
+	} else {
+		lpte = vmf->pte - 1;
+		rpte = vmf->pte + 1;
+		if (!pte_none(*lpte) && !pte_none(*rpte))
+			is_mapped = 1;
+		lpte = NULL;
+		rpte = NULL;
+	}
+
+	if (!is_mapped) {
+		vmf->pte -= off;
+		vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
+		vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT);
+	}
+
+	/* Huge page is mapped? Page fault is solved */
+	if (pmd_trans_huge(*vmf->pmd)) {
+		ret = VM_FAULT_NOPAGE;
+		goto out;
+	}
+
+	if (vmf->pte)
+		pte_unmap_unlock(vmf->pte, vmf->ptl);
+
+out:
+	vmf->address = address;
+	vmf->pte = NULL;
+	return ret;
+}
+
 static int do_read_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
@@ -3394,6 +3473,17 @@ static int do_read_fault(struct vm_fault *vmf)
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;
 
+	/*
+	 * Remap pages after read
+	 */
+	if (!(vma->vm_flags & VM_RAND_READ) && vma->vm_ops->map_pages
+			&& fault_around_bytes >> PAGE_SHIFT > 1) {
+		ret |= alloc_set_pte(vmf, vmf->memcg, vmf->page);
+		unlock_page(vmf->page);
+		redo_fault_around(vmf);
+		return ret;
+	}
+
 	ret |= finish_fault(vmf);
 	unlock_page(vmf->page);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] Patch for remapping pages around the fault page
  2017-06-01 22:01   ` [PATCH v2] " Sarunya Pumma
@ 2017-06-02  5:04     ` Ross Zwisler
  0 siblings, 0 replies; 6+ messages in thread
From: Ross Zwisler @ 2017-06-02  5:04 UTC (permalink / raw)
  To: Sarunya Pumma
  Cc: kbuild-all, rppt, linux-mm, akpm, kirill.shutemov, jack,
	ross.zwisler, mhocko, aneesh.kumar, lstoakes, dave.jiang

On Thu, Jun 01, 2017 at 06:01:49PM -0400, Sarunya Pumma wrote:
> After the fault handler performs the __do_fault function to read a fault
> page when a page fault occurs, it does not map other pages that have been
> read together with the fault page. This can cause a number of minor page
> faults to be large. Therefore, this patch is developed to remap pages
> around the fault page by aiming to map the pages that have been read
> synchronously or asynchronously with the fault page.
> 
> The major function of this patch is the redo_fault_around function. This
> function computes the start and end offsets of the pages to be mapped,
> determines whether to do the page remapping, remaps pages using the
> map_pages function, and returns. In the redo_fault_around function, the
> start and end offsets are computed the same way as the do_fault_around
> function. To determine whether to do the remapping, we determine if the
> pages around the fault page are already mapped. If they are, the remapping
> will not be performed.
> 
> As checking every page can be inefficient if a number of pages to be mapped
> is large, we have added a threshold called "vm_nr_rempping" to consider
> whether to check the status of every page around the fault page or just
> some pages. Note that the vm_nr_rempping parameter can be adjusted via the
> Sysctl interface. In the case that a number of pages to be mapped is
> smaller than the vm_nr_rempping threshold, we check all pages around the
> fault page (within the start and end offsets). Otherwise, we check only the
> adjacent pages (left and right).
> 
> The page remapping is beneficial when performing the "almost sequential"
> page accesses, where pages are accessed in order but some pages are
> skipped.
> 
> The following is one example scenario that we can reduce one page fault
> every 16 page:
> 
> Assume that we want to access pages sequentially and skip every page that
> marked as PG_readahead. Assume that the read-ahead size is 32 pages and the
> number of pages to be mapped each time (fault_around_pages) is 16.
> 
> When accessing a page at offset 0, a major page fault occurs, so pages from
> page 0 to page 31 is read from the disk to the page cache. With this, page
> 24 is marked as a read-ahead page (PG_readahead). Then only page 0 is
> mapped to the virtual memory space.
> 
> When accessing a page at offset 1, a minor page fault occurs, pages from
> page 0 to page 15 will be mapped.
> 
> We keep accessing pages until page 31. Note that we skip page 24.
> 
> When accessing a page at offset 32, a major page fault occurs.  The same
> process will be repeated. The other 32 pages will be read from the disk.
> Only page 32 is mapped. Then a minor page fault at the next page (page
> 33) will occur.
> 
> From this example, two page faults occur every 16 page. With this patch, we
> can eliminate the minor page fault in every 16 page.
> 
> Thank you very much for your time for reviewing the patch.
> 
> Signed-off-by: Sarunya Pumma <sarunya@vt.edu>

Please consider Kirill's feedback:

http://www.spinics.net/lists/linux-mm/msg127597.html

Really the only reason to consider this extra complexity would be if it
provided a performance benefit.  So, the onus is on the patch author to show
that the performance benefit is worth the code.

Also, it's helpful to reviewers to explicit enumerate the differences between 
different patch versions.  If you have a cover letter that's a great place to
do this, or if you have a short series without a cover letter you can do it
below a --- section break like this:

https://patchwork.kernel.org/patch/9741461/

The extra text below the section break will be stripped off by git am when the
patch is applied.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-02  5:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-21 15:12 [PATCH] Patch for remapping pages around the fault page Sarunya Pumma
2017-05-22 15:50 ` kbuild test robot
2017-05-22 17:07 ` kbuild test robot
2017-06-01 22:01   ` [PATCH v2] " Sarunya Pumma
2017-06-02  5:04     ` Ross Zwisler
2017-05-23 13:07 ` [PATCH] " Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.