All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Hugh Dickins <hughd@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Hillf Danton <dhillf@gmail.com>, Dave Hansen <dave@sr71.net>,
	Ning Qu <quning@google.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH 16/23] thp, mm: handle transhuge pages in do_generic_file_read()
Date: Sun,  4 Aug 2013 05:17:18 +0300	[thread overview]
Message-ID: <1375582645-29274-17-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1375582645-29274-1-git-send-email-kirill.shutemov@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

If a transhuge page is already in page cache (up to date and not
readahead) we go usual path: read from relevant subpage (head or tail).

If page is not cached (sparse file in ramfs case) and the mapping can
have hugepage we try allocate a new one and read it.

If a page is not up to date or in readahead, we have to move 'page' to
head page of the compound page, since it represents state of whole
transhuge page. We will switch back to relevant subpage when page is
ready to be read ('page_ok' label).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/filemap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c31d296..ed65af5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1175,8 +1175,28 @@ find_page:
 					ra, filp,
 					index, last_index - index);
 			page = find_get_page(mapping, index);
-			if (unlikely(page == NULL))
-				goto no_cached_page;
+			if (unlikely(page == NULL)) {
+				if (mapping_can_have_hugepages(mapping))
+					goto no_cached_page_thp;
+				else
+					goto no_cached_page;
+			}
+		}
+		if (PageTransCompound(page)) {
+			struct page *head = compound_trans_head(page);
+
+			if (!PageReadahead(head) && PageUptodate(page))
+				goto page_ok;
+
+			/*
+			 * Switch 'page' to head page. That's needed to handle
+			 * readahead or make page uptodate.
+			 * It will be switched back to the right tail page at
+			 * the begining 'page_ok'.
+			 */
+			page_cache_get(head);
+			page_cache_release(page);
+			page = head;
 		}
 		if (PageReadahead(page)) {
 			page_cache_async_readahead(mapping,
@@ -1198,6 +1218,18 @@ find_page:
 			unlock_page(page);
 		}
 page_ok:
+		/* Switch back to relevant tail page, if needed */
+		if (PageTransCompoundCache(page) && !PageTransTail(page)) {
+			int off = index & HPAGE_CACHE_INDEX_MASK;
+			if (off){
+				page_cache_get(page + off);
+				page_cache_release(page);
+				page += off;
+			}
+		}
+
+		VM_BUG_ON(page->index != index);
+
 		/*
 		 * i_size must be checked after we know the page is Uptodate.
 		 *
@@ -1329,6 +1361,27 @@ readpage_error:
 		page_cache_release(page);
 		goto out;
 
+no_cached_page_thp:
+		page = alloc_pages(mapping_gfp_mask(mapping) | __GFP_COLD,
+				HPAGE_PMD_ORDER);
+		if (!page) {
+			count_vm_event(THP_READ_ALLOC_FAILED);
+			goto no_cached_page;
+		}
+		count_vm_event(THP_READ_ALLOC);
+
+		error = add_to_page_cache_lru(page, mapping,
+				index & ~HPAGE_CACHE_INDEX_MASK, GFP_KERNEL);
+		if (!error)
+			goto readpage;
+
+		page_cache_release(page);
+		if (error != -EEXIST && error != -ENOSPC) {
+			desc->error = error;
+			goto out;
+		}
+
+		/* Fallback to small page */
 no_cached_page:
 		/*
 		 * Ok, it wasn't cached, so we need to create a new
-- 
1.8.3.2


WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Hugh Dickins <hughd@google.com>,
	Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
	Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Hillf Danton <dhillf@gmail.com>, Dave Hansen <dave@sr71.net>,
	Ning Qu <quning@google.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH 16/23] thp, mm: handle transhuge pages in do_generic_file_read()
Date: Sun,  4 Aug 2013 05:17:18 +0300	[thread overview]
Message-ID: <1375582645-29274-17-git-send-email-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <1375582645-29274-1-git-send-email-kirill.shutemov@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

If a transhuge page is already in page cache (up to date and not
readahead) we go usual path: read from relevant subpage (head or tail).

If page is not cached (sparse file in ramfs case) and the mapping can
have hugepage we try allocate a new one and read it.

If a page is not up to date or in readahead, we have to move 'page' to
head page of the compound page, since it represents state of whole
transhuge page. We will switch back to relevant subpage when page is
ready to be read ('page_ok' label).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/filemap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c31d296..ed65af5 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1175,8 +1175,28 @@ find_page:
 					ra, filp,
 					index, last_index - index);
 			page = find_get_page(mapping, index);
-			if (unlikely(page == NULL))
-				goto no_cached_page;
+			if (unlikely(page == NULL)) {
+				if (mapping_can_have_hugepages(mapping))
+					goto no_cached_page_thp;
+				else
+					goto no_cached_page;
+			}
+		}
+		if (PageTransCompound(page)) {
+			struct page *head = compound_trans_head(page);
+
+			if (!PageReadahead(head) && PageUptodate(page))
+				goto page_ok;
+
+			/*
+			 * Switch 'page' to head page. That's needed to handle
+			 * readahead or make page uptodate.
+			 * It will be switched back to the right tail page at
+			 * the begining 'page_ok'.
+			 */
+			page_cache_get(head);
+			page_cache_release(page);
+			page = head;
 		}
 		if (PageReadahead(page)) {
 			page_cache_async_readahead(mapping,
@@ -1198,6 +1218,18 @@ find_page:
 			unlock_page(page);
 		}
 page_ok:
+		/* Switch back to relevant tail page, if needed */
+		if (PageTransCompoundCache(page) && !PageTransTail(page)) {
+			int off = index & HPAGE_CACHE_INDEX_MASK;
+			if (off){
+				page_cache_get(page + off);
+				page_cache_release(page);
+				page += off;
+			}
+		}
+
+		VM_BUG_ON(page->index != index);
+
 		/*
 		 * i_size must be checked after we know the page is Uptodate.
 		 *
@@ -1329,6 +1361,27 @@ readpage_error:
 		page_cache_release(page);
 		goto out;
 
+no_cached_page_thp:
+		page = alloc_pages(mapping_gfp_mask(mapping) | __GFP_COLD,
+				HPAGE_PMD_ORDER);
+		if (!page) {
+			count_vm_event(THP_READ_ALLOC_FAILED);
+			goto no_cached_page;
+		}
+		count_vm_event(THP_READ_ALLOC);
+
+		error = add_to_page_cache_lru(page, mapping,
+				index & ~HPAGE_CACHE_INDEX_MASK, GFP_KERNEL);
+		if (!error)
+			goto readpage;
+
+		page_cache_release(page);
+		if (error != -EEXIST && error != -ENOSPC) {
+			desc->error = error;
+			goto out;
+		}
+
+		/* Fallback to small page */
 no_cached_page:
 		/*
 		 * Ok, it wasn't cached, so we need to create a new
-- 
1.8.3.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-08-04  2:16 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-04  2:17 [PATCHv5 00/23] Transparent huge page cache: phase 1, everything but mmap() Kirill A. Shutemov
2013-08-04  2:17 ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 01/23] radix-tree: implement preload for multiple contiguous elements Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-05 11:17   ` Jan Kara
2013-08-05 11:17     ` Jan Kara
2013-08-06 16:34     ` Matthew Wilcox
2013-08-06 16:34       ` Matthew Wilcox
2013-08-06 20:17       ` Jan Kara
2013-08-06 20:17         ` Jan Kara
2013-08-07 16:32     ` Kirill A. Shutemov
2013-08-07 16:32       ` Kirill A. Shutemov
2013-08-07 16:32       ` Kirill A. Shutemov
2013-08-07 20:00       ` Jan Kara
2013-08-07 20:00         ` Jan Kara
2013-08-07 20:24         ` Kirill A. Shutemov
2013-08-07 20:24           ` Kirill A. Shutemov
2013-08-07 20:24           ` Kirill A. Shutemov
2013-08-07 20:36           ` Jan Kara
2013-08-07 20:36             ` Jan Kara
2013-08-07 21:37             ` Kirill A. Shutemov
2013-08-07 21:37               ` Kirill A. Shutemov
2013-08-07 21:37               ` Kirill A. Shutemov
2013-08-08  8:45               ` Kirill A. Shutemov
2013-08-08  8:45                 ` Kirill A. Shutemov
2013-08-08  8:45                 ` Kirill A. Shutemov
2013-08-08 10:04                 ` Jan Kara
2013-08-08 10:04                   ` Jan Kara
2013-08-09 11:13                   ` Kirill A. Shutemov
2013-08-09 11:13                     ` Kirill A. Shutemov
2013-08-09 11:13                     ` Kirill A. Shutemov
2013-08-09 11:36                     ` Jan Kara
2013-08-09 11:36                       ` Jan Kara
2013-08-04  2:17 ` [PATCH 02/23] memcg, thp: charge huge cache pages Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  8:25   ` Michal Hocko
2013-08-04  8:25     ` Michal Hocko
2013-08-04  2:17 ` [PATCH 03/23] thp: compile-time and sysfs knob for thp pagecache Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-09-05 21:53   ` Ning Qu
2013-09-06 11:33     ` Kirill A. Shutemov
2013-09-06 11:33       ` Kirill A. Shutemov
2013-09-06 11:33       ` Kirill A. Shutemov
2013-09-06 17:14       ` Ning Qu
2013-08-04  2:17 ` [PATCH 04/23] thp, mm: introduce mapping_can_have_hugepages() predicate Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 05/23] thp: represent file thp pages in meminfo and friends Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-30 22:16   ` Ning Qu
2013-09-02 11:36     ` Kirill A. Shutemov
2013-09-02 11:36       ` Kirill A. Shutemov
2013-09-02 20:05       ` Ning Qu
2013-08-04  2:17 ` [PATCH 06/23] thp, mm: rewrite add_to_page_cache_locked() to support huge pages Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 07/23] mm: trace filemap: dump page order Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 08/23] block: implement add_bdi_stat() Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-05 11:21   ` Jan Kara
2013-08-05 11:21     ` Jan Kara
2013-08-04  2:17 ` [PATCH 09/23] thp, mm: rewrite delete_from_page_cache() to support huge pages Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 10/23] thp, mm: warn if we try to use replace_page_cache_page() with THP Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 11/23] thp, mm: handle tail pages in page_cache_get_speculative() Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 12/23] thp, mm: add event counters for huge page alloc on file write or read Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 13/23] thp, mm: allocate huge pages in grab_cache_page_write_begin() Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 14/23] thp, mm: naive support of thp in generic_perform_write Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 15/23] mm, fs: avoid page allocation beyond i_size on read Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-05  0:29   ` NeilBrown
2013-08-05  0:29     ` NeilBrown
2013-08-04  2:17 ` Kirill A. Shutemov [this message]
2013-08-04  2:17   ` [PATCH 16/23] thp, mm: handle transhuge pages in do_generic_file_read() Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 17/23] thp, libfs: initial thp support Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 18/23] thp: libfs: introduce simple_thp_release() Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 19/23] truncate: support huge pages Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-05 13:29   ` Jan Kara
2013-08-05 13:29     ` Jan Kara
2013-08-06 20:23   ` Dave Hansen
2013-08-06 20:23     ` Dave Hansen
2013-08-06 20:57     ` Kirill A. Shutemov
2013-08-06 20:57       ` Kirill A. Shutemov
2013-08-06 21:55   ` Dave Hansen
2013-08-06 21:55     ` Dave Hansen
2013-08-09 14:39     ` Kirill A. Shutemov
2013-08-09 14:39       ` Kirill A. Shutemov
2013-08-09 14:39       ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 20/23] thp: handle file pages in split_huge_page() Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-06 19:09   ` Ning Qu
2013-08-06 21:09     ` Ning Qu
2013-08-06 21:47       ` Ning Qu
2013-08-09 14:46         ` Kirill A. Shutemov
2013-08-09 14:46           ` Kirill A. Shutemov
2013-08-09 14:46           ` Kirill A. Shutemov
2013-08-09 14:49           ` Ning Qu
2013-08-09 21:24             ` Ning Qu
2013-08-09 21:24               ` Ning Qu
2013-08-04  2:17 ` [PATCH 21/23] thp: wait_split_huge_page(): serialize over i_mmap_mutex too Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 22/23] thp, mm: split huge page on mmap file page Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov
2013-08-08 20:49   ` Khalid Aziz
2013-08-08 20:49     ` Khalid Aziz
2013-08-09 14:50     ` Kirill A. Shutemov
2013-08-09 14:50       ` Kirill A. Shutemov
2013-08-04  2:17 ` [PATCH 23/23] ramfs: enable transparent huge page cache Kirill A. Shutemov
2013-08-04  2:17   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1375582645-29274-17-git-send-email-kirill.shutemov@linux.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=dhillf@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=quning@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.