linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [00/37] Large Blocksize Support V4
@ 2007-06-20 18:29 clameter
  2007-06-20 18:29 ` [01/37] Define functions for page cache handling clameter
                   ` (36 more replies)
  0 siblings, 37 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

V3->V4
- It is possible to transparently make filesystems support larger
  blocksizes by simply allowing larger blocksizes in set_blocksize.
  Remove all special modifications for mmap etc from the filesystems.
  This now makes 3 disk based filesystems that can use larger blocks
  (reiser, ext2, xfs). Are there any other useful ones to make work?
- Patch against 2.6.22-rc4-mm2 which allows the use of Mel's antifrag
  logic to avoid fragmentation.
- More page cache cleanup by applying the functions to filesystems.
- Disable bouncing when the gfp mask is setup.
- Disable mmap directly in mm/filemap.c to avoid filesystem changes
  while we have no mmap support for higher order pages.

RFC V2->V3
- More restructuring
- It actually works!
- Add XFS support
- Fix up UP support
- Work out the direct I/O issues
- Add CONFIG_LARGE_BLOCKSIZE. Off by default which makes the inlines revert
  back to constants. Disabled for 32bit and HIGHMEM configurations.
  This also allows a gradual migration to the new page cache
  inline functions. LARGE_BLOCKSIZE capabilities can be
  added gradually and if there is a problem then we can disable
  a subsystem.

RFC V1->V2
- Some ext2 support
- Some block layer, fs layer support etc.
- Better page cache macros
- Use macros to clean up code.

This patchset modifies the Linux kernel so that larger block sizes than
page size can be supported. Larger block sizes are handled by using
compound pages of an arbitrary order for the page cache instead of
single pages with order 0.

Rationales:

1. We have problems supporting devices with a higher blocksize than
   page size. This is for example important to support CD and DVDs that
   can only read and write 32k or 64k blocks. We currently have a shim
   layer in there to deal with this situation which limits the speed
   of I/O. The developers are currently looking for ways to completely
   bypass the page cache because of this deficiency.

2. 32/64k blocksize is also used in flash devices. Same issues.

3. Future harddisks will support bigger block sizes that Linux cannot
   support since we are limited to PAGE_SIZE. Ok the on board cache
   may buffer this for us but what is the point of handling smaller
   page sizes than what the drive supports?

4. Reduce fsck times. Larger block sizes mean faster file system checking.
   Using 64k block size will reduce the number of blocks to be managed
   by a factor of 16 and produce much denser and contiguous metadata.

5. Performance. If we look at IA64 vs. x86_64 then it seems that the
   faster interrupt handling on x86_64 compensate for the speed loss due to
   a smaller page size (4k vs 16k on IA64). Supporting larger block sizes
   sizes on all allows a significant reduction in I/O overhead and increases
   the size of I/O that can be performed by hardware in a single request
   since the number of scatter gather entries are typically limited for
   one request. This is going to become increasingly important to support
   the ever growing memory sizes since we may have to handle excessively
   large amounts of 4k requests for data sizes that may become common
   soon. For example to write a 1 terabyte file the kernel would have to
   handle 256 million 4k chunks.

6. Cross arch compatibility: It is currently not possible to mount
   an 16k blocksize ext2 filesystem created on IA64 on an x86_64 system.
   With this patch this becomes possible. Note that this also means that
   some filesystems are already capable of working with blocksizes of
   up to 64k (ext2, XFS) which is currently only available on a select
   few arches. This patchset enables that functionality on all arches.
   There are no special modifications needed to the filesystems. The
   set_blocksize() function call will simply support a larger blocksize.

7. VM scalability
   Large block sizes mean less state keeping for the information being
   transferred. For a 1TB file one needs to handle 256 million page
   structs in the VM if one uses 4k page size. A 64k page size reduces
   that amount to 16 million. If the limitation in existing filesystems
   are removed then even higher reductions become possible. For very
   large files like that a page size of 2 MB may be beneficial which
   will reduce the number of page struct to handle to 512k. The variable
   nature of the block size means that the size can be tuned at file
   system creation time for the anticipated needs on a volume.

8. IO scalability
   The IO layer will receive large blocks of contiguious memory with
   this patchset. This means that less scatter gather elements are needed
   and the memory used is guaranteed to be contiguous. Instead of having
   to handle 4k chunks we can f.e. handle 64k chunks in one go.

   Dave Chinner measures a performance increase of 50% when going to 64k
   blocksize with XFS.

How to make this work:

1. Apply this patchset on top of 2.6.22-rc4-mm2
2. Enable LARGE_BLOCKSIZE Support
3. compile kernel

In order to use a filesystem with a higher order it needs to be formatted
with larger blocksize. This is done using the mkfs.xxx tool for each
filesystem. The existing tools work without modification. They may just
warn you that the blocksize you specify is not supported on your particular
architecture. Ignore that warning since this is no longer true after you have
applied this patchset.


Tested file systems:

Filesystem	Max Blocksize	Changes

Reiserfs	8k		Page size functions
Ext2		64k		Page size functions
XFS		64k		Page size functions / Remove PAGE_SIZE check
Ramfs		MAX_ORDER	Parameter to specify order

Todo/Issues:

- There are certainly numerous issues with this patch. I have only tested
  copying files back and forth, volume creation etc. Others have run
  fsxlinux on the volumes. The missing mmap support limits what can be
  done for now.

- Antifragmentation in mm does address some fragmentation issues (typically
  works up to 32k blocksize). However, large orders lead to fragmentation of
  the movable sections. Seems that we need Mel's memory compaction to support
  even larger orders. How memory compaction impacts performance still has to
  be determined.

- Support for bouncing pages.

- Remove PAGE_CACHE_xxx constants after using page_cache_xxx functions
  everywhere. But that will have to wait until merging becomes possible.
  For now certain subsystems (shmem f.e.) are not using these functions.
  They will only use order 0 pages.

- Support for non harddisk based filesystems. Remove the pktdvd etc
  layers needed because the VM current does not support sufficiently
  large blocksizes for these devices. Look for other places  in the kernel
  where we have similar issues.

- Mmap read support

  Its likely easier to do restricted read only mmap support first. That
  would enable running executables off the filesystems with large block
  size.

- Full mmmap support

--

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [01/37] Define functions for page cache handling
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [02/37] Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user clameter
                   ` (35 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_page_cache_functions --]
[-- Type: text/plain, Size: 3342 bytes --]

We use the macros PAGE_CACHE_SIZE PAGE_CACHE_SHIFT PAGE_CACHE_MASK
and PAGE_CACHE_ALIGN in various places in the kernel. Many times
common operations like calculating the offset or the index are coded
using shifts and adds. This patch provides inline function to
get the calculations accomplished in a consistent way.

All functions take an address_space pointer. The address space pointer
will be used in the future to eventually support a variable size
page cache. Information reachable via the mapping may then determine
page size.

New function			Related base page constant
---------------------------------------------------
page_cache_shift(a)		PAGE_CACHE_SHIFT
page_cache_size(a)		PAGE_CACHE_SIZE
page_cache_mask(a)		PAGE_CACHE_MASK
page_cache_index(a, pos)	Calculate page number from position
page_cache_next(addr, pos)	Page number of next page
page_cache_offset(a, pos)	Calculate offset into a page
page_cache_pos(a, index, offset)
				Form position based on page number
				and an offset.

This provides a basis that would allow the conversion of all page cache
handling in the kernel and ultimately allow the removal of the PAGE_CACHE_*
constants.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/pagemap.h |   54 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

Index: vps/include/linux/pagemap.h
===================================================================
--- vps.orig/include/linux/pagemap.h	2007-06-08 10:57:49.000000000 -0700
+++ vps/include/linux/pagemap.h	2007-06-08 11:01:37.000000000 -0700
@@ -52,12 +52,66 @@ static inline void mapping_set_gfp_mask(
  * space in smaller chunks for same flexibility).
  *
  * Or rather, it _will_ be done in larger chunks.
+ *
+ * The following constants can be used if a filesystem only supports a single
+ * page size.
  */
 #define PAGE_CACHE_SHIFT	PAGE_SHIFT
 #define PAGE_CACHE_SIZE		PAGE_SIZE
 #define PAGE_CACHE_MASK		PAGE_MASK
 #define PAGE_CACHE_ALIGN(addr)	(((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK)
 
+/*
+ * Functions that are currently setup for a fixed PAGE_SIZEd. The use of
+ * these will allow a variable page size pagecache in the future.
+ */
+static inline int mapping_order(struct address_space *a)
+{
+	return 0;
+}
+
+static inline int page_cache_shift(struct address_space *a)
+{
+	return PAGE_SHIFT;
+}
+
+static inline unsigned int page_cache_size(struct address_space *a)
+{
+	return PAGE_SIZE;
+}
+
+static inline loff_t page_cache_mask(struct address_space *a)
+{
+	return (loff_t)PAGE_MASK;
+}
+
+static inline unsigned int page_cache_offset(struct address_space *a,
+		loff_t pos)
+{
+	return pos & ~PAGE_MASK;
+}
+
+static inline pgoff_t page_cache_index(struct address_space *a,
+		loff_t pos)
+{
+	return pos >> page_cache_shift(a);
+}
+
+/*
+ * Index of the page starting on or after the given position.
+ */
+static inline pgoff_t page_cache_next(struct address_space *a,
+		loff_t pos)
+{
+	return page_cache_index(a, pos + page_cache_size(a) - 1);
+}
+
+static inline loff_t page_cache_pos(struct address_space *a,
+		pgoff_t index, unsigned long offset)
+{
+	return ((loff_t)index << page_cache_shift(a)) + offset;
+}
+
 #define page_cache_get(page)		get_page(page)
 #define page_cache_release(page)	put_page(page)
 void release_pages(struct page **pages, int nr, int cold);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [02/37] Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
  2007-06-20 18:29 ` [01/37] Define functions for page cache handling clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [03/37] Use page_cache_xxx function in mm/filemap.c clameter
                   ` (34 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: zero_user_segments --]
[-- Type: text/plain, Size: 27828 bytes --]

Simplify page cache zeroing of segments of pages through 3 functions

zero_user_segments(page, start1, end1, start2, end2)

	Zeros two segments of the page. It takes the position where to
	start and end the zeroing which avoids length calculations.

zero_user_segment(page, start, end)

	Same for a single segment.

zero_user(page, start, length)

	Length variant for the case where we know the length.



We remove the zero_user_page macro. Issues:

1. Its a macro. Inline functions are preferable.

2. The KM_USER0 macro is only defined for HIGHMEM.

   Having to treat this special case everywhere makes the
   code needlessly complex. The parameter for zeroing is always
   KM_USER0 except in one single case that we open code.

Avoiding KM_USER0 makes a lot of code not having to be dealing
with the special casing for HIGHMEM anymore. Dealing with
kmap is only necessary for HIGHMEM configurations. In those
configurations we use KM_USER0 like we do for a series of other
functions defined in highmem.h.

Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
function could not be a macro. zero_user_* functions introduced
here can be because that constant is not used when these
functions are called.

Extract the flushing of the caches to be outside of the kmap.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/affs/file.c                           |    2 -
 fs/buffer.c                              |   47 +++++++++--------------------
 fs/direct-io.c                           |    4 +-
 fs/ecryptfs/mmap.c                       |    5 +--
 fs/ext3/inode.c                          |    4 +-
 fs/gfs2/bmap.c                           |    2 -
 fs/libfs.c                               |   19 +++---------
 fs/mpage.c                               |    7 +---
 fs/nfs/read.c                            |   10 +++---
 fs/nfs/write.c                           |    2 -
 fs/ntfs/aops.c                           |   18 ++++++-----
 fs/ntfs/file.c                           |   32 +++++++++-----------
 fs/ocfs2/aops.c                          |    2 -
 fs/reiser4/plugin/file/cryptcompress.c   |    8 +----
 fs/reiser4/plugin/file/file.c            |    2 -
 fs/reiser4/plugin/item/ctail.c           |    2 -
 fs/reiser4/plugin/item/extent_file_ops.c |    4 +-
 fs/reiser4/plugin/item/tail.c            |    3 -
 fs/reiserfs/inode.c                      |    4 +-
 fs/xfs/linux-2.6/xfs_lrw.c               |    2 -
 include/linux/highmem.h                  |   49 +++++++++++++++++++------------
 mm/filemap_xip.c                         |    2 -
 mm/truncate.c                            |    2 -
 23 files changed, 107 insertions(+), 125 deletions(-)

Index: vps/include/linux/highmem.h
===================================================================
--- vps.orig/include/linux/highmem.h	2007-06-11 22:33:01.000000000 -0700
+++ vps/include/linux/highmem.h	2007-06-11 22:33:07.000000000 -0700
@@ -124,28 +124,41 @@ static inline void clear_highpage(struct
 	kunmap_atomic(kaddr, KM_USER0);
 }
 
-/*
- * Same but also flushes aliased cache contents to RAM.
- *
- * This must be a macro because KM_USER0 and friends aren't defined if
- * !CONFIG_HIGHMEM
- */
-#define zero_user_page(page, offset, size, km_type)		\
-	do {							\
-		void *kaddr;					\
-								\
-		BUG_ON((offset) + (size) > PAGE_SIZE);		\
-								\
-		kaddr = kmap_atomic(page, km_type);		\
-		memset((char *)kaddr + (offset), 0, (size));	\
-		flush_dcache_page(page);			\
-		kunmap_atomic(kaddr, (km_type));		\
-	} while (0)
+static inline void zero_user_segments(struct page *page,
+	unsigned start1, unsigned end1,
+	unsigned start2, unsigned end2)
+{
+	void *kaddr = kmap_atomic(page, KM_USER0);
+
+	BUG_ON(end1 > PAGE_SIZE ||
+		end2 > PAGE_SIZE);
+
+	if (end1 > start1)
+		memset(kaddr + start1, 0, end1 - start1);
+
+	if (end2 > start2)
+		memset(kaddr + start2, 0, end2 - start2);
+
+	kunmap_atomic(kaddr, KM_USER0);
+	flush_dcache_page(page);
+}
+
+static inline void zero_user_segment(struct page *page,
+	unsigned start, unsigned end)
+{
+	zero_user_segments(page, start, end, 0, 0);
+}
+
+static inline void zero_user(struct page *page,
+	unsigned start, unsigned size)
+{
+	zero_user_segments(page, start, start + size, 0, 0);
+}
 
 static inline void __deprecated memclear_highpage_flush(struct page *page,
 			unsigned int offset, unsigned int size)
 {
-	zero_user_page(page, offset, size, KM_USER0);
+	zero_user(page, offset, size);
 }
 
 #ifndef __HAVE_ARCH_COPY_USER_HIGHPAGE
Index: vps/fs/buffer.c
===================================================================
--- vps.orig/fs/buffer.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/buffer.c	2007-06-11 22:49:08.000000000 -0700
@@ -1792,7 +1792,7 @@ void page_zero_new_buffers(struct page *
 					start = max(from, block_start);
 					size = min(to, block_end) - start;
 
-					zero_user_page(page, start, size, KM_USER0);
+					zero_user(page, start, size);
 					set_buffer_uptodate(bh);
 				}
 
@@ -1855,19 +1855,10 @@ static int __block_prepare_write(struct 
 					mark_buffer_dirty(bh);
 					continue;
 				}
-				if (block_end > to || block_start < from) {
-					void *kaddr;
-
-					kaddr = kmap_atomic(page, KM_USER0);
-					if (block_end > to)
-						memset(kaddr+to, 0,
-							block_end-to);
-					if (block_start < from)
-						memset(kaddr+block_start,
-							0, from-block_start);
-					flush_dcache_page(page);
-					kunmap_atomic(kaddr, KM_USER0);
-				}
+				if (block_end > to || block_start < from)
+					zero_user_segments(page,
+							to, block_end,
+							block_start, from);
 				continue;
 			}
 		}
@@ -2095,8 +2086,7 @@ int block_read_full_page(struct page *pa
 					SetPageError(page);
 			}
 			if (!buffer_mapped(bh)) {
-				zero_user_page(page, i * blocksize, blocksize,
-						KM_USER0);
+				zero_user(page, i * blocksize, blocksize);
 				if (!err)
 					set_buffer_uptodate(bh);
 				continue;
@@ -2209,7 +2199,7 @@ int cont_expand_zero(struct file *file, 
 						&page, &fsdata);
 		if (err)
 			goto out;
-		zero_user_page(page, zerofrom, len, KM_USER0);
+		zero_user(page, zerofrom, len);
 		err = pagecache_write_end(file, mapping, curpos, len, len,
 						page, fsdata);
 		if (err < 0)
@@ -2236,7 +2226,7 @@ int cont_expand_zero(struct file *file, 
 						&page, &fsdata);
 		if (err)
 			goto out;
-		zero_user_page(page, zerofrom, len, KM_USER0);
+		zero_user(page, zerofrom, len);
 		err = pagecache_write_end(file, mapping, curpos, len, len,
 						page, fsdata);
 		if (err < 0)
@@ -2350,7 +2340,6 @@ int nobh_prepare_write(struct page *page
 	unsigned block_in_page;
 	unsigned block_start;
 	sector_t block_in_file;
-	char *kaddr;
 	int nr_reads = 0;
 	int i;
 	int ret = 0;
@@ -2390,13 +2379,8 @@ int nobh_prepare_write(struct page *page
 		if (PageUptodate(page))
 			continue;
 		if (buffer_new(&map_bh) || !buffer_mapped(&map_bh)) {
-			kaddr = kmap_atomic(page, KM_USER0);
-			if (block_start < from)
-				memset(kaddr+block_start, 0, from-block_start);
-			if (block_end > to)
-				memset(kaddr + to, 0, block_end - to);
-			flush_dcache_page(page);
-			kunmap_atomic(kaddr, KM_USER0);
+			zero_user_segments(page, block_start, from,
+						to, block_end);
 			continue;
 		}
 		if (buffer_uptodate(&map_bh))
@@ -2462,7 +2446,7 @@ failed:
 	 * Error recovery is pretty slack.  Clear the page and mark it dirty
 	 * so we'll later zero out any blocks which _were_ allocated.
 	 */
-	zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+	zero_user(page, 0, PAGE_CACHE_SIZE);
 	SetPageUptodate(page);
 	set_page_dirty(page);
 	return ret;
@@ -2531,7 +2515,7 @@ int nobh_writepage(struct page *page, ge
 	 * the  page size, the remaining memory is zeroed when mapped, and
 	 * writes to that region are not written out to the file."
 	 */
-	zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0);
+	zero_user_segment(page, offset, PAGE_CACHE_SIZE);
 out:
 	ret = mpage_writepage(page, get_block, wbc);
 	if (ret == -EAGAIN)
@@ -2565,8 +2549,7 @@ int nobh_truncate_page(struct address_sp
 	to = (offset + blocksize) & ~(blocksize - 1);
 	ret = a_ops->prepare_write(NULL, page, offset, to);
 	if (ret == 0) {
-		zero_user_page(page, offset, PAGE_CACHE_SIZE - offset,
-				KM_USER0);
+		zero_user_segment(page, offset, PAGE_CACHE_SIZE);
 		/*
 		 * It would be more correct to call aops->commit_write()
 		 * here, but this is more efficient.
@@ -2645,7 +2628,7 @@ int block_truncate_page(struct address_s
 			goto unlock;
 	}
 
-	zero_user_page(page, offset, length, KM_USER0);
+	zero_user(page, offset, length);
 	mark_buffer_dirty(bh);
 	err = 0;
 
@@ -2691,7 +2674,7 @@ int block_write_full_page(struct page *p
 	 * the  page size, the remaining memory is zeroed when mapped, and
 	 * writes to that region are not written out to the file."
 	 */
-	zero_user_page(page, offset, PAGE_CACHE_SIZE - offset, KM_USER0);
+	zero_user_segment(page, offset, PAGE_CACHE_SIZE);
 	return __block_write_full_page(inode, page, get_block, wbc);
 }
 
Index: vps/fs/libfs.c
===================================================================
--- vps.orig/fs/libfs.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/libfs.c	2007-06-11 22:49:09.000000000 -0700
@@ -340,13 +340,10 @@ int simple_prepare_write(struct file *fi
 			unsigned from, unsigned to)
 {
 	if (!PageUptodate(page)) {
-		if (to - from != PAGE_CACHE_SIZE) {
-			void *kaddr = kmap_atomic(page, KM_USER0);
-			memset(kaddr, 0, from);
-			memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
-			flush_dcache_page(page);
-			kunmap_atomic(kaddr, KM_USER0);
-		}
+		if (to - from != PAGE_CACHE_SIZE)
+			zero_user_segments(page,
+				0, from,
+				to, PAGE_CACHE_SIZE);
 	}
 	return 0;
 }
@@ -396,12 +393,8 @@ int simple_write_end(struct file *file, 
 	unsigned from = pos & (PAGE_CACHE_SIZE - 1);
 
 	/* zero the stale part of the page if we did a short copy */
-	if (copied < len) {
-		void *kaddr = kmap_atomic(page, KM_USER0);
-		memset(kaddr + from + copied, 0, len - copied);
-		flush_dcache_page(page);
-		kunmap_atomic(kaddr, KM_USER0);
-	}
+	if (copied < len)
+		zero_user(page, from + copied, len);
 
 	simple_commit_write(file, page, from, from+copied);
 
Index: vps/fs/affs/file.c
===================================================================
--- vps.orig/fs/affs/file.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/affs/file.c	2007-06-11 22:33:07.000000000 -0700
@@ -628,7 +628,7 @@ static int affs_prepare_write_ofs(struct
 			return err;
 	}
 	if (to < PAGE_CACHE_SIZE) {
-		zero_user_page(page, to, PAGE_CACHE_SIZE - to, KM_USER0);
+		zero_user_segment(page, to, PAGE_CACHE_SIZE);
 		if (size > offset + to) {
 			if (size < offset + PAGE_CACHE_SIZE)
 				tmp = size & ~PAGE_CACHE_MASK;
Index: vps/fs/mpage.c
===================================================================
--- vps.orig/fs/mpage.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/mpage.c	2007-06-11 22:49:08.000000000 -0700
@@ -284,9 +284,7 @@ do_mpage_readpage(struct bio *bio, struc
 	}
 
 	if (first_hole != blocks_per_page) {
-		zero_user_page(page, first_hole << blkbits,
-				PAGE_CACHE_SIZE - (first_hole << blkbits),
-				KM_USER0);
+		zero_user_segment(page, first_hole << blkbits, PAGE_CACHE_SIZE);
 		if (first_hole == 0) {
 			SetPageUptodate(page);
 			unlock_page(page);
@@ -579,8 +577,7 @@ page_is_mapped:
 
 		if (page->index > end_index || !offset)
 			goto confused;
-		zero_user_page(page, offset, PAGE_CACHE_SIZE - offset,
-				KM_USER0);
+		zero_user_segment(page, offset, PAGE_CACHE_SIZE);
 	}
 
 	/*
Index: vps/fs/ntfs/aops.c
===================================================================
--- vps.orig/fs/ntfs/aops.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/ntfs/aops.c	2007-06-11 22:33:07.000000000 -0700
@@ -87,13 +87,17 @@ static void ntfs_end_buffer_async_read(s
 		/* Check for the current buffer head overflowing. */
 		if (unlikely(file_ofs + bh->b_size > init_size)) {
 			int ofs;
+			void *kaddr;
 
 			ofs = 0;
 			if (file_ofs < init_size)
 				ofs = init_size - file_ofs;
 			local_irq_save(flags);
-			zero_user_page(page, bh_offset(bh) + ofs,
-					 bh->b_size - ofs, KM_BIO_SRC_IRQ);
+			kaddr = kmap_atomic(page, KM_BIO_SRC_IRQ);
+			memset(kaddr + bh_offset(bh) + ofs, 0,
+					bh->b_size - ofs);
+			flush_dcache_page(page);
+			kunmap_atomic(kaddr, KM_BIO_SRC_IRQ);
 			local_irq_restore(flags);
 		}
 	} else {
@@ -334,7 +338,7 @@ handle_hole:
 		bh->b_blocknr = -1UL;
 		clear_buffer_mapped(bh);
 handle_zblock:
-		zero_user_page(page, i * blocksize, blocksize, KM_USER0);
+		zero_user(page, i * blocksize, blocksize);
 		if (likely(!err))
 			set_buffer_uptodate(bh);
 	} while (i++, iblock++, (bh = bh->b_this_page) != head);
@@ -451,7 +455,7 @@ retry_readpage:
 	 * ok to ignore the compressed flag here.
 	 */
 	if (unlikely(page->index > 0)) {
-		zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+		zero_user(page, 0, PAGE_CACHE_SIZE);
 		goto done;
 	}
 	if (!NInoAttr(ni))
@@ -780,8 +784,7 @@ lock_retry_remap:
 		if (err == -ENOENT || lcn == LCN_ENOENT) {
 			bh->b_blocknr = -1;
 			clear_buffer_dirty(bh);
-			zero_user_page(page, bh_offset(bh), blocksize,
-					KM_USER0);
+			zero_user(page, bh_offset(bh), blocksize);
 			set_buffer_uptodate(bh);
 			err = 0;
 			continue;
@@ -1406,8 +1409,7 @@ retry_writepage:
 		if (page->index >= (i_size >> PAGE_CACHE_SHIFT)) {
 			/* The page straddles i_size. */
 			unsigned int ofs = i_size & ~PAGE_CACHE_MASK;
-			zero_user_page(page, ofs, PAGE_CACHE_SIZE - ofs,
-					KM_USER0);
+			zero_user_segment(page, ofs, PAGE_CACHE_SIZE);
 		}
 		/* Handle mst protected attributes. */
 		if (NInoMstProtected(ni))
Index: vps/fs/reiserfs/inode.c
===================================================================
--- vps.orig/fs/reiserfs/inode.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiserfs/inode.c	2007-06-11 22:33:07.000000000 -0700
@@ -2151,7 +2151,7 @@ int reiserfs_truncate_file(struct inode 
 		/* if we are not on a block boundary */
 		if (length) {
 			length = blocksize - length;
-			zero_user_page(page, offset, length, KM_USER0);
+			zero_user(page, offset, length);
 			if (buffer_mapped(bh) && bh->b_blocknr != 0) {
 				mark_buffer_dirty(bh);
 			}
@@ -2375,7 +2375,7 @@ static int reiserfs_write_full_page(stru
 			unlock_page(page);
 			return 0;
 		}
-		zero_user_page(page, last_offset, PAGE_CACHE_SIZE - last_offset, KM_USER0);
+		zero_user_segment(page, last_offset, PAGE_CACHE_SIZE);
 	}
 	bh = head;
 	block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits);
Index: vps/mm/truncate.c
===================================================================
--- vps.orig/mm/truncate.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/mm/truncate.c	2007-06-11 22:54:27.000000000 -0700
@@ -47,7 +47,7 @@ void do_invalidatepage(struct page *page
 
 static inline void truncate_partial_page(struct page *page, unsigned partial)
 {
-	zero_user_page(page, partial, PAGE_CACHE_SIZE - partial, KM_USER0);
+	zero_user_segment(page, partial, PAGE_CACHE_SIZE);
 	if (PagePrivate(page))
 		do_invalidatepage(page, partial);
 }
Index: vps/fs/direct-io.c
===================================================================
--- vps.orig/fs/direct-io.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/direct-io.c	2007-06-11 22:33:07.000000000 -0700
@@ -887,8 +887,8 @@ do_holes:
 					page_cache_release(page);
 					goto out;
 				}
-				zero_user_page(page, block_in_page << blkbits,
-						1 << blkbits, KM_USER0);
+				zero_user(page, block_in_page << blkbits,
+						1 << blkbits);
 				dio->block_in_file++;
 				block_in_page++;
 				goto next_block;
Index: vps/mm/filemap_xip.c
===================================================================
--- vps.orig/mm/filemap_xip.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/mm/filemap_xip.c	2007-06-11 22:54:27.000000000 -0700
@@ -461,7 +461,7 @@ xip_truncate_page(struct address_space *
 		else
 			return PTR_ERR(page);
 	}
-	zero_user_page(page, offset, length, KM_USER0);
+	zero_user(page, offset, length);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(xip_truncate_page);
Index: vps/fs/ext3/inode.c
===================================================================
--- vps.orig/fs/ext3/inode.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/ext3/inode.c	2007-06-11 22:33:07.000000000 -0700
@@ -1818,7 +1818,7 @@ static int ext3_block_truncate_page(hand
 	 */
 	if (!page_has_buffers(page) && test_opt(inode->i_sb, NOBH) &&
 	     ext3_should_writeback_data(inode) && PageUptodate(page)) {
-		zero_user_page(page, offset, length, KM_USER0);
+		zero_user(page, offset, length);
 		set_page_dirty(page);
 		goto unlock;
 	}
@@ -1871,7 +1871,7 @@ static int ext3_block_truncate_page(hand
 			goto unlock;
 	}
 
-	zero_user_page(page, offset, length, KM_USER0);
+	zero_user(page, offset, length);
 	BUFFER_TRACE(bh, "zeroed end of block");
 
 	err = 0;
Index: vps/fs/ntfs/file.c
===================================================================
--- vps.orig/fs/ntfs/file.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/ntfs/file.c	2007-06-11 22:33:07.000000000 -0700
@@ -607,8 +607,8 @@ do_next_page:
 					ntfs_submit_bh_for_read(bh);
 					*wait_bh++ = bh;
 				} else {
-					zero_user_page(page, bh_offset(bh),
-							blocksize, KM_USER0);
+					zero_user(page, bh_offset(bh),
+							blocksize);
 					set_buffer_uptodate(bh);
 				}
 			}
@@ -683,9 +683,8 @@ map_buffer_cached:
 						ntfs_submit_bh_for_read(bh);
 						*wait_bh++ = bh;
 					} else {
-						zero_user_page(page,
-							bh_offset(bh),
-							blocksize, KM_USER0);
+						zero_user(page, bh_offset(bh),
+								blocksize);
 						set_buffer_uptodate(bh);
 					}
 				}
@@ -703,8 +702,8 @@ map_buffer_cached:
 			 */
 			if (bh_end <= pos || bh_pos >= end) {
 				if (!buffer_uptodate(bh)) {
-					zero_user_page(page, bh_offset(bh),
-							blocksize, KM_USER0);
+					zero_user(page, bh_offset(bh),
+							blocksize);
 					set_buffer_uptodate(bh);
 				}
 				mark_buffer_dirty(bh);
@@ -743,8 +742,7 @@ map_buffer_cached:
 				if (!buffer_uptodate(bh))
 					set_buffer_uptodate(bh);
 			} else if (!buffer_uptodate(bh)) {
-				zero_user_page(page, bh_offset(bh), blocksize,
-						KM_USER0);
+				zero_user(page, bh_offset(bh), blocksize);
 				set_buffer_uptodate(bh);
 			}
 			continue;
@@ -868,8 +866,8 @@ rl_not_mapped_enoent:
 					if (!buffer_uptodate(bh))
 						set_buffer_uptodate(bh);
 				} else if (!buffer_uptodate(bh)) {
-					zero_user_page(page, bh_offset(bh),
-							blocksize, KM_USER0);
+					zero_user(page, bh_offset(bh),
+						blocksize);
 					set_buffer_uptodate(bh);
 				}
 				continue;
@@ -1128,8 +1126,8 @@ rl_not_mapped_enoent:
 
 				if (likely(bh_pos < initialized_size))
 					ofs = initialized_size - bh_pos;
-				zero_user_page(page, bh_offset(bh) + ofs,
-						blocksize - ofs, KM_USER0);
+				zero_user_segment(page, bh_offset(bh) + ofs,
+						blocksize);
 			}
 		} else /* if (unlikely(!buffer_uptodate(bh))) */
 			err = -EIO;
@@ -1269,8 +1267,8 @@ rl_not_mapped_enoent:
 				if (PageUptodate(page))
 					set_buffer_uptodate(bh);
 				else {
-					zero_user_page(page, bh_offset(bh),
-							blocksize, KM_USER0);
+					zero_user(page, bh_offset(bh),
+							blocksize);
 					set_buffer_uptodate(bh);
 				}
 			}
@@ -1330,7 +1328,7 @@ err_out:
 		len = PAGE_CACHE_SIZE;
 		if (len > bytes)
 			len = bytes;
-		zero_user_page(*pages, 0, len, KM_USER0);
+		zero_user(*pages, 0, len);
 	}
 	goto out;
 }
@@ -1451,7 +1449,7 @@ err_out:
 		len = PAGE_CACHE_SIZE;
 		if (len > bytes)
 			len = bytes;
-		zero_user_page(*pages, 0, len, KM_USER0);
+		zero_user(*pages, 0, len);
 	}
 	goto out;
 }
Index: vps/fs/nfs/read.c
===================================================================
--- vps.orig/fs/nfs/read.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/nfs/read.c	2007-06-11 22:33:07.000000000 -0700
@@ -79,7 +79,7 @@ void nfs_readdata_release(void *data)
 static
 int nfs_return_empty_page(struct page *page)
 {
-	zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+	zero_user(page, 0, PAGE_CACHE_SIZE);
 	SetPageUptodate(page);
 	unlock_page(page);
 	return 0;
@@ -103,10 +103,10 @@ static void nfs_readpage_truncate_uninit
 	pglen = PAGE_CACHE_SIZE - base;
 	for (;;) {
 		if (remainder <= pglen) {
-			zero_user_page(*pages, base, remainder, KM_USER0);
+			zero_user(*pages, base, remainder);
 			break;
 		}
-		zero_user_page(*pages, base, pglen, KM_USER0);
+		zero_user(*pages, base, pglen);
 		pages++;
 		remainder -= pglen;
 		pglen = PAGE_CACHE_SIZE;
@@ -130,7 +130,7 @@ static int nfs_readpage_async(struct nfs
 		return PTR_ERR(new);
 	}
 	if (len < PAGE_CACHE_SIZE)
-		zero_user_page(page, len, PAGE_CACHE_SIZE - len, KM_USER0);
+		zero_user_segment(page, len, PAGE_CACHE_SIZE);
 
 	nfs_list_add_request(new, &one_request);
 	if (NFS_SERVER(inode)->rsize < PAGE_CACHE_SIZE)
@@ -538,7 +538,7 @@ readpage_async_filler(void *data, struct
 		goto out_error;
 
 	if (len < PAGE_CACHE_SIZE)
-		zero_user_page(page, len, PAGE_CACHE_SIZE - len, KM_USER0);
+		zero_user_segment(page, len, PAGE_CACHE_SIZE);
 	nfs_pageio_add_request(desc->pgio, new);
 	return 0;
 out_error:
Index: vps/fs/nfs/write.c
===================================================================
--- vps.orig/fs/nfs/write.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/nfs/write.c	2007-06-11 22:33:07.000000000 -0700
@@ -168,7 +168,7 @@ static void nfs_mark_uptodate(struct pag
 	if (count != nfs_page_length(page))
 		return;
 	if (count != PAGE_CACHE_SIZE)
-		zero_user_page(page, count, PAGE_CACHE_SIZE - count, KM_USER0);
+		zero_user_segment(page, count, PAGE_CACHE_SIZE);
 	SetPageUptodate(page);
 }
 
Index: vps/fs/xfs/linux-2.6/xfs_lrw.c
===================================================================
--- vps.orig/fs/xfs/linux-2.6/xfs_lrw.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/xfs/linux-2.6/xfs_lrw.c	2007-06-11 22:33:07.000000000 -0700
@@ -154,7 +154,7 @@ xfs_iozero(
 		if (status)
 			break;
 
-		zero_user_page(page, offset, bytes, KM_USER0);
+		zero_user(page, offset, bytes);
 
 		status = pagecache_write_end(NULL, mapping, pos, bytes, bytes,
 					page, fsdata);
Index: vps/fs/ecryptfs/mmap.c
===================================================================
--- vps.orig/fs/ecryptfs/mmap.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/ecryptfs/mmap.c	2007-06-11 22:33:07.000000000 -0700
@@ -370,8 +370,7 @@ static int fill_zeros_to_end_of_page(str
 	end_byte_in_page = i_size_read(inode) % PAGE_CACHE_SIZE;
 	if (to > end_byte_in_page)
 		end_byte_in_page = to;
-	zero_user_page(page, end_byte_in_page,
-		PAGE_CACHE_SIZE - end_byte_in_page, KM_USER0);
+	zero_user_segment(page, end_byte_in_page, PAGE_CACHE_SIZE);
 out:
 	return 0;
 }
@@ -784,7 +783,7 @@ int write_zeros(struct file *file, pgoff
 		page_cache_release(tmp_page);
 		goto out;
 	}
-	zero_user_page(tmp_page, start, num_zeros, KM_USER0);
+	zero_user(tmp_page, start, num_zeros);
 	rc = ecryptfs_commit_write(file, tmp_page, start, start + num_zeros);
 	if (rc < 0) {
 		ecryptfs_printk(KERN_ERR, "Error attempting to write zero's "
Index: vps/fs/gfs2/bmap.c
===================================================================
--- vps.orig/fs/gfs2/bmap.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/gfs2/bmap.c	2007-06-11 22:33:07.000000000 -0700
@@ -932,7 +932,7 @@ static int gfs2_block_truncate_page(stru
 	if (sdp->sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip))
 		gfs2_trans_add_bh(ip->i_gl, bh, 0);
 
-	zero_user_page(page, offset, length, KM_USER0);
+	zero_user(page, offset, length);
 
 unlock:
 	unlock_page(page);
Index: vps/fs/ocfs2/aops.c
===================================================================
--- vps.orig/fs/ocfs2/aops.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/ocfs2/aops.c	2007-06-11 22:33:07.000000000 -0700
@@ -238,7 +238,7 @@ static int ocfs2_readpage(struct file *f
 	 * XXX sys_readahead() seems to get that wrong?
 	 */
 	if (start >= i_size_read(inode)) {
-		zero_user_page(page, 0, PAGE_SIZE, KM_USER0);
+		zero_user(page, 0, PAGE_SIZE);
 		SetPageUptodate(page);
 		ret = 0;
 		goto out_alloc;
Index: vps/fs/reiser4/plugin/file/cryptcompress.c
===================================================================
--- vps.orig/fs/reiser4/plugin/file/cryptcompress.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiser4/plugin/file/cryptcompress.c	2007-06-11 22:33:07.000000000 -0700
@@ -1933,7 +1933,7 @@ static int write_hole(struct inode *inod
 
 		to_pg = min_count(PAGE_CACHE_SIZE - pg_off, cl_count);
 		lock_page(page);
-		zero_user_page(page, pg_off, to_pg, KM_USER0);
+		zero_user(page, pg_off, to_pg);
 		SetPageUptodate(page);
 		unlock_page(page);
 
@@ -2169,8 +2169,7 @@ static int read_some_cluster_pages(struc
 			off = off_to_pgoff(win->off+win->count+win->delta);
 			if (off) {
 				lock_page(pg);
-				zero_user_page(pg, off, PAGE_CACHE_SIZE - off,
-						KM_USER0);
+				zero_user_segment(pg, off, PAGE_CACHE_SIZE);
 				unlock_page(pg);
 			}
 		}
@@ -2217,8 +2216,7 @@ static int read_some_cluster_pages(struc
 
 			offset =
 			    off_to_pgoff(win->off + win->count + win->delta);
-			zero_user_page(pg, offset, PAGE_CACHE_SIZE - offset,
-					KM_USER0);
+			zero_user_segment(pg, offset, PAGE_CACHE_SIZE);
 			unlock_page(pg);
 			/* still not uptodate */
 			break;
Index: vps/fs/reiser4/plugin/file/file.c
===================================================================
--- vps.orig/fs/reiser4/plugin/file/file.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiser4/plugin/file/file.c	2007-06-11 22:33:07.000000000 -0700
@@ -538,7 +538,7 @@ static int shorten_file(struct inode *in
 
 	lock_page(page);
 	assert("vs-1066", PageLocked(page));
-	zero_user_page(page, padd_from, PAGE_CACHE_SIZE - padd_from, KM_USER0);
+	zero_user_segment(page, padd_from, PAGE_CACHE_SIZE);
 	unlock_page(page);
 	page_cache_release(page);
 	/* the below does up(sbinfo->delete_mutex). Do not get confused */
Index: vps/fs/reiser4/plugin/item/ctail.c
===================================================================
--- vps.orig/fs/reiser4/plugin/item/ctail.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiser4/plugin/item/ctail.c	2007-06-11 22:33:07.000000000 -0700
@@ -627,7 +627,7 @@ int do_readpage_ctail(struct inode * ino
 #endif
 	case FAKE_DISK_CLUSTER:
 		/* fill the page by zeroes */
-		zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+		zero_user(page, 0, PAGE_CACHE_SIZE);
 		SetPageUptodate(page);
 		break;
 	case PREP_DISK_CLUSTER:
Index: vps/fs/reiser4/plugin/item/extent_file_ops.c
===================================================================
--- vps.orig/fs/reiser4/plugin/item/extent_file_ops.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiser4/plugin/item/extent_file_ops.c	2007-06-11 22:33:07.000000000 -0700
@@ -1112,7 +1112,7 @@ int reiser4_do_readpage_extent(reiser4_e
 		 */
 		j = jfind(mapping, index);
 		if (j == NULL) {
-			zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+			zero_user(page, 0, PAGE_CACHE_SIZE);
 			SetPageUptodate(page);
 			unlock_page(page);
 			return 0;
@@ -1127,7 +1127,7 @@ int reiser4_do_readpage_extent(reiser4_e
 		block = *jnode_get_io_block(j);
 		spin_unlock_jnode(j);
 		if (block == 0) {
-			zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+			zero_user(page, 0, PAGE_CACHE_SIZE);
 			SetPageUptodate(page);
 			unlock_page(page);
 			jput(j);
Index: vps/fs/reiser4/plugin/item/tail.c
===================================================================
--- vps.orig/fs/reiser4/plugin/item/tail.c	2007-06-11 22:33:01.000000000 -0700
+++ vps/fs/reiser4/plugin/item/tail.c	2007-06-11 22:33:07.000000000 -0700
@@ -392,8 +392,7 @@ static int do_readpage_tail(uf_coord_t *
 
  done:
 	if (mapped != PAGE_CACHE_SIZE)
-		zero_user_page(page, mapped, PAGE_CACHE_SIZE - mapped,
-				KM_USER0);
+		zero_user_segment(page, mapped, PAGE_CACHE_SIZE);
 	SetPageUptodate(page);
  out_unlock_page:
 	unlock_page(page);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [03/37] Use page_cache_xxx function in mm/filemap.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
  2007-06-20 18:29 ` [01/37] Define functions for page cache handling clameter
  2007-06-20 18:29 ` [02/37] Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [04/37] Use page_cache_xxx in mm/page-writeback.c clameter
                   ` (33 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_filemap --]
[-- Type: text/plain, Size: 8613 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/filemap.c |   76 +++++++++++++++++++++++++++++------------------------------
 1 file changed, 38 insertions(+), 38 deletions(-)

Index: vps/mm/filemap.c
===================================================================
--- vps.orig/mm/filemap.c	2007-06-08 10:57:37.000000000 -0700
+++ vps/mm/filemap.c	2007-06-09 21:15:04.000000000 -0700
@@ -304,8 +304,8 @@ EXPORT_SYMBOL(add_to_page_cache_lru);
 int sync_page_range(struct inode *inode, struct address_space *mapping,
 			loff_t pos, loff_t count)
 {
-	pgoff_t start = pos >> PAGE_CACHE_SHIFT;
-	pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT;
+	pgoff_t start = page_cache_index(mapping, pos);
+	pgoff_t end = page_cache_index(mapping, pos + count - 1);
 	int ret;
 
 	if (!mapping_cap_writeback_dirty(mapping) || !count)
@@ -336,8 +336,8 @@ EXPORT_SYMBOL(sync_page_range);
 int sync_page_range_nolock(struct inode *inode, struct address_space *mapping,
 			   loff_t pos, loff_t count)
 {
-	pgoff_t start = pos >> PAGE_CACHE_SHIFT;
-	pgoff_t end = (pos + count - 1) >> PAGE_CACHE_SHIFT;
+	pgoff_t start = page_cache_index(mapping, pos);
+	pgoff_t end = page_cache_index(mapping, pos + count - 1);
 	int ret;
 
 	if (!mapping_cap_writeback_dirty(mapping) || !count)
@@ -366,7 +366,7 @@ int filemap_fdatawait(struct address_spa
 		return 0;
 
 	return wait_on_page_writeback_range(mapping, 0,
-				(i_size - 1) >> PAGE_CACHE_SHIFT);
+				page_cache_index(mapping, i_size - 1));
 }
 EXPORT_SYMBOL(filemap_fdatawait);
 
@@ -414,8 +414,8 @@ int filemap_write_and_wait_range(struct 
 		/* See comment of filemap_write_and_wait() */
 		if (err != -EIO) {
 			int err2 = wait_on_page_writeback_range(mapping,
-						lstart >> PAGE_CACHE_SHIFT,
-						lend >> PAGE_CACHE_SHIFT);
+					page_cache_index(mapping, lstart),
+					page_cache_index(mapping, lend));
 			if (!err)
 				err = err2;
 		}
@@ -881,28 +881,28 @@ void do_generic_mapping_read(struct addr
 	int error;
 	struct file_ra_state ra = *_ra;
 
-	index = *ppos >> PAGE_CACHE_SHIFT;
+	index = page_cache_index(mapping, *ppos);
 	next_index = index;
 	prev_index = ra.prev_index;
 	prev_offset = ra.prev_offset;
-	last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
-	offset = *ppos & ~PAGE_CACHE_MASK;
+	last_index = page_cache_next(mapping, *ppos + desc->count);
+	offset = page_cache_offset(mapping, *ppos);
 
 	isize = i_size_read(inode);
 	if (!isize)
 		goto out;
 
-	end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
+	end_index = page_cache_index(mapping, isize - 1);
 	for (;;) {
 		struct page *page;
 		unsigned long nr, ret;
 
 		/* nr is the maximum number of bytes to copy from this page */
-		nr = PAGE_CACHE_SIZE;
+		nr = page_cache_size(mapping);
 		if (index >= end_index) {
 			if (index > end_index)
 				goto out;
-			nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1;
+			nr = page_cache_offset(mapping, isize - 1) + 1;
 			if (nr <= offset) {
 				goto out;
 			}
@@ -956,8 +956,8 @@ page_ok:
 		 */
 		ret = actor(desc, page, offset, nr);
 		offset += ret;
-		index += offset >> PAGE_CACHE_SHIFT;
-		offset &= ~PAGE_CACHE_MASK;
+		index += page_cache_index(mapping, offset);
+		offset = page_cache_offset(mapping, offset);
 		prev_offset = offset;
 		ra.prev_offset = offset;
 
@@ -1023,16 +1023,16 @@ readpage:
 		 * another truncate extends the file - this is desired though).
 		 */
 		isize = i_size_read(inode);
-		end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
+		end_index = page_cache_index(mapping, isize - 1);
 		if (unlikely(!isize || index > end_index)) {
 			page_cache_release(page);
 			goto out;
 		}
 
 		/* nr is the maximum number of bytes to copy from this page */
-		nr = PAGE_CACHE_SIZE;
+		nr = page_cache_size(mapping);
 		if (index == end_index) {
-			nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1;
+			nr = page_cache_offset(mapping, isize - 1) + 1;
 			if (nr <= offset) {
 				page_cache_release(page);
 				goto out;
@@ -1073,7 +1073,7 @@ out:
 	*_ra = ra;
 	_ra->prev_index = prev_index;
 
-	*ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset;
+	*ppos = page_cache_pos(mapping, index, offset);
 	if (filp)
 		file_accessed(filp);
 }
@@ -1291,8 +1291,8 @@ asmlinkage ssize_t sys_readahead(int fd,
 	if (file) {
 		if (file->f_mode & FMODE_READ) {
 			struct address_space *mapping = file->f_mapping;
-			unsigned long start = offset >> PAGE_CACHE_SHIFT;
-			unsigned long end = (offset + count - 1) >> PAGE_CACHE_SHIFT;
+			unsigned long start = page_cache_index(mapping, offset);
+			unsigned long end = page_cache_index(mapping, offset + count - 1);
 			unsigned long len = end - start + 1;
 			ret = do_readahead(mapping, file, start, len);
 		}
@@ -1364,7 +1364,7 @@ struct page *filemap_fault(struct vm_are
 
 	BUG_ON(!(vma->vm_flags & VM_CAN_INVALIDATE));
 
-	size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	size = page_cache_next(mapping, i_size_read(inode));
 	if (fdata->pgoff >= size)
 		goto outside_data_content;
 
@@ -1439,7 +1439,7 @@ retry_find:
 		goto page_not_uptodate;
 
 	/* Must recheck i_size under page lock */
-	size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	size = page_cache_next(mapping, i_size_read(inode));
 	if (unlikely(fdata->pgoff >= size)) {
 		unlock_page(page);
 		goto outside_data_content;
@@ -1930,8 +1930,8 @@ int pagecache_write_begin(struct file *f
 							pagep, fsdata);
 	} else {
 		int ret;
-		pgoff_t index = pos >> PAGE_CACHE_SHIFT;
-		unsigned offset = pos & (PAGE_CACHE_SIZE - 1);
+		pgoff_t index = page_cache_index(mapping, pos);
+		unsigned offset = page_cache_offset(mapping, pos);
 		struct inode *inode = mapping->host;
 		struct page *page;
 again:
@@ -1984,7 +1984,7 @@ int pagecache_write_end(struct file *fil
 		ret = aops->write_end(file, mapping, pos, len, copied,
 							page, fsdata);
 	} else {
-		unsigned offset = pos & (PAGE_CACHE_SIZE - 1);
+		unsigned offset = page_cache_offset(mapping, pos);
 		struct inode *inode = mapping->host;
 
 		flush_dcache_page(page);
@@ -2089,9 +2089,9 @@ static ssize_t generic_perform_write_2co
 		unsigned long bytes;	/* Bytes to write to page */
 		size_t copied;		/* Bytes copied from user */
 
-		offset = (pos & (PAGE_CACHE_SIZE - 1));
-		index = pos >> PAGE_CACHE_SHIFT;
-		bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset,
+		offset = page_cache_offset(mapping, pos );
+		index = page_cache_index(mapping, pos);
+		bytes = min_t(unsigned long, page_cache_size(mapping) - offset,
 						iov_iter_count(i));
 
 		/*
@@ -2267,9 +2267,9 @@ static ssize_t generic_perform_write(str
 		size_t copied;		/* Bytes copied from user */
 		void *fsdata;
 
-		offset = (pos & (PAGE_CACHE_SIZE - 1));
-		index = pos >> PAGE_CACHE_SHIFT;
-		bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset,
+		offset = page_cache_offset(mapping, pos);
+		index = page_cache_index(mapping, pos);
+		bytes = min_t(unsigned long, page_cache_size(mapping)  - offset,
 						iov_iter_count(i));
 
 again:
@@ -2316,7 +2316,7 @@ again:
 			 * because not all segments in the iov can be copied at
 			 * once without a pagefault.
 			 */
-			bytes = min_t(unsigned long, PAGE_CACHE_SIZE - offset,
+			bytes = min_t(unsigned long, page_cache_size(mapping) - offset,
 						iov_iter_single_seg_count(i));
 			goto again;
 		}
@@ -2459,8 +2459,8 @@ __generic_file_aio_write_nolock(struct k
 		if (err == 0) {
 			written = written_buffered;
 			invalidate_mapping_pages(mapping,
-						 pos >> PAGE_CACHE_SHIFT,
-						 endbyte >> PAGE_CACHE_SHIFT);
+						 page_cache_index(mapping, pos),
+						 page_cache_index(mapping, endbyte));
 		} else {
 			/*
 			 * We don't know how much we wrote, so just return
@@ -2547,7 +2547,7 @@ generic_file_direct_IO(int rw, struct ki
 	 */
 	if (rw == WRITE) {
 		write_len = iov_length(iov, nr_segs);
-		end = (offset + write_len - 1) >> PAGE_CACHE_SHIFT;
+		end = page_cache_index(mapping, offset + write_len - 1);
 	       	if (mapping_mapped(mapping))
 			unmap_mapping_range(mapping, offset, write_len, 0);
 	}
@@ -2564,7 +2564,7 @@ generic_file_direct_IO(int rw, struct ki
 	 */
 	if (rw == WRITE && mapping->nrpages) {
 		retval = invalidate_inode_pages2_range(mapping,
-					offset >> PAGE_CACHE_SHIFT, end);
+					page_cache_index(mapping, offset), end);
 		if (retval)
 			goto out;
 	}
@@ -2582,7 +2582,7 @@ generic_file_direct_IO(int rw, struct ki
 	 */
 	if (rw == WRITE && mapping->nrpages) {
 		int err = invalidate_inode_pages2_range(mapping,
-					      offset >> PAGE_CACHE_SHIFT, end);
+					      page_cache_index(mapping, offset), end);
 		if (err && retval >= 0)
 			retval = err;
 	}

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [04/37] Use page_cache_xxx in mm/page-writeback.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (2 preceding siblings ...)
  2007-06-20 18:29 ` [03/37] Use page_cache_xxx function in mm/filemap.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [05/37] Use page_cache_xxx in mm/truncate.c clameter
                   ` (32 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_page_writeback --]
[-- Type: text/plain, Size: 1241 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/page-writeback.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: vps/mm/page-writeback.c
===================================================================
--- vps.orig/mm/page-writeback.c	2007-06-07 17:01:04.000000000 -0700
+++ vps/mm/page-writeback.c	2007-06-09 21:34:24.000000000 -0700
@@ -626,8 +626,8 @@ int write_cache_pages(struct address_spa
 		index = mapping->writeback_index; /* Start from prev offset */
 		end = -1;
 	} else {
-		index = wbc->range_start >> PAGE_CACHE_SHIFT;
-		end = wbc->range_end >> PAGE_CACHE_SHIFT;
+		index = page_cache_index(mapping, wbc->range_start);
+		end = page_cache_index(mapping, wbc->range_end);
 		if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
 			range_whole = 1;
 		scanned = 1;
@@ -829,7 +829,7 @@ int __set_page_dirty_nobuffers(struct pa
 			WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
 			if (mapping_cap_account_dirty(mapping)) {
 				__inc_zone_page_state(page, NR_FILE_DIRTY);
-				task_io_account_write(PAGE_CACHE_SIZE);
+				task_io_account_write(page_cache_size(mapping));
 			}
 			radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [05/37] Use page_cache_xxx in mm/truncate.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (3 preceding siblings ...)
  2007-06-20 18:29 ` [04/37] Use page_cache_xxx in mm/page-writeback.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [06/37] Use page_cache_xxx in mm/rmap.c clameter
                   ` (31 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_truncate --]
[-- Type: text/plain, Size: 3797 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/truncate.c |   35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

Index: vps/mm/truncate.c
===================================================================
--- vps.orig/mm/truncate.c	2007-06-09 20:35:19.000000000 -0700
+++ vps/mm/truncate.c	2007-06-09 21:39:47.000000000 -0700
@@ -45,9 +45,10 @@ void do_invalidatepage(struct page *page
 		(*invalidatepage)(page, offset);
 }
 
-static inline void truncate_partial_page(struct page *page, unsigned partial)
+static inline void truncate_partial_page(struct address_space *mapping,
+			struct page *page, unsigned partial)
 {
-	zero_user_segment(page, partial, PAGE_CACHE_SIZE);
+	zero_user_segment(page, partial, page_cache_size(mapping));
 	if (PagePrivate(page))
 		do_invalidatepage(page, partial);
 }
@@ -95,7 +96,7 @@ truncate_complete_page(struct address_sp
 	if (page->mapping != mapping)
 		return;
 
-	cancel_dirty_page(page, PAGE_CACHE_SIZE);
+	cancel_dirty_page(page, page_cache_size(mapping));
 
 	if (PagePrivate(page))
 		do_invalidatepage(page, 0);
@@ -157,9 +158,9 @@ invalidate_complete_page(struct address_
 void truncate_inode_pages_range(struct address_space *mapping,
 				loff_t lstart, loff_t lend)
 {
-	const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
+	const pgoff_t start = page_cache_next(mapping, lstart);
 	pgoff_t end;
-	const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
+	const unsigned partial = page_cache_offset(mapping, lstart);
 	struct pagevec pvec;
 	pgoff_t next;
 	int i;
@@ -167,8 +168,9 @@ void truncate_inode_pages_range(struct a
 	if (mapping->nrpages == 0)
 		return;
 
-	BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
-	end = (lend >> PAGE_CACHE_SHIFT);
+	BUG_ON(page_cache_offset(mapping, lend) !=
+				page_cache_size(mapping) - 1);
+	end = page_cache_index(mapping, lend);
 
 	pagevec_init(&pvec, 0);
 	next = start;
@@ -194,8 +196,8 @@ void truncate_inode_pages_range(struct a
 			}
 			if (page_mapped(page)) {
 				unmap_mapping_range(mapping,
-				  (loff_t)page_index<<PAGE_CACHE_SHIFT,
-				  PAGE_CACHE_SIZE, 0);
+				  page_cache_pos(mapping, page_index, 0),
+				  page_cache_size(mapping), 0);
 			}
 			truncate_complete_page(mapping, page);
 			unlock_page(page);
@@ -208,7 +210,7 @@ void truncate_inode_pages_range(struct a
 		struct page *page = find_lock_page(mapping, start - 1);
 		if (page) {
 			wait_on_page_writeback(page);
-			truncate_partial_page(page, partial);
+			truncate_partial_page(mapping, page, partial);
 			unlock_page(page);
 			page_cache_release(page);
 		}
@@ -236,8 +238,8 @@ void truncate_inode_pages_range(struct a
 			wait_on_page_writeback(page);
 			if (page_mapped(page)) {
 				unmap_mapping_range(mapping,
-				  (loff_t)page->index<<PAGE_CACHE_SHIFT,
-				  PAGE_CACHE_SIZE, 0);
+				  page_cache_pos(mapping, page->index, 0),
+				  page_cache_size(mapping), 0);
 			}
 			if (page->index > next)
 				next = page->index;
@@ -421,9 +423,8 @@ int invalidate_inode_pages2_range(struct
 					 * Zap the rest of the file in one hit.
 					 */
 					unmap_mapping_range(mapping,
-					   (loff_t)page_index<<PAGE_CACHE_SHIFT,
-					   (loff_t)(end - page_index + 1)
-							<< PAGE_CACHE_SHIFT,
+					   page_cache_pos(mapping, page_index, 0),
+					   page_cache_pos(mapping, end - page_index + 1, 0),
 					    0);
 					did_range_unmap = 1;
 				} else {
@@ -431,8 +432,8 @@ int invalidate_inode_pages2_range(struct
 					 * Just zap this page
 					 */
 					unmap_mapping_range(mapping,
-					  (loff_t)page_index<<PAGE_CACHE_SHIFT,
-					  PAGE_CACHE_SIZE, 0);
+					  page_cache_pos(mapping, page_index, 0),
+					  page_cache_size(mapping), 0);
 				}
 			}
 			BUG_ON(page_mapped(page));

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [06/37] Use page_cache_xxx in mm/rmap.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (4 preceding siblings ...)
  2007-06-20 18:29 ` [05/37] Use page_cache_xxx in mm/truncate.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [07/37] Use page_cache_xxx in mm/filemap_xip.c clameter
                   ` (30 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_rmap --]
[-- Type: text/plain, Size: 1925 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/rmap.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/rmap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/rmap.c	2007-06-14 10:35:45.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/rmap.c	2007-06-14 10:49:29.000000000 -0700
@@ -210,9 +210,14 @@
 static inline unsigned long
 vma_address(struct page *page, struct vm_area_struct *vma)
 {
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff;
 	unsigned long address;
 
+	if (PageAnon(page))
+		pgoff = page->index;
+	else
+		pgoff = page->index << mapping_order(page->mapping);
+
 	address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
 	if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
 		/* page should be within any vma from prio_tree_next */
@@ -357,7 +362,7 @@
 {
 	unsigned int mapcount;
 	struct address_space *mapping = page->mapping;
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT);
 	struct vm_area_struct *vma;
 	struct prio_tree_iter iter;
 	int referenced = 0;
@@ -469,7 +474,7 @@
 
 static int page_mkclean_file(struct address_space *mapping, struct page *page)
 {
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT);
 	struct vm_area_struct *vma;
 	struct prio_tree_iter iter;
 	int ret = 0;
@@ -885,7 +890,7 @@
 static int try_to_unmap_file(struct page *page, int migration)
 {
 	struct address_space *mapping = page->mapping;
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff = page->index << (page_cache_shift(mapping) - PAGE_SHIFT);
 	struct vm_area_struct *vma;
 	struct prio_tree_iter iter;
 	int ret = SWAP_AGAIN;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [07/37] Use page_cache_xxx in mm/filemap_xip.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (5 preceding siblings ...)
  2007-06-20 18:29 ` [06/37] Use page_cache_xxx in mm/rmap.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [08/37] Use page_cache_xxx in mm/migrate.c clameter
                   ` (29 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_filemap_xip --]
[-- Type: text/plain, Size: 2873 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/filemap_xip.c |   28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

Index: vps/mm/filemap_xip.c
===================================================================
--- vps.orig/mm/filemap_xip.c	2007-06-09 21:52:40.000000000 -0700
+++ vps/mm/filemap_xip.c	2007-06-09 21:58:11.000000000 -0700
@@ -60,24 +60,24 @@ do_xip_mapping_read(struct address_space
 
 	BUG_ON(!mapping->a_ops->get_xip_page);
 
-	index = *ppos >> PAGE_CACHE_SHIFT;
-	offset = *ppos & ~PAGE_CACHE_MASK;
+	index = page_cache_index(mapping, *ppos);
+	offset = page_cache_offset(mapping, *ppos);
 
 	isize = i_size_read(inode);
 	if (!isize)
 		goto out;
 
-	end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
+	end_index = page_cache_index(mapping, isize - 1);
 	for (;;) {
 		struct page *page;
 		unsigned long nr, ret;
 
 		/* nr is the maximum number of bytes to copy from this page */
-		nr = PAGE_CACHE_SIZE;
+		nr = page_cache_size(mapping);
 		if (index >= end_index) {
 			if (index > end_index)
 				goto out;
-			nr = ((isize - 1) & ~PAGE_CACHE_MASK) + 1;
+			nr = page_cache_next(mapping, size - 1) + 1;
 			if (nr <= offset) {
 				goto out;
 			}
@@ -116,8 +116,8 @@ do_xip_mapping_read(struct address_space
 		 */
 		ret = actor(desc, page, offset, nr);
 		offset += ret;
-		index += offset >> PAGE_CACHE_SHIFT;
-		offset &= ~PAGE_CACHE_MASK;
+		index += page_cache_index(mapping, offset);
+		offset = page_cache_offset(mapping, offset);
 
 		if (ret == nr && desc->count)
 			continue;
@@ -130,7 +130,7 @@ no_xip_page:
 	}
 
 out:
-	*ppos = ((loff_t) index << PAGE_CACHE_SHIFT) + offset;
+	*ppos = page_cache_pos(mapping, index, offset);
 	if (filp)
 		file_accessed(filp);
 }
@@ -242,7 +242,7 @@ static struct page *xip_file_fault(struc
 
 	/* XXX: are VM_FAULT_ codes OK? */
 
-	size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	size = page_cache_next(mapping, i_size_read(inode));
 	if (fdata->pgoff >= size) {
 		fdata->type = VM_FAULT_SIGBUS;
 		return NULL;
@@ -320,9 +320,9 @@ __xip_file_write(struct file *filp, cons
 		size_t copied;
 		char *kaddr;
 
-		offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */
-		index = pos >> PAGE_CACHE_SHIFT;
-		bytes = PAGE_CACHE_SIZE - offset;
+		offset = page_cache_offset(mapping, pos); /* Within page */
+		index = page_cache_index(mapping, pos);
+		bytes = page_cache_size(mapping) - offset;
 		if (bytes > count)
 			bytes = count;
 
@@ -433,8 +433,8 @@ EXPORT_SYMBOL_GPL(xip_file_write);
 int
 xip_truncate_page(struct address_space *mapping, loff_t from)
 {
-	pgoff_t index = from >> PAGE_CACHE_SHIFT;
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	pgoff_t index = page_cache_index(mapping, from);
+	unsigned offset = page_cache_offset(mapping, from);
 	unsigned blocksize;
 	unsigned length;
 	struct page *page;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [08/37] Use page_cache_xxx in mm/migrate.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (6 preceding siblings ...)
  2007-06-20 18:29 ` [07/37] Use page_cache_xxx in mm/filemap_xip.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [09/37] Use page_cache_xxx in fs/libfs.c clameter
                   ` (28 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_mm_migrate --]
[-- Type: text/plain, Size: 669 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/migrate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: vps/mm/migrate.c
===================================================================
--- vps.orig/mm/migrate.c	2007-06-11 15:56:37.000000000 -0700
+++ vps/mm/migrate.c	2007-06-11 22:05:16.000000000 -0700
@@ -196,7 +196,7 @@ static void remove_file_migration_ptes(s
 	struct vm_area_struct *vma;
 	struct address_space *mapping = page_mapping(new);
 	struct prio_tree_iter iter;
-	pgoff_t pgoff = new->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff = new->index << mapping_order(mapping);
 
 	if (!mapping)
 		return;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [09/37] Use page_cache_xxx in fs/libfs.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (7 preceding siblings ...)
  2007-06-20 18:29 ` [08/37] Use page_cache_xxx in mm/migrate.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [10/37] Use page_cache_xxx in fs/sync clameter
                   ` (27 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_libfs --]
[-- Type: text/plain, Size: 2134 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/libfs.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Index: vps/fs/libfs.c
===================================================================
--- vps.orig/fs/libfs.c	2007-06-11 21:39:09.000000000 -0700
+++ vps/fs/libfs.c	2007-06-11 22:08:13.000000000 -0700
@@ -16,7 +16,8 @@ int simple_getattr(struct vfsmount *mnt,
 {
 	struct inode *inode = dentry->d_inode;
 	generic_fillattr(inode, stat);
-	stat->blocks = inode->i_mapping->nrpages << (PAGE_CACHE_SHIFT - 9);
+	stat->blocks = inode->i_mapping->nrpages <<
+				(page_cache_shift(inode->i_mapping) - 9);
 	return 0;
 }
 
@@ -340,10 +341,10 @@ int simple_prepare_write(struct file *fi
 			unsigned from, unsigned to)
 {
 	if (!PageUptodate(page)) {
-		if (to - from != PAGE_CACHE_SIZE)
+		if (to - from != page_cache_size(file->f_mapping))
 			zero_user_segments(page,
 				0, from,
-				to, PAGE_CACHE_SIZE);
+				to, page_cache_size(file->f_mapping));
 	}
 	return 0;
 }
@@ -356,8 +357,8 @@ int simple_write_begin(struct file *file
 	pgoff_t index;
 	unsigned from;
 
-	index = pos >> PAGE_CACHE_SHIFT;
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	index = page_cache_index(mapping, pos);
+	from = page_cache_offset(mapping, pos);
 
 	page = __grab_cache_page(mapping, index);
 	if (!page)
@@ -371,8 +372,9 @@ int simple_write_begin(struct file *file
 int simple_commit_write(struct file *file, struct page *page,
 			unsigned from, unsigned to)
 {
-	struct inode *inode = page->mapping->host;
-	loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
+	loff_t pos = page_cache_pos(mapping, page->index, to);
 
 	if (!PageUptodate(page))
 		SetPageUptodate(page);
@@ -390,7 +392,7 @@ int simple_write_end(struct file *file, 
 			loff_t pos, unsigned len, unsigned copied,
 			struct page *page, void *fsdata)
 {
-	unsigned from = pos & (PAGE_CACHE_SIZE - 1);
+	unsigned from = page_cache_offset(mapping, pos);
 
 	/* zero the stale part of the page if we did a short copy */
 	if (copied < len)

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [10/37] Use page_cache_xxx in fs/sync.
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (8 preceding siblings ...)
  2007-06-20 18:29 ` [09/37] Use page_cache_xxx in fs/libfs.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [11/37] Use page_cache_xxx in fs/buffer.c clameter
                   ` (26 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_sync --]
[-- Type: text/plain, Size: 1025 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/sync.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: vps/fs/sync.c
===================================================================
--- vps.orig/fs/sync.c	2007-06-04 17:57:25.000000000 -0700
+++ vps/fs/sync.c	2007-06-09 21:17:45.000000000 -0700
@@ -252,8 +252,8 @@ int do_sync_mapping_range(struct address
 	ret = 0;
 	if (flags & SYNC_FILE_RANGE_WAIT_BEFORE) {
 		ret = wait_on_page_writeback_range(mapping,
-					offset >> PAGE_CACHE_SHIFT,
-					endbyte >> PAGE_CACHE_SHIFT);
+					page_cache_index(mapping, offset),
+					page_cache_index(mapping, endbyte));
 		if (ret < 0)
 			goto out;
 	}
@@ -267,8 +267,8 @@ int do_sync_mapping_range(struct address
 
 	if (flags & SYNC_FILE_RANGE_WAIT_AFTER) {
 		ret = wait_on_page_writeback_range(mapping,
-					offset >> PAGE_CACHE_SHIFT,
-					endbyte >> PAGE_CACHE_SHIFT);
+					page_cache_index(mapping, offset),
+					page_cache_index(mapping, endbyte));
 	}
 out:
 	return ret;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [11/37] Use page_cache_xxx in fs/buffer.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (9 preceding siblings ...)
  2007-06-20 18:29 ` [10/37] Use page_cache_xxx in fs/sync clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [12/37] Use page_cache_xxx in mm/mpage.c clameter
                   ` (25 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_buffer --]
[-- Type: text/plain, Size: 12581 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/buffer.c |   99 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 56 insertions(+), 43 deletions(-)

Index: vps/fs/buffer.c
===================================================================
--- vps.orig/fs/buffer.c	2007-06-11 22:33:07.000000000 -0700
+++ vps/fs/buffer.c	2007-06-11 22:34:34.000000000 -0700
@@ -265,7 +265,7 @@ __find_get_block_slow(struct block_devic
 	struct page *page;
 	int all_mapped = 1;
 
-	index = block >> (PAGE_CACHE_SHIFT - bd_inode->i_blkbits);
+	index = block >> (page_cache_shift(bd_mapping) - bd_inode->i_blkbits);
 	page = find_get_page(bd_mapping, index);
 	if (!page)
 		goto out;
@@ -705,7 +705,7 @@ static int __set_page_dirty(struct page 
 
 		if (mapping_cap_account_dirty(mapping)) {
 			__inc_zone_page_state(page, NR_FILE_DIRTY);
-			task_io_account_write(PAGE_CACHE_SIZE);
+			task_io_account_write(page_cache_size(mapping));
 		}
 		radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
@@ -899,10 +899,11 @@ struct buffer_head *alloc_page_buffers(s
 {
 	struct buffer_head *bh, *head;
 	long offset;
+	unsigned int page_size = page_cache_size(page->mapping);
 
 try_again:
 	head = NULL;
-	offset = PAGE_SIZE;
+	offset = page_size;
 	while ((offset -= size) >= 0) {
 		bh = alloc_buffer_head(GFP_NOFS);
 		if (!bh)
@@ -1434,7 +1435,7 @@ void set_bh_page(struct buffer_head *bh,
 		struct page *page, unsigned long offset)
 {
 	bh->b_page = page;
-	BUG_ON(offset >= PAGE_SIZE);
+	BUG_ON(offset >= page_cache_size(page->mapping));
 	if (PageHighMem(page))
 		/*
 		 * This catches illegal uses and preserves the offset:
@@ -1613,6 +1614,7 @@ static int __block_write_full_page(struc
 	struct buffer_head *bh, *head;
 	const unsigned blocksize = 1 << inode->i_blkbits;
 	int nr_underway = 0;
+	struct address_space *mapping = inode->i_mapping;
 
 	BUG_ON(!PageLocked(page));
 
@@ -1633,7 +1635,8 @@ static int __block_write_full_page(struc
 	 * handle that here by just cleaning them.
 	 */
 
-	block = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
+	block = (sector_t)page->index <<
+		(page_cache_shift(mapping) - inode->i_blkbits);
 	head = page_buffers(page);
 	bh = head;
 
@@ -1750,7 +1753,7 @@ recover:
 	} while ((bh = bh->b_this_page) != head);
 	SetPageError(page);
 	BUG_ON(PageWriteback(page));
-	mapping_set_error(page->mapping, err);
+	mapping_set_error(mapping, err);
 	set_page_writeback(page);
 	do {
 		struct buffer_head *next = bh->b_this_page;
@@ -1817,8 +1820,8 @@ static int __block_prepare_write(struct 
 	struct buffer_head *bh, *head, *wait[2], **wait_bh=wait;
 
 	BUG_ON(!PageLocked(page));
-	BUG_ON(from > PAGE_CACHE_SIZE);
-	BUG_ON(to > PAGE_CACHE_SIZE);
+	BUG_ON(from > page_cache_size(inode->i_mapping));
+	BUG_ON(to > page_cache_size(inode->i_mapping));
 	BUG_ON(from > to);
 
 	blocksize = 1 << inode->i_blkbits;
@@ -1827,7 +1830,8 @@ static int __block_prepare_write(struct 
 	head = page_buffers(page);
 
 	bbits = inode->i_blkbits;
-	block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
+	block = (sector_t)page->index <<
+		(page_cache_shift(inode->i_mapping) - bbits);
 
 	for(bh = head, block_start = 0; bh != head || !block_start;
 	    block++, block_start=block_end, bh = bh->b_this_page) {
@@ -1942,8 +1946,8 @@ int block_write_begin(struct file *file,
 	unsigned start, end;
 	int ownpage = 0;
 
-	index = pos >> PAGE_CACHE_SHIFT;
-	start = pos & (PAGE_CACHE_SIZE - 1);
+	index = page_cache_index(mapping, pos);
+	start = page_cache_offset(mapping, pos);
 	end = start + len;
 
 	page = *pagep;
@@ -1989,7 +1993,7 @@ int block_write_end(struct file *file, s
 	struct inode *inode = mapping->host;
 	unsigned start;
 
-	start = pos & (PAGE_CACHE_SIZE - 1);
+	start = page_cache_offset(mapping, pos);
 
 	if (unlikely(copied < len)) {
 		/*
@@ -2065,7 +2069,8 @@ int block_read_full_page(struct page *pa
 		create_empty_buffers(page, blocksize, 0);
 	head = page_buffers(page);
 
-	iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
+	iblock = (sector_t)page->index <<
+		(page_cache_shift(page->mapping) - inode->i_blkbits);
 	lblock = (i_size_read(inode)+blocksize-1) >> inode->i_blkbits;
 	bh = head;
 	nr = 0;
@@ -2183,16 +2188,17 @@ int cont_expand_zero(struct file *file, 
 	unsigned zerofrom, offset, len;
 	int err = 0;
 
-	index = pos >> PAGE_CACHE_SHIFT;
-	offset = pos & ~PAGE_CACHE_MASK;
+	index = page_cache_index(mapping, pos);
+	offset = page_cache_offset(mapping, pos);
 
-	while (index > (curidx = (curpos = *bytes)>>PAGE_CACHE_SHIFT)) {
-		zerofrom = curpos & ~PAGE_CACHE_MASK;
+	while (index > (curidx = page_cache_index(mapping,
+					(curpos = *bytes)))) {
+		zerofrom = page_cache_offset(mapping, curpos);
 		if (zerofrom & (blocksize-1)) {
 			*bytes |= (blocksize-1);
 			(*bytes)++;
 		}
-		len = PAGE_CACHE_SIZE - zerofrom;
+		len = page_cache_size(mapping) - zerofrom;
 
 		err = pagecache_write_begin(file, mapping, curpos, len,
 						AOP_FLAG_UNINTERRUPTIBLE,
@@ -2210,7 +2216,7 @@ int cont_expand_zero(struct file *file, 
 
 	/* page covers the boundary, find the boundary offset */
 	if (index == curidx) {
-		zerofrom = curpos & ~PAGE_CACHE_MASK;
+		zerofrom = page_cache_offset(mapping, curpos);
 		/* if we will expand the thing last block will be filled */
 		if (offset <= zerofrom) {
 			goto out;
@@ -2256,7 +2262,7 @@ int cont_write_begin(struct file *file, 
 	if (err)
 		goto out;
 
-	zerofrom = *bytes & ~PAGE_CACHE_MASK;
+	zerofrom = page_cache_offset(mapping, *bytes);
 	if (pos+len > *bytes && zerofrom & (blocksize-1)) {
 		*bytes |= (blocksize-1);
 		(*bytes)++;
@@ -2289,8 +2295,9 @@ int block_commit_write(struct page *page
 int generic_commit_write(struct file *file, struct page *page,
 		unsigned from, unsigned to)
 {
-	struct inode *inode = page->mapping->host;
-	loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
+	loff_t pos = page_cache_pos(mapping, page->index, to);
 	__block_commit_write(inode,page,from,to);
 	/*
 	 * No need to use i_size_read() here, the i_size
@@ -2332,6 +2339,7 @@ static void end_buffer_read_nobh(struct 
 int nobh_prepare_write(struct page *page, unsigned from, unsigned to,
 			get_block_t *get_block)
 {
+	struct address_space *mapping = page->mapping;
 	struct inode *inode = page->mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
 	const unsigned blocksize = 1 << blkbits;
@@ -2339,6 +2347,7 @@ int nobh_prepare_write(struct page *page
 	struct buffer_head *read_bh[MAX_BUF_PER_PAGE];
 	unsigned block_in_page;
 	unsigned block_start;
+	unsigned page_size = page_cache_size(mapping);
 	sector_t block_in_file;
 	int nr_reads = 0;
 	int i;
@@ -2348,7 +2357,8 @@ int nobh_prepare_write(struct page *page
 	if (PageMappedToDisk(page))
 		return 0;
 
-	block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits);
+	block_in_file = (sector_t)page->index <<
+			(page_cache_shift(mapping) - blkbits);
 	map_bh.b_page = page;
 
 	/*
@@ -2357,7 +2367,7 @@ int nobh_prepare_write(struct page *page
 	 * page is fully mapped-to-disk.
 	 */
 	for (block_start = 0, block_in_page = 0;
-		  block_start < PAGE_CACHE_SIZE;
+		  block_start < page_size;
 		  block_in_page++, block_start += blocksize) {
 		unsigned block_end = block_start + blocksize;
 		int create;
@@ -2446,7 +2456,7 @@ failed:
 	 * Error recovery is pretty slack.  Clear the page and mark it dirty
 	 * so we'll later zero out any blocks which _were_ allocated.
 	 */
-	zero_user(page, 0, PAGE_CACHE_SIZE);
+	zero_user(page, 0, page_size);
 	SetPageUptodate(page);
 	set_page_dirty(page);
 	return ret;
@@ -2460,8 +2470,9 @@ EXPORT_SYMBOL(nobh_prepare_write);
 int nobh_commit_write(struct file *file, struct page *page,
 		unsigned from, unsigned to)
 {
-	struct inode *inode = page->mapping->host;
-	loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
+	loff_t pos = page_cache_pos(mapping, page->index, to);
 
 	SetPageUptodate(page);
 	set_page_dirty(page);
@@ -2481,9 +2492,10 @@ EXPORT_SYMBOL(nobh_commit_write);
 int nobh_writepage(struct page *page, get_block_t *get_block,
 			struct writeback_control *wbc)
 {
-	struct inode * const inode = page->mapping->host;
+	struct address_space *mapping = page->mapping;
+	struct inode * const inode = mapping->host;
 	loff_t i_size = i_size_read(inode);
-	const pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT;
+	const pgoff_t end_index = page_cache_offset(mapping, i_size);
 	unsigned offset;
 	int ret;
 
@@ -2492,7 +2504,7 @@ int nobh_writepage(struct page *page, ge
 		goto out;
 
 	/* Is the page fully outside i_size? (truncate in progress) */
-	offset = i_size & (PAGE_CACHE_SIZE-1);
+	offset = page_cache_offset(mapping, i_size);
 	if (page->index >= end_index+1 || !offset) {
 		/*
 		 * The page may have dirty, unmapped buffers.  For example,
@@ -2515,7 +2527,7 @@ int nobh_writepage(struct page *page, ge
 	 * the  page size, the remaining memory is zeroed when mapped, and
 	 * writes to that region are not written out to the file."
 	 */
-	zero_user_segment(page, offset, PAGE_CACHE_SIZE);
+	zero_user_segment(page, offset, page_cache_size(mapping));
 out:
 	ret = mpage_writepage(page, get_block, wbc);
 	if (ret == -EAGAIN)
@@ -2531,8 +2543,8 @@ int nobh_truncate_page(struct address_sp
 {
 	struct inode *inode = mapping->host;
 	unsigned blocksize = 1 << inode->i_blkbits;
-	pgoff_t index = from >> PAGE_CACHE_SHIFT;
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	pgoff_t index = page_cache_index(mapping, from);
+	unsigned offset = page_cache_offset(mapping, from);
 	unsigned to;
 	struct page *page;
 	const struct address_space_operations *a_ops = mapping->a_ops;
@@ -2549,7 +2561,7 @@ int nobh_truncate_page(struct address_sp
 	to = (offset + blocksize) & ~(blocksize - 1);
 	ret = a_ops->prepare_write(NULL, page, offset, to);
 	if (ret == 0) {
-		zero_user_segment(page, offset, PAGE_CACHE_SIZE);
+		zero_user_segment(page, offset, page_cache_size(mapping));
 		/*
 		 * It would be more correct to call aops->commit_write()
 		 * here, but this is more efficient.
@@ -2567,8 +2579,8 @@ EXPORT_SYMBOL(nobh_truncate_page);
 int block_truncate_page(struct address_space *mapping,
 			loff_t from, get_block_t *get_block)
 {
-	pgoff_t index = from >> PAGE_CACHE_SHIFT;
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	pgoff_t index = page_cache_index(mapping, from);
+	unsigned offset = page_cache_offset(mapping, from);
 	unsigned blocksize;
 	sector_t iblock;
 	unsigned length, pos;
@@ -2585,8 +2597,8 @@ int block_truncate_page(struct address_s
 		return 0;
 
 	length = blocksize - length;
-	iblock = (sector_t)index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
-	
+	iblock = (sector_t)index <<
+			(page_cache_shift(mapping) - inode->i_blkbits);
 	page = grab_cache_page(mapping, index);
 	err = -ENOMEM;
 	if (!page)
@@ -2645,9 +2657,10 @@ out:
 int block_write_full_page(struct page *page, get_block_t *get_block,
 			struct writeback_control *wbc)
 {
-	struct inode * const inode = page->mapping->host;
+	struct address_space *mapping = page->mapping;
+	struct inode * const inode = mapping->host;
 	loff_t i_size = i_size_read(inode);
-	const pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT;
+	const pgoff_t end_index = page_cache_index(mapping, i_size);
 	unsigned offset;
 
 	/* Is the page fully inside i_size? */
@@ -2655,7 +2668,7 @@ int block_write_full_page(struct page *p
 		return __block_write_full_page(inode, page, get_block, wbc);
 
 	/* Is the page fully outside i_size? (truncate in progress) */
-	offset = i_size & (PAGE_CACHE_SIZE-1);
+	offset = page_cache_offset(mapping, i_size);
 	if (page->index >= end_index+1 || !offset) {
 		/*
 		 * The page may have dirty, unmapped buffers.  For example,
@@ -2674,7 +2687,7 @@ int block_write_full_page(struct page *p
 	 * the  page size, the remaining memory is zeroed when mapped, and
 	 * writes to that region are not written out to the file."
 	 */
-	zero_user_segment(page, offset, PAGE_CACHE_SIZE);
+	zero_user_segment(page, offset, page_cache_size(mapping));
 	return __block_write_full_page(inode, page, get_block, wbc);
 }
 
@@ -2928,7 +2941,7 @@ int try_to_free_buffers(struct page *pag
 	 * dirty bit from being lost.
 	 */
 	if (ret)
-		cancel_dirty_page(page, PAGE_CACHE_SIZE);
+		cancel_dirty_page(page, page_cache_size(mapping));
 	spin_unlock(&mapping->private_lock);
 out:
 	if (buffers_to_free) {

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [12/37] Use page_cache_xxx in mm/mpage.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (10 preceding siblings ...)
  2007-06-20 18:29 ` [11/37] Use page_cache_xxx in fs/buffer.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [13/37] Use page_cache_xxx in mm/fadvise.c clameter
                   ` (24 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_mpage --]
[-- Type: text/plain, Size: 4096 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/mpage.c |   28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

Index: vps/fs/mpage.c
===================================================================
--- vps.orig/fs/mpage.c	2007-06-11 22:33:07.000000000 -0700
+++ vps/fs/mpage.c	2007-06-11 22:37:24.000000000 -0700
@@ -133,7 +133,8 @@ mpage_alloc(struct block_device *bdev,
 static void 
 map_buffer_to_page(struct page *page, struct buffer_head *bh, int page_block) 
 {
-	struct inode *inode = page->mapping->host;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
 	struct buffer_head *page_bh, *head;
 	int block = 0;
 
@@ -142,9 +143,9 @@ map_buffer_to_page(struct page *page, st
 		 * don't make any buffers if there is only one buffer on
 		 * the page and the page just needs to be set up to date
 		 */
-		if (inode->i_blkbits == PAGE_CACHE_SHIFT && 
+		if (inode->i_blkbits == page_cache_shift(mapping) &&
 		    buffer_uptodate(bh)) {
-			SetPageUptodate(page);    
+			SetPageUptodate(page);
 			return;
 		}
 		create_empty_buffers(page, 1 << inode->i_blkbits, 0);
@@ -177,9 +178,10 @@ do_mpage_readpage(struct bio *bio, struc
 		sector_t *last_block_in_bio, struct buffer_head *map_bh,
 		unsigned long *first_logical_block, get_block_t get_block)
 {
-	struct inode *inode = page->mapping->host;
+	struct address_space *mapping = page->mapping;
+	struct inode *inode = mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
-	const unsigned blocks_per_page = PAGE_CACHE_SIZE >> blkbits;
+	const unsigned blocks_per_page = page_cache_size(mapping) >> blkbits;
 	const unsigned blocksize = 1 << blkbits;
 	sector_t block_in_file;
 	sector_t last_block;
@@ -196,7 +198,7 @@ do_mpage_readpage(struct bio *bio, struc
 	if (page_has_buffers(page))
 		goto confused;
 
-	block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits);
+	block_in_file = (sector_t)page->index << (page_cache_shift(mapping) - blkbits);
 	last_block = block_in_file + nr_pages * blocks_per_page;
 	last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits;
 	if (last_block > last_block_in_file)
@@ -284,7 +286,8 @@ do_mpage_readpage(struct bio *bio, struc
 	}
 
 	if (first_hole != blocks_per_page) {
-		zero_user_segment(page, first_hole << blkbits, PAGE_CACHE_SIZE);
+		zero_user_segment(page, first_hole << blkbits,
+					page_cache_size(mapping));
 		if (first_hole == 0) {
 			SetPageUptodate(page);
 			unlock_page(page);
@@ -462,7 +465,7 @@ static int __mpage_writepage(struct page
 	struct inode *inode = page->mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
 	unsigned long end_index;
-	const unsigned blocks_per_page = PAGE_CACHE_SIZE >> blkbits;
+	const unsigned blocks_per_page = page_cache_size(mapping) >> blkbits;
 	sector_t last_block;
 	sector_t block_in_file;
 	sector_t blocks[MAX_BUF_PER_PAGE];
@@ -531,7 +534,8 @@ static int __mpage_writepage(struct page
 	 * The page has no buffers: map it to disk
 	 */
 	BUG_ON(!PageUptodate(page));
-	block_in_file = (sector_t)page->index << (PAGE_CACHE_SHIFT - blkbits);
+	block_in_file = (sector_t)page->index <<
+			(page_cache_shift(mapping) - blkbits);
 	last_block = (i_size - 1) >> blkbits;
 	map_bh.b_page = page;
 	for (page_block = 0; page_block < blocks_per_page; ) {
@@ -563,7 +567,7 @@ static int __mpage_writepage(struct page
 	first_unmapped = page_block;
 
 page_is_mapped:
-	end_index = i_size >> PAGE_CACHE_SHIFT;
+	end_index = page_cache_index(mapping, i_size);
 	if (page->index >= end_index) {
 		/*
 		 * The page straddles i_size.  It must be zeroed out on each
@@ -573,11 +577,11 @@ page_is_mapped:
 		 * is zeroed when mapped, and writes to that region are not
 		 * written out to the file."
 		 */
-		unsigned offset = i_size & (PAGE_CACHE_SIZE - 1);
+		unsigned offset = page_cache_offset(mapping, i_size);
 
 		if (page->index > end_index || !offset)
 			goto confused;
-		zero_user_segment(page, offset, PAGE_CACHE_SIZE);
+		zero_user_segment(page, offset, page_cache_size(mapping));
 	}
 
 	/*

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [13/37] Use page_cache_xxx in mm/fadvise.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (11 preceding siblings ...)
  2007-06-20 18:29 ` [12/37] Use page_cache_xxx in mm/mpage.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [14/37] Use page_cache_xxx in fs/splice.c clameter
                   ` (23 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_fadvise --]
[-- Type: text/plain, Size: 1164 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/fadvise.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: vps/mm/fadvise.c
===================================================================
--- vps.orig/mm/fadvise.c	2007-06-04 17:57:25.000000000 -0700
+++ vps/mm/fadvise.c	2007-06-09 21:32:46.000000000 -0700
@@ -79,8 +79,8 @@ asmlinkage long sys_fadvise64_64(int fd,
 		}
 
 		/* First and last PARTIAL page! */
-		start_index = offset >> PAGE_CACHE_SHIFT;
-		end_index = endbyte >> PAGE_CACHE_SHIFT;
+		start_index = page_cache_index(mapping, offset);
+		end_index = page_cache_index(mapping, endbyte);
 
 		/* Careful about overflow on the "+1" */
 		nrpages = end_index - start_index + 1;
@@ -100,8 +100,8 @@ asmlinkage long sys_fadvise64_64(int fd,
 			filemap_flush(mapping);
 
 		/* First and last FULL page! */
-		start_index = (offset+(PAGE_CACHE_SIZE-1)) >> PAGE_CACHE_SHIFT;
-		end_index = (endbyte >> PAGE_CACHE_SHIFT);
+		start_index = page_cache_next(mapping, offset);
+		end_index = page_cache_index(mapping, endbyte);
 
 		if (end_index >= start_index)
 			invalidate_mapping_pages(mapping, start_index,

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [14/37] Use page_cache_xxx in fs/splice.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (12 preceding siblings ...)
  2007-06-20 18:29 ` [13/37] Use page_cache_xxx in mm/fadvise.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [15/37] Use page_cache_xxx functions in fs/ext2 clameter
                   ` (22 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_splice --]
[-- Type: text/plain, Size: 2822 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/splice.c |   23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

Index: vps/fs/splice.c
===================================================================
--- vps.orig/fs/splice.c	2007-06-09 22:18:02.000000000 -0700
+++ vps/fs/splice.c	2007-06-09 22:22:08.000000000 -0700
@@ -282,9 +282,9 @@ __generic_file_splice_read(struct file *
 		.ops = &page_cache_pipe_buf_ops,
 	};
 
-	index = *ppos >> PAGE_CACHE_SHIFT;
-	loff = *ppos & ~PAGE_CACHE_MASK;
-	nr_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	index = page_cache_index(mapping, *ppos);
+	loff = page_cache_offset(mapping, *ppos);
+	nr_pages = page_cache_next(mapping, len + loff);
 
 	if (nr_pages > PIPE_BUFFERS)
 		nr_pages = PIPE_BUFFERS;
@@ -345,7 +345,7 @@ __generic_file_splice_read(struct file *
 	 * Now loop over the map and see if we need to start IO on any
 	 * pages, fill in the partial map, etc.
 	 */
-	index = *ppos >> PAGE_CACHE_SHIFT;
+	index = page_cache_index(mapping, *ppos);
 	nr_pages = spd.nr_pages;
 	spd.nr_pages = 0;
 	for (page_nr = 0; page_nr < nr_pages; page_nr++) {
@@ -357,7 +357,8 @@ __generic_file_splice_read(struct file *
 		/*
 		 * this_len is the max we'll use from this page
 		 */
-		this_len = min_t(unsigned long, len, PAGE_CACHE_SIZE - loff);
+		this_len = min_t(unsigned long, len,
+					page_cache_size(mapping) - loff);
 		page = pages[page_nr];
 
 		if (PageReadahead(page))
@@ -416,7 +417,7 @@ __generic_file_splice_read(struct file *
 			 * i_size must be checked after ->readpage().
 			 */
 			isize = i_size_read(mapping->host);
-			end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
+			end_index = page_cache_index(mapping, isize - 1);
 			if (unlikely(!isize || index > end_index))
 				break;
 
@@ -425,7 +426,8 @@ __generic_file_splice_read(struct file *
 			 * the length and stop
 			 */
 			if (end_index == index) {
-				loff = PAGE_CACHE_SIZE - (isize & ~PAGE_CACHE_MASK);
+				loff = page_cache_size(mapping)
+					- page_cache_offset(mapping, isize);
 				if (total_len + loff > isize)
 					break;
 				/*
@@ -557,6 +559,7 @@ static int pipe_to_file(struct pipe_inod
 	struct page *page;
 	void *fsdata;
 	int ret;
+	int pagesize = page_cache_size(mapping);
 
 	/*
 	 * make sure the data in this buffer is uptodate
@@ -565,11 +568,11 @@ static int pipe_to_file(struct pipe_inod
 	if (unlikely(ret))
 		return ret;
 
-	offset = sd->pos & ~PAGE_CACHE_MASK;
+	offset = page_cache_offset(mapping, sd->pos);
 
 	this_len = sd->len;
-	if (this_len + offset > PAGE_CACHE_SIZE)
-		this_len = PAGE_CACHE_SIZE - offset;
+	if (this_len + offset > pagesize)
+		this_len = pagesize - offset;
 
 	ret = pagecache_write_begin(file, mapping, sd->pos, sd->len,
 				AOP_FLAG_UNINTERRUPTIBLE, &page, &fsdata);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [15/37] Use page_cache_xxx functions in fs/ext2
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (13 preceding siblings ...)
  2007-06-20 18:29 ` [14/37] Use page_cache_xxx in fs/splice.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [16/37] Use page_cache_xxx in fs/ext3 clameter
                   ` (21 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_ext2 --]
[-- Type: text/plain, Size: 5655 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/ext2/dir.c |   40 +++++++++++++++++++++++-----------------
 1 file changed, 23 insertions(+), 17 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/ext2/dir.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext2/dir.c	2007-06-15 17:35:32.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext2/dir.c	2007-06-18 18:57:34.000000000 -0700
@@ -45,7 +45,8 @@ static inline void ext2_put_page(struct 
 
 static inline unsigned long dir_pages(struct inode *inode)
 {
-	return (inode->i_size+PAGE_CACHE_SIZE-1)>>PAGE_CACHE_SHIFT;
+	return (inode->i_size+page_cache_size(inode->i_mapping)-1)>>
+			page_cache_shift(inode->i_mapping);
 }
 
 /*
@@ -56,10 +57,11 @@ static unsigned
 ext2_last_byte(struct inode *inode, unsigned long page_nr)
 {
 	unsigned last_byte = inode->i_size;
+	struct address_space *mapping = inode->i_mapping;
 
-	last_byte -= page_nr << PAGE_CACHE_SHIFT;
-	if (last_byte > PAGE_CACHE_SIZE)
-		last_byte = PAGE_CACHE_SIZE;
+	last_byte -= page_nr << page_cache_shift(mapping);
+	if (last_byte > page_cache_size(mapping))
+		last_byte = page_cache_size(mapping);
 	return last_byte;
 }
 
@@ -88,18 +90,19 @@ static int ext2_commit_chunk(struct page
 
 static void ext2_check_page(struct page *page)
 {
-	struct inode *dir = page->mapping->host;
+	struct address_space *mapping = page->mapping;
+	struct inode *dir = mapping->host;
 	struct super_block *sb = dir->i_sb;
 	unsigned chunk_size = ext2_chunk_size(dir);
 	char *kaddr = page_address(page);
 	u32 max_inumber = le32_to_cpu(EXT2_SB(sb)->s_es->s_inodes_count);
 	unsigned offs, rec_len;
-	unsigned limit = PAGE_CACHE_SIZE;
+	unsigned limit = page_cache_size(mapping);
 	ext2_dirent *p;
 	char *error;
 
-	if ((dir->i_size >> PAGE_CACHE_SHIFT) == page->index) {
-		limit = dir->i_size & ~PAGE_CACHE_MASK;
+	if (page_cache_index(mapping, dir->i_size) == page->index) {
+		limit = page_cache_offset(mapping, dir->i_size);
 		if (limit & (chunk_size - 1))
 			goto Ebadsize;
 		if (!limit)
@@ -151,7 +154,7 @@ Einumber:
 bad_entry:
 	ext2_error (sb, "ext2_check_page", "bad entry in directory #%lu: %s - "
 		"offset=%lu, inode=%lu, rec_len=%d, name_len=%d",
-		dir->i_ino, error, (page->index<<PAGE_CACHE_SHIFT)+offs,
+		dir->i_ino, error, page_cache_pos(mapping, page->index, offs),
 		(unsigned long) le32_to_cpu(p->inode),
 		rec_len, p->name_len);
 	goto fail;
@@ -160,7 +163,7 @@ Eend:
 	ext2_error (sb, "ext2_check_page",
 		"entry in directory #%lu spans the page boundary"
 		"offset=%lu, inode=%lu",
-		dir->i_ino, (page->index<<PAGE_CACHE_SHIFT)+offs,
+		dir->i_ino, page_cache_pos(mapping, page->index, offs),
 		(unsigned long) le32_to_cpu(p->inode));
 fail:
 	SetPageChecked(page);
@@ -258,8 +261,9 @@ ext2_readdir (struct file * filp, void *
 	loff_t pos = filp->f_pos;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	struct super_block *sb = inode->i_sb;
-	unsigned int offset = pos & ~PAGE_CACHE_MASK;
-	unsigned long n = pos >> PAGE_CACHE_SHIFT;
+	struct address_space *mapping = inode->i_mapping;
+	unsigned int offset = page_cache_offset(mapping, pos);
+	unsigned long n = page_cache_index(mapping, pos);
 	unsigned long npages = dir_pages(inode);
 	unsigned chunk_mask = ~(ext2_chunk_size(inode)-1);
 	unsigned char *types = NULL;
@@ -280,14 +284,14 @@ ext2_readdir (struct file * filp, void *
 			ext2_error(sb, __FUNCTION__,
 				   "bad page in #%lu",
 				   inode->i_ino);
-			filp->f_pos += PAGE_CACHE_SIZE - offset;
+			filp->f_pos += page_cache_size(mapping) - offset;
 			return -EIO;
 		}
 		kaddr = page_address(page);
 		if (unlikely(need_revalidate)) {
 			if (offset) {
 				offset = ext2_validate_entry(kaddr, offset, chunk_mask);
-				filp->f_pos = (n<<PAGE_CACHE_SHIFT) + offset;
+				filp->f_pos = page_cache_pos(mapping, n, offset);
 			}
 			filp->f_version = inode->i_version;
 			need_revalidate = 0;
@@ -310,7 +314,7 @@ ext2_readdir (struct file * filp, void *
 
 				offset = (char *)de - kaddr;
 				over = filldir(dirent, de->name, de->name_len,
-						(n<<PAGE_CACHE_SHIFT) | offset,
+						page_cache_pos(mapping, n, offset),
 						le32_to_cpu(de->inode), d_type);
 				if (over) {
 					ext2_put_page(page);
@@ -336,6 +340,7 @@ struct ext2_dir_entry_2 * ext2_find_entr
 			struct dentry *dentry, struct page ** res_page)
 {
 	const char *name = dentry->d_name.name;
+	struct address_space *mapping = dir->i_mapping;
 	int namelen = dentry->d_name.len;
 	unsigned reclen = EXT2_DIR_REC_LEN(namelen);
 	unsigned long start, n;
@@ -377,7 +382,7 @@ struct ext2_dir_entry_2 * ext2_find_entr
 		if (++n >= npages)
 			n = 0;
 		/* next page is past the blocks we've got */
-		if (unlikely(n > (dir->i_blocks >> (PAGE_CACHE_SHIFT - 9)))) {
+		if (unlikely(n > (dir->i_blocks >> (page_cache_shift(mapping) - 9)))) {
 			ext2_error(dir->i_sb, __FUNCTION__,
 				"dir %lu size %lld exceeds block count %llu",
 				dir->i_ino, dir->i_size,
@@ -448,6 +453,7 @@ void ext2_set_link(struct inode *dir, st
 int ext2_add_link (struct dentry *dentry, struct inode *inode)
 {
 	struct inode *dir = dentry->d_parent->d_inode;
+	struct address_space *mapping = inode->i_mapping;
 	const char *name = dentry->d_name.name;
 	int namelen = dentry->d_name.len;
 	unsigned chunk_size = ext2_chunk_size(dir);
@@ -477,7 +483,7 @@ int ext2_add_link (struct dentry *dentry
 		kaddr = page_address(page);
 		dir_end = kaddr + ext2_last_byte(dir, n);
 		de = (ext2_dirent *)kaddr;
-		kaddr += PAGE_CACHE_SIZE - reclen;
+		kaddr += page_cache_size(mapping) - reclen;
 		while ((char *)de <= kaddr) {
 			if ((char *)de == dir_end) {
 				/* We hit i_size */

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [16/37] Use page_cache_xxx in fs/ext3
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (14 preceding siblings ...)
  2007-06-20 18:29 ` [15/37] Use page_cache_xxx functions in fs/ext2 clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [17/37] Use page_cache_xxx in fs/ext4 clameter
                   ` (20 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_ext3 --]
[-- Type: text/plain, Size: 5109 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/ext3/dir.c   |    3 ++-
 fs/ext3/inode.c |   36 ++++++++++++++++++------------------
 2 files changed, 20 insertions(+), 19 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/ext3/dir.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext3/dir.c	2007-06-15 17:35:33.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext3/dir.c	2007-06-18 18:59:38.000000000 -0700
@@ -137,7 +137,8 @@ static int ext3_readdir(struct file * fi
 						&map_bh, 0, 0);
 		if (err > 0) {
 			pgoff_t index = map_bh.b_blocknr >>
-					(PAGE_CACHE_SHIFT - inode->i_blkbits);
+				(page_cache_shift(inode->i_mapping)
+					- inode->i_blkbits);
 			if (!ra_has_index(&filp->f_ra, index))
 				page_cache_readahead_ondemand(
 					sb->s_bdev->bd_inode->i_mapping,
Index: linux-2.6.22-rc4-mm2/fs/ext3/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext3/inode.c	2007-06-18 18:42:45.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext3/inode.c	2007-06-18 18:59:38.000000000 -0700
@@ -1159,8 +1159,8 @@ static int ext3_write_begin(struct file 
 	pgoff_t index;
 	unsigned from, to;
 
-	index = pos >> PAGE_CACHE_SHIFT;
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	index = page_cache_index(mapping, pos);
+	from = page_cache_offset(mapping, pos);
 	to = from + len;
 
 retry:
@@ -1233,7 +1233,7 @@ static int ext3_ordered_write_end(struct
 	unsigned from, to;
 	int ret = 0, ret2;
 
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	from = page_cache_offset(mapping, pos);
 	to = from + len;
 
 	ret = walk_page_buffers(handle, page_buffers(page),
@@ -1300,7 +1300,7 @@ static int ext3_journalled_write_end(str
 	int partial = 0;
 	unsigned from, to;
 
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	from = page_cache_offset(mapping, pos);
 	to = from + len;
 
 	if (copied < len) {
@@ -1462,6 +1462,7 @@ static int ext3_ordered_writepage(struct
 	handle_t *handle = NULL;
 	int ret = 0;
 	int err;
+	int pagesize = page_cache_size(inode->i_mapping);
 
 	J_ASSERT(PageLocked(page));
 
@@ -1484,8 +1485,7 @@ static int ext3_ordered_writepage(struct
 				(1 << BH_Dirty)|(1 << BH_Uptodate));
 	}
 	page_bufs = page_buffers(page);
-	walk_page_buffers(handle, page_bufs, 0,
-			PAGE_CACHE_SIZE, NULL, bget_one);
+	walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bget_one);
 
 	ret = block_write_full_page(page, ext3_get_block, wbc);
 
@@ -1502,13 +1502,12 @@ static int ext3_ordered_writepage(struct
 	 * and generally junk.
 	 */
 	if (ret == 0) {
-		err = walk_page_buffers(handle, page_bufs, 0, PAGE_CACHE_SIZE,
-					NULL, journal_dirty_data_fn);
+		err = walk_page_buffers(handle, page_bufs, 0, pagesize,
+			NULL, journal_dirty_data_fn);
 		if (!ret)
 			ret = err;
 	}
-	walk_page_buffers(handle, page_bufs, 0,
-			PAGE_CACHE_SIZE, NULL, bput_one);
+	walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bput_one);
 	err = ext3_journal_stop(handle);
 	if (!ret)
 		ret = err;
@@ -1560,6 +1559,7 @@ static int ext3_journalled_writepage(str
 	handle_t *handle = NULL;
 	int ret = 0;
 	int err;
+	int pagesize = page_cache_size(inode->i_mapping);
 
 	if (ext3_journal_current_handle())
 		goto no_write;
@@ -1576,17 +1576,16 @@ static int ext3_journalled_writepage(str
 		 * doesn't seem much point in redirtying the page here.
 		 */
 		ClearPageChecked(page);
-		ret = block_prepare_write(page, 0, PAGE_CACHE_SIZE,
-					ext3_get_block);
+		ret = block_prepare_write(page, 0, pagesize, ext3_get_block);
 		if (ret != 0) {
 			ext3_journal_stop(handle);
 			goto out_unlock;
 		}
 		ret = walk_page_buffers(handle, page_buffers(page), 0,
-			PAGE_CACHE_SIZE, NULL, do_journal_get_write_access);
+			pagesize, NULL, do_journal_get_write_access);
 
 		err = walk_page_buffers(handle, page_buffers(page), 0,
-				PAGE_CACHE_SIZE, NULL, write_end_fn);
+				pagesize, NULL, write_end_fn);
 		if (ret == 0)
 			ret = err;
 		EXT3_I(inode)->i_state |= EXT3_STATE_JDATA;
@@ -1801,8 +1800,8 @@ void ext3_set_aops(struct inode *inode)
 static int ext3_block_truncate_page(handle_t *handle, struct page *page,
 		struct address_space *mapping, loff_t from)
 {
-	ext3_fsblk_t index = from >> PAGE_CACHE_SHIFT;
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	ext3_fsblk_t index = page_cache_index(mapping, from);
+	unsigned offset = page_cache_offset(mapping, from);
 	unsigned blocksize, iblock, length, pos;
 	struct inode *inode = mapping->host;
 	struct buffer_head *bh;
@@ -1810,7 +1809,8 @@ static int ext3_block_truncate_page(hand
 
 	blocksize = inode->i_sb->s_blocksize;
 	length = blocksize - (offset & (blocksize - 1));
-	iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
+	iblock = index <<
+		(page_cache_shift(mapping) - inode->i_sb->s_blocksize_bits);
 
 	/*
 	 * For "nobh" option,  we can only work if we don't need to
@@ -2289,7 +2289,7 @@ void ext3_truncate(struct inode *inode)
 		page = NULL;
 	} else {
 		page = grab_cache_page(mapping,
-				inode->i_size >> PAGE_CACHE_SHIFT);
+				page_cache_index(mapping, inode->i_size));
 		if (!page)
 			return;
 	}

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [17/37] Use page_cache_xxx in fs/ext4
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (15 preceding siblings ...)
  2007-06-20 18:29 ` [16/37] Use page_cache_xxx in fs/ext3 clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [18/37] Use page_cache_xxx in fs/reiserfs clameter
                   ` (19 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_ext4 --]
[-- Type: text/plain, Size: 10218 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/ext4/dir.c       |    3 ++-
 fs/ext4/inode.c     |   34 +++++++++++++++++-----------------
 fs/ext4/writeback.c |   35 +++++++++++++++++++----------------
 3 files changed, 38 insertions(+), 34 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/ext4/dir.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext4/dir.c	2007-06-18 19:01:00.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext4/dir.c	2007-06-18 19:01:06.000000000 -0700
@@ -136,7 +136,8 @@ static int ext4_readdir(struct file * fi
 		err = ext4_get_blocks_wrap(NULL, inode, blk, 1, &map_bh, 0, 0);
 		if (err > 0) {
 			pgoff_t index = map_bh.b_blocknr >>
-					(PAGE_CACHE_SHIFT - inode->i_blkbits);
+				(page_cache_size(node->i_mapping)
+					- inode->i_blkbits);
 			if (!ra_has_index(&filp->f_ra, index))
 				page_cache_readahead_ondemand(
 					sb->s_bdev->bd_inode->i_mapping,
Index: linux-2.6.22-rc4-mm2/fs/ext4/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext4/inode.c	2007-06-18 19:01:00.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext4/inode.c	2007-06-18 19:01:06.000000000 -0700
@@ -1158,8 +1158,8 @@ static int ext4_write_begin(struct file 
  	pgoff_t index;
  	unsigned from, to;
 
- 	index = pos >> PAGE_CACHE_SHIFT;
- 	from = pos & (PAGE_CACHE_SIZE - 1);
+ 	index = page_cache_index(mapping, pos);
+ 	from = page_cache_offset(mapping, pos);
  	to = from + len;
 
 retry:
@@ -1231,7 +1231,7 @@ static int ext4_ordered_write_end(struct
 	unsigned from, to;
 	int ret = 0, ret2;
 
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	from = page_cache_offset(mapping, pos);
 	to = from + len;
 
 	ret = walk_page_buffers(handle, page_buffers(page),
@@ -1298,7 +1298,7 @@ static int ext4_journalled_write_end(str
 	int partial = 0;
 	unsigned from, to;
 
-	from = pos & (PAGE_CACHE_SIZE - 1);
+	from = page_cache_offset(mapping, pos);
 	to = from + len;
 
 	if (copied < len) {
@@ -1460,6 +1460,7 @@ static int ext4_ordered_writepage(struct
 	handle_t *handle = NULL;
 	int ret = 0;
 	int err;
+	int pagesize = page_cache_size(inode->i_mapping);
 
 	J_ASSERT(PageLocked(page));
 
@@ -1482,8 +1483,7 @@ static int ext4_ordered_writepage(struct
 				(1 << BH_Dirty)|(1 << BH_Uptodate));
 	}
 	page_bufs = page_buffers(page);
-	walk_page_buffers(handle, page_bufs, 0,
-			PAGE_CACHE_SIZE, NULL, bget_one);
+	walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bget_one);
 
 	ret = block_write_full_page(page, ext4_get_block, wbc);
 
@@ -1500,13 +1500,12 @@ static int ext4_ordered_writepage(struct
 	 * and generally junk.
 	 */
 	if (ret == 0) {
-		err = walk_page_buffers(handle, page_bufs, 0, PAGE_CACHE_SIZE,
+		err = walk_page_buffers(handle, page_bufs, 0, pagesize,
 					NULL, jbd2_journal_dirty_data_fn);
 		if (!ret)
 			ret = err;
 	}
-	walk_page_buffers(handle, page_bufs, 0,
-			PAGE_CACHE_SIZE, NULL, bput_one);
+	walk_page_buffers(handle, page_bufs, 0, pagesize, NULL, bput_one);
 	err = ext4_journal_stop(handle);
 	if (!ret)
 		ret = err;
@@ -1558,6 +1557,7 @@ static int ext4_journalled_writepage(str
 	handle_t *handle = NULL;
 	int ret = 0;
 	int err;
+	int pagesize = page_cache_size(inode->i_mapping);
 
 	if (ext4_journal_current_handle())
 		goto no_write;
@@ -1574,17 +1574,16 @@ static int ext4_journalled_writepage(str
 		 * doesn't seem much point in redirtying the page here.
 		 */
 		ClearPageChecked(page);
-		ret = block_prepare_write(page, 0, PAGE_CACHE_SIZE,
-					ext4_get_block);
+		ret = block_prepare_write(page, 0, pagesize, ext4_get_block);
 		if (ret != 0) {
 			ext4_journal_stop(handle);
 			goto out_unlock;
 		}
 		ret = walk_page_buffers(handle, page_buffers(page), 0,
-			PAGE_CACHE_SIZE, NULL, do_journal_get_write_access);
+			pagesize, NULL, do_journal_get_write_access);
 
 		err = walk_page_buffers(handle, page_buffers(page), 0,
-				PAGE_CACHE_SIZE, NULL, write_end_fn);
+			pagesize, NULL, write_end_fn);
 		if (ret == 0)
 			ret = err;
 		EXT4_I(inode)->i_state |= EXT4_STATE_JDATA;
@@ -1824,8 +1823,8 @@ void ext4_set_aops(struct inode *inode)
 int ext4_block_truncate_page(handle_t *handle, struct page *page,
 		struct address_space *mapping, loff_t from)
 {
-	ext4_fsblk_t index = from >> PAGE_CACHE_SHIFT;
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	ext4_fsblk_t index = page_cache_index(mapping, from);
+	unsigned offset = page_cache_offset(mapping, from);
 	unsigned blocksize, iblock, length, pos;
 	struct inode *inode = mapping->host;
 	struct buffer_head *bh;
@@ -1839,7 +1838,8 @@ int ext4_block_truncate_page(handle_t *h
 
 	blocksize = inode->i_sb->s_blocksize;
 	length = blocksize - (offset & (blocksize - 1));
-	iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
+	iblock = index <<
+		(page_cache_shift(mapping) - inode->i_sb->s_blocksize_bits);
 
 	/*
 	 * For "nobh" option,  we can only work if we don't need to
@@ -2325,7 +2325,7 @@ void ext4_truncate(struct inode *inode)
 		page = NULL;
 	} else {
 		page = grab_cache_page(mapping,
-				inode->i_size >> PAGE_CACHE_SHIFT);
+				page_cache_index(mapping, inode->i_size));
 		if (!page)
 			return;
 	}
Index: linux-2.6.22-rc4-mm2/fs/ext4/writeback.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext4/writeback.c	2007-06-18 19:01:00.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext4/writeback.c	2007-06-18 19:01:06.000000000 -0700
@@ -21,7 +21,7 @@
  *   MUST:
  *     - flush dirty pages in -ENOSPC case in order to free reserved blocks
  *     - direct I/O support
- *     - blocksize != PAGE_CACHE_SIZE support
+ *     - blocksize != PAGE_SIZE support
  *     - store last unwritten page in ext4_wb_writepages() and
  *       continue from it in a next run
  *   WISH:
@@ -285,7 +285,7 @@ static int ext4_wb_submit_extent(struct 
 	 * let's cook bios from them and start real I/O
 	 */
 
-	BUG_ON(PAGE_CACHE_SHIFT < blkbits);
+	BUG_ON(page_cache_shift(inode->i_mapping) < blkbits);
 	BUG_ON(list_empty(&wc->list));
 
 	wb_debug("cook and submit bios for %u/%u/%u for %lu/%u\n",
@@ -301,8 +301,9 @@ static int ext4_wb_submit_extent(struct 
 		if (page == NULL)
 			break;
 
-		pstart = page->index << (PAGE_CACHE_SHIFT - blkbits);
-		plen = PAGE_SIZE >> blkbits;
+		pstart = page->index <<
+				(page_cache_shift(node->i_mapping) - blkbits);
+		plen = page_cache_size(inode->i_mapping) >> blkbits;
 		if (pstart > blk) {
 			/* probably extent covers long space and page
 			 * to be written in the middle of it */
@@ -333,7 +334,8 @@ alloc_new_bio:
 			/* +2 because head/tail may belong to different pages */
 			nr_pages = (le16_to_cpu(ex->ee_len) -
 				    (blk - le32_to_cpu(ex->ee_block)));
-			nr_pages = (nr_pages >> (PAGE_CACHE_SHIFT - blkbits));
+			nr_pages = nr_pages >>
+					(page_cache_shift(inode->mapping) - blkbits);
 			off = le32_to_cpu(ex->ee_start) +
 			      (blk - le32_to_cpu(ex->ee_block));
 			off |= (ext4_fsblk_t)
@@ -832,11 +834,12 @@ int ext4_wb_writepages(struct address_sp
 static void ext4_wb_clear_page(struct page *page, int from, int to)
 {
 	void *kaddr;
+	struct address_space *mapping = page_mapping(page);
 
-	if (to < PAGE_CACHE_SIZE || from > 0) {
+	if (to < page_cache_size(mapping) || from > 0) {
 		kaddr = kmap_atomic(page, KM_USER0);
-		if (PAGE_CACHE_SIZE > to)
-			memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
+		if (page_cache_size(mapping) > to)
+			memset(kaddr + to, 0, page_cache_size(mapping) - to);
 		if (0 < from)
 			memset(kaddr, 0, from);
 		flush_dcache_page(page);
@@ -878,7 +881,7 @@ int ext4_wb_prepare_write(struct file *f
 		} else {
 			/* block is already mapped, so no need to reserve */
 			BUG_ON(PagePrivate(page));
-			if (to - from < PAGE_CACHE_SIZE) {
+			if (to - from < page_cache_size(inode->i_mapping)) {
 				wb_debug("read block %u\n",
 						(unsigned) bhw->b_blocknr);
 				set_bh_page(bhw, page, 0);
@@ -905,7 +908,7 @@ int ext4_wb_prepare_write(struct file *f
 int ext4_wb_commit_write(struct file *file, struct page *page,
 			     unsigned from, unsigned to)
 {
-	loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
+	loff_t pos = page_cache_pos(file->f_mapping, page->index, to);
 	struct inode *inode = page->mapping->host;
 	int err = 0;
 
@@ -984,7 +987,7 @@ int ext4_wb_writepage(struct page *page,
 {
 	struct inode *inode = page->mapping->host;
 	loff_t i_size = i_size_read(inode);
-	pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT;
+	pgoff_t end_index = page_cache_index(inode->i_mapping, i_size);
 	unsigned offset;
 	void *kaddr;
 
@@ -1017,7 +1020,7 @@ int ext4_wb_writepage(struct page *page,
 		return ext4_wb_write_single_page(page, wbc);
 
 	/* Is the page fully outside i_size? (truncate in progress) */
-	offset = i_size & (PAGE_CACHE_SIZE-1);
+	offset = page_cache_offset(inode->i_mapping, i_size);
 	if (page->index >= end_index + 1 || !offset) {
 		/*
 		 * The page may have dirty, unmapped buffers.  For example,
@@ -1037,7 +1040,7 @@ int ext4_wb_writepage(struct page *page,
 	 * writes to that region are not written out to the file."
 	 */
 	kaddr = kmap_atomic(page, KM_USER0);
-	memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset);
+	memset(kaddr + offset, 0, page_cache_size(inode->i_mapping) - offset);
 	flush_dcache_page(page);
 	kunmap_atomic(kaddr, KM_USER0);
 	return ext4_wb_write_single_page(page, wbc);
@@ -1086,7 +1089,7 @@ void ext4_wb_invalidatepage(struct page 
 int ext4_wb_block_truncate_page(handle_t *handle, struct page *page,
 				struct address_space *mapping, loff_t from)
 {
-	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	unsigned offset = page_cache_offset(mapping, from);
 	struct inode *inode = mapping->host;
 	struct buffer_head bh, *bhw = &bh;
 	unsigned blocksize, length;
@@ -1147,9 +1150,9 @@ void ext4_wb_init(struct super_block *sb
 	if (!test_opt(sb, DELAYED_ALLOC))
 		return;
 
-	if (PAGE_CACHE_SHIFT != sb->s_blocksize_bits) {
+	if (PAGE_SHIFT != sb->s_blocksize_bits) {
 		printk(KERN_ERR "EXT4-fs: delayed allocation isn't"
-			"supported for PAGE_CACHE_SIZE != blocksize yet\n");
+			"supported for PAGE_SIZE != blocksize yet\n");
 		clear_opt (EXT4_SB(sb)->s_mount_opt, DELAYED_ALLOC);
 		return;
 	}

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [18/37] Use page_cache_xxx in fs/reiserfs
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (16 preceding siblings ...)
  2007-06-20 18:29 ` [17/37] Use page_cache_xxx in fs/ext4 clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [19/37] Use page_cache_xxx for fs/xfs clameter
                   ` (18 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_reiserfs --]
[-- Type: text/plain, Size: 12222 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/reiserfs/bitmap.c          |    7 ++++++-
 fs/reiserfs/file.c            |    5 +++--
 fs/reiserfs/inode.c           |   37 ++++++++++++++++++++++---------------
 fs/reiserfs/ioctl.c           |    2 +-
 fs/reiserfs/stree.c           |    8 +++++---
 fs/reiserfs/tail_conversion.c |    5 +++--
 fs/reiserfs/xattr.c           |   19 ++++++++++---------
 7 files changed, 50 insertions(+), 33 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/reiserfs/ioctl.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/ioctl.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/ioctl.c	2007-06-18 19:04:38.000000000 -0700
@@ -173,8 +173,8 @@ static int reiserfs_unpack(struct inode 
 	 ** reiserfs_prepare_write on that page.  This will force a 
 	 ** reiserfs_get_block to unpack the tail for us.
 	 */
-	index = inode->i_size >> PAGE_CACHE_SHIFT;
 	mapping = inode->i_mapping;
+	index = page_cache_index(mapping, inode->i_size);
 	page = grab_cache_page(mapping, index);
 	retval = -ENOMEM;
 	if (!page) {
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/stree.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/stree.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/stree.c	2007-06-18 19:04:38.000000000 -0700
@@ -1282,7 +1282,8 @@ int reiserfs_delete_item(struct reiserfs
 		 */
 
 		data = kmap_atomic(p_s_un_bh->b_page, KM_USER0);
-		off = ((le_ih_k_offset(&s_ih) - 1) & (PAGE_CACHE_SIZE - 1));
+		off = page_cache_offset(p_s_inode->i_mapping,
+					le_ih_k_offset(&s_ih) - 1);
 		memcpy(data + off,
 		       B_I_PITEM(PATH_PLAST_BUFFER(p_s_path), &s_ih),
 		       n_ret_value);
@@ -1438,7 +1439,7 @@ static void unmap_buffers(struct page *p
 
 	if (page) {
 		if (page_has_buffers(page)) {
-			tail_index = pos & (PAGE_CACHE_SIZE - 1);
+			tail_index = page_cache_offset(page_mapping(page), pos);
 			cur_index = 0;
 			head = page_buffers(page);
 			bh = head;
@@ -1458,7 +1459,8 @@ static void unmap_buffers(struct page *p
 				bh = next;
 			} while (bh != head);
 			if (PAGE_SIZE == bh->b_size) {
-				cancel_dirty_page(page, PAGE_CACHE_SIZE);
+				cancel_dirty_page(page,
+					page_cache_size(page_mapping(page)));
 			}
 		}
 	}
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/tail_conversion.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/tail_conversion.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/tail_conversion.c	2007-06-18 19:21:18.000000000 -0700
@@ -128,7 +128,8 @@ int direct2indirect(struct reiserfs_tran
 	 */
 	if (up_to_date_bh) {
 		unsigned pgoff =
-		    (tail_offset + total_tail - 1) & (PAGE_CACHE_SIZE - 1);
+		    	page_cache_offset(inode->i_mapping,
+		    			tail_offset + total_tail - 1);
 		char *kaddr = kmap_atomic(up_to_date_bh->b_page, KM_USER0);
 		memset(kaddr + pgoff, 0, n_blk_size - total_tail);
 		kunmap_atomic(kaddr, KM_USER0);
@@ -238,7 +239,7 @@ int indirect2direct(struct reiserfs_tran
 	 ** the page was locked and this part of the page was up to date when
 	 ** indirect2direct was called, so we know the bytes are still valid
 	 */
-	tail = tail + (pos & (PAGE_CACHE_SIZE - 1));
+	tail = tail + page_cache_offset(p_s_inode->i_mapping, pos);
 
 	PATH_LAST_POSITION(p_s_path)++;
 
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/file.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/file.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/file.c	2007-06-18 19:04:38.000000000 -0700
@@ -161,11 +161,12 @@ int reiserfs_commit_page(struct inode *i
 	int partial = 0;
 	unsigned blocksize;
 	struct buffer_head *bh, *head;
-	unsigned long i_size_index = inode->i_size >> PAGE_CACHE_SHIFT;
+	unsigned long i_size_index =
+		page_cache_offset(inode->i_mapping, inode->i_size);
 	int new;
 	int logit = reiserfs_file_data_log(inode);
 	struct super_block *s = inode->i_sb;
-	int bh_per_page = PAGE_CACHE_SIZE / s->s_blocksize;
+	int bh_per_page = page_cache_size(inode->i_mapping) / s->s_blocksize;
 	struct reiserfs_transaction_handle th;
 	int ret = 0;
 
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/xattr.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c	2007-06-18 21:53:15.000000000 -0700
@@ -493,13 +493,13 @@ reiserfs_xattr_set(struct inode *inode, 
 	while (buffer_pos < buffer_size || buffer_pos == 0) {
 		size_t chunk;
 		size_t skip = 0;
-		size_t page_offset = (file_pos & (PAGE_CACHE_SIZE - 1));
-		if (buffer_size - buffer_pos > PAGE_CACHE_SIZE)
-			chunk = PAGE_CACHE_SIZE;
+		size_t page_offset = page_cache_offset(mapping, file_pos);
+		if (buffer_size - buffer_pos > page_cache_size(mapping))
+			chunk = page_cache_size(mapping);
 		else
 			chunk = buffer_size - buffer_pos;
 
-		page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT);
+		page = reiserfs_get_page(xinode, page_cache_index(mapping, file_pos));
 		if (IS_ERR(page)) {
 			err = PTR_ERR(page);
 			goto out_filp;
@@ -511,8 +511,8 @@ reiserfs_xattr_set(struct inode *inode, 
 		if (file_pos == 0) {
 			struct reiserfs_xattr_header *rxh;
 			skip = file_pos = sizeof(struct reiserfs_xattr_header);
-			if (chunk + skip > PAGE_CACHE_SIZE)
-				chunk = PAGE_CACHE_SIZE - skip;
+			if (chunk + skip > page_cache_size(mapping))
+				chunk = page_cache_size(mapping) - skip;
 			rxh = (struct reiserfs_xattr_header *)data;
 			rxh->h_magic = cpu_to_le32(REISERFS_XATTR_MAGIC);
 			rxh->h_hash = cpu_to_le32(xahash);
@@ -603,12 +603,13 @@ reiserfs_xattr_get(const struct inode *i
 		size_t chunk;
 		char *data;
 		size_t skip = 0;
-		if (isize - file_pos > PAGE_CACHE_SIZE)
-			chunk = PAGE_CACHE_SIZE;
+		if (isize - file_pos > page_cache_size(xinode->mapping))
+			chunk = page_cache_size(xinode->mapping);
 		else
 			chunk = isize - file_pos;
 
-		page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT);
+		page = reiserfs_get_page(xinode,
+				page_index(xinode->mapping, file_pos));
 		if (IS_ERR(page)) {
 			err = PTR_ERR(page);
 			goto out_dput;
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/inode.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/inode.c	2007-06-18 19:20:38.000000000 -0700
@@ -337,7 +337,8 @@ static int _get_block_create_0(struct in
 		goto finished;
 	}
 	// read file tail into part of page
-	offset = (cpu_key_k_offset(&key) - 1) & (PAGE_CACHE_SIZE - 1);
+	offset = page_cache_offset(inode->i_mapping,
+				cpu_key_k_offset(&key) - 1);
 	fs_gen = get_generation(inode->i_sb);
 	copy_item_head(&tmp_ih, ih);
 
@@ -523,10 +524,10 @@ static int convert_tail_for_hole(struct 
 		return -EIO;
 
 	/* always try to read until the end of the block */
-	tail_start = tail_offset & (PAGE_CACHE_SIZE - 1);
+	tail_start = page_cache_offset(inode->i_mapping, tail_offset);
 	tail_end = (tail_start | (bh_result->b_size - 1)) + 1;
 
-	index = tail_offset >> PAGE_CACHE_SHIFT;
+	index = page_cache_index(inode->i_mapping, tail_offset);
 	/* hole_page can be zero in case of direct_io, we are sure
 	   that we cannot get here if we write with O_DIRECT into
 	   tail page */
@@ -2008,11 +2009,13 @@ static int grab_tail_page(struct inode *
 	/* we want the page with the last byte in the file,
 	 ** not the page that will hold the next byte for appending
 	 */
-	unsigned long index = (p_s_inode->i_size - 1) >> PAGE_CACHE_SHIFT;
+	unsigned long index = page_cache_index(p_s_inode->i_mapping,
+						p_s_inode->i_size - 1);
 	unsigned long pos = 0;
 	unsigned long start = 0;
 	unsigned long blocksize = p_s_inode->i_sb->s_blocksize;
-	unsigned long offset = (p_s_inode->i_size) & (PAGE_CACHE_SIZE - 1);
+	unsigned long offset = page_cache_index(p_s_inode->i_mapping,
+							p_s_inode->i_size);
 	struct buffer_head *bh;
 	struct buffer_head *head;
 	struct page *page;
@@ -2084,7 +2087,8 @@ int reiserfs_truncate_file(struct inode 
 {
 	struct reiserfs_transaction_handle th;
 	/* we want the offset for the first byte after the end of the file */
-	unsigned long offset = p_s_inode->i_size & (PAGE_CACHE_SIZE - 1);
+	unsigned long offset = page_cache_offset(p_s_inode->i_mapping,
+							p_s_inode->i_size);
 	unsigned blocksize = p_s_inode->i_sb->s_blocksize;
 	unsigned length;
 	struct page *page = NULL;
@@ -2233,7 +2237,7 @@ static int map_block_for_writepage(struc
 	} else if (is_direct_le_ih(ih)) {
 		char *p;
 		p = page_address(bh_result->b_page);
-		p += (byte_offset - 1) & (PAGE_CACHE_SIZE - 1);
+		p += page_cache_offset(inode->i_mapping, byte_offset - 1);
 		copy_size = ih_item_len(ih) - pos_in_item;
 
 		fs_gen = get_generation(inode->i_sb);
@@ -2332,7 +2336,8 @@ static int reiserfs_write_full_page(stru
 				    struct writeback_control *wbc)
 {
 	struct inode *inode = page->mapping->host;
-	unsigned long end_index = inode->i_size >> PAGE_CACHE_SHIFT;
+	unsigned long end_index = page_cache_index(inode->i_mapping,
+							inode->i_size);
 	int error = 0;
 	unsigned long block;
 	sector_t last_block;
@@ -2342,7 +2347,7 @@ static int reiserfs_write_full_page(stru
 	int checked = PageChecked(page);
 	struct reiserfs_transaction_handle th;
 	struct super_block *s = inode->i_sb;
-	int bh_per_page = PAGE_CACHE_SIZE / s->s_blocksize;
+	int bh_per_page = page_cache_size(inode->i_mapping) / s->s_blocksize;
 	th.t_trans_id = 0;
 
 	/* no logging allowed when nonblocking or from PF_MEMALLOC */
@@ -2369,16 +2374,18 @@ static int reiserfs_write_full_page(stru
 	if (page->index >= end_index) {
 		unsigned last_offset;
 
-		last_offset = inode->i_size & (PAGE_CACHE_SIZE - 1);
+		last_offset = page_cache_offset(inode->i_mapping, inode->i_size);
 		/* no file contents in this page */
 		if (page->index >= end_index + 1 || !last_offset) {
 			unlock_page(page);
 			return 0;
 		}
-		zero_user_segment(page, last_offset, PAGE_CACHE_SIZE);
+		zero_user_segment(page, last_offset,
+				page_cache_size(inode->i_mapping));
 	}
 	bh = head;
-	block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits);
+	block = page->index << (page_cache_shift(inode->i_mapping)
+						- s->s_blocksize_bits);
 	last_block = (i_size_read(inode) - 1) >> inode->i_blkbits;
 	/* first map all the buffers, logging any direct items we find */
 	do {
@@ -2570,7 +2577,7 @@ static int reiserfs_write_begin(struct f
 		*fsdata = (void *)(unsigned long)flags;
 	}
 
-	index = pos >> PAGE_CACHE_SHIFT;
+	index = page_cache_index(mapping, pos);
 	page = __grab_cache_page(mapping, index);
 	if (!page)
 		return -ENOMEM;
@@ -2694,7 +2701,7 @@ static int reiserfs_write_end(struct fil
 	else
 		th = NULL;
 
-	start = pos & (PAGE_CACHE_SIZE - 1);
+	start = page_cache_offset(mapping, pos);
 	if (unlikely(copied < len)) {
 		if (!PageUptodate(page))
 			copied = 0;
@@ -2774,7 +2781,7 @@ int reiserfs_commit_write(struct file *f
 			  unsigned from, unsigned to)
 {
 	struct inode *inode = page->mapping->host;
-	loff_t pos = ((loff_t) page->index << PAGE_CACHE_SHIFT) + to;
+	loff_t pos = page_cache_pos(inode->i_mapping, page->index, to);
 	int ret = 0;
 	int update_sd = 0;
 	struct reiserfs_transaction_handle *th = NULL;
Index: linux-2.6.22-rc4-mm2/fs/reiserfs/bitmap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/bitmap.c	2007-06-18 19:04:34.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/bitmap.c	2007-06-18 19:04:38.000000000 -0700
@@ -1249,9 +1249,14 @@ int reiserfs_can_fit_pages(struct super_
 	int space;
 
 	spin_lock(&REISERFS_SB(sb)->bitmap_lock);
+
+	/*
+	 * Note the PAGE_SHIFT here. This means that the superblock
+	 * and metadata is restricted to page size.
+	 */
 	space =
 	    (SB_FREE_BLOCKS(sb) -
-	     REISERFS_SB(sb)->reserved_blocks) >> (PAGE_CACHE_SHIFT -
+	     REISERFS_SB(sb)->reserved_blocks) >> (PAGE_SHIFT -
 						   sb->s_blocksize_bits);
 	spin_unlock(&REISERFS_SB(sb)->bitmap_lock);
 

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [19/37] Use page_cache_xxx for fs/xfs
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (17 preceding siblings ...)
  2007-06-20 18:29 ` [18/37] Use page_cache_xxx in fs/reiserfs clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [20/37] Fix PAGE SIZE assumption in miscellaneous places clameter
                   ` (17 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_fs_xfs --]
[-- Type: text/plain, Size: 6938 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/xfs/linux-2.6/xfs_aops.c |   55 +++++++++++++++++++++++---------------------
 fs/xfs/linux-2.6/xfs_lrw.c  |    4 +--
 2 files changed, 31 insertions(+), 28 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_aops.c	2007-06-18 19:05:21.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_aops.c	2007-06-18 19:07:15.000000000 -0700
@@ -74,7 +74,7 @@ xfs_page_trace(
 	xfs_inode_t	*ip;
 	bhv_vnode_t	*vp = vn_from_inode(inode);
 	loff_t		isize = i_size_read(inode);
-	loff_t		offset = page_offset(page);
+	loff_t		offset = page_cache_offset(page->mapping);
 	int		delalloc = -1, unmapped = -1, unwritten = -1;
 
 	if (page_has_buffers(page))
@@ -610,7 +610,7 @@ xfs_probe_page(
 					break;
 			} while ((bh = bh->b_this_page) != head);
 		} else
-			ret = mapped ? 0 : PAGE_CACHE_SIZE;
+			ret = mapped ? 0 : page_cache_size(page->mapping);
 	}
 
 	return ret;
@@ -637,7 +637,7 @@ xfs_probe_cluster(
 	} while ((bh = bh->b_this_page) != head);
 
 	/* if we reached the end of the page, sum forwards in following pages */
-	tlast = i_size_read(inode) >> PAGE_CACHE_SHIFT;
+	tlast = page_cache_index(inode->i_mapping, i_size_read(inode));
 	tindex = startpage->index + 1;
 
 	/* Prune this back to avoid pathological behavior */
@@ -655,14 +655,14 @@ xfs_probe_cluster(
 			size_t pg_offset, len = 0;
 
 			if (tindex == tlast) {
-				pg_offset =
-				    i_size_read(inode) & (PAGE_CACHE_SIZE - 1);
+				pg_offset = page_cache_offset(inode->i_mapping,
+							i_size_read(inode));
 				if (!pg_offset) {
 					done = 1;
 					break;
 				}
 			} else
-				pg_offset = PAGE_CACHE_SIZE;
+				pg_offset = page_cache_size(inode->i_mapping);
 
 			if (page->index == tindex && !TestSetPageLocked(page)) {
 				len = xfs_probe_page(page, pg_offset, mapped);
@@ -744,7 +744,8 @@ xfs_convert_page(
 	int			bbits = inode->i_blkbits;
 	int			len, page_dirty;
 	int			count = 0, done = 0, uptodate = 1;
- 	xfs_off_t		offset = page_offset(page);
+	struct address_space	*map = inode->i_mapping;
+	xfs_off_t		offset = page_cache_pos(map, page->index, 0);
 
 	if (page->index != tindex)
 		goto fail;
@@ -752,7 +753,7 @@ xfs_convert_page(
 		goto fail;
 	if (PageWriteback(page))
 		goto fail_unlock_page;
-	if (page->mapping != inode->i_mapping)
+	if (page->mapping != map)
 		goto fail_unlock_page;
 	if (!xfs_is_delayed_page(page, (*ioendp)->io_type))
 		goto fail_unlock_page;
@@ -764,20 +765,20 @@ xfs_convert_page(
 	 * Derivation:
 	 *
 	 * End offset is the highest offset that this page should represent.
-	 * If we are on the last page, (end_offset & (PAGE_CACHE_SIZE - 1))
-	 * will evaluate non-zero and be less than PAGE_CACHE_SIZE and
+	 * If we are on the last page, (end_offset & page_cache_mask())
+	 * will evaluate non-zero and be less than page_cache_size() and
 	 * hence give us the correct page_dirty count. On any other page,
 	 * it will be zero and in that case we need page_dirty to be the
 	 * count of buffers on the page.
 	 */
 	end_offset = min_t(unsigned long long,
-			(xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT,
+			(xfs_off_t)(page->index + 1) << page_cache_shift(map),
 			i_size_read(inode));
 
 	len = 1 << inode->i_blkbits;
-	p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1),
-					PAGE_CACHE_SIZE);
-	p_offset = p_offset ? roundup(p_offset, len) : PAGE_CACHE_SIZE;
+	p_offset = min_t(unsigned long, page_cache_offset(map, end_offset),
+					page_cache_size(map));
+	p_offset = p_offset ? roundup(p_offset, len) : page_cache_size(map);
 	page_dirty = p_offset / len;
 
 	bh = head = page_buffers(page);
@@ -933,6 +934,8 @@ xfs_page_state_convert(
 	int			page_dirty, count = 0;
 	int			trylock = 0;
 	int			all_bh = unmapped;
+	struct address_space	*map = inode->i_mapping;
+	int			pagesize = page_cache_size(map);
 
 	if (startio) {
 		if (wbc->sync_mode == WB_SYNC_NONE && wbc->nonblocking)
@@ -941,11 +944,11 @@ xfs_page_state_convert(
 
 	/* Is this page beyond the end of the file? */
 	offset = i_size_read(inode);
-	end_index = offset >> PAGE_CACHE_SHIFT;
-	last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
+	end_index = page_cache_index(map, offset);
+	last_index = page_cache_index(map, (offset - 1));
 	if (page->index >= end_index) {
 		if ((page->index >= end_index + 1) ||
-		    !(i_size_read(inode) & (PAGE_CACHE_SIZE - 1))) {
+		    !(page_cache_offset(map, i_size_read(inode)))) {
 			if (startio)
 				unlock_page(page);
 			return 0;
@@ -959,22 +962,22 @@ xfs_page_state_convert(
 	 * Derivation:
 	 *
 	 * End offset is the highest offset that this page should represent.
-	 * If we are on the last page, (end_offset & (PAGE_CACHE_SIZE - 1))
-	 * will evaluate non-zero and be less than PAGE_CACHE_SIZE and
-	 * hence give us the correct page_dirty count. On any other page,
+	 * If we are on the last page, (page_cache_offset(mapping, end_offset))
+	 * will evaluate non-zero and be less than page_cache_size(mapping)
+	 * and hence give us the correct page_dirty count. On any other page,
 	 * it will be zero and in that case we need page_dirty to be the
 	 * count of buffers on the page.
  	 */
 	end_offset = min_t(unsigned long long,
-			(xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT, offset);
+			(xfs_off_t)page_cache_pos(map, page->index + 1, 0), offset);
 	len = 1 << inode->i_blkbits;
-	p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1),
-					PAGE_CACHE_SIZE);
-	p_offset = p_offset ? roundup(p_offset, len) : PAGE_CACHE_SIZE;
+	p_offset = min_t(unsigned long, page_cache_offset(map, end_offset),
+					pagesize);
+	p_offset = p_offset ? roundup(p_offset, len) : pagesize;
 	page_dirty = p_offset / len;
 
 	bh = head = page_buffers(page);
-	offset = page_offset(page);
+	offset = page_cache_pos(map, page->index, 0);
 	flags = BMAPI_READ;
 	type = IOMAP_NEW;
 
@@ -1111,7 +1114,7 @@ xfs_page_state_convert(
 
 	if (ioend && iomap_valid) {
 		offset = (iomap.iomap_offset + iomap.iomap_bsize - 1) >>
-					PAGE_CACHE_SHIFT;
+					page_cache_shift(map);
 		tlast = min_t(pgoff_t, offset, last_index);
 		xfs_cluster_write(inode, page->index + 1, &iomap, &ioend,
 					wbc, startio, all_bh, tlast);
Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_lrw.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_lrw.c	2007-06-18 19:05:21.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_lrw.c	2007-06-18 19:07:15.000000000 -0700
@@ -143,8 +143,8 @@ xfs_iozero(
 		unsigned offset, bytes;
 		void *fsdata;
 
-		offset = (pos & (PAGE_CACHE_SIZE -1)); /* Within page */
-		bytes = PAGE_CACHE_SIZE - offset;
+		offset = page_cache_offset(mapping, pos); /* Within page */
+		bytes = page_cache_size(mapping) - offset;
 		if (bytes > count)
 			bytes = count;
 

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [20/37] Fix PAGE SIZE assumption in miscellaneous places.
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (18 preceding siblings ...)
  2007-06-20 18:29 ` [19/37] Use page_cache_xxx for fs/xfs clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [21/37] Use page_cache_xxx in drivers/block/loop.c clameter
                   ` (16 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_kernel --]
[-- Type: text/plain, Size: 1283 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 kernel/container.c |    4 ++--
 kernel/futex.c     |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Index: vps/kernel/futex.c
===================================================================
--- vps.orig/kernel/futex.c	2007-06-14 21:52:20.000000000 -0700
+++ vps/kernel/futex.c	2007-06-14 21:53:43.000000000 -0700
@@ -255,7 +255,7 @@ int get_futex_key(u32 __user *uaddr, str
 	err = get_user_pages(current, mm, address, 1, 0, 0, &page, NULL);
 	if (err >= 0) {
 		key->shared.pgoff =
-			page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+			page->index << (compound_order(page) - PAGE_SHIFT);
 		put_page(page);
 		return 0;
 	}
Index: vps/kernel/container.c
===================================================================
--- vps.orig/kernel/container.c	2007-06-14 21:53:57.000000000 -0700
+++ vps/kernel/container.c	2007-06-14 21:54:13.000000000 -0700
@@ -840,8 +840,8 @@ static int container_fill_super(struct s
 	struct dentry *root;
 	struct containerfs_root *hroot = options;
 
-	sb->s_blocksize = PAGE_CACHE_SIZE;
-	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
+	sb->s_blocksize = PAGE_SIZE;
+	sb->s_blocksize_bits = PAGE_SHIFT;
 	sb->s_magic = CONTAINER_SUPER_MAGIC;
 	sb->s_op = &container_ops;
 

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [21/37] Use page_cache_xxx in drivers/block/loop.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (19 preceding siblings ...)
  2007-06-20 18:29 ` [20/37] Fix PAGE SIZE assumption in miscellaneous places clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [22/37] Use page_cache_xxx in drivers/block/rd.c clameter
                   ` (15 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_drivers_block_loop --]
[-- Type: text/plain, Size: 1500 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/block/loop.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Index: linux-2.6.22-rc4-mm2/drivers/block/loop.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/drivers/block/loop.c	2007-06-19 20:50:30.000000000 -0700
+++ linux-2.6.22-rc4-mm2/drivers/block/loop.c	2007-06-20 00:34:46.000000000 -0700
@@ -215,8 +215,8 @@ static int do_lo_send_aops(struct loop_d
 	int len, ret;
 
 	mutex_lock(&mapping->host->i_mutex);
-	index = pos >> PAGE_CACHE_SHIFT;
-	offset = pos & ((pgoff_t)PAGE_CACHE_SIZE - 1);
+	index = page_cache_index(mapping, pos);
+	offset = page_cache_offset(mapping, pos);
 	bv_offs = bvec->bv_offset;
 	len = bvec->bv_len;
 	while (len > 0) {
@@ -226,8 +226,9 @@ static int do_lo_send_aops(struct loop_d
 		struct page *page;
 		void *fsdata;
 
-		IV = ((sector_t)index << (PAGE_CACHE_SHIFT - 9))+(offset >> 9);
-		size = PAGE_CACHE_SIZE - offset;
+		IV = ((sector_t)index << (page_cache_shift(mapping) - 9))
+							+ (offset >> 9);
+		size = page_cache_size(mapping) - offset;
 		if (size > len)
 			size = len;
 
@@ -393,7 +394,9 @@ lo_read_actor(read_descriptor_t *desc, s
 	struct loop_device *lo = p->lo;
 	sector_t IV;
 
-	IV = ((sector_t) page->index << (PAGE_CACHE_SHIFT - 9))+(offset >> 9);
+	IV = ((sector_t) page->index <<
+			(page_cache_shift(page_mapping(page)) - 9))
+			+ (offset >> 9);
 
 	if (size > count)
 		size = count;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [22/37] Use page_cache_xxx in drivers/block/rd.c
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (20 preceding siblings ...)
  2007-06-20 18:29 ` [21/37] Use page_cache_xxx in drivers/block/loop.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [23/37] compound pages: PageHead/PageTail instead of PageCompound clameter
                   ` (14 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_drivers_block_rd --]
[-- Type: text/plain, Size: 1396 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/block/rd.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: vps/drivers/block/rd.c
===================================================================
--- vps.orig/drivers/block/rd.c	2007-06-14 21:49:09.000000000 -0700
+++ vps/drivers/block/rd.c	2007-06-14 21:50:28.000000000 -0700
@@ -121,7 +121,7 @@ static void make_page_uptodate(struct pa
 			}
 		} while ((bh = bh->b_this_page) != head);
 	} else {
-		memset(page_address(page), 0, PAGE_CACHE_SIZE);
+		memset(page_address(page), 0, page_cache_size(page_mapping(page)));
 	}
 	flush_dcache_page(page);
 	SetPageUptodate(page);
@@ -201,9 +201,9 @@ static const struct address_space_operat
 static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector,
 				struct address_space *mapping)
 {
-	pgoff_t index = sector >> (PAGE_CACHE_SHIFT - 9);
+	pgoff_t index = sector >> (page_cache_size(mapping) - 9);
 	unsigned int vec_offset = vec->bv_offset;
-	int offset = (sector << 9) & ~PAGE_CACHE_MASK;
+	int offset = page_cache_offset(mapping, (sector << 9));
 	int size = vec->bv_len;
 	int err = 0;
 
@@ -213,7 +213,7 @@ static int rd_blkdev_pagecache_IO(int rw
 		char *src;
 		char *dst;
 
-		count = PAGE_CACHE_SIZE - offset;
+		count = page_cache_size(mapping) - offset;
 		if (count > size)
 			count = size;
 		size -= count;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [23/37] compound pages: PageHead/PageTail instead of PageCompound
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (21 preceding siblings ...)
  2007-06-20 18:29 ` [22/37] Use page_cache_xxx in drivers/block/rd.c clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [24/37] compound pages: Add new support functions clameter
                   ` (13 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_headtail --]
[-- Type: text/plain, Size: 6516 bytes --]

This patch enhances the handling of compound pages in the VM. It may also
be important also for the antifrag patches that need to manage a set of
higher order free pages and also for other uses of compound pages.

For now it simplifies accounting for SLUB pages but the groundwork here is
important for the large block size patches and for allowing to page migration
of larger pages. With this framework we may be able to get to a point where
compound pages keep their flags while they are free and Mel may avoid having
special functions for determining the page order of higher order freed pages.
If we can avoid the setup and teardown of higher order pages then allocation
and release of compound pages will be faster.

Looking at the handling of compound pages we see that the fact that a page
is part of a higher order page is not that interesting. The differentiation
is mainly for head pages and tail pages of higher order pages. Head pages
usually need special handling to accomodate the larger size. It is usually
an error if tail pages are encountered. Or else they need to be treated
like PAGE_SIZE pages. So a compound flag in the page flags is not what we
need. Instead we introduce a flag for the head page and another for the tail
page. The PageCompound test is preserved for backward compatibility and
will test if either PageTail or PageHead has been set.

After this patchset the uses of CompoundPage() will be reduced significantly
in the core VM. The I/O layer will still use CompoundPage() for direct I/O.
However, if we at some point convert direct I/O to also support compound
pages as a single unit then CompoundPage() there may become unecessary as
well as the leftover check in mm/swap.c. We may end up mostly with checks
for PageTail and PageHead.

This patch:

Use two separate page flags for the head and tail of compound pages.
PageHead() and PageTail() become more efficient.

PageCompound then becomes a check for PageTail || PageHead. Over time
it is expected that PageCompound will mostly go away since the head page
processing will be different from tail page processing is most situations.

We can remove the compound page check from set_page_refcounted since
PG_reclaim is no longer overloaded.

Also the check in _free_one_page can only be for PageHead. We cannot
free a tail page.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/page-flags.h |   43 ++++++++++++-------------------------------
 mm/internal.h              |    2 +-
 mm/page_alloc.c            |    2 +-
 3 files changed, 14 insertions(+), 33 deletions(-)

Index: linux-2.6.22-rc4-mm2/include/linux/page-flags.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/page-flags.h	2007-06-15 17:35:33.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/page-flags.h	2007-06-18 19:13:03.000000000 -0700
@@ -83,7 +83,6 @@
 #define PG_private		11	/* If pagecache, has fs-private data */
 
 #define PG_writeback		12	/* Page is under writeback */
-#define PG_compound		14	/* Part of a compound page */
 #define PG_swapcache		15	/* Swap page: swp_entry_t in private */
 
 #define PG_mappedtodisk		16	/* Has blocks allocated on-disk */
@@ -91,6 +90,9 @@
 #define PG_buddy		19	/* Page is free, on buddy lists */
 #define PG_booked		20	/* Has blocks reserved on-disk */
 
+#define PG_head			21	/* Page is head of a compound page */
+#define PG_tail			22	/* Page is tail of a compound page */
+
 /* PG_readahead is only used for file reads; PG_reclaim is only for writes */
 #define PG_readahead		PG_reclaim /* Reminder to do async read-ahead */
 
@@ -221,37 +223,16 @@ static inline void SetPageUptodate(struc
 #define ClearPageReclaim(page)	clear_bit(PG_reclaim, &(page)->flags)
 #define TestClearPageReclaim(page) test_and_clear_bit(PG_reclaim, &(page)->flags)
 
-#define PageCompound(page)	test_bit(PG_compound, &(page)->flags)
-#define __SetPageCompound(page)	__set_bit(PG_compound, &(page)->flags)
-#define __ClearPageCompound(page) __clear_bit(PG_compound, &(page)->flags)
-
-/*
- * PG_reclaim is used in combination with PG_compound to mark the
- * head and tail of a compound page
- *
- * PG_compound & PG_reclaim	=> Tail page
- * PG_compound & ~PG_reclaim	=> Head page
- */
-
-#define PG_head_tail_mask ((1L << PG_compound) | (1L << PG_reclaim))
-
-#define PageTail(page)	((page->flags & PG_head_tail_mask) \
-				== PG_head_tail_mask)
-
-static inline void __SetPageTail(struct page *page)
-{
-	page->flags |= PG_head_tail_mask;
-}
-
-static inline void __ClearPageTail(struct page *page)
-{
-	page->flags &= ~PG_head_tail_mask;
-}
+#define PageHead(page)		test_bit(PG_head, &(page)->flags)
+#define __SetPageHead(page)	__set_bit(PG_head, &(page)->flags)
+#define __ClearPageHead(page)	__clear_bit(PG_head, &(page)->flags)
+
+#define PageTail(page)		test_bit(PG_tail, &(page->flags))
+#define __SetPageTail(page)	__set_bit(PG_tail, &(page)->flags)
+#define __ClearPageTail(page)	__clear_bit(PG_tail, &(page)->flags)
 
-#define PageHead(page)	((page->flags & PG_head_tail_mask) \
-				== (1L << PG_compound))
-#define __SetPageHead(page)	__SetPageCompound(page)
-#define __ClearPageHead(page)	__ClearPageCompound(page)
+#define PageCompound(page)	((page)->flags & \
+				((1L << PG_head) | (1L << PG_tail)))
 
 #ifdef CONFIG_SWAP
 #define PageSwapCache(page)	test_bit(PG_swapcache, &(page)->flags)
Index: linux-2.6.22-rc4-mm2/mm/internal.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/internal.h	2007-06-15 17:35:33.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/internal.h	2007-06-18 19:13:03.000000000 -0700
@@ -24,7 +24,7 @@ static inline void set_page_count(struct
  */
 static inline void set_page_refcounted(struct page *page)
 {
-	VM_BUG_ON(PageCompound(page) && PageTail(page));
+	VM_BUG_ON(PageTail(page));
 	VM_BUG_ON(atomic_read(&page->_count));
 	set_page_count(page, 1);
 }
Index: linux-2.6.22-rc4-mm2/mm/page_alloc.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/page_alloc.c	2007-06-18 18:42:45.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/page_alloc.c	2007-06-18 19:13:03.000000000 -0700
@@ -428,7 +428,7 @@ static inline void __free_one_page(struc
 	int order_size = 1 << order;
 	int migratetype = get_pageblock_migratetype(page);
 
-	if (unlikely(PageCompound(page)))
+	if (unlikely(PageHead(page)))
 		destroy_compound_page(page, order);
 
 	page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [24/37] compound pages: Add new support functions
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (22 preceding siblings ...)
  2007-06-20 18:29 ` [23/37] compound pages: PageHead/PageTail instead of PageCompound clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [25/37] compound pages: vmstat support clameter
                   ` (12 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_functions --]
[-- Type: text/plain, Size: 1115 bytes --]

compound_pages(page)	-> Determines base pages of a compound page

compound_shift(page)	-> Determine the page shift of a compound page

compound_size(page)	-> Determine the size of a compound page

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/mm.h |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

Index: vps/include/linux/mm.h
===================================================================
--- vps.orig/include/linux/mm.h	2007-06-11 15:56:37.000000000 -0700
+++ vps/include/linux/mm.h	2007-06-12 19:06:28.000000000 -0700
@@ -365,6 +365,21 @@ static inline void set_compound_order(st
 	page[1].lru.prev = (void *)order;
 }
 
+static inline int compound_pages(struct page *page)
+{
+ 	return 1 << compound_order(page);
+}
+
+static inline int compound_shift(struct page *page)
+{
+ 	return PAGE_SHIFT + compound_order(page);
+}
+
+static inline int compound_size(struct page *page)
+{
+	return PAGE_SIZE << compound_order(page);
+}
+
 /*
  * Multiple processes may "see" the same page. E.g. for untouched
  * mappings of /dev/null, all processes see the same page full of

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [25/37] compound pages: vmstat support
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (23 preceding siblings ...)
  2007-06-20 18:29 ` [24/37] compound pages: Add new support functions clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [26/37] compound pages: Use new compound vmstat functions in SLUB clameter
                   ` (11 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_vmstat --]
[-- Type: text/plain, Size: 2610 bytes --]

Add support for compound pages so that

inc_xxxx and dec_xxx

will increment the ZVCs by the number of pages of the compound page.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/vmstat.h |    5 ++---
 mm/vmstat.c            |   18 +++++++++++++-----
 2 files changed, 15 insertions(+), 8 deletions(-)

Index: vps/include/linux/vmstat.h
===================================================================
--- vps.orig/include/linux/vmstat.h	2007-06-11 15:56:37.000000000 -0700
+++ vps/include/linux/vmstat.h	2007-06-12 19:06:32.000000000 -0700
@@ -234,7 +234,7 @@ static inline void __inc_zone_state(stru
 static inline void __inc_zone_page_state(struct page *page,
 			enum zone_stat_item item)
 {
-	__inc_zone_state(page_zone(page), item);
+	__mod_zone_page_state(page_zone(page), item, compound_pages(page));
 }
 
 static inline void __dec_zone_state(struct zone *zone, enum zone_stat_item item)
@@ -246,8 +246,7 @@ static inline void __dec_zone_state(stru
 static inline void __dec_zone_page_state(struct page *page,
 			enum zone_stat_item item)
 {
-	atomic_long_dec(&page_zone(page)->vm_stat[item]);
-	atomic_long_dec(&vm_stat[item]);
+	__mod_zone_page_state(page_zone(page), item, -compound_pages(page));
 }
 
 /*
Index: vps/mm/vmstat.c
===================================================================
--- vps.orig/mm/vmstat.c	2007-06-11 15:56:37.000000000 -0700
+++ vps/mm/vmstat.c	2007-06-12 19:06:32.000000000 -0700
@@ -225,7 +225,12 @@ void __inc_zone_state(struct zone *zone,
 
 void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
 {
-	__inc_zone_state(page_zone(page), item);
+	struct zone *z = page_zone(page);
+
+	if (likely(!PageHead(page)))
+		__inc_zone_state(z, item);
+	else
+		__mod_zone_page_state(z, item, compound_pages(page));
 }
 EXPORT_SYMBOL(__inc_zone_page_state);
 
@@ -246,7 +251,12 @@ void __dec_zone_state(struct zone *zone,
 
 void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
 {
-	__dec_zone_state(page_zone(page), item);
+	struct zone *z = page_zone(page);
+
+	if (likely(!PageHead(page)))
+		__dec_zone_state(z, item);
+	else
+		__mod_zone_page_state(z, item, -compound_pages(page));
 }
 EXPORT_SYMBOL(__dec_zone_page_state);
 
@@ -262,11 +272,9 @@ void inc_zone_state(struct zone *zone, e
 void inc_zone_page_state(struct page *page, enum zone_stat_item item)
 {
 	unsigned long flags;
-	struct zone *zone;
 
-	zone = page_zone(page);
 	local_irq_save(flags);
-	__inc_zone_state(zone, item);
+	__inc_zone_page_state(page, item);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL(inc_zone_page_state);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [26/37] compound pages: Use new compound vmstat functions in SLUB
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (24 preceding siblings ...)
  2007-06-20 18:29 ` [25/37] compound pages: vmstat support clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [27/37] compound pages: Allow use of get_page_unless_zero with compound pages clameter
                   ` (10 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_vmstat_slub --]
[-- Type: text/plain, Size: 1627 bytes --]

Use the new dec/inc functions to simplify SLUB's accounting
of pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/slub.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/slub.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/slub.c	2007-06-18 18:42:45.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/slub.c	2007-06-18 19:13:26.000000000 -0700
@@ -1052,7 +1052,6 @@ static inline void kmem_cache_open_debug
 static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
 	struct page * page;
-	int pages = 1 << s->order;
 
 	if (s->order)
 		flags |= __GFP_COMP;
@@ -1071,10 +1070,9 @@ static struct page *allocate_slab(struct
 	if (!page)
 		return NULL;
 
-	mod_zone_page_state(page_zone(page),
+	inc_zone_page_state(page,
 		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
-		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
-		pages);
+		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE);
 
 	return page;
 }
@@ -1149,8 +1147,6 @@ static struct page *new_slab(struct kmem
 
 static void __free_slab(struct kmem_cache *s, struct page *page)
 {
-	int pages = 1 << s->order;
-
 	if (unlikely(SlabDebug(page))) {
 		void *p;
 
@@ -1159,10 +1155,9 @@ static void __free_slab(struct kmem_cach
 			check_object(s, page, p, 0);
 	}
 
-	mod_zone_page_state(page_zone(page),
+	dec_zone_page_state(page,
 		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
-		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
-		- pages);
+		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE);
 
 	page->mapping = NULL;
 	__free_pages(page, s->order);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [27/37] compound pages: Allow use of get_page_unless_zero with compound pages
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (25 preceding siblings ...)
  2007-06-20 18:29 ` [26/37] compound pages: Use new compound vmstat functions in SLUB clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [28/37] compound pages: Allow freeing of compound pages via pagevec clameter
                   ` (9 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_get_one_unless --]
[-- Type: text/plain, Size: 925 bytes --]

This will be needed by targeted slab reclaim in order to ensure that a
compound page allocated by SLUB will not go away under us.

It also may be needed if Mel starts to implement defragmentation. The
moving of compound pages may require the establishment of a reference
before the use of page migration functions.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/mm.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: vps/include/linux/mm.h
===================================================================
--- vps.orig/include/linux/mm.h	2007-06-12 19:06:28.000000000 -0700
+++ vps/include/linux/mm.h	2007-06-12 19:06:41.000000000 -0700
@@ -292,7 +292,7 @@ static inline int put_page_testzero(stru
  */
 static inline int get_page_unless_zero(struct page *page)
 {
-	VM_BUG_ON(PageCompound(page));
+	VM_BUG_ON(PageTail(page));
 	return atomic_inc_not_zero(&page->_count);
 }
 

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [28/37] compound pages: Allow freeing of compound pages via pagevec
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (26 preceding siblings ...)
  2007-06-20 18:29 ` [27/37] compound pages: Allow use of get_page_unless_zero with compound pages clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [29/37] Large blocksize support: Fix up reclaim counters clameter
                   ` (8 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: compound_free_via_pagevec --]
[-- Type: text/plain, Size: 2508 bytes --]

Allow the freeing of compound pages via pagevec.

In release_pages() we currently special case for compound pages in order to
be sure to always decrement the page count of the head page and not the
tail page. However that redirection to the head page is only necessary for
tail pages. So use PageTail instead of PageCompound. No change therefore
for the handling of tail pages.

The head page of a compound pages now represents single page large page.
We do the usual processing including checking if its on the LRU
and removing it (not useful right now but later when compound pages are
on the LRU this will work). Then we add the compound page to the pagevec.
Only head pages will end up on the pagevec not tail pages.

In __pagevec_free() we then check if we are freeing a head page and if
so call the destructor for the compound page.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/page_alloc.c |   13 +++++++++++--
 mm/swap.c       |    8 +++++++-
 2 files changed, 18 insertions(+), 3 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/page_alloc.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/page_alloc.c	2007-06-18 19:13:03.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/page_alloc.c	2007-06-18 19:14:03.000000000 -0700
@@ -1746,8 +1746,17 @@ void __pagevec_free(struct pagevec *pvec
 {
 	int i = pagevec_count(pvec);
 
-	while (--i >= 0)
-		free_hot_cold_page(pvec->pages[i], pvec->cold);
+	while (--i >= 0) {
+		struct page *page = pvec->pages[i];
+
+		if (PageHead(page)) {
+			compound_page_dtor *dtor;
+
+			dtor = get_compound_page_dtor(page);
+			(*dtor)(page);
+		} else
+			free_hot_cold_page(page, pvec->cold);
+	}
 }
 
 fastcall void __free_pages(struct page *page, unsigned int order)
Index: linux-2.6.22-rc4-mm2/mm/swap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/swap.c	2007-06-15 17:35:33.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/swap.c	2007-06-18 19:14:03.000000000 -0700
@@ -293,7 +293,13 @@ void release_pages(struct page **pages, 
 	for (i = 0; i < nr; i++) {
 		struct page *page = pages[i];
 
-		if (unlikely(PageCompound(page))) {
+		/*
+		 * If we have a tail page on the LRU then we need to
+		 * decrement the page count of the head page. There
+		 * is no further need to do anything since tail pages
+		 * cannot be on the LRU.
+		 */
+		if (unlikely(PageTail(page))) {
 			if (zone) {
 				spin_unlock_irq(&zone->lru_lock);
 				zone = NULL;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [29/37] Large blocksize support: Fix up reclaim counters
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (27 preceding siblings ...)
  2007-06-20 18:29 ` [28/37] compound pages: Allow freeing of compound pages via pagevec clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [30/37] Add VM_BUG_ONs to check for correct page order clameter
                   ` (7 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_higher_order_reclaim --]
[-- Type: text/plain, Size: 5324 bytes --]

We now have to reclaim compound pages of arbitrary order.

Adjust the counting in vmscan.c to could the number of base
pages.

Also change the active and inactive accounting to do the same.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/mm_inline.h |   41 +++++++++++++++++++++++++++++++----------
 mm/vmscan.c               |   22 ++++++++++++----------
 2 files changed, 43 insertions(+), 20 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/vmscan.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/vmscan.c	2007-06-19 23:27:02.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/vmscan.c	2007-06-19 23:27:29.000000000 -0700
@@ -474,14 +474,14 @@ static unsigned long shrink_page_list(st
 
 		VM_BUG_ON(PageActive(page));
 
-		sc->nr_scanned++;
+		sc->nr_scanned += compound_pages(page);
 
 		if (!sc->may_swap && page_mapped(page))
 			goto keep_locked;
 
 		/* Double the slab pressure for mapped and swapcache pages */
 		if (page_mapped(page) || PageSwapCache(page))
-			sc->nr_scanned++;
+			sc->nr_scanned += compound_pages(page);
 
 		if (PageWriteback(page))
 			goto keep_locked;
@@ -585,7 +585,7 @@ static unsigned long shrink_page_list(st
 
 free_it:
 		unlock_page(page);
-		nr_reclaimed++;
+		nr_reclaimed += compound_pages(page);
 		if (!pagevec_add(&freed_pvec, page))
 			__pagevec_release_nonlru(&freed_pvec);
 		continue;
@@ -677,22 +677,23 @@ static unsigned long isolate_lru_pages(u
 	unsigned long nr_taken = 0;
 	unsigned long scan;
 
-	for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
+	for (scan = 0; scan < nr_to_scan && !list_empty(src); ) {
 		struct page *page;
 		unsigned long pfn;
 		unsigned long end_pfn;
 		unsigned long page_pfn;
+		int pages;
 		int zone_id;
 
 		page = lru_to_page(src);
 		prefetchw_prev_lru_page(page, src, flags);
-
+		pages = compound_pages(page);
 		VM_BUG_ON(!PageLRU(page));
 
 		switch (__isolate_lru_page(page, mode)) {
 		case 0:
 			list_move(&page->lru, dst);
-			nr_taken++;
+			nr_taken += pages;
 			break;
 
 		case -EBUSY:
@@ -738,8 +739,8 @@ static unsigned long isolate_lru_pages(u
 			switch (__isolate_lru_page(cursor_page, mode)) {
 			case 0:
 				list_move(&cursor_page->lru, dst);
-				nr_taken++;
-				scan++;
+				nr_taken += compound_pages(cursor_page);
+				scan+= compound_pages(cursor_page);
 				break;
 
 			case -EBUSY:
@@ -749,6 +750,7 @@ static unsigned long isolate_lru_pages(u
 				break;
 			}
 		}
+		scan += pages;
 	}
 
 	*scanned = scan;
@@ -985,7 +987,7 @@ force_reclaim_mapped:
 		ClearPageActive(page);
 
 		list_move(&page->lru, &zone->inactive_list);
-		pgmoved++;
+		pgmoved += compound_pages(page);
 		if (!pagevec_add(&pvec, page)) {
 			__mod_zone_page_state(zone, NR_INACTIVE, pgmoved);
 			spin_unlock_irq(&zone->lru_lock);
@@ -1013,7 +1015,7 @@ force_reclaim_mapped:
 		SetPageLRU(page);
 		VM_BUG_ON(!PageActive(page));
 		list_move(&page->lru, &zone->active_list);
-		pgmoved++;
+		pgmoved += compound_pages(page);
 		if (!pagevec_add(&pvec, page)) {
 			__mod_zone_page_state(zone, NR_ACTIVE, pgmoved);
 			pgmoved = 0;
Index: linux-2.6.22-rc4-mm2/include/linux/mm_inline.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/mm_inline.h	2007-06-19 23:27:02.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/mm_inline.h	2007-06-20 00:22:16.000000000 -0700
@@ -2,46 +2,67 @@ static inline void
 add_page_to_active_list(struct zone *zone, struct page *page)
 {
 	list_add(&page->lru, &zone->active_list);
-	__inc_zone_state(zone, NR_ACTIVE);
+	if (!PageHead(page))
+		__inc_zone_state(zone, NR_ACTIVE);
+	else
+		__inc_zone_page_state(page, NR_ACTIVE);
 }
 
 static inline void
 add_page_to_inactive_list(struct zone *zone, struct page *page)
 {
 	list_add(&page->lru, &zone->inactive_list);
-	__inc_zone_state(zone, NR_INACTIVE);
+	if (!PageHead(page))
+		__inc_zone_state(zone, NR_INACTIVE);
+	else
+		__inc_zone_page_state(page, NR_INACTIVE);
 }
 
 static inline void
 add_page_to_inactive_list_tail(struct zone *zone, struct page *page)
 {
 	list_add_tail(&page->lru, &zone->inactive_list);
-	__inc_zone_state(zone, NR_INACTIVE);
+	if (!PageHead(page))
+		__inc_zone_state(zone, NR_INACTIVE);
+	else
+		__inc_zone_page_state(page, NR_INACTIVE);
 }
 
 static inline void
 del_page_from_active_list(struct zone *zone, struct page *page)
 {
 	list_del(&page->lru);
-	__dec_zone_state(zone, NR_ACTIVE);
+	if (!PageHead(page))
+		__dec_zone_state(zone, NR_ACTIVE);
+	else
+		__dec_zone_page_state(page, NR_ACTIVE);
 }
 
 static inline void
 del_page_from_inactive_list(struct zone *zone, struct page *page)
 {
 	list_del(&page->lru);
-	__dec_zone_state(zone, NR_INACTIVE);
+	if (!PageHead(page))
+		__dec_zone_state(zone, NR_INACTIVE);
+	else
+		__dec_zone_page_state(page, NR_INACTIVE);
 }
 
 static inline void
 del_page_from_lru(struct zone *zone, struct page *page)
 {
+	enum zone_stat_item counter = NR_ACTIVE;
+
 	list_del(&page->lru);
-	if (PageActive(page)) {
+	if (PageActive(page))
 		__ClearPageActive(page);
-		__dec_zone_state(zone, NR_ACTIVE);
-	} else {
-		__dec_zone_state(zone, NR_INACTIVE);
-	}
+	else
+		counter = NR_INACTIVE;
+
+	if (!PageHead(page))
+		__dec_zone_state(zone, counter);
+	else
+		__dec_zone_page_state(page, counter);
 }
 
+

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [30/37] Add VM_BUG_ONs to check for correct page order
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (28 preceding siblings ...)
  2007-06-20 18:29 ` [29/37] Large blocksize support: Fix up reclaim counters clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [31/37] Large blocksize support: Core piece clameter
                   ` (6 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_safety_checks --]
[-- Type: text/plain, Size: 3999 bytes --]

Before we start allowing different page orders we better get checkpoints in
at various places in the VM. Checkpoints will help debugging whenever a
wrong order page shows up in a mapping. This will be helpful for converting
new filesystems to utilize larger pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/buffer.c  |    1 +
 mm/filemap.c |   18 +++++++++++++++---
 2 files changed, 16 insertions(+), 3 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/filemap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/filemap.c	2007-06-18 23:09:36.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/filemap.c	2007-06-19 19:20:29.000000000 -0700
@@ -128,6 +128,7 @@ void remove_from_page_cache(struct page 
 	struct address_space *mapping = page->mapping;
 
 	BUG_ON(!PageLocked(page));
+	VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 
 	write_lock_irq(&mapping->tree_lock);
 	__remove_from_page_cache(page);
@@ -269,6 +270,7 @@ int wait_on_page_writeback_range(struct 
 			if (page->index > end)
 				continue;
 
+			VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 			wait_on_page_writeback(page);
 			if (PageError(page))
 				ret = -EIO;
@@ -441,6 +443,7 @@ int add_to_page_cache(struct page *page,
 {
 	int error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
 
+	VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 	if (error == 0) {
 		write_lock_irq(&mapping->tree_lock);
 		error = radix_tree_insert(&mapping->page_tree, offset, page);
@@ -600,8 +603,10 @@ struct page * find_get_page(struct addre
 
 	read_lock_irq(&mapping->tree_lock);
 	page = radix_tree_lookup(&mapping->page_tree, offset);
-	if (page)
+	if (page) {
+		VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 		page_cache_get(page);
+	}
 	read_unlock_irq(&mapping->tree_lock);
 	return page;
 }
@@ -626,6 +631,7 @@ struct page *find_lock_page(struct addre
 repeat:
 	page = radix_tree_lookup(&mapping->page_tree, offset);
 	if (page) {
+		VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 		page_cache_get(page);
 		if (TestSetPageLocked(page)) {
 			read_unlock_irq(&mapping->tree_lock);
@@ -711,8 +717,10 @@ unsigned find_get_pages(struct address_s
 	read_lock_irq(&mapping->tree_lock);
 	ret = radix_tree_gang_lookup(&mapping->page_tree,
 				(void **)pages, start, nr_pages);
-	for (i = 0; i < ret; i++)
+	for (i = 0; i < ret; i++) {
+		VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i]));
 		page_cache_get(pages[i]);
+	}
 	read_unlock_irq(&mapping->tree_lock);
 	return ret;
 }
@@ -743,6 +751,7 @@ unsigned find_get_pages_contig(struct ad
 		if (pages[i]->mapping == NULL || pages[i]->index != index)
 			break;
 
+		VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i]));
 		page_cache_get(pages[i]);
 		index++;
 	}
@@ -771,8 +780,10 @@ unsigned find_get_pages_tag(struct addre
 	read_lock_irq(&mapping->tree_lock);
 	ret = radix_tree_gang_lookup_tag(&mapping->page_tree,
 				(void **)pages, *index, nr_pages, tag);
-	for (i = 0; i < ret; i++)
+	for (i = 0; i < ret; i++) {
+		VM_BUG_ON(mapping_order(mapping) != compound_order(pages[i]));
 		page_cache_get(pages[i]);
+	}
 	if (ret)
 		*index = pages[ret - 1]->index + 1;
 	read_unlock_irq(&mapping->tree_lock);
@@ -2610,6 +2621,7 @@ int try_to_release_page(struct page *pag
 	struct address_space * const mapping = page->mapping;
 
 	BUG_ON(!PageLocked(page));
+	VM_BUG_ON(mapping_order(mapping) != compound_order(page));
 	if (PageWriteback(page))
 		return 0;
 
Index: linux-2.6.22-rc4-mm2/fs/buffer.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/buffer.c	2007-06-19 19:20:29.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/buffer.c	2007-06-19 19:24:12.000000000 -0700
@@ -901,6 +901,7 @@ struct buffer_head *alloc_page_buffers(s
 	long offset;
 	unsigned page_size = page_cache_size(page->mapping);
 
+	BUG_ON(size > page_size);
 try_again:
 	head = NULL;
 	offset = page_size;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [31/37] Large blocksize support: Core piece
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (29 preceding siblings ...)
  2007-06-20 18:29 ` [30/37] Add VM_BUG_ONs to check for correct page order clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-21  0:20   ` Bob Picco
  2007-06-20 18:29 ` [32/37] Readahead changes to support large blocksize clameter
                   ` (5 subsequent siblings)
  36 siblings, 1 reply; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_large_page_size --]
[-- Type: text/plain, Size: 16862 bytes --]

Provide an alternate definition for the page_cache_xxx(mapping, ...)
functions that can determine the current page size from the mapping
and generate the appropriate shifts, sizes and mask for the page cache
operations. Change the basic functions that allocate pages for the
page cache to be able to handle higher order allocations.

Provide a new function

mapping_setup(stdruct address_space *, gfp_t mask, int order)

that allows the setup of a mapping of any compound page order.

mapping_set_gfp_mask() is still provided but it sets mappings to order 0.
Calls to mapping_set_gfp_mask() must be converted to mapping_setup() in
order for the filesystem to be able to use larger pages. For some key block
devices and filesystems the conversion is done here.

mapping_setup() for higher order is only allowed if the mapping does not
use DMA mappings or HIGHMEM since we do not support bouncing at the moment.
Thus we currently BUG() on DMA mappings and clear the highmem bit of higher
order mappings.

Modify the set_blocksize() function so that an arbitrary blocksize can be set.
Blocksizes up to MAX_ORDER can be set. This is typically 8MB on many
platforms (order 11). Typically file systems are not only limited by the core
VM but also by the structure of their internal data structures. The core VM
limitations fall away with this patch. The functionality provided here
can do nothing about the internal limitations of filesystems.

Known internal limitations:

Ext2		64k
XFS		64k
Reiserfs	8k
Ext3		4k
Ext4		4k

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 block/Kconfig               |   17 ++++++
 drivers/block/rd.c          |    6 +-
 fs/block_dev.c              |   29 +++++++----
 fs/buffer.c                 |    2 
 fs/inode.c                  |    7 +-
 fs/xfs/linux-2.6/xfs_buf.c  |    3 -
 include/linux/buffer_head.h |   12 ++++
 include/linux/fs.h          |    5 +
 include/linux/pagemap.h     |  116 +++++++++++++++++++++++++++++++++++++++++---
 mm/filemap.c                |   17 ++++--
 10 files changed, 186 insertions(+), 28 deletions(-)

Index: linux-2.6.22-rc4-mm2/include/linux/pagemap.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/pagemap.h	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/pagemap.h	2007-06-19 23:50:55.000000000 -0700
@@ -39,10 +39,30 @@ static inline gfp_t mapping_gfp_mask(str
  * This is non-atomic.  Only to be used before the mapping is activated.
  * Probably needs a barrier...
  */
-static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
+static inline void mapping_setup(struct address_space *m,
+					gfp_t mask, int order)
 {
 	m->flags = (m->flags & ~(__force unsigned long)__GFP_BITS_MASK) |
 				(__force unsigned long)mask;
+
+#ifdef CONFIG_LARGE_BLOCKSIZE
+	m->order = order;
+	m->shift = order + PAGE_SHIFT;
+	m->offset_mask = (PAGE_SIZE << order) - 1;
+	if (order) {
+		/*
+		 * Bouncing is not supported. Requests for DMA
+		 * memory will not work
+		 */
+		BUG_ON(m->flags & (__GFP_DMA|__GFP_DMA32));
+		/*
+		 * Bouncing not supported. We cannot use HIGHMEM
+		 */
+		m->flags &= ~__GFP_HIGHMEM;
+		m->flags |= __GFP_COMP;
+		raise_kswapd_order(order);
+	}
+#endif
 }
 
 /*
@@ -62,6 +82,78 @@ static inline void mapping_set_gfp_mask(
 #define PAGE_CACHE_ALIGN(addr)	(((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK)
 
 /*
+ * The next set of functions allow to write code that is capable of dealing
+ * with multiple page sizes.
+ */
+#ifdef CONFIG_LARGE_BLOCKSIZE
+/*
+ * Determine page order from the blkbits in the inode structure
+ */
+static inline int page_cache_blkbits_to_order(int shift)
+{
+	BUG_ON(shift < 9);
+
+	if (shift < PAGE_SHIFT)
+		return 0;
+
+	return shift - PAGE_SHIFT;
+}
+
+/*
+ * Determine page order from a given blocksize
+ */
+static inline int page_cache_blocksize_to_order(unsigned long size)
+{
+	return page_cache_blkbits_to_order(ilog2(size));
+}
+
+static inline int mapping_order(struct address_space *a)
+{
+	return a->order;
+}
+
+static inline int page_cache_shift(struct address_space *a)
+{
+	return a->shift;
+}
+
+static inline unsigned int page_cache_size(struct address_space *a)
+{
+	return a->offset_mask + 1;
+}
+
+static inline loff_t page_cache_mask(struct address_space *a)
+{
+	return ~a->offset_mask;
+}
+
+static inline unsigned int page_cache_offset(struct address_space *a,
+		loff_t pos)
+{
+	return pos & a->offset_mask;
+}
+#else
+/*
+ * Kernel configured for a fixed PAGE_SIZEd page cache
+ */
+static inline int page_cache_blkbits_to_order(int shift)
+{
+	if (shift < 9)
+		return -EINVAL;
+	if (shift > PAGE_SHIFT)
+		return -EINVAL;
+	return 0;
+}
+
+static inline int page_cache_blocksize_to_order(unsigned long size)
+{
+	if (size >= 512 && size <= PAGE_SIZE)
+		return 0;
+
+	return -EINVAL;
+}
+
+/*
  * Functions that are currently setup for a fixed PAGE_SIZEd. The use of
  * these will allow a variable page size pagecache in the future.
  */
@@ -90,6 +182,7 @@ static inline unsigned int page_cache_of
 {
 	return pos & ~PAGE_MASK;
 }
+#endif
 
 static inline pgoff_t page_cache_index(struct address_space *a,
 		loff_t pos)
@@ -112,27 +205,38 @@ static inline loff_t page_cache_pos(stru
 	return ((loff_t)index << page_cache_shift(a)) + offset;
 }
 
+/*
+ * Legacy function. Only supports order 0 pages.
+ */
+static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
+{
+	if (mapping_order(m))
+		printk(KERN_ERR "mapping_setup(%p, %x, %d)\n", m, mask, mapping_order(m));
+	mapping_setup(m, mask, 0);
+}
+
 #define page_cache_get(page)		get_page(page)
 #define page_cache_release(page)	put_page(page)
 void release_pages(struct page **pages, int nr, int cold);
 
 #ifdef CONFIG_NUMA
-extern struct page *__page_cache_alloc(gfp_t gfp);
+extern struct page *__page_cache_alloc(gfp_t gfp, int);
 #else
-static inline struct page *__page_cache_alloc(gfp_t gfp)
+static inline struct page *__page_cache_alloc(gfp_t gfp, int order)
 {
-	return alloc_pages(gfp, 0);
+	return alloc_pages(gfp, order);
 }
 #endif
 
 static inline struct page *page_cache_alloc(struct address_space *x)
 {
-	return __page_cache_alloc(mapping_gfp_mask(x));
+	return __page_cache_alloc(mapping_gfp_mask(x), mapping_order(x));
 }
 
 static inline struct page *page_cache_alloc_cold(struct address_space *x)
 {
-	return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD);
+	return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD,
+				mapping_order(x));
 }
 
 typedef int filler_t(void *, struct page *);
Index: linux-2.6.22-rc4-mm2/include/linux/fs.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/fs.h	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/fs.h	2007-06-19 23:33:45.000000000 -0700
@@ -519,6 +519,11 @@ struct address_space {
 	spinlock_t		i_mmap_lock;	/* protect tree, count, list */
 	unsigned int		truncate_count;	/* Cover race condition with truncate */
 	unsigned long		nrpages;	/* number of total pages */
+#ifdef CONFIG_LARGE_BLOCKSIZE
+	loff_t			offset_mask;	/* Mask to get to offset bits */
+	unsigned int		order;		/* Page order of the pages in here */
+	unsigned int		shift;		/* Shift of index */
+#endif
 	pgoff_t			writeback_index;/* writeback starts here */
 	const struct address_space_operations *a_ops;	/* methods */
 	unsigned long		flags;		/* error bits/gfp mask */
Index: linux-2.6.22-rc4-mm2/mm/filemap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/filemap.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/filemap.c	2007-06-20 00:45:36.000000000 -0700
@@ -472,13 +472,13 @@ int add_to_page_cache_lru(struct page *p
 }
 
 #ifdef CONFIG_NUMA
-struct page *__page_cache_alloc(gfp_t gfp)
+struct page *__page_cache_alloc(gfp_t gfp, int order)
 {
 	if (cpuset_do_page_mem_spread()) {
 		int n = cpuset_mem_spread_node();
-		return alloc_pages_node(n, gfp, 0);
+		return alloc_pages_node(n, gfp, order);
 	}
-	return alloc_pages(gfp, 0);
+	return alloc_pages(gfp, order);
 }
 EXPORT_SYMBOL(__page_cache_alloc);
 #endif
@@ -677,7 +677,7 @@ struct page *find_or_create_page(struct 
 repeat:
 	page = find_lock_page(mapping, index);
 	if (!page) {
-		page = __page_cache_alloc(gfp_mask);
+		page = __page_cache_alloc(gfp_mask, mapping_order(mapping));
 		if (!page)
 			return NULL;
 		err = add_to_page_cache_lru(page, mapping, index, gfp_mask);
@@ -815,7 +815,8 @@ grab_cache_page_nowait(struct address_sp
 		page_cache_release(page);
 		return NULL;
 	}
-	page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS);
+	page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS,
+				mapping_order(mapping));
 	if (page && add_to_page_cache_lru(page, mapping, index, GFP_KERNEL)) {
 		page_cache_release(page);
 		page = NULL;
@@ -1536,6 +1537,12 @@ int generic_file_mmap(struct file * file
 {
 	struct address_space *mapping = file->f_mapping;
 
+	/*
+	 * Forbid mmap access to higher order mappings.
+	 */
+	if (mapping_order(mapping))
+		return -ENOSYS;
+
 	if (!mapping->a_ops->readpage)
 		return -ENOEXEC;
 	file_accessed(file);
Index: linux-2.6.22-rc4-mm2/block/Kconfig
===================================================================
--- linux-2.6.22-rc4-mm2.orig/block/Kconfig	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/block/Kconfig	2007-06-19 23:33:45.000000000 -0700
@@ -49,6 +49,23 @@ config LSF
 
 	  If unsure, say Y.
 
+#
+# The functions to switch on larger pages in a filesystem will return an error
+# if the gfp flags for a mapping require only DMA pages. Highmem will always
+# be switched off for higher order mappings.
+#
+config LARGE_BLOCKSIZE
+	bool "Support blocksizes larger than page size"
+	default n
+	depends on EXPERIMENTAL
+	help
+	  Allows the page cache to support higher orders of pages. Higher
+	  order page cache pages may be useful to support special devices
+	  like CD or DVDs and Flash. Also to increase I/O performance.
+	  WARNING: This functionality may have significant memory
+	  requirements. It is not advisable to enable this in configuration
+	  where ZONE_NORMAL is smaller than 1 Gigabyte.
+
 endif # BLOCK
 
 source block/Kconfig.iosched
Index: linux-2.6.22-rc4-mm2/fs/block_dev.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/block_dev.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/block_dev.c	2007-06-19 23:50:01.000000000 -0700
@@ -65,36 +65,46 @@ static void kill_bdev(struct block_devic
 		return;
 	invalidate_bh_lrus();
 	truncate_inode_pages(bdev->bd_inode->i_mapping, 0);
-}	
+}
 
 int set_blocksize(struct block_device *bdev, int size)
 {
-	/* Size must be a power of two, and between 512 and PAGE_SIZE */
-	if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size))
+	int order;
+
+	if (size > (PAGE_SIZE << MAX_ORDER) || size < 512 ||
+						!is_power_of_2(size))
 		return -EINVAL;
 
 	/* Size cannot be smaller than the size supported by the device */
 	if (size < bdev_hardsect_size(bdev))
 		return -EINVAL;
 
+	order = page_cache_blocksize_to_order(size);
+
 	/* Don't change the size if it is same as current */
 	if (bdev->bd_block_size != size) {
+		int bits = blksize_bits(size);
+		struct address_space *mapping =
+			bdev->bd_inode->i_mapping;
+
 		sync_blockdev(bdev);
-		bdev->bd_block_size = size;
-		bdev->bd_inode->i_blkbits = blksize_bits(size);
 		kill_bdev(bdev);
+		bdev->bd_block_size = size;
+		bdev->bd_inode->i_blkbits = bits;
+		mapping_setup(mapping, GFP_NOFS, order);
 	}
 	return 0;
 }
-
 EXPORT_SYMBOL(set_blocksize);
 
 int sb_set_blocksize(struct super_block *sb, int size)
 {
 	if (set_blocksize(sb->s_bdev, size))
 		return 0;
-	/* If we get here, we know size is power of two
-	 * and it's value is between 512 and PAGE_SIZE */
+	/*
+	 * If we get here, we know size is power of two
+	 * and it's value is valid for the page cache
+	 */
 	sb->s_blocksize = size;
 	sb->s_blocksize_bits = blksize_bits(size);
 	return sb->s_blocksize;
@@ -588,7 +598,8 @@ struct block_device *bdget(dev_t dev)
 		inode->i_rdev = dev;
 		inode->i_bdev = bdev;
 		inode->i_data.a_ops = &def_blk_aops;
-		mapping_set_gfp_mask(&inode->i_data, GFP_USER);
+		mapping_setup(&inode->i_data, GFP_USER,
+			page_cache_blkbits_to_order(inode->i_blkbits));
 		inode->i_data.backing_dev_info = &default_backing_dev_info;
 		spin_lock(&bdev_lock);
 		list_add(&bdev->bd_list, &all_bdevs);
Index: linux-2.6.22-rc4-mm2/fs/buffer.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/buffer.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/buffer.c	2007-06-20 00:16:41.000000000 -0700
@@ -1098,7 +1098,7 @@ __getblk_slow(struct block_device *bdev,
 {
 	/* Size must be multiple of hard sectorsize */
 	if (unlikely(size & (bdev_hardsect_size(bdev)-1) ||
-			(size < 512 || size > PAGE_SIZE))) {
+		size < 512 || size > (PAGE_SIZE << MAX_ORDER))) {
 		printk(KERN_ERR "getblk(): invalid block size %d requested\n",
 					size);
 		printk(KERN_ERR "hardsect size: %d\n",
Index: linux-2.6.22-rc4-mm2/fs/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/inode.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/inode.c	2007-06-19 23:53:56.000000000 -0700
@@ -145,7 +145,8 @@ static struct inode *alloc_inode(struct 
 		mapping->a_ops = &empty_aops;
  		mapping->host = inode;
 		mapping->flags = 0;
-		mapping_set_gfp_mask(mapping, GFP_HIGHUSER_PAGECACHE);
+		mapping_setup(mapping, GFP_HIGHUSER_PAGECACHE,
+				page_cache_blkbits_to_order(inode->i_blkbits));
 		mapping->assoc_mapping = NULL;
 		mapping->backing_dev_info = &default_backing_dev_info;
 
@@ -243,7 +244,7 @@ void clear_inode(struct inode *inode)
 {
 	might_sleep();
 	invalidate_inode_buffers(inode);
-       
+
 	BUG_ON(inode->i_data.nrpages);
 	BUG_ON(!(inode->i_state & I_FREEING));
 	BUG_ON(inode->i_state & I_CLEAR);
@@ -528,7 +529,7 @@ repeat:
  *	for allocations related to inode->i_mapping is GFP_HIGHUSER_PAGECACHE.
  *	If HIGHMEM pages are unsuitable or it is known that pages allocated
  *	for the page cache are not reclaimable or migratable,
- *	mapping_set_gfp_mask() must be called with suitable flags on the
+ *	mapping_setup() must be called with suitable flags and bits on the
  *	newly created inode's mapping
  *
  */
Index: linux-2.6.22-rc4-mm2/drivers/block/rd.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/drivers/block/rd.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/drivers/block/rd.c	2007-06-20 00:35:55.000000000 -0700
@@ -121,7 +121,8 @@ static void make_page_uptodate(struct pa
 			}
 		} while ((bh = bh->b_this_page) != head);
 	} else {
-		memset(page_address(page), 0, page_cache_size(page_mapping(page)));
+		memset(page_address(page), 0,
+			page_cache_size(page_mapping(page)));
 	}
 	flush_dcache_page(page);
 	SetPageUptodate(page);
@@ -380,7 +381,8 @@ static int rd_open(struct inode *inode, 
 		gfp_mask = mapping_gfp_mask(mapping);
 		gfp_mask &= ~(__GFP_FS|__GFP_IO);
 		gfp_mask |= __GFP_HIGH;
-		mapping_set_gfp_mask(mapping, gfp_mask);
+		mapping_setup(mapping, gfp_mask,
+			page_cache_blkbits_to_order(inode->i_blkbits));
 	}
 
 	return 0;
Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_buf.c	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c	2007-06-19 23:33:45.000000000 -0700
@@ -1558,7 +1558,8 @@ xfs_mapping_buftarg(
 	mapping = &inode->i_data;
 	mapping->a_ops = &mapping_aops;
 	mapping->backing_dev_info = bdi;
-	mapping_set_gfp_mask(mapping, GFP_NOFS);
+	mapping_setup(mapping, GFP_NOFS,
+		page_cache_blkbits_to_order(inode->i_blkbits));
 	btp->bt_mapping = mapping;
 	return 0;
 }
Index: linux-2.6.22-rc4-mm2/include/linux/buffer_head.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/buffer_head.h	2007-06-19 23:33:44.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/buffer_head.h	2007-06-19 23:33:45.000000000 -0700
@@ -129,7 +129,17 @@ BUFFER_FNS(Ordered, ordered)
 BUFFER_FNS(Eopnotsupp, eopnotsupp)
 BUFFER_FNS(Unwritten, unwritten)
 
-#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
+static inline unsigned long bh_offset(struct buffer_head *bh)
+{
+	/*
+	 * No mapping available. Use page struct to obtain
+	 * order.
+	 */
+	unsigned long mask = compound_size(bh->b_page) - 1;
+
+	return (unsigned long)bh->b_data & mask;
+}
+
 #define touch_buffer(bh)	mark_page_accessed(bh->b_page)
 
 /* If we *know* page->private refers to buffer_heads */

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [32/37] Readahead changes to support large blocksize.
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (30 preceding siblings ...)
  2007-06-20 18:29 ` [31/37] Large blocksize support: Core piece clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [33/37] Large blocksize: Compound page zeroing and flushing clameter
                   ` (4 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, Fengguang Wu,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps-readahead.patch --]
[-- Type: text/plain, Size: 5537 bytes --]

Fix up readhead for large I/O operations.

Only calculate the readahead until the 2M boundary then fall back to
one page.

Signed-off-by: Fengguang Wu <fengguang.wu@gmail.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>

===================================================================
---
 include/linux/mm.h |    2 +-
 mm/fadvise.c       |    4 ++--
 mm/filemap.c       |    5 ++---
 mm/madvise.c       |    2 +-
 mm/readahead.c     |   22 ++++++++++++++--------
 5 files changed, 20 insertions(+), 15 deletions(-)

Index: linux-2.6.22-rc4-mm2/mm/fadvise.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/fadvise.c	2007-06-18 23:09:37.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/fadvise.c	2007-06-19 20:01:53.000000000 -0700
@@ -86,10 +86,10 @@ asmlinkage long sys_fadvise64_64(int fd,
 		nrpages = end_index - start_index + 1;
 		if (!nrpages)
 			nrpages = ~0UL;
-		
+
 		ret = force_page_cache_readahead(mapping, file,
 				start_index,
-				max_sane_readahead(nrpages));
+				nrpages);
 		if (ret > 0)
 			ret = 0;
 		break;
Index: linux-2.6.22-rc4-mm2/mm/filemap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/filemap.c	2007-06-19 19:28:15.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/filemap.c	2007-06-19 20:01:53.000000000 -0700
@@ -1288,8 +1288,7 @@ do_readahead(struct address_space *mappi
 	if (!mapping || !mapping->a_ops || !mapping->a_ops->readpage)
 		return -EINVAL;
 
-	force_page_cache_readahead(mapping, filp, index,
-					max_sane_readahead(nr));
+	force_page_cache_readahead(mapping, filp, index, nr);
 	return 0;
 }
 
@@ -1427,7 +1426,7 @@ retry_find:
 			count_vm_event(PGMAJFAULT);
 		}
 		did_readaround = 1;
-		ra_pages = max_sane_readahead(file->f_ra.ra_pages);
+		ra_pages = file->f_ra.ra_pages;
 		if (ra_pages) {
 			pgoff_t start = 0;
 
Index: linux-2.6.22-rc4-mm2/mm/madvise.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/madvise.c	2007-06-04 17:57:25.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/madvise.c	2007-06-19 20:01:53.000000000 -0700
@@ -124,7 +124,7 @@ static long madvise_willneed(struct vm_a
 	end = ((end - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
 
 	force_page_cache_readahead(file->f_mapping,
-			file, start, max_sane_readahead(end - start));
+			file, start, end - start);
 	return 0;
 }
 
Index: linux-2.6.22-rc4-mm2/mm/readahead.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/readahead.c	2007-06-15 17:35:33.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/readahead.c	2007-06-19 20:01:53.000000000 -0700
@@ -44,7 +44,8 @@ EXPORT_SYMBOL_GPL(default_backing_dev_in
 void
 file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
 {
-	ra->ra_pages = mapping->backing_dev_info->ra_pages;
+	ra->ra_pages = DIV_ROUND_UP(mapping->backing_dev_info->ra_pages,
+				    page_cache_size(mapping));
 	ra->prev_index = -1;
 }
 EXPORT_SYMBOL_GPL(file_ra_state_init);
@@ -82,7 +83,7 @@ int read_cache_pages(struct address_spac
 			put_pages_list(pages);
 			break;
 		}
-		task_io_account_read(PAGE_CACHE_SIZE);
+		task_io_account_read(page_cache_size(mapping));
 	}
 	return ret;
 }
@@ -143,7 +144,7 @@ __do_page_cache_readahead(struct address
 	if (isize == 0)
 		goto out;
 
-	end_index = ((isize - 1) >> PAGE_CACHE_SHIFT);
+ 	end_index = page_cache_index(mapping, isize - 1);
 
 	/*
 	 * Preallocate as many pages as we will need.
@@ -196,10 +197,12 @@ int force_page_cache_readahead(struct ad
 	if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages))
 		return -EINVAL;
 
+	nr_to_read = max_sane_readahead(nr_to_read, mapping_order(mapping));
 	while (nr_to_read) {
 		int err;
 
-		unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_CACHE_SIZE;
+		unsigned long this_chunk = DIV_ROUND_UP(2 * 1024 * 1024,
+						page_cache_size(mapping));
 
 		if (this_chunk > nr_to_read)
 			this_chunk = nr_to_read;
@@ -229,17 +232,20 @@ int do_page_cache_readahead(struct addre
 	if (bdi_read_congested(mapping->backing_dev_info))
 		return -1;
 
+	nr_to_read = max_sane_readahead(nr_to_read, mapping_order(mapping));
 	return __do_page_cache_readahead(mapping, filp, offset, nr_to_read, 0);
 }
 
 /*
- * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
+ * Given a desired number of page order readahead pages, return a
  * sensible upper limit.
  */
-unsigned long max_sane_readahead(unsigned long nr)
+unsigned long max_sane_readahead(unsigned long nr, int order)
 {
-	return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE)
-		+ node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
+	unsigned long base_pages = node_page_state(numa_node_id(), NR_INACTIVE)
+			+ node_page_state(numa_node_id(), NR_FREE_PAGES);
+
+	return min(nr, (base_pages / 2) >> order);
 }
 
 /*
Index: linux-2.6.22-rc4-mm2/include/linux/mm.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/mm.h	2007-06-18 23:09:37.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/mm.h	2007-06-19 20:01:53.000000000 -0700
@@ -1167,7 +1167,7 @@ unsigned long page_cache_readahead_ondem
 			  struct page *page,
 			  pgoff_t offset,
 			  unsigned long size);
-unsigned long max_sane_readahead(unsigned long nr);
+unsigned long max_sane_readahead(unsigned long nr, int order);
 
 /* Do stack extension */
 extern int expand_stack(struct vm_area_struct *vma, unsigned long address);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [33/37] Large blocksize: Compound page zeroing and flushing
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (31 preceding siblings ...)
  2007-06-20 18:29 ` [32/37] Readahead changes to support large blocksize clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [34/37] Large blocksize support in ramfs clameter
                   ` (3 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_flush_compound_page --]
[-- Type: text/plain, Size: 4691 bytes --]

We may now have to zero and flush higher order pages. Implement
clear_mapping_page and flush_mapping_page to do that job. Replace
the flushing and clearing at some key locations for the pagecache.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/libfs.c              |    4 ++--
 include/linux/highmem.h |   31 +++++++++++++++++++++++++++++--
 include/linux/pagemap.h |    1 +
 mm/filemap.c            |    8 ++++----
 mm/filemap_xip.c        |    4 ++--
 5 files changed, 38 insertions(+), 10 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/libfs.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/libfs.c	2007-06-19 20:10:05.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/libfs.c	2007-06-19 20:10:45.000000000 -0700
@@ -330,8 +330,8 @@ int simple_rename(struct inode *old_dir,
 
 int simple_readpage(struct file *file, struct page *page)
 {
-	clear_highpage(page);
-	flush_dcache_page(page);
+	clear_mapping_page(page);
+	flush_mapping_page(page);
 	SetPageUptodate(page);
 	unlock_page(page);
 	return 0;
Index: linux-2.6.22-rc4-mm2/include/linux/highmem.h
===================================================================
--- linux-2.6.22-rc4-mm2.orig/include/linux/highmem.h	2007-06-19 20:06:06.000000000 -0700
+++ linux-2.6.22-rc4-mm2/include/linux/highmem.h	2007-06-19 20:30:06.000000000 -0700
@@ -124,14 +124,41 @@ static inline void clear_highpage(struct
 	kunmap_atomic(kaddr, KM_USER0);
 }
 
+/*
+ * Clear a higher order page
+ */
+static inline void clear_mapping_page(struct page *page)
+{
+	int nr_pages = compound_pages(page);
+	int i;
+
+	for (i = 0; i < nr_pages; i++)
+		clear_highpage(page + i);
+}
+
+/*
+ * Primitive support for flushing higher order pages.
+ *
+ * A bit stupid: On many platforms flushing the first page
+ * will flush any TLB starting there
+ */
+static inline void flush_mapping_page(struct page *page)
+{
+	int nr_pages = compound_pages(page);
+	int i;
+
+	for (i = 0; i < nr_pages; i++)
+		flush_dcache_page(page + i);
+}
+
 static inline void zero_user_segments(struct page *page,
 	unsigned start1, unsigned end1,
 	unsigned start2, unsigned end2)
 {
 	void *kaddr = kmap_atomic(page, KM_USER0);
 
-	BUG_ON(end1 > PAGE_SIZE ||
-		end2 > PAGE_SIZE);
+	BUG_ON(end1 > compound_size(page) ||
+		end2 > compound_size(page));
 
 	if (end1 > start1)
 		memset(kaddr + start1, 0, end1 - start1);
Index: linux-2.6.22-rc4-mm2/mm/filemap.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/filemap.c	2007-06-19 20:10:52.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/filemap.c	2007-06-19 20:11:44.000000000 -0700
@@ -946,7 +946,7 @@ page_ok:
 		 * before reading the page on the kernel side.
 		 */
 		if (mapping_writably_mapped(mapping))
-			flush_dcache_page(page);
+			flush_mapping_page(page);
 
 		/*
 		 * When a sequential read accesses a page several times,
@@ -2004,7 +2004,7 @@ int pagecache_write_end(struct file *fil
 		unsigned offset = page_cache_offset(mapping, pos);
 		struct inode *inode = mapping->host;
 
-		flush_dcache_page(page);
+		flush_mapping_page(page);
 		ret = aops->commit_write(file, page, offset, offset+len);
 		unlock_page(page);
 		page_cache_release(page);
@@ -2216,7 +2216,7 @@ static ssize_t generic_perform_write_2co
 			kunmap_atomic(src, KM_USER0);
 			copied = bytes;
 		}
-		flush_dcache_page(page);
+		flush_mapping_page(page);
 
 		status = a_ops->commit_write(file, page, offset, offset+bytes);
 		if (unlikely(status < 0 || status == AOP_TRUNCATED_PAGE))
@@ -2314,7 +2314,7 @@ again:
 		pagefault_disable();
 		copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
 		pagefault_enable();
-		flush_dcache_page(page);
+		flush_mapping_page(page);
 
 		status = a_ops->write_end(file, mapping, pos, bytes, copied,
 						page, fsdata);
Index: linux-2.6.22-rc4-mm2/mm/filemap_xip.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/mm/filemap_xip.c	2007-06-19 20:12:10.000000000 -0700
+++ linux-2.6.22-rc4-mm2/mm/filemap_xip.c	2007-06-19 20:12:46.000000000 -0700
@@ -103,7 +103,7 @@ do_xip_mapping_read(struct address_space
 		 * before reading the page on the kernel side.
 		 */
 		if (mapping_writably_mapped(mapping))
-			flush_dcache_page(page);
+			flush_mapping_page(page);
 
 		/*
 		 * Ok, we have the page, so now we can copy it to user space...
@@ -347,7 +347,7 @@ __xip_file_write(struct file *filp, cons
 		copied = bytes -
 			__copy_from_user_inatomic_nocache(kaddr, buf, bytes);
 		kunmap_atomic(kaddr, KM_USER0);
-		flush_dcache_page(page);
+		flush_mapping_page(page);
 
 		if (likely(copied > 0)) {
 			status = copied;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [34/37] Large blocksize support in ramfs
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (32 preceding siblings ...)
  2007-06-20 18:29 ` [33/37] Large blocksize: Compound page zeroing and flushing clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 20:50   ` Andreas Dilger
  2007-06-20 18:29 ` [35/37] Large blocksize support in XFS clameter
                   ` (2 subsequent siblings)
  36 siblings, 1 reply; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_filesystem_ramfs --]
[-- Type: text/plain, Size: 2058 bytes --]

The simplest file system to use for larg blocksize support is ramfs.
Add a mount parameter that specifies the page order of the pages
that ramfs should use.

Note that ramfs does not use the lower layers (buffer I/O etc) so this
case is useful for initial testing of changes to large buffer size
support if one just wants to exercise the higher layers.

If you apply this patch and then you can f.e. try this:

	mount -tramfs -o10 none /media

Mounts a ramfs filesystem with order 10 pages (4 MB)

	cp linux-2.6.21-rc7.tar.gz /media

Populate the ramfs. Note that we allocate 14 pages of 4M each
instead of 13508..

umount /media

Gets rid of the large pages again

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/ramfs/inode.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/ramfs/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ramfs/inode.c	2007-06-19 19:34:10.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ramfs/inode.c	2007-06-19 20:01:04.000000000 -0700
@@ -60,7 +60,8 @@ struct inode *ramfs_get_inode(struct sup
 		inode->i_blocks = 0;
 		inode->i_mapping->a_ops = &ramfs_aops;
 		inode->i_mapping->backing_dev_info = &ramfs_backing_dev_info;
-		mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
+		mapping_setup(inode->i_mapping, GFP_HIGHUSER,
+				sb->s_blocksize_bits - PAGE_SHIFT);
 		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
 		switch (mode & S_IFMT) {
 		default:
@@ -164,10 +165,15 @@ static int ramfs_fill_super(struct super
 {
 	struct inode * inode;
 	struct dentry * root;
+	int order = 0;
+	char *options = data;
+
+	if (options && *options)
+		order = simple_strtoul(options, NULL, 10);
 
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
-	sb->s_blocksize = PAGE_CACHE_SIZE;
-	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
+	sb->s_blocksize = PAGE_CACHE_SIZE << order;
+	sb->s_blocksize_bits = order + PAGE_CACHE_SHIFT;
 	sb->s_magic = RAMFS_MAGIC;
 	sb->s_op = &ramfs_ops;
 	sb->s_time_gran = 1;

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [35/37] Large blocksize support in XFS
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (33 preceding siblings ...)
  2007-06-20 18:29 ` [34/37] Large blocksize support in ramfs clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 18:29 ` [36/37] Large blocksize support for ext2 clameter
  2007-06-20 18:29 ` [37/37] Reiserfs: Fix up for mapping_set_gfp_mask clameter
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, Dave Chinner,
	William Lee Irwin III, Jens Axboe, Badari Pulavarty,
	Maxim Levitsky

[-- Attachment #1: vps_filesystem_xfs --]
[-- Type: text/plain, Size: 1072 bytes --]

From: David Chinner <dgc@sgi.com>

The only thing that needs to change to enable Large Block I/O is to remove
the check for a too large blocksize ;-)

Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/xfs/xfs_mount.c |   13 -------------
 1 file changed, 13 deletions(-)

Index: linux-2.6.22-rc4-mm2/fs/xfs/xfs_mount.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/xfs/xfs_mount.c	2007-06-18 19:05:21.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/xfs/xfs_mount.c	2007-06-19 19:45:33.000000000 -0700
@@ -326,19 +326,6 @@ xfs_mount_validate_sb(
 		return XFS_ERROR(ENOSYS);
 	}
 
-	/*
-	 * Until this is fixed only page-sized or smaller data blocks work.
-	 */
-	if (unlikely(sbp->sb_blocksize > PAGE_SIZE)) {
-		xfs_fs_mount_cmn_err(flags,
-			"file system with blocksize %d bytes",
-			sbp->sb_blocksize);
-		xfs_fs_mount_cmn_err(flags,
-			"only pagesize (%ld) or less will currently work.",
-			PAGE_SIZE);
-		return XFS_ERROR(ENOSYS);
-	}
-
 	return 0;
 }
 

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [36/37] Large blocksize support for ext2
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (34 preceding siblings ...)
  2007-06-20 18:29 ` [35/37] Large blocksize support in XFS clameter
@ 2007-06-20 18:29 ` clameter
  2007-06-20 20:56   ` Andreas Dilger
  2007-06-20 18:29 ` [37/37] Reiserfs: Fix up for mapping_set_gfp_mask clameter
  36 siblings, 1 reply; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_filesystem_ext2 --]
[-- Type: text/plain, Size: 1195 bytes --]

This adds support for a block size of up to 64k on any platform.
It enables the mounting filesystems that have a larger blocksize
than the page size.

F.e. the following is possible on x86_64 and i386 that have only a 4k page
size.

mke2fs -b 16384 /dev/hdd2	<Ignore warning about too large block size>

mount /dev/hdd2 /media
ls -l /media

.... Do more things with the volume that uses a 16k page cache size on
a 4k page sized platform..

Hmmm... Actually there is nothing additional to be done after the earlier
cleanup of the macros. So just modify copyright.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/ext2/inode.c |    3 +++
 1 file changed, 3 insertions(+)


Index: linux-2.6.22-rc4-mm2/fs/ext2/inode.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/ext2/inode.c	2007-06-19 19:40:56.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/ext2/inode.c	2007-06-19 19:41:56.000000000 -0700
@@ -20,6 +20,9 @@
  * 	(jj@sunsite.ms.mff.cuni.cz)
  *
  *  Assorted race fixes, rewrite of ext2_get_block() by Al Viro, 2000
+ *
+ *  (C) 2007 SGI.
+ *  Large blocksize support by Christoph Lameter
  */
 
 #include <linux/smp_lock.h>

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [37/37] Reiserfs: Fix up for mapping_set_gfp_mask
  2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
                   ` (35 preceding siblings ...)
  2007-06-20 18:29 ` [36/37] Large blocksize support for ext2 clameter
@ 2007-06-20 18:29 ` clameter
  36 siblings, 0 replies; 45+ messages in thread
From: clameter @ 2007-06-20 18:29 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Christoph Hellwig, Mel Gorman, William Lee Irwin III,
	David Chinner, Jens Axboe, Badari Pulavarty, Maxim Levitsky

[-- Attachment #1: vps_filesystem_reiserfs --]
[-- Type: text/plain, Size: 1055 bytes --]

mapping_set_gfp_mask only works on order 0 page cache operations. Reiserfs
can use 8k pages (order 1). Replace the mapping_set_gfp_mask with
mapping_setup to make this work properly.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/reiserfs/xattr.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c
===================================================================
--- linux-2.6.22-rc4-mm2.orig/fs/reiserfs/xattr.c	2007-06-19 23:54:38.000000000 -0700
+++ linux-2.6.22-rc4-mm2/fs/reiserfs/xattr.c	2007-06-19 23:56:40.000000000 -0700
@@ -405,9 +405,10 @@ static struct page *reiserfs_get_page(st
 {
 	struct address_space *mapping = dir->i_mapping;
 	struct page *page;
+
 	/* We can deadlock if we try to free dentries,
 	   and an unlink/rmdir has just occured - GFP_NOFS avoids this */
-	mapping_set_gfp_mask(mapping, GFP_NOFS);
+	mapping_setup(mapping, GFP_NOFS, page_cache_shift(mapping));
 	page = read_mapping_page(mapping, n, NULL);
 	if (!IS_ERR(page)) {
 		kmap(page);

-- 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [34/37] Large blocksize support in ramfs
  2007-06-20 18:29 ` [34/37] Large blocksize support in ramfs clameter
@ 2007-06-20 20:50   ` Andreas Dilger
  2007-06-20 21:29     ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Andreas Dilger @ 2007-06-20 20:50 UTC (permalink / raw)
  To: clameter
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky

On Jun 20, 2007  11:29 -0700, clameter@sgi.com wrote:
> If you apply this patch and then you can f.e. try this:
> 
> 	mount -tramfs -o10 none /media

> @@ -164,10 +165,15 @@ static int ramfs_fill_super(struct super
> +	if (options && *options)
> +		order = simple_strtoul(options, NULL, 10);

This is probably a bad name for a mount option.  What about "order=10"?
Otherwise you prevent any other option from being used in the future.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [36/37] Large blocksize support for ext2
  2007-06-20 18:29 ` [36/37] Large blocksize support for ext2 clameter
@ 2007-06-20 20:56   ` Andreas Dilger
  2007-06-20 21:27     ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Andreas Dilger @ 2007-06-20 20:56 UTC (permalink / raw)
  To: clameter
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky, linux-ext4

On Jun 20, 2007  11:29 -0700, clameter@sgi.com wrote:
> This adds support for a block size of up to 64k on any platform.
> It enables the mounting filesystems that have a larger blocksize
> than the page size.

Might have been good to CC the ext2/3/4 maintainers here?  I definitely
have been waiting for a patch like this for ages (so definitely no
objection from me), but there are a few caveats before this will work
on ext2/3/4.

> Hmmm... Actually there is nothing additional to be done after the earlier
> cleanup of the macros. So just modify copyright.

It is NOT possible to have 64kB blocksize on ext2/3/4 without some small
changes to the directory handling code.  The reason is that an empty 64kB
directory block would have a rec_len == (__u16)2^16 == 0, and this would
cause an error to be hit in the filesystem.  What is needed is to put
2 empty records in such a directory, or to special-case an impossible
value like rec_len = 0xffff to handle this.

There was a patch to fix the 64kB blocksize directory problem, but it
hasn't been merged anywhere yet seeing as there wasn't previously a
patch to allow larger blocksize...

Having 32kB blocksize has no problems that I'm aware of.  Also, I'm not
sure how it happened, but ext2 SHOULD have an explicit check (as
ext3/4 does) limiting it to EXT2_MAX_BLOCK_SIZE.  Otherwise it appears
that there would be no error reported if the superblock reports e.g. 16MB
blocksize, and all kinds of things would break.

There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to
32kB (AFAIK), but I haven't looked into this in a while.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [36/37] Large blocksize support for ext2
  2007-06-20 20:56   ` Andreas Dilger
@ 2007-06-20 21:27     ` Christoph Lameter
  2007-06-20 22:19       ` Andreas Dilger
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2007-06-20 21:27 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky, linux-ext4

On Wed, 20 Jun 2007, Andreas Dilger wrote:

> On Jun 20, 2007  11:29 -0700, clameter@sgi.com wrote:
> > This adds support for a block size of up to 64k on any platform.
> > It enables the mounting filesystems that have a larger blocksize
> > than the page size.
> 
> Might have been good to CC the ext2/3/4 maintainers here?  I definitely
> have been waiting for a patch like this for ages (so definitely no
> objection from me), but there are a few caveats before this will work
> on ext2/3/4.

The CC list is already big so I thought those would be monitoring 
linux-fsdevel.

> > Hmmm... Actually there is nothing additional to be done after the earlier
> > cleanup of the macros. So just modify copyright.
> 
> It is NOT possible to have 64kB blocksize on ext2/3/4 without some small
> changes to the directory handling code.  The reason is that an empty 64kB
> directory block would have a rec_len == (__u16)2^16 == 0, and this would
> cause an error to be hit in the filesystem.  What is needed is to put
> 2 empty records in such a directory, or to special-case an impossible
> value like rec_len = 0xffff to handle this.
> 
> There was a patch to fix the 64kB blocksize directory problem, but it
> hasn't been merged anywhere yet seeing as there wasn't previously a
> patch to allow larger blocksize...

mke2fs allows to specify a 64kb blocksize and IA64 can run with 64kb 
PAGE_SIZE. So this is a bug in ext2fs that needs to be fixed regardless.

> Having 32kB blocksize has no problems that I'm aware of.  Also, I'm not
> sure how it happened, but ext2 SHOULD have an explicit check (as
> ext3/4 does) limiting it to EXT2_MAX_BLOCK_SIZE.  Otherwise it appears
> that there would be no error reported if the superblock reports e.g. 16MB
> blocksize, and all kinds of things would break.

mke2fs fails for blocksizes > 64k so you are safe there. I'd like to see 
that limit lifted?

> There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to
> 32kB (AFAIK), but I haven't looked into this in a while.

I'd love to see such a patch. That is also useful for arches that have 
PAGE_SIZE > 4kb without this patchset.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [34/37] Large blocksize support in ramfs
  2007-06-20 20:50   ` Andreas Dilger
@ 2007-06-20 21:29     ` Christoph Lameter
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Lameter @ 2007-06-20 21:29 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky

On Wed, 20 Jun 2007, Andreas Dilger wrote:

> On Jun 20, 2007  11:29 -0700, clameter@sgi.com wrote:
> > If you apply this patch and then you can f.e. try this:
> > 
> > 	mount -tramfs -o10 none /media
> 
> > @@ -164,10 +165,15 @@ static int ramfs_fill_super(struct super
> > +	if (options && *options)
> > +		order = simple_strtoul(options, NULL, 10);
> 
> This is probably a bad name for a mount option.  What about "order=10"?
> Otherwise you prevent any other option from being used in the future.

I tried to make it as simple as possible. The patch is primarily useful as 
a debugging aid since it eliminates the lower layers from the game. I 
think ramfs should be left as is sine it is intended as a minimal 
implementation that should stay simpl.

If we really want such an option for good then it may best be added to 
shmem or ramdisk drivers?


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [36/37] Large blocksize support for ext2
  2007-06-20 21:27     ` Christoph Lameter
@ 2007-06-20 22:19       ` Andreas Dilger
  0 siblings, 0 replies; 45+ messages in thread
From: Andreas Dilger @ 2007-06-20 22:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky, linux-ext4

On Jun 20, 2007  14:27 -0700, Christoph Lameter wrote:
> > > Hmmm... Actually there is nothing additional to be done after the earlier
> > > cleanup of the macros. So just modify copyright.
> > 
> > It is NOT possible to have 64kB blocksize on ext2/3/4 without some small
> > changes to the directory handling code.  The reason is that an empty 64kB
> > directory block would have a rec_len == (__u16)2^16 == 0, and this would
> > cause an error to be hit in the filesystem.  What is needed is to put
> > 2 empty records in such a directory, or to special-case an impossible
> > value like rec_len = 0xffff to handle this.
> > 
> > There was a patch to fix the 64kB blocksize directory problem, but it
> > hasn't been merged anywhere yet seeing as there wasn't previously a
> > patch to allow larger blocksize...
> 
> mke2fs allows to specify a 64kb blocksize and IA64 can run with 64kb 
> PAGE_SIZE. So this is a bug in ext2fs that needs to be fixed regardless.

True.  I had increased the e2fsprogs blocksize to 16kB after testing it,
and after that it seems Ted increased it to 64kB after that.  The 64kB
directory problem only came out recently.

> > Having 32kB blocksize has no problems that I'm aware of.  Also, I'm not
> > sure how it happened, but ext2 SHOULD have an explicit check (as
> > ext3/4 does) limiting it to EXT2_MAX_BLOCK_SIZE.  Otherwise it appears
> > that there would be no error reported if the superblock reports e.g. 16MB
> > blocksize, and all kinds of things would break.
> 
> mke2fs fails for blocksizes > 64k so you are safe there. I'd like to see 
> that limit lifted?

I don't think extN can go to past 64kB blocksize in any case.

> > There shouldn't be a problem with increasing EXT{2,3,4}_MAX_BLOCK_SIZE to
> > 32kB (AFAIK), but I haven't looked into this in a while.
> 
> I'd love to see such a patch. That is also useful for arches that have 
> PAGE_SIZE > 4kb without this patchset.

Definitely, which is why we had been working on this originally.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [31/37] Large blocksize support: Core piece
  2007-06-20 18:29 ` [31/37] Large blocksize support: Core piece clameter
@ 2007-06-21  0:20   ` Bob Picco
  2007-06-21  5:26     ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Bob Picco @ 2007-06-21  0:20 UTC (permalink / raw)
  To: clameter
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky

Christoph Lameter wrote:	[Wed Jun 20 2007, 02:29:38PM EDT]
> Provide an alternate definition for the page_cache_xxx(mapping, ...)
> functions that can determine the current page size from the mapping
> and generate the appropriate shifts, sizes and mask for the page cache
> operations. Change the basic functions that allocate pages for the
> page cache to be able to handle higher order allocations.
> 
> Provide a new function
> 
> mapping_setup(stdruct address_space *, gfp_t mask, int order)
> 
> that allows the setup of a mapping of any compound page order.
> 
> mapping_set_gfp_mask() is still provided but it sets mappings to order 0.
> Calls to mapping_set_gfp_mask() must be converted to mapping_setup() in
> order for the filesystem to be able to use larger pages. For some key block
> devices and filesystems the conversion is done here.
> 
> mapping_setup() for higher order is only allowed if the mapping does not
> use DMA mappings or HIGHMEM since we do not support bouncing at the moment.
> Thus we currently BUG() on DMA mappings and clear the highmem bit of higher
> order mappings.
> 
> Modify the set_blocksize() function so that an arbitrary blocksize can be set.
> Blocksizes up to MAX_ORDER can be set. This is typically 8MB on many
> platforms (order 11). Typically file systems are not only limited by the core
> VM but also by the structure of their internal data structures. The core VM
> limitations fall away with this patch. The functionality provided here
> can do nothing about the internal limitations of filesystems.
> 
> Known internal limitations:
> 
> Ext2		64k
> XFS		64k
> Reiserfs	8k
> Ext3		4k
> Ext4		4k
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> 
> ---
>  block/Kconfig               |   17 ++++++
>  drivers/block/rd.c          |    6 +-
>  fs/block_dev.c              |   29 +++++++----
>  fs/buffer.c                 |    2 
>  fs/inode.c                  |    7 +-
>  fs/xfs/linux-2.6/xfs_buf.c  |    3 -
>  include/linux/buffer_head.h |   12 ++++
>  include/linux/fs.h          |    5 +
>  include/linux/pagemap.h     |  116 +++++++++++++++++++++++++++++++++++++++++---
>  mm/filemap.c                |   17 ++++--
>  10 files changed, 186 insertions(+), 28 deletions(-)
> 
> Index: linux-2.6.22-rc4-mm2/include/linux/pagemap.h
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/include/linux/pagemap.h	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/include/linux/pagemap.h	2007-06-19 23:50:55.000000000 -0700
> @@ -39,10 +39,30 @@ static inline gfp_t mapping_gfp_mask(str
>   * This is non-atomic.  Only to be used before the mapping is activated.
>   * Probably needs a barrier...
>   */
> -static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
> +static inline void mapping_setup(struct address_space *m,
> +					gfp_t mask, int order)
>  {
>  	m->flags = (m->flags & ~(__force unsigned long)__GFP_BITS_MASK) |
>  				(__force unsigned long)mask;
> +
> +#ifdef CONFIG_LARGE_BLOCKSIZE
> +	m->order = order;
> +	m->shift = order + PAGE_SHIFT;
> +	m->offset_mask = (PAGE_SIZE << order) - 1;
> +	if (order) {
> +		/*
> +		 * Bouncing is not supported. Requests for DMA
> +		 * memory will not work
> +		 */
> +		BUG_ON(m->flags & (__GFP_DMA|__GFP_DMA32));
> +		/*
> +		 * Bouncing not supported. We cannot use HIGHMEM
> +		 */
> +		m->flags &= ~__GFP_HIGHMEM;
> +		m->flags |= __GFP_COMP;
> +		raise_kswapd_order(order);
> +	}
> +#endif
>  }
>  
>  /*
> @@ -62,6 +82,78 @@ static inline void mapping_set_gfp_mask(
>  #define PAGE_CACHE_ALIGN(addr)	(((addr)+PAGE_CACHE_SIZE-1)&PAGE_CACHE_MASK)
>  
>  /*
> + * The next set of functions allow to write code that is capable of dealing
> + * with multiple page sizes.
> + */
> +#ifdef CONFIG_LARGE_BLOCKSIZE
> +/*
> + * Determine page order from the blkbits in the inode structure
> + */
> +static inline int page_cache_blkbits_to_order(int shift)
> +{
> +	BUG_ON(shift < 9);
> +
> +	if (shift < PAGE_SHIFT)
> +		return 0;
> +
> +	return shift - PAGE_SHIFT;
> +}
> +
> +/*
> + * Determine page order from a given blocksize
> + */
> +static inline int page_cache_blocksize_to_order(unsigned long size)
> +{
> +	return page_cache_blkbits_to_order(ilog2(size));
> +}
> +
> +static inline int mapping_order(struct address_space *a)
> +{
> +	return a->order;
> +}
> +
> +static inline int page_cache_shift(struct address_space *a)
> +{
> +	return a->shift;
> +}
> +
> +static inline unsigned int page_cache_size(struct address_space *a)
> +{
> +	return a->offset_mask + 1;
> +}
> +
> +static inline loff_t page_cache_mask(struct address_space *a)
> +{
> +	return ~a->offset_mask;
> +}
> +
> +static inline unsigned int page_cache_offset(struct address_space *a,
> +		loff_t pos)
> +{
> +	return pos & a->offset_mask;
> +}
> +#else
> +/*
> + * Kernel configured for a fixed PAGE_SIZEd page cache
> + */
> +static inline int page_cache_blkbits_to_order(int shift)
> +{
> +	if (shift < 9)
> +		return -EINVAL;
> +	if (shift > PAGE_SHIFT)
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +static inline int page_cache_blocksize_to_order(unsigned long size)
> +{
> +	if (size >= 512 && size <= PAGE_SIZE)
> +		return 0;
> +
> +	return -EINVAL;
> +}
> +
> +/*
>   * Functions that are currently setup for a fixed PAGE_SIZEd. The use of
>   * these will allow a variable page size pagecache in the future.
>   */
> @@ -90,6 +182,7 @@ static inline unsigned int page_cache_of
>  {
>  	return pos & ~PAGE_MASK;
>  }
> +#endif
>  
>  static inline pgoff_t page_cache_index(struct address_space *a,
>  		loff_t pos)
> @@ -112,27 +205,38 @@ static inline loff_t page_cache_pos(stru
>  	return ((loff_t)index << page_cache_shift(a)) + offset;
>  }
>  
> +/*
> + * Legacy function. Only supports order 0 pages.
> + */
> +static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
> +{
> +	if (mapping_order(m))
> +		printk(KERN_ERR "mapping_setup(%p, %x, %d)\n", m, mask, mapping_order(m));
> +	mapping_setup(m, mask, 0);
> +}
> +
>  #define page_cache_get(page)		get_page(page)
>  #define page_cache_release(page)	put_page(page)
>  void release_pages(struct page **pages, int nr, int cold);
>  
>  #ifdef CONFIG_NUMA
> -extern struct page *__page_cache_alloc(gfp_t gfp);
> +extern struct page *__page_cache_alloc(gfp_t gfp, int);
>  #else
> -static inline struct page *__page_cache_alloc(gfp_t gfp)
> +static inline struct page *__page_cache_alloc(gfp_t gfp, int order)
>  {
> -	return alloc_pages(gfp, 0);
> +	return alloc_pages(gfp, order);
>  }
>  #endif
>  
>  static inline struct page *page_cache_alloc(struct address_space *x)
>  {
> -	return __page_cache_alloc(mapping_gfp_mask(x));
> +	return __page_cache_alloc(mapping_gfp_mask(x), mapping_order(x));
>  }
>  
>  static inline struct page *page_cache_alloc_cold(struct address_space *x)
>  {
> -	return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD);
> +	return __page_cache_alloc(mapping_gfp_mask(x)|__GFP_COLD,
> +				mapping_order(x));
>  }
>  
>  typedef int filler_t(void *, struct page *);
> Index: linux-2.6.22-rc4-mm2/include/linux/fs.h
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/include/linux/fs.h	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/include/linux/fs.h	2007-06-19 23:33:45.000000000 -0700
> @@ -519,6 +519,11 @@ struct address_space {
>  	spinlock_t		i_mmap_lock;	/* protect tree, count, list */
>  	unsigned int		truncate_count;	/* Cover race condition with truncate */
>  	unsigned long		nrpages;	/* number of total pages */
> +#ifdef CONFIG_LARGE_BLOCKSIZE
> +	loff_t			offset_mask;	/* Mask to get to offset bits */
> +	unsigned int		order;		/* Page order of the pages in here */
> +	unsigned int		shift;		/* Shift of index */
> +#endif
>  	pgoff_t			writeback_index;/* writeback starts here */
>  	const struct address_space_operations *a_ops;	/* methods */
>  	unsigned long		flags;		/* error bits/gfp mask */
> Index: linux-2.6.22-rc4-mm2/mm/filemap.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/mm/filemap.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/mm/filemap.c	2007-06-20 00:45:36.000000000 -0700
> @@ -472,13 +472,13 @@ int add_to_page_cache_lru(struct page *p
>  }
>  
>  #ifdef CONFIG_NUMA
> -struct page *__page_cache_alloc(gfp_t gfp)
> +struct page *__page_cache_alloc(gfp_t gfp, int order)
>  {
>  	if (cpuset_do_page_mem_spread()) {
>  		int n = cpuset_mem_spread_node();
> -		return alloc_pages_node(n, gfp, 0);
> +		return alloc_pages_node(n, gfp, order);
>  	}
> -	return alloc_pages(gfp, 0);
> +	return alloc_pages(gfp, order);
>  }
>  EXPORT_SYMBOL(__page_cache_alloc);
>  #endif
> @@ -677,7 +677,7 @@ struct page *find_or_create_page(struct 
>  repeat:
>  	page = find_lock_page(mapping, index);
>  	if (!page) {
> -		page = __page_cache_alloc(gfp_mask);
> +		page = __page_cache_alloc(gfp_mask, mapping_order(mapping));
>  		if (!page)
>  			return NULL;
>  		err = add_to_page_cache_lru(page, mapping, index, gfp_mask);
> @@ -815,7 +815,8 @@ grab_cache_page_nowait(struct address_sp
>  		page_cache_release(page);
>  		return NULL;
>  	}
> -	page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS);
> +	page = __page_cache_alloc(mapping_gfp_mask(mapping) & ~__GFP_FS,
> +				mapping_order(mapping));
>  	if (page && add_to_page_cache_lru(page, mapping, index, GFP_KERNEL)) {
>  		page_cache_release(page);
>  		page = NULL;
> @@ -1536,6 +1537,12 @@ int generic_file_mmap(struct file * file
>  {
>  	struct address_space *mapping = file->f_mapping;
>  
> +	/*
> +	 * Forbid mmap access to higher order mappings.
> +	 */
> +	if (mapping_order(mapping))
> +		return -ENOSYS;
> +
>  	if (!mapping->a_ops->readpage)
>  		return -ENOEXEC;
>  	file_accessed(file);
> Index: linux-2.6.22-rc4-mm2/block/Kconfig
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/block/Kconfig	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/block/Kconfig	2007-06-19 23:33:45.000000000 -0700
> @@ -49,6 +49,23 @@ config LSF
>  
>  	  If unsure, say Y.
>  
> +#
> +# The functions to switch on larger pages in a filesystem will return an error
> +# if the gfp flags for a mapping require only DMA pages. Highmem will always
> +# be switched off for higher order mappings.
> +#
> +config LARGE_BLOCKSIZE
> +	bool "Support blocksizes larger than page size"
> +	default n
> +	depends on EXPERIMENTAL
> +	help
> +	  Allows the page cache to support higher orders of pages. Higher
> +	  order page cache pages may be useful to support special devices
> +	  like CD or DVDs and Flash. Also to increase I/O performance.
> +	  WARNING: This functionality may have significant memory
> +	  requirements. It is not advisable to enable this in configuration
> +	  where ZONE_NORMAL is smaller than 1 Gigabyte.
> +
>  endif # BLOCK
>  
>  source block/Kconfig.iosched
> Index: linux-2.6.22-rc4-mm2/fs/block_dev.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/fs/block_dev.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/fs/block_dev.c	2007-06-19 23:50:01.000000000 -0700
> @@ -65,36 +65,46 @@ static void kill_bdev(struct block_devic
>  		return;
>  	invalidate_bh_lrus();
>  	truncate_inode_pages(bdev->bd_inode->i_mapping, 0);
> -}	
> +}
>  
>  int set_blocksize(struct block_device *bdev, int size)
>  {
> -	/* Size must be a power of two, and between 512 and PAGE_SIZE */
> -	if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size))
> +	int order;
> +
> +	if (size > (PAGE_SIZE << MAX_ORDER) || size < 512 ||
> +						!is_power_of_2(size))
I think this should be:
	if (size > (MAX_ORDER_NR_PAGES << PAGE_SHIFT) ... 
	or
	if (size > (PAGE_SIZE << (MAX_ORDER - 1)) ...
bob
>  		return -EINVAL;
>  
>  	/* Size cannot be smaller than the size supported by the device */
>  	if (size < bdev_hardsect_size(bdev))
>  		return -EINVAL;
>  
> +	order = page_cache_blocksize_to_order(size);
> +
>  	/* Don't change the size if it is same as current */
>  	if (bdev->bd_block_size != size) {
> +		int bits = blksize_bits(size);
> +		struct address_space *mapping =
> +			bdev->bd_inode->i_mapping;
> +
>  		sync_blockdev(bdev);
> -		bdev->bd_block_size = size;
> -		bdev->bd_inode->i_blkbits = blksize_bits(size);
>  		kill_bdev(bdev);
> +		bdev->bd_block_size = size;
> +		bdev->bd_inode->i_blkbits = bits;
> +		mapping_setup(mapping, GFP_NOFS, order);
>  	}
>  	return 0;
>  }
> -
>  EXPORT_SYMBOL(set_blocksize);
>  
>  int sb_set_blocksize(struct super_block *sb, int size)
>  {
>  	if (set_blocksize(sb->s_bdev, size))
>  		return 0;
> -	/* If we get here, we know size is power of two
> -	 * and it's value is between 512 and PAGE_SIZE */
> +	/*
> +	 * If we get here, we know size is power of two
> +	 * and it's value is valid for the page cache
> +	 */
>  	sb->s_blocksize = size;
>  	sb->s_blocksize_bits = blksize_bits(size);
>  	return sb->s_blocksize;
> @@ -588,7 +598,8 @@ struct block_device *bdget(dev_t dev)
>  		inode->i_rdev = dev;
>  		inode->i_bdev = bdev;
>  		inode->i_data.a_ops = &def_blk_aops;
> -		mapping_set_gfp_mask(&inode->i_data, GFP_USER);
> +		mapping_setup(&inode->i_data, GFP_USER,
> +			page_cache_blkbits_to_order(inode->i_blkbits));
>  		inode->i_data.backing_dev_info = &default_backing_dev_info;
>  		spin_lock(&bdev_lock);
>  		list_add(&bdev->bd_list, &all_bdevs);
> Index: linux-2.6.22-rc4-mm2/fs/buffer.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/fs/buffer.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/fs/buffer.c	2007-06-20 00:16:41.000000000 -0700
> @@ -1098,7 +1098,7 @@ __getblk_slow(struct block_device *bdev,
>  {
>  	/* Size must be multiple of hard sectorsize */
>  	if (unlikely(size & (bdev_hardsect_size(bdev)-1) ||
> -			(size < 512 || size > PAGE_SIZE))) {
> +		size < 512 || size > (PAGE_SIZE << MAX_ORDER))) {
>  		printk(KERN_ERR "getblk(): invalid block size %d requested\n",
>  					size);
>  		printk(KERN_ERR "hardsect size: %d\n",
> Index: linux-2.6.22-rc4-mm2/fs/inode.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/fs/inode.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/fs/inode.c	2007-06-19 23:53:56.000000000 -0700
> @@ -145,7 +145,8 @@ static struct inode *alloc_inode(struct 
>  		mapping->a_ops = &empty_aops;
>   		mapping->host = inode;
>  		mapping->flags = 0;
> -		mapping_set_gfp_mask(mapping, GFP_HIGHUSER_PAGECACHE);
> +		mapping_setup(mapping, GFP_HIGHUSER_PAGECACHE,
> +				page_cache_blkbits_to_order(inode->i_blkbits));
>  		mapping->assoc_mapping = NULL;
>  		mapping->backing_dev_info = &default_backing_dev_info;
>  
> @@ -243,7 +244,7 @@ void clear_inode(struct inode *inode)
>  {
>  	might_sleep();
>  	invalidate_inode_buffers(inode);
> -       
> +
>  	BUG_ON(inode->i_data.nrpages);
>  	BUG_ON(!(inode->i_state & I_FREEING));
>  	BUG_ON(inode->i_state & I_CLEAR);
> @@ -528,7 +529,7 @@ repeat:
>   *	for allocations related to inode->i_mapping is GFP_HIGHUSER_PAGECACHE.
>   *	If HIGHMEM pages are unsuitable or it is known that pages allocated
>   *	for the page cache are not reclaimable or migratable,
> - *	mapping_set_gfp_mask() must be called with suitable flags on the
> + *	mapping_setup() must be called with suitable flags and bits on the
>   *	newly created inode's mapping
>   *
>   */
> Index: linux-2.6.22-rc4-mm2/drivers/block/rd.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/drivers/block/rd.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/drivers/block/rd.c	2007-06-20 00:35:55.000000000 -0700
> @@ -121,7 +121,8 @@ static void make_page_uptodate(struct pa
>  			}
>  		} while ((bh = bh->b_this_page) != head);
>  	} else {
> -		memset(page_address(page), 0, page_cache_size(page_mapping(page)));
> +		memset(page_address(page), 0,
> +			page_cache_size(page_mapping(page)));
>  	}
>  	flush_dcache_page(page);
>  	SetPageUptodate(page);
> @@ -380,7 +381,8 @@ static int rd_open(struct inode *inode, 
>  		gfp_mask = mapping_gfp_mask(mapping);
>  		gfp_mask &= ~(__GFP_FS|__GFP_IO);
>  		gfp_mask |= __GFP_HIGH;
> -		mapping_set_gfp_mask(mapping, gfp_mask);
> +		mapping_setup(mapping, gfp_mask,
> +			page_cache_blkbits_to_order(inode->i_blkbits));
>  	}
>  
>  	return 0;
> Index: linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/fs/xfs/linux-2.6/xfs_buf.c	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/fs/xfs/linux-2.6/xfs_buf.c	2007-06-19 23:33:45.000000000 -0700
> @@ -1558,7 +1558,8 @@ xfs_mapping_buftarg(
>  	mapping = &inode->i_data;
>  	mapping->a_ops = &mapping_aops;
>  	mapping->backing_dev_info = bdi;
> -	mapping_set_gfp_mask(mapping, GFP_NOFS);
> +	mapping_setup(mapping, GFP_NOFS,
> +		page_cache_blkbits_to_order(inode->i_blkbits));
>  	btp->bt_mapping = mapping;
>  	return 0;
>  }
> Index: linux-2.6.22-rc4-mm2/include/linux/buffer_head.h
> ===================================================================
> --- linux-2.6.22-rc4-mm2.orig/include/linux/buffer_head.h	2007-06-19 23:33:44.000000000 -0700
> +++ linux-2.6.22-rc4-mm2/include/linux/buffer_head.h	2007-06-19 23:33:45.000000000 -0700
> @@ -129,7 +129,17 @@ BUFFER_FNS(Ordered, ordered)
>  BUFFER_FNS(Eopnotsupp, eopnotsupp)
>  BUFFER_FNS(Unwritten, unwritten)
>  
> -#define bh_offset(bh)		((unsigned long)(bh)->b_data & ~PAGE_MASK)
> +static inline unsigned long bh_offset(struct buffer_head *bh)
> +{
> +	/*
> +	 * No mapping available. Use page struct to obtain
> +	 * order.
> +	 */
> +	unsigned long mask = compound_size(bh->b_page) - 1;
> +
> +	return (unsigned long)bh->b_data & mask;
> +}
> +
>  #define touch_buffer(bh)	mark_page_accessed(bh->b_page)
>  
>  /* If we *know* page->private refers to buffer_heads */
> 
> -- 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [31/37] Large blocksize support: Core piece
  2007-06-21  0:20   ` Bob Picco
@ 2007-06-21  5:26     ` Christoph Lameter
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Lameter @ 2007-06-21  5:26 UTC (permalink / raw)
  To: Bob Picco
  Cc: linux-fsdevel, linux-kernel, Christoph Hellwig, Mel Gorman,
	William Lee Irwin III, David Chinner, Jens Axboe,
	Badari Pulavarty, Maxim Levitsky

On Wed, 20 Jun 2007, Bob Picco wrote:

> > +	if (size > (PAGE_SIZE << MAX_ORDER) || size < 512 ||
> > +						!is_power_of_2(size))
> I think this should be:
> 	if (size > (MAX_ORDER_NR_PAGES << PAGE_SHIFT) ... 
> 	or
> 	if (size > (PAGE_SIZE << (MAX_ORDER - 1)) ...
> bob
> >  		return -EINVAL;


Right.... MAX_ORDER is the first illegal order not the last usable.


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2007-06-21  5:26 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-20 18:29 [00/37] Large Blocksize Support V4 clameter
2007-06-20 18:29 ` [01/37] Define functions for page cache handling clameter
2007-06-20 18:29 ` [02/37] Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user clameter
2007-06-20 18:29 ` [03/37] Use page_cache_xxx function in mm/filemap.c clameter
2007-06-20 18:29 ` [04/37] Use page_cache_xxx in mm/page-writeback.c clameter
2007-06-20 18:29 ` [05/37] Use page_cache_xxx in mm/truncate.c clameter
2007-06-20 18:29 ` [06/37] Use page_cache_xxx in mm/rmap.c clameter
2007-06-20 18:29 ` [07/37] Use page_cache_xxx in mm/filemap_xip.c clameter
2007-06-20 18:29 ` [08/37] Use page_cache_xxx in mm/migrate.c clameter
2007-06-20 18:29 ` [09/37] Use page_cache_xxx in fs/libfs.c clameter
2007-06-20 18:29 ` [10/37] Use page_cache_xxx in fs/sync clameter
2007-06-20 18:29 ` [11/37] Use page_cache_xxx in fs/buffer.c clameter
2007-06-20 18:29 ` [12/37] Use page_cache_xxx in mm/mpage.c clameter
2007-06-20 18:29 ` [13/37] Use page_cache_xxx in mm/fadvise.c clameter
2007-06-20 18:29 ` [14/37] Use page_cache_xxx in fs/splice.c clameter
2007-06-20 18:29 ` [15/37] Use page_cache_xxx functions in fs/ext2 clameter
2007-06-20 18:29 ` [16/37] Use page_cache_xxx in fs/ext3 clameter
2007-06-20 18:29 ` [17/37] Use page_cache_xxx in fs/ext4 clameter
2007-06-20 18:29 ` [18/37] Use page_cache_xxx in fs/reiserfs clameter
2007-06-20 18:29 ` [19/37] Use page_cache_xxx for fs/xfs clameter
2007-06-20 18:29 ` [20/37] Fix PAGE SIZE assumption in miscellaneous places clameter
2007-06-20 18:29 ` [21/37] Use page_cache_xxx in drivers/block/loop.c clameter
2007-06-20 18:29 ` [22/37] Use page_cache_xxx in drivers/block/rd.c clameter
2007-06-20 18:29 ` [23/37] compound pages: PageHead/PageTail instead of PageCompound clameter
2007-06-20 18:29 ` [24/37] compound pages: Add new support functions clameter
2007-06-20 18:29 ` [25/37] compound pages: vmstat support clameter
2007-06-20 18:29 ` [26/37] compound pages: Use new compound vmstat functions in SLUB clameter
2007-06-20 18:29 ` [27/37] compound pages: Allow use of get_page_unless_zero with compound pages clameter
2007-06-20 18:29 ` [28/37] compound pages: Allow freeing of compound pages via pagevec clameter
2007-06-20 18:29 ` [29/37] Large blocksize support: Fix up reclaim counters clameter
2007-06-20 18:29 ` [30/37] Add VM_BUG_ONs to check for correct page order clameter
2007-06-20 18:29 ` [31/37] Large blocksize support: Core piece clameter
2007-06-21  0:20   ` Bob Picco
2007-06-21  5:26     ` Christoph Lameter
2007-06-20 18:29 ` [32/37] Readahead changes to support large blocksize clameter
2007-06-20 18:29 ` [33/37] Large blocksize: Compound page zeroing and flushing clameter
2007-06-20 18:29 ` [34/37] Large blocksize support in ramfs clameter
2007-06-20 20:50   ` Andreas Dilger
2007-06-20 21:29     ` Christoph Lameter
2007-06-20 18:29 ` [35/37] Large blocksize support in XFS clameter
2007-06-20 18:29 ` [36/37] Large blocksize support for ext2 clameter
2007-06-20 20:56   ` Andreas Dilger
2007-06-20 21:27     ` Christoph Lameter
2007-06-20 22:19       ` Andreas Dilger
2007-06-20 18:29 ` [37/37] Reiserfs: Fix up for mapping_set_gfp_mask clameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).