linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew.r.wilcox@intel.com>
To: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>, willy@linux.intel.com
Subject: [PATCH v8 04/22] Change direct_access calling convention
Date: Tue, 22 Jul 2014 15:47:52 -0400	[thread overview]
Message-ID: <b78b33d94b669a5fbd02e06f2493b43dd5d77698.1406058387.git.matthew.r.wilcox@intel.com> (raw)
In-Reply-To: <cover.1406058387.git.matthew.r.wilcox@intel.com>
In-Reply-To: <cover.1406058387.git.matthew.r.wilcox@intel.com>

In order to support accesses to larger chunks of memory, pass in a
'size' parameter (counted in bytes), and return the amount available at
that address.

Support partitioning the underlying block device through a new helper
function, bdev_direct_access(), since partition handling should be done
in the block layer, not the filesystem, nor device driver.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 Documentation/filesystems/xip.txt | 15 +++++++++------
 arch/powerpc/sysdev/axonram.c     | 12 ++++--------
 drivers/block/brd.c               |  8 +++++---
 drivers/s390/block/dcssblk.c      | 19 ++++++++++---------
 fs/block_dev.c                    | 28 ++++++++++++++++++++++++++++
 fs/ext2/xip.c                     | 31 +++++++++++++------------------
 include/linux/blkdev.h            |  6 ++++--
 7 files changed, 73 insertions(+), 46 deletions(-)

diff --git a/Documentation/filesystems/xip.txt b/Documentation/filesystems/xip.txt
index 0466ee5..b62eabf 100644
--- a/Documentation/filesystems/xip.txt
+++ b/Documentation/filesystems/xip.txt
@@ -28,12 +28,15 @@ Implementation
 Execute-in-place is implemented in three steps: block device operation,
 address space operation, and file operations.
 
-A block device operation named direct_access is used to retrieve a
-reference (pointer) to a block on-disk. The reference is supposed to be
-cpu-addressable, physical address and remain valid until the release operation
-is performed. A struct block_device reference is used to address the device,
-and a sector_t argument is used to identify the individual block. As an
-alternative, memory technology devices can be used for this.
+A block device operation named direct_access is used to translate the
+block device sector number to a page frame number (pfn) that identifies
+the physical page for the memory.  It also returns a kernel virtual
+address that can be used to access the memory.
+
+The direct_access method takes a 'size' parameter that indicates the
+number of bytes being requested.  The function should return the number
+of bytes that it can provide, although it must not exceed the number of
+bytes requested.  It may also return a negative errno if an error occurs.
 
 The block device operation is optional, these block devices support it as of
 today:
diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c
index 830edc8..3ee1c08 100644
--- a/arch/powerpc/sysdev/axonram.c
+++ b/arch/powerpc/sysdev/axonram.c
@@ -139,17 +139,13 @@ axon_ram_make_request(struct request_queue *queue, struct bio *bio)
  * axon_ram_direct_access - direct_access() method for block device
  * @device, @sector, @data: see block_device_operations method
  */
-static int
+static long
 axon_ram_direct_access(struct block_device *device, sector_t sector,
-		       void **kaddr, unsigned long *pfn)
+		       void **kaddr, unsigned long *pfn, long size)
 {
 	struct axon_ram_bank *bank = device->bd_disk->private_data;
-	loff_t offset;
+	loff_t offset = (loff_t)sector << AXON_RAM_SECTOR_SHIFT;
 
-	offset = sector;
-	if (device->bd_part != NULL)
-		offset += device->bd_part->start_sect;
-	offset <<= AXON_RAM_SECTOR_SHIFT;
 	if (offset >= bank->size) {
 		dev_err(&bank->device->dev, "Access outside of address space\n");
 		return -ERANGE;
@@ -158,7 +154,7 @@ axon_ram_direct_access(struct block_device *device, sector_t sector,
 	*kaddr = (void *)(bank->ph_addr + offset);
 	*pfn = virt_to_phys(*kaddr) >> PAGE_SHIFT;
 
-	return 0;
+	return min_t(long, size, bank->size - offset);
 }
 
 static const struct block_device_operations axon_ram_devops = {
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index c7d138e..96e4c96 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -370,8 +370,8 @@ static int brd_rw_page(struct block_device *bdev, sector_t sector,
 }
 
 #ifdef CONFIG_BLK_DEV_XIP
-static int brd_direct_access(struct block_device *bdev, sector_t sector,
-			void **kaddr, unsigned long *pfn)
+static long brd_direct_access(struct block_device *bdev, sector_t sector,
+			void **kaddr, unsigned long *pfn, long size)
 {
 	struct brd_device *brd = bdev->bd_disk->private_data;
 	struct page *page;
@@ -388,7 +388,9 @@ static int brd_direct_access(struct block_device *bdev, sector_t sector,
 	*kaddr = page_address(page);
 	*pfn = page_to_pfn(page);
 
-	return 0;
+	/* Could optimistically check to see if the next page in the
+	 * file is mapped to the next page of physical RAM */
+	return min_t(long, PAGE_SIZE, size);
 }
 #endif
 
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index 0f47175..58958d1 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -28,8 +28,8 @@
 static int dcssblk_open(struct block_device *bdev, fmode_t mode);
 static void dcssblk_release(struct gendisk *disk, fmode_t mode);
 static void dcssblk_make_request(struct request_queue *q, struct bio *bio);
-static int dcssblk_direct_access(struct block_device *bdev, sector_t secnum,
-				 void **kaddr, unsigned long *pfn);
+static long dcssblk_direct_access(struct block_device *bdev, sector_t secnum,
+				 void **kaddr, unsigned long *pfn, long size);
 
 static char dcssblk_segments[DCSSBLK_PARM_LEN] = "\0";
 
@@ -866,25 +866,26 @@ fail:
 	bio_io_error(bio);
 }
 
-static int
+static long
 dcssblk_direct_access (struct block_device *bdev, sector_t secnum,
-			void **kaddr, unsigned long *pfn)
+			void **kaddr, unsigned long *pfn, long size)
 {
 	struct dcssblk_dev_info *dev_info;
-	unsigned long pgoff;
+	unsigned long offset, dev_sz;
 
 	dev_info = bdev->bd_disk->private_data;
 	if (!dev_info)
 		return -ENODEV;
+	dev_sz = dev_info->end - dev_info->start;
 	if (secnum % (PAGE_SIZE/512))
 		return -EINVAL;
-	pgoff = secnum / (PAGE_SIZE / 512);
-	if ((pgoff+1)*PAGE_SIZE-1 > dev_info->end - dev_info->start)
+	offset = secnum * 512;
+	if (offset > dev_sz)
 		return -ERANGE;
-	*kaddr = (void *) (dev_info->start+pgoff*PAGE_SIZE);
+	*kaddr = (void *) (dev_info->start + offset);
 	*pfn = virt_to_phys(*kaddr) >> PAGE_SHIFT;
 
-	return 0;
+	return min_t(long, size, dev_sz - offset);
 }
 
 static void
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6d72746..f1a158e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -427,6 +427,34 @@ int bdev_write_page(struct block_device *bdev, sector_t sector,
 }
 EXPORT_SYMBOL_GPL(bdev_write_page);
 
+/**
+ * bdev_direct_access() - Get the address for directly-accessibly memory
+ * @bdev: The device containing the memory
+ * @sector: The offset within the device
+ * @addr: Where to put the address of the memory
+ * @pfn: The Page Frame Number for the memory
+ * @size: The number of bytes requested
+ *
+ * If a block device is made up of directly addressable memory, this function
+ * will tell the caller the PFN and the address of the memory.  The address
+ * may be directly dereferenced within the kernel without the need to call
+ * ioremap(), kmap() or similar.  THe PFN is suitable for inserting into
+ * page tables.
+ *
+ * Return: negative errno if an error occurs, otherwise the number of bytes
+ * accessible at this address.
+ */
+long bdev_direct_access(struct block_device *bdev, sector_t sector,
+			void **addr, unsigned long *pfn, long size)
+{
+	const struct block_device_operations *ops = bdev->bd_disk->fops;
+	if (!ops->direct_access)
+		return -EOPNOTSUPP;
+	return ops->direct_access(bdev, sector + get_start_sect(bdev), addr,
+					pfn, size);
+}
+EXPORT_SYMBOL_GPL(bdev_direct_access);
+
 /*
  * pseudo-fs
  */
diff --git a/fs/ext2/xip.c b/fs/ext2/xip.c
index e98171a..bbc5fec 100644
--- a/fs/ext2/xip.c
+++ b/fs/ext2/xip.c
@@ -13,18 +13,12 @@
 #include "ext2.h"
 #include "xip.h"
 
-static inline int
-__inode_direct_access(struct inode *inode, sector_t block,
-		      void **kaddr, unsigned long *pfn)
+static inline long __inode_direct_access(struct inode *inode, sector_t block,
+				void **kaddr, unsigned long *pfn, long size)
 {
 	struct block_device *bdev = inode->i_sb->s_bdev;
-	const struct block_device_operations *ops = bdev->bd_disk->fops;
-	sector_t sector;
-
-	sector = block * (PAGE_SIZE / 512); /* ext2 block to bdev sector */
-
-	BUG_ON(!ops->direct_access);
-	return ops->direct_access(bdev, sector, kaddr, pfn);
+	sector_t sector = block * (PAGE_SIZE / 512);
+	return bdev_direct_access(bdev, sector, kaddr, pfn, size);
 }
 
 static inline int
@@ -53,12 +47,13 @@ ext2_clear_xip_target(struct inode *inode, sector_t block)
 {
 	void *kaddr;
 	unsigned long pfn;
-	int rc;
+	long size;
 
-	rc = __inode_direct_access(inode, block, &kaddr, &pfn);
-	if (!rc)
-		clear_page(kaddr);
-	return rc;
+	size = __inode_direct_access(inode, block, &kaddr, &pfn, PAGE_SIZE);
+	if (size < 0)
+		return size;
+	clear_page(kaddr);
+	return 0;
 }
 
 void ext2_xip_verify_sb(struct super_block *sb)
@@ -77,7 +72,7 @@ void ext2_xip_verify_sb(struct super_block *sb)
 int ext2_get_xip_mem(struct address_space *mapping, pgoff_t pgoff, int create,
 				void **kmem, unsigned long *pfn)
 {
-	int rc;
+	long rc;
 	sector_t block;
 
 	/* first, retrieve the sector number */
@@ -86,6 +81,6 @@ int ext2_get_xip_mem(struct address_space *mapping, pgoff_t pgoff, int create,
 		return rc;
 
 	/* retrieve address of the target data */
-	rc = __inode_direct_access(mapping->host, block, kmem, pfn);
-	return rc;
+	rc = __inode_direct_access(mapping->host, block, kmem, pfn, PAGE_SIZE);
+	return (rc < 0) ? rc : 0;
 }
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8699bcf..bc5ea9e 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1613,8 +1613,8 @@ struct block_device_operations {
 	int (*rw_page)(struct block_device *, sector_t, struct page *, int rw);
 	int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
 	int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
-	int (*direct_access) (struct block_device *, sector_t,
-						void **, unsigned long *);
+	long (*direct_access) (struct block_device *, sector_t,
+					void **, unsigned long *pfn, long size);
 	unsigned int (*check_events) (struct gendisk *disk,
 				      unsigned int clearing);
 	/* ->media_changed() is DEPRECATED, use ->check_events() instead */
@@ -1632,6 +1632,8 @@ extern int __blkdev_driver_ioctl(struct block_device *, fmode_t, unsigned int,
 extern int bdev_read_page(struct block_device *, sector_t, struct page *);
 extern int bdev_write_page(struct block_device *, sector_t, struct page *,
 						struct writeback_control *);
+extern long bdev_direct_access(struct block_device *, sector_t, void **addr,
+						unsigned long *pfn, long size);
 #else /* CONFIG_BLOCK */
 
 struct block_device;
-- 
2.0.0


  parent reply	other threads:[~2014-07-22 19:53 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-22 19:47 [PATCH v8 00/22] Support ext4 on NV-DIMMs Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 01/22] Fix XIP fault vs truncate race Matthew Wilcox
2014-07-23 11:21   ` Kirill A. Shutemov
2014-07-22 19:47 ` [PATCH v8 02/22] Allow page fault handlers to perform the COW Matthew Wilcox
2014-07-23 11:23   ` Kirill A. Shutemov
2014-07-22 19:47 ` [PATCH v8 03/22] axonram: Fix bug in direct_access Matthew Wilcox
2014-07-23 11:24   ` Kirill A. Shutemov
2014-07-22 19:47 ` Matthew Wilcox [this message]
2014-07-30 16:03   ` [PATCH v8 04/22] Change direct_access calling convention Boaz Harrosh
2014-07-30 16:12     ` Boaz Harrosh
2014-07-30 20:34       ` Matthew Wilcox
2014-07-31 10:16         ` Boaz Harrosh
2014-07-30 19:45     ` Matthew Wilcox
2014-07-31 10:11       ` Boaz Harrosh
2014-07-31 14:13         ` Matthew Wilcox
2014-07-31 15:28           ` Boaz Harrosh
2014-07-31 17:19             ` Matthew Wilcox
2014-07-31 18:04               ` Boaz Harrosh
2014-07-31 20:30                 ` Zwisler, Ross
2014-08-01 18:45                   ` Zwisler, Ross
2014-07-22 19:47 ` [PATCH v8 05/22] Add vm_replace_mixed() Matthew Wilcox
2014-07-23  9:10   ` Jan Kara
2014-07-23 11:45   ` Kirill A. Shutemov
2014-07-23 13:52     ` Matthew Wilcox
2014-07-23 14:20       ` Kirill A. Shutemov
2014-07-23 14:27         ` Matthew Wilcox
2014-07-23 15:55           ` Kirill A. Shutemov
2014-07-24  1:36             ` Zhang, Tianfei
2014-07-25 19:44             ` Matthew Wilcox
2014-07-28 13:25               ` Kirill A. Shutemov
2014-07-29  1:55                 ` Zhang, Tianfei
2014-07-22 19:47 ` [PATCH v8 06/22] Introduce IS_DAX(inode) Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 07/22] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 08/22] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 09/22] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 10/22] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-07-23 12:10   ` Kirill A. Shutemov
2014-07-23 13:55     ` Matthew Wilcox
2014-07-23 14:10       ` [PATCH v8 00/22] Support ext4 on NV-DIMMs Howard Chu
2014-07-23 14:34         ` Matthew Wilcox
2014-07-23 15:28           ` Howard Chu
2014-07-23 20:55             ` Theodore Ts'o
2014-07-23 16:57   ` [PATCH v8 10/22] Replace the XIP page fault handler with the DAX page fault handler Boaz Harrosh
2014-07-23 19:57     ` Matthew Wilcox
2014-07-22 19:47 ` [PATCH v8 11/22] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 12/22] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 13/22] Remove get_xip_mem Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 14/22] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 15/22] ext2: Remove ext2_use_xip Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 16/22] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 17/22] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 18/22] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 19/22] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 20/22] xip: Add xip_zero_page_range Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 21/22] ext4: Add DAX functionality Matthew Wilcox
2014-07-22 19:48 ` [PATCH v8 22/22] brd: Rename XIP to DAX Matthew Wilcox
2014-07-23 12:30 ` [PATCH v8 00/22] Support ext4 on NV-DIMMs Kirill A. Shutemov
2014-07-23 13:59   ` Matthew Wilcox
2014-07-23 15:58 ` Boaz Harrosh
2014-07-23 19:50   ` Matthew Wilcox
2014-07-24 18:51     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b78b33d94b669a5fbd02e06f2493b43dd5d77698.1406058387.git.matthew.r.wilcox@intel.com \
    --to=matthew.r.wilcox@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).