linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface
@ 2020-01-09 13:30 Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 1/5] fs: Enable bmap() function to properly return errors Carlos Maiolino
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

Hi,

This series refactor the internal structure of FIBMAP so that the filesystem can
properly report errors back to VFS, and also simplifies its usage by
standardizing all ->bmap() method usage via bmap() function.

The last patch is a bug fix for ioctl_fibmap() calls with negative block values.


Viro spotted a mistake in patch 4/5 on the previous version, where bmap()
return value was not being propagated back to userland, breaking its ABI.

So, this new version, only has a change on patch 4/5 to fix this problem.


Changelog:

V8:
	- Rebased over linux-next
	- Fix an error in patch 4/5, which led to the wrong value being
	  returned by ioctl_fibmap()
V7:

        - Rebased over linux-next
        - Add parameters names to function declarations
          Reported-by: kbuild test robot <lkp@intel.com>
        - Remove changelog entries from patches's commit logs

V6:

        - Add a dummy bmap() definition so build does not break if
          CONFIG_BLOCK is not set
          Reported-by: kbuild test robot <lkp@intel.com>

        - ASSERT only if filesystem does not support bmap() to
          match the original logic

        - Fix bmap() doc function
          Reported-by: kbuild test robot <lkp@intel.com>

V5:

        - Rebasing against 5.3 required changes to the f2fs
          check_swap_activate() function

V4:

        - Ensure ioctl_fibmap() returns 0 in case of error returned from
          bmap(). Otherwise we'll be changing the user interface (which
          returns 0 in case of error)
V3:
        - Rename usr_blk to ur_block

V2:
        - Use a local sector_t variable to asign the block number
          instead of using direct casting.


Carlos Maiolino (5):
  fs: Enable bmap() function to properly return errors
  cachefiles: drop direct usage of ->bmap method.
  ecryptfs: drop direct calls to ->bmap
  fibmap: Use bmap instead of ->bmap method in ioctl_fibmap
  fibmap: Reject negative block numbers

 drivers/md/md-bitmap.c | 16 ++++++++++------
 fs/cachefiles/rdwr.c   | 27 ++++++++++++++-------------
 fs/ecryptfs/mmap.c     | 16 ++++++----------
 fs/f2fs/data.c         | 16 +++++++++++-----
 fs/inode.c             | 30 +++++++++++++++++-------------
 fs/ioctl.c             | 33 +++++++++++++++++++++++----------
 fs/jbd2/journal.c      | 22 +++++++++++++++-------
 include/linux/fs.h     |  9 ++++++++-
 mm/page_io.c           | 11 +++++++----
 9 files changed, 111 insertions(+), 69 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] fs: Enable bmap() function to properly return errors
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
@ 2020-01-09 13:30 ` Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 2/5] cachefiles: drop direct usage of ->bmap method Carlos Maiolino
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

By now, bmap() will either return the physical block number related to
the requested file offset or 0 in case of error or the requested offset
maps into a hole.
This patch makes the needed changes to enable bmap() to proper return
errors, using the return value as an error return, and now, a pointer
must be passed to bmap() to be filled with the mapped physical block.

It will change the behavior of bmap() on return:

- negative value in case of error
- zero on success or map fell into a hole

In case of a hole, the *block will be zero too

Since this is a prep patch, by now, the only error return is -EINVAL if
->bmap doesn't exist.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 drivers/md/md-bitmap.c | 16 ++++++++++------
 fs/f2fs/data.c         | 16 +++++++++++-----
 fs/inode.c             | 30 +++++++++++++++++-------------
 fs/jbd2/journal.c      | 22 +++++++++++++++-------
 include/linux/fs.h     |  2 +-
 mm/page_io.c           | 11 +++++++----
 6 files changed, 61 insertions(+), 36 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 3ad18246fcb3..92d3b515252d 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -364,7 +364,7 @@ static int read_page(struct file *file, unsigned long index,
 	int ret = 0;
 	struct inode *inode = file_inode(file);
 	struct buffer_head *bh;
-	sector_t block;
+	sector_t block, blk_cur;
 
 	pr_debug("read bitmap file (%dB @ %llu)\n", (int)PAGE_SIZE,
 		 (unsigned long long)index << PAGE_SHIFT);
@@ -375,17 +375,21 @@ static int read_page(struct file *file, unsigned long index,
 		goto out;
 	}
 	attach_page_buffers(page, bh);
-	block = index << (PAGE_SHIFT - inode->i_blkbits);
+	blk_cur = index << (PAGE_SHIFT - inode->i_blkbits);
 	while (bh) {
+		block = blk_cur;
+
 		if (count == 0)
 			bh->b_blocknr = 0;
 		else {
-			bh->b_blocknr = bmap(inode, block);
-			if (bh->b_blocknr == 0) {
-				/* Cannot use this file! */
+			ret = bmap(inode, &block);
+			if (ret || !block) {
 				ret = -EINVAL;
+				bh->b_blocknr = 0;
 				goto out;
 			}
+
+			bh->b_blocknr = block;
 			bh->b_bdev = inode->i_sb->s_bdev;
 			if (count < (1<<inode->i_blkbits))
 				count = 0;
@@ -399,7 +403,7 @@ static int read_page(struct file *file, unsigned long index,
 			set_buffer_mapped(bh);
 			submit_bh(REQ_OP_READ, 0, bh);
 		}
-		block++;
+		blk_cur++;
 		bh = bh->b_this_page;
 	}
 	page->index = index;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 356642e8c3b3..3bc3d137c9ff 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3647,12 +3647,16 @@ static int check_swap_activate(struct swap_info_struct *sis,
 			page_no < sis->max) {
 		unsigned block_in_page;
 		sector_t first_block;
+		sector_t block = 0;
+		int	 err = 0;
 
 		cond_resched();
 
-		first_block = bmap(inode, probe_block);
-		if (first_block == 0)
+		block = probe_block;
+		err = bmap(inode, &block);
+		if (err || !block)
 			goto bad_bmap;
+		first_block = block;
 
 		/*
 		 * It must be PAGE_SIZE aligned on-disk
@@ -3664,11 +3668,13 @@ static int check_swap_activate(struct swap_info_struct *sis,
 
 		for (block_in_page = 1; block_in_page < blocks_per_page;
 					block_in_page++) {
-			sector_t block;
 
-			block = bmap(inode, probe_block + block_in_page);
-			if (block == 0)
+			block = probe_block + block_in_page;
+			err = bmap(inode, &block);
+
+			if (err || !block)
 				goto bad_bmap;
+
 			if (block != first_block + block_in_page) {
 				/* Discontiguity */
 				probe_block++;
diff --git a/fs/inode.c b/fs/inode.c
index 2b0f51161918..9f894b25af2b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1600,21 +1600,25 @@ EXPORT_SYMBOL(iput);
 
 /**
  *	bmap	- find a block number in a file
- *	@inode: inode of file
- *	@block: block to find
- *
- *	Returns the block number on the device holding the inode that
- *	is the disk block number for the block of the file requested.
- *	That is, asked for block 4 of inode 1 the function will return the
- *	disk block relative to the disk start that holds that block of the
- *	file.
+ *	@inode:  inode owning the block number being requested
+ *	@block: pointer containing the block to find
+ *
+ *	Replaces the value in *block with the block number on the device holding
+ *	corresponding to the requested block number in the file.
+ *	That is, asked for block 4 of inode 1 the function will replace the
+ *	4 in *block, with disk block relative to the disk start that holds that
+ *	block of the file.
+ *
+ *	Returns -EINVAL in case of error, 0 otherwise. If mapping falls into a
+ *	hole, returns 0 and *block is also set to 0.
  */
-sector_t bmap(struct inode *inode, sector_t block)
+int bmap(struct inode *inode, sector_t *block)
 {
-	sector_t res = 0;
-	if (inode->i_mapping->a_ops->bmap)
-		res = inode->i_mapping->a_ops->bmap(inode->i_mapping, block);
-	return res;
+	if (!inode->i_mapping->a_ops->bmap)
+		return -EINVAL;
+
+	*block = inode->i_mapping->a_ops->bmap(inode->i_mapping, *block);
+	return 0;
 }
 EXPORT_SYMBOL(bmap);
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index fa3a9232caa7..8e2756e2252f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -795,18 +795,23 @@ int jbd2_journal_bmap(journal_t *journal, unsigned long blocknr,
 {
 	int err = 0;
 	unsigned long long ret;
+	sector_t block = 0;
 
 	if (journal->j_inode) {
-		ret = bmap(journal->j_inode, blocknr);
-		if (ret)
-			*retp = ret;
-		else {
+		block = blocknr;
+		ret = bmap(journal->j_inode, &block);
+
+		if (ret || !block) {
 			printk(KERN_ALERT "%s: journal block not found "
 					"at offset %lu on %s\n",
 			       __func__, blocknr, journal->j_devname);
 			err = -EIO;
 			__journal_abort_soft(journal, err);
+
+		} else {
+			*retp = block;
 		}
+
 	} else {
 		*retp = blocknr; /* +journal->j_blk_offset */
 	}
@@ -1243,11 +1248,14 @@ journal_t *jbd2_journal_init_dev(struct block_device *bdev,
 journal_t *jbd2_journal_init_inode(struct inode *inode)
 {
 	journal_t *journal;
+	sector_t blocknr;
 	char *p;
-	unsigned long long blocknr;
+	int err = 0;
+
+	blocknr = 0;
+	err = bmap(inode, &blocknr);
 
-	blocknr = bmap(inode, 0);
-	if (!blocknr) {
+	if (err || !blocknr) {
 		pr_err("%s: Cannot locate journal superblock\n",
 			__func__);
 		return NULL;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d7584bcef5d3..092699e99faa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2865,7 +2865,7 @@ static inline ssize_t generic_write_sync(struct kiocb *iocb, ssize_t count)
 extern void emergency_sync(void);
 extern void emergency_remount(void);
 #ifdef CONFIG_BLOCK
-extern sector_t bmap(struct inode *, sector_t);
+extern int bmap(struct inode *, sector_t *);
 #endif
 extern int notify_change(struct dentry *, struct iattr *, struct inode **);
 extern int inode_permission(struct inode *, int);
diff --git a/mm/page_io.c b/mm/page_io.c
index 3a198deb8bb1..76965be1d40e 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -177,8 +177,9 @@ int generic_swapfile_activate(struct swap_info_struct *sis,
 
 		cond_resched();
 
-		first_block = bmap(inode, probe_block);
-		if (first_block == 0)
+		first_block = probe_block;
+		ret = bmap(inode, &first_block);
+		if (ret || !first_block)
 			goto bad_bmap;
 
 		/*
@@ -193,9 +194,11 @@ int generic_swapfile_activate(struct swap_info_struct *sis,
 					block_in_page++) {
 			sector_t block;
 
-			block = bmap(inode, probe_block + block_in_page);
-			if (block == 0)
+			block = probe_block + block_in_page;
+			ret = bmap(inode, &block);
+			if (ret || !block)
 				goto bad_bmap;
+
 			if (block != first_block + block_in_page) {
 				/* Discontiguity */
 				probe_block++;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] cachefiles: drop direct usage of ->bmap method.
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 1/5] fs: Enable bmap() function to properly return errors Carlos Maiolino
@ 2020-01-09 13:30 ` Carlos Maiolino
  2020-04-09  0:32   ` NeilBrown
  2020-01-09 13:30 ` [PATCH 3/5] ecryptfs: drop direct calls to ->bmap Carlos Maiolino
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

Replace the direct usage of ->bmap method by a bmap() call.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/cachefiles/rdwr.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 44a3ce1e4ce4..1dc97f2d6201 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -396,7 +396,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
 	struct cachefiles_object *object;
 	struct cachefiles_cache *cache;
 	struct inode *inode;
-	sector_t block0, block;
+	sector_t block;
 	unsigned shift;
 	int ret;
 
@@ -412,7 +412,6 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
 
 	inode = d_backing_inode(object->backer);
 	ASSERT(S_ISREG(inode->i_mode));
-	ASSERT(inode->i_mapping->a_ops->bmap);
 	ASSERT(inode->i_mapping->a_ops->readpages);
 
 	/* calculate the shift required to use bmap */
@@ -428,12 +427,14 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
 	 *   enough for this as it doesn't indicate errors, but it's all we've
 	 *   got for the moment
 	 */
-	block0 = page->index;
-	block0 <<= shift;
+	block = page->index;
+	block <<= shift;
+
+	ret = bmap(inode, &block);
+	ASSERT(ret < 0);
 
-	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
 	_debug("%llx -> %llx",
-	       (unsigned long long) block0,
+	       (unsigned long long) (page->index << shift),
 	       (unsigned long long) block);
 
 	if (block) {
@@ -711,7 +712,6 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
 
 	inode = d_backing_inode(object->backer);
 	ASSERT(S_ISREG(inode->i_mode));
-	ASSERT(inode->i_mapping->a_ops->bmap);
 	ASSERT(inode->i_mapping->a_ops->readpages);
 
 	/* calculate the shift required to use bmap */
@@ -728,7 +728,7 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
 
 	ret = space ? -ENODATA : -ENOBUFS;
 	list_for_each_entry_safe(page, _n, pages, lru) {
-		sector_t block0, block;
+		sector_t block;
 
 		/* we assume the absence or presence of the first block is a
 		 * good enough indication for the page as a whole
@@ -736,13 +736,14 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
 		 *   good enough for this as it doesn't indicate errors, but
 		 *   it's all we've got for the moment
 		 */
-		block0 = page->index;
-		block0 <<= shift;
+		block = page->index;
+		block <<= shift;
+
+		ret = bmap(inode, &block);
+		ASSERT(!ret);
 
-		block = inode->i_mapping->a_ops->bmap(inode->i_mapping,
-						      block0);
 		_debug("%llx -> %llx",
-		       (unsigned long long) block0,
+		       (unsigned long long) (page->index << shift),
 		       (unsigned long long) block);
 
 		if (block) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] ecryptfs: drop direct calls to ->bmap
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 1/5] fs: Enable bmap() function to properly return errors Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 2/5] cachefiles: drop direct usage of ->bmap method Carlos Maiolino
@ 2020-01-09 13:30 ` Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 4/5] fibmap: Use bmap instead of ->bmap method in ioctl_fibmap Carlos Maiolino
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

Replace direct ->bmap calls by bmap() method.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/ecryptfs/mmap.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
index cffa0c1ec829..019572c6b39a 100644
--- a/fs/ecryptfs/mmap.c
+++ b/fs/ecryptfs/mmap.c
@@ -524,16 +524,12 @@ static int ecryptfs_write_end(struct file *file,
 
 static sector_t ecryptfs_bmap(struct address_space *mapping, sector_t block)
 {
-	int rc = 0;
-	struct inode *inode;
-	struct inode *lower_inode;
-
-	inode = (struct inode *)mapping->host;
-	lower_inode = ecryptfs_inode_to_lower(inode);
-	if (lower_inode->i_mapping->a_ops->bmap)
-		rc = lower_inode->i_mapping->a_ops->bmap(lower_inode->i_mapping,
-							 block);
-	return rc;
+	struct inode *lower_inode = ecryptfs_inode_to_lower(mapping->host);
+	int ret = bmap(lower_inode, &block);
+
+	if (ret)
+		return 0;
+	return block;
 }
 
 const struct address_space_operations ecryptfs_aops = {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] fibmap: Use bmap instead of ->bmap method in ioctl_fibmap
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
                   ` (2 preceding siblings ...)
  2020-01-09 13:30 ` [PATCH 3/5] ecryptfs: drop direct calls to ->bmap Carlos Maiolino
@ 2020-01-09 13:30 ` Carlos Maiolino
  2020-01-09 13:30 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
  2020-01-09 15:05 ` [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Al Viro
  5 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

Now we have the possibility of proper error return in bmap, use bmap()
function in ioctl_fibmap() instead of calling ->bmap method directly.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/ioctl.c         | 30 ++++++++++++++++++++----------
 include/linux/fs.h |  9 ++++++++-
 2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 7c9a5df5a597..0ed5fb2d6c19 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -54,19 +54,29 @@ EXPORT_SYMBOL(vfs_ioctl);
 
 static int ioctl_fibmap(struct file *filp, int __user *p)
 {
-	struct address_space *mapping = filp->f_mapping;
-	int res, block;
+	struct inode *inode = file_inode(filp);
+	int error, ur_block;
+	sector_t block;
 
-	/* do we support this mess? */
-	if (!mapping->a_ops->bmap)
-		return -EINVAL;
 	if (!capable(CAP_SYS_RAWIO))
 		return -EPERM;
-	res = get_user(block, p);
-	if (res)
-		return res;
-	res = mapping->a_ops->bmap(mapping, block);
-	return put_user(res, p);
+
+	error = get_user(ur_block, p);
+	if (error)
+		return error;
+
+	block = ur_block;
+	error = bmap(inode, &block);
+
+	if (error)
+		ur_block = 0;
+	else
+		ur_block = block;
+
+	if (put_user(ur_block, p))
+		error = -EFAULT;
+
+	return error;
 }
 
 /**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 092699e99faa..90a72f7185b7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2864,9 +2864,16 @@ static inline ssize_t generic_write_sync(struct kiocb *iocb, ssize_t count)
 
 extern void emergency_sync(void);
 extern void emergency_remount(void);
+
 #ifdef CONFIG_BLOCK
-extern int bmap(struct inode *, sector_t *);
+extern int bmap(struct inode *inode, sector_t *block);
+#else
+static inline int bmap(struct inode *inode,  sector_t *block)
+{
+	return -EINVAL;
+}
 #endif
+
 extern int notify_change(struct dentry *, struct iattr *, struct inode **);
 extern int inode_permission(struct inode *, int);
 extern int generic_permission(struct inode *, int);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] fibmap: Reject negative block numbers
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
                   ` (3 preceding siblings ...)
  2020-01-09 13:30 ` [PATCH 4/5] fibmap: Use bmap instead of ->bmap method in ioctl_fibmap Carlos Maiolino
@ 2020-01-09 13:30 ` Carlos Maiolino
  2020-01-09 15:05 ` [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Al Viro
  5 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2020-01-09 13:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, viro

FIBMAP receives an integer from userspace which is then implicitly converted
into sector_t to be passed to bmap(). No check is made to ensure userspace
didn't send a negative block number, which can end up in an underflow, and
returning to userspace a corrupted block address.

As a side-effect, the underflow caused by a negative block here, will
trigger the WARN() in iomap_bmap_actor(), which is how this issue was
first discovered.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/ioctl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 0ed5fb2d6c19..72d6848fb6ad 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -65,6 +65,9 @@ static int ioctl_fibmap(struct file *filp, int __user *p)
 	if (error)
 		return error;
 
+	if (ur_block < 0)
+		return -EINVAL;
+
 	block = ur_block;
 	error = bmap(inode, &block);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface
  2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
                   ` (4 preceding siblings ...)
  2020-01-09 13:30 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
@ 2020-01-09 15:05 ` Al Viro
  5 siblings, 0 replies; 12+ messages in thread
From: Al Viro @ 2020-01-09 15:05 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-fsdevel, hch

On Thu, Jan 09, 2020 at 02:30:40PM +0100, Carlos Maiolino wrote:
> Hi,
> 
> This series refactor the internal structure of FIBMAP so that the filesystem can
> properly report errors back to VFS, and also simplifies its usage by
> standardizing all ->bmap() method usage via bmap() function.
> 
> The last patch is a bug fix for ioctl_fibmap() calls with negative block values.
> 
> 
> Viro spotted a mistake in patch 4/5 on the previous version, where bmap()
> return value was not being propagated back to userland, breaking its ABI.
> 
> So, this new version, only has a change on patch 4/5 to fix this problem.

Applied and pushed (#work.misc, #for-next)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] cachefiles: drop direct usage of ->bmap method.
  2020-01-09 13:30 ` [PATCH 2/5] cachefiles: drop direct usage of ->bmap method Carlos Maiolino
@ 2020-04-09  0:32   ` NeilBrown
  2020-04-16 14:03     ` David Wysochanski
  0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2020-04-09  0:32 UTC (permalink / raw)
  To: Carlos Maiolino, linux-fsdevel; +Cc: hch, viro, David Howells

[-- Attachment #1: Type: text/plain, Size: 3575 bytes --]

On Thu, Jan 09 2020, Carlos Maiolino wrote:

> Replace the direct usage of ->bmap method by a bmap() call.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/cachefiles/rdwr.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index 44a3ce1e4ce4..1dc97f2d6201 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -396,7 +396,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
>  	struct cachefiles_object *object;
>  	struct cachefiles_cache *cache;
>  	struct inode *inode;
> -	sector_t block0, block;
> +	sector_t block;
>  	unsigned shift;
>  	int ret;
>  
> @@ -412,7 +412,6 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
>  
>  	inode = d_backing_inode(object->backer);
>  	ASSERT(S_ISREG(inode->i_mode));
> -	ASSERT(inode->i_mapping->a_ops->bmap);
>  	ASSERT(inode->i_mapping->a_ops->readpages);
>  
>  	/* calculate the shift required to use bmap */
> @@ -428,12 +427,14 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
>  	 *   enough for this as it doesn't indicate errors, but it's all we've
>  	 *   got for the moment
>  	 */
> -	block0 = page->index;
> -	block0 <<= shift;
> +	block = page->index;
> +	block <<= shift;
> +
> +	ret = bmap(inode, &block);
> +	ASSERT(ret < 0);
>  
> -	block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
>  	_debug("%llx -> %llx",
> -	       (unsigned long long) block0,
> +	       (unsigned long long) (page->index << shift),
>  	       (unsigned long long) block);
>  
>  	if (block) {
> @@ -711,7 +712,6 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
>  
>  	inode = d_backing_inode(object->backer);
>  	ASSERT(S_ISREG(inode->i_mode));
> -	ASSERT(inode->i_mapping->a_ops->bmap);
>  	ASSERT(inode->i_mapping->a_ops->readpages);
>  
>  	/* calculate the shift required to use bmap */
> @@ -728,7 +728,7 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
>  
>  	ret = space ? -ENODATA : -ENOBUFS;
>  	list_for_each_entry_safe(page, _n, pages, lru) {
> -		sector_t block0, block;
> +		sector_t block;
>  
>  		/* we assume the absence or presence of the first block is a
>  		 * good enough indication for the page as a whole
> @@ -736,13 +736,14 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
>  		 *   good enough for this as it doesn't indicate errors, but
>  		 *   it's all we've got for the moment
>  		 */
> -		block0 = page->index;
> -		block0 <<= shift;
> +		block = page->index;
> +		block <<= shift;
> +
> +		ret = bmap(inode, &block);
> +		ASSERT(!ret);

Hi,
 this change corrupts 'ret'.
 Some paths from here down change ret, but some paths to do not.
 So it is possible that this function would previously return -ENODATA
 or -ENOBUFS, but now it returns 0.

 This gets caught by a BUG_ON in __nfs_readpages_from_fscache().

 Maybe a new variable should be introduced?
 or maybe ASSERT(!bmap(..));
 Or maybe if (bmap() != 0) ASSET(0);

??

https://bugzilla.opensuse.org/show_bug.cgi?id=1168841

NeilBrown


>  
> -		block = inode->i_mapping->a_ops->bmap(inode->i_mapping,
> -						      block0);
>  		_debug("%llx -> %llx",
> -		       (unsigned long long) block0,
> +		       (unsigned long long) (page->index << shift),
>  		       (unsigned long long) block);
>  
>  		if (block) {
> -- 
> 2.23.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/5] cachefiles: drop direct usage of ->bmap method.
  2020-04-09  0:32   ` NeilBrown
@ 2020-04-16 14:03     ` David Wysochanski
  0 siblings, 0 replies; 12+ messages in thread
From: David Wysochanski @ 2020-04-16 14:03 UTC (permalink / raw)
  To: NeilBrown; +Cc: Carlos Maiolino, linux-fsdevel, hch, viro, David Howells

On Wed, Apr 8, 2020 at 8:32 PM NeilBrown <neilb@suse.de> wrote:
>
> On Thu, Jan 09 2020, Carlos Maiolino wrote:
>
> > Replace the direct usage of ->bmap method by a bmap() call.
> >
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> >  fs/cachefiles/rdwr.c | 27 ++++++++++++++-------------
> >  1 file changed, 14 insertions(+), 13 deletions(-)
> >
> > diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> > index 44a3ce1e4ce4..1dc97f2d6201 100644
> > --- a/fs/cachefiles/rdwr.c
> > +++ b/fs/cachefiles/rdwr.c
> > @@ -396,7 +396,7 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
> >       struct cachefiles_object *object;
> >       struct cachefiles_cache *cache;
> >       struct inode *inode;
> > -     sector_t block0, block;
> > +     sector_t block;
> >       unsigned shift;
> >       int ret;
> >
> > @@ -412,7 +412,6 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
> >
> >       inode = d_backing_inode(object->backer);
> >       ASSERT(S_ISREG(inode->i_mode));
> > -     ASSERT(inode->i_mapping->a_ops->bmap);
> >       ASSERT(inode->i_mapping->a_ops->readpages);
> >
> >       /* calculate the shift required to use bmap */
> > @@ -428,12 +427,14 @@ int cachefiles_read_or_alloc_page(struct fscache_retrieval *op,
> >        *   enough for this as it doesn't indicate errors, but it's all we've
> >        *   got for the moment
> >        */
> > -     block0 = page->index;
> > -     block0 <<= shift;
> > +     block = page->index;
> > +     block <<= shift;
> > +
> > +     ret = bmap(inode, &block);
> > +     ASSERT(ret < 0);
> >

I too am hitting this assert with a very simple test (see below) and I
saw ret == *block == 0
This used to fall through and be handled in 'else if / else'

        if (block) {
                /* submit the apparently valid page to the backing fs to be
                 * read from disk */
                ret = cachefiles_read_backing_file_one(object, op, page);
        } else if (cachefiles_has_space(cache, 0, 1) == 0) {
                /* there's space in the cache we can use */
                fscache_mark_page_cached(op, page);
                fscache_retrieval_complete(op, 1);
                ret = -ENODATA;
        } else {
                goto enobufs;
        }


Test:
systemctl start cachefilesd
mount -o vers=3,fsc 127.0.0.1:/export /mnt
dd if=/dev/zero of=/mnt/file1.bin bs=1M count=1
echo 3 > /proc/sys/vm/drop_caches
dd if=/mnt/file1.bin of=/dev/null
echo 3 > /proc/sys/vm/drop_caches
dd if=/mnt/file1.bin of=/dev/null

FWIW, the below is 5.7-rc1 kernel plus the 3 patches I posted to linux-nfs

[ 1613.017827] readahead_oops. (8973): drop_caches: 3
[ 1613.100768] readahead_oops. (8973): drop_caches: 3
[ 1613.126040] CacheFiles:
[ 1613.127317] CacheFiles: Assertion failed
[ 1613.129130] ------------[ cut here ]------------
[ 1613.130901] kernel BUG at fs/cachefiles/rdwr.c:434!
[ 1613.132856] invalid opcode: 0000 [#1] SMP PTI
[ 1613.134590] CPU: 3 PID: 9010 Comm: dd Kdump: loaded Not tainted
5.7.0-rc1-bz1790933-bz1793560-fix3+ #13
[ 1613.138250] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 1613.140544] RIP:
0010:cachefiles_read_or_alloc_page.cold+0x25c/0x361 [cachefiles]
[ 1613.143466] Code: 4c 89 c1 e8 05 05 97 c0 4c 8b 44 24 08 e9 b0 bb
ff ff 48 c7 c7 09 57 7d c0 e8 ef 04 97 c0 48 c7 c7 60 6c 7d c0 e8 e3
04 97 c0 <0f> 0b 48 c7 c7 09 57 7d c0 e8 d5 04 97 c0 48 c7 c7 60 6c 7d
c0 e8
[ 1613.188264] RSP: 0018:ffffa13d00527c40 EFLAGS: 00010246
[ 1613.190334] RAX: 000000000000001c RBX: ffff8e8533290600 RCX: 0000000000000000
[ 1613.193104] RDX: 0000000000000000 RSI: ffff8e8537cd9c28 RDI: ffff8e8537cd9c28
[ 1613.195875] RBP: ffff8e852c42ea80 R08: 000000000000028c R09: 000000000000002b
[ 1613.198698] R10: ffffa13d00527af0 R11: 0000000000000000 R12: 0000000000000000
[ 1613.201513] R13: ffffdc5b05673700 R14: ffff8e8526636c00 R15: ffff8e8533290668
[ 1613.204311] FS:  00007f45a30c7580(0000) GS:ffff8e8537cc0000(0000)
knlGS:0000000000000000
[ 1613.207470] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1613.209724] CR2: 000055c636d58000 CR3: 00000002331d2000 CR4: 00000000000406e0
[ 1613.212513] Call Trace:
[ 1613.213624]  __fscache_read_or_alloc_page+0x2a5/0x320 [fscache]
[ 1613.215999]  __nfs_readpage_from_fscache+0x49/0xf0 [nfs]
[ 1613.218128]  nfs_readpage+0xf8/0x260 [nfs]
[ 1613.219805]  generic_file_read_iter+0x6f3/0xe40
[ 1613.221639]  ? nfs3_proc_commit_setup+0x10/0x10 [nfsv3]
[ 1613.223726]  ? nfs_check_cache_invalid+0x33/0x90 [nfs]
[ 1613.225784]  nfs_file_read+0x6d/0xa0 [nfs]
[ 1613.227464]  new_sync_read+0x12a/0x1c0
[ 1613.228981]  vfs_read+0x9d/0x150
[ 1613.230295]  ksys_read+0x5f/0xe0
[ 1613.231624]  do_syscall_64+0x5b/0x1c0
[ 1613.233180]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1613.235191] RIP: 0033:0x7f45a2fef3e2





> > -     block = inode->i_mapping->a_ops->bmap(inode->i_mapping, block0);
> >       _debug("%llx -> %llx",
> > -            (unsigned long long) block0,
> > +            (unsigned long long) (page->index << shift),
> >              (unsigned long long) block);
> >
> >       if (block) {
> > @@ -711,7 +712,6 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
> >
> >       inode = d_backing_inode(object->backer);
> >       ASSERT(S_ISREG(inode->i_mode));
> > -     ASSERT(inode->i_mapping->a_ops->bmap);
> >       ASSERT(inode->i_mapping->a_ops->readpages);
> >
> >       /* calculate the shift required to use bmap */
> > @@ -728,7 +728,7 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
> >
> >       ret = space ? -ENODATA : -ENOBUFS;
> >       list_for_each_entry_safe(page, _n, pages, lru) {
> > -             sector_t block0, block;
> > +             sector_t block;
> >
> >               /* we assume the absence or presence of the first block is a
> >                * good enough indication for the page as a whole
> > @@ -736,13 +736,14 @@ int cachefiles_read_or_alloc_pages(struct fscache_retrieval *op,
> >                *   good enough for this as it doesn't indicate errors, but
> >                *   it's all we've got for the moment
> >                */
> > -             block0 = page->index;
> > -             block0 <<= shift;
> > +             block = page->index;
> > +             block <<= shift;
> > +
> > +             ret = bmap(inode, &block);
> > +             ASSERT(!ret);
>
> Hi,
>  this change corrupts 'ret'.
>  Some paths from here down change ret, but some paths to do not.
>  So it is possible that this function would previously return -ENODATA
>  or -ENOBUFS, but now it returns 0.
>
>  This gets caught by a BUG_ON in __nfs_readpages_from_fscache().
>
>  Maybe a new variable should be introduced?
>  or maybe ASSERT(!bmap(..));
>  Or maybe if (bmap() != 0) ASSET(0);
>
> ??
>
> https://bugzilla.opensuse.org/show_bug.cgi?id=1168841
>
> NeilBrown
>
>
> >
> > -             block = inode->i_mapping->a_ops->bmap(inode->i_mapping,
> > -                                                   block0);
> >               _debug("%llx -> %llx",
> > -                    (unsigned long long) block0,
> > +                    (unsigned long long) (page->index << shift),
> >                      (unsigned long long) block);
> >
> >               if (block) {
> > --
> > 2.23.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/5] fibmap: Reject negative block numbers
  2019-12-10 15:03 [PATCH " Carlos Maiolino
@ 2019-12-10 15:03 ` Carlos Maiolino
  0 siblings, 0 replies; 12+ messages in thread
From: Carlos Maiolino @ 2019-12-10 15:03 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch

FIBMAP receives an integer from userspace which is then implicitly converted
into sector_t to be passed to bmap(). No check is made to ensure userspace
didn't send a negative block number, which can end up in an underflow, and
returning to userspace a corrupted block address.

As a side-effect, the underflow caused by a negative block here, will
trigger the WARN() in iomap_bmap_actor(), which is how this issue was
first discovered.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/ioctl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 83f36feaf781..79468b5a6391 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -65,6 +65,9 @@ static int ioctl_fibmap(struct file *filp, int __user *p)
 	if (error)
 		return error;
 
+	if (ur_block < 0)
+		return -EINVAL;
+
 	block = ur_block;
 	error = bmap(inode, &block);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] fibmap: Reject negative block numbers
  2019-11-22  8:53 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
@ 2019-11-22 13:38   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2019-11-22 13:38 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-fsdevel, hch, darrick.wong, sandeen

On Fri, Nov 22, 2019 at 09:53:20AM +0100, Carlos Maiolino wrote:
> FIBMAP receives an integer from userspace which is then implicitly converted
> into sector_t to be passed to bmap(). No check is made to ensure userspace
> didn't send a negative block number, which can end up in an underflow, and
> returning to userspace a corrupted block address.
> 
> As a side-effect, the underflow caused by a negative block here, will
> trigger the WARN() in iomap_bmap_actor(), which is how this issue was
> first discovered.
> 
> This is essentially a V2 of a patch I sent a while ago, reworded and
> refactored to fit into this patchset.

That last sentence should probably be removed.

Otherwise:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/5] fibmap: Reject negative block numbers
  2019-11-22  8:53 [PATCH 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
@ 2019-11-22  8:53 ` Carlos Maiolino
  2019-11-22 13:38   ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Carlos Maiolino @ 2019-11-22  8:53 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: hch, darrick.wong, sandeen

FIBMAP receives an integer from userspace which is then implicitly converted
into sector_t to be passed to bmap(). No check is made to ensure userspace
didn't send a negative block number, which can end up in an underflow, and
returning to userspace a corrupted block address.

As a side-effect, the underflow caused by a negative block here, will
trigger the WARN() in iomap_bmap_actor(), which is how this issue was
first discovered.

This is essentially a V2 of a patch I sent a while ago, reworded and
refactored to fit into this patchset.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/ioctl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 6b589c873bc2..6b365b3eff70 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -64,6 +64,9 @@ static int ioctl_fibmap(struct file *filp, int __user *p)
 	if (error)
 		return error;
 
+	if (ur_block < 0)
+		return -EINVAL;
+
 	block = ur_block;
 	error = bmap(inode, &block);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-16 14:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 13:30 [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
2020-01-09 13:30 ` [PATCH 1/5] fs: Enable bmap() function to properly return errors Carlos Maiolino
2020-01-09 13:30 ` [PATCH 2/5] cachefiles: drop direct usage of ->bmap method Carlos Maiolino
2020-04-09  0:32   ` NeilBrown
2020-04-16 14:03     ` David Wysochanski
2020-01-09 13:30 ` [PATCH 3/5] ecryptfs: drop direct calls to ->bmap Carlos Maiolino
2020-01-09 13:30 ` [PATCH 4/5] fibmap: Use bmap instead of ->bmap method in ioctl_fibmap Carlos Maiolino
2020-01-09 13:30 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
2020-01-09 15:05 ` [PATCH V8 0/5] Refactor ioctl_fibmap() internal interface Al Viro
  -- strict thread matches above, loose matches on Subject: below --
2019-12-10 15:03 [PATCH " Carlos Maiolino
2019-12-10 15:03 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
2019-11-22  8:53 [PATCH 0/5] Refactor ioctl_fibmap() internal interface Carlos Maiolino
2019-11-22  8:53 ` [PATCH 5/5] fibmap: Reject negative block numbers Carlos Maiolino
2019-11-22 13:38   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).