* [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O
@ 2018-05-11 6:30 Omar Sandoval
2018-05-11 6:30 ` [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO() Omar Sandoval
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 6:30 UTC (permalink / raw)
To: linux-btrfs, linux-fsdevel; +Cc: kernel-team
From: Omar Sandoval <osandov@fb.com>
Hi, everyone,
Btrfs currently abuses current->journal_info in btrfs_direct_IO() in
order to pass around some state to get_block() and submit_io(). This
hack is ugly and unnecessary because the data we pass around is only
used in one call frame. Robbie Ko also pointed out [1] that it could
potentially cause a crash if we happen to end up in start_transaction()
(e.g., from memory reclaim calling into btrfs_evict_inode(), which can
start a transaction). I'm not convinced that Robbie's case can happen in
practice since we are using GFP_NOFS for allocations during direct I/O,
but either way it's fragile and nasty.
This series stops using current->journal_info and instead adds some
extra arguments to the generic direct I/O code so that we can pass
things around like sane people.
Based on Linus' master.
Thanks!
1: https://patchwork.kernel.org/patch/10389077/
Omar Sandoval (3):
fs: add initial bh_result->b_private value to __blockdev_direct_IO()
fs: add private argument to dio_submit_t
Btrfs: stop abusing current->journal_info in btrfs_direct_IO()
fs/btrfs/inode.c | 39 ++++++++++-----------------------------
fs/direct-io.c | 12 +++++++-----
fs/ext4/inode.c | 6 +++---
fs/gfs2/aops.c | 2 +-
fs/ocfs2/aops.c | 5 ++---
include/linux/fs.h | 10 +++++-----
6 files changed, 28 insertions(+), 46 deletions(-)
--
2.17.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
@ 2018-05-11 6:30 ` Omar Sandoval
2018-05-11 20:05 ` Al Viro
2018-05-11 6:30 ` [PATCH 2/3] fs: add private argument to dio_submit_t Omar Sandoval
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 6:30 UTC (permalink / raw)
To: linux-btrfs, linux-fsdevel; +Cc: kernel-team
From: Omar Sandoval <osandov@fb.com>
Btrfs abuses current->journal_info in btrfs_direct_IO() in order to pass
around some state to get_block() and submit_io(). The generic DIO code
already provides bh_result->b_private as a way to pass data between
calls to get_block() and end_io(), but it is always initialized to NULL.
Let's allow an initial value to be passed in from the caller of
__blockdev_direct_IO().
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/btrfs/inode.c | 6 +++---
fs/direct-io.c | 9 +++++----
fs/ext4/inode.c | 6 +++---
fs/gfs2/aops.c | 2 +-
fs/ocfs2/aops.c | 5 ++---
include/linux/fs.h | 8 ++++----
6 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d241285a0d2a..d8b0a7eacd19 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8618,9 +8618,9 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
}
ret = __blockdev_direct_IO(iocb, inode,
- fs_info->fs_devices->latest_bdev,
- iter, btrfs_get_blocks_direct, NULL,
- btrfs_submit_direct, flags);
+ fs_info->fs_devices->latest_bdev, iter,
+ btrfs_get_blocks_direct, NULL,
+ btrfs_submit_direct, flags, NULL);
if (iov_iter_rw(iter) == WRITE) {
up_read(&BTRFS_I(inode)->dio_sem);
current->journal_info = NULL;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 874607bb6e02..9bb33d9c206c 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -1171,7 +1171,7 @@ static inline ssize_t
do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
struct block_device *bdev, struct iov_iter *iter,
get_block_t get_block, dio_iodone_t end_io,
- dio_submit_t submit_io, int flags)
+ dio_submit_t submit_io, int flags, void *private)
{
unsigned i_blkbits = READ_ONCE(inode->i_blkbits);
unsigned blkbits = i_blkbits;
@@ -1182,7 +1182,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
const loff_t end = offset + count;
struct dio *dio;
struct dio_submit sdio = { 0, };
- struct buffer_head map_bh = { 0, };
+ struct buffer_head map_bh = { .b_private = private, };
struct blk_plug plug;
unsigned long align = offset | iov_iter_alignment(iter);
@@ -1304,6 +1304,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
sdio.get_block = get_block;
dio->end_io = end_io;
+ dio->private = map_bh.b_private;
sdio.submit_io = submit_io;
sdio.final_block_in_bio = -1;
sdio.next_block_for_io = -1;
@@ -1400,7 +1401,7 @@ ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
struct block_device *bdev, struct iov_iter *iter,
get_block_t get_block,
dio_iodone_t end_io, dio_submit_t submit_io,
- int flags)
+ int flags, void *private)
{
/*
* The block device state is needed in the end to finally
@@ -1415,7 +1416,7 @@ ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
prefetch((char *)bdev->bd_queue + SMP_CACHE_BYTES);
return do_blockdev_direct_IO(iocb, inode, bdev, iter, get_block,
- end_io, submit_io, flags);
+ end_io, submit_io, flags, private);
}
EXPORT_SYMBOL(__blockdev_direct_IO);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1e50c5efae67..e453e7529a3e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3739,7 +3739,7 @@ static ssize_t ext4_direct_IO_write(struct kiocb *iocb, struct iov_iter *iter)
}
ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
get_block_func, ext4_end_io_dio, NULL,
- dio_flags);
+ dio_flags, NULL);
if (ret > 0 && !overwrite && ext4_test_inode_state(inode,
EXT4_STATE_DIO_UNWRITTEN)) {
@@ -3830,8 +3830,8 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
iocb->ki_pos + count - 1);
if (ret)
goto out_unlock;
- ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev,
- iter, ext4_dio_get_block, NULL, NULL, 0);
+ ret = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
+ ext4_dio_get_block, NULL, NULL, 0, NULL);
out_unlock:
inode_unlock_shared(inode);
return ret;
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index f58716567972..d42e68543baa 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -1113,7 +1113,7 @@ static ssize_t gfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
}
rv = __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
- gfs2_get_block_direct, NULL, NULL, 0);
+ gfs2_get_block_direct, NULL, NULL, 0, NULL);
out:
gfs2_glock_dq(&gh);
out_uninit:
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 302cd7caa4a7..58b74d9da931 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2446,9 +2446,8 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
else
get_block = ocfs2_dio_wr_get_block;
- return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev,
- iter, get_block,
- ocfs2_dio_end_io, NULL, 0);
+ return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
+ get_block, ocfs2_dio_end_io, NULL, 0, NULL);
}
const struct address_space_operations ocfs2_aops = {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 760d8da1b6c7..b9ef2fb1224b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2993,9 +2993,8 @@ void dio_warn_stale_pagecache(struct file *filp);
ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
struct block_device *bdev, struct iov_iter *iter,
- get_block_t get_block,
- dio_iodone_t end_io, dio_submit_t submit_io,
- int flags);
+ get_block_t get_block, dio_iodone_t end_io,
+ dio_submit_t submit_io, int flags, void *private);
static inline ssize_t blockdev_direct_IO(struct kiocb *iocb,
struct inode *inode,
@@ -3003,7 +3002,8 @@ static inline ssize_t blockdev_direct_IO(struct kiocb *iocb,
get_block_t get_block)
{
return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
- get_block, NULL, NULL, DIO_LOCKING | DIO_SKIP_HOLES);
+ get_block, NULL, NULL,
+ DIO_LOCKING | DIO_SKIP_HOLES, NULL);
}
#endif
--
2.17.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/3] fs: add private argument to dio_submit_t
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
2018-05-11 6:30 ` [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO() Omar Sandoval
@ 2018-05-11 6:30 ` Omar Sandoval
2018-05-11 6:30 ` [PATCH 3/3] Btrfs: stop abusing current->journal_info in btrfs_direct_IO() Omar Sandoval
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 6:30 UTC (permalink / raw)
To: linux-btrfs, linux-fsdevel; +Cc: kernel-team
From: Omar Sandoval <osandov@fb.com>
get_block() already has access to this through bh_result->b_private, and
it gets passed to end_io() as private, so do the same for submit_io().
Along with the previous change, this will allow us to get rid Btrfs'
current->journal_info abuse in btrfs_direct_IO().
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/btrfs/inode.c | 2 +-
fs/direct-io.c | 3 ++-
include/linux/fs.h | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d8b0a7eacd19..46e3d366df8b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8418,7 +8418,7 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip)
}
static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
- loff_t file_offset)
+ loff_t file_offset, void *private)
{
struct btrfs_dio_private *dip = NULL;
struct bio *bio = NULL;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 9bb33d9c206c..3f3aa2d95ca1 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -475,7 +475,8 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
dio->bio_disk = bio->bi_disk;
if (sdio->submit_io) {
- sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio);
+ sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio,
+ dio->private);
dio->bio_cookie = BLK_QC_T_NONE;
} else
dio->bio_cookie = submit_bio(bio);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b9ef2fb1224b..9e5e6ea1c144 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2978,7 +2978,7 @@ extern int nonseekable_open(struct inode * inode, struct file * filp);
#ifdef CONFIG_BLOCK
typedef void (dio_submit_t)(struct bio *bio, struct inode *inode,
- loff_t file_offset);
+ loff_t file_offset, void *private);
enum {
/* need locking between buffered and direct access */
--
2.17.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/3] Btrfs: stop abusing current->journal_info in btrfs_direct_IO()
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
2018-05-11 6:30 ` [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO() Omar Sandoval
2018-05-11 6:30 ` [PATCH 2/3] fs: add private argument to dio_submit_t Omar Sandoval
@ 2018-05-11 6:30 ` Omar Sandoval
2018-05-11 9:26 ` [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O David Sterba
2018-05-11 9:53 ` Nikolay Borisov
4 siblings, 0 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 6:30 UTC (permalink / raw)
To: linux-btrfs, linux-fsdevel; +Cc: kernel-team
From: Omar Sandoval <osandov@fb.com>
Now that we can pass around the struct btrfs_dio_data through the
different callbacks generically, we don't need to shove it in
current->journal_info.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/btrfs/inode.c | 33 +++++++--------------------------
1 file changed, 7 insertions(+), 26 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 46e3d366df8b..3673c0420509 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7574,7 +7574,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct extent_map *em;
struct extent_state *cached_state = NULL;
- struct btrfs_dio_data *dio_data = NULL;
+ struct btrfs_dio_data *dio_data = bh_result->b_private;
u64 start = iblock << inode->i_blkbits;
u64 lockstart, lockend;
u64 len = bh_result->b_size;
@@ -7589,25 +7589,13 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
lockstart = start;
lockend = start + len - 1;
- if (current->journal_info) {
- /*
- * Need to pull our outstanding extents and set journal_info to NULL so
- * that anything that needs to check if there's a transaction doesn't get
- * confused.
- */
- dio_data = current->journal_info;
- current->journal_info = NULL;
- }
-
/*
* If this errors out it's because we couldn't invalidate pagecache for
* this range and we need to fallback to buffered.
*/
if (lock_extent_direct(inode, lockstart, lockend, &cached_state,
- create)) {
- ret = -ENOTBLK;
- goto err;
- }
+ create))
+ return -ENOTBLK;
em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0);
if (IS_ERR(em)) {
@@ -7732,7 +7720,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
WARN_ON(dio_data->reserve < len);
dio_data->reserve -= len;
dio_data->unsubmitted_oe_range_end = start + len;
- current->journal_info = dio_data;
}
/*
@@ -7755,9 +7742,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
unlock_err:
clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
unlock_bits, 1, 0, &cached_state);
-err:
- if (dio_data)
- current->journal_info = dio_data;
return ret;
}
@@ -8460,7 +8444,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
* time by btrfs_direct_IO().
*/
if (write) {
- struct btrfs_dio_data *dio_data = current->journal_info;
+ struct btrfs_dio_data *dio_data = private;
dio_data->unsubmitted_oe_range_end = dip->logical_offset +
dip->bytes;
@@ -8602,13 +8586,11 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
/*
* We need to know how many extents we reserved so that we can
* do the accounting properly if we go over the number we
- * originally calculated. Abuse current->journal_info for this.
+ * originally calculated.
*/
- dio_data.reserve = round_up(count,
- fs_info->sectorsize);
+ dio_data.reserve = round_up(count, fs_info->sectorsize);
dio_data.unsubmitted_oe_range_start = (u64)offset;
dio_data.unsubmitted_oe_range_end = (u64)offset;
- current->journal_info = &dio_data;
down_read(&BTRFS_I(inode)->dio_sem);
} else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
&BTRFS_I(inode)->runtime_flags)) {
@@ -8620,10 +8602,9 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
ret = __blockdev_direct_IO(iocb, inode,
fs_info->fs_devices->latest_bdev, iter,
btrfs_get_blocks_direct, NULL,
- btrfs_submit_direct, flags, NULL);
+ btrfs_submit_direct, flags, &dio_data);
if (iov_iter_rw(iter) == WRITE) {
up_read(&BTRFS_I(inode)->dio_sem);
- current->journal_info = NULL;
if (ret < 0 && ret != -EIOCBQUEUED) {
if (dio_data.reserve)
btrfs_delalloc_release_space(inode, data_reserved,
--
2.17.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
` (2 preceding siblings ...)
2018-05-11 6:30 ` [PATCH 3/3] Btrfs: stop abusing current->journal_info in btrfs_direct_IO() Omar Sandoval
@ 2018-05-11 9:26 ` David Sterba
2018-05-11 9:53 ` Nikolay Borisov
4 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2018-05-11 9:26 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Thu, May 10, 2018 at 11:30:09PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Hi, everyone,
>
> Btrfs currently abuses current->journal_info in btrfs_direct_IO() in
> order to pass around some state to get_block() and submit_io(). This
> hack is ugly and unnecessary because the data we pass around is only
> used in one call frame.
I'd very much like to get rid of the journal_info hack. The changes to
ther filesystems are minimal.
The 3 patches look good to me, you can add my reviewed-by for btrfs
and ack for the rest. I'm going to do a fstests round too.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
` (3 preceding siblings ...)
2018-05-11 9:26 ` [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O David Sterba
@ 2018-05-11 9:53 ` Nikolay Borisov
2018-05-11 10:24 ` David Sterba
2018-05-11 17:28 ` Omar Sandoval
4 siblings, 2 replies; 15+ messages in thread
From: Nikolay Borisov @ 2018-05-11 9:53 UTC (permalink / raw)
To: Omar Sandoval, linux-btrfs, linux-fsdevel; +Cc: kernel-team
On 11.05.2018 09:30, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Hi, everyone,
>
> Btrfs currently abuses current->journal_info in btrfs_direct_IO() in
> order to pass around some state to get_block() and submit_io(). This
> hack is ugly and unnecessary because the data we pass around is only
> used in one call frame. Robbie Ko also pointed out [1] that it could
> potentially cause a crash if we happen to end up in start_transaction()
> (e.g., from memory reclaim calling into btrfs_evict_inode(), which can
> start a transaction). I'm not convinced that Robbie's case can happen in
> practice since we are using GFP_NOFS for allocations during direct I/O,
> but either way it's fragile and nasty.
When I worked initially on btrfs-over-swap I managed to hit a case where
ext4 stacked on top of btrfs would crash since btrfs will overwrite
journal_info which was set by ext4. So this change is indeed welcome :)
>
> This series stops using current->journal_info and instead adds some
> extra arguments to the generic direct I/O code so that we can pass
> things around like sane people.
>
> Based on Linus' master.
>
> Thanks!
>
> 1: https://patchwork.kernel.org/patch/10389077/
>
> Omar Sandoval (3):
> fs: add initial bh_result->b_private value to __blockdev_direct_IO()
> fs: add private argument to dio_submit_t
> Btrfs: stop abusing current->journal_info in btrfs_direct_IO()
>
> fs/btrfs/inode.c | 39 ++++++++++-----------------------------
> fs/direct-io.c | 12 +++++++-----
> fs/ext4/inode.c | 6 +++---
> fs/gfs2/aops.c | 2 +-
> fs/ocfs2/aops.c | 5 ++---
> include/linux/fs.h | 10 +++++-----
> 6 files changed, 28 insertions(+), 46 deletions(-)
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O
2018-05-11 9:53 ` Nikolay Borisov
@ 2018-05-11 10:24 ` David Sterba
2018-05-11 17:28 ` Omar Sandoval
1 sibling, 0 replies; 15+ messages in thread
From: David Sterba @ 2018-05-11 10:24 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: Omar Sandoval, linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 12:53:36PM +0300, Nikolay Borisov wrote:
>
>
> On 11.05.2018 09:30, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > Hi, everyone,
> >
> > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in
> > order to pass around some state to get_block() and submit_io(). This
> > hack is ugly and unnecessary because the data we pass around is only
> > used in one call frame. Robbie Ko also pointed out [1] that it could
> > potentially cause a crash if we happen to end up in start_transaction()
> > (e.g., from memory reclaim calling into btrfs_evict_inode(), which can
> > start a transaction). I'm not convinced that Robbie's case can happen in
> > practice since we are using GFP_NOFS for allocations during direct I/O,
> > but either way it's fragile and nasty.
>
> When I worked initially on btrfs-over-swap I managed to hit a case where
> ext4 stacked on top of btrfs would crash since btrfs will overwrite
> journal_info which was set by ext4. So this change is indeed welcome :)
And also this, https://lkml.org/lkml/2017/12/14/165.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O
2018-05-11 9:53 ` Nikolay Borisov
2018-05-11 10:24 ` David Sterba
@ 2018-05-11 17:28 ` Omar Sandoval
1 sibling, 0 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 17:28 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 12:53:36PM +0300, Nikolay Borisov wrote:
>
>
> On 11.05.2018 09:30, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > Hi, everyone,
> >
> > Btrfs currently abuses current->journal_info in btrfs_direct_IO() in
> > order to pass around some state to get_block() and submit_io(). This
> > hack is ugly and unnecessary because the data we pass around is only
> > used in one call frame. Robbie Ko also pointed out [1] that it could
> > potentially cause a crash if we happen to end up in start_transaction()
> > (e.g., from memory reclaim calling into btrfs_evict_inode(), which can
> > start a transaction). I'm not convinced that Robbie's case can happen in
> > practice since we are using GFP_NOFS for allocations during direct I/O,
> > but either way it's fragile and nasty.
>
> When I worked initially on btrfs-over-swap I managed to hit a case where
> ext4 stacked on top of btrfs would crash since btrfs will overwrite
> journal_info which was set by ext4. So this change is indeed welcome :)
Yup, that's what I originally made these patches for. Although my latest
idea for swap is to do something along the lines of Darrick's
iomap_swap_activate(): https://patchwork.kernel.org/patch/10376435/,
I'll be getting back to that soon.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 6:30 ` [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO() Omar Sandoval
@ 2018-05-11 20:05 ` Al Viro
2018-05-11 20:30 ` Omar Sandoval
0 siblings, 1 reply; 15+ messages in thread
From: Al Viro @ 2018-05-11 20:05 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> struct block_device *bdev, struct iov_iter *iter,
> get_block_t get_block, dio_iodone_t end_io,
> - dio_submit_t submit_io, int flags)
> + dio_submit_t submit_io, int flags, void *private)
Oh, dear... That's what, 9 arguments? I agree that the hack in question
is obscene, but so is this ;-/
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 20:05 ` Al Viro
@ 2018-05-11 20:30 ` Omar Sandoval
2018-05-11 20:32 ` Al Viro
2018-05-14 16:35 ` David Sterba
0 siblings, 2 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 20:30 UTC (permalink / raw)
To: Al Viro; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > struct block_device *bdev, struct iov_iter *iter,
> > get_block_t get_block, dio_iodone_t end_io,
> > - dio_submit_t submit_io, int flags)
> > + dio_submit_t submit_io, int flags, void *private)
>
> Oh, dear... That's what, 9 arguments? I agree that the hack in question
> is obscene, but so is this ;-/
So looking at these one by one, obviously needed:
- iocb
- inode
- iter
bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
These could _maybe_ go in struct kiocb:
- flags could maybe be folded into ki_flags
- private could maybe go in iocb->private, but I haven't yet read
through to figure out if we're already using iocb->private for direct
I/O
That leaves the callbacks, get_block, end_io, and submit_io. Perhaps we
can add those to inode_operations?
Thoughts?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 20:30 ` Omar Sandoval
@ 2018-05-11 20:32 ` Al Viro
2018-05-11 20:41 ` Omar Sandoval
2018-05-14 16:35 ` David Sterba
1 sibling, 1 reply; 15+ messages in thread
From: Al Viro @ 2018-05-11 20:32 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > struct block_device *bdev, struct iov_iter *iter,
> > > get_block_t get_block, dio_iodone_t end_io,
> > > - dio_submit_t submit_io, int flags)
> > > + dio_submit_t submit_io, int flags, void *private)
> >
> > Oh, dear... That's what, 9 arguments? I agree that the hack in question
> > is obscene, but so is this ;-/
>
> So looking at these one by one, obviously needed:
>
> - iocb
> - inode
> - iter
>
> bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
>
> These could _maybe_ go in struct kiocb:
>
> - flags could maybe be folded into ki_flags
> - private could maybe go in iocb->private, but I haven't yet read
> through to figure out if we're already using iocb->private for direct
> I/O
>
> That leaves the callbacks, get_block, end_io, and submit_io. Perhaps we
> can add those to inode_operations?
Or, perhaps, btrfs shouldn't be using the common helper? The question
is not where to stash the bits and pieces - it's how unreadable the callers
are and how much boilerplate/hidden information is involved...
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 20:32 ` Al Viro
@ 2018-05-11 20:41 ` Omar Sandoval
0 siblings, 0 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-05-11 20:41 UTC (permalink / raw)
To: Al Viro; +Cc: linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 09:32:28PM +0100, Al Viro wrote:
> On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > > struct block_device *bdev, struct iov_iter *iter,
> > > > get_block_t get_block, dio_iodone_t end_io,
> > > > - dio_submit_t submit_io, int flags)
> > > > + dio_submit_t submit_io, int flags, void *private)
> > >
> > > Oh, dear... That's what, 9 arguments? I agree that the hack in question
> > > is obscene, but so is this ;-/
> >
> > So looking at these one by one, obviously needed:
> >
> > - iocb
> > - inode
> > - iter
> >
> > bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
> >
> > These could _maybe_ go in struct kiocb:
> >
> > - flags could maybe be folded into ki_flags
> > - private could maybe go in iocb->private, but I haven't yet read
> > through to figure out if we're already using iocb->private for direct
> > I/O
Modifying kiocb isn't going to pan out, it's constructed way up in the
stack so that'd be a mess.
> > That leaves the callbacks, get_block, end_io, and submit_io. Perhaps we
> > can add those to inode_operations?
>
> Or, perhaps, btrfs shouldn't be using the common helper? The question
> is not where to stash the bits and pieces - it's how unreadable the callers
> are and how much boilerplate/hidden information is involved...
I need to call through to do_blockdev_direct_IO() eventually, I'm sure
no one wants me to reimplement the 200 lines in there :) so I'd be happy
to add a separate helper that only Btrfs uses, but if we're going to
call do_blockdev_direct_IO() eventually then we still need the 9
arguments in some form. Am I misunderstanding?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-11 20:30 ` Omar Sandoval
2018-05-11 20:32 ` Al Viro
@ 2018-05-14 16:35 ` David Sterba
2018-06-25 17:16 ` David Sterba
1 sibling, 1 reply; 15+ messages in thread
From: David Sterba @ 2018-05-14 16:35 UTC (permalink / raw)
To: Omar Sandoval; +Cc: Al Viro, linux-btrfs, linux-fsdevel, kernel-team
On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > struct block_device *bdev, struct iov_iter *iter,
> > > get_block_t get_block, dio_iodone_t end_io,
> > > - dio_submit_t submit_io, int flags)
> > > + dio_submit_t submit_io, int flags, void *private)
> >
> > Oh, dear... That's what, 9 arguments? I agree that the hack in question
> > is obscene, but so is this ;-/
>
> So looking at these one by one, obviously needed:
>
> - iocb
> - inode
> - iter
>
> bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
>
> These could _maybe_ go in struct kiocb:
>
> - flags could maybe be folded into ki_flags
> - private could maybe go in iocb->private, but I haven't yet read
> through to figure out if we're already using iocb->private for direct
> I/O
I think the kiocb::private can be used for the purpose. There's only one
user, ext4, that also passes some DIO data around so it would in line
with the interface AFAICS.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-05-14 16:35 ` David Sterba
@ 2018-06-25 17:16 ` David Sterba
2018-06-29 7:02 ` Omar Sandoval
0 siblings, 1 reply; 15+ messages in thread
From: David Sterba @ 2018-06-25 17:16 UTC (permalink / raw)
To: dsterba, Omar Sandoval, Al Viro, linux-btrfs, linux-fsdevel, kernel-team
On Mon, May 14, 2018 at 06:35:48PM +0200, David Sterba wrote:
> On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > > struct block_device *bdev, struct iov_iter *iter,
> > > > get_block_t get_block, dio_iodone_t end_io,
> > > > - dio_submit_t submit_io, int flags)
> > > > + dio_submit_t submit_io, int flags, void *private)
> > >
> > > Oh, dear... That's what, 9 arguments? I agree that the hack in question
> > > is obscene, but so is this ;-/
> >
> > So looking at these one by one, obviously needed:
> >
> > - iocb
> > - inode
> > - iter
> >
> > bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
> >
> > These could _maybe_ go in struct kiocb:
> >
> > - flags could maybe be folded into ki_flags
> > - private could maybe go in iocb->private, but I haven't yet read
> > through to figure out if we're already using iocb->private for direct
> > I/O
>
> I think the kiocb::private can be used for the purpose. There's only one
> user, ext4, that also passes some DIO data around so it would in line
> with the interface AFAICS.
Omar, do you have an update to the patchset? Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()
2018-06-25 17:16 ` David Sterba
@ 2018-06-29 7:02 ` Omar Sandoval
0 siblings, 0 replies; 15+ messages in thread
From: Omar Sandoval @ 2018-06-29 7:02 UTC (permalink / raw)
To: dsterba, Al Viro, linux-btrfs, linux-fsdevel, kernel-team
On Mon, Jun 25, 2018 at 07:16:38PM +0200, David Sterba wrote:
> On Mon, May 14, 2018 at 06:35:48PM +0200, David Sterba wrote:
> > On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> > > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > > > do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > > > struct block_device *bdev, struct iov_iter *iter,
> > > > > get_block_t get_block, dio_iodone_t end_io,
> > > > > - dio_submit_t submit_io, int flags)
> > > > > + dio_submit_t submit_io, int flags, void *private)
> > > >
> > > > Oh, dear... That's what, 9 arguments? I agree that the hack in question
> > > > is obscene, but so is this ;-/
> > >
> > > So looking at these one by one, obviously needed:
> > >
> > > - iocb
> > > - inode
> > > - iter
> > >
> > > bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
> > >
> > > These could _maybe_ go in struct kiocb:
> > >
> > > - flags could maybe be folded into ki_flags
> > > - private could maybe go in iocb->private, but I haven't yet read
> > > through to figure out if we're already using iocb->private for direct
> > > I/O
> >
> > I think the kiocb::private can be used for the purpose. There's only one
> > user, ext4, that also passes some DIO data around so it would in line
> > with the interface AFAICS.
>
> Omar, do you have an update to the patchset? Thanks.
Al, what do you think of changing all users of map_bh->b_private to use
iocb->private? We'd have to pass the iocb to get_block() and
submit_io(), but we could get rid of dio->private.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-06-29 7:02 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-11 6:30 [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O Omar Sandoval
2018-05-11 6:30 ` [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO() Omar Sandoval
2018-05-11 20:05 ` Al Viro
2018-05-11 20:30 ` Omar Sandoval
2018-05-11 20:32 ` Al Viro
2018-05-11 20:41 ` Omar Sandoval
2018-05-14 16:35 ` David Sterba
2018-06-25 17:16 ` David Sterba
2018-06-29 7:02 ` Omar Sandoval
2018-05-11 6:30 ` [PATCH 2/3] fs: add private argument to dio_submit_t Omar Sandoval
2018-05-11 6:30 ` [PATCH 3/3] Btrfs: stop abusing current->journal_info in btrfs_direct_IO() Omar Sandoval
2018-05-11 9:26 ` [PATCH 0/3] Btrfs: stop abusing current->journal_info for direct I/O David Sterba
2018-05-11 9:53 ` Nikolay Borisov
2018-05-11 10:24 ` David Sterba
2018-05-11 17:28 ` Omar Sandoval
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).