On Mon, Jul 20, 2020 at 10:21:18PM +0900, Johannes Thumshirn wrote: >If we get an async I/O iocb with an O_APPEND or RWF_APPEND flag set, >submit it using REQ_OP_ZONE_APPEND to the block layer. > >As an REQ_OP_ZONE_APPEND bio must not be split, this does come with an >additional constraint, namely the buffer submitted to zonefs must not be >bigger than the max zone append size of the underlying device. For >synchronous I/O we don't care about this constraint as we can return short >writes, for AIO we need to return an error on too big buffers. I wonder what part of the patch implements that constraint on large buffer and avoids short-write. Existing code seems to trim iov_iter in the outset. max = queue_max_zone_append_sectors(bdev_get_queue(bdev)); max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize); iov_iter_truncate(from, max); This will prevent large-buffer seeing that error, and will lead to partial write. >On a successful completion, the position the data is written to is >returned via AIO's res2 field to the calling application. > >Signed-off-by: Johannes Thumshirn >--- > fs/zonefs/super.c | 143 +++++++++++++++++++++++++++++++++++++++------ > fs/zonefs/zonefs.h | 3 + > 2 files changed, 128 insertions(+), 18 deletions(-) > >diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c >index 5832e9f69268..f155a658675b 100644 >--- a/fs/zonefs/super.c >+++ b/fs/zonefs/super.c >@@ -24,6 +24,8 @@ > > #include "zonefs.h" > >+static struct bio_set zonefs_dio_bio_set; >+ > static inline int zonefs_zone_mgmt(struct zonefs_inode_info *zi, > enum req_opf op) > { >@@ -700,16 +702,71 @@ static const struct iomap_dio_ops zonefs_write_dio_ops = { > .end_io = zonefs_file_write_dio_end_io, > }; > >+struct zonefs_dio { >+ struct kiocb *iocb; >+ struct task_struct *waiter; >+ int error; >+ struct work_struct work; >+ size_t size; >+ u64 sector; >+ struct completion completion; >+ struct bio bio; >+}; How about this (will save 32 bytes) - +struct zonefs_dio { + struct kiocb *iocb; + struct task_struct *waiter; + int error; + union { + struct work_struct work; //only for async IO + struct completion completion; //only for sync IO + }; + size_t size; + u64 sector; + struct bio bio; +}; And dio->error field is not required. I see it being used at one place - + ret = zonefs_file_write_dio_end_io(iocb, dio->size, + dio->error, 0); Here error-code can be picked from dio->bio.