* [PATCH 00/10] iov_iter: Add new iters and use with AFS
@ 2018-08-06 13:16 David Howells
2018-08-06 13:16 ` [PATCH 01/10] iov_iter: Separate type from direction and use accessor functions David Howells
` (9 more replies)
0 siblings, 10 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:16 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Hi Al,
Here's a set of patches that adds two new iov_iter types and then makes AFS
use them to do I/O. The iov_iter changes are:
(1) Separate the type from the direction in the iov_iter struct and
provide accessor functions to wrap type checking.
(2) Renumber the type constants to be contiguous small unsigned integers,
starting from 0 and then use switch-statements rather than if-else
chains using bit-testing.
Note that the compiler can optimise this better by using CMP rather
than AND/TEST, say, as comparing integers requires fewer CMP
instructions or can use jump tables.
(3) Change count and iov_offset from size_t to loff_t. This allows
iov_offset to be then used as a byte offset with ITER_MAPPING and
allows 4GiB and larger reads and writes to be proposed.
This makes no difference on a 64-bit system, but does make a 32-bit
compilation a bit larger. Possibly only iov_offset really needs to be
made loff_t and size >= 4GiB should just be disallowed on 32-bit.
(4) Add an ITER_MAPPING iterator type. This provides an iterator that
directly accesses an address_space, and assumes that the target pages
are in some way locked (eg. PG_lock or PG_writeback).
(5) Add an ITER_DISCARD iterator type. This provides an iterator that
simply discards anything written to it. It cannot be used as a data
source.
The afs changes are:
(1) Use a single ITER_MAPPING iterator to cover all the data read by a
single FetchData RPC op. This involves switching the operation to an
ITER_KVEC iterator to read the status record that's transmitted after
the data.
(2) Use an ITER_DISCARD iterator to discard any extra data the server may
have included that we don't want.
(3) Use a single ITER_MAPPING iterator to cover all the data written by a
single StoreData RPC op.
(4) Add synchronous O_DIRECT read support.
The patches can be found here also:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=afs-next
David
---
David Howells (10):
iov_iter: Separate type from direction and use accessor functions
iov_iter: Renumber the ITER_* constants in uio.h
iov_iter: Make count and iov_offset loff_t not size_t
iov_iter: Add mapping and discard iterator types
afs: Better tracing of protocol errors
afs: Set up the iov_iter before calling afs_extract_data()
afs: Use ITER_MAPPING for writing
afs: Add O_DIRECT read support
afs: Add a couple of tracepoints to log I/O errors
afs: Don't invoke the server to read data beyond EOF
block/bio.c | 2
drivers/block/drbd/drbd_main.c | 2
drivers/block/drbd/drbd_receiver.c | 2
drivers/block/loop.c | 9
drivers/block/nbd.c | 12 -
drivers/isdn/mISDN/l1oip_core.c | 3
drivers/misc/vmw_vmci/vmci_queue_pair.c | 6
drivers/nvme/target/io-cmd-file.c | 2
drivers/target/iscsi/iscsi_target_util.c | 6
drivers/target/target_core_file.c | 6
drivers/usb/usbip/usbip_common.c | 2
drivers/xen/pvcalls-back.c | 8
fs/9p/vfs_addr.c | 4
fs/9p/vfs_dir.c | 2
fs/9p/vfs_file.c | 2
fs/9p/xattr.c | 4
fs/afs/cmservice.c | 56 +--
fs/afs/dir.c | 210 ++++++++---
fs/afs/file.c | 263 ++++++++++----
fs/afs/fsclient.c | 409 ++++++++-------------
fs/afs/inode.c | 2
fs/afs/internal.h | 81 +++-
fs/afs/mntpt.c | 2
fs/afs/rxrpc.c | 156 ++------
fs/afs/server.c | 4
fs/afs/vlclient.c | 134 +++----
fs/afs/volume.c | 2
fs/afs/write.c | 112 ++++--
fs/block_dev.c | 2
fs/btrfs/file.c | 7
fs/ceph/file.c | 7
fs/cifs/connect.c | 4
fs/cifs/file.c | 4
fs/cifs/misc.c | 4
fs/cifs/smb2ops.c | 4
fs/cifs/smbdirect.c | 17 +
fs/cifs/transport.c | 8
fs/direct-io.c | 2
fs/dlm/lowcomms.c | 2
fs/fuse/file.c | 2
fs/iomap.c | 2
fs/nfs/direct.c | 2
fs/nfs/file.c | 4
fs/nfsd/vfs.c | 4
fs/ocfs2/cluster/tcp.c | 2
fs/orangefs/inode.c | 2
fs/splice.c | 7
include/linux/fscache.h | 31 ++
include/linux/uio.h | 71 ++--
include/trace/events/afs.h | 202 ++++++++---
lib/iov_iter.c | 572 ++++++++++++++++++++++++------
mm/filemap.c | 2
mm/page_io.c | 2
net/9p/client.c | 4
net/9p/trans_virtio.c | 2
net/bluetooth/6lowpan.c | 2
net/bluetooth/a2mp.c | 2
net/bluetooth/smp.c | 2
net/ceph/messenger.c | 6
net/netfilter/ipvs/ip_vs_sync.c | 2
net/rxrpc/recvmsg.c | 4
net/smc/smc_clc.c | 4
net/socket.c | 6
net/sunrpc/svcsock.c | 2
net/tipc/topsrv.c | 2
net/tls/tls_device.c | 4
net/tls/tls_sw.c | 4
67 files changed, 1555 insertions(+), 960 deletions(-)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 01/10] iov_iter: Separate type from direction and use accessor functions
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
@ 2018-08-06 13:16 ` David Howells
2018-08-06 13:16 ` [PATCH 02/10] iov_iter: Renumber the ITER_* constants in uio.h David Howells
` (8 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:16 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.
Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements. This makes it easier to add further
iterator types. Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.
Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself. Only the direction is required.
Signed-off-by: David Howells <dhowells@redhat.com>
---
block/bio.c | 2
drivers/block/drbd/drbd_main.c | 2
drivers/block/drbd/drbd_receiver.c | 2
drivers/block/loop.c | 9 +-
drivers/block/nbd.c | 12 +-
drivers/isdn/mISDN/l1oip_core.c | 3 -
drivers/misc/vmw_vmci/vmci_queue_pair.c | 6 +
drivers/nvme/target/io-cmd-file.c | 2
drivers/target/iscsi/iscsi_target_util.c | 6 -
drivers/target/target_core_file.c | 6 +
drivers/usb/usbip/usbip_common.c | 2
drivers/xen/pvcalls-back.c | 8 +-
fs/9p/vfs_addr.c | 4 -
fs/9p/vfs_dir.c | 2
fs/9p/xattr.c | 4 -
fs/afs/rxrpc.c | 15 +--
fs/block_dev.c | 2
fs/ceph/file.c | 7 +
fs/cifs/connect.c | 4 -
fs/cifs/file.c | 4 -
fs/cifs/misc.c | 4 -
fs/cifs/smb2ops.c | 4 -
fs/cifs/smbdirect.c | 17 +++
fs/cifs/transport.c | 8 +-
fs/direct-io.c | 2
fs/dlm/lowcomms.c | 2
fs/fuse/file.c | 2
fs/iomap.c | 2
fs/nfsd/vfs.c | 4 -
fs/ocfs2/cluster/tcp.c | 2
fs/orangefs/inode.c | 2
fs/splice.c | 7 +
include/linux/uio.h | 49 ++++++----
lib/iov_iter.c | 154 ++++++++++++++++++------------
mm/filemap.c | 2
mm/page_io.c | 2
net/9p/client.c | 2
net/9p/trans_virtio.c | 2
net/bluetooth/6lowpan.c | 2
net/bluetooth/a2mp.c | 2
net/bluetooth/smp.c | 2
net/ceph/messenger.c | 6 +
net/netfilter/ipvs/ip_vs_sync.c | 2
net/smc/smc_clc.c | 4 -
net/socket.c | 6 +
net/sunrpc/svcsock.c | 2
net/tipc/topsrv.c | 2
net/tls/tls_device.c | 4 -
net/tls/tls_sw.c | 4 -
49 files changed, 223 insertions(+), 182 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 047c5dca6d90..df999480c0f4 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1330,7 +1330,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
/*
* success
*/
- if (((iter->type & WRITE) && (!map_data || !map_data->null_mapped)) ||
+ if ((iov_iter_rw(iter) == WRITE && (!map_data || !map_data->null_mapped)) ||
(map_data && map_data->from_user)) {
ret = bio_copy_from_iter(bio, iter);
if (ret)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index a80809bd3057..53d840ab1df2 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1856,7 +1856,7 @@ int drbd_send(struct drbd_connection *connection, struct socket *sock,
/* THINK if (signal_pending) return ... ? */
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iov, 1, size);
if (sock == connection->data.socket) {
rcu_read_lock();
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index be9450f5ad1c..d1b05e124325 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -516,7 +516,7 @@ static int drbd_recv_short(struct socket *sock, void *buf, size_t size, int flag
struct msghdr msg = {
.msg_flags = (flags ? flags : MSG_WAITALL | MSG_NOSIGNAL)
};
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, size);
return sock_recvmsg(sock, &msg, msg.msg_flags);
}
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 4cb1d1be3cfb..13598884a787 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -268,7 +268,7 @@ static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
struct iov_iter i;
ssize_t bw;
- iov_iter_bvec(&i, ITER_BVEC | WRITE, bvec, 1, bvec->bv_len);
+ iov_iter_bvec(&i, WRITE, bvec, 1, bvec->bv_len);
file_start_write(file);
bw = vfs_iter_write(file, &i, ppos, 0);
@@ -346,7 +346,7 @@ static int lo_read_simple(struct loop_device *lo, struct request *rq,
ssize_t len;
rq_for_each_segment(bvec, rq, iter) {
- iov_iter_bvec(&i, ITER_BVEC, &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&i, READ, &bvec, 1, bvec.bv_len);
len = vfs_iter_read(lo->lo_backing_file, &i, &pos, 0);
if (len < 0)
return len;
@@ -387,7 +387,7 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
b.bv_offset = 0;
b.bv_len = bvec.bv_len;
- iov_iter_bvec(&i, ITER_BVEC, &b, 1, b.bv_len);
+ iov_iter_bvec(&i, READ, &b, 1, b.bv_len);
len = vfs_iter_read(lo->lo_backing_file, &i, &pos, 0);
if (len < 0) {
ret = len;
@@ -554,8 +554,7 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
}
atomic_set(&cmd->ref, 2);
- iov_iter_bvec(&iter, ITER_BVEC | rw, bvec,
- segments, blk_rq_bytes(rq));
+ iov_iter_bvec(&iter, rw, bvec, segments, blk_rq_bytes(rq));
iter.iov_offset = offset;
cmd->iocb.ki_pos = pos;
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 3863c00372bb..3e5fb3decdce 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -473,7 +473,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
u32 nbd_cmd_flags = 0;
int sent = nsock->sent, skip = 0;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &iov, 1, sizeof(request));
+ iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
switch (req_op(req)) {
case REQ_OP_DISCARD:
@@ -564,8 +564,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
dev_dbg(nbd_to_dev(nbd), "request %p: sending %d bytes data\n",
req, bvec.bv_len);
- iov_iter_bvec(&from, ITER_BVEC | WRITE,
- &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&from, WRITE, &bvec, 1, bvec.bv_len);
if (skip) {
if (skip >= iov_iter_count(&from)) {
skip -= iov_iter_count(&from);
@@ -624,7 +623,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
int ret = 0;
reply.magic = 0;
- iov_iter_kvec(&to, READ | ITER_KVEC, &iov, 1, sizeof(reply));
+ iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
if (result <= 0) {
if (!nbd_disconnected(config))
@@ -678,8 +677,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
struct bio_vec bvec;
rq_for_each_segment(bvec, req, iter) {
- iov_iter_bvec(&to, ITER_BVEC | READ,
- &bvec, 1, bvec.bv_len);
+ iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
if (result <= 0) {
dev_err(disk_to_dev(nbd->disk), "Receive data failed (result %d)\n",
@@ -1073,7 +1071,7 @@ static void send_disconnects(struct nbd_device *nbd)
for (i = 0; i < config->num_connections; i++) {
struct nbd_sock *nsock = config->socks[i];
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &iov, 1, sizeof(request));
+ iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
mutex_lock(&nsock->tx_lock);
ret = sock_xmit(nbd, i, 1, &from, 0, NULL);
if (ret <= 0)
diff --git a/drivers/isdn/mISDN/l1oip_core.c b/drivers/isdn/mISDN/l1oip_core.c
index b05022f94f18..072bb5e36c18 100644
--- a/drivers/isdn/mISDN/l1oip_core.c
+++ b/drivers/isdn/mISDN/l1oip_core.c
@@ -718,8 +718,7 @@ l1oip_socket_thread(void *data)
printk(KERN_DEBUG "%s: socket created and open\n",
__func__);
while (!signal_pending(current)) {
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1,
- recvbuf_size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, recvbuf_size);
recvlen = sock_recvmsg(socket, &msg, 0);
if (recvlen > 0) {
l1oip_socket_parse(hc, &sin_rx, recvbuf, recvlen);
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index b4d7774cfe07..a7fb26246385 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -3035,7 +3035,7 @@ ssize_t vmci_qpair_enqueue(struct vmci_qp *qpair,
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&from, WRITE, &v, 1, buf_size);
qp_lock(qpair);
@@ -3079,7 +3079,7 @@ ssize_t vmci_qpair_dequeue(struct vmci_qp *qpair,
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&to, READ | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&to, READ, &v, 1, buf_size);
qp_lock(qpair);
@@ -3124,7 +3124,7 @@ ssize_t vmci_qpair_peek(struct vmci_qp *qpair,
if (!qpair || !buf)
return VMCI_ERROR_INVALID_ARGS;
- iov_iter_kvec(&to, READ | ITER_KVEC, &v, 1, buf_size);
+ iov_iter_kvec(&to, READ, &v, 1, buf_size);
qp_lock(qpair);
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c
index 8c42b3a8c420..ba05278b21b2 100644
--- a/drivers/nvme/target/io-cmd-file.c
+++ b/drivers/nvme/target/io-cmd-file.c
@@ -96,7 +96,7 @@ static ssize_t nvmet_file_submit_bvec(struct nvmet_req *req, loff_t pos,
rw = READ;
}
- iov_iter_bvec(&iter, ITER_BVEC | rw, req->f.bvec, nr_segs, count);
+ iov_iter_bvec(&iter, rw, req->f.bvec, nr_segs, count);
iocb->ki_pos = pos;
iocb->ki_filp = req->ns->file;
diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 4435bf374d2d..c86039c0fbc1 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1249,8 +1249,7 @@ static int iscsit_do_rx_data(
return -1;
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
- count->iov, count->iov_count, data);
+ iov_iter_kvec(&msg.msg_iter, READ, count->iov, count->iov_count, data);
while (msg_data_left(&msg)) {
rx_loop = sock_recvmsg(conn->sock, &msg, MSG_WAITALL);
@@ -1306,8 +1305,7 @@ int tx_data(
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC,
- iov, iov_count, data);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, iov_count, data);
while (msg_data_left(&msg)) {
int tx_loop = sock_sendmsg(conn->sock, &msg);
diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 16751ae55d7b..49b110d1b972 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -303,7 +303,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
len += sg->length;
}
- iov_iter_bvec(&iter, ITER_BVEC | is_write, bvec, sgl_nents, len);
+ iov_iter_bvec(&iter, is_write, bvec, sgl_nents, len);
aio_cmd->cmd = cmd;
aio_cmd->len = len;
@@ -353,7 +353,7 @@ static int fd_do_rw(struct se_cmd *cmd, struct file *fd,
len += sg->length;
}
- iov_iter_bvec(&iter, ITER_BVEC, bvec, sgl_nents, len);
+ iov_iter_bvec(&iter, READ, bvec, sgl_nents, len);
if (is_write)
ret = vfs_iter_write(fd, &iter, &pos, 0);
else
@@ -490,7 +490,7 @@ fd_execute_write_same(struct se_cmd *cmd)
len += se_dev->dev_attrib.block_size;
}
- iov_iter_bvec(&iter, ITER_BVEC, bvec, nolb, len);
+ iov_iter_bvec(&iter, READ, bvec, nolb, len);
ret = vfs_iter_write(fd_dev->fd_file, &iter, &pos, 0);
kfree(bvec);
diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c
index 9756752c0681..45da3e01c7b0 100644
--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -309,7 +309,7 @@ int usbip_recv(struct socket *sock, void *buf, int size)
if (!sock || !buf || !size)
return -EINVAL;
- iov_iter_kvec(&msg.msg_iter, READ|ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, size);
usbip_dbg_xmit("enter\n");
diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index b1092fbefa63..2e5d845b5091 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -137,13 +137,13 @@ static void pvcalls_conn_back_read(void *opaque)
if (masked_prod < masked_cons) {
vec[0].iov_base = data->in + masked_prod;
vec[0].iov_len = wanted;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|WRITE, vec, 1, wanted);
+ iov_iter_kvec(&msg.msg_iter, WRITE, vec, 1, wanted);
} else {
vec[0].iov_base = data->in + masked_prod;
vec[0].iov_len = array_size - masked_prod;
vec[1].iov_base = data->in;
vec[1].iov_len = wanted - vec[0].iov_len;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|WRITE, vec, 2, wanted);
+ iov_iter_kvec(&msg.msg_iter, WRITE, vec, 2, wanted);
}
atomic_set(&map->read, 0);
@@ -195,13 +195,13 @@ static void pvcalls_conn_back_write(struct sock_mapping *map)
if (pvcalls_mask(prod, array_size) > pvcalls_mask(cons, array_size)) {
vec[0].iov_base = data->out + pvcalls_mask(cons, array_size);
vec[0].iov_len = size;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|READ, vec, 1, size);
+ iov_iter_kvec(&msg.msg_iter, READ, vec, 1, size);
} else {
vec[0].iov_base = data->out + pvcalls_mask(cons, array_size);
vec[0].iov_len = array_size - pvcalls_mask(cons, array_size);
vec[1].iov_base = data->out;
vec[1].iov_len = size - vec[0].iov_len;
- iov_iter_kvec(&msg.msg_iter, ITER_KVEC|READ, vec, 2, size);
+ iov_iter_kvec(&msg.msg_iter, READ, vec, 2, size);
}
atomic_set(&map->write, 0);
diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index e1cbdfdb7c68..0bcbcc20f769 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -65,7 +65,7 @@ static int v9fs_fid_readpage(struct p9_fid *fid, struct page *page)
if (retval == 0)
return retval;
- iov_iter_bvec(&to, ITER_BVEC | READ, &bvec, 1, PAGE_SIZE);
+ iov_iter_bvec(&to, READ, &bvec, 1, PAGE_SIZE);
retval = p9_client_read(fid, page_offset(page), &to, &err);
if (err) {
@@ -175,7 +175,7 @@ static int v9fs_vfs_writepage_locked(struct page *page)
bvec.bv_page = page;
bvec.bv_offset = 0;
bvec.bv_len = len;
- iov_iter_bvec(&from, ITER_BVEC | WRITE, &bvec, 1, len);
+ iov_iter_bvec(&from, WRITE, &bvec, 1, len);
/* We should have writeback_fid always set */
BUG_ON(!v9inode->writeback_fid);
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index b0405d6aac85..d5db3c968a03 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -133,7 +133,7 @@ static int v9fs_dir_readdir(struct file *file, struct dir_context *ctx)
if (rdir->tail == rdir->head) {
struct iov_iter to;
int n;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kvec, 1, buflen);
+ iov_iter_kvec(&to, READ, &kvec, 1, buflen);
n = p9_client_read(file->private_data, ctx->pos, &to,
&err);
if (err)
diff --git a/fs/9p/xattr.c b/fs/9p/xattr.c
index f329eee6dc93..92ed5ecfafd1 100644
--- a/fs/9p/xattr.c
+++ b/fs/9p/xattr.c
@@ -32,7 +32,7 @@ ssize_t v9fs_fid_xattr_get(struct p9_fid *fid, const char *name,
struct iov_iter to;
int err;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kvec, 1, buffer_size);
+ iov_iter_kvec(&to, READ, &kvec, 1, buffer_size);
attr_fid = p9_client_xattrwalk(fid, name, &attr_size);
if (IS_ERR(attr_fid)) {
@@ -107,7 +107,7 @@ int v9fs_fid_xattr_set(struct p9_fid *fid, const char *name,
struct iov_iter from;
int retval;
- iov_iter_kvec(&from, WRITE | ITER_KVEC, &kvec, 1, value_len);
+ iov_iter_kvec(&from, WRITE, &kvec, 1, value_len);
p9_debug(P9_DEBUG_VFS, "name = %s value_len = %zu flags = %d\n",
name, value_len, flags);
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 19db5f672a9d..55441af2de53 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -286,7 +286,7 @@ static void afs_load_bvec(struct afs_call *call, struct msghdr *msg,
offset = 0;
}
- iov_iter_bvec(&msg->msg_iter, WRITE | ITER_BVEC, bv, nr, bytes);
+ iov_iter_bvec(&msg->msg_iter, WRITE, bv, nr, bytes);
}
/*
@@ -401,8 +401,7 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iov, 1,
- call->request_size);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, call->request_size);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = MSG_WAITALL | (call->send_pages ? MSG_MORE : 0);
@@ -432,7 +431,7 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
rxrpc_kernel_abort_call(call->net->socket, rxcall,
RX_USER_ABORT, ret, "KSD");
} else {
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg.msg_iter, READ, NULL, 0, 0);
rxrpc_kernel_recv_data(call->net->socket, rxcall,
&msg.msg_iter, false,
&call->abort_code, &call->service_id);
@@ -468,7 +467,7 @@ static void afs_deliver_to_call(struct afs_call *call)
if (state == AFS_CALL_SV_AWAIT_ACK) {
struct iov_iter iter;
- iov_iter_kvec(&iter, READ | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&iter, READ, NULL, 0, 0);
ret = rxrpc_kernel_recv_data(call->net->socket,
call->rxcall, &iter, false,
&remote_abort,
@@ -827,7 +826,7 @@ void afs_send_empty_reply(struct afs_call *call)
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg.msg_iter, WRITE, NULL, 0, 0);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
@@ -866,7 +865,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
iov[0].iov_len = len;
msg.msg_name = NULL;
msg.msg_namelen = 0;
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iov, 1, len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
@@ -907,7 +906,7 @@ int afs_extract_data(struct afs_call *call, void *buf, size_t count,
iov.iov_base = buf + call->offset;
iov.iov_len = count - call->offset;
- iov_iter_kvec(&iter, ITER_KVEC | READ, &iov, 1, count - call->offset);
+ iov_iter_kvec(&iter, READ, &iov, 1, count - call->offset);
ret = rxrpc_kernel_recv_data(net->socket, call->rxcall, &iter,
want_more, &remote_abort,
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 7dc135204722..aa54c91df73e 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -349,7 +349,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
dio->size = 0;
dio->multi_bio = false;
- dio->should_dirty = is_read && (iter->type == ITER_IOVEC);
+ dio->should_dirty = is_read && iter_is_iovec(iter);
blk_start_plug(&plug);
for (;;) {
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index ad0bed99b1d5..f0ddb0dc4f16 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -659,7 +659,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to,
if (ret < 0)
return ret;
- if (unlikely(to->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(to))) {
size_t page_off;
ret = iov_iter_get_pages_alloc(to, &pages, len,
&page_off);
@@ -822,7 +822,7 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
aio_req->total_len = rc + zlen;
}
- iov_iter_bvec(&i, ITER_BVEC, osd_data->bvec_pos.bvecs,
+ iov_iter_bvec(&i, READ, osd_data->bvec_pos.bvecs,
osd_data->num_bvecs,
osd_data->bvec_pos.iter.bi_size);
iov_iter_advance(&i, rc);
@@ -1045,8 +1045,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
int zlen = min_t(size_t, len - ret,
size - pos - ret);
- iov_iter_bvec(&i, ITER_BVEC, bvecs, num_pages,
- len);
+ iov_iter_bvec(&i, READ, bvecs, num_pages, len);
iov_iter_advance(&i, ret);
iov_iter_zero(zlen, &i);
ret += zlen;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5df2c0698cda..b3feb2c6db5b 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -589,7 +589,7 @@ cifs_read_from_socket(struct TCP_Server_Info *server, char *buf,
{
struct msghdr smb_msg;
struct kvec iov = {.iov_base = buf, .iov_len = to_read};
- iov_iter_kvec(&smb_msg.msg_iter, READ | ITER_KVEC, &iov, 1, to_read);
+ iov_iter_kvec(&smb_msg.msg_iter, READ, &iov, 1, to_read);
return cifs_readv_from_socket(server, &smb_msg);
}
@@ -601,7 +601,7 @@ cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page,
struct msghdr smb_msg;
struct bio_vec bv = {
.bv_page = page, .bv_len = to_read, .bv_offset = page_offset};
- iov_iter_bvec(&smb_msg.msg_iter, READ | ITER_BVEC, &bv, 1, to_read);
+ iov_iter_bvec(&smb_msg.msg_iter, READ, &bv, 1, to_read);
return cifs_readv_from_socket(server, &smb_msg);
}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8d41ca7bfcf1..dcdbcb6f09f8 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2990,7 +2990,7 @@ cifs_readdata_to_iov(struct cifs_readdata *rdata, struct iov_iter *iter)
size_t copy = min_t(size_t, remaining, PAGE_SIZE);
size_t written;
- if (unlikely(iter->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(iter))) {
void *addr = kmap_atomic(page);
written = copy_to_iter(addr, copy, iter);
@@ -3302,7 +3302,7 @@ ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to)
if (!is_sync_kiocb(iocb))
ctx->iocb = iocb;
- if (to->type == ITER_IOVEC)
+ if (iter_is_iovec(to))
ctx->should_dirty = true;
rc = setup_aio_ctx_iter(ctx, to, READ);
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index 53e8362cbc4a..65540d33993e 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -780,7 +780,7 @@ setup_aio_ctx_iter(struct cifs_aio_ctx *ctx, struct iov_iter *iter, int rw)
struct page **pages = NULL;
struct bio_vec *bv = NULL;
- if (iter->type & ITER_KVEC) {
+ if (iov_iter_type(iter) == ITER_KVEC) {
memcpy(&ctx->iter, iter, sizeof(struct iov_iter));
ctx->len = count;
iov_iter_advance(iter, count);
@@ -851,7 +851,7 @@ setup_aio_ctx_iter(struct cifs_aio_ctx *ctx, struct iov_iter *iter, int rw)
ctx->bv = bv;
ctx->len = saved_len - count;
ctx->npages = npages;
- iov_iter_bvec(&ctx->iter, ITER_BVEC | rw, ctx->bv, npages, ctx->len);
+ iov_iter_bvec(&ctx->iter, rw, ctx->bv, npages, ctx->len);
return 0;
}
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index ea92a38b2f08..3e6fe3943591 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2769,13 +2769,13 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
return 0;
}
- iov_iter_bvec(&iter, WRITE | ITER_BVEC, bvec, npages, data_len);
+ iov_iter_bvec(&iter, WRITE, bvec, npages, data_len);
} else if (buf_len >= data_offset + data_len) {
/* read response payload is in buf */
WARN_ONCE(npages > 0, "read data can be either in buf or in pages");
iov.iov_base = buf + data_offset;
iov.iov_len = data_len;
- iov_iter_kvec(&iter, WRITE | ITER_KVEC, &iov, 1, data_len);
+ iov_iter_kvec(&iter, WRITE, &iov, 1, data_len);
} else {
/* read response payload cannot be in both buf and pages */
WARN_ONCE(1, "buf can not contain only a part of read data");
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index c55ea4e6201b..10307b46222d 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -2047,14 +2047,22 @@ int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
info->smbd_recv_pending++;
- switch (msg->msg_iter.type) {
- case READ | ITER_KVEC:
+ if (iov_iter_rw(&msg->msg_iter) == WRITE) {
+ /* It's a bug in upper layer to get there */
+ cifs_dbg(VFS, "CIFS: invalid msg iter dir %u\n",
+ iov_iter_rw(&msg->msg_iter));
+ rc = -EINVAL;
+ goto out;
+ }
+
+ switch (iov_iter_type(&msg->msg_iter)) {
+ case ITER_KVEC:
buf = msg->msg_iter.kvec->iov_base;
to_read = msg->msg_iter.kvec->iov_len;
rc = smbd_recv_buf(info, buf, to_read);
break;
- case READ | ITER_BVEC:
+ case ITER_BVEC:
page = msg->msg_iter.bvec->bv_page;
page_offset = msg->msg_iter.bvec->bv_offset;
to_read = msg->msg_iter.bvec->bv_len;
@@ -2064,10 +2072,11 @@ int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
default:
/* It's a bug in upper layer to get there */
cifs_dbg(VFS, "CIFS: invalid msg type %d\n",
- msg->msg_iter.type);
+ iov_iter_type(&msg->msg_iter));
rc = -EINVAL;
}
+out:
info->smbd_recv_pending--;
wake_up(&info->wait_smbd_recv_pending);
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index a341ec839c83..4781de558ab1 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -297,8 +297,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
.iov_base = &rfc1002_marker,
.iov_len = 4
};
- iov_iter_kvec(&smb_msg.msg_iter, WRITE | ITER_KVEC, &hiov,
- 1, 4);
+ iov_iter_kvec(&smb_msg.msg_iter, WRITE, &hiov, 1, 4);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
goto uncork;
@@ -319,8 +318,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
size += iov[i].iov_len;
}
- iov_iter_kvec(&smb_msg.msg_iter, WRITE | ITER_KVEC,
- iov, n_vec, size);
+ iov_iter_kvec(&smb_msg.msg_iter, WRITE, iov, n_vec, size);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
@@ -336,7 +334,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
rqst_page_get_length(&rqst[j], i, &bvec.bv_len,
&bvec.bv_offset);
- iov_iter_bvec(&smb_msg.msg_iter, WRITE | ITER_BVEC,
+ iov_iter_bvec(&smb_msg.msg_iter, WRITE,
&bvec, 1, bvec.bv_len);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 093fb54cd316..989f30cc0d09 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -1313,7 +1313,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
spin_lock_init(&dio->bio_lock);
dio->refcount = 1;
- dio->should_dirty = (iter->type == ITER_IOVEC);
+ dio->should_dirty = iter_is_iovec(iter);
sdio.iter = iter;
sdio.final_block_in_request = end >> blkbits;
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index a5e4a221435c..76976d6e50f9 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -674,7 +674,7 @@ static int receive_from_sock(struct connection *con)
nvec = 2;
}
len = iov[0].iov_len + iov[1].iov_len;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, iov, nvec, len);
+ iov_iter_kvec(&msg.msg_iter, READ, iov, nvec, len);
r = ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT | MSG_NOSIGNAL);
if (ret <= 0)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a201fb0ac64f..fd01af7664e8 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1269,7 +1269,7 @@ static int fuse_get_user_pages(struct fuse_req *req, struct iov_iter *ii,
ssize_t ret = 0;
/* Special case for kernel I/O: can copy directly into the buffer */
- if (ii->type & ITER_KVEC) {
+ if (iov_iter_type(ii) == ITER_KVEC) {
unsigned long user_addr = fuse_get_user_addr(ii);
size_t frag_size = fuse_get_frag_size(ii, *nbytesp);
diff --git a/fs/iomap.c b/fs/iomap.c
index 0d0bd8845586..ad374c9530c3 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1143,7 +1143,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
if (pos >= dio->i_size)
goto out_free_dio;
- if (iter->type == ITER_IOVEC)
+ if (iter_is_iovec(iter))
dio->flags |= IOMAP_DIO_DIRTY;
} else {
flags |= IOMAP_WRITE;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index b0555d7d8200..ad61520c8078 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -922,7 +922,7 @@ __be32 nfsd_readv(struct svc_rqst *rqstp, struct svc_fh *fhp,
int host_err;
trace_nfsd_read_vector(rqstp, fhp, offset, *count);
- iov_iter_kvec(&iter, READ | ITER_KVEC, vec, vlen, *count);
+ iov_iter_kvec(&iter, READ, vec, vlen, *count);
host_err = vfs_iter_read(file, &iter, &offset, 0);
return nfsd_finish_read(rqstp, fhp, file, offset, count, host_err);
}
@@ -998,7 +998,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
if (stable && !use_wgather)
flags |= RWF_SYNC;
- iov_iter_kvec(&iter, WRITE | ITER_KVEC, vec, vlen, *cnt);
+ iov_iter_kvec(&iter, WRITE, vec, vlen, *cnt);
host_err = vfs_iter_write(file, &iter, &pos, flags);
if (host_err < 0)
goto out_nfserr;
diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c
index 1296f78ae966..04b651c8309c 100644
--- a/fs/ocfs2/cluster/tcp.c
+++ b/fs/ocfs2/cluster/tcp.c
@@ -918,7 +918,7 @@ static int o2net_recv_tcp_msg(struct socket *sock, void *data, size_t len)
{
struct kvec vec = { .iov_len = len, .iov_base = data, };
struct msghdr msg = { .msg_flags = MSG_DONTWAIT, };
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1, len);
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1, len);
return sock_recvmsg(sock, &msg, MSG_DONTWAIT);
}
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 6e4d2af8f5bc..90815360075a 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -25,7 +25,7 @@ static int read_one_page(struct page *page)
struct iov_iter to;
struct bio_vec bv = {.bv_page = page, .bv_len = PAGE_SIZE};
- iov_iter_bvec(&to, ITER_BVEC | READ, &bv, 1, PAGE_SIZE);
+ iov_iter_bvec(&to, READ, &bv, 1, PAGE_SIZE);
gossip_debug(GOSSIP_INODE_DEBUG,
"orangefs_readpage called with page %p\n",
diff --git a/fs/splice.c b/fs/splice.c
index b3daa971f597..3553f1956508 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -301,7 +301,7 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
struct kiocb kiocb;
int idx, ret;
- iov_iter_pipe(&to, ITER_PIPE | READ, pipe, len);
+ iov_iter_pipe(&to, READ, pipe, len);
idx = to.idx;
init_sync_kiocb(&kiocb, in);
kiocb.ki_pos = *ppos;
@@ -386,7 +386,7 @@ static ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
*/
offset = *ppos & ~PAGE_MASK;
- iov_iter_pipe(&to, ITER_PIPE | READ, pipe, len + offset);
+ iov_iter_pipe(&to, READ, pipe, len + offset);
res = iov_iter_get_pages_alloc(&to, &pages, len + offset, &base);
if (res <= 0)
@@ -745,8 +745,7 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
left -= this_len;
}
- iov_iter_bvec(&from, ITER_BVEC | WRITE, array, n,
- sd.total_len - left);
+ iov_iter_bvec(&from, WRITE, array, n, sd.total_len - left);
ret = vfs_iter_write(out, &from, &sd.pos, 0);
if (ret <= 0)
break;
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 409c845d4cd3..48e7fa36f923 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -21,7 +21,7 @@ struct kvec {
size_t iov_len;
};
-enum {
+enum iter_type {
ITER_IOVEC = 0,
ITER_KVEC = 2,
ITER_BVEC = 4,
@@ -29,7 +29,8 @@ enum {
};
struct iov_iter {
- int type;
+ enum iter_type iter_type:8;
+ u8 iter_dir;
size_t iov_offset;
size_t count;
union {
@@ -47,6 +48,26 @@ struct iov_iter {
};
};
+static inline enum iter_type iov_iter_type(const struct iov_iter *i)
+{
+ return i->iter_type;
+}
+
+static inline bool iov_iter_is_pipe(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_PIPE;
+}
+
+static inline bool iter_is_iovec(const struct iov_iter *i)
+{
+ return iov_iter_type(i) == ITER_IOVEC;
+}
+
+static inline unsigned char iov_iter_rw(const struct iov_iter *i)
+{
+ return i->iter_dir;
+}
+
/*
* Total number of bytes covered by an iovec.
*
@@ -74,7 +95,8 @@ static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
}
#define iov_for_each(iov, iter, start) \
- if (!((start).type & (ITER_BVEC | ITER_PIPE))) \
+ if (iov_iter_type(start) == ITER_IOVEC || \
+ iov_iter_type(start) == ITER_KVEC) \
for (iter = (start); \
(iter).count && \
((iov = iov_iter_iovec(&(iter))), 1); \
@@ -181,13 +203,13 @@ size_t copy_to_iter_mcsafe(void *addr, size_t bytes, struct iov_iter *i)
size_t iov_iter_zero(size_t bytes, struct iov_iter *);
unsigned long iov_iter_alignment(const struct iov_iter *i);
unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
-void iov_iter_init(struct iov_iter *i, int direction, const struct iovec *iov,
+void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
unsigned long nr_segs, size_t count);
-void iov_iter_kvec(struct iov_iter *i, int direction, const struct kvec *kvec,
+void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
unsigned long nr_segs, size_t count);
-void iov_iter_bvec(struct iov_iter *i, int direction, const struct bio_vec *bvec,
+void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
unsigned long nr_segs, size_t count);
-void iov_iter_pipe(struct iov_iter *i, int direction, struct pipe_inode_info *pipe,
+void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
size_t count);
ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
size_t maxsize, unsigned maxpages, size_t *start);
@@ -202,19 +224,6 @@ static inline size_t iov_iter_count(const struct iov_iter *i)
return i->count;
}
-static inline bool iter_is_iovec(const struct iov_iter *i)
-{
- return !(i->type & (ITER_BVEC | ITER_KVEC | ITER_PIPE));
-}
-
-/*
- * Get one of READ or WRITE out of iter->type without any other flags OR'd in
- * with it.
- *
- * The ?: is just for type safety.
- */
-#define iov_iter_rw(i) ((0 ? (struct iov_iter *)0 : (i))->type & (READ | WRITE))
-
/*
* Cap the iov_iter by given limit; note that the second argument is
* *not* the new size - it's upper limit for such. Passing it a value
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 8be175df3075..bd828591afb0 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -75,11 +75,11 @@
#define iterate_all_kinds(i, n, v, I, B, K) { \
if (likely(n)) { \
size_t skip = i->iov_offset; \
- if (unlikely(i->type & ITER_BVEC)) { \
+ if (unlikely(i->iter_type & ITER_BVEC)) { \
struct bio_vec v; \
struct bvec_iter __bi; \
iterate_bvec(i, n, v, __bi, skip, (B)) \
- } else if (unlikely(i->type & ITER_KVEC)) { \
+ } else if (unlikely(i->iter_type & ITER_KVEC)) { \
const struct kvec *kvec; \
struct kvec v; \
iterate_kvec(i, n, v, kvec, skip, (K)) \
@@ -96,7 +96,7 @@
n = i->count; \
if (i->count) { \
size_t skip = i->iov_offset; \
- if (unlikely(i->type & ITER_BVEC)) { \
+ if (unlikely(i->iter_type & ITER_BVEC)) { \
const struct bio_vec *bvec = i->bvec; \
struct bio_vec v; \
struct bvec_iter __bi; \
@@ -104,7 +104,7 @@
i->bvec = __bvec_iter_bvec(i->bvec, __bi); \
i->nr_segs -= i->bvec - bvec; \
skip = __bi.bi_bvec_done; \
- } else if (unlikely(i->type & ITER_KVEC)) { \
+ } else if (unlikely(i->iter_type & ITER_KVEC)) { \
const struct kvec *kvec; \
struct kvec v; \
iterate_kvec(i, n, v, kvec, skip, (K)) \
@@ -417,28 +417,35 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
int err;
struct iovec v;
- if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
+ switch (iov_iter_type(i)) {
+ case ITER_IOVEC:
+ case ITER_PIPE:
iterate_iovec(i, bytes, v, iov, skip, ({
err = fault_in_pages_readable(v.iov_base, v.iov_len);
if (unlikely(err))
return err;
0;}))
+ break;
+ case ITER_KVEC:
+ case ITER_BVEC:
+ break;
}
return 0;
}
EXPORT_SYMBOL(iov_iter_fault_in_readable);
-void iov_iter_init(struct iov_iter *i, int direction,
+void iov_iter_init(struct iov_iter *i, unsigned int direction,
const struct iovec *iov, unsigned long nr_segs,
size_t count)
{
/* It will get better. Eventually... */
if (uaccess_kernel()) {
- direction |= ITER_KVEC;
- i->type = direction;
+ i->iter_type = ITER_KVEC;
+ i->iter_dir = direction;
i->kvec = (struct kvec *)iov;
} else {
- i->type = direction;
+ i->iter_type = ITER_IOVEC;
+ i->iter_dir = direction;
i->iov = iov;
}
i->nr_segs = nr_segs;
@@ -558,7 +565,7 @@ static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
{
const char *from = addr;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return copy_pipe_to_iter(addr, bytes, i);
if (iter_is_iovec(i))
might_fault();
@@ -658,7 +665,7 @@ size_t _copy_to_iter_mcsafe(const void *addr, size_t bytes, struct iov_iter *i)
const char *from = addr;
unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return copy_pipe_to_iter_mcsafe(addr, bytes, i);
if (iter_is_iovec(i))
might_fault();
@@ -692,7 +699,7 @@ EXPORT_SYMBOL_GPL(_copy_to_iter_mcsafe);
size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
@@ -712,7 +719,7 @@ EXPORT_SYMBOL(_copy_from_iter);
bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return false;
}
@@ -739,7 +746,7 @@ EXPORT_SYMBOL(_copy_from_iter_full);
size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
@@ -773,7 +780,7 @@ EXPORT_SYMBOL(_copy_from_iter_nocache);
size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
@@ -794,7 +801,7 @@ EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
{
char *to = addr;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return false;
}
@@ -831,15 +838,20 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
{
if (unlikely(!page_copy_sane(page, offset, bytes)))
return 0;
- if (i->type & (ITER_BVEC|ITER_KVEC)) {
+ switch (iov_iter_type(i)) {
+ case ITER_BVEC:
+ case ITER_KVEC: {
void *kaddr = kmap_atomic(page);
size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
kunmap_atomic(kaddr);
return wanted;
- } else if (likely(!(i->type & ITER_PIPE)))
+ }
+ case ITER_IOVEC:
return copy_page_to_iter_iovec(page, offset, bytes, i);
- else
+ case ITER_PIPE:
return copy_page_to_iter_pipe(page, offset, bytes, i);
+ }
+ BUG();
}
EXPORT_SYMBOL(copy_page_to_iter);
@@ -848,17 +860,21 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
{
if (unlikely(!page_copy_sane(page, offset, bytes)))
return 0;
- if (unlikely(i->type & ITER_PIPE)) {
- WARN_ON(1);
- return 0;
- }
- if (i->type & (ITER_BVEC|ITER_KVEC)) {
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE:
+ break;
+ case ITER_BVEC:
+ case ITER_KVEC: {
void *kaddr = kmap_atomic(page);
size_t wanted = _copy_from_iter(kaddr + offset, bytes, i);
kunmap_atomic(kaddr);
return wanted;
- } else
+ }
+ case ITER_IOVEC:
return copy_page_from_iter_iovec(page, offset, bytes, i);
+ }
+ WARN_ON(1);
+ return 0;
}
EXPORT_SYMBOL(copy_page_from_iter);
@@ -888,7 +904,7 @@ static size_t pipe_zero(size_t bytes, struct iov_iter *i)
size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
{
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_zero(bytes, i);
iterate_and_advance(i, bytes, v,
clear_user(v.iov_base, v.iov_len),
@@ -908,7 +924,7 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
kunmap_atomic(kaddr);
return 0;
}
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
kunmap_atomic(kaddr);
WARN_ON(1);
return 0;
@@ -972,7 +988,7 @@ static void pipe_advance(struct iov_iter *i, size_t size)
void iov_iter_advance(struct iov_iter *i, size_t size)
{
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
pipe_advance(i, size);
return;
}
@@ -987,7 +1003,7 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
if (WARN_ON(unroll > MAX_RW_COUNT))
return;
i->count += unroll;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
struct pipe_inode_info *pipe = i->pipe;
int idx = i->idx;
size_t off = i->iov_offset;
@@ -1016,7 +1032,8 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
return;
}
unroll -= i->iov_offset;
- if (i->type & ITER_BVEC) {
+ switch (iov_iter_type(i)) {
+ case ITER_BVEC: {
const struct bio_vec *bvec = i->bvec;
while (1) {
size_t n = (--bvec)->bv_len;
@@ -1028,7 +1045,10 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
}
unroll -= n;
}
- } else { /* same logics for iovec and kvec */
+ }
+ case ITER_IOVEC:
+ case ITER_KVEC: {
+ /* same logics for iovec and kvec */
const struct iovec *iov = i->iov;
while (1) {
size_t n = (--iov)->iov_len;
@@ -1041,6 +1061,9 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
unroll -= n;
}
}
+ case ITER_PIPE:
+ BUG();
+ }
}
EXPORT_SYMBOL(iov_iter_revert);
@@ -1049,23 +1072,28 @@ EXPORT_SYMBOL(iov_iter_revert);
*/
size_t iov_iter_single_seg_count(const struct iov_iter *i)
{
- if (unlikely(i->type & ITER_PIPE))
- return i->count; // it is a silly place, anyway
if (i->nr_segs == 1)
return i->count;
- else if (i->type & ITER_BVEC)
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE:
+ return i->count; // it is a silly place, anyway
+ case ITER_BVEC:
return min(i->count, i->bvec->bv_len - i->iov_offset);
- else
+ case ITER_KVEC:
+ case ITER_IOVEC:
return min(i->count, i->iov->iov_len - i->iov_offset);
+ }
+ BUG();
}
EXPORT_SYMBOL(iov_iter_single_seg_count);
-void iov_iter_kvec(struct iov_iter *i, int direction,
+void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
const struct kvec *kvec, unsigned long nr_segs,
size_t count)
{
- BUG_ON(!(direction & ITER_KVEC));
- i->type = direction;
+ BUG_ON(direction & ~1);
+ i->iter_dir = direction;
+ i->iter_type = ITER_KVEC;
i->kvec = kvec;
i->nr_segs = nr_segs;
i->iov_offset = 0;
@@ -1073,12 +1101,13 @@ void iov_iter_kvec(struct iov_iter *i, int direction,
}
EXPORT_SYMBOL(iov_iter_kvec);
-void iov_iter_bvec(struct iov_iter *i, int direction,
+void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
const struct bio_vec *bvec, unsigned long nr_segs,
size_t count)
{
- BUG_ON(!(direction & ITER_BVEC));
- i->type = direction;
+ BUG_ON(direction & ~1);
+ i->iter_dir = direction;
+ i->iter_type = ITER_BVEC;
i->bvec = bvec;
i->nr_segs = nr_segs;
i->iov_offset = 0;
@@ -1086,13 +1115,14 @@ void iov_iter_bvec(struct iov_iter *i, int direction,
}
EXPORT_SYMBOL(iov_iter_bvec);
-void iov_iter_pipe(struct iov_iter *i, int direction,
+void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
struct pipe_inode_info *pipe,
size_t count)
{
- BUG_ON(direction != ITER_PIPE);
+ BUG_ON(direction != READ);
WARN_ON(pipe->nrbufs == pipe->buffers);
- i->type = direction;
+ i->iter_dir = READ;
+ i->iter_type = ITER_PIPE;
i->pipe = pipe;
i->idx = (pipe->curbuf + pipe->nrbufs) & (pipe->buffers - 1);
i->iov_offset = 0;
@@ -1106,7 +1136,7 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
unsigned long res = 0;
size_t size = i->count;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
if (size && i->iov_offset && allocated(&i->pipe->bufs[i->idx]))
return size | i->iov_offset;
return size;
@@ -1125,7 +1155,7 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
unsigned long res = 0;
size_t size = i->count;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return ~0U;
}
@@ -1193,7 +1223,7 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_get_pages(i, pages, maxsize, maxpages, start);
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
@@ -1205,7 +1235,7 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
len = maxpages * PAGE_SIZE;
addr &= ~(PAGE_SIZE - 1);
n = DIV_ROUND_UP(len, PAGE_SIZE);
- res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, pages);
+ res = get_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE, pages);
if (unlikely(res < 0))
return res;
return (res == n ? len : res * PAGE_SIZE) - *start;
@@ -1270,7 +1300,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(i->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(i)))
return pipe_get_pages_alloc(i, pages, maxsize, start);
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
@@ -1283,7 +1313,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
p = get_pages_array(n);
if (!p)
return -ENOMEM;
- res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, p);
+ res = get_user_pages_fast(addr, n, iov_iter_rw(i) != WRITE, p);
if (unlikely(res < 0)) {
kvfree(p);
return res;
@@ -1313,7 +1343,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return 0;
}
@@ -1355,7 +1385,7 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1);
return false;
}
@@ -1400,7 +1430,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
WARN_ON(1); /* for now */
return 0;
}
@@ -1443,7 +1473,7 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
if (!size)
return 0;
- if (unlikely(i->type & ITER_PIPE)) {
+ if (unlikely(iov_iter_is_pipe(i))) {
struct pipe_inode_info *pipe = i->pipe;
size_t off;
int idx;
@@ -1481,19 +1511,23 @@ EXPORT_SYMBOL(iov_iter_npages);
const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
{
*new = *old;
- if (unlikely(new->type & ITER_PIPE)) {
- WARN_ON(1);
- return NULL;
- }
- if (new->type & ITER_BVEC)
+ switch (iov_iter_type(new)) {
+ case ITER_PIPE:
+ break;
+ case ITER_BVEC:
return new->bvec = kmemdup(new->bvec,
new->nr_segs * sizeof(struct bio_vec),
flags);
- else
+ case ITER_IOVEC:
+ case ITER_KVEC:
/* iovec and kvec have identical layout */
return new->iov = kmemdup(new->iov,
new->nr_segs * sizeof(struct iovec),
flags);
+ }
+
+ WARN_ON(1);
+ return NULL;
}
EXPORT_SYMBOL(dup_iter);
diff --git a/mm/filemap.c b/mm/filemap.c
index 52517f28e6f4..bdeee0168ea7 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2122,7 +2122,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
!mapping->a_ops->is_partially_uptodate)
goto page_not_up_to_date;
/* pipes can't handle partially uptodate pages */
- if (unlikely(iter->type & ITER_PIPE))
+ if (unlikely(iov_iter_is_pipe(iter)))
goto page_not_up_to_date;
if (!trylock_page(page))
goto page_not_up_to_date;
diff --git a/mm/page_io.c b/mm/page_io.c
index b41cf9644585..23c5756142bb 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -294,7 +294,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc,
};
struct iov_iter from;
- iov_iter_bvec(&from, ITER_BVEC | WRITE, &bv, 1, PAGE_SIZE);
+ iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE);
init_sync_kiocb(&kiocb, swap_file);
kiocb.ki_pos = page_file_offset(page);
diff --git a/net/9p/client.c b/net/9p/client.c
index 5c1343195292..1e853e2a62bb 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -2089,7 +2089,7 @@ int p9_client_readdir(struct p9_fid *fid, char *data, u32 count, u64 offset)
struct kvec kv = {.iov_base = data, .iov_len = count};
struct iov_iter to;
- iov_iter_kvec(&to, READ | ITER_KVEC, &kv, 1, count);
+ iov_iter_kvec(&to, READ, &kv, 1, count);
p9_debug(P9_DEBUG_9P, ">>> TREADDIR fid %d offset %llu count %d\n",
fid->fid, (unsigned long long) offset, count);
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 05006cbb3361..457cd9fcdd4f 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -320,7 +320,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
if (!iov_iter_count(data))
return 0;
- if (!(data->type & ITER_KVEC)) {
+ if (iov_iter_type(data) != ITER_KVEC) {
int n;
/*
* We allow only p9_max_pages pinned. We wait for the
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index 4e2576fc0c59..828e87fe8027 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -467,7 +467,7 @@ static int send_pkt(struct l2cap_chan *chan, struct sk_buff *skb,
iv.iov_len = skb->len;
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, skb->len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iv, 1, skb->len);
err = l2cap_chan_send(chan, &msg, skb->len);
if (err > 0) {
diff --git a/net/bluetooth/a2mp.c b/net/bluetooth/a2mp.c
index 51c2cf2d8923..58fc6333d412 100644
--- a/net/bluetooth/a2mp.c
+++ b/net/bluetooth/a2mp.c
@@ -63,7 +63,7 @@ static void a2mp_send(struct amp_mgr *mgr, u8 code, u8 ident, u16 len, void *dat
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, total_len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, &iv, 1, total_len);
l2cap_chan_send(chan, &msg, total_len);
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index ae91e2d40056..5f4c9f8333fd 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -622,7 +622,7 @@ static void smp_send_cmd(struct l2cap_conn *conn, u8 code, u16 len, void *data)
memset(&msg, 0, sizeof(msg));
- iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iv, 2, 1 + len);
+ iov_iter_kvec(&msg.msg_iter, WRITE, iv, 2, 1 + len);
l2cap_chan_send(chan, &msg, 1 + len);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index c6413c360771..fcbe49f5b704 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -526,7 +526,7 @@ static int ceph_tcp_recvmsg(struct socket *sock, void *buf, size_t len)
if (!buf)
msg.msg_flags |= MSG_TRUNC;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, len);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, len);
r = sock_recvmsg(sock, &msg, msg.msg_flags);
if (r == -EAGAIN)
r = 0;
@@ -545,7 +545,7 @@ static int ceph_tcp_recvpage(struct socket *sock, struct page *page,
int r;
BUG_ON(page_offset + length > PAGE_SIZE);
- iov_iter_bvec(&msg.msg_iter, READ | ITER_BVEC, &bvec, 1, length);
+ iov_iter_bvec(&msg.msg_iter, READ, &bvec, 1, length);
r = sock_recvmsg(sock, &msg, msg.msg_flags);
if (r == -EAGAIN)
r = 0;
@@ -607,7 +607,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
else
msg.msg_flags |= MSG_EOR; /* superfluous, but what the hell */
- iov_iter_bvec(&msg.msg_iter, WRITE | ITER_BVEC, &bvec, 1, size);
+ iov_iter_bvec(&msg.msg_iter, WRITE, &bvec, 1, size);
ret = sock_sendmsg(sock, &msg);
if (ret == -EAGAIN)
ret = 0;
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index d4020c5e831d..2526be6b3d90 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1616,7 +1616,7 @@ ip_vs_receive(struct socket *sock, char *buffer, const size_t buflen)
EnterFunction(7);
/* Receive a packet */
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, buflen);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, buflen);
len = sock_recvmsg(sock, &msg, MSG_DONTWAIT);
if (len < 0)
return len;
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 83aba9ade060..f27a298f3498 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -286,7 +286,7 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen,
*/
krflags = MSG_PEEK | MSG_WAITALL;
smc->clcsock->sk->sk_rcvtimeo = CLC_WAIT_TIME;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1,
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1,
sizeof(struct smc_clc_msg_hdr));
len = sock_recvmsg(smc->clcsock, &msg, krflags);
if (signal_pending(current)) {
@@ -325,7 +325,7 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen,
/* receive the complete CLC message */
memset(&msg, 0, sizeof(struct msghdr));
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &vec, 1, datlen);
+ iov_iter_kvec(&msg.msg_iter, READ, &vec, 1, datlen);
krflags = MSG_WAITALL;
len = sock_recvmsg(smc->clcsock, &msg, krflags);
if (len < datlen || !smc_clc_msg_hdr_valid(clcm)) {
diff --git a/net/socket.c b/net/socket.c
index 0d8cfc0c9b2a..52ea48a2e601 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -655,7 +655,7 @@ EXPORT_SYMBOL(sock_sendmsg);
int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
struct kvec *vec, size_t num, size_t size)
{
- iov_iter_kvec(&msg->msg_iter, WRITE | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, WRITE, vec, num, size);
return sock_sendmsg(sock, msg);
}
EXPORT_SYMBOL(kernel_sendmsg);
@@ -668,7 +668,7 @@ int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
if (!sock->ops->sendmsg_locked)
return sock_no_sendmsg_locked(sk, msg, size);
- iov_iter_kvec(&msg->msg_iter, WRITE | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, WRITE, vec, num, size);
return sock->ops->sendmsg_locked(sk, msg, msg_data_left(msg));
}
@@ -843,7 +843,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
mm_segment_t oldfs = get_fs();
int result;
- iov_iter_kvec(&msg->msg_iter, READ | ITER_KVEC, vec, num, size);
+ iov_iter_kvec(&msg->msg_iter, READ, vec, num, size);
set_fs(KERNEL_DS);
result = sock_recvmsg(sock, msg, flags);
set_fs(oldfs);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 5445145e639c..0b46ec0bf74e 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -338,7 +338,7 @@ static int svc_recvfrom(struct svc_rqst *rqstp, struct kvec *iov, int nr,
rqstp->rq_xprt_hlen = 0;
clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, iov, nr, buflen);
+ iov_iter_kvec(&msg.msg_iter, READ, iov, nr, buflen);
len = sock_recvmsg(svsk->sk_sock, &msg, msg.msg_flags);
/* If we read a full record, then assume there may be more
* data to read (stream based sockets only!)
diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c
index c8e34ef22c30..7e0f8df6b79d 100644
--- a/net/tipc/topsrv.c
+++ b/net/tipc/topsrv.c
@@ -400,7 +400,7 @@ static int tipc_conn_rcv_from_sock(struct tipc_conn *con)
iov.iov_base = &s;
iov.iov_len = sizeof(s);
msg.msg_name = NULL;
- iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, iov.iov_len);
+ iov_iter_kvec(&msg.msg_iter, READ, &iov, 1, iov.iov_len);
ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT);
if (ret == -EWOULDBLOCK)
return -EWOULDBLOCK;
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 292742e50bfa..fcf50375cd05 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -489,7 +489,7 @@ int tls_device_sendpage(struct sock *sk, struct page *page,
iov.iov_base = kaddr + offset;
iov.iov_len = size;
- iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+ iov_iter_kvec(&msg_iter, WRITE, &iov, 1, size);
rc = tls_push_data(sk, &msg_iter, size,
flags, TLS_RECORD_TYPE_DATA);
kunmap(page);
@@ -538,7 +538,7 @@ static int tls_device_push_pending_record(struct sock *sk, int flags)
{
struct iov_iter msg_iter;
- iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
+ iov_iter_kvec(&msg_iter, WRITE, NULL, 0, 0);
return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
}
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 83d67df33f0c..0047e75ac4db 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -365,7 +365,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
int record_room;
bool full_record;
int orig_size;
- bool is_kvec = msg->msg_iter.type & ITER_KVEC;
+ bool is_kvec = iov_iter_type(&msg->msg_iter) == ITER_KVEC;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL))
return -ENOTSUPP;
@@ -778,7 +778,7 @@ int tls_sw_recvmsg(struct sock *sk,
bool cmsg = false;
int target, err = 0;
long timeo;
- bool is_kvec = msg->msg_iter.type & ITER_KVEC;
+ bool is_kvec = iov_iter_type(&msg->msg_iter) == ITER_KVEC;
flags |= nonblock;
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 02/10] iov_iter: Renumber the ITER_* constants in uio.h
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
2018-08-06 13:16 ` [PATCH 01/10] iov_iter: Separate type from direction and use accessor functions David Howells
@ 2018-08-06 13:16 ` David Howells
2018-08-06 13:16 ` [PATCH 03/10] iov_iter: Make count and iov_offset loff_t not size_t David Howells
` (7 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:16 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Renumber the ITER_* constants in uio.h to be contiguous to make comparing
them more efficient in a switch-statement.
Signed-off-by: David Howells <dhowelsl@redhat.com>
---
include/linux/uio.h | 8 +++---
lib/iov_iter.c | 69 +++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 60 insertions(+), 17 deletions(-)
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 48e7fa36f923..d5f8755bf778 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -22,10 +22,10 @@ struct kvec {
};
enum iter_type {
- ITER_IOVEC = 0,
- ITER_KVEC = 2,
- ITER_BVEC = 4,
- ITER_PIPE = 8,
+ ITER_IOVEC,
+ ITER_KVEC,
+ ITER_BVEC,
+ ITER_PIPE,
};
struct iov_iter {
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bd828591afb0..f30ecd263d6e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -75,18 +75,28 @@
#define iterate_all_kinds(i, n, v, I, B, K) { \
if (likely(n)) { \
size_t skip = i->iov_offset; \
- if (unlikely(i->iter_type & ITER_BVEC)) { \
+ switch (iov_iter_type(i)) { \
+ case ITER_BVEC: { \
struct bio_vec v; \
struct bvec_iter __bi; \
- iterate_bvec(i, n, v, __bi, skip, (B)) \
- } else if (unlikely(i->iter_type & ITER_KVEC)) { \
+ iterate_bvec(i, n, v, __bi, skip, (B)); \
+ break; \
+ } \
+ case ITER_KVEC: { \
const struct kvec *kvec; \
struct kvec v; \
- iterate_kvec(i, n, v, kvec, skip, (K)) \
- } else { \
+ iterate_kvec(i, n, v, kvec, skip, (K)); \
+ break; \
+ } \
+ case ITER_PIPE: { \
+ break; \
+ } \
+ case ITER_IOVEC: { \
const struct iovec *iov; \
struct iovec v; \
- iterate_iovec(i, n, v, iov, skip, (I)) \
+ iterate_iovec(i, n, v, iov, skip, (I)); \
+ break; \
+ } \
} \
} \
}
@@ -96,7 +106,8 @@
n = i->count; \
if (i->count) { \
size_t skip = i->iov_offset; \
- if (unlikely(i->iter_type & ITER_BVEC)) { \
+ switch (iov_iter_type(i)) { \
+ case ITER_BVEC: { \
const struct bio_vec *bvec = i->bvec; \
struct bio_vec v; \
struct bvec_iter __bi; \
@@ -104,7 +115,9 @@
i->bvec = __bvec_iter_bvec(i->bvec, __bi); \
i->nr_segs -= i->bvec - bvec; \
skip = __bi.bi_bvec_done; \
- } else if (unlikely(i->iter_type & ITER_KVEC)) { \
+ break; \
+ } \
+ case ITER_KVEC: { \
const struct kvec *kvec; \
struct kvec v; \
iterate_kvec(i, n, v, kvec, skip, (K)) \
@@ -114,7 +127,9 @@
} \
i->nr_segs -= kvec - i->kvec; \
i->kvec = kvec; \
- } else { \
+ break; \
+ } \
+ case ITER_IOVEC: { \
const struct iovec *iov; \
struct iovec v; \
iterate_iovec(i, n, v, iov, skip, (I)) \
@@ -124,6 +139,11 @@
} \
i->nr_segs -= iov - i->iov; \
i->iov = iov; \
+ break; \
+ } \
+ case ITER_PIPE: { \
+ break; \
+ } \
} \
i->count -= n; \
i->iov_offset = skip; \
@@ -873,6 +893,7 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
case ITER_IOVEC:
return copy_page_from_iter_iovec(page, offset, bytes, i);
}
+
WARN_ON(1);
return 0;
}
@@ -988,11 +1009,17 @@ static void pipe_advance(struct iov_iter *i, size_t size)
void iov_iter_advance(struct iov_iter *i, size_t size)
{
- if (unlikely(iov_iter_is_pipe(i))) {
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE:
pipe_advance(i, size);
return;
+ case ITER_IOVEC:
+ case ITER_KVEC:
+ case ITER_BVEC:
+ iterate_and_advance(i, size, v, 0, 0, 0);
+ return;
}
- iterate_and_advance(i, size, v, 0, 0, 0)
+ BUG();
}
EXPORT_SYMBOL(iov_iter_advance);
@@ -1223,8 +1250,16 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(iov_iter_is_pipe(i)))
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE:
return pipe_get_pages(i, pages, maxsize, maxpages, start);
+ case ITER_KVEC:
+ return -EFAULT;
+ case ITER_IOVEC:
+ case ITER_BVEC:
+ break;
+ }
+
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
@@ -1300,8 +1335,16 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
if (maxsize > i->count)
maxsize = i->count;
- if (unlikely(iov_iter_is_pipe(i)))
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE:
return pipe_get_pages_alloc(i, pages, maxsize, start);
+ case ITER_KVEC:
+ return -EFAULT;
+ case ITER_IOVEC:
+ case ITER_BVEC:
+ break;
+ }
+
iterate_all_kinds(i, maxsize, v, ({
unsigned long addr = (unsigned long)v.iov_base;
size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 03/10] iov_iter: Make count and iov_offset loff_t not size_t
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
2018-08-06 13:16 ` [PATCH 01/10] iov_iter: Separate type from direction and use accessor functions David Howells
2018-08-06 13:16 ` [PATCH 02/10] iov_iter: Renumber the ITER_* constants in uio.h David Howells
@ 2018-08-06 13:16 ` David Howells
2018-08-06 13:17 ` [PATCH 04/10] iov_iter: Add mapping and discard iterator types David Howells
` (6 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:16 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Make count and iov_offset loff_t not size_t so that they can handle
transactions larger than 4GiB in size and starting at 4GiB or more into a
file on a 32-bit system.
On a 64-bit system, this should make no difference.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/9p/vfs_file.c | 2 +-
fs/btrfs/file.c | 7 ++++---
fs/nfs/direct.c | 2 +-
fs/nfs/file.c | 4 ++--
include/linux/uio.h | 6 +++---
lib/iov_iter.c | 14 +++++++-------
net/9p/client.c | 2 +-
net/rxrpc/recvmsg.c | 4 ++--
8 files changed, 21 insertions(+), 20 deletions(-)
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 03c9e325bfbc..a83b81ea677f 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -384,7 +384,7 @@ v9fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
struct p9_fid *fid = iocb->ki_filp->private_data;
int ret, err = 0;
- p9_debug(P9_DEBUG_VFS, "count %zu offset %lld\n",
+ p9_debug(P9_DEBUG_VFS, "count %llu offset %lld\n",
iov_iter_count(to), iocb->ki_pos);
ret = p9_client_read(fid, iocb->ki_pos, to, &err);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 51e77d72068a..48bf2b0af7b5 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1599,9 +1599,10 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
while (iov_iter_count(i) > 0) {
size_t offset = pos & (PAGE_SIZE - 1);
size_t sector_offset;
- size_t write_bytes = min(iov_iter_count(i),
- nrptrs * (size_t)PAGE_SIZE -
- offset);
+ size_t write_bytes = min_t(size_t,
+ iov_iter_count(i),
+ nrptrs * (size_t)PAGE_SIZE -
+ offset);
size_t num_pages = DIV_ROUND_UP(write_bytes + offset,
PAGE_SIZE);
size_t reserve_bytes;
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 621c517b325c..3d438aa2650b 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -974,7 +974,7 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
struct nfs_lock_context *l_ctx;
loff_t pos, end;
- dfprintk(FILE, "NFS: direct write(%pD2, %zd@%Ld)\n",
+ dfprintk(FILE, "NFS: direct write(%pD2, %lld@%lld)\n",
file, iov_iter_count(iter), (long long) iocb->ki_pos);
result = generic_write_checks(iocb, iter);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 81cca49a8375..ef4fd7124ca1 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -159,7 +159,7 @@ nfs_file_read(struct kiocb *iocb, struct iov_iter *to)
if (iocb->ki_flags & IOCB_DIRECT)
return nfs_file_direct_read(iocb, to);
- dprintk("NFS: read(%pD2, %zu@%lu)\n",
+ dprintk("NFS: read(%pD2, %llu@%lu)\n",
iocb->ki_filp,
iov_iter_count(to), (unsigned long) iocb->ki_pos);
@@ -608,7 +608,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
if (iocb->ki_flags & IOCB_DIRECT)
return nfs_file_direct_write(iocb, from);
- dprintk("NFS: write(%pD2, %zu@%Ld)\n",
+ dprintk("NFS: write(%pD2, %llu@%Ld)\n",
file, iov_iter_count(from), (long long) iocb->ki_pos);
if (IS_SWAPFILE(inode))
diff --git a/include/linux/uio.h b/include/linux/uio.h
index d5f8755bf778..d0655b74fd7e 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -31,7 +31,7 @@ enum iter_type {
struct iov_iter {
enum iter_type iter_type:8;
u8 iter_dir;
- size_t iov_offset;
+ loff_t iov_offset;
size_t count;
union {
const struct iovec *iov;
@@ -89,7 +89,7 @@ static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
{
return (struct iovec) {
.iov_base = iter->iov->iov_base + iter->iov_offset,
- .iov_len = min(iter->count,
+ .iov_len = min_t(size_t, iter->count,
iter->iov->iov_len - iter->iov_offset),
};
}
@@ -219,7 +219,7 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages);
const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
-static inline size_t iov_iter_count(const struct iov_iter *i)
+static inline loff_t iov_iter_count(const struct iov_iter *i)
{
return i->count;
}
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index f30ecd263d6e..8231f0e38f20 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -13,7 +13,7 @@
size_t left; \
size_t wanted = n; \
__p = i->iov; \
- __v.iov_len = min(n, __p->iov_len - skip); \
+ __v.iov_len = min_t(size_t, n, __p->iov_len - skip); \
if (likely(__v.iov_len)) { \
__v.iov_base = __p->iov_base + skip; \
left = (STEP); \
@@ -40,7 +40,7 @@
#define iterate_kvec(i, n, __v, __p, skip, STEP) { \
size_t wanted = n; \
__p = i->kvec; \
- __v.iov_len = min(n, __p->iov_len - skip); \
+ __v.iov_len = min_t(size_t, n, __p->iov_len - skip); \
if (likely(__v.iov_len)) { \
__v.iov_base = __p->iov_base + skip; \
(void)(STEP); \
@@ -74,7 +74,7 @@
#define iterate_all_kinds(i, n, v, I, B, K) { \
if (likely(n)) { \
- size_t skip = i->iov_offset; \
+ loff_t skip = i->iov_offset; \
switch (iov_iter_type(i)) { \
case ITER_BVEC: { \
struct bio_vec v; \
@@ -105,7 +105,7 @@
if (unlikely(i->count < n)) \
n = i->count; \
if (i->count) { \
- size_t skip = i->iov_offset; \
+ loff_t skip = i->iov_offset; \
switch (iov_iter_type(i)) { \
case ITER_BVEC: { \
const struct bio_vec *bvec = i->bvec; \
@@ -358,7 +358,7 @@ static bool sanity(const struct iov_iter *i)
}
return true;
Bad:
- printk(KERN_ERR "idx = %d, offset = %zd\n", i->idx, i->iov_offset);
+ printk(KERN_ERR "idx = %d, offset = %lld\n", i->idx, i->iov_offset);
printk(KERN_ERR "curbuf = %d, nrbufs = %d, buffers = %d\n",
pipe->curbuf, pipe->nrbufs, pipe->buffers);
for (idx = 0; idx < pipe->buffers; idx++)
@@ -1105,10 +1105,10 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i)
case ITER_PIPE:
return i->count; // it is a silly place, anyway
case ITER_BVEC:
- return min(i->count, i->bvec->bv_len - i->iov_offset);
+ return min_t(size_t, i->count, i->bvec->bv_len - i->iov_offset);
case ITER_KVEC:
case ITER_IOVEC:
- return min(i->count, i->iov->iov_len - i->iov_offset);
+ return min_t(size_t, i->count, i->iov->iov_len - i->iov_offset);
}
BUG();
}
diff --git a/net/9p/client.c b/net/9p/client.c
index 1e853e2a62bb..19b045714206 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1649,7 +1649,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
int total = 0;
*err = 0;
- p9_debug(P9_DEBUG_9P, ">>> TWRITE fid %d offset %llu count %zd\n",
+ p9_debug(P9_DEBUG_9P, ">>> TWRITE fid %d offset %llu count %llu\n",
fid->fid, (unsigned long long) offset,
iov_iter_count(from));
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 816b19a78809..920f505baedd 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -633,7 +633,7 @@ int rxrpc_kernel_recv_data(struct socket *sock, struct rxrpc_call *call,
size_t offset = 0;
int ret;
- _enter("{%d,%s},%zu,%d",
+ _enter("{%d,%s},%llu,%d",
call->debug_id, rxrpc_call_states[call->state],
iov_iter_count(iter), want_more);
@@ -693,7 +693,7 @@ int rxrpc_kernel_recv_data(struct socket *sock, struct rxrpc_call *call,
if (_service)
*_service = call->service_id;
mutex_unlock(&call->user_mutex);
- _leave(" = %d [%zu,%d]", ret, iov_iter_count(iter), *_abort);
+ _leave(" = %d [%llu,%d]", ret, iov_iter_count(iter), *_abort);
return ret;
short_data:
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 04/10] iov_iter: Add mapping and discard iterator types
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (2 preceding siblings ...)
2018-08-06 13:16 ` [PATCH 03/10] iov_iter: Make count and iov_offset loff_t not size_t David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 05/10] afs: Better tracing of protocol errors David Howells
` (5 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Add two new iterator types to iov_iter:
(1) ITER_MAPPING
This walks through a set of pages attached to an address_space that
are pinned or locked, starting at a given page and offset and walking
for the specified amount of space. A facility to get a callback each
time a page is entirely processed is provided.
This is useful for copying data from socket buffers to inodes in
network filesystems.
(2) ITER_DISCARD
This is a sink iterator that can only be used in READ mode and just
discards any data copied to it.
This is useful in a network filesystem for discarding any unwanted
data sent by a server.
Signed-off-by: David Howells <dhowells@redhat.com>
---
include/linux/uio.h | 8 +
lib/iov_iter.c | 365 +++++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 330 insertions(+), 43 deletions(-)
diff --git a/include/linux/uio.h b/include/linux/uio.h
index d0655b74fd7e..24f6901472e9 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -14,6 +14,7 @@
#include <uapi/linux/uio.h>
struct page;
+struct address_space;
struct pipe_inode_info;
struct kvec {
@@ -26,6 +27,8 @@ enum iter_type {
ITER_KVEC,
ITER_BVEC,
ITER_PIPE,
+ ITER_DISCARD,
+ ITER_MAPPING,
};
struct iov_iter {
@@ -37,6 +40,7 @@ struct iov_iter {
const struct iovec *iov;
const struct kvec *kvec;
const struct bio_vec *bvec;
+ struct address_space *mapping;
struct pipe_inode_info *pipe;
};
union {
@@ -45,6 +49,7 @@ struct iov_iter {
int idx;
int start_idx;
};
+ void (*page_done)(const struct iov_iter *, const struct bio_vec *);
};
};
@@ -211,6 +216,9 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_
unsigned long nr_segs, size_t count);
void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
size_t count);
+void iov_iter_mapping(struct iov_iter *i, unsigned int direction, struct address_space *mapping,
+ loff_t start, size_t count);
+void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
size_t maxsize, unsigned maxpages, size_t *start);
ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 8231f0e38f20..22b35464891b 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -72,7 +72,35 @@
} \
}
-#define iterate_all_kinds(i, n, v, I, B, K) { \
+#define iterate_mapping(i, n, __v, do_done, skip, STEP) { \
+ struct radix_tree_iter cursor; \
+ size_t wanted = n, seg, offset; \
+ pgoff_t index = skip >> PAGE_SHIFT; \
+ void __rcu **slot; \
+ \
+ rcu_read_lock(); \
+ radix_tree_for_each_contig(slot, &i->mapping->i_pages, \
+ &cursor, index) { \
+ if (!n) \
+ break; \
+ __v.bv_page = radix_tree_deref_slot(slot); \
+ if (!__v.bv_page) \
+ break; \
+ offset = skip & ~PAGE_MASK; \
+ seg = PAGE_SIZE - offset; \
+ __v.bv_offset = offset; \
+ __v.bv_len = min(n, seg); \
+ (void)(STEP); \
+ if (do_done && __v.bv_offset + __v.bv_len == PAGE_SIZE) \
+ i->page_done(i, &__v); \
+ n -= __v.bv_len; \
+ skip += __v.bv_len; \
+ } \
+ rcu_read_unlock(); \
+ n = wanted - n; \
+}
+
+#define iterate_all_kinds(i, n, v, I, B, K, M) { \
if (likely(n)) { \
loff_t skip = i->iov_offset; \
switch (iov_iter_type(i)) { \
@@ -91,6 +119,14 @@
case ITER_PIPE: { \
break; \
} \
+ case ITER_MAPPING: { \
+ struct bio_vec v; \
+ iterate_mapping(i, n, v, false, skip, (M)); \
+ break; \
+ } \
+ case ITER_DISCARD: { \
+ break; \
+ } \
case ITER_IOVEC: { \
const struct iovec *iov; \
struct iovec v; \
@@ -101,7 +137,7 @@
} \
}
-#define iterate_and_advance(i, n, v, I, B, K) { \
+#define iterate_and_advance(i, n, v, I, B, K, M) { \
if (unlikely(i->count < n)) \
n = i->count; \
if (i->count) { \
@@ -129,6 +165,11 @@
i->kvec = kvec; \
break; \
} \
+ case ITER_MAPPING: { \
+ struct bio_vec v; \
+ iterate_mapping(i, n, v, i->page_done, skip, (M)) \
+ break; \
+ } \
case ITER_IOVEC: { \
const struct iovec *iov; \
struct iovec v; \
@@ -144,6 +185,10 @@
case ITER_PIPE: { \
break; \
} \
+ case ITER_DISCARD: { \
+ skip += n; \
+ break; \
+ } \
} \
i->count -= n; \
i->iov_offset = skip; \
@@ -448,6 +493,8 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
break;
case ITER_KVEC:
case ITER_BVEC:
+ case ITER_MAPPING:
+ case ITER_DISCARD:
break;
}
return 0;
@@ -593,7 +640,9 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
memcpy_to_page(v.bv_page, v.bv_offset,
(from += v.bv_len) - v.bv_len, v.bv_len),
- memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
+ memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
+ memcpy_to_page(v.bv_page, v.bv_offset,
+ (from += v.bv_len) - v.bv_len, v.bv_len)
)
return bytes;
@@ -708,6 +757,15 @@ size_t _copy_to_iter_mcsafe(const void *addr, size_t bytes, struct iov_iter *i)
bytes = curr_addr - s_addr - rem;
return bytes;
}
+ }),
+ ({
+ rem = memcpy_mcsafe_to_page(v.bv_page, v.bv_offset,
+ (from += v.bv_len) - v.bv_len, v.bv_len);
+ if (rem) {
+ curr_addr = (unsigned long) from;
+ bytes = curr_addr - s_addr - rem;
+ return bytes;
+ }
})
)
@@ -729,7 +787,9 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
copyin((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
- memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+ memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+ memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
return bytes;
@@ -755,7 +815,9 @@ bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
0;}),
memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
- memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+ memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+ memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
iov_iter_advance(i, bytes);
@@ -775,7 +837,9 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
v.iov_base, v.iov_len),
memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
- memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+ memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+ memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
return bytes;
@@ -810,7 +874,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
memcpy_flushcache((to += v.iov_len) - v.iov_len, v.iov_base,
- v.iov_len)
+ v.iov_len),
+ memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
return bytes;
@@ -834,7 +900,9 @@ bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
0;}),
memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
- memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+ memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+ memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
iov_iter_advance(i, bytes);
@@ -860,7 +928,8 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
return 0;
switch (iov_iter_type(i)) {
case ITER_BVEC:
- case ITER_KVEC: {
+ case ITER_KVEC:
+ case ITER_MAPPING: {
void *kaddr = kmap_atomic(page);
size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
kunmap_atomic(kaddr);
@@ -870,6 +939,8 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
return copy_page_to_iter_iovec(page, offset, bytes, i);
case ITER_PIPE:
return copy_page_to_iter_pipe(page, offset, bytes, i);
+ case ITER_DISCARD:
+ return bytes;
}
BUG();
}
@@ -882,7 +953,9 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
return 0;
switch (iov_iter_type(i)) {
case ITER_PIPE:
+ case ITER_DISCARD:
break;
+ case ITER_MAPPING:
case ITER_BVEC:
case ITER_KVEC: {
void *kaddr = kmap_atomic(page);
@@ -930,7 +1003,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
iterate_and_advance(i, bytes, v,
clear_user(v.iov_base, v.iov_len),
memzero_page(v.bv_page, v.bv_offset, v.bv_len),
- memset(v.iov_base, 0, v.iov_len)
+ memset(v.iov_base, 0, v.iov_len),
+ memzero_page(v.bv_page, v.bv_offset, v.bv_len)
)
return bytes;
@@ -945,7 +1019,7 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
kunmap_atomic(kaddr);
return 0;
}
- if (unlikely(iov_iter_is_pipe(i))) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_type(i) == ITER_DISCARD)) {
kunmap_atomic(kaddr);
WARN_ON(1);
return 0;
@@ -954,7 +1028,9 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
copyin((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page,
v.bv_offset, v.bv_len),
- memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+ memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+ memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page,
+ v.bv_offset, v.bv_len)
)
kunmap_atomic(kaddr);
return bytes;
@@ -1016,7 +1092,14 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
case ITER_IOVEC:
case ITER_KVEC:
case ITER_BVEC:
- iterate_and_advance(i, size, v, 0, 0, 0);
+ iterate_and_advance(i, size, v, 0, 0, 0, 0);
+ return;
+ case ITER_MAPPING:
+ /* We really don't want to fetch pages is we can avoid it */
+ i->iov_offset += size;
+ /* Fall through */
+ case ITER_DISCARD:
+ i->count -= size;
return;
}
BUG();
@@ -1060,6 +1143,14 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
}
unroll -= i->iov_offset;
switch (iov_iter_type(i)) {
+ case ITER_MAPPING:
+ BUG(); /* We should never go beyond the start of the mapping
+ * since iov_offset includes that page number as well as
+ * the in-page offset.
+ */
+ case ITER_DISCARD:
+ i->iov_offset = 0;
+ return;
case ITER_BVEC: {
const struct bio_vec *bvec = i->bvec;
while (1) {
@@ -1103,6 +1194,8 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i)
return i->count;
switch (iov_iter_type(i)) {
case ITER_PIPE:
+ case ITER_DISCARD:
+ case ITER_MAPPING:
return i->count; // it is a silly place, anyway
case ITER_BVEC:
return min_t(size_t, i->count, i->bvec->bv_len - i->iov_offset);
@@ -1158,6 +1251,52 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
}
EXPORT_SYMBOL(iov_iter_pipe);
+/**
+ * iov_iter_mapping - Initialise an I/O iterator to use the pages in a mapping
+ * @i: The iterator to initialise.
+ * @direction: The direction of the transfer.
+ * @mapping: The mapping to access.
+ * @start: The start file position.
+ * @count: The size of the I/O buffer in bytes.
+ *
+ * Set up an I/O iterator to either draw data out of the pages attached to an
+ * inode or to inject data into those pages. The pages *must* be prevented
+ * from evaporation, either by taking a ref on them or locking them by the
+ * caller.
+ */
+void iov_iter_mapping(struct iov_iter *i, unsigned int direction,
+ struct address_space *mapping,
+ loff_t start, size_t count)
+{
+ BUG_ON(direction & ~1);
+ i->iter_dir = direction;
+ i->iter_type = ITER_MAPPING;
+ i->mapping = mapping;
+ i->count = count;
+ i->iov_offset = start;
+ i->page_done = NULL;
+}
+EXPORT_SYMBOL(iov_iter_mapping);
+
+/**
+ * iov_iter_discard - Initialise an I/O iterator that discards data
+ * @i: The iterator to initialise.
+ * @direction: The direction of the transfer.
+ * @count: The size of the I/O buffer in bytes.
+ *
+ * Set up an I/O iterator that just discards everything that's written to it.
+ * It's only available as a READ iterator.
+ */
+void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
+{
+ BUG_ON(direction != READ);
+ i->iter_dir = READ;
+ i->iter_type = ITER_DISCARD;
+ i->count = count;
+ i->iov_offset = 0;
+}
+EXPORT_SYMBOL(iov_iter_discard);
+
unsigned long iov_iter_alignment(const struct iov_iter *i)
{
unsigned long res = 0;
@@ -1171,7 +1310,8 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
iterate_all_kinds(i, size, v,
(res |= (unsigned long)v.iov_base | v.iov_len, 0),
res |= v.bv_offset | v.bv_len,
- res |= (unsigned long)v.iov_base | v.iov_len
+ res |= (unsigned long)v.iov_base | v.iov_len,
+ res |= v.bv_offset | v.bv_len
)
return res;
}
@@ -1182,7 +1322,7 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
unsigned long res = 0;
size_t size = i->count;
- if (unlikely(iov_iter_is_pipe(i))) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_type(i) == ITER_DISCARD)) {
WARN_ON(1);
return ~0U;
}
@@ -1193,7 +1333,9 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
(res |= (!res ? 0 : (unsigned long)v.bv_offset) |
(size != v.bv_len ? size : 0)),
(res |= (!res ? 0 : (unsigned long)v.iov_base) |
- (size != v.iov_len ? size : 0))
+ (size != v.iov_len ? size : 0)),
+ (res |= (!res ? 0 : (unsigned long)v.bv_offset) |
+ (size != v.bv_len ? size : 0))
);
return res;
}
@@ -1243,6 +1385,43 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
return __pipe_get_pages(i, min(maxsize, capacity), pages, idx, start);
}
+static ssize_t iter_mapping_get_pages(struct iov_iter *i,
+ struct page **pages, size_t maxsize,
+ unsigned maxpages, size_t *start)
+{
+ unsigned nr, offset;
+ pgoff_t index, count;
+ size_t size = maxsize;
+
+ if (!size || !maxpages)
+ return 0;
+
+ index = i->iov_offset >> PAGE_SHIFT;
+ offset = i->iov_offset & ~PAGE_MASK;
+ *start = offset;
+
+ count = 1;
+ if (size > PAGE_SIZE - offset) {
+ size -= PAGE_SIZE - offset;
+ count += size >> PAGE_SHIFT;
+ size &= ~PAGE_MASK;
+ if (size)
+ count++;
+ }
+
+ if (count > maxpages)
+ count = maxpages;
+
+ nr = find_get_pages_contig(i->mapping, index, count, pages);
+ if (nr == count)
+ return maxsize;
+ if (nr == 0)
+ return 0;
+ if (nr == 1)
+ return PAGE_SIZE - offset;
+ return (PAGE_SIZE - offset) + count * PAGE_SIZE;
+}
+
ssize_t iov_iter_get_pages(struct iov_iter *i,
struct page **pages, size_t maxsize, unsigned maxpages,
size_t *start)
@@ -1253,6 +1432,9 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
switch (iov_iter_type(i)) {
case ITER_PIPE:
return pipe_get_pages(i, pages, maxsize, maxpages, start);
+ case ITER_MAPPING:
+ return iter_mapping_get_pages(i, pages, maxsize, maxpages, start);
+ case ITER_DISCARD:
case ITER_KVEC:
return -EFAULT;
case ITER_IOVEC:
@@ -1279,9 +1461,7 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
*start = v.bv_offset;
get_page(*pages = v.bv_page);
return v.bv_len;
- }),({
- return -EFAULT;
- })
+ }), 0, 0
)
return 0;
}
@@ -1326,6 +1506,48 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
return n;
}
+static ssize_t iter_mapping_get_pages_alloc(struct iov_iter *i,
+ struct page ***pages, size_t maxsize,
+ size_t *start)
+{
+ struct page **p;
+ unsigned nr, offset;
+ pgoff_t index, count;
+ size_t size = maxsize;
+
+ if (!size)
+ return 0;
+
+ index = i->iov_offset >> PAGE_SHIFT;
+ offset = i->iov_offset & ~PAGE_MASK;
+ *start = offset;
+
+ count = 1;
+ if (size > PAGE_SIZE - offset) {
+ size -= PAGE_SIZE - offset;
+ count += size >> PAGE_SHIFT;
+ size &= ~PAGE_MASK;
+ if (size)
+ count++;
+ }
+
+ p = get_pages_array(count);
+ if (!p)
+ return -ENOMEM;
+ *pages = p;
+
+ nr = find_get_pages_contig(i->mapping, index, count, p);
+ if (nr == count)
+ return maxsize;
+ if (nr == 0) {
+ kvfree(p);
+ return 0;
+ }
+ if (nr == 1)
+ return PAGE_SIZE - offset;
+ return (PAGE_SIZE - offset) + count * PAGE_SIZE;
+}
+
ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
struct page ***pages, size_t maxsize,
size_t *start)
@@ -1338,6 +1560,9 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
switch (iov_iter_type(i)) {
case ITER_PIPE:
return pipe_get_pages_alloc(i, pages, maxsize, start);
+ case ITER_MAPPING:
+ return iter_mapping_get_pages_alloc(i, pages, maxsize, start);
+ case ITER_DISCARD:
case ITER_KVEC:
return -EFAULT;
case ITER_IOVEC:
@@ -1371,9 +1596,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
return -ENOMEM;
get_page(*p = v.bv_page);
return v.bv_len;
- }),({
- return -EFAULT;
- })
+ }), 0, 0
)
return 0;
}
@@ -1386,7 +1609,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(iov_iter_is_pipe(i))) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_type(i) == ITER_DISCARD)) {
WARN_ON(1);
return 0;
}
@@ -1414,6 +1637,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
v.iov_len, 0);
sum = csum_block_add(sum, next, off);
off += v.iov_len;
+ }), ({
+ char *p = kmap_atomic(v.bv_page);
+ next = csum_partial_copy_nocheck(p + v.bv_offset,
+ (to += v.bv_len) - v.bv_len,
+ v.bv_len, 0);
+ kunmap_atomic(p);
+ sum = csum_block_add(sum, next, off);
+ off += v.bv_len;
})
)
*csum = sum;
@@ -1428,7 +1659,7 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(iov_iter_is_pipe(i))) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_type(i) == ITER_DISCARD)) {
WARN_ON(1);
return false;
}
@@ -1458,6 +1689,14 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
v.iov_len, 0);
sum = csum_block_add(sum, next, off);
off += v.iov_len;
+ }), ({
+ char *p = kmap_atomic(v.bv_page);
+ next = csum_partial_copy_nocheck(p + v.bv_offset,
+ (to += v.bv_len) - v.bv_len,
+ v.bv_len, 0);
+ kunmap_atomic(p);
+ sum = csum_block_add(sum, next, off);
+ off += v.bv_len;
})
)
*csum = sum;
@@ -1473,7 +1712,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, __wsum *csum,
__wsum sum, next;
size_t off = 0;
sum = *csum;
- if (unlikely(iov_iter_is_pipe(i))) {
+ if (unlikely(iov_iter_is_pipe(i) || iov_iter_type(i) == ITER_DISCARD)) {
WARN_ON(1); /* for now */
return 0;
}
@@ -1501,6 +1740,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, __wsum *csum,
v.iov_len, 0);
sum = csum_block_add(sum, next, off);
off += v.iov_len;
+ }), ({
+ char *p = kmap_atomic(v.bv_page);
+ next = csum_partial_copy_nocheck((from += v.bv_len) - v.bv_len,
+ p + v.bv_offset,
+ v.bv_len, 0);
+ kunmap_atomic(p);
+ sum = csum_block_add(sum, next, off);
+ off += v.bv_len;
})
)
*csum = sum;
@@ -1516,7 +1763,8 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
if (!size)
return 0;
- if (unlikely(iov_iter_is_pipe(i))) {
+ switch (iov_iter_type(i)) {
+ case ITER_PIPE: {
struct pipe_inode_info *pipe = i->pipe;
size_t off;
int idx;
@@ -1529,24 +1777,47 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
npages = ((pipe->curbuf - idx - 1) & (pipe->buffers - 1)) + 1;
if (npages >= maxpages)
return maxpages;
- } else iterate_all_kinds(i, size, v, ({
- unsigned long p = (unsigned long)v.iov_base;
- npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
- - p / PAGE_SIZE;
- if (npages >= maxpages)
- return maxpages;
- 0;}),({
- npages++;
- if (npages >= maxpages)
- return maxpages;
- }),({
- unsigned long p = (unsigned long)v.iov_base;
- npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
- - p / PAGE_SIZE;
+ }
+ case ITER_MAPPING: {
+ unsigned offset;
+
+ offset = i->iov_offset & ~PAGE_MASK;
+
+ npages = 1;
+ if (size > PAGE_SIZE - offset) {
+ size -= PAGE_SIZE - offset;
+ npages += size >> PAGE_SHIFT;
+ size &= ~PAGE_MASK;
+ if (size)
+ npages++;
+ }
if (npages >= maxpages)
return maxpages;
- })
- )
+ }
+ case ITER_DISCARD:
+ return 0;
+
+ default:
+ iterate_all_kinds(i, size, v, ({
+ unsigned long p = (unsigned long)v.iov_base;
+ npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
+ - p / PAGE_SIZE;
+ if (npages >= maxpages)
+ return maxpages;
+ 0;}),({
+ npages++;
+ if (npages >= maxpages)
+ return maxpages;
+ }),({
+ unsigned long p = (unsigned long)v.iov_base;
+ npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
+ - p / PAGE_SIZE;
+ if (npages >= maxpages)
+ return maxpages;
+ }),
+ 0
+ )
+ }
return npages;
}
EXPORT_SYMBOL(iov_iter_npages);
@@ -1567,6 +1838,9 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
return new->iov = kmemdup(new->iov,
new->nr_segs * sizeof(struct iovec),
flags);
+ case ITER_MAPPING:
+ case ITER_DISCARD:
+ return NULL;
}
WARN_ON(1);
@@ -1670,7 +1944,12 @@ int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
kunmap(v.bv_page);
err;}), ({
w = v;
- err = f(&w, context);})
+ err = f(&w, context);}), ({
+ w.iov_base = kmap(v.bv_page) + v.bv_offset;
+ w.iov_len = v.bv_len;
+ err = f(&w, context);
+ kunmap(v.bv_page);
+ err;})
)
return err;
}
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 05/10] afs: Better tracing of protocol errors
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (3 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 04/10] iov_iter: Add mapping and discard iterator types David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 06/10] afs: Set up the iov_iter before calling afs_extract_data() David Howells
` (4 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Include the site of detection of AFS protocol errors in trace lines to
better be able to determine what went wrong.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/cmservice.c | 6 ++
fs/afs/fsclient.c | 117 +++++++++++++++++++++++++++-----------------
fs/afs/inode.c | 2 -
fs/afs/internal.h | 2 -
fs/afs/rxrpc.c | 5 +-
fs/afs/vlclient.c | 30 ++++++++---
include/trace/events/afs.h | 54 +++++++++++++++++---
7 files changed, 146 insertions(+), 70 deletions(-)
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 9e51d6fe7e8f..58f79301a716 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -189,7 +189,8 @@ static int afs_deliver_cb_callback(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("FID count: %u", call->count);
if (call->count > AFSCBMAX)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_cb_fid_count);
call->buffer = kmalloc(array3_size(call->count, 3, 4),
GFP_KERNEL);
@@ -234,7 +235,8 @@ static int afs_deliver_cb_callback(struct afs_call *call)
call->count2 = ntohl(call->tmp);
_debug("CB count: %u", call->count2);
if (call->count2 != call->count && call->count2 != 0)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_cb_count);
call->offset = 0;
call->unmarshall++;
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 50929cb91732..d9a5815945dc 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -233,7 +233,7 @@ static int xdr_decode_AFSFetchStatus(struct afs_call *call,
bad:
xdr_dump_bad(*_bp);
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG, afs_eproto_bad_status);
}
/*
@@ -399,9 +399,10 @@ static int afs_deliver_fs_fetch_status_vnode(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
xdr_decode_AFSCallBack(call, vnode, &bp);
if (call->reply[1])
xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -580,9 +581,10 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
return ret;
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &vnode->status.data_version, req) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &vnode->status.data_version, req);
+ if (ret < 0)
+ return ret;
xdr_decode_AFSCallBack(call, vnode, &bp);
if (call->reply[1])
xdr_decode_AFSVolSync(&bp, call->reply[1]);
@@ -733,10 +735,13 @@ static int afs_deliver_fs_create_vnode(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
xdr_decode_AFSFid(&bp, call->reply[1]);
- if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) < 0 ||
- afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
xdr_decode_AFSCallBack_raw(&bp, call->reply[3]);
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
@@ -839,9 +844,10 @@ static int afs_deliver_fs_remove(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
@@ -929,10 +935,13 @@ static int afs_deliver_fs_link(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL) < 0 ||
- afs_decode_status(call, &bp, &dvnode->status, dvnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &dvnode->status, dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
@@ -1016,10 +1025,13 @@ static int afs_deliver_fs_symlink(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
xdr_decode_AFSFid(&bp, call->reply[1]);
- if (afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL) ||
- afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, call->reply[2], NULL, NULL, NULL);
+ if (ret < 0)
+ return ret;
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
@@ -1122,13 +1134,16 @@ static int afs_deliver_fs_rename(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
- if (new_dvnode != orig_dvnode &&
- afs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
- &call->expected_version_2, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &orig_dvnode->status, orig_dvnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
+ if (new_dvnode != orig_dvnode) {
+ ret = afs_decode_status(call, &bp, &new_dvnode->status, new_dvnode,
+ &call->expected_version_2, NULL);
+ if (ret < 0)
+ return ret;
+ }
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
@@ -1231,9 +1246,10 @@ static int afs_deliver_fs_store_data(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
afs_pages_written_back(vnode, call);
@@ -1407,9 +1423,10 @@ static int afs_deliver_fs_store_status(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- if (afs_decode_status(call, &bp, &vnode->status, vnode,
- &call->expected_version, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &vnode->status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
/* xdr_decode_AFSVolSync(&bp, call->reply[X]); */
_leave(" = 0 [done]");
@@ -1612,7 +1629,8 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("volname length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_volname_len);
call->offset = 0;
call->unmarshall++;
@@ -1659,7 +1677,8 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("offline msg length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_offline_msg_len);
call->offset = 0;
call->unmarshall++;
@@ -1706,7 +1725,8 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
call->count = ntohl(call->tmp);
_debug("motd length: %u", call->count);
if (call->count >= AFSNAMEMAX)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_motd_len);
call->offset = 0;
call->unmarshall++;
@@ -2109,8 +2129,10 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call)
/* unmarshall the reply once we've received all of it */
bp = call->buffer;
- afs_decode_status(call, &bp, status, vnode,
- &call->expected_version, NULL);
+ ret = afs_decode_status(call, &bp, status, vnode,
+ &call->expected_version, NULL);
+ if (ret < 0)
+ return ret;
callback[call->count].version = ntohl(bp[0]);
callback[call->count].expiry = ntohl(bp[1]);
callback[call->count].type = ntohl(bp[2]);
@@ -2206,7 +2228,8 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
tmp = ntohl(call->tmp);
_debug("status count: %u/%u", tmp, call->count2);
if (tmp != call->count2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_count);
call->count = 0;
call->unmarshall++;
@@ -2221,10 +2244,11 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
bp = call->buffer;
statuses = call->reply[1];
- if (afs_decode_status(call, &bp, &statuses[call->count],
- call->count == 0 ? vnode : NULL,
- NULL, NULL) < 0)
- return afs_protocol_error(call, -EBADMSG);
+ ret = afs_decode_status(call, &bp, &statuses[call->count],
+ call->count == 0 ? vnode : NULL,
+ NULL, NULL);
+ if (ret < 0)
+ return ret;
call->count++;
if (call->count < call->count2)
@@ -2244,7 +2268,8 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
tmp = ntohl(call->tmp);
_debug("CB count: %u", tmp);
if (tmp != call->count2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_ibulkst_cb_count);
call->count = 0;
call->unmarshall++;
more_cbs:
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 479b7fdda124..ab4e7a15c205 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -82,7 +82,7 @@ static int afs_inode_init_from_status(struct afs_vnode *vnode, struct key *key)
default:
printk("kAFS: AFS vnode with undefined type\n");
read_sequnlock_excl(&vnode->cb_lock);
- return afs_protocol_error(NULL, -EBADMSG);
+ return afs_protocol_error(NULL, -EBADMSG, afs_eproto_file_type);
}
inode->i_blocks = 0;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 35aa4bd79145..852f68a400b8 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -928,7 +928,7 @@ extern void afs_flat_call_destructor(struct afs_call *);
extern void afs_send_empty_reply(struct afs_call *);
extern void afs_send_simple_reply(struct afs_call *, const void *, size_t);
extern int afs_extract_data(struct afs_call *, void *, size_t, bool);
-extern int afs_protocol_error(struct afs_call *, int);
+extern int afs_protocol_error(struct afs_call *, int, enum afs_eproto_cause);
static inline int afs_transfer_reply(struct afs_call *call)
{
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 55441af2de53..37811901a5c9 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -941,8 +941,9 @@ int afs_extract_data(struct afs_call *call, void *buf, size_t count,
/*
* Log protocol error production.
*/
-noinline int afs_protocol_error(struct afs_call *call, int error)
+noinline int afs_protocol_error(struct afs_call *call, int error,
+ enum afs_eproto_cause cause)
{
- trace_afs_protocol_error(call, error, __builtin_return_address(0));
+ trace_afs_protocol_error(call, error, cause);
return error;
}
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index c3b740813fc7..d0f95c4ab05e 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -451,7 +451,8 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
call->count2 = ntohl(*bp); /* Type or next count */
if (call->count > YFS_MAXENDPOINTS)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_num);
alist = afs_alloc_addrlist(call->count, FS_SERVICE, AFS_FS_PORT);
if (!alist)
@@ -475,7 +476,8 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
size = sizeof(__be32) * (1 + 4 + 1);
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_type);
}
size += sizeof(__be32);
@@ -488,18 +490,21 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
if (ntohl(bp[0]) != sizeof(__be32) * 2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt4_len);
afs_merge_fs_addr4(alist, bp[1], ntohl(bp[2]));
bp += 3;
break;
case YFS_ENDPOINT_IPV6:
if (ntohl(bp[0]) != sizeof(__be32) * 5)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt6_len);
afs_merge_fs_addr6(alist, bp + 1, ntohl(bp[5]));
bp += 6;
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_fsendpt_type);
}
/* Got either the type of the next entry or the count of
@@ -518,7 +523,8 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
if (!call->count)
goto end;
if (call->count > YFS_MAXENDPOINTS)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
call->unmarshall = 3;
@@ -546,7 +552,8 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
size = sizeof(__be32) * (1 + 4 + 1);
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
}
if (call->count > 1)
@@ -559,16 +566,19 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
if (ntohl(bp[0]) != sizeof(__be32) * 2)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt4_len);
bp += 3;
break;
case YFS_ENDPOINT_IPV6:
if (ntohl(bp[0]) != sizeof(__be32) * 5)
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt6_len);
bp += 6;
break;
default:
- return afs_protocol_error(call, -EBADMSG);
+ return afs_protocol_error(call, -EBADMSG,
+ afs_eproto_yvl_vlendpt_type);
}
/* Got either the type of the next entry or the count of
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index d0a341bc4540..5c60ade2c7d8 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -84,6 +84,25 @@ enum afs_edit_dir_reason {
afs_edit_dir_for_unlink,
};
+enum afs_eproto_cause {
+ afs_eproto_bad_status,
+ afs_eproto_cb_count,
+ afs_eproto_cb_fid_count,
+ afs_eproto_file_type,
+ afs_eproto_ibulkst_cb_count,
+ afs_eproto_ibulkst_count,
+ afs_eproto_motd_len,
+ afs_eproto_offline_msg_len,
+ afs_eproto_volname_len,
+ afs_eproto_yvl_fsendpt4_len,
+ afs_eproto_yvl_fsendpt6_len,
+ afs_eproto_yvl_fsendpt_num,
+ afs_eproto_yvl_fsendpt_type,
+ afs_eproto_yvl_vlendpt4_len,
+ afs_eproto_yvl_vlendpt6_len,
+ afs_eproto_yvl_vlendpt_type,
+};
+
#endif /* end __AFS_DECLARE_TRACE_ENUMS_ONCE_ONLY */
/*
@@ -146,6 +165,24 @@ enum afs_edit_dir_reason {
EM(afs_edit_dir_for_symlink, "Symlnk") \
E_(afs_edit_dir_for_unlink, "Unlink")
+#define afs_eproto_causes \
+ EM(afs_eproto_bad_status, "BadStatus") \
+ EM(afs_eproto_cb_count, "CbCount") \
+ EM(afs_eproto_cb_fid_count, "CbFidCount") \
+ EM(afs_eproto_file_type, "FileTYpe") \
+ EM(afs_eproto_ibulkst_cb_count, "IBS.CbCount") \
+ EM(afs_eproto_ibulkst_count, "IBS.FidCount") \
+ EM(afs_eproto_motd_len, "MotdLen") \
+ EM(afs_eproto_offline_msg_len, "OfflineMsgLen") \
+ EM(afs_eproto_volname_len, "VolNameLen") \
+ EM(afs_eproto_yvl_fsendpt4_len, "YVL.FsEnd4Len") \
+ EM(afs_eproto_yvl_fsendpt6_len, "YVL.FsEnd6Len") \
+ EM(afs_eproto_yvl_fsendpt_num, "YVL.FsEndCount") \
+ EM(afs_eproto_yvl_fsendpt_type, "YVL.FsEndType") \
+ EM(afs_eproto_yvl_vlendpt4_len, "YVL.VlEnd4Len") \
+ EM(afs_eproto_yvl_vlendpt6_len, "YVL.VlEnd6Len") \
+ E_(afs_eproto_yvl_vlendpt_type, "YVL.VlEndType")
+
/*
* Export enum symbols via userspace.
@@ -555,24 +592,25 @@ TRACE_EVENT(afs_edit_dir,
);
TRACE_EVENT(afs_protocol_error,
- TP_PROTO(struct afs_call *call, int error, const void *where),
+ TP_PROTO(struct afs_call *call, int error, enum afs_eproto_cause cause),
- TP_ARGS(call, error, where),
+ TP_ARGS(call, error, cause),
TP_STRUCT__entry(
- __field(unsigned int, call )
- __field(int, error )
- __field(const void *, where )
+ __field(unsigned int, call )
+ __field(int, error )
+ __field(enum afs_eproto_cause, cause )
),
TP_fast_assign(
__entry->call = call ? call->debug_id : 0;
__entry->error = error;
- __entry->where = where;
+ __entry->cause = cause;
),
- TP_printk("c=%08x r=%d sp=%pSR",
- __entry->call, __entry->error, __entry->where)
+ TP_printk("c=%08x r=%d %s",
+ __entry->call, __entry->error,
+ __print_symbolic(__entry->cause, afs_eproto_causes))
);
TRACE_EVENT(afs_cm_no_server,
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 06/10] afs: Set up the iov_iter before calling afs_extract_data()
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (4 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 05/10] afs: Better tracing of protocol errors David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 07/10] afs: Use ITER_MAPPING for writing David Howells
` (3 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
each time it is called to describe the remaining buffer to be filled.
Instead:
(1) Put an iterator in the afs_call struct.
(2) Set the iterator for each marshalling stage to load data into the
appropriate places. A number of convenience functions are provided to
this end (eg. afs_extract_to_buf()).
This iterator is then passed to afs_extract_data().
(3) Use the new ITER_MAPPING iterator when reading data to load directly
into the inode's pages without needing to create a list of them. This
comes with a page-done callback that can be used to unlock pages as
they are filled.
(4) Use the new ITER_DISCARD iterator to discard any excess data provided
by FetchData.
This will allow O_DIRECT calls to be supported in future patches.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/cmservice.c | 40 +++----
fs/afs/dir.c | 193 +++++++++++++++++++++++-----------
fs/afs/file.c | 200 +++++++++++++++++++++++------------
fs/afs/fsclient.c | 252 ++++++++++++--------------------------------
fs/afs/internal.h | 52 ++++++++-
fs/afs/rxrpc.c | 41 ++-----
fs/afs/vlclient.c | 104 ++++++++----------
fs/afs/write.c | 8 +
include/linux/fscache.h | 31 +++++
include/trace/events/afs.h | 22 ++--
10 files changed, 502 insertions(+), 441 deletions(-)
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 58f79301a716..4db62ae8dc1a 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -176,13 +176,13 @@ static int afs_deliver_cb_callback(struct afs_call *call)
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* extract the FID array and its count in two steps */
case 1:
_debug("extract FID count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -196,13 +196,12 @@ static int afs_deliver_cb_callback(struct afs_call *call)
GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
- call->offset = 0;
+ afs_extract_to_buf(call, call->count * 3 * 4);
call->unmarshall++;
case 2:
_debug("extract FID array");
- ret = afs_extract_data(call, call->buffer,
- call->count * 3 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -222,13 +221,13 @@ static int afs_deliver_cb_callback(struct afs_call *call)
cb->cb.type = AFSCM_CB_UNTYPED;
}
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* extract the callback array and its count in two steps */
case 3:
_debug("extract CB count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -237,13 +236,12 @@ static int afs_deliver_cb_callback(struct afs_call *call)
if (call->count2 != call->count && call->count2 != 0)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_cb_count);
- call->offset = 0;
+ afs_extract_to_buf(call, call->count2 * 3 * 4);
call->unmarshall++;
case 4:
_debug("extract CB array");
- ret = afs_extract_data(call, call->buffer,
- call->count2 * 3 * 4, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -256,7 +254,6 @@ static int afs_deliver_cb_callback(struct afs_call *call)
cb->cb.type = ntohl(*bp++);
}
- call->offset = 0;
call->unmarshall++;
case 5:
break;
@@ -303,7 +300,8 @@ static int afs_deliver_cb_init_call_back_state(struct afs_call *call)
rxrpc_kernel_get_peer(call->net->socket, call->rxcall, &srx);
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -332,16 +330,15 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call)
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->buffer = kmalloc_array(11, sizeof(__be32), GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
+ afs_extract_to_buf(call, 11 * sizeof(__be32));
call->unmarshall++;
case 1:
_debug("extract UUID");
- ret = afs_extract_data(call, call->buffer,
- 11 * sizeof(__be32), false);
+ ret = afs_extract_data(call, false);
switch (ret) {
case 0: break;
case -EAGAIN: return 0;
@@ -364,7 +361,6 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call)
for (loop = 0; loop < 6; loop++)
r->node[loop] = ntohl(b[loop + 5]);
- call->offset = 0;
call->unmarshall++;
case 2:
@@ -407,7 +403,8 @@ static int afs_deliver_cb_probe(struct afs_call *call)
_enter("");
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -455,16 +452,15 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call)
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->buffer = kmalloc_array(11, sizeof(__be32), GFP_KERNEL);
if (!call->buffer)
return -ENOMEM;
+ afs_extract_to_buf(call, 11 * sizeof(__be32));
call->unmarshall++;
case 1:
_debug("extract UUID");
- ret = afs_extract_data(call, call->buffer,
- 11 * sizeof(__be32), false);
+ ret = afs_extract_data(call, false);
switch (ret) {
case 0: break;
case -EAGAIN: return 0;
@@ -487,7 +483,6 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call)
for (loop = 0; loop < 6; loop++)
r->node[loop] = ntohl(b[loop + 5]);
- call->offset = 0;
call->unmarshall++;
case 2:
@@ -572,7 +567,8 @@ static int afs_deliver_cb_tell_me_about_yourself(struct afs_call *call)
_enter("");
- ret = afs_extract_data(call, NULL, 0, false);
+ afs_extract_discard(call, 0);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 7d623008157f..1357a10804bb 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -105,6 +105,41 @@ struct afs_lookup_cookie {
struct afs_fid fids[50];
};
+/*
+ * Drop the refs that we're holding on the pages we were reading into. We've
+ * got refs on the first nr_pages pages.
+ */
+static void afs_dir_read_cleanup(struct afs_read *req)
+{
+ struct radix_tree_iter iter;
+ struct address_space *mapping = req->iter.mapping;
+ struct page *page;
+ pgoff_t index = req->pos >> PAGE_SHIFT;
+ void __rcu **slot;
+
+ if (unlikely(!req->nr_pages))
+ return;
+
+ rcu_read_lock();
+ radix_tree_for_each_contig(slot, &mapping->i_pages, &iter, index) {
+ page = radix_tree_deref_slot(slot);
+ if (unlikely(!page))
+ continue;
+
+ BUG_ON(radix_tree_exception(page));
+ BUG_ON(PageCompound(page));
+ BUG_ON(page->mapping != req->iter.mapping);
+ BUG_ON(page_to_pgoff(page) != iter.index);
+
+ put_page(page);
+ req->nr_pages--;
+ if (req->nr_pages == 0)
+ break;
+ }
+
+ rcu_read_unlock();
+}
+
/*
* check that a directory page is valid
*/
@@ -130,7 +165,7 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
qty /= sizeof(union afs_xdr_dir_block);
/* check them */
- dbuf = kmap(page);
+ dbuf = kmap_atomic(page);
for (tmp = 0; tmp < qty; tmp++) {
if (dbuf->blocks[tmp].hdr.magic != AFS_DIR_MAGIC) {
printk("kAFS: %s(%lx): bad magic %d/%d is %04hx\n",
@@ -148,7 +183,7 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
((u8 *)&dbuf->blocks[tmp])[AFS_DIR_BLOCK_SIZE - 1] = 0;
}
- kunmap(page);
+ kunmap_atomic(dbuf);
checked:
afs_stat_v(dvnode, n_read_dir);
@@ -158,6 +193,46 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
return false;
}
+/*
+ * Check all the pages in a directory. All the pages are held pinned.
+ */
+static int afs_dir_check(struct afs_vnode *dvnode, unsigned int nr_pages,
+ loff_t i_size)
+{
+ struct radix_tree_iter iter;
+ struct address_space *mapping = dvnode->vfs_inode.i_mapping;
+ struct page *page;
+ void __rcu **slot;
+ int ret = 0;
+
+ if (unlikely(!nr_pages))
+ return 0;
+
+ rcu_read_lock();
+ radix_tree_for_each_contig(slot, &mapping->i_pages, &iter, 0) {
+ page = radix_tree_deref_slot(slot);
+ if (unlikely(!page)) {
+ pr_warn("kAFS: Missing page in dircheck\n");
+ ret = -EIO;
+ break;
+ }
+ if (page->index >= nr_pages)
+ break;
+
+ BUG_ON(radix_tree_exception(page));
+ BUG_ON(PageCompound(page));
+ BUG_ON(page->mapping != mapping);
+ BUG_ON(page_to_pgoff(page) != iter.index);
+
+ ret = afs_dir_check_page(dvnode, page, i_size);
+ if (ret < 0)
+ break;
+ }
+
+ rcu_read_unlock();
+ return ret;
+}
+
/*
* open an AFS directory file
*/
@@ -184,56 +259,49 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
{
struct afs_read *req;
loff_t i_size;
- int nr_pages, nr_inline, i, n;
- int ret = -ENOMEM;
+ int nr_pages, i, n;
+ int ret;
+
+ _enter("");
+
+ req = kzalloc(sizeof(*req), GFP_KERNEL);
+ if (!req)
+ return ERR_PTR(-ENOMEM);
+
+ refcount_set(&req->usage, 1);
+ req->cleanup = afs_dir_read_cleanup;
-retry:
+expand:
i_size = i_size_read(&dvnode->vfs_inode);
+ ret = -EIO;
if (i_size < 2048)
- return ERR_PTR(-EIO);
+ goto error;
+ ret = -EFBIG;
if (i_size > 2048 * 1024)
- return ERR_PTR(-EFBIG);
-
- _enter("%llu", i_size);
+ goto error;
- /* Get a request record to hold the page list. We want to hold it
- * inline if we can, but we don't want to make an order 1 allocation.
- */
nr_pages = (i_size + PAGE_SIZE - 1) / PAGE_SIZE;
- nr_inline = nr_pages;
- if (nr_inline > (PAGE_SIZE - sizeof(*req)) / sizeof(struct page *))
- nr_inline = 0;
- req = kzalloc(sizeof(*req) + sizeof(struct page *) * nr_inline,
- GFP_KERNEL);
- if (!req)
- return ERR_PTR(-ENOMEM);
-
- refcount_set(&req->usage, 1);
- req->nr_pages = nr_pages;
req->actual_len = i_size; /* May change */
req->len = nr_pages * PAGE_SIZE; /* We can ask for more than there is */
req->data_version = dvnode->status.data_version; /* May change */
- if (nr_inline > 0) {
- req->pages = req->array;
- } else {
- req->pages = kcalloc(nr_pages, sizeof(struct page *),
- GFP_KERNEL);
- if (!req->pages)
- goto error;
- }
+ iov_iter_mapping(&req->iter, READ, dvnode->vfs_inode.i_mapping,
+ 0, i_size);
- /* Get a list of all the pages that hold or will hold the directory
- * content. We need to fill in any gaps that we might find where the
- * memory reclaimer has been at work. If there are any gaps, we will
+ /* Fill in any gaps that we might find where the memory reclaimer has
+ * been at work and pin all the pages. If there are any gaps, we will
* need to reread the entire directory contents.
*/
- i = 0;
- do {
+ i = req->nr_pages;
+ while (i < nr_pages) {
+ struct page *pages[8], *page;
+
n = find_get_pages_contig(dvnode->vfs_inode.i_mapping, i,
- req->nr_pages - i,
- req->pages + i);
- _debug("find %u at %u/%u", n, i, req->nr_pages);
+ min_t(unsigned int, nr_pages - i,
+ ARRAY_SIZE(pages)),
+ pages);
+ _debug("find %u at %u/%u", n, i, nr_pages);
+
if (n == 0) {
gfp_t gfp = dvnode->vfs_inode.i_mapping->gfp_mask;
@@ -241,23 +309,25 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
afs_stat_v(dvnode, n_inval);
ret = -ENOMEM;
- req->pages[i] = __page_cache_alloc(gfp);
- if (!req->pages[i])
+ page = __page_cache_alloc(gfp);
+ if (!page)
goto error;
- ret = add_to_page_cache_lru(req->pages[i],
+ ret = add_to_page_cache_lru(page,
dvnode->vfs_inode.i_mapping,
i, gfp);
if (ret < 0)
goto error;
- set_page_private(req->pages[i], 1);
- SetPagePrivate(req->pages[i]);
- unlock_page(req->pages[i]);
+ set_page_private(page, 1);
+ SetPagePrivate(page);
+ unlock_page(page);
+ req->nr_pages++;
i++;
} else {
+ req->nr_pages += n;
i += n;
}
- } while (i < req->nr_pages);
+ }
/* If we're going to reload, we need to lock all the pages to prevent
* races.
@@ -280,15 +350,18 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
task_io_account_read(PAGE_SIZE * req->nr_pages);
- if (req->len < req->file_size)
- goto content_has_grown;
+ if (req->len < req->file_size) {
+ /* The content has grown, so we need to expand the
+ * buffer.
+ */
+ up_write(&dvnode->validate_lock);
+ goto expand;
+ }
/* Validate the data we just read. */
- ret = -EIO;
- for (i = 0; i < req->nr_pages; i++)
- if (!afs_dir_check_page(dvnode, req->pages[i],
- req->actual_len))
- goto error_unlock;
+ ret = afs_dir_check(dvnode, req->nr_pages, req->actual_len);
+ if (ret < 0)
+ goto error_unlock;
// TODO: Trim excess pages
@@ -305,11 +378,6 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
afs_put_read(req);
_leave(" = %d", ret);
return ERR_PTR(ret);
-
-content_has_grown:
- up_write(&dvnode->validate_lock);
- afs_put_read(req);
- goto retry;
}
/*
@@ -415,6 +483,7 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
struct afs_read *req;
struct page *page;
unsigned blkoff, limit;
+ void __rcu **slot;
int ret;
_enter("{%lu},%u,,", dir->i_ino, (unsigned)ctx->pos);
@@ -438,9 +507,15 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
blkoff = ctx->pos & ~(sizeof(union afs_xdr_dir_block) - 1);
/* Fetch the appropriate page from the directory and re-add it
- * to the LRU.
+ * to the LRU. We have all the pages pinned with an extra ref.
*/
- page = req->pages[blkoff / PAGE_SIZE];
+ rcu_read_lock();
+ page = NULL;
+ slot = radix_tree_lookup_slot(&dvnode->vfs_inode.i_mapping->i_pages,
+ blkoff / PAGE_SIZE);
+ if (slot)
+ page = radix_tree_deref_slot(slot);
+ rcu_read_unlock();
if (!page) {
ret = -EIO;
break;
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 7d4f26198573..6d55dabccd79 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -148,7 +148,7 @@ int afs_open(struct inode *inode, struct file *file)
if (file->f_flags & O_TRUNC)
set_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
-
+
file->private_data = af;
_leave(" = 0");
return 0;
@@ -185,24 +185,80 @@ int afs_release(struct inode *inode, struct file *file)
return 0;
}
+/*
+ * Make pages available as they're filled. This function may not sleep.
+ */
+static void afs_readpages_page_done(const struct iov_iter *iter,
+ const struct bio_vec *bv)
+{
+ struct page *page = bv->bv_page;
+ struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
+ struct afs_read *req = container_of(iter, struct afs_read, iter);
+
+ SetPageUptodate(page);
+
+ if (0 && afs_vnode_cache(vnode))
+ SetPageFsCache(page);
+ unlock_page(page);
+ put_page(page);
+ req->done_pages++;
+}
+
+/*
+ * Unlock the pages we were reading into. We've got locks and refs on the
+ * first nr_pages pages.
+ */
+static void afs_file_read_cleanup(struct afs_read *req)
+{
+ struct radix_tree_iter iter;
+ struct address_space *mapping = req->iter.mapping;
+ struct page *page;
+ pgoff_t index = req->pos >> PAGE_SHIFT;
+ void **slot;
+
+ _enter("%lu,%u,%u,%llu",
+ index, req->done_pages, req->nr_pages, iov_iter_count(&req->iter));
+
+ if (likely(req->done_pages >= req->nr_pages))
+ return;
+
+ rcu_read_lock();
+ radix_tree_for_each_contig(slot, &mapping->i_pages, &iter, index) {
+ page = radix_tree_deref_slot(slot);
+ if (unlikely(!page))
+ continue;
+
+ BUG_ON(radix_tree_exception(page));
+ BUG_ON(PageCompound(page));
+ BUG_ON(page != *slot);
+ BUG_ON(page->mapping != req->iter.mapping);
+ BUG_ON(page_to_pgoff(page) != iter.index);
+
+ if (req->error)
+ SetPageError(page);
+ unlock_page(page);
+ put_page(page);
+ req->done_pages++;
+ if (req->done_pages >= req->nr_pages)
+ break;
+ }
+
+ rcu_read_unlock();
+}
+
/*
* Dispose of a ref to a read record.
*/
void afs_put_read(struct afs_read *req)
{
- int i;
-
if (refcount_dec_and_test(&req->usage)) {
- for (i = 0; i < req->nr_pages; i++)
- if (req->pages[i])
- put_page(req->pages[i]);
- if (req->pages != req->array)
- kfree(req->pages);
+ if (req->cleanup)
+ req->cleanup(req);
kfree(req);
}
}
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
/*
* deal with notification that a page was read from the cache
*/
@@ -257,6 +313,22 @@ int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *de
return ret;
}
+/*
+ * Clear the trailer after a short read.
+ */
+static void afs_clear_after_read(struct afs_vnode *vnode, struct afs_read *req,
+ bool catch_page_done)
+{
+ if (req->actual_len >= req->len)
+ return;
+ iov_iter_mapping(&req->iter, READ, vnode->vfs_inode.i_mapping,
+ req->pos + req->actual_len,
+ req->len - req->actual_len);
+ if (catch_page_done)
+ req->iter.page_done = afs_readpages_page_done;
+ iov_iter_zero(req->len - req->actual_len, &req->iter);
+}
+
/*
* read page from file, directory or symlink, given a key to use
*/
@@ -277,7 +349,7 @@ int afs_page_filler(void *data, struct page *page)
goto error;
/* is it cached? */
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
ret = fscache_read_or_alloc_page(vnode->cache,
page,
afs_file_readpage_read_complete,
@@ -301,8 +373,7 @@ int afs_page_filler(void *data, struct page *page)
_debug("cache said ENOBUFS");
default:
go_on:
- req = kzalloc(sizeof(struct afs_read) + sizeof(struct page *),
- GFP_KERNEL);
+ req = kzalloc(sizeof(struct afs_read), GFP_KERNEL);
if (!req)
goto enomem;
@@ -314,10 +385,11 @@ int afs_page_filler(void *data, struct page *page)
req->pos = (loff_t)page->index << PAGE_SHIFT;
req->len = PAGE_SIZE;
req->nr_pages = 1;
- req->pages = req->array;
- req->pages[0] = page;
get_page(page);
+ iov_iter_mapping(&req->iter, READ, page->mapping,
+ (loff_t)page->index << PAGE_SHIFT, PAGE_SIZE);
+
/* read the contents of the file from the server into the
* page */
ret = afs_fetch_data(vnode, key, req);
@@ -331,11 +403,6 @@ int afs_page_filler(void *data, struct page *page)
ret = -ESTALE;
}
-#ifdef CONFIG_AFS_FSCACHE
- fscache_uncache_page(vnode->cache, page);
-#endif
- BUG_ON(PageFsCache(page));
-
if (ret == -EINTR ||
ret == -ENOMEM ||
ret == -ERESTARTSYS ||
@@ -344,10 +411,11 @@ int afs_page_filler(void *data, struct page *page)
goto io_error;
}
+ afs_clear_after_read(vnode, req, false);
SetPageUptodate(page);
/* send the page to the cache */
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
if (PageFsCache(page) &&
fscache_write_page(vnode->cache, page, vnode->status.size,
GFP_KERNEL) != 0) {
@@ -398,31 +466,39 @@ static int afs_readpage(struct file *file, struct page *page)
return ret;
}
+#if 0
/*
- * Make pages available as they're filled.
+ * Allow writing to a page to take place. This function may not sleep.
*/
-static void afs_readpages_page_done(struct afs_call *call, struct afs_read *req)
+static void afs_clear_page_fscache_mark(const struct iov_iter *iter,
+ struct page *page)
{
-#ifdef CONFIG_AFS_FSCACHE
- struct afs_vnode *vnode = call->reply[0];
-#endif
- struct page *page = req->pages[req->index];
+ ClearPageFsCache(page);
+}
- req->pages[req->index] = NULL;
- SetPageUptodate(page);
+static void afs_fscache_write_done(struct fscache_cookie *cookie,
+ struct iov_iter *iter)
+{
+ struct afs_read *req = container_of(iter, struct afs_read, iter);
+
+ afs_put_read(req);
+}
+
+/*
+ * Write the read data to the cache.
+ */
+static void afs_readpages_write_to_cache(struct afs_read *req)
+{
+ struct afs_vnode *vnode = AFS_FS_I(req->iter.mapping->host);
- /* send the page to the cache */
-#ifdef CONFIG_AFS_FSCACHE
- if (PageFsCache(page) &&
- fscache_write_page(vnode->cache, page, vnode->status.size,
- GFP_KERNEL) != 0) {
- fscache_uncache_page(vnode->cache, page);
- BUG_ON(PageFsCache(page));
+ if (afs_vnode_cache(vnode)) {
+ req->iter.page_done = afs_clear_page_fscache_mark;
+ fscache_write(vnode->cache, &req->iter, req->pos,
+ req->file_size, GFP_KERNEL,
+ afs_fscache_write_done);
}
-#endif
- unlock_page(page);
- put_page(page);
}
+#endif
/*
* Read a contiguous set of pages.
@@ -436,7 +512,7 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
struct page *first, *page;
struct key *key = afs_file_key(file);
pgoff_t index;
- int ret, n, i;
+ int ret, n;
/* Count the number of contiguous pages at the front of the list. Note
* that the list goes prev-wards rather than next-wards.
@@ -452,20 +528,17 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
n++;
}
- req = kzalloc(sizeof(struct afs_read) + sizeof(struct page *) * n,
- GFP_NOFS);
+ req = kzalloc(sizeof(struct afs_read), GFP_NOFS);
if (!req)
return -ENOMEM;
refcount_set(&req->usage, 1);
- req->page_done = afs_readpages_page_done;
+ req->cleanup = afs_file_read_cleanup;
req->pos = first->index;
req->pos <<= PAGE_SHIFT;
- req->pages = req->array;
- /* Transfer the pages to the request. We add them in until one fails
- * to add to the LRU and then we stop (as that'll make a hole in the
- * contiguous run.
+ /* Add pages to the LRU until it fails. We keep the pages ref'd and
+ * locked until the read is complete.
*
* Note that it's possible for the file size to change whilst we're
* doing this, but we rely on the server returning less than we asked
@@ -478,15 +551,11 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
index = page->index;
if (add_to_page_cache_lru(page, mapping, index,
readahead_gfp_mask(mapping))) {
-#ifdef CONFIG_AFS_FSCACHE
- fscache_uncache_page(vnode->cache, page);
-#endif
put_page(page);
break;
}
- req->pages[req->nr_pages++] = page;
- req->len += PAGE_SIZE;
+ req->nr_pages++;
} while (req->nr_pages < n);
if (req->nr_pages == 0) {
@@ -494,33 +563,26 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
return 0;
}
+ req->len = req->nr_pages * PAGE_SIZE;
+ iov_iter_mapping(&req->iter, READ, file->f_mapping, req->pos, req->len);
+ req->iter.page_done = afs_readpages_page_done;
+
ret = afs_fetch_data(vnode, key, req);
if (ret < 0)
goto error;
- task_io_account_read(PAGE_SIZE * req->nr_pages);
- afs_put_read(req);
+ afs_clear_after_read(vnode, req, true);
+ task_io_account_read(req->len);
return 0;
error:
if (ret == -ENOENT) {
- _debug("got NOENT from server"
- " - marking file deleted and stale");
+ _debug("got NOENT from server - marking file deleted and stale");
set_bit(AFS_VNODE_DELETED, &vnode->flags);
ret = -ESTALE;
}
- for (i = 0; i < req->nr_pages; i++) {
- page = req->pages[i];
- if (page) {
-#ifdef CONFIG_AFS_FSCACHE
- fscache_uncache_page(vnode->cache, page);
-#endif
- SetPageError(page);
- unlock_page(page);
- }
- }
-
+ req->error = true;
afs_put_read(req);
return ret;
}
@@ -547,7 +609,7 @@ static int afs_readpages(struct file *file, struct address_space *mapping,
}
/* attempt to read as many of the pages as possible */
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
ret = fscache_read_or_alloc_pages(vnode->cache,
mapping,
pages,
@@ -605,7 +667,7 @@ static void afs_invalidatepage(struct page *page, unsigned int offset,
/* we clean up only if the entire page is being invalidated */
if (offset == 0 && length == PAGE_SIZE) {
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
if (PageFsCache(page)) {
struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
fscache_wait_on_page_write(vnode->cache, page);
@@ -640,7 +702,7 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags)
/* deny if page is being written to the cache and the caller hasn't
* elected to wait */
-#ifdef CONFIG_AFS_FSCACHE
+#if 0 //def CONFIG_AFS_FSCACHE
if (!fscache_maybe_release_page(vnode->cache, page, gfp_flags)) {
_leave(" = F [cache busy]");
return 0;
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index d9a5815945dc..eb60c570dca2 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -20,12 +20,6 @@
static const struct afs_fid afs_zero_fid;
-/*
- * We need somewhere to discard into in case the server helpfully returns more
- * than we asked for in FS.FetchData{,64}.
- */
-static u8 afs_discard_buffer[64];
-
static inline void afs_use_fs_server(struct afs_call *call, struct afs_cb_interest *cbi)
{
call->cbi = afs_get_cb_interest(cbi);
@@ -468,115 +462,82 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
struct afs_vnode *vnode = call->reply[0];
struct afs_read *req = call->reply[2];
const __be32 *bp;
- unsigned int size;
- void *buffer;
int ret;
- _enter("{%u,%zu/%u;%llu/%llu}",
- call->unmarshall, call->offset, call->count,
- req->remain, req->actual_len);
+ _enter("{%u,%llu/%llu}",
+ call->unmarshall, iov_iter_count(&call->iter), req->actual_len);
switch (call->unmarshall) {
case 0:
req->actual_len = 0;
- call->offset = 0;
call->unmarshall++;
if (call->operation_ID != FSFETCHDATA64) {
call->unmarshall++;
goto no_msw;
}
+ afs_extract_to_tmp(call);
/* extract the upper part of the returned data length of an
- * FSFETCHDATA64 op (which should always be 0 using this
- * client) */
+ * FSFETCHDATA64 op.
+ */
case 1:
_debug("extract data length (MSW)");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
req->actual_len = ntohl(call->tmp);
req->actual_len <<= 32;
- call->offset = 0;
call->unmarshall++;
-
no_msw:
+ afs_extract_to_tmp(call);
+
/* extract the returned data length */
case 2:
_debug("extract data length");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
req->actual_len |= ntohl(call->tmp);
_debug("DATA length: %llu", req->actual_len);
- req->remain = req->actual_len;
- call->offset = req->pos & (PAGE_SIZE - 1);
- req->index = 0;
if (req->actual_len == 0)
goto no_more_data;
call->unmarshall++;
-
- begin_page:
- ASSERTCMP(req->index, <, req->nr_pages);
- if (req->remain > PAGE_SIZE - call->offset)
- size = PAGE_SIZE - call->offset;
- else
- size = req->remain;
- call->count = call->offset + size;
- ASSERTCMP(call->count, <=, PAGE_SIZE);
- req->remain -= size;
+ call->_iter = &req->iter;
+ iov_iter_truncate(&req->iter, req->actual_len);
/* extract the returned data */
case 3:
- _debug("extract data %llu/%llu %zu/%u",
- req->remain, req->actual_len, call->offset, call->count);
+ _debug("extract data %llu/%llu",
+ iov_iter_count(&call->iter), req->actual_len);
- buffer = kmap(req->pages[req->index]);
- ret = afs_extract_data(call, buffer, call->count, true);
- kunmap(req->pages[req->index]);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
- if (call->offset == PAGE_SIZE) {
- if (req->page_done)
- req->page_done(call, req);
- req->index++;
- if (req->remain > 0) {
- call->offset = 0;
- if (req->index >= req->nr_pages) {
- call->unmarshall = 4;
- goto begin_discard;
- }
- goto begin_page;
- }
- }
- goto no_more_data;
+
+ call->_iter = &call->iter;
+ if (req->actual_len <= req->len)
+ goto no_more_data;
/* Discard any excess data the server gave us */
- begin_discard:
+ iov_iter_discard(&call->iter, READ, req->actual_len - req->len);
case 4:
- size = min_t(loff_t, sizeof(afs_discard_buffer), req->remain);
- call->count = size;
- _debug("extract discard %llu/%llu %zu/%u",
- req->remain, req->actual_len, call->offset, call->count);
-
- call->offset = 0;
- ret = afs_extract_data(call, afs_discard_buffer, call->count, true);
- req->remain -= call->offset;
+ _debug("extract discard %llu/%llu",
+ iov_iter_count(&call->iter), req->actual_len - req->len);
+
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
- if (req->remain > 0)
- goto begin_discard;
no_more_data:
- call->offset = 0;
call->unmarshall = 5;
+ afs_extract_to_buf(call, (21 + 3 + 6) * 4);
/* extract the metadata */
case 5:
- ret = afs_extract_data(call, call->buffer,
- (21 + 3 + 6) * 4, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -589,22 +550,12 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
if (call->reply[1])
xdr_decode_AFSVolSync(&bp, call->reply[1]);
- call->offset = 0;
call->unmarshall++;
case 6:
break;
}
- for (; req->index < req->nr_pages; req->index++) {
- if (call->count < PAGE_SIZE)
- zero_user_segment(req->pages[req->index],
- call->count, PAGE_SIZE);
- if (req->page_done)
- req->page_done(call, req);
- call->count = 0;
- }
-
_leave(" = 0 [done]");
return 0;
}
@@ -700,6 +651,7 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc, struct afs_read *req)
call->reply[1] = NULL; /* volsync */
call->reply[2] = req;
call->expected_version = vnode->status.data_version;
+ req->call_debug_id = call->debug_id;
/* marshall the parameters */
bp = call->request;
@@ -1598,31 +1550,31 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
{
const __be32 *bp;
char *p;
+ u32 size;
int ret;
_enter("{%u}", call->unmarshall);
switch (call->unmarshall) {
case 0:
- call->offset = 0;
call->unmarshall++;
+ afs_extract_to_buf(call, 12 * 4);
/* extract the returned status record */
case 1:
_debug("extract status");
- ret = afs_extract_data(call, call->buffer,
- 12 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
xdr_decode_AFSFetchVolumeStatus(&bp, call->reply[1]);
- call->offset = 0;
call->unmarshall++;
+ afs_extract_to_tmp(call);
/* extract the volume name length */
case 2:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -1631,46 +1583,26 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
if (call->count >= AFSNAMEMAX)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_volname_len);
- call->offset = 0;
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the volume name */
case 3:
_debug("extract volname");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("volname '%s'", p);
-
- call->offset = 0;
- call->unmarshall++;
-
- /* extract the volume name padding */
- if ((call->count & 3) == 0) {
- call->unmarshall++;
- goto no_volname_padding;
- }
- call->count = 4 - (call->count & 3);
-
- case 4:
- ret = afs_extract_data(call, call->buffer,
- call->count, true);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
- no_volname_padding:
/* extract the offline message length */
- case 5:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ case 4:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -1679,46 +1611,27 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
if (call->count >= AFSNAMEMAX)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_offline_msg_len);
- call->offset = 0;
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the offline message */
- case 6:
+ case 5:
_debug("extract offline");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, true);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("offline '%s'", p);
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
- /* extract the offline message padding */
- if ((call->count & 3) == 0) {
- call->unmarshall++;
- goto no_offline_padding;
- }
- call->count = 4 - (call->count & 3);
-
- case 7:
- ret = afs_extract_data(call, call->buffer,
- call->count, true);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
- call->unmarshall++;
- no_offline_padding:
-
/* extract the message of the day length */
- case 8:
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ case 6:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -1727,38 +1640,24 @@ static int afs_deliver_fs_get_volume_status(struct afs_call *call)
if (call->count >= AFSNAMEMAX)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_motd_len);
- call->offset = 0;
+ size = (call->count + 3) & ~3; /* It's padded */
+ afs_extract_begin(call, call->reply[2], size);
call->unmarshall++;
/* extract the message of the day */
- case 9:
+ case 7:
_debug("extract motd");
- if (call->count > 0) {
- ret = afs_extract_data(call, call->reply[2],
- call->count, true);
- if (ret < 0)
- return ret;
- }
+ ret = afs_extract_data(call, false);
+ if (ret < 0)
+ return ret;
p = call->reply[2];
p[call->count] = 0;
_debug("motd '%s'", p);
- call->offset = 0;
call->unmarshall++;
- /* extract the message of the day padding */
- call->count = (4 - (call->count & 3)) & 3;
-
- case 10:
- ret = afs_extract_data(call, call->buffer,
- call->count, false);
- if (ret < 0)
- return ret;
-
- call->offset = 0;
- call->unmarshall++;
- case 11:
+ case 8:
break;
}
@@ -2024,19 +1923,16 @@ static int afs_deliver_fs_get_capabilities(struct afs_call *call)
u32 count;
int ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%llu}", call->unmarshall, iov_iter_count(&call->iter));
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the capabilities word count */
case 1:
- ret = afs_extract_data(call, &call->tmp,
- 1 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -2044,24 +1940,17 @@ static int afs_deliver_fs_get_capabilities(struct afs_call *call)
call->count = count;
call->count2 = count;
- call->offset = 0;
+ iov_iter_discard(&call->iter, READ, count * sizeof(__be32));
call->unmarshall++;
/* Extract capabilities words */
case 2:
- count = min(call->count, 16U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 16);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
/* TODO: Examine capabilities */
- call->count -= count;
- if (call->count > 0)
- goto again;
- call->offset = 0;
call->unmarshall++;
break;
}
@@ -2215,13 +2104,13 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the file status count and array in two steps */
case 1:
_debug("extract status count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -2234,11 +2123,11 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
call->count = 0;
call->unmarshall++;
more_counts:
- call->offset = 0;
+ afs_extract_to_buf(call, 21 * sizeof(__be32));
case 2:
_debug("extract status array %u", call->count);
- ret = afs_extract_data(call, call->buffer, 21 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -2256,12 +2145,12 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
call->count = 0;
call->unmarshall++;
- call->offset = 0;
+ afs_extract_to_tmp(call);
/* Extract the callback count and array in two steps */
case 3:
_debug("extract CB count");
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -2273,11 +2162,11 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
call->count = 0;
call->unmarshall++;
more_cbs:
- call->offset = 0;
+ afs_extract_to_buf(call, 3 * sizeof(__be32));
case 4:
_debug("extract CB array");
- ret = afs_extract_data(call, call->buffer, 3 * 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -2294,11 +2183,11 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
if (call->count < call->count2)
goto more_cbs;
- call->offset = 0;
+ afs_extract_to_buf(call, 6 * sizeof(__be32));
call->unmarshall++;
case 5:
- ret = afs_extract_data(call, call->buffer, 6 * 4, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -2306,7 +2195,6 @@ static int afs_deliver_fs_inline_bulk_status(struct afs_call *call)
if (call->reply[3])
xdr_decode_AFSVolSync(&bp, call->reply[3]);
- call->offset = 0;
call->unmarshall++;
case 6:
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 852f68a400b8..8e248051afde 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -94,11 +94,16 @@ struct afs_call {
struct afs_cb_interest *cbi; /* Callback interest for server used */
void *request; /* request data (first part) */
struct address_space *mapping; /* Pages being written from */
+ struct iov_iter iter; /* Buffer iterator */
+ struct iov_iter *_iter; /* Iterator currently in use */
+ union { /* Convenience for ->iter */
+ struct kvec kvec[1];
+ struct bio_vec bvec[1];
+ };
void *buffer; /* reply receive buffer */
void *reply[4]; /* Where to put the reply */
pgoff_t first; /* first page in mapping to deal with */
pgoff_t last; /* last page in mapping to deal with */
- size_t offset; /* offset into received data store */
atomic_t usage;
enum afs_call_state state;
spinlock_t state_lock;
@@ -175,15 +180,15 @@ struct afs_read {
loff_t pos; /* Where to start reading */
loff_t len; /* How much we're asking for */
loff_t actual_len; /* How much we're actually getting */
- loff_t remain; /* Amount remaining */
loff_t file_size; /* File size returned by server */
afs_dataversion_t data_version; /* Version number returned by server */
refcount_t usage;
- unsigned int index; /* Which page we're reading into */
unsigned int nr_pages;
- void (*page_done)(struct afs_call *, struct afs_read *);
- struct page **pages;
- struct page *array[];
+ unsigned int done_pages;
+ bool error;
+ unsigned int call_debug_id;
+ void (*cleanup)(struct afs_read *);
+ struct iov_iter iter; /* Buffer */
};
/*
@@ -547,6 +552,15 @@ struct afs_vnode {
afs_callback_type_t cb_type; /* type of callback */
};
+static inline struct fscache_cookie *afs_vnode_cache(struct afs_vnode *vnode)
+{
+#ifdef CONFIG_AFS_FSCACHE
+ return vnode->cache;
+#else
+ return NULL;
+#endif
+}
+
/*
* cached security record for one user's attempt to access a vnode
*/
@@ -927,12 +941,34 @@ extern struct afs_call *afs_alloc_flat_call(struct afs_net *,
extern void afs_flat_call_destructor(struct afs_call *);
extern void afs_send_empty_reply(struct afs_call *);
extern void afs_send_simple_reply(struct afs_call *, const void *, size_t);
-extern int afs_extract_data(struct afs_call *, void *, size_t, bool);
+extern int afs_extract_data(struct afs_call *, bool);
extern int afs_protocol_error(struct afs_call *, int, enum afs_eproto_cause);
+static inline void afs_extract_begin(struct afs_call *call, void *buf, size_t size)
+{
+ call->kvec[0].iov_base = buf;
+ call->kvec[0].iov_len = size;
+ iov_iter_kvec(&call->iter, READ, call->kvec, 1, size);
+}
+
+static inline void afs_extract_to_tmp(struct afs_call *call)
+{
+ afs_extract_begin(call, &call->tmp, sizeof(call->tmp));
+}
+
+static inline void afs_extract_discard(struct afs_call *call, size_t size)
+{
+ iov_iter_discard(&call->iter, READ, size);
+}
+
+static inline void afs_extract_to_buf(struct afs_call *call, size_t size)
+{
+ afs_extract_begin(call, call->buffer, size);
+}
+
static inline int afs_transfer_reply(struct afs_call *call)
{
- return afs_extract_data(call, call->buffer, call->reply_max, false);
+ return afs_extract_data(call, false);
}
static inline bool afs_check_call_state(struct afs_call *call,
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 37811901a5c9..09479677f11f 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -143,6 +143,7 @@ static struct afs_call *afs_alloc_call(struct afs_net *net,
INIT_WORK(&call->async_work, afs_process_async_call);
init_waitqueue_head(&call->waitq);
spin_lock_init(&call->state_lock);
+ call->_iter = &call->iter;
o = atomic_inc_return(&net->nr_outstanding_calls);
trace_afs_call(call, afs_call_trace_alloc, 1, o,
@@ -233,6 +234,7 @@ struct afs_call *afs_alloc_flat_call(struct afs_net *net,
goto nomem_free;
}
+ afs_extract_to_buf(call, call->reply_max);
call->operation_ID = type->op;
init_waitqueue_head(&call->waitq);
return call;
@@ -465,14 +467,12 @@ static void afs_deliver_to_call(struct afs_call *call)
state == AFS_CALL_SV_AWAIT_ACK
) {
if (state == AFS_CALL_SV_AWAIT_ACK) {
- struct iov_iter iter;
-
- iov_iter_kvec(&iter, READ, NULL, 0, 0);
+ iov_iter_kvec(&call->iter, READ, NULL, 0, 0);
ret = rxrpc_kernel_recv_data(call->net->socket,
- call->rxcall, &iter, false,
- &remote_abort,
+ call->rxcall, &call->iter,
+ false, &remote_abort,
&call->service_id);
- trace_afs_recv_data(call, 0, 0, false, ret);
+ trace_afs_receive_data(call, &call->iter, false, ret);
if (ret == -EINPROGRESS || ret == -EAGAIN)
return;
@@ -516,7 +516,7 @@ static void afs_deliver_to_call(struct afs_call *call)
if (state != AFS_CALL_CL_AWAIT_REPLY)
abort_code = RXGEN_SS_UNMARSHAL;
rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
- abort_code, -EBADMSG, "KUM");
+ abort_code, ret, "KUM");
goto local_abort;
}
}
@@ -729,6 +729,7 @@ void afs_charge_preallocation(struct work_struct *work)
call->async = true;
call->state = AFS_CALL_SV_AWAIT_OP_ID;
init_waitqueue_head(&call->waitq);
+ afs_extract_to_tmp(call);
}
if (rxrpc_kernel_charge_accept(net->socket,
@@ -774,18 +775,15 @@ static int afs_deliver_cm_op_id(struct afs_call *call)
{
int ret;
- _enter("{%zu}", call->offset);
-
- ASSERTCMP(call->offset, <, 4);
+ _enter("{%llu}", iov_iter_count(call->_iter));
/* the operation ID forms the first four bytes of the request data */
- ret = afs_extract_data(call, &call->tmp, 4, true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
call->operation_ID = ntohl(call->tmp);
afs_set_call_state(call, AFS_CALL_SV_AWAIT_OP_ID, AFS_CALL_SV_AWAIT_REQUEST);
- call->offset = 0;
/* ask the cache manager to route the call (it'll change the call type
* if successful) */
@@ -889,30 +887,19 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
/*
* Extract a piece of data from the received data socket buffers.
*/
-int afs_extract_data(struct afs_call *call, void *buf, size_t count,
- bool want_more)
+int afs_extract_data(struct afs_call *call, bool want_more)
{
struct afs_net *net = call->net;
- struct iov_iter iter;
- struct kvec iov;
+ struct iov_iter *iter = call->_iter;
enum afs_call_state state;
u32 remote_abort = 0;
int ret;
- _enter("{%s,%zu},,%zu,%d",
- call->type->name, call->offset, count, want_more);
-
- ASSERTCMP(call->offset, <=, count);
-
- iov.iov_base = buf + call->offset;
- iov.iov_len = count - call->offset;
- iov_iter_kvec(&iter, READ, &iov, 1, count - call->offset);
+ _enter("{%s,%llu},%d", call->type->name, iov_iter_count(iter), want_more);
- ret = rxrpc_kernel_recv_data(net->socket, call->rxcall, &iter,
+ ret = rxrpc_kernel_recv_data(net->socket, call->rxcall, iter,
want_more, &remote_abort,
&call->service_id);
- call->offset += (count - call->offset) - iov_iter_count(&iter);
- trace_afs_recv_data(call, count, call->offset, want_more, ret);
if (ret == 0 || ret == -EAGAIN)
return ret;
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index d0f95c4ab05e..37656ba14d54 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -187,19 +187,18 @@ static int afs_deliver_vl_get_addrs_u(struct afs_call *call)
u32 uniquifier, nentries, count;
int i, ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%llu/%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_buf(call,
+ sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32));
call->unmarshall++;
/* Extract the returned uuid, uniquifier, nentries and blkaddrs size */
case 1:
- ret = afs_extract_data(call, call->buffer,
- sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -216,28 +215,28 @@ static int afs_deliver_vl_get_addrs_u(struct afs_call *call)
call->reply[0] = alist;
call->count = count;
call->count2 = nentries;
- call->offset = 0;
call->unmarshall++;
+ more_entries:
+ count = min(call->count, 4U);
+ afs_extract_to_buf(call, count * sizeof(__be32));
+
/* Extract entries */
case 2:
- count = min(call->count, 4U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 4);
+ ret = afs_extract_data(call, call->count > 4);
if (ret < 0)
return ret;
alist = call->reply[0];
bp = call->buffer;
+ count = min(call->count, 4U);
for (i = 0; i < count; i++)
if (alist->nr_addrs < call->count2)
afs_merge_fs_addr4(alist, *bp++, AFS_FS_PORT);
call->count -= count;
if (call->count > 0)
- goto again;
- call->offset = 0;
+ goto more_entries;
call->unmarshall++;
break;
}
@@ -318,44 +317,35 @@ static int afs_deliver_vl_get_capabilities(struct afs_call *call)
u32 count;
int ret;
- _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+ _enter("{%u,%llu/%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_tmp(call);
call->unmarshall++;
/* Extract the capabilities word count */
case 1:
- ret = afs_extract_data(call, &call->tmp,
- 1 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
count = ntohl(call->tmp);
-
call->count = count;
call->count2 = count;
- call->offset = 0;
+
call->unmarshall++;
+ afs_extract_discard(call, count * sizeof(__be32));
/* Extract capabilities words */
case 2:
- count = min(call->count, 16U);
- ret = afs_extract_data(call, call->buffer,
- count * sizeof(__be32),
- call->count > 16);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
/* TODO: Examine capabilities */
- call->count -= count;
- if (call->count > 0)
- goto again;
- call->offset = 0;
call->unmarshall++;
break;
}
@@ -426,22 +416,19 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
u32 uniquifier, size;
int ret;
- _enter("{%u,%zu/%u,%u}", call->unmarshall, call->offset, call->count, call->count2);
+ _enter("{%u,%llu,%u}",
+ call->unmarshall, iov_iter_count(call->_iter), call->count2);
-again:
switch (call->unmarshall) {
case 0:
- call->offset = 0;
+ afs_extract_to_buf(call, sizeof(uuid_t) + 3 * sizeof(__be32));
call->unmarshall = 1;
/* Extract the returned uuid, uniquifier, fsEndpoints count and
* either the first fsEndpoint type or the volEndpoints
* count if there are no fsEndpoints. */
case 1:
- ret = afs_extract_data(call, call->buffer,
- sizeof(uuid_t) +
- 3 * sizeof(__be32),
- true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -459,15 +446,11 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
return -ENOMEM;
alist->version = uniquifier;
call->reply[0] = alist;
- call->offset = 0;
if (call->count == 0)
goto extract_volendpoints;
- call->unmarshall = 2;
-
- /* Extract fsEndpoints[] entries */
- case 2:
+ next_fsendpoint:
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
size = sizeof(__be32) * (1 + 1 + 1);
@@ -481,7 +464,12 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
}
size += sizeof(__be32);
- ret = afs_extract_data(call, call->buffer, size, true);
+ afs_extract_to_buf(call, size);
+ call->unmarshall = 2;
+
+ /* Extract fsEndpoints[] entries */
+ case 2:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -512,10 +500,9 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
*/
call->count2 = ntohl(*bp++);
- call->offset = 0;
call->count--;
if (call->count > 0)
- goto again;
+ goto next_fsendpoint;
extract_volendpoints:
/* Extract the list of volEndpoints. */
@@ -526,6 +513,7 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
return afs_protocol_error(call, -EBADMSG,
afs_eproto_yvl_vlendpt_type);
+ afs_extract_to_buf(call, 1 * sizeof(__be32));
call->unmarshall = 3;
/* Extract the type of volEndpoints[0]. Normally we would
@@ -533,17 +521,14 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
* data of the current one, but this is the first...
*/
case 3:
- ret = afs_extract_data(call, call->buffer, sizeof(__be32), true);
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
bp = call->buffer;
- call->count2 = ntohl(*bp++);
- call->offset = 0;
- call->unmarshall = 4;
- /* Extract volEndpoints[] entries */
- case 4:
+ next_volendpoint:
+ call->count2 = ntohl(*bp++);
switch (call->count2) {
case YFS_ENDPOINT_IPV4:
size = sizeof(__be32) * (1 + 1 + 1);
@@ -557,8 +542,13 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
}
if (call->count > 1)
- size += sizeof(__be32);
- ret = afs_extract_data(call, call->buffer, size, true);
+ size += sizeof(__be32); /* Get next type too */
+ afs_extract_to_buf(call, size);
+ call->unmarshall = 4;
+
+ /* Extract volEndpoints[] entries */
+ case 4:
+ ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -584,19 +574,17 @@ static int afs_deliver_yfsvl_get_endpoints(struct afs_call *call)
/* Got either the type of the next entry or the count of
* volEndpoints if no more fsEndpoints.
*/
- call->offset = 0;
call->count--;
- if (call->count > 0) {
- call->count2 = ntohl(*bp++);
- goto again;
- }
+ if (call->count > 0)
+ goto next_volendpoint;
end:
+ afs_extract_discard(call, 0);
call->unmarshall = 5;
/* Done */
case 5:
- ret = afs_extract_data(call, call->buffer, 0, false);
+ ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
call->unmarshall = 6;
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 8b39e6ebb40b..ad3d706aa9e3 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -37,8 +37,7 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
_enter(",,%llu", (unsigned long long)pos);
- req = kzalloc(sizeof(struct afs_read) + sizeof(struct page *),
- GFP_KERNEL);
+ req = kzalloc(sizeof(struct afs_read), GFP_KERNEL);
if (!req)
return -ENOMEM;
@@ -46,9 +45,8 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
req->pos = pos;
req->len = len;
req->nr_pages = 1;
- req->pages = req->array;
- req->pages[0] = page;
- get_page(page);
+ iov_iter_mapping(&req->iter, READ, vnode->vfs_inode.i_mapping,
+ pos, len);
ret = afs_fetch_data(vnode, key, req);
afs_put_read(req);
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 84b90a79d75a..d99294c75e9a 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -218,6 +218,9 @@ extern int __fscache_read_or_alloc_pages(struct fscache_cookie *,
gfp_t);
extern int __fscache_alloc_page(struct fscache_cookie *, struct page *, gfp_t);
extern int __fscache_write_page(struct fscache_cookie *, struct page *, loff_t, gfp_t);
+extern int __fscache_write(struct fscache_cookie *, struct iov_iter *,
+ loff_t, loff_t, gfp_t,
+ void (*)(struct fscache_cookie *, struct iov_iter *));
extern void __fscache_uncache_page(struct fscache_cookie *, struct page *);
extern bool __fscache_check_page_write(struct fscache_cookie *, struct page *);
extern void __fscache_wait_on_page_write(struct fscache_cookie *, struct page *);
@@ -655,6 +658,34 @@ void fscache_readpages_cancel(struct fscache_cookie *cookie,
__fscache_readpages_cancel(cookie, pages);
}
+/**
+ * fscache_write - Request storage of data in the cache
+ * @cookie: The cookie representing the cache object
+ * @iter: The data to store
+ * @pos: The position in the cached data
+ * @object_size: Updated size of object
+ * @gfp: The conditions under which memory allocation should be made
+ * @done: Called upon operation completion
+ *
+ * Request the data described by the iterator be written into the cache. This
+ * request may be ignored if insufficient space exists in the cache, in which
+ * case -ENOBUFS will be returned.
+ *
+ * See Documentation/filesystems/caching/netfs-api.txt for a complete
+ * description.
+ */
+static inline
+int fscache_write(struct fscache_cookie *cookie, struct iov_iter *iter,
+ loff_t pos, loff_t object_size, gfp_t gfp,
+ void (*done)(struct fscache_cookie *cookie,
+ struct iov_iter *iter))
+{
+ if (fscache_cookie_valid(cookie))
+ return __fscache_write(cookie, iter, pos, object_size, gfp, done);
+ else
+ return -ENOBUFS;
+}
+
/**
* fscache_write_page - Request storage of a page in the cache
* @cookie: The cookie representing the cache object
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 5c60ade2c7d8..5e0f8dcede26 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -207,17 +207,16 @@ afs_edit_dir_reasons;
#define EM(a, b) { a, b },
#define E_(a, b) { a, b }
-TRACE_EVENT(afs_recv_data,
- TP_PROTO(struct afs_call *call, unsigned count, unsigned offset,
+TRACE_EVENT(afs_receive_data,
+ TP_PROTO(struct afs_call *call, struct iov_iter *iter,
bool want_more, int ret),
- TP_ARGS(call, count, offset, want_more, ret),
+ TP_ARGS(call, iter, want_more, ret),
TP_STRUCT__entry(
+ __field(loff_t, remain )
__field(unsigned int, call )
__field(enum afs_call_state, state )
- __field(unsigned int, count )
- __field(unsigned int, offset )
__field(unsigned short, unmarshall )
__field(bool, want_more )
__field(int, ret )
@@ -227,17 +226,18 @@ TRACE_EVENT(afs_recv_data,
__entry->call = call->debug_id;
__entry->state = call->state;
__entry->unmarshall = call->unmarshall;
- __entry->count = count;
- __entry->offset = offset;
+ __entry->remain = iov_iter_count(iter);
__entry->want_more = want_more;
__entry->ret = ret;
),
- TP_printk("c=%08x s=%u u=%u %u/%u wm=%u ret=%d",
+ TP_printk("c=%08x r=%llu u=%u w=%u s=%u ret=%d",
__entry->call,
- __entry->state, __entry->unmarshall,
- __entry->offset, __entry->count,
- __entry->want_more, __entry->ret)
+ __entry->remain,
+ __entry->unmarshall,
+ __entry->want_more,
+ __entry->state,
+ __entry->ret)
);
TRACE_EVENT(afs_notify_call,
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 07/10] afs: Use ITER_MAPPING for writing
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (5 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 06/10] afs: Set up the iov_iter before calling afs_extract_data() David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 08/10] afs: Add O_DIRECT read support David Howells
` (2 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Use a single ITER_MAPPING iterator to describe the portion of a file to be
transmitted to the server rather than generating a series of small
ITER_BVEC iterators on the fly. This will make it easier to implement AIO
in afs.
In theory we could maybe use one giant ITER_BVEC, but that means
potentially allocating a huge array of bio_vec structs (max 256 per page)
when in fact the pagecache already has a structure listing all the relevant
pages (radix_tree/xarray) that can be walked over.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/file.c | 4 --
fs/afs/fsclient.c | 40 +++++-------------
fs/afs/internal.h | 15 ++-----
fs/afs/rxrpc.c | 99 +++++++-------------------------------------
fs/afs/write.c | 84 +++++++++++++++++++++----------------
include/trace/events/afs.h | 51 ++++++++---------------
6 files changed, 100 insertions(+), 193 deletions(-)
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 6d55dabccd79..7d65efe69929 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -195,11 +195,9 @@ static void afs_readpages_page_done(const struct iov_iter *iter,
struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
struct afs_read *req = container_of(iter, struct afs_read, iter);
- SetPageUptodate(page);
-
if (0 && afs_vnode_cache(vnode))
SetPageFsCache(page);
- unlock_page(page);
+ page_endio(page, false, 0);
put_page(page);
req->done_pages++;
}
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index eb60c570dca2..4c6997d91552 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -1230,10 +1230,7 @@ static const struct afs_call_type afs_RXFSStoreData64 = {
/*
* store a set of pages to a very large file
*/
-static int afs_fs_store_data64(struct afs_fs_cursor *fc,
- struct address_space *mapping,
- pgoff_t first, pgoff_t last,
- unsigned offset, unsigned to,
+static int afs_fs_store_data64(struct afs_fs_cursor *fc, struct iov_iter *iter,
loff_t size, loff_t pos, loff_t i_size)
{
struct afs_vnode *vnode = fc->vnode;
@@ -1251,13 +1248,10 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
return -ENOMEM;
call->key = fc->key;
- call->mapping = mapping;
+ call->write_iter = iter;
+ call->write_first = pos >> PAGE_SHIFT;
+ call->write_last = (pos + size - 1) >> PAGE_SHIFT;
call->reply[0] = vnode;
- call->first = first;
- call->last = last;
- call->first_offset = offset;
- call->last_to = to;
- call->send_pages = true;
call->expected_version = vnode->status.data_version + 1;
/* marshall the parameters */
@@ -1286,26 +1280,20 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
}
/*
- * store a set of pages
+ * Write data to a file on the server.
*/
-int afs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
- pgoff_t first, pgoff_t last,
- unsigned offset, unsigned to)
+int afs_fs_store_data(struct afs_fs_cursor *fc, struct iov_iter *iter, loff_t pos)
{
struct afs_vnode *vnode = fc->vnode;
struct afs_call *call;
struct afs_net *net = afs_v2net(vnode);
- loff_t size, pos, i_size;
+ loff_t size, i_size;
__be32 *bp;
_enter(",%x,{%x:%u},,",
key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
- size = (loff_t)to - (loff_t)offset;
- if (first != last)
- size += (loff_t)(last - first) << PAGE_SHIFT;
- pos = (loff_t)first << PAGE_SHIFT;
- pos += offset;
+ size = iov_iter_count(iter);
i_size = i_size_read(&vnode->vfs_inode);
if (pos + size > i_size)
@@ -1316,8 +1304,7 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
(unsigned long long) i_size);
if (pos >> 32 || i_size >> 32 || size >> 32 || (pos + size) >> 32)
- return afs_fs_store_data64(fc, mapping, first, last, offset, to,
- size, pos, i_size);
+ return afs_fs_store_data64(fc, iter, size, pos, i_size);
call = afs_alloc_flat_call(net, &afs_RXFSStoreData,
(4 + 6 + 3) * 4,
@@ -1326,13 +1313,10 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct address_space *mapping,
return -ENOMEM;
call->key = fc->key;
- call->mapping = mapping;
+ call->write_iter = iter;
+ call->write_first = pos >> PAGE_SHIFT;
+ call->write_last = (pos + size - 1) >> PAGE_SHIFT;
call->reply[0] = vnode;
- call->first = first;
- call->last = last;
- call->first_offset = offset;
- call->last_to = to;
- call->send_pages = true;
call->expected_version = vnode->status.data_version + 1;
/* marshall the parameters */
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 8e248051afde..b12566b640f6 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -96,14 +96,15 @@ struct afs_call {
struct address_space *mapping; /* Pages being written from */
struct iov_iter iter; /* Buffer iterator */
struct iov_iter *_iter; /* Iterator currently in use */
+ struct iov_iter *write_iter; /* Iterator defining write to be made */
union { /* Convenience for ->iter */
struct kvec kvec[1];
struct bio_vec bvec[1];
};
void *buffer; /* reply receive buffer */
void *reply[4]; /* Where to put the reply */
- pgoff_t first; /* first page in mapping to deal with */
- pgoff_t last; /* last page in mapping to deal with */
+ pgoff_t write_first; /* First page of write */
+ pgoff_t write_last; /* Last page of write */
atomic_t usage;
enum afs_call_state state;
spinlock_t state_lock;
@@ -111,15 +112,10 @@ struct afs_call {
u32 abort_code; /* Remote abort ID or 0 */
unsigned request_size; /* size of request data */
unsigned reply_max; /* maximum size of reply */
- unsigned first_offset; /* offset into mapping[first] */
unsigned int cb_break; /* cb_break + cb_s_break before the call */
- union {
- unsigned last_to; /* amount of mapping[last] */
- unsigned count2; /* count used in unmarshalling */
- };
+ unsigned count2; /* count used in unmarshalling */
unsigned char unmarshall; /* unmarshalling phase */
bool incoming; /* T if incoming call */
- bool send_pages; /* T if data from mapping should be sent */
bool need_attention; /* T if RxRPC poked us */
bool async; /* T if asynchronous */
bool ret_reply0; /* T if should return reply[0] on success */
@@ -798,8 +794,7 @@ extern int afs_fs_symlink(struct afs_fs_cursor *, const char *, const char *, u6
struct afs_fid *, struct afs_file_status *);
extern int afs_fs_rename(struct afs_fs_cursor *, const char *,
struct afs_vnode *, const char *, u64, u64);
-extern int afs_fs_store_data(struct afs_fs_cursor *, struct address_space *,
- pgoff_t, pgoff_t, unsigned, unsigned);
+extern int afs_fs_store_data(struct afs_fs_cursor *, struct iov_iter *, loff_t);
extern int afs_fs_setattr(struct afs_fs_cursor *, struct iattr *);
extern int afs_fs_get_volume_status(struct afs_fs_cursor *, struct afs_volume_status *);
extern int afs_fs_set_lock(struct afs_fs_cursor *, afs_lock_type_t);
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 09479677f11f..fa22739cbf4f 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -258,39 +258,6 @@ void afs_flat_call_destructor(struct afs_call *call)
call->buffer = NULL;
}
-#define AFS_BVEC_MAX 8
-
-/*
- * Load the given bvec with the next few pages.
- */
-static void afs_load_bvec(struct afs_call *call, struct msghdr *msg,
- struct bio_vec *bv, pgoff_t first, pgoff_t last,
- unsigned offset)
-{
- struct page *pages[AFS_BVEC_MAX];
- unsigned int nr, n, i, to, bytes = 0;
-
- nr = min_t(pgoff_t, last - first + 1, AFS_BVEC_MAX);
- n = find_get_pages_contig(call->mapping, first, nr, pages);
- ASSERTCMP(n, ==, nr);
-
- msg->msg_flags |= MSG_MORE;
- for (i = 0; i < nr; i++) {
- to = PAGE_SIZE;
- if (first + i >= last) {
- to = call->last_to;
- msg->msg_flags &= ~MSG_MORE;
- }
- bv[i].bv_page = pages[i];
- bv[i].bv_len = to - offset;
- bv[i].bv_offset = offset;
- bytes += to - offset;
- offset = 0;
- }
-
- iov_iter_bvec(&msg->msg_iter, WRITE, bv, nr, bytes);
-}
-
/*
* Advance the AFS call state when the RxRPC call ends the transmit phase.
*/
@@ -303,41 +270,6 @@ static void afs_notify_end_request_tx(struct sock *sock,
afs_set_call_state(call, AFS_CALL_CL_REQUESTING, AFS_CALL_CL_AWAIT_REPLY);
}
-/*
- * attach the data from a bunch of pages on an inode to a call
- */
-static int afs_send_pages(struct afs_call *call, struct msghdr *msg)
-{
- struct bio_vec bv[AFS_BVEC_MAX];
- unsigned int bytes, nr, loop, offset;
- pgoff_t first = call->first, last = call->last;
- int ret;
-
- offset = call->first_offset;
- call->first_offset = 0;
-
- do {
- afs_load_bvec(call, msg, bv, first, last, offset);
- trace_afs_send_pages(call, msg, first, last, offset);
-
- offset = 0;
- bytes = msg->msg_iter.count;
- nr = msg->msg_iter.nr_segs;
-
- ret = rxrpc_kernel_send_data(call->net->socket, call->rxcall, msg,
- bytes, afs_notify_end_request_tx);
- for (loop = 0; loop < nr; loop++)
- put_page(bv[loop].bv_page);
- if (ret < 0)
- break;
-
- first += nr;
- } while (first <= last);
-
- trace_afs_sent_pages(call, call->first, last, first, ret);
- return ret;
-}
-
/*
* initiate a call
*/
@@ -367,19 +299,8 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
* after the initial fixed part.
*/
tx_total_len = call->request_size;
- if (call->send_pages) {
- if (call->last == call->first) {
- tx_total_len += call->last_to - call->first_offset;
- } else {
- /* It looks mathematically like you should be able to
- * combine the following lines with the ones above, but
- * unsigned arithmetic is fun when it wraps...
- */
- tx_total_len += PAGE_SIZE - call->first_offset;
- tx_total_len += call->last_to;
- tx_total_len += (call->last - call->first - 1) * PAGE_SIZE;
- }
- }
+ if (call->write_iter)
+ tx_total_len += iov_iter_count(call->write_iter);
/* create a call */
rxcall = rxrpc_kernel_begin_call(call->net->socket, srx, call->key,
@@ -406,7 +327,7 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, call->request_size);
msg.msg_control = NULL;
msg.msg_controllen = 0;
- msg.msg_flags = MSG_WAITALL | (call->send_pages ? MSG_MORE : 0);
+ msg.msg_flags = MSG_WAITALL | (call->write_iter ? MSG_MORE : 0);
ret = rxrpc_kernel_send_data(call->net->socket, rxcall,
&msg, call->request_size,
@@ -414,8 +335,18 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
if (ret < 0)
goto error_do_abort;
- if (call->send_pages) {
- ret = afs_send_pages(call, &msg);
+ if (call->write_iter) {
+ msg.msg_iter = *call->write_iter;
+ msg.msg_flags &= ~MSG_MORE;
+ trace_afs_send_data(call, &msg);
+
+ ret = rxrpc_kernel_send_data(call->net->socket,
+ call->rxcall, &msg,
+ iov_iter_count(&msg.msg_iter),
+ afs_notify_end_request_tx);
+ *call->write_iter = msg.msg_iter;
+
+ trace_afs_sent_data(call, &msg, ret);
if (ret < 0)
goto error_do_abort;
}
diff --git a/fs/afs/write.c b/fs/afs/write.c
index ad3d706aa9e3..8ce5142f0f08 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -302,22 +302,21 @@ static void afs_redirty_pages(struct writeback_control *wbc,
/*
* write to a file
*/
-static int afs_store_data(struct address_space *mapping,
- pgoff_t first, pgoff_t last,
- unsigned offset, unsigned to)
+static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter,
+ loff_t pos)
{
- struct afs_vnode *vnode = AFS_FS_I(mapping->host);
struct afs_fs_cursor fc;
struct afs_wb_key *wbk = NULL;
struct list_head *p;
+ loff_t count = iov_iter_count(iter);
int ret = -ENOKEY, ret2;
- _enter("%s{%x:%u.%u},%lx,%lx,%x,%x",
+ _enter("%s{%x:%u.%u},%llx,%llx",
vnode->volume->name,
vnode->fid.vid,
vnode->fid.vnode,
vnode->fid.unique,
- first, last, offset, to);
+ count, pos);
spin_lock(&vnode->wb_lock);
p = vnode->wb_keys.next;
@@ -350,7 +349,7 @@ static int afs_store_data(struct address_space *mapping,
if (afs_begin_vnode_operation(&fc, vnode, wbk->key)) {
while (afs_select_fileserver(&fc)) {
fc.cb_break = afs_calc_vnode_cb_break(vnode);
- afs_fs_store_data(&fc, mapping, first, last, offset, to);
+ afs_fs_store_data(&fc, iter, pos);
}
afs_check_for_remote_deletion(&fc, fc.vnode);
@@ -361,9 +360,7 @@ static int afs_store_data(struct address_space *mapping,
switch (ret) {
case 0:
afs_stat_v(vnode, n_stores);
- atomic_long_add((last * PAGE_SIZE + to) -
- (first * PAGE_SIZE + offset),
- &afs_v2net(vnode)->n_store_bytes);
+ atomic_long_add(count, &afs_v2net(vnode)->n_store_bytes);
break;
case -EACCES:
case -EPERM:
@@ -393,10 +390,12 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
pgoff_t final_page)
{
struct afs_vnode *vnode = AFS_FS_I(mapping->host);
+ struct iov_iter iter;
struct page *pages[8], *page;
unsigned long count, priv;
unsigned n, offset, to, f, t;
pgoff_t start, first, last;
+ loff_t a, b;
int loop, ret;
_enter(",%lx", primary_page->index);
@@ -496,10 +495,17 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
first = primary_page->index;
last = first + count - 1;
-
_debug("write back %lx[%u..] to %lx[..%u]", first, offset, last, to);
- ret = afs_store_data(mapping, first, last, offset, to);
+ a = first;
+ a <<= PAGE_SHIFT;
+ a += offset;
+ b = last;
+ b <<= PAGE_SHIFT;
+ b += to;
+ iov_iter_mapping(&iter, WRITE, mapping, a, b - a);
+
+ ret = afs_store_data(vnode, &iter, a);
switch (ret) {
case 0:
ret = count;
@@ -668,36 +674,35 @@ int afs_writepages(struct address_space *mapping,
*/
void afs_pages_written_back(struct afs_vnode *vnode, struct afs_call *call)
{
- struct pagevec pv;
+ struct radix_tree_iter iter;
+ struct address_space *mapping = vnode->vfs_inode.i_mapping;
unsigned long priv;
- unsigned count, loop;
- pgoff_t first = call->first, last = call->last;
+ pgoff_t cursor = call->write_first, last = call->write_last;
+ void __rcu **slot;
_enter("{%x:%u},{%lx-%lx}",
- vnode->fid.vid, vnode->fid.vnode, first, last);
+ vnode->fid.vid, vnode->fid.vnode, cursor, last);
- pagevec_init(&pv);
+ rcu_read_lock();
- do {
- _debug("done %lx-%lx", first, last);
+ radix_tree_for_each_contig(slot, &mapping->i_pages, &iter, cursor) {
+ struct page *page = radix_tree_deref_slot(slot);
- count = last - first + 1;
- if (count > PAGEVEC_SIZE)
- count = PAGEVEC_SIZE;
- pv.nr = find_get_pages_contig(vnode->vfs_inode.i_mapping,
- first, count, pv.pages);
- ASSERTCMP(pv.nr, ==, count);
+ ASSERT(page);
+ ASSERT(PageWriteback(page));
- for (loop = 0; loop < count; loop++) {
- priv = page_private(pv.pages[loop]);
- trace_afs_page_dirty(vnode, tracepoint_string("clear"),
- pv.pages[loop]->index, priv);
- set_page_private(pv.pages[loop], 0);
- end_page_writeback(pv.pages[loop]);
- }
- first += count;
- __pagevec_release(&pv);
- } while (first <= last);
+ priv = page_private(page);
+ trace_afs_page_dirty(vnode, tracepoint_string("clear"),
+ page->index, priv);
+ set_page_private(page, 0);
+ page_endio(page, true, 0);
+
+ if (cursor >= last)
+ break;
+ cursor++;
+ }
+
+ rcu_read_unlock();
afs_prune_wb_keys(vnode);
_leave("");
@@ -829,6 +834,8 @@ int afs_launder_page(struct page *page)
{
struct address_space *mapping = page->mapping;
struct afs_vnode *vnode = AFS_FS_I(mapping->host);
+ struct iov_iter iter;
+ struct bio_vec bv[1];
unsigned long priv;
unsigned int f, t;
int ret = 0;
@@ -844,9 +851,14 @@ int afs_launder_page(struct page *page)
t = priv >> AFS_PRIV_SHIFT;
}
+ bv[0].bv_page = page;
+ bv[0].bv_offset = f;
+ bv[0].bv_len = t - f;
+ iov_iter_bvec(&iter, WRITE, bv, 1, bv[0].bv_len);
+
trace_afs_page_dirty(vnode, tracepoint_string("launder"),
page->index, priv);
- ret = afs_store_data(mapping, page->index, page->index, t, f);
+ ret = afs_store_data(vnode, &iter, (loff_t)page->index << PAGE_SHIFT);
}
trace_afs_page_dirty(vnode, tracepoint_string("laundered"),
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 5e0f8dcede26..9c44245987b3 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -392,65 +392,52 @@ TRACE_EVENT(afs_call_done,
__entry->rx_call)
);
-TRACE_EVENT(afs_send_pages,
- TP_PROTO(struct afs_call *call, struct msghdr *msg,
- pgoff_t first, pgoff_t last, unsigned int offset),
+TRACE_EVENT(afs_send_data,
+ TP_PROTO(struct afs_call *call, struct msghdr *msg),
- TP_ARGS(call, msg, first, last, offset),
+ TP_ARGS(call, msg),
TP_STRUCT__entry(
__field(unsigned int, call )
- __field(pgoff_t, first )
- __field(pgoff_t, last )
- __field(unsigned int, nr )
- __field(unsigned int, bytes )
- __field(unsigned int, offset )
__field(unsigned int, flags )
+ __field(loff_t, offset )
+ __field(loff_t, count )
),
TP_fast_assign(
__entry->call = call->debug_id;
- __entry->first = first;
- __entry->last = last;
- __entry->nr = msg->msg_iter.nr_segs;
- __entry->bytes = msg->msg_iter.count;
- __entry->offset = offset;
__entry->flags = msg->msg_flags;
+ __entry->offset = msg->msg_iter.iov_offset;
+ __entry->count = iov_iter_count(&msg->msg_iter);
),
- TP_printk(" c=%08x %lx-%lx-%lx b=%x o=%x f=%x",
- __entry->call,
- __entry->first, __entry->first + __entry->nr - 1, __entry->last,
- __entry->bytes, __entry->offset,
+ TP_printk(" c=%08x o=%llx c=%llx f=%x",
+ __entry->call, __entry->offset, __entry->count,
__entry->flags)
);
-TRACE_EVENT(afs_sent_pages,
- TP_PROTO(struct afs_call *call, pgoff_t first, pgoff_t last,
- pgoff_t cursor, int ret),
+TRACE_EVENT(afs_sent_data,
+ TP_PROTO(struct afs_call *call, struct msghdr *msg, int ret),
- TP_ARGS(call, first, last, cursor, ret),
+ TP_ARGS(call, msg, ret),
TP_STRUCT__entry(
__field(unsigned int, call )
- __field(pgoff_t, first )
- __field(pgoff_t, last )
- __field(pgoff_t, cursor )
__field(int, ret )
+ __field(loff_t, offset )
+ __field(loff_t, count )
),
TP_fast_assign(
__entry->call = call->debug_id;
- __entry->first = first;
- __entry->last = last;
- __entry->cursor = cursor;
__entry->ret = ret;
+ __entry->offset = msg->msg_iter.iov_offset;
+ __entry->count = iov_iter_count(&msg->msg_iter);
),
- TP_printk(" c=%08x %lx-%lx c=%lx r=%d",
- __entry->call,
- __entry->first, __entry->last,
- __entry->cursor, __entry->ret)
+ TP_printk(" c=%08x o=%llx c=%llx r=%d",
+ __entry->call, __entry->offset, __entry->count,
+ __entry->ret)
);
TRACE_EVENT(afs_dir_check_failed,
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 08/10] afs: Add O_DIRECT read support
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (6 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 07/10] afs: Use ITER_MAPPING for writing David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 09/10] afs: Add a couple of tracepoints to log I/O errors David Howells
2018-08-06 13:17 ` [PATCH 10/10] afs: Don't invoke the server to read data beyond EOF David Howells
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Add synchronous O_DIRECT read support to AFS (no AIO yet). It can
theoretically handle reads up to the maximum size describable by loff_t -
and given an iterator with sufficiently capacity to handle that and given
support on the server.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/file.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/afs/internal.h | 1 +
fs/afs/write.c | 8 +++++++
3 files changed, 70 insertions(+)
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 7d65efe69929..095ef4b4a0e7 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -27,6 +27,7 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags);
static int afs_readpages(struct file *filp, struct address_space *mapping,
struct list_head *pages, unsigned nr_pages);
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter);
const struct file_operations afs_file_operations = {
.open = afs_open,
@@ -55,6 +56,7 @@ const struct address_space_operations afs_fs_aops = {
.launder_page = afs_launder_page,
.releasepage = afs_releasepage,
.invalidatepage = afs_invalidatepage,
+ .direct_IO = afs_direct_IO,
.write_begin = afs_write_begin,
.write_end = afs_write_end,
.writepage = afs_writepage,
@@ -732,3 +734,62 @@ static int afs_file_mmap(struct file *file, struct vm_area_struct *vma)
vma->vm_ops = &afs_vm_ops;
return ret;
}
+
+/*
+ * Direct file read operation for an AFS file.
+ */
+static ssize_t afs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter)
+{
+ struct file *file = iocb->ki_filp;
+ struct address_space *mapping = file->f_mapping;
+ struct afs_vnode *vnode = AFS_FS_I(mapping->host);
+ struct afs_read *req;
+ struct key *key = afs_file_key(file);
+ ssize_t ret;
+ size_t count = iov_iter_count(iter), transferred = 0;
+
+ if (!count)
+ return 0;
+ if (!is_sync_kiocb(iocb))
+ return -EOPNOTSUPP;
+
+ req = kzalloc(sizeof(struct afs_read), GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ refcount_set(&req->usage, 1);
+ req->pos = iocb->ki_pos;
+ req->len = count;
+ req->iter = *iter;
+
+ task_io_account_read(count);
+
+
+ // TODO afs_start_io_direct(inode);
+ ret = afs_fetch_data(vnode, key, req);
+ if (ret == 0)
+ transferred = req->actual_len;
+ *iter = req->iter;
+ afs_put_read(req);
+
+ // TODO afs_end_io_direct(inode);
+
+ BUG_ON(ret == -EIOCBQUEUED); // TODO
+
+ if (ret == 0)
+ ret = transferred;
+
+ return ret;
+}
+
+/*
+ * Do direct I/O.
+ */
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
+ VM_BUG_ON(iov_iter_count(iter) != PAGE_SIZE);
+
+ if (iov_iter_rw(iter) == READ)
+ return afs_file_direct_read(iocb, iter);
+ return afs_file_direct_write(iocb, iter);
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index b12566b640f6..ac763e651c42 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1110,6 +1110,7 @@ extern int afs_fsync(struct file *, loff_t, loff_t, int);
extern int afs_page_mkwrite(struct vm_fault *);
extern void afs_prune_wb_keys(struct afs_vnode *);
extern int afs_launder_page(struct page *);
+extern ssize_t afs_file_direct_write(struct kiocb *, struct iov_iter *);
/*
* xattr.c
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 8ce5142f0f08..18b836b473f2 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -874,3 +874,11 @@ int afs_launder_page(struct page *page)
#endif
return ret;
}
+
+/*
+ * Direct file write operation for an AFS file.
+ */
+ssize_t afs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
+{
+ return -EOPNOTSUPP;
+}
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 09/10] afs: Add a couple of tracepoints to log I/O errors
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (7 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 08/10] afs: Add O_DIRECT read support David Howells
@ 2018-08-06 13:17 ` David Howells
2018-08-06 13:17 ` [PATCH 10/10] afs: Don't invoke the server to read data beyond EOF David Howells
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
Add a couple of tracepoints to log the production of I/O errors within the AFS
filesystem.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/cmservice.c | 10 +++--
fs/afs/dir.c | 23 ++++++++----
fs/afs/internal.h | 11 ++++++
fs/afs/mntpt.c | 2 +
fs/afs/rxrpc.c | 2 +
fs/afs/server.c | 4 +-
fs/afs/volume.c | 2 +
fs/afs/write.c | 1 +
include/trace/events/afs.h | 81 ++++++++++++++++++++++++++++++++++++++++++++
9 files changed, 117 insertions(+), 19 deletions(-)
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 4db62ae8dc1a..186f621f8722 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -260,7 +260,7 @@ static int afs_deliver_cb_callback(struct afs_call *call)
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
/* we'll need the file server record as that tells us which set of
* vnodes to operate upon */
@@ -368,7 +368,7 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call)
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
/* we'll need the file server record as that tells us which set of
* vnodes to operate upon */
@@ -409,7 +409,7 @@ static int afs_deliver_cb_probe(struct afs_call *call)
return ret;
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
return afs_queue_call_work(call);
}
@@ -490,7 +490,7 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call)
}
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
return afs_queue_call_work(call);
}
@@ -573,7 +573,7 @@ static int afs_deliver_cb_tell_me_about_yourself(struct afs_call *call)
return ret;
if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING))
- return -EIO;
+ return afs_io_error(call, afs_io_error_cm_reply);
return afs_queue_call_work(call);
}
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1357a10804bb..300514cb33d8 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -173,6 +173,7 @@ static bool afs_dir_check_page(struct afs_vnode *dvnode, struct page *page,
ntohs(dbuf->blocks[tmp].hdr.magic));
trace_afs_dir_check_failed(dvnode, off, i_size);
kunmap(page);
+ trace_afs_file_error(dvnode, -EIO, afs_file_error_dir_bad_magic);
goto error;
}
@@ -273,12 +274,15 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
expand:
i_size = i_size_read(&dvnode->vfs_inode);
- ret = -EIO;
- if (i_size < 2048)
+ if (i_size < 2048) {
+ ret = afs_bad(dvnode, afs_file_error_dir_small);
goto error;
- ret = -EFBIG;
- if (i_size > 2048 * 1024)
+ }
+ if (i_size > 2048 * 1024) {
+ trace_afs_file_error(dvnode, -EFBIG, afs_file_error_dir_big);
+ ret = -EFBIG;
goto error;
+ }
nr_pages = (i_size + PAGE_SIZE - 1) / PAGE_SIZE;
@@ -383,7 +387,8 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
/*
* deal with one block in an AFS directory
*/
-static int afs_dir_iterate_block(struct dir_context *ctx,
+static int afs_dir_iterate_block(struct afs_vnode *dvnode,
+ struct dir_context *ctx,
union afs_xdr_dir_block *block,
unsigned blkoff)
{
@@ -433,7 +438,7 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
" (len %u/%zu)",
blkoff / sizeof(union afs_xdr_dir_block),
offset, next, tmp, nlen);
- return -EIO;
+ return afs_bad(dvnode, afs_file_error_dir_over_end);
}
if (!(block->hdr.bitmap[next / 8] &
(1 << (next % 8)))) {
@@ -441,7 +446,7 @@ static int afs_dir_iterate_block(struct dir_context *ctx,
" %u unmarked extension (len %u/%zu)",
blkoff / sizeof(union afs_xdr_dir_block),
offset, next, tmp, nlen);
- return -EIO;
+ return afs_bad(dvnode, afs_file_error_dir_unmarked_ext);
}
_debug("ENT[%zu.%u]: ext %u/%zu",
@@ -517,7 +522,7 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
page = radix_tree_deref_slot(slot);
rcu_read_unlock();
if (!page) {
- ret = -EIO;
+ ret = afs_bad(dvnode, afs_file_error_dir_missing_page);
break;
}
mark_page_accessed(page);
@@ -530,7 +535,7 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx,
do {
dblock = &dbuf->blocks[(blkoff % PAGE_SIZE) /
sizeof(union afs_xdr_dir_block)];
- ret = afs_dir_iterate_block(ctx, dblock, blkoff);
+ ret = afs_dir_iterate_block(dvnode, ctx, dblock, blkoff);
if (ret != 1) {
kunmap(page);
goto out;
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index ac763e651c42..8e2941cafc36 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1149,6 +1149,17 @@ static inline void afs_check_for_remote_deletion(struct afs_fs_cursor *fc,
}
}
+static inline int afs_io_error(struct afs_call *call, enum afs_io_error where)
+{
+ trace_afs_io_error(call->debug_id, -EIO, where);
+ return -EIO;
+}
+
+static inline int afs_bad(struct afs_vnode *vnode, enum afs_file_error where)
+{
+ trace_afs_file_error(vnode, -EIO, where);
+ return -EIO;
+}
/*****************************************************************************/
/*
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index c8a7f05b9f12..822d62790a74 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -122,7 +122,7 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt)
if (PageError(page)) {
put_page(page);
- return -EIO;
+ return afs_bad(AFS_FS_I(d_inode(mntpt)), afs_file_error_mntpt);
}
buf = kmap(page);
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index fa22739cbf4f..36363c33097b 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -845,7 +845,7 @@ int afs_extract_data(struct afs_call *call, bool want_more)
break;
case AFS_CALL_COMPLETE:
kdebug("prem complete %d", call->error);
- return -EIO;
+ return afs_io_error(call, afs_io_error_extract);
default:
break;
}
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 1d329e6981d5..b7874bf9402e 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -274,7 +274,7 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell,
case -ECONNREFUSED:
break;
default:
- ac.error = -EIO;
+ ac.error = afs_io_error(NULL, afs_io_error_vl_lookup_fail);
goto error;
}
}
@@ -560,7 +560,7 @@ static bool afs_do_probe_fileserver(struct afs_fs_cursor *fc)
case -ETIME:
break;
default:
- fc->ac.error = -EIO;
+ fc->ac.error = afs_io_error(NULL, afs_io_error_fs_probe_fail);
goto error;
}
}
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 7adcddf02e66..23dde4624d06 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -116,7 +116,7 @@ static struct afs_vldb_entry *afs_vl_lookup_vldb(struct afs_cell *cell,
case -ECONNREFUSED:
break;
default:
- ac.error = -EIO;
+ ac.error = afs_io_error(NULL, afs_io_error_vl_lookup_fail);
goto error;
}
}
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 18b836b473f2..6ff0f25836de 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -537,6 +537,7 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
case -ENOENT:
case -ENOMEDIUM:
case -ENXIO:
+ trace_afs_file_error(vnode, ret, afs_file_error_writeback_fail);
afs_kill_pages(mapping, first, last);
mapping_set_error(mapping, ret);
break;
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 9c44245987b3..d9c99c423fd9 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -103,6 +103,24 @@ enum afs_eproto_cause {
afs_eproto_yvl_vlendpt_type,
};
+enum afs_io_error {
+ afs_io_error_cm_reply,
+ afs_io_error_extract,
+ afs_io_error_fs_probe_fail,
+ afs_io_error_vl_lookup_fail,
+};
+
+enum afs_file_error {
+ afs_file_error_dir_bad_magic,
+ afs_file_error_dir_big,
+ afs_file_error_dir_missing_page,
+ afs_file_error_dir_over_end,
+ afs_file_error_dir_small,
+ afs_file_error_dir_unmarked_ext,
+ afs_file_error_mntpt,
+ afs_file_error_writeback_fail,
+};
+
#endif /* end __AFS_DECLARE_TRACE_ENUMS_ONCE_ONLY */
/*
@@ -183,6 +201,21 @@ enum afs_eproto_cause {
EM(afs_eproto_yvl_vlendpt6_len, "YVL.VlEnd6Len") \
E_(afs_eproto_yvl_vlendpt_type, "YVL.VlEndType")
+#define afs_io_errors \
+ EM(afs_io_error_cm_reply, "CM_REPLY") \
+ EM(afs_io_error_extract, "EXTRACT") \
+ EM(afs_io_error_fs_probe_fail, "FS_PROBE_FAIL") \
+ E_(afs_io_error_vl_lookup_fail, "VL_LOOKUP_FAIL")
+
+#define afs_file_errors \
+ EM(afs_file_error_dir_bad_magic, "DIR_BAD_MAGIC") \
+ EM(afs_file_error_dir_big, "DIR_BIG") \
+ EM(afs_file_error_dir_missing_page, "DIR_MISSING_PAGE") \
+ EM(afs_file_error_dir_over_end, "DIR_ENT_OVER_END") \
+ EM(afs_file_error_dir_small, "DIR_SMALL") \
+ EM(afs_file_error_dir_unmarked_ext, "DIR_UNMARKED_EXT") \
+ EM(afs_file_error_mntpt, "MNTPT_READ_FAILED") \
+ E_(afs_file_error_writeback_fail, "WRITEBACK_FAILED")
/*
* Export enum symbols via userspace.
@@ -197,6 +230,9 @@ afs_fs_operations;
afs_vl_operations;
afs_edit_dir_ops;
afs_edit_dir_reasons;
+afs_eproto_causes;
+afs_io_errors;
+afs_file_errors;
/*
* Now redefine the EM() and E_() macros to map the enums to the strings that
@@ -600,6 +636,51 @@ TRACE_EVENT(afs_protocol_error,
__print_symbolic(__entry->cause, afs_eproto_causes))
);
+TRACE_EVENT(afs_io_error,
+ TP_PROTO(unsigned int call, int error, enum afs_io_error where),
+
+ TP_ARGS(call, error, where),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, call )
+ __field(int, error )
+ __field(enum afs_io_error, where )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->error = error;
+ __entry->where = where;
+ ),
+
+ TP_printk("c=%08x r=%d %s",
+ __entry->call, __entry->error,
+ __print_symbolic(__entry->where, afs_io_errors))
+ );
+
+TRACE_EVENT(afs_file_error,
+ TP_PROTO(struct afs_vnode *vnode, int error, enum afs_file_error where),
+
+ TP_ARGS(vnode, error, where),
+
+ TP_STRUCT__entry(
+ __field_struct(struct afs_fid, fid )
+ __field(int, error )
+ __field(enum afs_file_error, where )
+ ),
+
+ TP_fast_assign(
+ __entry->fid = vnode->fid;
+ __entry->error = error;
+ __entry->where = where;
+ ),
+
+ TP_printk("%x:%x:%x r=%d %s",
+ __entry->fid.vid, __entry->fid.vnode, __entry->fid.unique,
+ __entry->error,
+ __print_symbolic(__entry->where, afs_file_errors))
+ );
+
TRACE_EVENT(afs_cm_no_server,
TP_PROTO(struct afs_call *call, struct sockaddr_rxrpc *srx),
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 10/10] afs: Don't invoke the server to read data beyond EOF
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
` (8 preceding siblings ...)
2018-08-06 13:17 ` [PATCH 09/10] afs: Add a couple of tracepoints to log I/O errors David Howells
@ 2018-08-06 13:17 ` David Howells
9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-08-06 13:17 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-afs, linux-fsdevel, linux-kernel, matthew
When writing a new page, clear space in the page rather than attempting to
load it from the server if the space is beyond the EOF.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/write.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 6ff0f25836de..7c7a03310f16 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -33,10 +33,21 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
loff_t pos, unsigned int len, struct page *page)
{
struct afs_read *req;
+ size_t p;
+ void *data;
int ret;
_enter(",,%llu", (unsigned long long)pos);
+ if (pos >= vnode->vfs_inode.i_size) {
+ p = pos & ~PAGE_MASK;
+ ASSERTCMP(p + len, <=, PAGE_SIZE);
+ data = kmap(page);
+ memset(data + p, 0, len);
+ kunmap(page);
+ return 0;
+ }
+
req = kzalloc(sizeof(struct afs_read), GFP_KERNEL);
if (!req)
return -ENOMEM;
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 08/10] afs: Add O_DIRECT read support
2018-09-13 15:51 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
@ 2018-09-13 15:52 ` David Howells
0 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2018-09-13 15:52 UTC (permalink / raw)
To: viro; +Cc: dhowells, linux-fsdevel, linux-afs, linux-kernel
Add synchronous O_DIRECT read support to AFS (no AIO yet). It can
theoretically handle reads up to the maximum size describable by loff_t -
and given an iterator with sufficiently capacity to handle that and given
support on the server.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/afs/file.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/afs/internal.h | 1 +
fs/afs/write.c | 8 +++++++
3 files changed, 70 insertions(+)
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 36ba263db749..60f9896e1426 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -27,6 +27,7 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags);
static int afs_readpages(struct file *filp, struct address_space *mapping,
struct list_head *pages, unsigned nr_pages);
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter);
const struct file_operations afs_file_operations = {
.open = afs_open,
@@ -55,6 +56,7 @@ const struct address_space_operations afs_fs_aops = {
.launder_page = afs_launder_page,
.releasepage = afs_releasepage,
.invalidatepage = afs_invalidatepage,
+ .direct_IO = afs_direct_IO,
.write_begin = afs_write_begin,
.write_end = afs_write_end,
.writepage = afs_writepage,
@@ -731,3 +733,62 @@ static int afs_file_mmap(struct file *file, struct vm_area_struct *vma)
vma->vm_ops = &afs_vm_ops;
return ret;
}
+
+/*
+ * Direct file read operation for an AFS file.
+ */
+static ssize_t afs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter)
+{
+ struct file *file = iocb->ki_filp;
+ struct address_space *mapping = file->f_mapping;
+ struct afs_vnode *vnode = AFS_FS_I(mapping->host);
+ struct afs_read *req;
+ struct key *key = afs_file_key(file);
+ ssize_t ret;
+ size_t count = iov_iter_count(iter), transferred = 0;
+
+ if (!count)
+ return 0;
+ if (!is_sync_kiocb(iocb))
+ return -EOPNOTSUPP;
+
+ req = kzalloc(sizeof(struct afs_read), GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ refcount_set(&req->usage, 1);
+ req->pos = iocb->ki_pos;
+ req->len = count;
+ req->iter = *iter;
+
+ task_io_account_read(count);
+
+
+ // TODO afs_start_io_direct(inode);
+ ret = afs_fetch_data(vnode, key, req);
+ if (ret == 0)
+ transferred = req->actual_len;
+ *iter = req->iter;
+ afs_put_read(req);
+
+ // TODO afs_end_io_direct(inode);
+
+ BUG_ON(ret == -EIOCBQUEUED); // TODO
+
+ if (ret == 0)
+ ret = transferred;
+
+ return ret;
+}
+
+/*
+ * Do direct I/O.
+ */
+static ssize_t afs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
+ VM_BUG_ON(iov_iter_count(iter) != PAGE_SIZE);
+
+ if (iov_iter_rw(iter) == READ)
+ return afs_file_direct_read(iocb, iter);
+ return afs_file_direct_write(iocb, iter);
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 7c7598856b96..38b48382db52 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1111,6 +1111,7 @@ extern int afs_fsync(struct file *, loff_t, loff_t, int);
extern vm_fault_t afs_page_mkwrite(struct vm_fault *vmf);
extern void afs_prune_wb_keys(struct afs_vnode *);
extern int afs_launder_page(struct page *);
+extern ssize_t afs_file_direct_write(struct kiocb *, struct iov_iter *);
/*
* xattr.c
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 9f035386b9c7..46a3a911c682 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -874,3 +874,11 @@ int afs_launder_page(struct page *page)
#endif
return ret;
}
+
+/*
+ * Direct file write operation for an AFS file.
+ */
+ssize_t afs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter)
+{
+ return -EOPNOTSUPP;
+}
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-09-13 17:35 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-06 13:16 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
2018-08-06 13:16 ` [PATCH 01/10] iov_iter: Separate type from direction and use accessor functions David Howells
2018-08-06 13:16 ` [PATCH 02/10] iov_iter: Renumber the ITER_* constants in uio.h David Howells
2018-08-06 13:16 ` [PATCH 03/10] iov_iter: Make count and iov_offset loff_t not size_t David Howells
2018-08-06 13:17 ` [PATCH 04/10] iov_iter: Add mapping and discard iterator types David Howells
2018-08-06 13:17 ` [PATCH 05/10] afs: Better tracing of protocol errors David Howells
2018-08-06 13:17 ` [PATCH 06/10] afs: Set up the iov_iter before calling afs_extract_data() David Howells
2018-08-06 13:17 ` [PATCH 07/10] afs: Use ITER_MAPPING for writing David Howells
2018-08-06 13:17 ` [PATCH 08/10] afs: Add O_DIRECT read support David Howells
2018-08-06 13:17 ` [PATCH 09/10] afs: Add a couple of tracepoints to log I/O errors David Howells
2018-08-06 13:17 ` [PATCH 10/10] afs: Don't invoke the server to read data beyond EOF David Howells
2018-09-13 15:51 [PATCH 00/10] iov_iter: Add new iters and use with AFS David Howells
2018-09-13 15:52 ` [PATCH 08/10] afs: Add O_DIRECT read support David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).