* [Patch v3 00/16] CIFS: add support for direct I/O
@ 2018-09-08 2:13 Long Li
2018-09-08 2:13 ` [Patch v3 01/16] CIFS: Add support for direct pages in rdata Long Li
` (16 more replies)
0 siblings, 17 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
This patch set implements direct I/O.
In normal code path (even with cache=none), CIFS copies I/O data from
user-space to kernel-space for security reasons of possible protocol
required signing and encryption on user data.
With this patch set, CIFS passes the I/O data directly from user-space
buffer to the transport layer, when file system is mounted with
"cache-none".
Patch v2 addressed comments from Christoph Hellwig <hch@lst.de> and
Tom Talpey <ttalpey@microsoft.com> to implement direct I/O for both
socket and RDMA.
Patch v3 added support for kernel AIO.
Long Li (16):
CIFS: Add support for direct pages in rdata
CIFS: Use offset when reading pages
CIFS: Add support for direct pages in wdata
CIFS: pass page offset when issuing SMB write
CIFS: Calculate the correct request length based on page offset and
tail size
CIFS: Introduce helper function to get page offset and length in
smb_rqst
CIFS: When sending data on socket, pass the correct page offset
CIFS: SMBD: Support page offset in RDMA send
CIFS: SMBD: Support page offset in RDMA recv
CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration
CIFS: SMBD: Support page offset in memory registration
CIFS: Pass page offset for calculating signature
CIFS: Pass page offset for encrypting
CIFS: Add support for direct I/O read
CIFS: Add support for direct I/O write
CIFS: Add direct I/O functions to file_operations
fs/cifs/cifsencrypt.c | 9 +-
fs/cifs/cifsfs.c | 10 +-
fs/cifs/cifsfs.h | 2 +
fs/cifs/cifsglob.h | 11 +-
fs/cifs/cifsproto.h | 9 +-
fs/cifs/cifssmb.c | 19 +-
fs/cifs/connect.c | 5 +-
fs/cifs/file.c | 477 ++++++++++++++++++++++++++++++++++++++++++--------
fs/cifs/misc.c | 17 ++
fs/cifs/smb2ops.c | 22 ++-
fs/cifs/smb2pdu.c | 20 ++-
fs/cifs/smbdirect.c | 156 ++++++++++-------
fs/cifs/smbdirect.h | 2 +-
fs/cifs/transport.c | 34 ++--
14 files changed, 606 insertions(+), 187 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Patch v3 01/16] CIFS: Add support for direct pages in rdata
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 02/16] CIFS: Use offset when reading pages Long Li
` (15 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
Add a function to allocate rdata without allocating pages for data
transfer. This gives the caller an option to pass a number of pages
that point to the data buffer.
rdata is reponsible for free those pages after it's done.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsglob.h | 3 +--
fs/cifs/file.c | 23 ++++++++++++++++++++---
2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 80a34ce..166e140 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1173,14 +1173,13 @@ struct cifs_readdata {
struct kvec iov[2];
#ifdef CONFIG_CIFS_SMB_DIRECT
struct smbd_mr *mr;
- struct page **direct_pages;
#endif
unsigned int pagesz;
unsigned int page_offset;
unsigned int tailsz;
unsigned int credits;
unsigned int nr_pages;
- struct page *pages[];
+ struct page **pages;
};
struct cifs_writedata;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 23fd430..1c98293 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2880,13 +2880,13 @@ cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from)
}
static struct cifs_readdata *
-cifs_readdata_alloc(unsigned int nr_pages, work_func_t complete)
+cifs_readdata_direct_alloc(struct page **pages, work_func_t complete)
{
struct cifs_readdata *rdata;
- rdata = kzalloc(sizeof(*rdata) + (sizeof(struct page *) * nr_pages),
- GFP_KERNEL);
+ rdata = kzalloc(sizeof(*rdata), GFP_KERNEL);
if (rdata != NULL) {
+ rdata->pages = pages;
kref_init(&rdata->refcount);
INIT_LIST_HEAD(&rdata->list);
init_completion(&rdata->done);
@@ -2896,6 +2896,22 @@ cifs_readdata_alloc(unsigned int nr_pages, work_func_t complete)
return rdata;
}
+static struct cifs_readdata *
+cifs_readdata_alloc(unsigned int nr_pages, work_func_t complete)
+{
+ struct page **pages =
+ kzalloc(sizeof(struct page *) * nr_pages, GFP_KERNEL);
+ struct cifs_readdata *ret = NULL;
+
+ if (pages) {
+ ret = cifs_readdata_direct_alloc(pages, complete);
+ if (!ret)
+ kfree(pages);
+ }
+
+ return ret;
+}
+
void
cifs_readdata_release(struct kref *refcount)
{
@@ -2910,6 +2926,7 @@ cifs_readdata_release(struct kref *refcount)
if (rdata->cfile)
cifsFileInfo_put(rdata->cfile);
+ kvfree(rdata->pages);
kfree(rdata);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 02/16] CIFS: Use offset when reading pages
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
2018-09-08 2:13 ` [Patch v3 01/16] CIFS: Add support for direct pages in rdata Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 03/16] CIFS: Add support for direct pages in wdata Long Li
` (14 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
With offset defined in rdata, transport functions need to look at this
offset when reading data into the correct places in pages.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsproto.h | 4 +++-
fs/cifs/cifssmb.c | 1 +
fs/cifs/connect.c | 5 +++--
fs/cifs/file.c | 52 +++++++++++++++++++++++++++++++++++++---------------
fs/cifs/smb2ops.c | 2 +-
fs/cifs/smb2pdu.c | 1 +
6 files changed, 46 insertions(+), 19 deletions(-)
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index dc80f84..1f27d8e 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -203,7 +203,9 @@ extern void dequeue_mid(struct mid_q_entry *mid, bool malformed);
extern int cifs_read_from_socket(struct TCP_Server_Info *server, char *buf,
unsigned int to_read);
extern int cifs_read_page_from_socket(struct TCP_Server_Info *server,
- struct page *page, unsigned int to_read);
+ struct page *page,
+ unsigned int page_offset,
+ unsigned int to_read);
extern int cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
struct cifs_sb_info *cifs_sb);
extern int cifs_match_super(struct super_block *, void *);
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index c8e4278..a1af258 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -1596,6 +1596,7 @@ cifs_readv_callback(struct mid_q_entry *mid)
struct smb_rqst rqst = { .rq_iov = rdata->iov,
.rq_nvec = 2,
.rq_pages = rdata->pages,
+ .rq_offset = rdata->page_offset,
.rq_npages = rdata->nr_pages,
.rq_pagesz = rdata->pagesz,
.rq_tailsz = rdata->tailsz };
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 83b0234..8501da0 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -594,10 +594,11 @@ cifs_read_from_socket(struct TCP_Server_Info *server, char *buf,
int
cifs_read_page_from_socket(struct TCP_Server_Info *server, struct page *page,
- unsigned int to_read)
+ unsigned int page_offset, unsigned int to_read)
{
struct msghdr smb_msg;
- struct bio_vec bv = {.bv_page = page, .bv_len = to_read};
+ struct bio_vec bv = {
+ .bv_page = page, .bv_len = to_read, .bv_offset = page_offset};
iov_iter_bvec(&smb_msg.msg_iter, READ | ITER_BVEC, &bv, 1, to_read);
return cifs_readv_from_socket(server, &smb_msg);
}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 1c98293..87eece6 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -3026,12 +3026,20 @@ uncached_fill_pages(struct TCP_Server_Info *server,
int result = 0;
unsigned int i;
unsigned int nr_pages = rdata->nr_pages;
+ unsigned int page_offset = rdata->page_offset;
rdata->got_bytes = 0;
rdata->tailsz = PAGE_SIZE;
for (i = 0; i < nr_pages; i++) {
struct page *page = rdata->pages[i];
size_t n;
+ unsigned int segment_size = rdata->pagesz;
+
+ if (i == 0)
+ segment_size -= page_offset;
+ else
+ page_offset = 0;
+
if (len <= 0) {
/* no need to hold page hostage */
@@ -3040,24 +3048,25 @@ uncached_fill_pages(struct TCP_Server_Info *server,
put_page(page);
continue;
}
+
n = len;
- if (len >= PAGE_SIZE) {
+ if (len >= segment_size)
/* enough data to fill the page */
- n = PAGE_SIZE;
- len -= n;
- } else {
- zero_user(page, len, PAGE_SIZE - len);
+ n = segment_size;
+ else
rdata->tailsz = len;
- len = 0;
- }
+ len -= n;
+
if (iter)
- result = copy_page_from_iter(page, 0, n, iter);
+ result = copy_page_from_iter(
+ page, page_offset, n, iter);
#ifdef CONFIG_CIFS_SMB_DIRECT
else if (rdata->mr)
result = n;
#endif
else
- result = cifs_read_page_from_socket(server, page, n);
+ result = cifs_read_page_from_socket(
+ server, page, page_offset, n);
if (result < 0)
break;
@@ -3130,6 +3139,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
rdata->bytes = cur_len;
rdata->pid = pid;
rdata->pagesz = PAGE_SIZE;
+ rdata->tailsz = PAGE_SIZE;
rdata->read_into_pages = cifs_uncached_read_into_pages;
rdata->copy_into_pages = cifs_uncached_copy_into_pages;
rdata->credits = credits;
@@ -3574,6 +3584,7 @@ readpages_fill_pages(struct TCP_Server_Info *server,
u64 eof;
pgoff_t eof_index;
unsigned int nr_pages = rdata->nr_pages;
+ unsigned int page_offset = rdata->page_offset;
/* determine the eof that the server (probably) has */
eof = CIFS_I(rdata->mapping->host)->server_eof;
@@ -3584,13 +3595,21 @@ readpages_fill_pages(struct TCP_Server_Info *server,
rdata->tailsz = PAGE_SIZE;
for (i = 0; i < nr_pages; i++) {
struct page *page = rdata->pages[i];
- size_t n = PAGE_SIZE;
+ unsigned int to_read = rdata->pagesz;
+ size_t n;
+
+ if (i == 0)
+ to_read -= page_offset;
+ else
+ page_offset = 0;
+
+ n = to_read;
- if (len >= PAGE_SIZE) {
- len -= PAGE_SIZE;
+ if (len >= to_read) {
+ len -= to_read;
} else if (len > 0) {
/* enough for partial page, fill and zero the rest */
- zero_user(page, len, PAGE_SIZE - len);
+ zero_user(page, len + page_offset, to_read - len);
n = rdata->tailsz = len;
len = 0;
} else if (page->index > eof_index) {
@@ -3622,13 +3641,15 @@ readpages_fill_pages(struct TCP_Server_Info *server,
}
if (iter)
- result = copy_page_from_iter(page, 0, n, iter);
+ result = copy_page_from_iter(
+ page, page_offset, n, iter);
#ifdef CONFIG_CIFS_SMB_DIRECT
else if (rdata->mr)
result = n;
#endif
else
- result = cifs_read_page_from_socket(server, page, n);
+ result = cifs_read_page_from_socket(
+ server, page, page_offset, n);
if (result < 0)
break;
@@ -3807,6 +3828,7 @@ static int cifs_readpages(struct file *file, struct address_space *mapping,
rdata->bytes = bytes;
rdata->pid = pid;
rdata->pagesz = PAGE_SIZE;
+ rdata->tailsz = PAGE_SIZE;
rdata->read_into_pages = cifs_readpages_read_into_pages;
rdata->copy_into_pages = cifs_readpages_copy_into_pages;
rdata->credits = credits;
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 7c0edd2..1fa1c29 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2467,7 +2467,7 @@ read_data_into_pages(struct TCP_Server_Info *server, struct page **pages,
zero_user(page, len, PAGE_SIZE - len);
len = 0;
}
- length = cifs_read_page_from_socket(server, page, n);
+ length = cifs_read_page_from_socket(server, page, 0, n);
if (length < 0)
return length;
server->total_read += length;
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 47d5331..6c22da8 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2683,6 +2683,7 @@ smb2_readv_callback(struct mid_q_entry *mid)
struct smb_rqst rqst = { .rq_iov = rdata->iov,
.rq_nvec = 2,
.rq_pages = rdata->pages,
+ .rq_offset = rdata->page_offset,
.rq_npages = rdata->nr_pages,
.rq_pagesz = rdata->pagesz,
.rq_tailsz = rdata->tailsz };
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 03/16] CIFS: Add support for direct pages in wdata
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
2018-09-08 2:13 ` [Patch v3 01/16] CIFS: Add support for direct pages in rdata Long Li
2018-09-08 2:13 ` [Patch v3 02/16] CIFS: Use offset when reading pages Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 04/16] CIFS: pass page offset when issuing SMB write Long Li
` (13 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
Add a function to allocate wdata without allocating pages for data
transfer. This gives the caller an option to pass a number of pages that
point to the data buffer to write to.
wdata is reponsible for free those pages after it's done.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsglob.h | 3 +--
fs/cifs/cifsproto.h | 2 ++
fs/cifs/cifssmb.c | 17 ++++++++++++++---
3 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 166e140..7f62c98 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1199,14 +1199,13 @@ struct cifs_writedata {
int result;
#ifdef CONFIG_CIFS_SMB_DIRECT
struct smbd_mr *mr;
- struct page **direct_pages;
#endif
unsigned int pagesz;
unsigned int page_offset;
unsigned int tailsz;
unsigned int credits;
unsigned int nr_pages;
- struct page *pages[];
+ struct page **pages;
};
/*
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 1f27d8e..7933c5f 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -533,6 +533,8 @@ int cifs_async_writev(struct cifs_writedata *wdata,
void cifs_writev_complete(struct work_struct *work);
struct cifs_writedata *cifs_writedata_alloc(unsigned int nr_pages,
work_func_t complete);
+struct cifs_writedata *cifs_writedata_direct_alloc(struct page **pages,
+ work_func_t complete);
void cifs_writedata_release(struct kref *refcount);
int cifs_query_mf_symlink(unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index a1af258..503e0ed 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -1953,6 +1953,7 @@ cifs_writedata_release(struct kref *refcount)
if (wdata->cfile)
cifsFileInfo_put(wdata->cfile);
+ kvfree(wdata->pages);
kfree(wdata);
}
@@ -2076,12 +2077,22 @@ cifs_writev_complete(struct work_struct *work)
struct cifs_writedata *
cifs_writedata_alloc(unsigned int nr_pages, work_func_t complete)
{
+ struct page **pages =
+ kzalloc(sizeof(struct page *) * nr_pages, GFP_NOFS);
+ if (pages)
+ return cifs_writedata_direct_alloc(pages, complete);
+
+ return NULL;
+}
+
+struct cifs_writedata *
+cifs_writedata_direct_alloc(struct page **pages, work_func_t complete)
+{
struct cifs_writedata *wdata;
- /* writedata + number of page pointers */
- wdata = kzalloc(sizeof(*wdata) +
- sizeof(struct page *) * nr_pages, GFP_NOFS);
+ wdata = kzalloc(sizeof(*wdata), GFP_NOFS);
if (wdata != NULL) {
+ wdata->pages = pages;
kref_init(&wdata->refcount);
INIT_LIST_HEAD(&wdata->list);
init_completion(&wdata->done);
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 04/16] CIFS: pass page offset when issuing SMB write
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (2 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 03/16] CIFS: Add support for direct pages in wdata Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 05/16] CIFS: Calculate the correct request length based on page offset and tail size Long Li
` (12 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
When issuing SMB writes, pass along the write data page offset to transport.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifssmb.c | 1 +
fs/cifs/smb2pdu.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 503e0ed..0a57c61 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -2200,6 +2200,7 @@ cifs_async_writev(struct cifs_writedata *wdata,
rqst.rq_iov = iov;
rqst.rq_nvec = 2;
rqst.rq_pages = wdata->pages;
+ rqst.rq_offset = wdata->page_offset;
rqst.rq_npages = wdata->nr_pages;
rqst.rq_pagesz = wdata->pagesz;
rqst.rq_tailsz = wdata->tailsz;
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 6c22da8..f603fbe 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -3046,6 +3046,7 @@ smb2_async_writev(struct cifs_writedata *wdata,
rqst.rq_iov = iov;
rqst.rq_nvec = 2;
rqst.rq_pages = wdata->pages;
+ rqst.rq_offset = wdata->page_offset;
rqst.rq_npages = wdata->nr_pages;
rqst.rq_pagesz = wdata->pagesz;
rqst.rq_tailsz = wdata->tailsz;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 05/16] CIFS: Calculate the correct request length based on page offset and tail size
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (3 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 04/16] CIFS: pass page offset when issuing SMB write Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 06/16] CIFS: Introduce helper function to get page offset and length in smb_rqst Long Li
` (11 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
It's possible that the page offset is non-zero in the pages in a request,
change the function to calculate the correct data buffer length.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/transport.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 927226a..d6b5523 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -212,10 +212,24 @@ rqst_len(struct smb_rqst *rqst)
for (i = 0; i < rqst->rq_nvec; i++)
buflen += iov[i].iov_len;
- /* add in the page array if there is one */
+ /*
+ * Add in the page array if there is one. The caller needs to make
+ * sure rq_offset and rq_tailsz are set correctly. If a buffer of
+ * multiple pages ends at page boundary, rq_tailsz needs to be set to
+ * PAGE_SIZE.
+ */
if (rqst->rq_npages) {
- buflen += rqst->rq_pagesz * (rqst->rq_npages - 1);
- buflen += rqst->rq_tailsz;
+ if (rqst->rq_npages == 1)
+ buflen += rqst->rq_tailsz;
+ else {
+ /*
+ * If there is more than one page, calculate the
+ * buffer length based on rq_offset and rq_tailsz
+ */
+ buflen += rqst->rq_pagesz * (rqst->rq_npages - 1) -
+ rqst->rq_offset;
+ buflen += rqst->rq_tailsz;
+ }
}
return buflen;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 06/16] CIFS: Introduce helper function to get page offset and length in smb_rqst
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (4 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 05/16] CIFS: Calculate the correct request length based on page offset and tail size Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 07/16] CIFS: When sending data on socket, pass the correct page offset Long Li
` (10 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
Introduce a function rqst_page_get_length to return the page offset and
length for a given page in smb_rqst. This function is to be used by
following patches.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsproto.h | 3 +++
fs/cifs/misc.c | 17 +++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 7933c5f..89dda14 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -557,4 +557,7 @@ int cifs_alloc_hash(const char *name, struct crypto_shash **shash,
struct sdesc **sdesc);
void cifs_free_hash(struct crypto_shash **shash, struct sdesc **sdesc);
+extern void rqst_page_get_length(struct smb_rqst *rqst, unsigned int page,
+ unsigned int *len, unsigned int *offset);
+
#endif /* _CIFSPROTO_H */
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index 96849b5..e951417 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -905,3 +905,20 @@ cifs_free_hash(struct crypto_shash **shash, struct sdesc **sdesc)
crypto_free_shash(*shash);
*shash = NULL;
}
+
+/**
+ * rqst_page_get_length - obtain the length and offset for a page in smb_rqst
+ * Input: rqst - a smb_rqst, page - a page index for rqst
+ * Output: *len - the length for this page, *offset - the offset for this page
+ */
+void rqst_page_get_length(struct smb_rqst *rqst, unsigned int page,
+ unsigned int *len, unsigned int *offset)
+{
+ *len = rqst->rq_pagesz;
+ *offset = (page == 0) ? rqst->rq_offset : 0;
+
+ if (rqst->rq_npages == 1 || page == rqst->rq_npages-1)
+ *len = rqst->rq_tailsz;
+ else if (page == 0)
+ *len = rqst->rq_pagesz - rqst->rq_offset;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 07/16] CIFS: When sending data on socket, pass the correct page offset
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (5 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 06/16] CIFS: Introduce helper function to get page offset and length in smb_rqst Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 08/16] CIFS: SMBD: Support page offset in RDMA send Long Li
` (9 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
It's possible that the offset is non-zero in the page to send, change the
function to pass this offset to socket.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/transport.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index d6b5523..5c96ee8 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -288,15 +288,13 @@ __smb_send_rqst(struct TCP_Server_Info *server, struct smb_rqst *rqst)
/* now walk the page array and send each page in it */
for (i = 0; i < rqst->rq_npages; i++) {
- size_t len = i == rqst->rq_npages - 1
- ? rqst->rq_tailsz
- : rqst->rq_pagesz;
- struct bio_vec bvec = {
- .bv_page = rqst->rq_pages[i],
- .bv_len = len
- };
+ struct bio_vec bvec;
+
+ bvec.bv_page = rqst->rq_pages[i];
+ rqst_page_get_length(rqst, i, &bvec.bv_len, &bvec.bv_offset);
+
iov_iter_bvec(&smb_msg.msg_iter, WRITE | ITER_BVEC,
- &bvec, 1, len);
+ &bvec, 1, bvec.bv_len);
rc = smb_send_kvec(server, &smb_msg, &sent);
if (rc < 0)
break;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 08/16] CIFS: SMBD: Support page offset in RDMA send
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (6 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 07/16] CIFS: When sending data on socket, pass the correct page offset Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 09/16] CIFS: SMBD: Support page offset in RDMA recv Long Li
` (8 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
The RDMA send function needs to look at offset in the request pages, and
send data starting from there.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/smbdirect.c | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index c62f7c9..6141e3c 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -17,6 +17,7 @@
#include <linux/highmem.h>
#include "smbdirect.h"
#include "cifs_debug.h"
+#include "cifsproto.h"
static struct smbd_response *get_empty_queue_buffer(
struct smbd_connection *info);
@@ -2082,7 +2083,7 @@ int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst)
struct kvec vec;
int nvecs;
int size;
- int buflen = 0, remaining_data_length;
+ unsigned int buflen = 0, remaining_data_length;
int start, i, j;
int max_iov_size =
info->max_send_size - sizeof(struct smbd_data_transfer);
@@ -2113,10 +2114,17 @@ int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst)
buflen += iov[i].iov_len;
}
- /* add in the page array if there is one */
+ /*
+ * Add in the page array if there is one. The caller needs to set
+ * rq_tailsz to PAGE_SIZE when the buffer has multiple pages and
+ * ends at page boundary
+ */
if (rqst->rq_npages) {
- buflen += rqst->rq_pagesz * (rqst->rq_npages - 1);
- buflen += rqst->rq_tailsz;
+ if (rqst->rq_npages == 1)
+ buflen += rqst->rq_tailsz;
+ else
+ buflen += rqst->rq_pagesz * (rqst->rq_npages - 1) -
+ rqst->rq_offset + rqst->rq_tailsz;
}
if (buflen + sizeof(struct smbd_data_transfer) >
@@ -2213,8 +2221,9 @@ int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst)
/* now sending pages if there are any */
for (i = 0; i < rqst->rq_npages; i++) {
- buflen = (i == rqst->rq_npages-1) ?
- rqst->rq_tailsz : rqst->rq_pagesz;
+ unsigned int offset;
+
+ rqst_page_get_length(rqst, i, &buflen, &offset);
nvecs = (buflen + max_iov_size - 1) / max_iov_size;
log_write(INFO, "sending pages buflen=%d nvecs=%d\n",
buflen, nvecs);
@@ -2225,9 +2234,11 @@ int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst)
remaining_data_length -= size;
log_write(INFO, "sending pages i=%d offset=%d size=%d"
" remaining_data_length=%d\n",
- i, j*max_iov_size, size, remaining_data_length);
+ i, j*max_iov_size+offset, size,
+ remaining_data_length);
rc = smbd_post_send_page(
- info, rqst->rq_pages[i], j*max_iov_size,
+ info, rqst->rq_pages[i],
+ j*max_iov_size + offset,
size, remaining_data_length);
if (rc)
goto done;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 09/16] CIFS: SMBD: Support page offset in RDMA recv
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (7 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 08/16] CIFS: SMBD: Support page offset in RDMA send Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 10/16] CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration Long Li
` (7 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
RDMA recv function needs to place data to the correct place starting at
page offset.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/smbdirect.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index 6141e3c..ba53c52 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -2004,10 +2004,12 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
* return value: actual data read
*/
static int smbd_recv_page(struct smbd_connection *info,
- struct page *page, unsigned int to_read)
+ struct page *page, unsigned int page_offset,
+ unsigned int to_read)
{
int ret;
char *to_address;
+ void *page_address;
/* make sure we have the page ready for read */
ret = wait_event_interruptible(
@@ -2015,16 +2017,17 @@ static int smbd_recv_page(struct smbd_connection *info,
info->reassembly_data_length >= to_read ||
info->transport_status != SMBD_CONNECTED);
if (ret)
- return 0;
+ return ret;
/* now we can read from reassembly queue and not sleep */
- to_address = kmap_atomic(page);
+ page_address = kmap_atomic(page);
+ to_address = (char *) page_address + page_offset;
log_read(INFO, "reading from page=%p address=%p to_read=%d\n",
page, to_address, to_read);
ret = smbd_recv_buf(info, to_address, to_read);
- kunmap_atomic(to_address);
+ kunmap_atomic(page_address);
return ret;
}
@@ -2038,7 +2041,7 @@ int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
{
char *buf;
struct page *page;
- unsigned int to_read;
+ unsigned int to_read, page_offset;
int rc;
info->smbd_recv_pending++;
@@ -2052,15 +2055,16 @@ int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
case READ | ITER_BVEC:
page = msg->msg_iter.bvec->bv_page;
+ page_offset = msg->msg_iter.bvec->bv_offset;
to_read = msg->msg_iter.bvec->bv_len;
- rc = smbd_recv_page(info, page, to_read);
+ rc = smbd_recv_page(info, page, page_offset, to_read);
break;
default:
/* It's a bug in upper layer to get there */
cifs_dbg(VFS, "CIFS: invalid msg type %d\n",
msg->msg_iter.type);
- rc = -EIO;
+ rc = -EINVAL;
}
info->smbd_recv_pending--;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 10/16] CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (8 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 09/16] CIFS: SMBD: Support page offset in RDMA recv Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 11/16] CIFS: SMBD: Support page offset in " Long Li
` (6 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
It is not necessary to deregister a memory registration after it has been
successfully invalidated.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/smbdirect.c | 82 ++++++++++++++++++++++++++---------------------------
1 file changed, 41 insertions(+), 41 deletions(-)
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index ba53c52..b470cd0 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -2296,50 +2296,50 @@ static void smbd_mr_recovery_work(struct work_struct *work)
int rc;
list_for_each_entry(smbdirect_mr, &info->mr_list, list) {
- if (smbdirect_mr->state == MR_INVALIDATED ||
- smbdirect_mr->state == MR_ERROR) {
-
- if (smbdirect_mr->state == MR_INVALIDATED) {
- ib_dma_unmap_sg(
- info->id->device, smbdirect_mr->sgl,
- smbdirect_mr->sgl_count,
- smbdirect_mr->dir);
- smbdirect_mr->state = MR_READY;
- } else if (smbdirect_mr->state == MR_ERROR) {
-
- /* recover this MR entry */
- rc = ib_dereg_mr(smbdirect_mr->mr);
- if (rc) {
- log_rdma_mr(ERR,
- "ib_dereg_mr failed rc=%x\n",
- rc);
- smbd_disconnect_rdma_connection(info);
- }
+ if (smbdirect_mr->state == MR_INVALIDATED)
+ ib_dma_unmap_sg(
+ info->id->device, smbdirect_mr->sgl,
+ smbdirect_mr->sgl_count,
+ smbdirect_mr->dir);
+ else if (smbdirect_mr->state == MR_ERROR) {
+
+ /* recover this MR entry */
+ rc = ib_dereg_mr(smbdirect_mr->mr);
+ if (rc) {
+ log_rdma_mr(ERR,
+ "ib_dereg_mr failed rc=%x\n",
+ rc);
+ smbd_disconnect_rdma_connection(info);
+ continue;
+ }
- smbdirect_mr->mr = ib_alloc_mr(
- info->pd, info->mr_type,
+ smbdirect_mr->mr = ib_alloc_mr(
+ info->pd, info->mr_type,
+ info->max_frmr_depth);
+ if (IS_ERR(smbdirect_mr->mr)) {
+ log_rdma_mr(ERR,
+ "ib_alloc_mr failed mr_type=%x "
+ "max_frmr_depth=%x\n",
+ info->mr_type,
info->max_frmr_depth);
- if (IS_ERR(smbdirect_mr->mr)) {
- log_rdma_mr(ERR,
- "ib_alloc_mr failed mr_type=%x "
- "max_frmr_depth=%x\n",
- info->mr_type,
- info->max_frmr_depth);
- smbd_disconnect_rdma_connection(info);
- }
-
- smbdirect_mr->state = MR_READY;
+ smbd_disconnect_rdma_connection(info);
+ continue;
}
- /* smbdirect_mr->state is updated by this function
- * and is read and updated by I/O issuing CPUs trying
- * to get a MR, the call to atomic_inc_return
- * implicates a memory barrier and guarantees this
- * value is updated before waking up any calls to
- * get_mr() from the I/O issuing CPUs
- */
- if (atomic_inc_return(&info->mr_ready_count) == 1)
- wake_up_interruptible(&info->wait_mr);
- }
+ } else
+ /* This MR is being used, don't recover it */
+ continue;
+
+ smbdirect_mr->state = MR_READY;
+
+ /* smbdirect_mr->state is updated by this function
+ * and is read and updated by I/O issuing CPUs trying
+ * to get a MR, the call to atomic_inc_return
+ * implicates a memory barrier and guarantees this
+ * value is updated before waking up any calls to
+ * get_mr() from the I/O issuing CPUs
+ */
+ if (atomic_inc_return(&info->mr_ready_count) == 1)
+ wake_up_interruptible(&info->wait_mr);
}
}
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 11/16] CIFS: SMBD: Support page offset in memory registration
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (9 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 10/16] CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 12/16] CIFS: Pass page offset for calculating signature Long Li
` (5 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
Change code to pass the correct page offset during memory registration for
RDMA read/write.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/smb2pdu.c | 18 ++++++++++++------
fs/cifs/smbdirect.c | 29 +++++++++++++++++++++--------
fs/cifs/smbdirect.h | 2 +-
3 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index f603fbe..fc30774 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2623,8 +2623,8 @@ smb2_new_read_req(void **buf, unsigned int *total_len,
rdata->mr = smbd_register_mr(
server->smbd_conn, rdata->pages,
- rdata->nr_pages, rdata->tailsz,
- true, need_invalidate);
+ rdata->nr_pages, rdata->page_offset,
+ rdata->tailsz, true, need_invalidate);
if (!rdata->mr)
return -ENOBUFS;
@@ -3013,16 +3013,22 @@ smb2_async_writev(struct cifs_writedata *wdata,
wdata->mr = smbd_register_mr(
server->smbd_conn, wdata->pages,
- wdata->nr_pages, wdata->tailsz,
- false, need_invalidate);
+ wdata->nr_pages, wdata->page_offset,
+ wdata->tailsz, false, need_invalidate);
if (!wdata->mr) {
rc = -ENOBUFS;
goto async_writev_out;
}
req->Length = 0;
req->DataOffset = 0;
- req->RemainingBytes =
- cpu_to_le32((wdata->nr_pages-1)*PAGE_SIZE + wdata->tailsz);
+ if (wdata->nr_pages > 1)
+ req->RemainingBytes =
+ cpu_to_le32(
+ (wdata->nr_pages - 1) * wdata->pagesz -
+ wdata->page_offset + wdata->tailsz
+ );
+ else
+ req->RemainingBytes = cpu_to_le32(wdata->tailsz);
req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE;
if (need_invalidate)
req->Channel = SMB2_CHANNEL_RDMA_V1;
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
index b470cd0..f61daa9 100644
--- a/fs/cifs/smbdirect.c
+++ b/fs/cifs/smbdirect.c
@@ -2475,7 +2475,7 @@ static struct smbd_mr *get_mr(struct smbd_connection *info)
*/
struct smbd_mr *smbd_register_mr(
struct smbd_connection *info, struct page *pages[], int num_pages,
- int tailsz, bool writing, bool need_invalidate)
+ int offset, int tailsz, bool writing, bool need_invalidate)
{
struct smbd_mr *smbdirect_mr;
int rc, i;
@@ -2498,17 +2498,30 @@ struct smbd_mr *smbd_register_mr(
smbdirect_mr->sgl_count = num_pages;
sg_init_table(smbdirect_mr->sgl, num_pages);
- for (i = 0; i < num_pages - 1; i++)
- sg_set_page(&smbdirect_mr->sgl[i], pages[i], PAGE_SIZE, 0);
+ log_rdma_mr(INFO, "num_pages=0x%x offset=0x%x tailsz=0x%x\n",
+ num_pages, offset, tailsz);
+
+ if (num_pages == 1) {
+ sg_set_page(&smbdirect_mr->sgl[0], pages[0], tailsz, offset);
+ goto skip_multiple_pages;
+ }
- sg_set_page(&smbdirect_mr->sgl[i], pages[i],
- tailsz ? tailsz : PAGE_SIZE, 0);
+ /* We have at least two pages to register */
+ sg_set_page(
+ &smbdirect_mr->sgl[0], pages[0], PAGE_SIZE - offset, offset);
+ i = 1;
+ while (i < num_pages - 1) {
+ sg_set_page(&smbdirect_mr->sgl[i], pages[i], PAGE_SIZE, 0);
+ i++;
+ }
+ sg_set_page(&smbdirect_mr->sgl[i], pages[i], tailsz, 0);
+skip_multiple_pages:
dir = writing ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
smbdirect_mr->dir = dir;
rc = ib_dma_map_sg(info->id->device, smbdirect_mr->sgl, num_pages, dir);
if (!rc) {
- log_rdma_mr(INFO, "ib_dma_map_sg num_pages=%x dir=%x rc=%x\n",
+ log_rdma_mr(ERR, "ib_dma_map_sg num_pages=%x dir=%x rc=%x\n",
num_pages, dir, rc);
goto dma_map_error;
}
@@ -2516,8 +2529,8 @@ struct smbd_mr *smbd_register_mr(
rc = ib_map_mr_sg(smbdirect_mr->mr, smbdirect_mr->sgl, num_pages,
NULL, PAGE_SIZE);
if (rc != num_pages) {
- log_rdma_mr(INFO,
- "ib_map_mr_sg failed rc = %x num_pages = %x\n",
+ log_rdma_mr(ERR,
+ "ib_map_mr_sg failed rc = %d num_pages = %x\n",
rc, num_pages);
goto map_mr_error;
}
diff --git a/fs/cifs/smbdirect.h b/fs/cifs/smbdirect.h
index f9038da..1e419c2 100644
--- a/fs/cifs/smbdirect.h
+++ b/fs/cifs/smbdirect.h
@@ -321,7 +321,7 @@ struct smbd_mr {
/* Interfaces to register and deregister MR for RDMA read/write */
struct smbd_mr *smbd_register_mr(
struct smbd_connection *info, struct page *pages[], int num_pages,
- int tailsz, bool writing, bool need_invalidate);
+ int offset, int tailsz, bool writing, bool need_invalidate);
int smbd_deregister_mr(struct smbd_mr *mr);
#else
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 12/16] CIFS: Pass page offset for calculating signature
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (10 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 11/16] CIFS: SMBD: Support page offset in " Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 13/16] CIFS: Pass page offset for encrypting Long Li
` (4 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
When calculating signature for the packet, it needs to read into the
correct page offset for the data.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsencrypt.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
index a6ef088..e88303c 100644
--- a/fs/cifs/cifsencrypt.c
+++ b/fs/cifs/cifsencrypt.c
@@ -68,11 +68,12 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
/* now hash over the rq_pages array */
for (i = 0; i < rqst->rq_npages; i++) {
- void *kaddr = kmap(rqst->rq_pages[i]);
- size_t len = rqst->rq_pagesz;
+ void *kaddr;
+ unsigned int len, offset;
- if (i == rqst->rq_npages - 1)
- len = rqst->rq_tailsz;
+ rqst_page_get_length(rqst, i, &len, &offset);
+
+ kaddr = (char *) kmap(rqst->rq_pages[i]) + offset;
crypto_shash_update(shash, kaddr, len);
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 13/16] CIFS: Pass page offset for encrypting
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (11 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 12/16] CIFS: Pass page offset for calculating signature Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 14/16] CIFS: Add support for direct I/O read Long Li
` (3 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
Encryption function needs to read data starting page offset from input
buffer.
This doesn't affect decryption path since it allocates its own page
buffers.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/smb2ops.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 1fa1c29..38d19b6 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -2189,9 +2189,10 @@ init_sg(struct smb_rqst *rqst, u8 *sign)
smb2_sg_set_buf(&sg[i], rqst->rq_iov[i].iov_base,
rqst->rq_iov[i].iov_len);
for (j = 0; i < sg_len - 1; i++, j++) {
- unsigned int len = (j < rqst->rq_npages - 1) ? rqst->rq_pagesz
- : rqst->rq_tailsz;
- sg_set_page(&sg[i], rqst->rq_pages[j], len, 0);
+ unsigned int len, offset;
+
+ rqst_page_get_length(rqst, j, &len, &offset);
+ sg_set_page(&sg[i], rqst->rq_pages[j], len, offset);
}
smb2_sg_set_buf(&sg[sg_len - 1], sign, SMB2_SIGNATURE_SIZE);
return sg;
@@ -2332,6 +2333,7 @@ smb3_init_transform_rq(struct TCP_Server_Info *server, struct smb_rqst *new_rq,
return rc;
new_rq->rq_pages = pages;
+ new_rq->rq_offset = old_rq->rq_offset;
new_rq->rq_npages = old_rq->rq_npages;
new_rq->rq_pagesz = old_rq->rq_pagesz;
new_rq->rq_tailsz = old_rq->rq_tailsz;
@@ -2363,10 +2365,14 @@ smb3_init_transform_rq(struct TCP_Server_Info *server, struct smb_rqst *new_rq,
/* copy pages form the old */
for (i = 0; i < npages; i++) {
- char *dst = kmap(new_rq->rq_pages[i]);
- char *src = kmap(old_rq->rq_pages[i]);
- unsigned int len = (i < npages - 1) ? new_rq->rq_pagesz :
- new_rq->rq_tailsz;
+ char *dst, *src;
+ unsigned int offset, len;
+
+ rqst_page_get_length(new_rq, i, &len, &offset);
+
+ dst = (char *) kmap(new_rq->rq_pages[i]) + offset;
+ src = (char *) kmap(old_rq->rq_pages[i]) + offset;
+
memcpy(dst, src, len);
kunmap(new_rq->rq_pages[i]);
kunmap(old_rq->rq_pages[i]);
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 14/16] CIFS: Add support for direct I/O read
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (12 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 13/16] CIFS: Pass page offset for encrypting Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 15/16] CIFS: Add support for direct I/O write Long Li
` (2 subsequent siblings)
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
With direct I/O read, we transfer the data directly from transport layer to
the user data buffer.
Change in v3: added support for kernel AIO
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsfs.h | 1 +
fs/cifs/cifsglob.h | 5 ++
fs/cifs/file.c | 209 +++++++++++++++++++++++++++++++++++++++++++++--------
3 files changed, 186 insertions(+), 29 deletions(-)
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 5f02318..7fba9aa 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -102,6 +102,7 @@ extern int cifs_open(struct inode *inode, struct file *file);
extern int cifs_close(struct inode *inode, struct file *file);
extern int cifs_closedir(struct inode *inode, struct file *file);
extern ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to);
+extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from);
extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 7f62c98..52248dd 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1146,6 +1146,11 @@ struct cifs_aio_ctx {
unsigned int len;
unsigned int total_len;
bool should_dirty;
+ /*
+ * Indicates if this aio_ctx is for direct_io,
+ * If yes, iter is a copy of the user passed iov_iter
+ */
+ bool direct_io;
};
struct cifs_readdata;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 87eece6..476b2a1 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2965,7 +2965,6 @@ cifs_uncached_readdata_release(struct kref *refcount)
kref_put(&rdata->ctx->refcount, cifs_aio_ctx_release);
for (i = 0; i < rdata->nr_pages; i++) {
put_page(rdata->pages[i]);
- rdata->pages[i] = NULL;
}
cifs_readdata_release(refcount);
}
@@ -3004,7 +3003,7 @@ cifs_readdata_to_iov(struct cifs_readdata *rdata, struct iov_iter *iter)
return remaining ? -EFAULT : 0;
}
-static void collect_uncached_read_data(struct cifs_aio_ctx *ctx);
+static void collect_uncached_read_data(struct cifs_readdata *rdata, struct cifs_aio_ctx *ctx);
static void
cifs_uncached_readv_complete(struct work_struct *work)
@@ -3013,7 +3012,7 @@ cifs_uncached_readv_complete(struct work_struct *work)
struct cifs_readdata, work);
complete(&rdata->done);
- collect_uncached_read_data(rdata->ctx);
+ collect_uncached_read_data(rdata, rdata->ctx);
/* the below call can possibly free the last ref to aio ctx */
kref_put(&rdata->refcount, cifs_uncached_readdata_release);
}
@@ -3103,6 +3102,9 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
int rc;
pid_t pid;
struct TCP_Server_Info *server;
+ struct page **pagevec;
+ size_t start;
+ struct iov_iter direct_iov = ctx->iter;
server = tlink_tcon(open_file->tlink)->ses->server;
@@ -3111,6 +3113,9 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
else
pid = current->tgid;
+ if (ctx->direct_io)
+ iov_iter_advance(&direct_iov, offset - ctx->pos);
+
do {
rc = server->ops->wait_mtu_credits(server, cifs_sb->rsize,
&rsize, &credits);
@@ -3118,20 +3123,56 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
break;
cur_len = min_t(const size_t, len, rsize);
- npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
- /* allocate a readdata struct */
- rdata = cifs_readdata_alloc(npages,
+ if (ctx->direct_io) {
+
+ cur_len = iov_iter_get_pages_alloc(
+ &direct_iov, &pagevec,
+ cur_len, &start);
+ if (cur_len < 0) {
+ cifs_dbg(VFS,
+ "couldn't get user pages (cur_len=%zd)"
+ " iter type %d"
+ " iov_offset %lu count %lu\n",
+ cur_len, direct_iov.type, direct_iov.iov_offset,
+ direct_iov.count);
+ dump_stack();
+ break;
+ }
+ iov_iter_advance(&direct_iov, cur_len);
+
+ rdata = cifs_readdata_direct_alloc(
+ pagevec, cifs_uncached_readv_complete);
+ if (!rdata) {
+ add_credits_and_wake_if(server, credits, 0);
+ rc = -ENOMEM;
+ break;
+ }
+
+ npages = (cur_len + start + PAGE_SIZE-1) / PAGE_SIZE;
+ rdata->page_offset = start;
+ rdata->tailsz = npages > 1 ?
+ cur_len-(PAGE_SIZE-start)-(npages-2)*PAGE_SIZE :
+ cur_len;
+
+ } else {
+
+ npages = DIV_ROUND_UP(cur_len, PAGE_SIZE);
+ /* allocate a readdata struct */
+ rdata = cifs_readdata_alloc(npages,
cifs_uncached_readv_complete);
- if (!rdata) {
- add_credits_and_wake_if(server, credits, 0);
- rc = -ENOMEM;
- break;
- }
+ if (!rdata) {
+ add_credits_and_wake_if(server, credits, 0);
+ rc = -ENOMEM;
+ break;
+ }
- rc = cifs_read_allocate_pages(rdata, npages);
- if (rc)
- goto error;
+ rc = cifs_read_allocate_pages(rdata, npages);
+ if (rc)
+ goto error;
+
+ rdata->tailsz = PAGE_SIZE;
+ }
rdata->cfile = cifsFileInfo_get(open_file);
rdata->nr_pages = npages;
@@ -3139,7 +3180,6 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
rdata->bytes = cur_len;
rdata->pid = pid;
rdata->pagesz = PAGE_SIZE;
- rdata->tailsz = PAGE_SIZE;
rdata->read_into_pages = cifs_uncached_read_into_pages;
rdata->copy_into_pages = cifs_uncached_copy_into_pages;
rdata->credits = credits;
@@ -3153,13 +3193,17 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
if (rc) {
add_credits_and_wake_if(server, rdata->credits, 0);
kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
- if (rc == -EAGAIN)
+ cifs_uncached_readdata_release);
+ if (rc == -EAGAIN) {
+ iov_iter_revert(&direct_iov, cur_len);
continue;
+ }
break;
}
- list_add_tail(&rdata->list, rdata_list);
+ /* Add to aio pending list if it's not there */
+ if (rdata_list)
+ list_add_tail(&rdata->list, rdata_list);
offset += cur_len;
len -= cur_len;
} while (len > 0);
@@ -3168,7 +3212,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
}
static void
-collect_uncached_read_data(struct cifs_aio_ctx *ctx)
+collect_uncached_read_data(struct cifs_readdata *uncached_rdata, struct cifs_aio_ctx *ctx)
{
struct cifs_readdata *rdata, *tmp;
struct iov_iter *to = &ctx->iter;
@@ -3211,10 +3255,12 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
* reading.
*/
if (got_bytes && got_bytes < rdata->bytes) {
- rc = cifs_readdata_to_iov(rdata, to);
+ rc = 0;
+ if (!ctx->direct_io)
+ rc = cifs_readdata_to_iov(rdata, to);
if (rc) {
kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
+ cifs_uncached_readdata_release);
continue;
}
}
@@ -3228,28 +3274,32 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
list_splice(&tmp_list, &ctx->list);
kref_put(&rdata->refcount,
- cifs_uncached_readdata_release);
+ cifs_uncached_readdata_release);
goto again;
} else if (rdata->result)
rc = rdata->result;
- else
+ else if (!ctx->direct_io)
rc = cifs_readdata_to_iov(rdata, to);
/* if there was a short read -- discard anything left */
if (rdata->got_bytes && rdata->got_bytes < rdata->bytes)
rc = -ENODATA;
+
+ ctx->total_len += rdata->got_bytes;
}
list_del_init(&rdata->list);
kref_put(&rdata->refcount, cifs_uncached_readdata_release);
}
- for (i = 0; i < ctx->npages; i++) {
- if (ctx->should_dirty)
- set_page_dirty(ctx->bv[i].bv_page);
- put_page(ctx->bv[i].bv_page);
- }
+ if (!ctx->direct_io) {
+ for (i = 0; i < ctx->npages; i++) {
+ if (ctx->should_dirty)
+ set_page_dirty(ctx->bv[i].bv_page);
+ put_page(ctx->bv[i].bv_page);
+ }
- ctx->total_len = ctx->len - iov_iter_count(to);
+ ctx->total_len = ctx->len - iov_iter_count(to);
+ }
cifs_stats_bytes_read(tcon, ctx->total_len);
@@ -3267,6 +3317,107 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx)
complete(&ctx->done);
}
+ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to)
+{
+ size_t len;
+ struct file *file;
+ struct cifs_sb_info *cifs_sb;
+ struct cifsFileInfo *cfile;
+ struct cifs_tcon *tcon;
+ ssize_t rc, total_read = 0;
+ struct TCP_Server_Info *server;
+ loff_t offset = iocb->ki_pos;
+ pid_t pid;
+ struct cifs_aio_ctx *ctx;
+
+ /*
+ * iov_iter_get_pages_alloc() doesn't work with ITER_KVEC,
+ * fall back to data copy read path
+ */
+ if (to->type & ITER_KVEC) {
+ cifs_dbg(FYI, "use non-direct cifs_user_readv for kvec I/O\n");
+ return cifs_user_readv(iocb, to);
+ }
+
+ len = iov_iter_count(to);
+ if (!len)
+ return 0;
+
+ file = iocb->ki_filp;
+ cifs_sb = CIFS_FILE_SB(file);
+ cfile = file->private_data;
+ tcon = tlink_tcon(cfile->tlink);
+ server = tcon->ses->server;
+
+ if (!server->ops->async_readv)
+ return -ENOSYS;
+
+ if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD)
+ pid = cfile->pid;
+ else
+ pid = current->tgid;
+
+ if ((file->f_flags & O_ACCMODE) == O_WRONLY)
+ cifs_dbg(FYI, "attempting read on write only file instance\n");
+
+ ctx = cifs_aio_ctx_alloc();
+ if (!ctx)
+ return -ENOMEM;
+
+ ctx->cfile = cifsFileInfo_get(cfile);
+
+ if (!is_sync_kiocb(iocb))
+ ctx->iocb = iocb;
+
+ if (to->type == ITER_IOVEC)
+ ctx->should_dirty = true;
+
+ ctx->pos = offset;
+ ctx->direct_io = true;
+ ctx->iter = *to;
+ ctx->len = len;
+
+ /* grab a lock here due to read response handlers can access ctx */
+ mutex_lock(&ctx->aio_mutex);
+
+ rc = cifs_send_async_read(offset, len, cfile, cifs_sb, &ctx->list, ctx);
+
+ /* if at least one read request send succeeded, then reset rc */
+ if (!list_empty(&ctx->list))
+ rc = 0;
+
+ mutex_unlock(&ctx->aio_mutex);
+
+ if (rc) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return rc;
+ }
+
+ if (!is_sync_kiocb(iocb)) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return -EIOCBQUEUED;
+ }
+
+ rc = wait_for_completion_killable(&ctx->done);
+ if (rc) {
+ mutex_lock(&ctx->aio_mutex);
+ ctx->rc = rc = -EINTR;
+ total_read = ctx->total_len;
+ mutex_unlock(&ctx->aio_mutex);
+ } else {
+ rc = ctx->rc;
+ total_read = ctx->total_len;
+ }
+
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+
+ if (total_read) {
+ iocb->ki_pos += total_read;
+ return total_read;
+ }
+ return rc;
+}
+
ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to)
{
struct file *file = iocb->ki_filp;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 15/16] CIFS: Add support for direct I/O write
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (13 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 14/16] CIFS: Add support for direct I/O read Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-08 2:13 ` [Patch v3 16/16] CIFS: Add direct I/O functions to file_operations Long Li
2018-09-15 9:28 ` [Patch v3 00/16] CIFS: add support for direct I/O Steve French
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
With direct I/O write, user supplied buffers are pinned to the memory and data
are transferred directly from user buffers to the transport layer.
Change in v3: added support for kernel AIO
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsfs.h | 1 +
fs/cifs/file.c | 195 ++++++++++++++++++++++++++++++++++++++++++++++---------
2 files changed, 165 insertions(+), 31 deletions(-)
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 7fba9aa..e9c5103 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -105,6 +105,7 @@ extern ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from);
+extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from);
extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
extern int cifs_lock(struct file *, int, struct file_lock *);
extern int cifs_fsync(struct file *, loff_t, loff_t, int);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 476b2a1..76e0266 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2537,6 +2537,8 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
loff_t saved_offset = offset;
pid_t pid;
struct TCP_Server_Info *server;
+ struct page **pagevec;
+ size_t start;
if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD)
pid = open_file->pid;
@@ -2553,38 +2555,74 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
if (rc)
break;
- nr_pages = get_numpages(wsize, len, &cur_len);
- wdata = cifs_writedata_alloc(nr_pages,
+ if (ctx->direct_io) {
+ cur_len = iov_iter_get_pages_alloc(
+ from, &pagevec, wsize, &start);
+ if (cur_len < 0) {
+ cifs_dbg(VFS,
+ "direct_writev couldn't get user pages "
+ "(rc=%zd) iter type %d iov_offset %lu count"
+ " %lu\n",
+ cur_len, from->type,
+ from->iov_offset, from->count);
+ dump_stack();
+ break;
+ }
+ iov_iter_advance(from, cur_len);
+
+ nr_pages = (cur_len + start + PAGE_SIZE - 1) / PAGE_SIZE;
+
+ wdata = cifs_writedata_direct_alloc(pagevec,
cifs_uncached_writev_complete);
- if (!wdata) {
- rc = -ENOMEM;
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
+ if (!wdata) {
+ rc = -ENOMEM;
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
- rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
- if (rc) {
- kfree(wdata);
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
- num_pages = nr_pages;
- rc = wdata_fill_from_iovec(wdata, from, &cur_len, &num_pages);
- if (rc) {
- for (i = 0; i < nr_pages; i++)
- put_page(wdata->pages[i]);
- kfree(wdata);
- add_credits_and_wake_if(server, credits, 0);
- break;
- }
+ wdata->page_offset = start;
+ wdata->tailsz =
+ nr_pages > 1 ?
+ cur_len - (PAGE_SIZE - start) -
+ (nr_pages - 2) * PAGE_SIZE :
+ cur_len;
+ } else {
+ nr_pages = get_numpages(wsize, len, &cur_len);
+ wdata = cifs_writedata_alloc(nr_pages,
+ cifs_uncached_writev_complete);
+ if (!wdata) {
+ rc = -ENOMEM;
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
- /*
- * Bring nr_pages down to the number of pages we actually used,
- * and free any pages that we didn't use.
- */
- for ( ; nr_pages > num_pages; nr_pages--)
- put_page(wdata->pages[nr_pages - 1]);
+ rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
+ if (rc) {
+ kfree(wdata);
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
+
+ num_pages = nr_pages;
+ rc = wdata_fill_from_iovec(wdata, from, &cur_len, &num_pages);
+ if (rc) {
+ for (i = 0; i < nr_pages; i++)
+ put_page(wdata->pages[i]);
+ kfree(wdata);
+ add_credits_and_wake_if(server, credits, 0);
+ break;
+ }
+
+ /*
+ * Bring nr_pages down to the number of pages we actually used,
+ * and free any pages that we didn't use.
+ */
+ for ( ; nr_pages > num_pages; nr_pages--)
+ put_page(wdata->pages[nr_pages - 1]);
+
+ wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE);
+ }
wdata->sync_mode = WB_SYNC_ALL;
wdata->nr_pages = nr_pages;
@@ -2593,7 +2631,6 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
wdata->pid = pid;
wdata->bytes = cur_len;
wdata->pagesz = PAGE_SIZE;
- wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE);
wdata->credits = credits;
wdata->ctx = ctx;
kref_get(&ctx->refcount);
@@ -2687,8 +2724,9 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
kref_put(&wdata->refcount, cifs_uncached_writedata_release);
}
- for (i = 0; i < ctx->npages; i++)
- put_page(ctx->bv[i].bv_page);
+ if (!ctx->direct_io)
+ for (i = 0; i < ctx->npages; i++)
+ put_page(ctx->bv[i].bv_page);
cifs_stats_bytes_written(tcon, ctx->total_len);
set_bit(CIFS_INO_INVALID_MAPPING, &CIFS_I(dentry->d_inode)->flags);
@@ -2703,6 +2741,101 @@ static void collect_uncached_write_data(struct cifs_aio_ctx *ctx)
complete(&ctx->done);
}
+ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ ssize_t total_written = 0;
+ struct cifsFileInfo *cfile;
+ struct cifs_tcon *tcon;
+ struct cifs_sb_info *cifs_sb;
+ struct TCP_Server_Info *server;
+ size_t len = iov_iter_count(from);
+ int rc;
+ struct cifs_aio_ctx *ctx;
+
+ /*
+ * iov_iter_get_pages_alloc doesn't work with ITER_KVEC.
+ * In this case, fall back to non-direct write function.
+ */
+ if (from->type & ITER_KVEC) {
+ cifs_dbg(FYI, "use non-direct cifs_user_writev for kvec I/O\n");
+ return cifs_user_writev(iocb, from);
+ }
+
+ rc = generic_write_checks(iocb, from);
+ if (rc <= 0)
+ return rc;
+
+ cifs_sb = CIFS_FILE_SB(file);
+ cfile = file->private_data;
+ tcon = tlink_tcon(cfile->tlink);
+ server = tcon->ses->server;
+
+ if (!server->ops->async_writev)
+ return -ENOSYS;
+
+ ctx = cifs_aio_ctx_alloc();
+ if (!ctx)
+ return -ENOMEM;
+
+ ctx->cfile = cifsFileInfo_get(cfile);
+
+ if (!is_sync_kiocb(iocb))
+ ctx->iocb = iocb;
+
+ ctx->pos = iocb->ki_pos;
+
+ ctx->direct_io = true;
+ ctx->iter = *from;
+ ctx->len = len;
+
+ /* grab a lock here due to read response handlers can access ctx */
+ mutex_lock(&ctx->aio_mutex);
+
+ rc = cifs_write_from_iter(iocb->ki_pos, ctx->len, from,
+ cfile, cifs_sb, &ctx->list, ctx);
+
+ /*
+ * If at least one write was successfully sent, then discard any rc
+ * value from the later writes. If the other write succeeds, then
+ * we'll end up returning whatever was written. If it fails, then
+ * we'll get a new rc value from that.
+ */
+ if (!list_empty(&ctx->list))
+ rc = 0;
+
+ mutex_unlock(&ctx->aio_mutex);
+
+ if (rc) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return rc;
+ }
+
+ if (!is_sync_kiocb(iocb)) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return -EIOCBQUEUED;
+ }
+
+ rc = wait_for_completion_killable(&ctx->done);
+ if (rc) {
+ mutex_lock(&ctx->aio_mutex);
+ ctx->rc = rc = -EINTR;
+ total_written = ctx->total_len;
+ mutex_unlock(&ctx->aio_mutex);
+ } else {
+ rc = ctx->rc;
+ total_written = ctx->total_len;
+ }
+
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+
+ if (unlikely(!total_written))
+ return rc;
+
+ iocb->ki_pos += total_written;
+ return total_written;
+}
+
ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from)
{
struct file *file = iocb->ki_filp;
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [Patch v3 16/16] CIFS: Add direct I/O functions to file_operations
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (14 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 15/16] CIFS: Add support for direct I/O write Long Li
@ 2018-09-08 2:13 ` Long Li
2018-09-15 9:28 ` [Patch v3 00/16] CIFS: add support for direct I/O Steve French
16 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-08 2:13 UTC (permalink / raw)
To: Steve French, linux-cifs, samba-technical, linux-kernel, linux-rdma
Cc: Long Li
From: Long Li <longli@microsoft.com>
With direct read/write functions implemented, add them to file_operations.
Dircet I/O is used under two conditions:
1. When mounting with "cache=none", CIFS uses direct I/O for all user file
data transfer.
2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
transfer on this file.
Signed-off-by: Long Li <longli@microsoft.com>
---
fs/cifs/cifsfs.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 62f1662..f18091b 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1113,9 +1113,8 @@ const struct file_operations cifs_file_strict_ops = {
};
const struct file_operations cifs_file_direct_ops = {
- /* BB reevaluate whether they can be done with directio, no cache */
- .read_iter = cifs_user_readv,
- .write_iter = cifs_user_writev,
+ .read_iter = cifs_direct_readv,
+ .write_iter = cifs_direct_writev,
.open = cifs_open,
.release = cifs_close,
.lock = cifs_lock,
@@ -1169,9 +1168,8 @@ const struct file_operations cifs_file_strict_nobrl_ops = {
};
const struct file_operations cifs_file_direct_nobrl_ops = {
- /* BB reevaluate whether they can be done with directio, no cache */
- .read_iter = cifs_user_readv,
- .write_iter = cifs_user_writev,
+ .read_iter = cifs_direct_readv,
+ .write_iter = cifs_direct_writev,
.open = cifs_open,
.release = cifs_close,
.fsync = cifs_fsync,
--
2.7.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Patch v3 00/16] CIFS: add support for direct I/O
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
` (15 preceding siblings ...)
2018-09-08 2:13 ` [Patch v3 16/16] CIFS: Add direct I/O functions to file_operations Long Li
@ 2018-09-15 9:28 ` Steve French
2018-09-15 20:57 ` Long Li
16 siblings, 1 reply; 19+ messages in thread
From: Steve French @ 2018-09-15 9:28 UTC (permalink / raw)
To: Long Li; +Cc: Steve French, CIFS, samba-technical, LKML, linux-rdma
could you rebase these, patch 1 was merged quite a while ago, and
patch 2 etc. doesn't apply cleanly
On Fri, Sep 7, 2018 at 9:18 PM Long Li <longli@linuxonhyperv.com> wrote:
>
> From: Long Li <longli@microsoft.com>
>
> This patch set implements direct I/O.
>
> In normal code path (even with cache=none), CIFS copies I/O data from
> user-space to kernel-space for security reasons of possible protocol
> required signing and encryption on user data.
>
> With this patch set, CIFS passes the I/O data directly from user-space
> buffer to the transport layer, when file system is mounted with
> "cache-none".
>
> Patch v2 addressed comments from Christoph Hellwig <hch@lst.de> and
> Tom Talpey <ttalpey@microsoft.com> to implement direct I/O for both
> socket and RDMA.
>
> Patch v3 added support for kernel AIO.
>
>
> Long Li (16):
> CIFS: Add support for direct pages in rdata
> CIFS: Use offset when reading pages
> CIFS: Add support for direct pages in wdata
> CIFS: pass page offset when issuing SMB write
> CIFS: Calculate the correct request length based on page offset and
> tail size
> CIFS: Introduce helper function to get page offset and length in
> smb_rqst
> CIFS: When sending data on socket, pass the correct page offset
> CIFS: SMBD: Support page offset in RDMA send
> CIFS: SMBD: Support page offset in RDMA recv
> CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration
> CIFS: SMBD: Support page offset in memory registration
> CIFS: Pass page offset for calculating signature
> CIFS: Pass page offset for encrypting
> CIFS: Add support for direct I/O read
> CIFS: Add support for direct I/O write
> CIFS: Add direct I/O functions to file_operations
>
> fs/cifs/cifsencrypt.c | 9 +-
> fs/cifs/cifsfs.c | 10 +-
> fs/cifs/cifsfs.h | 2 +
> fs/cifs/cifsglob.h | 11 +-
> fs/cifs/cifsproto.h | 9 +-
> fs/cifs/cifssmb.c | 19 +-
> fs/cifs/connect.c | 5 +-
> fs/cifs/file.c | 477 ++++++++++++++++++++++++++++++++++++++++++--------
> fs/cifs/misc.c | 17 ++
> fs/cifs/smb2ops.c | 22 ++-
> fs/cifs/smb2pdu.c | 20 ++-
> fs/cifs/smbdirect.c | 156 ++++++++++-------
> fs/cifs/smbdirect.h | 2 +-
> fs/cifs/transport.c | 34 ++--
> 14 files changed, 606 insertions(+), 187 deletions(-)
>
> --
> 2.7.4
>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [Patch v3 00/16] CIFS: add support for direct I/O
2018-09-15 9:28 ` [Patch v3 00/16] CIFS: add support for direct I/O Steve French
@ 2018-09-15 20:57 ` Long Li
0 siblings, 0 replies; 19+ messages in thread
From: Long Li @ 2018-09-15 20:57 UTC (permalink / raw)
To: Steve French; +Cc: Steve French, CIFS, samba-technical, LKML, linux-rdma
> From: Steve French <smfrench@gmail.com>
> Sent: Saturday, September 15, 2018 2:28 AM
> To: Long Li <longli@microsoft.com>
> Cc: Steve French <sfrench@samba.org>; CIFS <linux-cifs@vger.kernel.org>;
> samba-technical <samba-technical@lists.samba.org>; LKML <linux-
> kernel@vger.kernel.org>; linux-rdma@vger.kernel.org
> Subject: Re: [Patch v3 00/16] CIFS: add support for direct I/O
>
> could you rebase these, patch 1 was merged quite a while ago, and patch 2
> etc. doesn't apply cleanly
Sorry, I will rebase and resend.
On Fri, Sep 7, 2018 at 9:18 PM Long Li
> <longli@linuxonhyperv.com> wrote:
> >
> > From: Long Li <longli@microsoft.com>
> >
> > This patch set implements direct I/O.
> >
> > In normal code path (even with cache=none), CIFS copies I/O data from
> > user-space to kernel-space for security reasons of possible protocol
> > required signing and encryption on user data.
> >
> > With this patch set, CIFS passes the I/O data directly from user-space
> > buffer to the transport layer, when file system is mounted with
> > "cache-none".
> >
> > Patch v2 addressed comments from Christoph Hellwig <hch@lst.de> and
> > Tom Talpey <ttalpey@microsoft.com> to implement direct I/O for both
> > socket and RDMA.
> >
> > Patch v3 added support for kernel AIO.
> >
> >
> > Long Li (16):
> > CIFS: Add support for direct pages in rdata
> > CIFS: Use offset when reading pages
> > CIFS: Add support for direct pages in wdata
> > CIFS: pass page offset when issuing SMB write
> > CIFS: Calculate the correct request length based on page offset and
> > tail size
> > CIFS: Introduce helper function to get page offset and length in
> > smb_rqst
> > CIFS: When sending data on socket, pass the correct page offset
> > CIFS: SMBD: Support page offset in RDMA send
> > CIFS: SMBD: Support page offset in RDMA recv
> > CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration
> > CIFS: SMBD: Support page offset in memory registration
> > CIFS: Pass page offset for calculating signature
> > CIFS: Pass page offset for encrypting
> > CIFS: Add support for direct I/O read
> > CIFS: Add support for direct I/O write
> > CIFS: Add direct I/O functions to file_operations
> >
> > fs/cifs/cifsencrypt.c | 9 +-
> > fs/cifs/cifsfs.c | 10 +-
> > fs/cifs/cifsfs.h | 2 +
> > fs/cifs/cifsglob.h | 11 +-
> > fs/cifs/cifsproto.h | 9 +-
> > fs/cifs/cifssmb.c | 19 +-
> > fs/cifs/connect.c | 5 +-
> > fs/cifs/file.c | 477
> ++++++++++++++++++++++++++++++++++++++++++--------
> > fs/cifs/misc.c | 17 ++
> > fs/cifs/smb2ops.c | 22 ++-
> > fs/cifs/smb2pdu.c | 20 ++-
> > fs/cifs/smbdirect.c | 156 ++++++++++-------
> > fs/cifs/smbdirect.h | 2 +-
> > fs/cifs/transport.c | 34 ++--
> > 14 files changed, 606 insertions(+), 187 deletions(-)
> >
> > --
> > 2.7.4
> >
>
>
> --
> Thanks,
>
> Steve
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-09-15 20:57 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-08 2:13 [Patch v3 00/16] CIFS: add support for direct I/O Long Li
2018-09-08 2:13 ` [Patch v3 01/16] CIFS: Add support for direct pages in rdata Long Li
2018-09-08 2:13 ` [Patch v3 02/16] CIFS: Use offset when reading pages Long Li
2018-09-08 2:13 ` [Patch v3 03/16] CIFS: Add support for direct pages in wdata Long Li
2018-09-08 2:13 ` [Patch v3 04/16] CIFS: pass page offset when issuing SMB write Long Li
2018-09-08 2:13 ` [Patch v3 05/16] CIFS: Calculate the correct request length based on page offset and tail size Long Li
2018-09-08 2:13 ` [Patch v3 06/16] CIFS: Introduce helper function to get page offset and length in smb_rqst Long Li
2018-09-08 2:13 ` [Patch v3 07/16] CIFS: When sending data on socket, pass the correct page offset Long Li
2018-09-08 2:13 ` [Patch v3 08/16] CIFS: SMBD: Support page offset in RDMA send Long Li
2018-09-08 2:13 ` [Patch v3 09/16] CIFS: SMBD: Support page offset in RDMA recv Long Li
2018-09-08 2:13 ` [Patch v3 10/16] CIFS: SMBD: Do not call ib_dereg_mr on invalidated memory registration Long Li
2018-09-08 2:13 ` [Patch v3 11/16] CIFS: SMBD: Support page offset in " Long Li
2018-09-08 2:13 ` [Patch v3 12/16] CIFS: Pass page offset for calculating signature Long Li
2018-09-08 2:13 ` [Patch v3 13/16] CIFS: Pass page offset for encrypting Long Li
2018-09-08 2:13 ` [Patch v3 14/16] CIFS: Add support for direct I/O read Long Li
2018-09-08 2:13 ` [Patch v3 15/16] CIFS: Add support for direct I/O write Long Li
2018-09-08 2:13 ` [Patch v3 16/16] CIFS: Add direct I/O functions to file_operations Long Li
2018-09-15 9:28 ` [Patch v3 00/16] CIFS: add support for direct I/O Steve French
2018-09-15 20:57 ` Long Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.