All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] libceph: miscellaneous cleanups
@ 2013-03-09 15:11 Alex Elder
  2013-03-09 15:13 ` [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN Alex Elder
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:11 UTC (permalink / raw)
  To: ceph-devel

This series contains a bunch of small cleanups.  All of
them I've created while working on other things, and most
of them were done in order to allow upcoming work to be
easier, or easier to understand.  I've moved them all to
the front of my stack of patches and now I'm posting them
as a set for review.  They apply on top of the ceph-client
"testing" branch as of a couple of days ago:
    2f60d302  libceph: fix decoding of pgids

					-Alex

[PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN
[PATCH 2/8] libceph: minor byte order problems in
[PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more"
[PATCH 4/8] libceph: kill args in read_partial_message_bio()
[PATCH 5/8] libceph: define and use in_msg_pos_next()
[PATCH 6/8] libceph: advance pagelist with list_rotate_left()
[PATCH 7/8] libceph: simplify new message initialization
[PATCH 8/8] libceph: record byte count not page count

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
@ 2013-03-09 15:13 ` Alex Elder
  2013-03-09 15:13 ` [PATCH 2/8] libceph: minor byte order problems in Alex Elder
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:13 UTC (permalink / raw)
  To: ceph-devel

This is probably unnecessary but the code read as if it were wrong
in read_partial_message().

Signed-off-by: Alex Elder <elder@inktank.com>
---
 include/linux/ceph/libceph.h |    1 +
 net/ceph/messenger.c         |    2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
index 29818fc..5493d7b 100644
--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -66,6 +66,7 @@ struct ceph_options {
 #define CEPH_OSD_IDLE_TTL_DEFAULT    60

 #define CEPH_MSG_MAX_FRONT_LEN	(16*1024*1024)
+#define CEPH_MSG_MAX_MIDDLE_LEN	(16*1024*1024)
 #define CEPH_MSG_MAX_DATA_LEN	(16*1024*1024)

 #define CEPH_AUTH_NAME_DEFAULT   "guest"
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index af0c35d..b8d0da5 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1887,7 +1887,7 @@ static int read_partial_message(struct
ceph_connection *con)
 	if (front_len > CEPH_MSG_MAX_FRONT_LEN)
 		return -EIO;
 	middle_len = le32_to_cpu(con->in_hdr.middle_len);
-	if (middle_len > CEPH_MSG_MAX_DATA_LEN)
+	if (middle_len > CEPH_MSG_MAX_MIDDLE_LEN)
 		return -EIO;
 	data_len = le32_to_cpu(con->in_hdr.data_len);
 	if (data_len > CEPH_MSG_MAX_DATA_LEN)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/8] libceph: minor byte order problems in
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
  2013-03-09 15:13 ` [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN Alex Elder
@ 2013-03-09 15:13 ` Alex Elder
  2013-03-09 15:14 ` [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more" Alex Elder
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:13 UTC (permalink / raw)
  To: ceph-devel

Some values printed are not (necessarily) in CPU order.  We already
have a copy of the converted versions, so use them.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index b8d0da5..d9ace97 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1916,7 +1916,7 @@ static int read_partial_message(struct
ceph_connection *con)
 		int skip = 0;

 		dout("got hdr type %d front %d data %d\n", con->in_hdr.type,
-		     con->in_hdr.front_len, con->in_hdr.data_len);
+		     front_len, data_len);
 		ret = ceph_con_in_msg_alloc(con, &skip);
 		if (ret < 0)
 			return ret;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more"
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
  2013-03-09 15:13 ` [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN Alex Elder
  2013-03-09 15:13 ` [PATCH 2/8] libceph: minor byte order problems in Alex Elder
@ 2013-03-09 15:14 ` Alex Elder
  2013-03-09 15:14 ` [PATCH 4/8] libceph: kill args in read_partial_message_bio() Alex Elder
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:14 UTC (permalink / raw)
  To: ceph-devel

Change the type of the "more" parameter from int to bool.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d9ace97..962b2cd 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -493,7 +493,7 @@ static int ceph_tcp_sendmsg(struct socket *sock,
struct kvec *iov,
 }

 static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
-		     int offset, size_t size, int more)
+		     int offset, size_t size, bool more)
 {
 	int flags = MSG_DONTWAIT | MSG_NOSIGNAL | (more ? MSG_MORE : MSG_EOR);
 	int ret;
@@ -1132,7 +1132,7 @@ static int write_partial_msg_pages(struct
ceph_connection *con)
 		}
 		ret = ceph_tcp_sendpage(con->sock, page,
 				      con->out_msg_pos.page_pos + bio_offset,
-				      len, 1);
+				      len, true);
 		if (ret <= 0)
 			goto out;

@@ -1161,7 +1161,7 @@ static int write_partial_skip(struct
ceph_connection *con)
 	while (con->out_skip > 0) {
 		size_t size = min(con->out_skip, (int) PAGE_CACHE_SIZE);

-		ret = ceph_tcp_sendpage(con->sock, zero_page, 0, size, 1);
+		ret = ceph_tcp_sendpage(con->sock, zero_page, 0, size, true);
 		if (ret <= 0)
 			goto out;
 		con->out_skip -= ret;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/8] libceph: kill args in read_partial_message_bio()
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (2 preceding siblings ...)
  2013-03-09 15:14 ` [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more" Alex Elder
@ 2013-03-09 15:14 ` Alex Elder
  2013-03-09 15:14 ` [PATCH 5/8] libceph: define and use in_msg_pos_next() Alex Elder
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:14 UTC (permalink / raw)
  To: ceph-devel

There is only one caller for read_partial_message_bio(), and it
always passes &msg->bio_iter and &bio_seg as the second and third
arguments.  Furthermore, the message in question is always the
connection's in_msg, and we can get that inside the called function.

So drop those two parameters and use their derived equivalents.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 962b2cd..2017b88 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1819,14 +1819,16 @@ static int read_partial_message_pages(struct
ceph_connection *con,

 #ifdef CONFIG_BLOCK
 static int read_partial_message_bio(struct ceph_connection *con,
-				    struct bio **bio_iter,
-				    unsigned int *bio_seg,
 				    unsigned int data_len, bool do_datacrc)
 {
-	struct bio_vec *bv = bio_iovec_idx(*bio_iter, *bio_seg);
+	struct ceph_msg *msg = con->in_msg;
+	struct bio_vec *bv;
 	void *p;
 	int ret, left;

+	BUG_ON(!msg);
+	BUG_ON(!msg->bio_iter);
+	bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
 	left = min((int)(data_len - con->in_msg_pos.data_pos),
 		   (int)(bv->bv_len - con->in_msg_pos.page_pos));

@@ -1845,7 +1847,7 @@ static int read_partial_message_bio(struct
ceph_connection *con,
 	con->in_msg_pos.page_pos += ret;
 	if (con->in_msg_pos.page_pos == bv->bv_len) {
 		con->in_msg_pos.page_pos = 0;
-		iter_bio_next(bio_iter, bio_seg);
+		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
 	}

 	return ret;
@@ -1975,9 +1977,7 @@ static int read_partial_message(struct
ceph_connection *con)
 				return ret;
 #ifdef CONFIG_BLOCK
 		} else if (m->bio) {
-			BUG_ON(!m->bio_iter);
 			ret = read_partial_message_bio(con,
-						 &m->bio_iter, &m->bio_seg,
 						 data_len, do_datacrc);
 			if (ret <= 0)
 				return ret;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/8] libceph: define and use in_msg_pos_next()
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (3 preceding siblings ...)
  2013-03-09 15:14 ` [PATCH 4/8] libceph: kill args in read_partial_message_bio() Alex Elder
@ 2013-03-09 15:14 ` Alex Elder
  2013-03-11 18:57   ` Josh Durgin
  2013-03-09 15:15 ` [PATCH 6/8] libceph: advance pagelist with list_rotate_left() Alex Elder
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:14 UTC (permalink / raw)
  To: ceph-devel

Define a new function in_msg_pos_next() to match out_msg_pos_next(),
and use it in place of code at the end of read_partial_message_pages()
and read_partial_message_bio().

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |   57
++++++++++++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 2017b88..fb5f6e7 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1052,6 +1052,28 @@ static void out_msg_pos_next(struct
ceph_connection *con, struct page *page,
 #endif
 }

+static void in_msg_pos_next(struct ceph_connection *con, size_t len,
+				size_t received)
+{
+	struct ceph_msg *msg = con->in_msg;
+
+	BUG_ON(!msg);
+	BUG_ON(!received);
+
+	con->in_msg_pos.data_pos += received;
+	con->in_msg_pos.page_pos += received;
+	if (received < len)
+		return;
+
+	BUG_ON(received != len);
+	con->in_msg_pos.page_pos = 0;
+	con->in_msg_pos.page++;
+#ifdef CONFIG_BLOCK
+	if (msg->bio)
+		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
+#endif /* CONFIG_BLOCK */
+}
+
 /*
  * Write as much message data payload as we can.  If we finish, queue
  * up the footer.
@@ -1789,6 +1811,7 @@ static int read_partial_message_pages(struct
ceph_connection *con,
 				      struct page **pages,
 				      unsigned int data_len, bool do_datacrc)
 {
+	struct page *page;
 	void *p;
 	int ret;
 	int left;
@@ -1797,22 +1820,18 @@ static int read_partial_message_pages(struct
ceph_connection *con,
 		   (int)(PAGE_SIZE - con->in_msg_pos.page_pos));
 	/* (page) data */
 	BUG_ON(pages == NULL);
-	p = kmap(pages[con->in_msg_pos.page]);
-	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
-			       left);
+	page = pages[con->in_msg_pos.page];
+	p = kmap(page);
+	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos, left);
 	if (ret > 0 && do_datacrc)
 		con->in_data_crc =
 			crc32c(con->in_data_crc,
 				  p + con->in_msg_pos.page_pos, ret);
-	kunmap(pages[con->in_msg_pos.page]);
+	kunmap(page);
 	if (ret <= 0)
 		return ret;
-	con->in_msg_pos.data_pos += ret;
-	con->in_msg_pos.page_pos += ret;
-	if (con->in_msg_pos.page_pos == PAGE_SIZE) {
-		con->in_msg_pos.page_pos = 0;
-		con->in_msg_pos.page++;
-	}
+
+	in_msg_pos_next(con, left, ret);

 	return ret;
 }
@@ -1823,32 +1842,30 @@ static int read_partial_message_bio(struct
ceph_connection *con,
 {
 	struct ceph_msg *msg = con->in_msg;
 	struct bio_vec *bv;
+	struct page *page;
 	void *p;
 	int ret, left;

 	BUG_ON(!msg);
 	BUG_ON(!msg->bio_iter);
 	bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
+
 	left = min((int)(data_len - con->in_msg_pos.data_pos),
 		   (int)(bv->bv_len - con->in_msg_pos.page_pos));

-	p = kmap(bv->bv_page) + bv->bv_offset;
+	page = bv->bv_page;
+	p = kmap(page) + bv->bv_offset;

-	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
-			       left);
+	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos, left);
 	if (ret > 0 && do_datacrc)
 		con->in_data_crc =
 			crc32c(con->in_data_crc,
 				  p + con->in_msg_pos.page_pos, ret);
-	kunmap(bv->bv_page);
+	kunmap(page);
 	if (ret <= 0)
 		return ret;
-	con->in_msg_pos.data_pos += ret;
-	con->in_msg_pos.page_pos += ret;
-	if (con->in_msg_pos.page_pos == bv->bv_len) {
-		con->in_msg_pos.page_pos = 0;
-		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
-	}
+
+	in_msg_pos_next(con, left, ret);

 	return ret;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/8] libceph: advance pagelist with list_rotate_left()
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (4 preceding siblings ...)
  2013-03-09 15:14 ` [PATCH 5/8] libceph: define and use in_msg_pos_next() Alex Elder
@ 2013-03-09 15:15 ` Alex Elder
  2013-03-09 15:15 ` [PATCH 7/8] libceph: simplify new message initialization Alex Elder
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:15 UTC (permalink / raw)
  To: ceph-devel

While processing an outgoing pagelist (either the data pagelist or
trail) in a ceph message, the messenger cycles through each of the
pages on the list.  This is accomplished in out_msg_pos_next(), if
the end of the first page on the list is reached, the first page is
moved to the end of the list.

There is a list operation, list_rotate_left(), which performs
exactly this operation, and by using it, what's really going on
becomes more obvious.

So replace these two list_move_tail() calls with list_rotate_left().

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index fb5f6e7..2734d03 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1041,11 +1041,9 @@ static void out_msg_pos_next(struct
ceph_connection *con, struct page *page,
 	con->out_msg_pos.page++;
 	con->out_msg_pos.did_page_crc = false;
 	if (in_trail)
-		list_move_tail(&page->lru,
-			       &msg->trail->head);
+		list_rotate_left(&msg->trail->head);
 	else if (msg->pagelist)
-		list_move_tail(&page->lru,
-			       &msg->pagelist->head);
+		list_rotate_left(&msg->pagelist->head);
 #ifdef CONFIG_BLOCK
 	else if (msg->bio)
 		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/8] libceph: simplify new message initialization
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (5 preceding siblings ...)
  2013-03-09 15:15 ` [PATCH 6/8] libceph: advance pagelist with list_rotate_left() Alex Elder
@ 2013-03-09 15:15 ` Alex Elder
  2013-03-09 15:15 ` [PATCH 8/8] libceph: record byte count not page count Alex Elder
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:15 UTC (permalink / raw)
  To: ceph-devel

Rather than explicitly initializing many fields to 0, NULL, or false
in a newly-allocated message, just use kzalloc() for allocating new
messages.  This will become a much more convenient way of doing
things anyway for upcoming patches that abstract the data field.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |   38 ++++----------------------------------
 1 file changed, 4 insertions(+), 34 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 2734d03..ce1669f 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2699,49 +2699,19 @@ struct ceph_msg *ceph_msg_new(int type, int
front_len, gfp_t flags,
 {
 	struct ceph_msg *m;

-	m = kmalloc(sizeof(*m), flags);
+	m = kzalloc(sizeof(*m), flags);
 	if (m == NULL)
 		goto out;
-	kref_init(&m->kref);
-
-	m->con = NULL;
-	INIT_LIST_HEAD(&m->list_head);

-	m->hdr.tid = 0;
 	m->hdr.type = cpu_to_le16(type);
 	m->hdr.priority = cpu_to_le16(CEPH_MSG_PRIO_DEFAULT);
-	m->hdr.version = 0;
 	m->hdr.front_len = cpu_to_le32(front_len);
-	m->hdr.middle_len = 0;
-	m->hdr.data_len = 0;
-	m->hdr.data_off = 0;
-	m->hdr.reserved = 0;
-	m->footer.front_crc = 0;
-	m->footer.middle_crc = 0;
-	m->footer.data_crc = 0;
-	m->footer.flags = 0;
-	m->front_max = front_len;
-	m->front_is_vmalloc = false;
-	m->more_to_follow = false;
-	m->ack_stamp = 0;
-	m->pool = NULL;

-	/* middle */
-	m->middle = NULL;
-
-	/* data */
-	m->page_count = 0;
-	m->page_alignment = 0;
-	m->pages = NULL;
-	m->pagelist = NULL;
-#ifdef	CONFIG_BLOCK
-	m->bio = NULL;
-	m->bio_iter = NULL;
-	m->bio_seg = 0;
-#endif	/* CONFIG_BLOCK */
-	m->trail = NULL;
+	INIT_LIST_HEAD(&m->list_head);
+	kref_init(&m->kref);

 	/* front */
+	m->front_max = front_len;
 	if (front_len) {
 		if (front_len > PAGE_CACHE_SIZE) {
 			m->front.iov_base = __vmalloc(front_len, flags,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 8/8] libceph: record byte count not page count
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (6 preceding siblings ...)
  2013-03-09 15:15 ` [PATCH 7/8] libceph: simplify new message initialization Alex Elder
@ 2013-03-09 15:15 ` Alex Elder
  2013-03-09 15:21 ` [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
  2013-03-11 18:57 ` Josh Durgin
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:15 UTC (permalink / raw)
  To: ceph-devel

Record the byte count for an osd request rather than the page count.
The number of pages can always be derived from the byte count (and
alignment/offset) but the reverse is not true.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c             |    2 +-
 fs/ceph/addr.c                  |   33 ++++++++++++++++----------
 fs/ceph/file.c                  |    2 +-
 include/linux/ceph/osd_client.h |    2 +-
 net/ceph/osd_client.c           |   50
+++++++++++++++++++++++----------------
 5 files changed, 52 insertions(+), 37 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index a0a6182..ae6b976 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1420,7 +1420,7 @@ static struct ceph_osd_request *rbd_osd_req_create(
 	case OBJ_REQUEST_PAGES:
 		osd_data->type = CEPH_OSD_DATA_TYPE_PAGES;
 		osd_data->pages = obj_request->pages;
-		osd_data->num_pages = obj_request->page_count;
+		osd_data->length = obj_request->length;
 		osd_data->alignment = offset & ~PAGE_MASK;
 		osd_data->pages_from_pool = false;
 		osd_data->own_pages = false;
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index ceb2829..67d4965 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -238,13 +238,16 @@ static void finish_read(struct ceph_osd_request
*req, struct ceph_msg *msg)
 	struct inode *inode = req->r_inode;
 	int rc = req->r_result;
 	int bytes = le32_to_cpu(msg->hdr.data_len);
+	int num_pages;
 	int i;

 	dout("finish_read %p req %p rc %d bytes %d\n", inode, req, rc, bytes);

 	/* unlock all pages, zeroing any data we didn't read */
 	BUG_ON(req->r_data_in.type != CEPH_OSD_DATA_TYPE_PAGES);
-	for (i = 0; i < req->r_data_in.num_pages; i++) {
+	num_pages = calc_pages_for((u64)req->r_data_in.alignment,
+					(u64)req->r_data_in.length);
+	for (i = 0; i < num_pages; i++) {
 		struct page *page = req->r_data_in.pages[i];

 		if (bytes < (int)PAGE_CACHE_SIZE) {
@@ -340,7 +343,7 @@ static int start_read(struct inode *inode, struct
list_head *page_list, int max)
 	}
 	req->r_data_in.type = CEPH_OSD_DATA_TYPE_PAGES;
 	req->r_data_in.pages = pages;
-	req->r_data_in.num_pages = nr_pages;
+	req->r_data_in.length = len;
 	req->r_data_in.alignment = 0;
 	req->r_callback = finish_read;
 	req->r_inode = inode;
@@ -555,6 +558,7 @@ static void writepages_finish(struct
ceph_osd_request *req,
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	unsigned wrote;
 	struct page *page;
+	int num_pages;
 	int i;
 	struct ceph_snap_context *snapc = req->r_snapc;
 	struct address_space *mapping = inode->i_mapping;
@@ -565,6 +569,8 @@ static void writepages_finish(struct
ceph_osd_request *req,
 	unsigned issued = ceph_caps_issued(ci);

 	BUG_ON(req->r_data_out.type != CEPH_OSD_DATA_TYPE_PAGES);
+	num_pages = calc_pages_for((u64)req->r_data_out.alignment,
+					(u64)req->r_data_out.length);
 	if (rc >= 0) {
 		/*
 		 * Assume we wrote the pages we originally sent.  The
@@ -572,7 +578,7 @@ static void writepages_finish(struct
ceph_osd_request *req,
 		 * raced with a truncation and was adjusted at the osd,
 		 * so don't believe the reply.
 		 */
-		wrote = req->r_data_out.num_pages;
+		wrote = num_pages;
 	} else {
 		wrote = 0;
 		mapping_set_error(mapping, rc);
@@ -581,7 +587,7 @@ static void writepages_finish(struct
ceph_osd_request *req,
 	     inode, rc, bytes, wrote);

 	/* clean all pages */
-	for (i = 0; i < req->r_data_out.num_pages; i++) {
+	for (i = 0; i < num_pages; i++) {
 		page = req->r_data_out.pages[i];
 		BUG_ON(!page);
 		WARN_ON(!PageUptodate(page));
@@ -611,9 +617,9 @@ static void writepages_finish(struct
ceph_osd_request *req,
 		unlock_page(page);
 	}
 	dout("%p wrote+cleaned %d pages\n", inode, wrote);
-	ceph_put_wrbuffer_cap_refs(ci, req->r_data_out.num_pages, snapc);
+	ceph_put_wrbuffer_cap_refs(ci, num_pages, snapc);

-	ceph_release_pages(req->r_data_out.pages, req->r_data_out.num_pages);
+	ceph_release_pages(req->r_data_out.pages, num_pages);
 	if (req->r_data_out.pages_from_pool)
 		mempool_free(req->r_data_out.pages,
 			     ceph_sb_to_client(inode->i_sb)->wb_pagevec_pool);
@@ -624,15 +630,18 @@ static void writepages_finish(struct
ceph_osd_request *req,

 /*
  * allocate a page vec, either directly, or if necessary, via a the
- * mempool.  we avoid the mempool if we can because
req->r_data_out.num_pages
+ * mempool.  we avoid the mempool if we can because req->r_data_out.length
  * may be less than the maximum write size.
  */
 static void alloc_page_vec(struct ceph_fs_client *fsc,
 			   struct ceph_osd_request *req)
 {
 	size_t size;
+	int num_pages;

-	size = sizeof (struct page *) * req->r_data_out.num_pages;
+	num_pages = calc_pages_for((u64)req->r_data_out.alignment,
+					(u64)req->r_data_out.length);
+	size = sizeof (struct page *) * num_pages;
 	req->r_data_out.pages = kmalloc(size, GFP_NOFS);
 	if (!req->r_data_out.pages) {
 		req->r_data_out.pages = mempool_alloc(fsc->wb_pagevec_pool,
@@ -838,11 +847,9 @@ get_more_pages:
 				}

 				req->r_data_out.type = CEPH_OSD_DATA_TYPE_PAGES;
-				req->r_data_out.num_pages =
-						calc_pages_for(0, len);
+				req->r_data_out.length = len;
 				req->r_data_out.alignment = 0;
-				max_pages = req->r_data_out.num_pages;
-
+				max_pages = calc_pages_for(0, (u64)len);
 				alloc_page_vec(fsc, req);
 				req->r_callback = writepages_finish;
 				req->r_inode = inode;
@@ -900,7 +907,7 @@ get_more_pages:
 		     locked_pages, offset, len);

 		/* revise final length, page count */
-		req->r_data_out.num_pages = locked_pages;
+		req->r_data_out.length = len;
 		req->r_request_ops[0].extent.length = cpu_to_le64(len);
 		req->r_request_ops[0].payload_len = cpu_to_le32(len);
 		req->r_request->hdr.data_len = cpu_to_le32(len);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 3e0a6da..1cd009a 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -573,7 +573,7 @@ more:
 	}
 	req->r_data_out.type = CEPH_OSD_DATA_TYPE_PAGES;
 	req->r_data_out.pages = pages;
-	req->r_data_out.num_pages = num_pages;
+	req->r_data_out.length = len;
 	req->r_data_out.alignment = page_align;
 	req->r_inode = inode;

diff --git a/include/linux/ceph/osd_client.h
b/include/linux/ceph/osd_client.h
index 40e0260..a8016df 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -63,7 +63,7 @@ struct ceph_osd_data {
 	union {
 		struct {
 			struct page	**pages;
-			u32		num_pages;
+			u64		length;
 			u32		alignment;
 			bool		pages_from_pool;
 			bool		own_pages;
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index f9cf445..202af14 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -107,6 +107,7 @@ static int calc_layout(struct ceph_file_layout
*layout, u64 off, u64 *plen,
  */
 void ceph_osdc_release_request(struct kref *kref)
 {
+	int num_pages;
 	struct ceph_osd_request *req = container_of(kref,
 						    struct ceph_osd_request,
 						    r_kref);
@@ -124,13 +125,17 @@ void ceph_osdc_release_request(struct kref *kref)
 		ceph_msg_put(req->r_reply);

 	if (req->r_data_in.type == CEPH_OSD_DATA_TYPE_PAGES &&
-			req->r_data_in.own_pages)
-		ceph_release_page_vector(req->r_data_in.pages,
-					 req->r_data_in.num_pages);
+			req->r_data_in.own_pages) {
+		num_pages = calc_pages_for((u64)req->r_data_in.alignment,
+						(u64)req->r_data_in.length);
+		ceph_release_page_vector(req->r_data_in.pages, num_pages);
+	}
 	if (req->r_data_out.type == CEPH_OSD_DATA_TYPE_PAGES &&
-			req->r_data_out.own_pages)
-		ceph_release_page_vector(req->r_data_out.pages,
-					 req->r_data_out.num_pages);
+			req->r_data_out.own_pages) {
+		num_pages = calc_pages_for((u64)req->r_data_out.alignment,
+						(u64)req->r_data_out.length);
+		ceph_release_page_vector(req->r_data_out.pages, num_pages);
+	}

 	ceph_put_snap_context(req->r_snapc);
 	ceph_pagelist_release(&req->r_trail);
@@ -1753,8 +1758,12 @@ int ceph_osdc_start_request(struct
ceph_osd_client *osdc,

 	osd_data = &req->r_data_out;
 	if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
+		unsigned int page_count;
+
 		req->r_request->pages = osd_data->pages;
-		req->r_request->page_count = osd_data->num_pages;
+		page_count = calc_pages_for((u64)osd_data->alignment,
+						(u64)osd_data->length);
+		req->r_request->page_count = page_count;
 		req->r_request->page_alignment = osd_data->alignment;
 #ifdef CONFIG_BLOCK
 	} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
@@ -1967,11 +1976,11 @@ int ceph_osdc_readpages(struct ceph_osd_client
*osdc,
 	osd_data = &req->r_data_in;
 	osd_data->type = CEPH_OSD_DATA_TYPE_PAGES;
 	osd_data->pages = pages;
-	osd_data->num_pages = calc_pages_for(page_align, *plen);
+	osd_data->length = *plen;
 	osd_data->alignment = page_align;

-	dout("readpages  final extent is %llu~%llu (%d pages align %d)\n",
-	     off, *plen, osd_data->num_pages, page_align);
+	dout("readpages  final extent is %llu~%llu (%llu bytes align %d)\n",
+	     off, *plen, osd_data->length, page_align);

 	rc = ceph_osdc_start_request(osdc, req, false);
 	if (!rc)
@@ -2013,10 +2022,9 @@ int ceph_osdc_writepages(struct ceph_osd_client
*osdc, struct ceph_vino vino,
 	osd_data = &req->r_data_out;
 	osd_data->type = CEPH_OSD_DATA_TYPE_PAGES;
 	osd_data->pages = pages;
-	osd_data->num_pages = calc_pages_for(page_align, len);
+	osd_data->length = len;
 	osd_data->alignment = page_align;
-	dout("writepages %llu~%llu (%d pages)\n", off, len,
-		osd_data->num_pages);
+	dout("writepages %llu~%llu (%llu bytes)\n", off, len, osd_data->length);

 	rc = ceph_osdc_start_request(osdc, req, true);
 	if (!rc)
@@ -2112,23 +2120,23 @@ static struct ceph_msg *get_reply(struct
ceph_connection *con,
 		struct ceph_osd_data *osd_data = &req->r_data_in;

 		if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
-			int want;
+			unsigned int page_count;

-			want = calc_pages_for(osd_data->alignment, data_len);
 			if (osd_data->pages &&
-				unlikely(osd_data->num_pages < want)) {
+				unlikely(osd_data->length < data_len)) {

-				pr_warning("tid %lld reply has %d bytes %d "
-					"pages, we had only %d pages ready\n",
-					tid, data_len, want,
-					osd_data->num_pages);
+				pr_warning("tid %lld reply has %d bytes "
+					"we had only %llu bytes ready\n",
+					tid, data_len, osd_data->length);
 				*skip = 1;
 				ceph_msg_put(m);
 				m = NULL;
 				goto out;
 			}
+			page_count = calc_pages_for((u64)osd_data->alignment,
+							(u64)osd_data->length);
 			m->pages = osd_data->pages;
-			m->page_count = osd_data->num_pages;
+			m->page_count = page_count;
 			m->page_alignment = osd_data->alignment;
 #ifdef CONFIG_BLOCK
 		} else if (osd_data->type == CEPH_OSD_DATA_TYPE_BIO) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8] libceph: miscellaneous cleanups
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (7 preceding siblings ...)
  2013-03-09 15:15 ` [PATCH 8/8] libceph: record byte count not page count Alex Elder
@ 2013-03-09 15:21 ` Alex Elder
  2013-03-11 18:57 ` Josh Durgin
  9 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2013-03-09 15:21 UTC (permalink / raw)
  To: ceph-devel

On 03/09/2013 09:11 AM, Alex Elder wrote:
> This series contains a bunch of small cleanups.  All of
> them I've created while working on other things, and most
> of them were done in order to allow upcoming work to be
> easier, or easier to understand.  I've moved them all to
> the front of my stack of patches and now I'm posting them
> as a set for review.  They apply on top of the ceph-client
> "testing" branch as of a couple of days ago:
>     2f60d302  libceph: fix decoding of pgids

I forgot to say, these are available on the "review/wip-cleanup"
branch of the ceph-client repository.

					-Alex


> 					-Alex
> 
> [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN
> [PATCH 2/8] libceph: minor byte order problems in
> [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more"
> [PATCH 4/8] libceph: kill args in read_partial_message_bio()
> [PATCH 5/8] libceph: define and use in_msg_pos_next()
> [PATCH 6/8] libceph: advance pagelist with list_rotate_left()
> [PATCH 7/8] libceph: simplify new message initialization
> [PATCH 8/8] libceph: record byte count not page count
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/8] libceph: define and use in_msg_pos_next()
  2013-03-09 15:14 ` [PATCH 5/8] libceph: define and use in_msg_pos_next() Alex Elder
@ 2013-03-11 18:57   ` Josh Durgin
  2013-03-11 19:16     ` Alex Elder
  0 siblings, 1 reply; 14+ messages in thread
From: Josh Durgin @ 2013-03-11 18:57 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On 03/09/2013 07:14 AM, Alex Elder wrote:
> Define a new function in_msg_pos_next() to match out_msg_pos_next(),
> and use it in place of code at the end of read_partial_message_pages()
> and read_partial_message_bio().
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   net/ceph/messenger.c |   57
> ++++++++++++++++++++++++++++++++------------------
>   1 file changed, 37 insertions(+), 20 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 2017b88..fb5f6e7 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -1052,6 +1052,28 @@ static void out_msg_pos_next(struct
> ceph_connection *con, struct page *page,
>   #endif
>   }
>
> +static void in_msg_pos_next(struct ceph_connection *con, size_t len,
> +				size_t received)
> +{
> +	struct ceph_msg *msg = con->in_msg;
> +
> +	BUG_ON(!msg);
> +	BUG_ON(!received);
> +
> +	con->in_msg_pos.data_pos += received;
> +	con->in_msg_pos.page_pos += received;
> +	if (received < len)
> +		return;

This is different from the condition in read_partial_message_pages()
case, which only increments in_msg_pos.page and resets page_pos when
page_pos == PAGE_SIZE. Maybe this is equivalent, but if so it's not
obvious.

> +
> +	BUG_ON(received != len);
> +	con->in_msg_pos.page_pos = 0;
> +	con->in_msg_pos.page++;
> +#ifdef CONFIG_BLOCK
> +	if (msg->bio)
> +		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
> +#endif /* CONFIG_BLOCK */
> +}
> +
>   /*
>    * Write as much message data payload as we can.  If we finish, queue
>    * up the footer.
> @@ -1789,6 +1811,7 @@ static int read_partial_message_pages(struct
> ceph_connection *con,
>   				      struct page **pages,
>   				      unsigned int data_len, bool do_datacrc)
>   {
> +	struct page *page;
>   	void *p;
>   	int ret;
>   	int left;
> @@ -1797,22 +1820,18 @@ static int read_partial_message_pages(struct
> ceph_connection *con,
>   		   (int)(PAGE_SIZE - con->in_msg_pos.page_pos));
>   	/* (page) data */
>   	BUG_ON(pages == NULL);
> -	p = kmap(pages[con->in_msg_pos.page]);
> -	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
> -			       left);
> +	page = pages[con->in_msg_pos.page];
> +	p = kmap(page);
> +	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos, left);
>   	if (ret > 0 && do_datacrc)
>   		con->in_data_crc =
>   			crc32c(con->in_data_crc,
>   				  p + con->in_msg_pos.page_pos, ret);
> -	kunmap(pages[con->in_msg_pos.page]);
> +	kunmap(page);
>   	if (ret <= 0)
>   		return ret;
> -	con->in_msg_pos.data_pos += ret;
> -	con->in_msg_pos.page_pos += ret;
> -	if (con->in_msg_pos.page_pos == PAGE_SIZE) {
> -		con->in_msg_pos.page_pos = 0;
> -		con->in_msg_pos.page++;
> -	}
> +
> +	in_msg_pos_next(con, left, ret);
>
>   	return ret;
>   }
> @@ -1823,32 +1842,30 @@ static int read_partial_message_bio(struct
> ceph_connection *con,
>   {
>   	struct ceph_msg *msg = con->in_msg;
>   	struct bio_vec *bv;
> +	struct page *page;
>   	void *p;
>   	int ret, left;
>
>   	BUG_ON(!msg);
>   	BUG_ON(!msg->bio_iter);
>   	bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
> +
>   	left = min((int)(data_len - con->in_msg_pos.data_pos),
>   		   (int)(bv->bv_len - con->in_msg_pos.page_pos));
>
> -	p = kmap(bv->bv_page) + bv->bv_offset;
> +	page = bv->bv_page;
> +	p = kmap(page) + bv->bv_offset;
>
> -	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
> -			       left);
> +	ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos, left);
>   	if (ret > 0 && do_datacrc)
>   		con->in_data_crc =
>   			crc32c(con->in_data_crc,
>   				  p + con->in_msg_pos.page_pos, ret);
> -	kunmap(bv->bv_page);
> +	kunmap(page);
>   	if (ret <= 0)
>   		return ret;
> -	con->in_msg_pos.data_pos += ret;
> -	con->in_msg_pos.page_pos += ret;
> -	if (con->in_msg_pos.page_pos == bv->bv_len) {
> -		con->in_msg_pos.page_pos = 0;
> -		iter_bio_next(&msg->bio_iter, &msg->bio_seg);
> -	}
> +
> +	in_msg_pos_next(con, left, ret);
>
>   	return ret;
>   }
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8] libceph: miscellaneous cleanups
  2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
                   ` (8 preceding siblings ...)
  2013-03-09 15:21 ` [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
@ 2013-03-11 18:57 ` Josh Durgin
  9 siblings, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2013-03-11 18:57 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On 03/09/2013 07:11 AM, Alex Elder wrote:
> This series contains a bunch of small cleanups.  All of
> them I've created while working on other things, and most
> of them were done in order to allow upcoming work to be
> easier, or easier to understand.  I've moved them all to
> the front of my stack of patches and now I'm posting them
> as a set for review.  They apply on top of the ceph-client
> "testing" branch as of a couple of days ago:
>      2f60d302  libceph: fix decoding of pgids
>
> 					-Alex
>
> [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN
> [PATCH 2/8] libceph: minor byte order problems in
> [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more"
> [PATCH 4/8] libceph: kill args in read_partial_message_bio()
> [PATCH 5/8] libceph: define and use in_msg_pos_next()
> [PATCH 6/8] libceph: advance pagelist with list_rotate_left()
> [PATCH 7/8] libceph: simplify new message initialization
> [PATCH 8/8] libceph: record byte count not page count

These look good except for 5/8.
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/8] libceph: define and use in_msg_pos_next()
  2013-03-11 18:57   ` Josh Durgin
@ 2013-03-11 19:16     ` Alex Elder
  2013-03-11 19:28       ` Josh Durgin
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Elder @ 2013-03-11 19:16 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel

On 03/11/2013 01:57 PM, Josh Durgin wrote:
> On 03/09/2013 07:14 AM, Alex Elder wrote:
>> Define a new function in_msg_pos_next() to match out_msg_pos_next(),
>> and use it in place of code at the end of read_partial_message_pages()
>> and read_partial_message_bio().
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>   net/ceph/messenger.c |   57
>> ++++++++++++++++++++++++++++++++------------------
>>   1 file changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 2017b88..fb5f6e7 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -1052,6 +1052,28 @@ static void out_msg_pos_next(struct
>> ceph_connection *con, struct page *page,
>>   #endif
>>   }
>>
>> +static void in_msg_pos_next(struct ceph_connection *con, size_t len,
>> +                size_t received)
>> +{
>> +    struct ceph_msg *msg = con->in_msg;
>> +
>> +    BUG_ON(!msg);
>> +    BUG_ON(!received);
>> +
>> +    con->in_msg_pos.data_pos += received;
>> +    con->in_msg_pos.page_pos += received;
>> +    if (received < len)
>> +        return;
> 
> This is different from the condition in read_partial_message_pages()
> case, which only increments in_msg_pos.page and resets page_pos when
> page_pos == PAGE_SIZE. Maybe this is equivalent, but if so it's not
> obvious.

You are correct.  And, despite my usual thoroughness and
verbosity, I didn't explain this one.

This message position structure (in_msg_pos) is tracking
how much of the message has been consumed.  Each time a
an incoming message is going to arrive, we find out how
much room is left--not surpassing the current page--and
provide that as the "len" (number of bytes) to receive:

      left = min((int)(data_len - con->in_msg_pos.data_pos),
             (int)(bv->bv_len - con->in_msg_pos.page_pos));

This is saying the amount we'll use is the lesser of:
- all that's left of the entire request
- all that's left in the current page

The number of bytes received is then used, along with
the "len" value, to determine how to advance the position.
If we received exactly how many were requested, we either
reached the end of the request or the end of the page.
In the first case, we're done, in the second, we move
onto the next page in the array.

In all cases but (possibly) on the last page, after adding
the number of bytes received, page_pos == PAGE_SIZE.  On the
last page, it doesn't really matter whether we increment
the page number and reset the page position, because we're
done and we won't come back here again.  The code previously
skipped over that last case, basically.  The new code
handles that case the same as the others, incrementing
and resetting.

So the short answer is, they are still equivalent.  And
I agree, it is not obvious.

If you are satisfied, I can add the above (or maybe an
abbreviated form of it) to the commit message.

					-Alex

>> +
>> +    BUG_ON(received != len);
>> +    con->in_msg_pos.page_pos = 0;
>> +    con->in_msg_pos.page++;
>> +#ifdef CONFIG_BLOCK
>> +    if (msg->bio)
>> +        iter_bio_next(&msg->bio_iter, &msg->bio_seg);
>> +#endif /* CONFIG_BLOCK */
>> +}
>> +
>>   /*
>>    * Write as much message data payload as we can.  If we finish, queue
>>    * up the footer.
>> @@ -1789,6 +1811,7 @@ static int read_partial_message_pages(struct
>> ceph_connection *con,
>>                         struct page **pages,
>>                         unsigned int data_len, bool do_datacrc)
>>   {
>> +    struct page *page;
>>       void *p;
>>       int ret;
>>       int left;
>> @@ -1797,22 +1820,18 @@ static int read_partial_message_pages(struct
>> ceph_connection *con,
>>              (int)(PAGE_SIZE - con->in_msg_pos.page_pos));
>>       /* (page) data */
>>       BUG_ON(pages == NULL);
>> -    p = kmap(pages[con->in_msg_pos.page]);
>> -    ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
>> -                   left);
>> +    page = pages[con->in_msg_pos.page];
>> +    p = kmap(page);
>> +    ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
>> left);
>>       if (ret > 0 && do_datacrc)
>>           con->in_data_crc =
>>               crc32c(con->in_data_crc,
>>                     p + con->in_msg_pos.page_pos, ret);
>> -    kunmap(pages[con->in_msg_pos.page]);
>> +    kunmap(page);
>>       if (ret <= 0)
>>           return ret;
>> -    con->in_msg_pos.data_pos += ret;
>> -    con->in_msg_pos.page_pos += ret;
>> -    if (con->in_msg_pos.page_pos == PAGE_SIZE) {
>> -        con->in_msg_pos.page_pos = 0;
>> -        con->in_msg_pos.page++;
>> -    }
>> +
>> +    in_msg_pos_next(con, left, ret);
>>
>>       return ret;
>>   }
>> @@ -1823,32 +1842,30 @@ static int read_partial_message_bio(struct
>> ceph_connection *con,
>>   {
>>       struct ceph_msg *msg = con->in_msg;
>>       struct bio_vec *bv;
>> +    struct page *page;
>>       void *p;
>>       int ret, left;
>>
>>       BUG_ON(!msg);
>>       BUG_ON(!msg->bio_iter);
>>       bv = bio_iovec_idx(msg->bio_iter, msg->bio_seg);
>> +
>>       left = min((int)(data_len - con->in_msg_pos.data_pos),
>>              (int)(bv->bv_len - con->in_msg_pos.page_pos));
>>
>> -    p = kmap(bv->bv_page) + bv->bv_offset;
>> +    page = bv->bv_page;
>> +    p = kmap(page) + bv->bv_offset;
>>
>> -    ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
>> -                   left);
>> +    ret = ceph_tcp_recvmsg(con->sock, p + con->in_msg_pos.page_pos,
>> left);
>>       if (ret > 0 && do_datacrc)
>>           con->in_data_crc =
>>               crc32c(con->in_data_crc,
>>                     p + con->in_msg_pos.page_pos, ret);
>> -    kunmap(bv->bv_page);
>> +    kunmap(page);
>>       if (ret <= 0)
>>           return ret;
>> -    con->in_msg_pos.data_pos += ret;
>> -    con->in_msg_pos.page_pos += ret;
>> -    if (con->in_msg_pos.page_pos == bv->bv_len) {
>> -        con->in_msg_pos.page_pos = 0;
>> -        iter_bio_next(&msg->bio_iter, &msg->bio_seg);
>> -    }
>> +
>> +    in_msg_pos_next(con, left, ret);
>>
>>       return ret;
>>   }
>>
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/8] libceph: define and use in_msg_pos_next()
  2013-03-11 19:16     ` Alex Elder
@ 2013-03-11 19:28       ` Josh Durgin
  0 siblings, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2013-03-11 19:28 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On 03/11/2013 12:16 PM, Alex Elder wrote:
> On 03/11/2013 01:57 PM, Josh Durgin wrote:
>> On 03/09/2013 07:14 AM, Alex Elder wrote:
>>> Define a new function in_msg_pos_next() to match out_msg_pos_next(),
>>> and use it in place of code at the end of read_partial_message_pages()
>>> and read_partial_message_bio().
>>>
>>> Signed-off-by: Alex Elder <elder@inktank.com>
>>> ---
>>>    net/ceph/messenger.c |   57
>>> ++++++++++++++++++++++++++++++++------------------
>>>    1 file changed, 37 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>>> index 2017b88..fb5f6e7 100644
>>> --- a/net/ceph/messenger.c
>>> +++ b/net/ceph/messenger.c
>>> @@ -1052,6 +1052,28 @@ static void out_msg_pos_next(struct
>>> ceph_connection *con, struct page *page,
>>>    #endif
>>>    }
>>>
>>> +static void in_msg_pos_next(struct ceph_connection *con, size_t len,
>>> +                size_t received)
>>> +{
>>> +    struct ceph_msg *msg = con->in_msg;
>>> +
>>> +    BUG_ON(!msg);
>>> +    BUG_ON(!received);
>>> +
>>> +    con->in_msg_pos.data_pos += received;
>>> +    con->in_msg_pos.page_pos += received;
>>> +    if (received < len)
>>> +        return;
>>
>> This is different from the condition in read_partial_message_pages()
>> case, which only increments in_msg_pos.page and resets page_pos when
>> page_pos == PAGE_SIZE. Maybe this is equivalent, but if so it's not
>> obvious.
>
> You are correct.  And, despite my usual thoroughness and
> verbosity, I didn't explain this one.
>
> This message position structure (in_msg_pos) is tracking
> how much of the message has been consumed.  Each time a
> an incoming message is going to arrive, we find out how
> much room is left--not surpassing the current page--and
> provide that as the "len" (number of bytes) to receive:
>
>        left = min((int)(data_len - con->in_msg_pos.data_pos),
>               (int)(bv->bv_len - con->in_msg_pos.page_pos));
>
> This is saying the amount we'll use is the lesser of:
> - all that's left of the entire request
> - all that's left in the current page
>
> The number of bytes received is then used, along with
> the "len" value, to determine how to advance the position.
> If we received exactly how many were requested, we either
> reached the end of the request or the end of the page.
> In the first case, we're done, in the second, we move
> onto the next page in the array.
>
> In all cases but (possibly) on the last page, after adding
> the number of bytes received, page_pos == PAGE_SIZE.  On the
> last page, it doesn't really matter whether we increment
> the page number and reset the page position, because we're
> done and we won't come back here again.  The code previously
> skipped over that last case, basically.  The new code
> handles that case the same as the others, incrementing
> and resetting.
>
> So the short answer is, they are still equivalent.  And
> I agree, it is not obvious.
>
> If you are satisfied, I can add the above (or maybe an
> abbreviated form of it) to the commit message.

Thanks for the explanation, that makes sense now.
It'd be nice to have some form of it in the commit message.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-03-11 19:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-09 15:11 [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
2013-03-09 15:13 ` [PATCH 1/8] libceph: define CEPH_MSG_MAX_MIDDLE_LEN Alex Elder
2013-03-09 15:13 ` [PATCH 2/8] libceph: minor byte order problems in Alex Elder
2013-03-09 15:14 ` [PATCH 3/8] libceph: change type of ceph_tcp_sendpage() "more" Alex Elder
2013-03-09 15:14 ` [PATCH 4/8] libceph: kill args in read_partial_message_bio() Alex Elder
2013-03-09 15:14 ` [PATCH 5/8] libceph: define and use in_msg_pos_next() Alex Elder
2013-03-11 18:57   ` Josh Durgin
2013-03-11 19:16     ` Alex Elder
2013-03-11 19:28       ` Josh Durgin
2013-03-09 15:15 ` [PATCH 6/8] libceph: advance pagelist with list_rotate_left() Alex Elder
2013-03-09 15:15 ` [PATCH 7/8] libceph: simplify new message initialization Alex Elder
2013-03-09 15:15 ` [PATCH 8/8] libceph: record byte count not page count Alex Elder
2013-03-09 15:21 ` [PATCH 0/8] libceph: miscellaneous cleanups Alex Elder
2013-03-11 18:57 ` Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.