All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: xiubli@redhat.com, ceph-devel@vger.kernel.org
Cc: idryomov@gmail.com, vshankar@redhat.com, mchangir@redhat.com
Subject: Re: [PATCH v4 3/3] libceph: just wait for more data to be available on the socket
Date: Thu, 18 Jan 2024 13:24:47 -0500	[thread overview]
Message-ID: <ca7f6ba894524474d513807a165f02f4ad50a506.camel@kernel.org> (raw)
In-Reply-To: <20240118105047.792879-4-xiubli@redhat.com>

On Thu, 2024-01-18 at 18:50 +0800, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> The messages from ceph maybe split into multiple socket packages
> and we just need to wait for all the data to be availiable on the
> sokcet.
> 
> This will add 'sr_total_resid' to record the total length for all
> data items for sparse-read message and 'sr_resid_elen' to record
> the current extent total length.
> 
> URL: https://tracker.ceph.com/issues/63586
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  include/linux/ceph/messenger.h |  1 +
>  net/ceph/messenger_v1.c        | 32 +++++++++++++++++++++-----------
>  2 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 2eaaabbe98cb..ca6f82abed62 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -231,6 +231,7 @@ struct ceph_msg_data {
>  
>  struct ceph_msg_data_cursor {
>  	size_t			total_resid;	/* across all data items */
> +	size_t			sr_total_resid;	/* across all data items for sparse-read */
>  
>  	struct ceph_msg_data	*data;		/* current data item */
>  	size_t			resid;		/* bytes not yet consumed */
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 4cb60bacf5f5..2733da891688 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -160,7 +160,9 @@ static size_t sizeof_footer(struct ceph_connection *con)
>  static void prepare_message_data(struct ceph_msg *msg, u32 data_len)
>  {
>  	/* Initialize data cursor if it's not a sparse read */
> -	if (!msg->sparse_read)
> +	if (msg->sparse_read)
> +		msg->cursor.sr_total_resid = data_len;
> +	else
>  		ceph_msg_data_cursor_init(&msg->cursor, msg, data_len);
>  }
>  
> @@ -1032,35 +1034,43 @@ static int read_partial_sparse_msg_data(struct ceph_connection *con)
>  	bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC);
>  	u32 crc = 0;
>  	int ret = 1;
> +	int len;
>  
>  	if (do_datacrc)
>  		crc = con->in_data_crc;
>  
> -	do {
> -		if (con->v1.in_sr_kvec.iov_base)
> +	while (cursor->sr_total_resid) {
> +		len = 0;
> +		if (con->v1.in_sr_kvec.iov_base) {
> +			len = con->v1.in_sr_kvec.iov_len;
>  			ret = read_partial_message_chunk(con,
>  							 &con->v1.in_sr_kvec,
>  							 con->v1.in_sr_len,
>  							 &crc);
> -		else if (cursor->sr_resid > 0)
> +			len = con->v1.in_sr_kvec.iov_len - len;
> +		} else if (cursor->sr_resid > 0) {
> +			len = cursor->sr_resid;
>  			ret = read_partial_sparse_msg_extent(con, &crc);
> -
> -		if (ret <= 0) {
> -			if (do_datacrc)
> -				con->in_data_crc = crc;
> -			return ret;
> +			len -= cursor->sr_resid;
>  		}
> +		cursor->sr_total_resid -= len;
> +		if (ret <= 0)
> +			break;
>  
>  		memset(&con->v1.in_sr_kvec, 0, sizeof(con->v1.in_sr_kvec));
>  		ret = con->ops->sparse_read(con, cursor,
>  				(char **)&con->v1.in_sr_kvec.iov_base);
> +		if (ret <= 0) {
> +			ret = ret ? : 1; /* must return > 0 to indicate success */
> +			break;
> +		}
>  		con->v1.in_sr_len = ret;
> -	} while (ret > 0);
> +	}
>  
>  	if (do_datacrc)
>  		con->in_data_crc = crc;
>  
> -	return ret < 0 ? ret : 1;  /* must return > 0 to indicate success */
> +	return ret;
>  }
>  
>  static int read_partial_msg_data(struct ceph_connection *con)

Looking back over this code...

The way it works today, once we determine it's a sparse read, we call
read_sparse_msg_data. At that point we call either
read_partial_message_chunk (to read into the kvec) or
read_sparse_msg_extent if sr_resid is already set (indicating that we're
receiving an extent).

read_sparse_msg_extent calls ceph_tcp_recvpage in a loop until
cursor->sr_resid have been received. The exception there when
ceph_tcp_recvpage returns <= 0.

ceph_tcp_recvpage returns 0 if sock_recvmsg returns -EAGAIN (maybe also
in other cases). So it sounds like the client just timed out on a read
from the socket or caught a signal or something?

If that's correct, then do we know what ceph_tcp_recvpage returned when
the problem happened?
-- 
Jeff Layton <jlayton@kernel.org>

  parent reply	other threads:[~2024-01-18 18:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-18 10:50 [PATCH v4 0/3] libceph: fix sparse-read failure bug xiubli
2024-01-18 10:50 ` [PATCH v4 1/3] libceph: fail the sparse-read if there still has data in socket xiubli
2024-01-18 14:03   ` Jeff Layton
2024-01-19  4:07     ` Xiubo Li
2024-01-19 11:03       ` Jeff Layton
2024-01-22  3:17         ` Xiubo Li
2024-01-18 10:50 ` [PATCH v4 2/3] libceph: rename read_sparse_msg_XX to read_partial_sparse_msg_XX xiubli
2024-01-18 14:04   ` Jeff Layton
2024-01-18 10:50 ` [PATCH v4 3/3] libceph: just wait for more data to be available on the socket xiubli
2024-01-18 14:36   ` Jeff Layton
2024-01-18 18:24   ` Jeff Layton [this message]
2024-01-19  4:35     ` Xiubo Li
2024-01-19 11:09       ` Jeff Layton
2024-01-22  2:52         ` Xiubo Li
2024-01-22 11:44           ` Jeff Layton
2024-01-22 15:02   ` Jeff Layton
2024-01-22 16:55     ` Ilya Dryomov
2024-01-22 17:14       ` Jeff Layton
2024-01-22 19:41         ` Ilya Dryomov
2024-01-23  0:53           ` Xiubo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca7f6ba894524474d513807a165f02f4ad50a506.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=mchangir@redhat.com \
    --cc=vshankar@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.