All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiubo Li <xiubli@redhat.com>
To: Jeff Layton <jlayton@kernel.org>, ceph-devel@vger.kernel.org
Cc: idryomov@gmail.com, vshankar@redhat.com, mchangir@redhat.com
Subject: Re: [PATCH v3 1/3] libceph: fail the sparse-read if there still has data in socket
Date: Mon, 18 Dec 2023 09:23:13 +0800	[thread overview]
Message-ID: <2d23c041-c329-422c-bb6b-98b6dc4813b7@redhat.com> (raw)
In-Reply-To: <9f7d560bdc3eb80dd03b1fe500a78da0959cab0b.camel@kernel.org>


On 12/16/23 01:06, Jeff Layton wrote:
> On Fri, 2023-12-15 at 08:20 +0800, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> Once this happens that means there have bugs.
>>
>> URL: https://tracker.ceph.com/issues/63586
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>   net/ceph/osd_client.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
>> index 5753036d1957..848ef19055a0 100644
>> --- a/net/ceph/osd_client.c
>> +++ b/net/ceph/osd_client.c
>> @@ -5912,10 +5912,12 @@ static int osd_sparse_read(struct ceph_connection *con,
>>   		fallthrough;
>>   	case CEPH_SPARSE_READ_DATA:
>>   		if (sr->sr_index >= count) {
>> -			if (sr->sr_datalen && count)
>> +			if (sr->sr_datalen) {
>>   				pr_warn_ratelimited("sr_datalen %u sr_index %d count %u\n",
>>   						    sr->sr_datalen, sr->sr_index,
>>   						    count);
>> +				return -EREMOTEIO;
>> +			}
>>   
>>   			sr->sr_state = CEPH_SPARSE_READ_HDR;
>>   			goto next_op;
> Do you really need to fail the read in this case? Would it not be better
> to just advance past the extra junk? Or is this problem more indicative
> of a malformed frame?
>
> It'd be nice to have some specific explanation of the problem this is
> fixing and how it was triggered. If this due to a misbehaving server?
> Bad hw?

Hi Jeff,

Why I added this check before was that when I was debugging the 
sparse-read issue last time I found even when the extent array 'count == 
0', sometimes the 'sr_datalen != 0'. I just thought the ceph side's 
logic allowed it.

But this time from going through and debugging more in the ceph side 
code, I didn't get where was doing this. So I just suspected it should 
be an issue somewhere in kclient side and it also possibly would trigger 
the sparse-read bug.  So I just removed the 'count' check and finally 
found the root case for both these two issues.

IMO normally this shouldn't happen anyway. Once this happens in kclient 
side it will 100% corrupt the following msgs and reset the connection 
from my tests. Actually when the 'count ==0' the 'sr_datalen' will be a 
random number, sometimes very large, so just advancing past the extra 
junk makes no any sense.

Thanks

- Xiubo



  reply	other threads:[~2023-12-18  1:23 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-15  0:20 [PATCH v3 0/3] libceph: fix sparse-read failure bug xiubli
2023-12-15  0:20 ` [PATCH v3 1/3] libceph: fail the sparse-read if there still has data in socket xiubli
2023-12-15 17:06   ` Jeff Layton
2023-12-18  1:23     ` Xiubo Li [this message]
2023-12-15  0:20 ` [PATCH v3 2/3] libceph: rename read_sparse_msg_XX to read_partial_sparse_msg_XX xiubli
2023-12-15  0:20 ` [PATCH v3 3/3] libceph: just wait for more data to be available on the socket xiubli
2024-01-15 22:38   ` Ilya Dryomov
     [not found]     ` <a1d6e998-f496-4408-9d76-3671ee73e054@redhat.com>
2024-01-16 10:00       ` Ilya Dryomov
     [not found]         ` <35849fda-29b2-47ad-bf49-f2715efc7b8c@redhat.com>
2024-01-16 12:06           ` Ilya Dryomov
     [not found]         ` <68e4cf5a-f64f-4545-87b0-762ab920d9ba@redhat.com>
2024-01-22 19:13           ` Ilya Dryomov
2024-01-22 21:26             ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d23c041-c329-422c-bb6b-98b6dc4813b7@redhat.com \
    --to=xiubli@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=mchangir@redhat.com \
    --cc=vshankar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.