All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiubo Li <xiubli@redhat.com>
To: Ilya Dryomov <idryomov@gmail.com>
Cc: ceph-devel@vger.kernel.org, jlayton@kernel.org,
	vshankar@redhat.com, mchangir@redhat.com
Subject: Re: [PATCH v2 2/2] libceph: just wait for more data to be available on the socket
Date: Wed, 13 Dec 2023 19:03:09 +0800	[thread overview]
Message-ID: <008fe687-9df0-45d2-929c-168a10222b2f@redhat.com> (raw)
In-Reply-To: <CAOi1vP9EzGZM=U1jDzAnTwFvWD6fpZ+qMedgOQuK79qOodU+NQ@mail.gmail.com>


On 12/13/23 18:31, Ilya Dryomov wrote:
> On Wed, Dec 13, 2023 at 2:02 AM Xiubo Li <xiubli@redhat.com> wrote:
>>
>> On 12/13/23 00:31, Ilya Dryomov wrote:
>>> On Fri, Dec 8, 2023 at 5:08 PM <xiubli@redhat.com> wrote:
>>>> From: Xiubo Li <xiubli@redhat.com>
>>>>
>>>> The messages from ceph maybe split into multiple socket packages
>>>> and we just need to wait for all the data to be availiable on the
>>>> sokcet.
>>>>
>>>> This will add a new _FINISH state for the sparse-read to mark the
>>>> current sparse-read succeeded. Else it will treat it as a new
>>>> sparse-read when the socket receives the last piece of the osd
>>>> request reply message, and the cancel_request() later will help
>>>> init the sparse-read context.
>>>>
>>>> URL: https://tracker.ceph.com/issues/63586
>>>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>>>> ---
>>>>    include/linux/ceph/osd_client.h | 1 +
>>>>    net/ceph/osd_client.c           | 6 +++---
>>>>    2 files changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
>>>> index 493de3496cd3..00d98e13100f 100644
>>>> --- a/include/linux/ceph/osd_client.h
>>>> +++ b/include/linux/ceph/osd_client.h
>>>> @@ -47,6 +47,7 @@ enum ceph_sparse_read_state {
>>>>           CEPH_SPARSE_READ_DATA_LEN,
>>>>           CEPH_SPARSE_READ_DATA_PRE,
>>>>           CEPH_SPARSE_READ_DATA,
>>>> +       CEPH_SPARSE_READ_FINISH,
>>>>    };
>>>>
>>>>    /*
>>>> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
>>>> index 848ef19055a0..f1705b4f19eb 100644
>>>> --- a/net/ceph/osd_client.c
>>>> +++ b/net/ceph/osd_client.c
>>>> @@ -5802,8 +5802,6 @@ static int prep_next_sparse_read(struct ceph_connection *con,
>>>>                           advance_cursor(cursor, sr->sr_req_len - end, false);
>>>>           }
>>>>
>>>> -       ceph_init_sparse_read(sr);
>>>> -
>>>>           /* find next op in this request (if any) */
>>>>           while (++o->o_sparse_op_idx < req->r_num_ops) {
>>>>                   op = &req->r_ops[o->o_sparse_op_idx];
>>>> @@ -5919,7 +5917,7 @@ static int osd_sparse_read(struct ceph_connection *con,
>>>>                                   return -EREMOTEIO;
>>>>                           }
>>>>
>>>> -                       sr->sr_state = CEPH_SPARSE_READ_HDR;
>>>> +                       sr->sr_state = CEPH_SPARSE_READ_FINISH;
>>>>                           goto next_op;
>>> Hi Xiubo,
>>>
>>> This code appears to be set up to handle multiple (sparse-read) ops in
>>> a single OSD request.  Wouldn't this break that case?
>> Yeah, it will break it. I just missed it.
>>
>>> I think the bug is in read_sparse_msg_data().  It shouldn't be calling
>>> con->ops->sparse_read() after the message has progressed to the footer.
>>> osd_sparse_read() is most likely fine as is.
>> Yes it is. We cannot tell exactly whether has it progressed to the
>> footer IMO, such as when in case 'con->v1.in_base_pos ==
> Hi Xiubo,
>
> I don't buy this.  If the messenger is trying to read the footer, it
> _has_ progressed to the footer.  This is definitive and irreversible,
> not a "maybe".
>
>> sizeof(con->v1.in_hdr)' the socket buffer may break just after finishing
>> sparse-read and before reading footer or some where in sparse-read. For
>> the later case then we should continue in the sparse reads.
> The size of the data section of the message is always known.  The
> contract is that read_partial_msg_data*() returns 1 and does nothing
> else after the data section is consumed.  This is how the messenger is
> told to move on to the footer.
>
> read_partial_sparse_msg_data() doesn't adhere to this contract and
> should be fixed.
>
>>
>>> Notice how read_partial_msg_data() and read_partial_msg_data_bounce()
>>> behave: if called at that point (i.e. after the data section has been
>>> processed, meaning that cursor->total_resid == 0), they do nothing.
>>> read_sparse_msg_data() should have a similar condition and basically
>>> no-op itself.
>> Correct, this what I want to do in the sparse-read code.
> No, this needs to be done on the messenger side.  sparse-read code
> should not be invoked after the messenger has moved on to the footer.

 From my reading, even the messenger has moved on to the 'footer', it 
will always try to invoke to parse the 'header':

read_partial(con, end, size, &con->v1.in_hdr);

But it will do nothing and just returns.

And the same for 'front', 'middle' and '(page) data', then the last for 
'footer'.

Did I miss something ?

Thanks

- Xiubo

> Thanks,
>
>                  Ilya
>


  reply	other threads:[~2023-12-13 11:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08 16:05 [PATCH v2 0/2] libceph: fix sparse-read failure bug xiubli
2023-12-08 16:06 ` [PATCH v2 1/2] libceph: fail the sparse-read if there still has data in socket xiubli
2023-12-08 16:06 ` [PATCH v2 2/2] libceph: just wait for more data to be available on the socket xiubli
2023-12-12 16:31   ` Ilya Dryomov
2023-12-13  1:01     ` Xiubo Li
2023-12-13 10:31       ` Ilya Dryomov
2023-12-13 11:03         ` Xiubo Li [this message]
2023-12-13 11:54           ` Ilya Dryomov
     [not found]             ` <9115452a-0ca0-4760-9407-bcc3146134ff@redhat.com>
2023-12-13 13:07               ` Ilya Dryomov
2023-12-13 14:11                 ` Jeff Layton
2023-12-13  1:13     ` Xiubo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=008fe687-9df0-45d2-929c-168a10222b2f@redhat.com \
    --to=xiubli@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=mchangir@redhat.com \
    --cc=vshankar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.