All of lore.kernel.org
 help / color / mirror / Atom feed
* [Virtio-fs] Few queries about virtiofsd read implementation
@ 2021-05-07 18:59 Vivek Goyal
  2021-05-07 21:14 ` Edward McClanahan
  2021-05-10 17:23 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 6+ messages in thread
From: Vivek Goyal @ 2021-05-07 18:59 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Stefan Hajnoczi; +Cc: virtio-fs-list

Hi David/Stefan,

I am browsing through the code of read requests (FUSE_READ) in virtiofsd
(and in virtiofs) and I have few questions. You folks probably know the
answers.

1. virtio_send_data_iov(), reads the data from file into the scatter list.
  Some of the code looks strange.

  We seem to be retrying read if we read less number of bytes than what
  client asked for. I am wondering shoudl this really be our
  responsibility or client should deal with it. I am assuming that client
  should be ready to deal with less number of bytes read.

  So what was the thought process behind retrying.

          if (ret < len && ret) {
            fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
            /* Skip over this much next time around */
            skip_size = ret;
            buf->buf[0].pos += ret;
            len -= ret;

            /* Lets do another read */
            continue;
        }

- After this we have code where if number of bytes read are not same
  as we expect to, then we return EIO.

          if (ret != len) {
            fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
            ret = EIO;
            free(in_sg_cpy);
            goto err;
        }

  When do we hit this. IIUC, preadv() will return.

  A. Either number of bytes we expected (no issues)
  B. 0 in case of EOF (We break out of loop and just return to client with
  		      number of bytes we have read so far).
  C. <0 (This is error case and we return error to client)
  D. X bytes which is less than len. 

To handle D we have code to retry. So when do we hit the above if
condition where "ret !=len). Is this a dead code. Or I missed something.

2. When client sent FUSE_READ, we put pointer to pages into sglist. IIUC,
   we put pointer to "struct page *" and not actual page. So who converts
   these struct page pointer to memory belong to page.

   sg_init_fuse_pages() {
   	sg_set_page(&sg[i], pages[i], this_len, page_descs[i].offset);
   }

3. Who converts guest memory address (when) into qemu process address which is
   accessible by virtiofsd.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Virtio-fs] Few queries about virtiofsd read implementation
  2021-05-07 18:59 [Virtio-fs] Few queries about virtiofsd read implementation Vivek Goyal
@ 2021-05-07 21:14 ` Edward McClanahan
  2021-05-10 13:10   ` Vivek Goyal
  2021-05-10 17:23 ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 6+ messages in thread
From: Edward McClanahan @ 2021-05-07 21:14 UTC (permalink / raw)
  To: Vivek Goyal, Dr. David Alan Gilbert, Stefan Hajnoczi; +Cc: virtio-fs-list

[-- Attachment #1: Type: text/plain, Size: 3231 bytes --]

I concur... Client usermode code is required to deal with ret < requested length... In this case, the requested length being the sum of the lengths of each iov segment.

There are two cases where this can occure:
1) the offset+Len > file size, and
2) the process intercepted an EINTR before all requested data became available

For file systems, we don't deal with #2...but we .just deal with #1.

Get Outlook for Android<https://aka.ms/AAb9ysg>
________________________________
From: virtio-fs-bounces@redhat.com <virtio-fs-bounces@redhat.com> on behalf of Vivek Goyal <vgoyal@redhat.com>
Sent: Friday, May 7, 2021 1:59:09 PM
To: Dr. David Alan Gilbert <dgilbert@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>
Cc: virtio-fs-list <virtio-fs@redhat.com>
Subject: [Virtio-fs] Few queries about virtiofsd read implementation

External email: Use caution opening links or attachments


Hi David/Stefan,

I am browsing through the code of read requests (FUSE_READ) in virtiofsd
(and in virtiofs) and I have few questions. You folks probably know the
answers.

1. virtio_send_data_iov(), reads the data from file into the scatter list.
  Some of the code looks strange.

  We seem to be retrying read if we read less number of bytes than what
  client asked for. I am wondering shoudl this really be our
  responsibility or client should deal with it. I am assuming that client
  should be ready to deal with less number of bytes read.

  So what was the thought process behind retrying.

          if (ret < len && ret) {
            fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
            /* Skip over this much next time around */
            skip_size = ret;
            buf->buf[0].pos += ret;
            len -= ret;

            /* Lets do another read */
            continue;
        }

- After this we have code where if number of bytes read are not same
  as we expect to, then we return EIO.

          if (ret != len) {
            fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
            ret = EIO;
            free(in_sg_cpy);
            goto err;
        }

  When do we hit this. IIUC, preadv() will return.

  A. Either number of bytes we expected (no issues)
  B. 0 in case of EOF (We break out of loop and just return to client with
                      number of bytes we have read so far).
  C. <0 (This is error case and we return error to client)
  D. X bytes which is less than len.

To handle D we have code to retry. So when do we hit the above if
condition where "ret !=len). Is this a dead code. Or I missed something.

2. When client sent FUSE_READ, we put pointer to pages into sglist. IIUC,
   we put pointer to "struct page *" and not actual page. So who converts
   these struct page pointer to memory belong to page.

   sg_init_fuse_pages() {
        sg_set_page(&sg[i], pages[i], this_len, page_descs[i].offset);
   }

3. Who converts guest memory address (when) into qemu process address which is
   accessible by virtiofsd.

Thanks
Vivek

_______________________________________________
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs


[-- Attachment #2: Type: text/html, Size: 5898 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Virtio-fs] Few queries about virtiofsd read implementation
  2021-05-07 21:14 ` Edward McClanahan
@ 2021-05-10 13:10   ` Vivek Goyal
  0 siblings, 0 replies; 6+ messages in thread
From: Vivek Goyal @ 2021-05-10 13:10 UTC (permalink / raw)
  To: Edward McClanahan; +Cc: virtio-fs-list

On Fri, May 07, 2021 at 02:14:54PM -0700, Edward McClanahan wrote:
> I concur... Client usermode code is required to deal with ret < requested length... In this case, the requested length being the sum of the lengths of each iov segment.

Apart from client user mode, guest kernel also uses this to read pages
into page cache. If number of bytes read are less than bytes requested,
then it assumes EOF and truncates file. (fuse_short_read()).

So that probably means that we should only handle EINTR in server and
retry otherwise there should not be a need to retry in file server.

I am not sure if we need to handle EINTR explicitly or it is already
taken care of by library/kernel and restart system call.

Thanks
Vivek

> 
> There are two cases where this can occure:
> 1) the offset+Len > file size, and
> 2) the process intercepted an EINTR before all requested data became available
> 
> For file systems, we don't deal with #2...but we .just deal with #1.
> 
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> ________________________________
> From: virtio-fs-bounces@redhat.com <virtio-fs-bounces@redhat.com> on behalf of Vivek Goyal <vgoyal@redhat.com>
> Sent: Friday, May 7, 2021 1:59:09 PM
> To: Dr. David Alan Gilbert <dgilbert@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>
> Cc: virtio-fs-list <virtio-fs@redhat.com>
> Subject: [Virtio-fs] Few queries about virtiofsd read implementation
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi David/Stefan,
> 
> I am browsing through the code of read requests (FUSE_READ) in virtiofsd
> (and in virtiofs) and I have few questions. You folks probably know the
> answers.
> 
> 1. virtio_send_data_iov(), reads the data from file into the scatter list.
>   Some of the code looks strange.
> 
>   We seem to be retrying read if we read less number of bytes than what
>   client asked for. I am wondering shoudl this really be our
>   responsibility or client should deal with it. I am assuming that client
>   should be ready to deal with less number of bytes read.
> 
>   So what was the thought process behind retrying.
> 
>           if (ret < len && ret) {
>             fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
>             /* Skip over this much next time around */
>             skip_size = ret;
>             buf->buf[0].pos += ret;
>             len -= ret;
> 
>             /* Lets do another read */
>             continue;
>         }
> 
> - After this we have code where if number of bytes read are not same
>   as we expect to, then we return EIO.
> 
>           if (ret != len) {
>             fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
>             ret = EIO;
>             free(in_sg_cpy);
>             goto err;
>         }
> 
>   When do we hit this. IIUC, preadv() will return.
> 
>   A. Either number of bytes we expected (no issues)
>   B. 0 in case of EOF (We break out of loop and just return to client with
>                       number of bytes we have read so far).
>   C. <0 (This is error case and we return error to client)
>   D. X bytes which is less than len.
> 
> To handle D we have code to retry. So when do we hit the above if
> condition where "ret !=len). Is this a dead code. Or I missed something.
> 
> 2. When client sent FUSE_READ, we put pointer to pages into sglist. IIUC,
>    we put pointer to "struct page *" and not actual page. So who converts
>    these struct page pointer to memory belong to page.
> 
>    sg_init_fuse_pages() {
>         sg_set_page(&sg[i], pages[i], this_len, page_descs[i].offset);
>    }
> 
> 3. Who converts guest memory address (when) into qemu process address which is
>    accessible by virtiofsd.
> 
> Thanks
> Vivek
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://listman.redhat.com/mailman/listinfo/virtio-fs
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Virtio-fs] Few queries about virtiofsd read implementation
  2021-05-07 18:59 [Virtio-fs] Few queries about virtiofsd read implementation Vivek Goyal
  2021-05-07 21:14 ` Edward McClanahan
@ 2021-05-10 17:23 ` Dr. David Alan Gilbert
  2021-05-10 21:40   ` Vivek Goyal
  1 sibling, 1 reply; 6+ messages in thread
From: Dr. David Alan Gilbert @ 2021-05-10 17:23 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: virtio-fs-list

* Vivek Goyal (vgoyal@redhat.com) wrote:
> Hi David/Stefan,
> 
> I am browsing through the code of read requests (FUSE_READ) in virtiofsd
> (and in virtiofs) and I have few questions. You folks probably know the
> answers.
> 
> 1. virtio_send_data_iov(), reads the data from file into the scatter list.
>   Some of the code looks strange.
> 
>   We seem to be retrying read if we read less number of bytes than what
>   client asked for. I am wondering shoudl this really be our
>   responsibility or client should deal with it. I am assuming that client
>   should be ready to deal with less number of bytes read.
> 
>   So what was the thought process behind retrying.
> 
>           if (ret < len && ret) {
>             fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
>             /* Skip over this much next time around */
>             skip_size = ret;
>             buf->buf[0].pos += ret;
>             len -= ret;
> 
>             /* Lets do another read */
>             continue;
>         }
> 
> - After this we have code where if number of bytes read are not same
>   as we expect to, then we return EIO.
> 
>           if (ret != len) {
>             fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
>             ret = EIO;
>             free(in_sg_cpy);
>             goto err;
>         }
> 
>   When do we hit this. IIUC, preadv() will return.
> 
>   A. Either number of bytes we expected (no issues)
>   B. 0 in case of EOF (We break out of loop and just return to client with
>   		      number of bytes we have read so far).
>   C. <0 (This is error case and we return error to client)
>   D. X bytes which is less than len. 
> 
> To handle D we have code to retry. So when do we hit the above if
> condition where "ret !=len). Is this a dead code. Or I missed something.

I think you're right, that's dead.
And oyu're probably also right that we could just take it easy and
return less data to the client if preadv just gives us part of it.

> 2. When client sent FUSE_READ, we put pointer to pages into sglist. IIUC,
>    we put pointer to "struct page *" and not actual page. So who converts
>    these struct page pointer to memory belong to page.
> 
>    sg_init_fuse_pages() {
>    	sg_set_page(&sg[i], pages[i], this_len, page_descs[i].offset);
>    }

Hmm don't know.

> 3. Who converts guest memory address (when) into qemu process address which is
>    accessible by virtiofsd.

I think that's libvhost-user doing that for us; in vu_queue_pop it calls
vu_queue_map_desc.

Dave

> Thanks
> Vivek
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Virtio-fs] Few queries about virtiofsd read implementation
  2021-05-10 17:23 ` Dr. David Alan Gilbert
@ 2021-05-10 21:40   ` Vivek Goyal
  2021-05-10 21:45     ` Vivek Goyal
  0 siblings, 1 reply; 6+ messages in thread
From: Vivek Goyal @ 2021-05-10 21:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: virtio-fs-list

On Mon, May 10, 2021 at 06:23:48PM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgoyal@redhat.com) wrote:
> > Hi David/Stefan,
> > 
> > I am browsing through the code of read requests (FUSE_READ) in virtiofsd
> > (and in virtiofs) and I have few questions. You folks probably know the
> > answers.
> > 
> > 1. virtio_send_data_iov(), reads the data from file into the scatter list.
> >   Some of the code looks strange.
> > 
> >   We seem to be retrying read if we read less number of bytes than what
> >   client asked for. I am wondering shoudl this really be our
> >   responsibility or client should deal with it. I am assuming that client
> >   should be ready to deal with less number of bytes read.
> > 
> >   So what was the thought process behind retrying.
> > 
> >           if (ret < len && ret) {
> >             fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
> >             /* Skip over this much next time around */
> >             skip_size = ret;
> >             buf->buf[0].pos += ret;
> >             len -= ret;
> > 
> >             /* Lets do another read */
> >             continue;
> >         }
> > 
> > - After this we have code where if number of bytes read are not same
> >   as we expect to, then we return EIO.
> > 
> >           if (ret != len) {
> >             fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
> >             ret = EIO;
> >             free(in_sg_cpy);
> >             goto err;
> >         }
> > 
> >   When do we hit this. IIUC, preadv() will return.
> > 
> >   A. Either number of bytes we expected (no issues)
> >   B. 0 in case of EOF (We break out of loop and just return to client with
> >   		      number of bytes we have read so far).
> >   C. <0 (This is error case and we return error to client)
> >   D. X bytes which is less than len. 
> > 
> > To handle D we have code to retry. So when do we hit the above if
> > condition where "ret !=len). Is this a dead code. Or I missed something.
> 
> I think you're right, that's dead.
> And oyu're probably also right that we could just take it easy and
> return less data to the client if preadv just gives us part of it.

Also, looks like we never return error to client.  virtio_send_data_iov()
only sends reply back if preadv() reads requested bytes or reads less due
to EOF. If preadv() returnes error, then we return to caller with error.
And I don't see anybody propagating that error back to client.

lo_read()
  fuse_reply_data()
    fuse_send_data_iov()
      fuse_send_data_iov_fallback()
        virtio_send_data_iov()

fuse_reply_data() returns error only if ret > 0 and that's not the case
here. In fact that seems to be another piece of dead code for virtiofs.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Virtio-fs] Few queries about virtiofsd read implementation
  2021-05-10 21:40   ` Vivek Goyal
@ 2021-05-10 21:45     ` Vivek Goyal
  0 siblings, 0 replies; 6+ messages in thread
From: Vivek Goyal @ 2021-05-10 21:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: virtio-fs-list

On Mon, May 10, 2021 at 05:40:05PM -0400, Vivek Goyal wrote:
> On Mon, May 10, 2021 at 06:23:48PM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgoyal@redhat.com) wrote:
> > > Hi David/Stefan,
> > > 
> > > I am browsing through the code of read requests (FUSE_READ) in virtiofsd
> > > (and in virtiofs) and I have few questions. You folks probably know the
> > > answers.
> > > 
> > > 1. virtio_send_data_iov(), reads the data from file into the scatter list.
> > >   Some of the code looks strange.
> > > 
> > >   We seem to be retrying read if we read less number of bytes than what
> > >   client asked for. I am wondering shoudl this really be our
> > >   responsibility or client should deal with it. I am assuming that client
> > >   should be ready to deal with less number of bytes read.
> > > 
> > >   So what was the thought process behind retrying.
> > > 
> > >           if (ret < len && ret) {
> > >             fuse_log(FUSE_LOG_DEBUG, "%s: ret < len\n", __func__);
> > >             /* Skip over this much next time around */
> > >             skip_size = ret;
> > >             buf->buf[0].pos += ret;
> > >             len -= ret;
> > > 
> > >             /* Lets do another read */
> > >             continue;
> > >         }
> > > 
> > > - After this we have code where if number of bytes read are not same
> > >   as we expect to, then we return EIO.
> > > 
> > >           if (ret != len) {
> > >             fuse_log(FUSE_LOG_DEBUG, "%s: ret!=len\n", __func__);
> > >             ret = EIO;
> > >             free(in_sg_cpy);
> > >             goto err;
> > >         }
> > > 
> > >   When do we hit this. IIUC, preadv() will return.
> > > 
> > >   A. Either number of bytes we expected (no issues)
> > >   B. 0 in case of EOF (We break out of loop and just return to client with
> > >   		      number of bytes we have read so far).
> > >   C. <0 (This is error case and we return error to client)
> > >   D. X bytes which is less than len. 
> > > 
> > > To handle D we have code to retry. So when do we hit the above if
> > > condition where "ret !=len). Is this a dead code. Or I missed something.
> > 
> > I think you're right, that's dead.
> > And oyu're probably also right that we could just take it easy and
> > return less data to the client if preadv just gives us part of it.
> 
> Also, looks like we never return error to client.  virtio_send_data_iov()
> only sends reply back if preadv() reads requested bytes or reads less due
> to EOF. If preadv() returnes error, then we return to caller with error.
> And I don't see anybody propagating that error back to client.
> 
> lo_read()
>   fuse_reply_data()
>     fuse_send_data_iov()
>       fuse_send_data_iov_fallback()
>         virtio_send_data_iov()
> 
> fuse_reply_data() returns error only if ret > 0 and that's not the case
> here. In fact that seems to be another piece of dead code for virtiofs.

Actually I might have misread the code. In case of error
virtio_send_data_iov() returns errno, which probably is positive and
fuse_reply_data() returns error back to client in that case.

    res = fuse_send_data_iov(req->se, req->ch, iov, 1, bufv);
    if (res <= 0) {
        fuse_free_req(req);
        return res;
    } else {
        return fuse_reply_err(req, res);
    }

This is confusing...

Vivek


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-10 21:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07 18:59 [Virtio-fs] Few queries about virtiofsd read implementation Vivek Goyal
2021-05-07 21:14 ` Edward McClanahan
2021-05-10 13:10   ` Vivek Goyal
2021-05-10 17:23 ` Dr. David Alan Gilbert
2021-05-10 21:40   ` Vivek Goyal
2021-05-10 21:45     ` Vivek Goyal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.