All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH v2 03/20] fuse: Implement standard FUSE operations
Date: Thu, 15 Oct 2020 18:04:37 +0200	[thread overview]
Message-ID: <1768b35f-8615-1f0f-b008-6628882c6eaa@redhat.com> (raw)
In-Reply-To: <20201015155800.GL4610@merkur.fritz.box>


[-- Attachment #1.1: Type: text/plain, Size: 5046 bytes --]

On 15.10.20 17:58, Kevin Wolf wrote:
> Am 15.10.2020 um 17:18 hat Max Reitz geschrieben:
>> On 15.10.20 11:46, Kevin Wolf wrote:
>>> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben:
>>>> This makes the export actually useful instead of only producing errors
>>>> whenever it is accessed.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
>>>> +/**
>>>> + * Handle client reads from the exported image.
>>>> + */
>>>> +static void fuse_read(fuse_req_t req, fuse_ino_t inode,
>>>> +                      size_t size, off_t offset, struct fuse_file_info *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int64_t length;
>>>> +    void *buf;
>>>> +    int ret;
>>>> +
>>>> +    /**
>>>> +     * Clients will expect short reads at EOF, so we have to limit
>>>> +     * offset+size to the image length.
>>>> +     */
>>>> +    length = blk_getlength(exp->common.blk);
>>>> +    if (length < 0) {
>>>> +        fuse_reply_err(req, -length);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    size = MIN(size, FUSE_MAX_BOUNCE_BYTES);
>>>
>>> "Read should send exactly the number of bytes requested except on EOF or
>>> error, otherwise the rest of the data will be substituted with zeroes."
>>
>> :(
>>
>>> Do we corrupt huge reads with this, so that read() succeeds, but the
>>> buffer is zeroed above 64M instead of containing the correct data? Maybe
>>> we should return an error instead?
>>
>> Hm.  It looks like there is a max_read option obeyed by the kernel
>> driver, and it appears it’s set by implementing .init() and setting
>> fuse_conn_info.max_read (also, “for the time being” it has to also set
>> in the mount options passed to fuse_session_new()).
>>
>> I’m not sure whether that does anything, though.  It appears that
>> whenever I do a cached read, it caps out at 128k (which is what
>> cuse_prep_data() in libfuse sets; though increasing that number there
>> doesn’t change anything, so I think that’s just a coincidence), and with
>> O_DIRECT, it caps out at 1M.
>>
>> But at least that would be grounds to return an error for >64M requests.
>>  (Limiting max_read to 64k does limit all read requests to 64k.)
> 
> Yes, setting max_read and making larger requests an error seems like a
> good solution.
> 
>> Further investigation is needed.
> 
> If you want :-)

Not really, but 128k per request is a bit sad.

[...]

>>>> +/**
>>>> + * Let clients flush the exported image.
>>>> + */
>>>> +static void fuse_flush(fuse_req_t req, fuse_ino_t inode,
>>>> +                       struct fuse_file_info *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int ret;
>>>> +
>>>> +    ret = blk_flush(exp->common.blk);
>>>> +    fuse_reply_err(req, ret < 0 ? -ret : 0);
>>>> +}
>>>
>>> This seems to be an implementation for .fsync rather than for .flush.
>>
>> Wouldn’t fsync entail a drain?
> 
> Hm, the libfuse documentation doesn't say anything about draining. I
> suppose this is because if requests need to be drained, it will do so in
> the kernel. But I haven't checked the code.

Hmm, well, yeah...  A sync doesn’t necessarily mean settling all
in-flight requests.

> So I expect that calling fsync() on the lower layer does everything that
> is needed. Which is bdrv_flush() in QEMU.
> 
>> Or is it the opposite, flush should just drain and not invoke
>> blk_flush()?  (Sorry, this all gets me confused every time.)
> 
> I'm just relying on the libfuse documentation there for flush:
> 
> "This is called on each close() of the opened file."

Ah, OK.

> and
> 
> "NOTE: the name of the method is misleading, since (unlike fsync) the
> filesystem is not forced to flush pending writes. One reason to flush
> data is if the filesystem wants to return write errors during close.
> However, such use is non-portable because POSIX does not require close
> to wait for delayed I/O to complete."
> 
>> (Though I do think .fsync should both flush and drain.)
>>
>>> Hmm, or maybe actually for both? We usually do bdrv_flush() during
>>> close, so it would be consistent to do the same here. It's our last
>>> chance to report an error to the user before the file is closed.
>>
>> I don’t understand what you mean.  What is “the same”?  Closing the
>> image?  Or indeed having .flush() be implemented with blk_flush()?
> 
> Implementing .flush(), which will be called when the image is closed,
> with blk_flush().
> 
> And still doing blk_flush() for .fsync, of course.

OK.

>> Do you mean that other parties may do the same as qemu does, i.e.
>> flush files before they are closed, which is why we should anticipate
>> the same and give users a chance to see errors?
> 
> I'm not exactly sure what failure of .flush() would look like for users.
> Probably close() failing, so they don't have to do anything special,
> just check their return values.

Checking return values on close()?  Sounds special to me. O:)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-10-15 16:10 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22 10:49 [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Max Reitz
2020-09-22 10:49 ` [PATCH v2 01/20] configure: Detect libfuse Max Reitz
2020-09-22 11:14   ` Thomas Huth
2020-09-22 11:21     ` Paolo Bonzini
2020-09-22 11:46     ` Max Reitz
2020-09-22 15:37     ` Max Reitz
2020-09-22 15:45       ` Paolo Bonzini
2020-09-22 10:49 ` [PATCH v2 02/20] fuse: Allow exporting BDSs via FUSE Max Reitz
2020-10-15  8:57   ` Kevin Wolf
2020-10-15 14:46     ` Max Reitz
2020-10-15 15:41       ` Kevin Wolf
2020-10-15 15:59         ` Max Reitz
2020-10-15 17:01           ` Kevin Wolf
2020-09-22 10:49 ` [PATCH v2 03/20] fuse: Implement standard FUSE operations Max Reitz
2020-10-15  9:46   ` Kevin Wolf
2020-10-15 15:18     ` Max Reitz
2020-10-15 15:58       ` Kevin Wolf
2020-10-15 16:04         ` Max Reitz [this message]
2020-09-22 10:49 ` [PATCH v2 04/20] fuse: Allow growable exports Max Reitz
2020-10-15 10:41   ` Kevin Wolf
2020-10-15 15:20     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 05/20] fuse: (Partially) implement fallocate() Max Reitz
2020-09-22 10:49 ` [PATCH v2 06/20] fuse: Implement hole detection through lseek Max Reitz
2020-09-22 10:49 ` [PATCH v2 07/20] iotests: Do not needlessly filter _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 08/20] iotests: Do not pipe _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 09/20] iotests: Use convert -n in some cases Max Reitz
2020-09-22 10:49 ` [PATCH v2 10/20] iotests/046: Avoid renaming images Max Reitz
2020-09-22 10:49 ` [PATCH v2 11/20] iotests: Derive image names from $TEST_IMG Max Reitz
2020-09-22 10:49 ` [PATCH v2 12/20] iotests/091: Use _cleanup_qemu instad of "wait" Max Reitz
2020-09-22 10:49 ` [PATCH v2 13/20] iotests: Restrict some Python tests to file Max Reitz
2020-09-22 10:49 ` [PATCH v2 14/20] iotests: Let _make_test_img guess $TEST_IMG_FILE Max Reitz
2020-09-22 10:49 ` [PATCH v2 15/20] iotests/287: Clean up subshell test image Max Reitz
2020-09-22 10:49 ` [PATCH v2 16/20] storage-daemon: Call bdrv_close_all() on exit Max Reitz
2020-09-22 10:49 ` [PATCH v2 17/20] iotests: Give access to the qemu-storage-daemon Max Reitz
2020-10-15 11:27   ` Kevin Wolf
2020-10-15 15:22     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 18/20] iotests: Allow testing FUSE exports Max Reitz
2020-10-15 11:43   ` Kevin Wolf
2020-10-15 15:27     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 19/20] iotests: Enable fuse for many tests Max Reitz
2020-09-22 10:49 ` [PATCH v2 20/20] iotests/308: Add test for FUSE exports Max Reitz
2020-09-22 15:58 ` [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Daniel P. Berrangé
2020-09-23  7:21   ` Max Reitz
2020-09-23  9:08   ` Stefan Hajnoczi
2020-10-15 12:01 ` Kevin Wolf
2020-10-15 16:47   ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1768b35f-8615-1f0f-b008-6628882c6eaa@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.