All of lore.kernel.org
 help / color / mirror / Atom feed
* cifs fallocate doesn't flush writes?
@ 2022-01-13 13:18 David Howells
  2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells
  0 siblings, 1 reply; 5+ messages in thread
From: David Howells @ 2022-01-13 13:18 UTC (permalink / raw)
  To: smfrench; +Cc: dhowells, Shyam Prasad N, jlayton, linux-cifs

Hi Steve,

Should cifs fallocate be flushing writes before doing the falloc?

If I do the following:

	mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0
	/usr/sbin/xfs_io -f -t \
		-c "pwrite -S 0x41 0 4096"
		-c "pwrite -S 0x42 4096 4096"
		-c "fzero 0 4096" \
		-c "pread 0 8192" \
		/xfstest.test/008.7067

The packet trace shows:

   25 0.289099336  192.168.6.2 -> 192.168.6.1  SMB2 414 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request
   26 0.292164924  192.168.6.1 -> 192.168.6.2  SMB2 510 Create Response File: ;GetInfo Response;Close Response
   27 0.292444124  192.168.6.2 -> 192.168.6.1  SMB2 174 GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO File: 008.7067
   28 0.292716736  192.168.6.1 -> 192.168.6.2  SMB2 260 GetInfo Response
   29 0.293018017  192.168.6.2 -> 192.168.6.1  SMB2 414 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request
   30 0.295730538  192.168.6.1 -> 192.168.6.2  SMB2 510 Create Response File: ;GetInfo Response;Close Response
   31 0.321638749  192.168.6.2 -> 192.168.6.1  SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067
   32 0.321862497  192.168.6.1 -> 192.168.6.2  SMB2 182 Ioctl Response FSCTL_SET_ZERO_DATA File: 008.7067
   33 0.334313914  192.168.6.2 -> 192.168.6.1  SMB2 183 Read Request Len:4096 Off:0 File: 008.7067
   34 0.334772295  192.168.6.1 -> 192.168.6.2  SMB2 143 Read Response, Error: STATUS_END_OF_FILE
   35 0.357622873  192.168.6.2 -> 192.168.6.1  SMB2 183 Read Request Len:4096 Off:0 File: 008.7067
   36 0.358040997  192.168.6.1 -> 192.168.6.2  SMB2 143 Read Response, Error: STATUS_END_OF_FILE
   38 0.373614076  192.168.6.2 -> 192.168.6.1  SMB2 1382 Write Request Len:4096 Off:4096 File: 008.7067
   40 0.374142468  192.168.6.1 -> 192.168.6.2  SMB2 150 Write Response
   41 0.374485805  192.168.6.2 -> 192.168.6.1  SMB2 158 Close Request File: 008.7067
   42 0.375222020  192.168.6.1 -> 192.168.6.2  SMB2 194 Close Response

The first page read (which overlaps with the writes and fzero) returns an EOF
error.  Shouldn't the FSCTL_SET_ZERO_DATA have extended the file to 4096?
Also, I wonder what happened to the pagecache: the second read shouldn't have
happened because fallocate() shouldn't have touched that page.

Also, the writeback happens last - I wonder if fallocate should flush.  Not
that it should be necessary if it's just modifying pages already in the
pagecache.

David


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Incorrect fallocate behaviour in cifs or samba?
  2022-01-13 13:18 cifs fallocate doesn't flush writes? David Howells
@ 2022-01-13 15:20 ` David Howells
  2022-01-13 17:43   ` Jeremy Allison
  0 siblings, 1 reply; 5+ messages in thread
From: David Howells @ 2022-01-13 15:20 UTC (permalink / raw)
  To: smfrench; +Cc: dhowells, Shyam Prasad N, jlayton, linux-cifs

David Howells <dhowells@redhat.com> wrote:

> If I do the following:
> 
> 	mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0
> 	/usr/sbin/xfs_io -f -t \
> 		-c "pwrite -S 0x41 0 4096"
> 		-c "pwrite -S 0x42 4096 4096"
> 		-c "fzero 0 4096" \
> 		-c "pread 0 8192" \
> 		/xfstest.test/008.7067
> ...
>    31 0.321638749  192.168.6.2 -> 192.168.6.1  SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067

So what I see is that Samba does:

	fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0

for this... but that's not what cifs was asked to do.  Should Samba be using
FALLOC_FL_ZERO_RANGE instead?

Also cifs wasn't given FALLOC_FL_KEEP_SIZE, so the caller expected the file to
be extended - smb3_zero_range(), however, doesn't seem necessarily to get this
right.  It seems to allow for KEEP_SIZE not being set by calling
SMB2_set_eof() - but only if the range ends beyond the *local* i_size (which
the server doesn't know about yet).

However, we did two pwrites first, which moved the local i_size over.  This
means the server's EOF isn't extended - probably correctly, since we haven't
saved the data, but the subsequent reads then fail.

In this case, I wonder if the right thing to do is one of the following
options:

 (1) Flush the pagecache before doing proceeding with the fallocate.

 (2) Modify the local pagecache and don't talk to the server at all unless
     there are gaps in the pagecache.

 (3) Keep separate track of where we think the server's EOF is and skip any
     read that's beyond that.

 (4) Treat a read ending in EOF as a read of blank data if it's below the
     local EOF.  I guess that's what we do now.

David


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect fallocate behaviour in cifs or samba?
  2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells
@ 2022-01-13 17:43   ` Jeremy Allison
  2022-01-13 18:16     ` David Disseldorp
  0 siblings, 1 reply; 5+ messages in thread
From: Jeremy Allison @ 2022-01-13 17:43 UTC (permalink / raw)
  To: David Howells; +Cc: smfrench, Shyam Prasad N, jlayton, linux-cifs

On Thu, Jan 13, 2022 at 03:20:32PM +0000, David Howells wrote:
>David Howells <dhowells@redhat.com> wrote:
>
>> If I do the following:
>>
>> 	mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0
>> 	/usr/sbin/xfs_io -f -t \
>> 		-c "pwrite -S 0x41 0 4096"
>> 		-c "pwrite -S 0x42 4096 4096"
>> 		-c "fzero 0 4096" \
>> 		-c "pread 0 8192" \
>> 		/xfstest.test/008.7067
>> ...
>>    31 0.321638749  192.168.6.2 -> 192.168.6.1  SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067
>
>So what I see is that Samba does:
>
>	fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0
>
>for this... but that's not what cifs was asked to do.  Should Samba be using
>FALLOC_FL_ZERO_RANGE instead?

This is from fsctl_zero_data() in Samba. We have:

         /*
          * MS-FSCC <58> Section 2.3.67
          * This FSCTL sets the range of bytes to zero (0) without extending the
          * file size.
          *
          * The VFS_FALLOCATE_FL_KEEP_SIZE flag is used to satisfy this
          * constraint.
          */

         mode = VFS_FALLOCATE_FL_PUNCH_HOLE | VFS_FALLOCATE_FL_KEEP_SIZE;
         ret = SMB_VFS_FALLOCATE(fsp, mode, zdata_info.file_off, len);
         if (ret == -1)  {
                 status = map_nt_error_from_unix_common(errno);
                 DEBUG(2, ("zero-data fallocate(0x%x) failed: %s\n", mode,
                       strerror(errno)));
                 return status;
         }

         if (!fsp->fsp_flags.is_sparse && lp_strict_allocate(SNUM(fsp->conn))) {
                 /*
                  * File marked non-sparse and "strict allocate" is enabled -
                  * allocate the range that we just punched out.
                  * In future FALLOC_FL_ZERO_RANGE could be used exclusively for
                  * this, but it's currently only supported on XFS and ext4.
                  *
                  * The newly allocated range still won't be found by SEEK_DATA
                  * for QAR, but stat.st_blocks will reflect it.
                  */
                 ret = SMB_VFS_FALLOCATE(fsp, VFS_FALLOCATE_FL_KEEP_SIZE,
                                         zdata_info.file_off, len);

Note the "currently only supported on XFS and ext4" problem
with FALLOC_FL_ZERO_RANGE.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect fallocate behaviour in cifs or samba?
  2022-01-13 17:43   ` Jeremy Allison
@ 2022-01-13 18:16     ` David Disseldorp
  2022-01-13 18:22       ` Jeremy Allison
  0 siblings, 1 reply; 5+ messages in thread
From: David Disseldorp @ 2022-01-13 18:16 UTC (permalink / raw)
  To: Jeremy Allison, David Howells
  Cc: smfrench, Shyam Prasad N, jlayton, linux-cifs

Hi Jeremy and David,

On Thu, 13 Jan 2022 09:43:32 -0800, Jeremy Allison wrote:

> On Thu, Jan 13, 2022 at 03:20:32PM +0000, David Howells wrote:
> >David Howells <dhowells@redhat.com> wrote:
> >  
> >> If I do the following:
> >>
> >> 	mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0
> >> 	/usr/sbin/xfs_io -f -t \
> >> 		-c "pwrite -S 0x41 0 4096"
> >> 		-c "pwrite -S 0x42 4096 4096"
> >> 		-c "fzero 0 4096" \
> >> 		-c "pread 0 8192" \
> >> 		/xfstest.test/008.7067
> >> ...
> >>    31 0.321638749  192.168.6.2 -> 192.168.6.1  SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067  
> >
> >So what I see is that Samba does:
> >
> >	fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0
> >
> >for this... but that's not what cifs was asked to do.  Should Samba be using
> >FALLOC_FL_ZERO_RANGE instead?  
> 
> This is from fsctl_zero_data() in Samba. We have:
> 
>          /*
>           * MS-FSCC <58> Section 2.3.67
>           * This FSCTL sets the range of bytes to zero (0) without extending the
>           * file size.
>           *
>           * The VFS_FALLOCATE_FL_KEEP_SIZE flag is used to satisfy this
>           * constraint.
>           */
> 
>          mode = VFS_FALLOCATE_FL_PUNCH_HOLE | VFS_FALLOCATE_FL_KEEP_SIZE;
>          ret = SMB_VFS_FALLOCATE(fsp, mode, zdata_info.file_off, len);
>          if (ret == -1)  {
>                  status = map_nt_error_from_unix_common(errno);
>                  DEBUG(2, ("zero-data fallocate(0x%x) failed: %s\n", mode,
>                        strerror(errno)));
>                  return status;
>          }
> 
>          if (!fsp->fsp_flags.is_sparse && lp_strict_allocate(SNUM(fsp->conn))) {
>                  /*
>                   * File marked non-sparse and "strict allocate" is enabled -
>                   * allocate the range that we just punched out.
>                   * In future FALLOC_FL_ZERO_RANGE could be used exclusively for
>                   * this, but it's currently only supported on XFS and ext4.
>                   *
>                   * The newly allocated range still won't be found by SEEK_DATA
>                   * for QAR, but stat.st_blocks will reflect it.
>                   */
>                  ret = SMB_VFS_FALLOCATE(fsp, VFS_FALLOCATE_FL_KEEP_SIZE,
>                                          zdata_info.file_off, len);
> 
> Note the "currently only supported on XFS and ext4" problem
> with FALLOC_FL_ZERO_RANGE.

FWIW, Samba's fsctl_zero_data semantics are based on observed Windows
server behaviour and the MS specs, which state(d at the time):
/*
 * 2.3.57 FSCTL_SET_ZERO_DATA Request
 *
 * How an implementation zeros data within a file is implementation-dependent.
 * A file system MAY choose to deallocate regions of disk space that have been
 * zeroed.<50>
 * <50>
 * ... NTFS might deallocate disk space in the file if the file is stored on an
 * NTFS volume, and the file is sparse or compressed. It will free any allocated
 * space in chunks of 64 kilobytes that begin at an offset that is a multiple of
 * 64 kilobytes. Other bytes in the file (prior to the first freed 64-kilobyte
 * chunk and after the last freed 64-kilobyte chunk) will be zeroed but not
 * deallocated.
 */

IIRC while implementing this I observed Windows deallocation behaviour
using FSCTL_QUERY_ALLOCATED_RANGES (referred to as QAR in the previous
code snippit).

Cheers, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Incorrect fallocate behaviour in cifs or samba?
  2022-01-13 18:16     ` David Disseldorp
@ 2022-01-13 18:22       ` Jeremy Allison
  0 siblings, 0 replies; 5+ messages in thread
From: Jeremy Allison @ 2022-01-13 18:22 UTC (permalink / raw)
  To: David Disseldorp
  Cc: David Howells, smfrench, Shyam Prasad N, jlayton, linux-cifs

On Thu, Jan 13, 2022 at 07:16:07PM +0100, David Disseldorp wrote:
>FWIW, Samba's fsctl_zero_data semantics are based on observed Windows
>server behaviour and the MS specs, which state(d at the time):
>/*
> * 2.3.57 FSCTL_SET_ZERO_DATA Request
> *
> * How an implementation zeros data within a file is implementation-dependent.
> * A file system MAY choose to deallocate regions of disk space that have been
> * zeroed.<50>
> * <50>
> * ... NTFS might deallocate disk space in the file if the file is stored on an
> * NTFS volume, and the file is sparse or compressed. It will free any allocated
> * space in chunks of 64 kilobytes that begin at an offset that is a multiple of
> * 64 kilobytes. Other bytes in the file (prior to the first freed 64-kilobyte
> * chunk and after the last freed 64-kilobyte chunk) will be zeroed but not
> * deallocated.
> */
>
>IIRC while implementing this I observed Windows deallocation behaviour
>using FSCTL_QUERY_ALLOCATED_RANGES (referred to as QAR in the previous
>code snippit).

Oh thanks so much for clarifying David. It's always nice
to hear things from the person who actually wrote the
code :-).

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-13 18:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-13 13:18 cifs fallocate doesn't flush writes? David Howells
2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells
2022-01-13 17:43   ` Jeremy Allison
2022-01-13 18:16     ` David Disseldorp
2022-01-13 18:22       ` Jeremy Allison

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.