* cifs fallocate doesn't flush writes? @ 2022-01-13 13:18 David Howells 2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells 0 siblings, 1 reply; 5+ messages in thread From: David Howells @ 2022-01-13 13:18 UTC (permalink / raw) To: smfrench; +Cc: dhowells, Shyam Prasad N, jlayton, linux-cifs Hi Steve, Should cifs fallocate be flushing writes before doing the falloc? If I do the following: mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0 /usr/sbin/xfs_io -f -t \ -c "pwrite -S 0x41 0 4096" -c "pwrite -S 0x42 4096 4096" -c "fzero 0 4096" \ -c "pread 0 8192" \ /xfstest.test/008.7067 The packet trace shows: 25 0.289099336 192.168.6.2 -> 192.168.6.1 SMB2 414 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request 26 0.292164924 192.168.6.1 -> 192.168.6.2 SMB2 510 Create Response File: ;GetInfo Response;Close Response 27 0.292444124 192.168.6.2 -> 192.168.6.1 SMB2 174 GetInfo Request FILE_INFO/SMB2_FILE_ALL_INFO File: 008.7067 28 0.292716736 192.168.6.1 -> 192.168.6.2 SMB2 260 GetInfo Response 29 0.293018017 192.168.6.2 -> 192.168.6.1 SMB2 414 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request 30 0.295730538 192.168.6.1 -> 192.168.6.2 SMB2 510 Create Response File: ;GetInfo Response;Close Response 31 0.321638749 192.168.6.2 -> 192.168.6.1 SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067 32 0.321862497 192.168.6.1 -> 192.168.6.2 SMB2 182 Ioctl Response FSCTL_SET_ZERO_DATA File: 008.7067 33 0.334313914 192.168.6.2 -> 192.168.6.1 SMB2 183 Read Request Len:4096 Off:0 File: 008.7067 34 0.334772295 192.168.6.1 -> 192.168.6.2 SMB2 143 Read Response, Error: STATUS_END_OF_FILE 35 0.357622873 192.168.6.2 -> 192.168.6.1 SMB2 183 Read Request Len:4096 Off:0 File: 008.7067 36 0.358040997 192.168.6.1 -> 192.168.6.2 SMB2 143 Read Response, Error: STATUS_END_OF_FILE 38 0.373614076 192.168.6.2 -> 192.168.6.1 SMB2 1382 Write Request Len:4096 Off:4096 File: 008.7067 40 0.374142468 192.168.6.1 -> 192.168.6.2 SMB2 150 Write Response 41 0.374485805 192.168.6.2 -> 192.168.6.1 SMB2 158 Close Request File: 008.7067 42 0.375222020 192.168.6.1 -> 192.168.6.2 SMB2 194 Close Response The first page read (which overlaps with the writes and fzero) returns an EOF error. Shouldn't the FSCTL_SET_ZERO_DATA have extended the file to 4096? Also, I wonder what happened to the pagecache: the second read shouldn't have happened because fallocate() shouldn't have touched that page. Also, the writeback happens last - I wonder if fallocate should flush. Not that it should be necessary if it's just modifying pages already in the pagecache. David ^ permalink raw reply [flat|nested] 5+ messages in thread
* Incorrect fallocate behaviour in cifs or samba? 2022-01-13 13:18 cifs fallocate doesn't flush writes? David Howells @ 2022-01-13 15:20 ` David Howells 2022-01-13 17:43 ` Jeremy Allison 0 siblings, 1 reply; 5+ messages in thread From: David Howells @ 2022-01-13 15:20 UTC (permalink / raw) To: smfrench; +Cc: dhowells, Shyam Prasad N, jlayton, linux-cifs David Howells <dhowells@redhat.com> wrote: > If I do the following: > > mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0 > /usr/sbin/xfs_io -f -t \ > -c "pwrite -S 0x41 0 4096" > -c "pwrite -S 0x42 4096 4096" > -c "fzero 0 4096" \ > -c "pread 0 8192" \ > /xfstest.test/008.7067 > ... > 31 0.321638749 192.168.6.2 -> 192.168.6.1 SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067 So what I see is that Samba does: fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0 for this... but that's not what cifs was asked to do. Should Samba be using FALLOC_FL_ZERO_RANGE instead? Also cifs wasn't given FALLOC_FL_KEEP_SIZE, so the caller expected the file to be extended - smb3_zero_range(), however, doesn't seem necessarily to get this right. It seems to allow for KEEP_SIZE not being set by calling SMB2_set_eof() - but only if the range ends beyond the *local* i_size (which the server doesn't know about yet). However, we did two pwrites first, which moved the local i_size over. This means the server's EOF isn't extended - probably correctly, since we haven't saved the data, but the subsequent reads then fail. In this case, I wonder if the right thing to do is one of the following options: (1) Flush the pagecache before doing proceeding with the fallocate. (2) Modify the local pagecache and don't talk to the server at all unless there are gaps in the pagecache. (3) Keep separate track of where we think the server's EOF is and skip any read that's beyond that. (4) Treat a read ending in EOF as a read of blank data if it's below the local EOF. I guess that's what we do now. David ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Incorrect fallocate behaviour in cifs or samba? 2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells @ 2022-01-13 17:43 ` Jeremy Allison 2022-01-13 18:16 ` David Disseldorp 0 siblings, 1 reply; 5+ messages in thread From: Jeremy Allison @ 2022-01-13 17:43 UTC (permalink / raw) To: David Howells; +Cc: smfrench, Shyam Prasad N, jlayton, linux-cifs On Thu, Jan 13, 2022 at 03:20:32PM +0000, David Howells wrote: >David Howells <dhowells@redhat.com> wrote: > >> If I do the following: >> >> mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0 >> /usr/sbin/xfs_io -f -t \ >> -c "pwrite -S 0x41 0 4096" >> -c "pwrite -S 0x42 4096 4096" >> -c "fzero 0 4096" \ >> -c "pread 0 8192" \ >> /xfstest.test/008.7067 >> ... >> 31 0.321638749 192.168.6.2 -> 192.168.6.1 SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067 > >So what I see is that Samba does: > > fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0 > >for this... but that's not what cifs was asked to do. Should Samba be using >FALLOC_FL_ZERO_RANGE instead? This is from fsctl_zero_data() in Samba. We have: /* * MS-FSCC <58> Section 2.3.67 * This FSCTL sets the range of bytes to zero (0) without extending the * file size. * * The VFS_FALLOCATE_FL_KEEP_SIZE flag is used to satisfy this * constraint. */ mode = VFS_FALLOCATE_FL_PUNCH_HOLE | VFS_FALLOCATE_FL_KEEP_SIZE; ret = SMB_VFS_FALLOCATE(fsp, mode, zdata_info.file_off, len); if (ret == -1) { status = map_nt_error_from_unix_common(errno); DEBUG(2, ("zero-data fallocate(0x%x) failed: %s\n", mode, strerror(errno))); return status; } if (!fsp->fsp_flags.is_sparse && lp_strict_allocate(SNUM(fsp->conn))) { /* * File marked non-sparse and "strict allocate" is enabled - * allocate the range that we just punched out. * In future FALLOC_FL_ZERO_RANGE could be used exclusively for * this, but it's currently only supported on XFS and ext4. * * The newly allocated range still won't be found by SEEK_DATA * for QAR, but stat.st_blocks will reflect it. */ ret = SMB_VFS_FALLOCATE(fsp, VFS_FALLOCATE_FL_KEEP_SIZE, zdata_info.file_off, len); Note the "currently only supported on XFS and ext4" problem with FALLOC_FL_ZERO_RANGE. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Incorrect fallocate behaviour in cifs or samba? 2022-01-13 17:43 ` Jeremy Allison @ 2022-01-13 18:16 ` David Disseldorp 2022-01-13 18:22 ` Jeremy Allison 0 siblings, 1 reply; 5+ messages in thread From: David Disseldorp @ 2022-01-13 18:16 UTC (permalink / raw) To: Jeremy Allison, David Howells Cc: smfrench, Shyam Prasad N, jlayton, linux-cifs Hi Jeremy and David, On Thu, 13 Jan 2022 09:43:32 -0800, Jeremy Allison wrote: > On Thu, Jan 13, 2022 at 03:20:32PM +0000, David Howells wrote: > >David Howells <dhowells@redhat.com> wrote: > > > >> If I do the following: > >> > >> mount //carina/test /xfstest.test -o user=shares,pass=foobar,noperm,vers=3.0,mfsymlinks,actimeo=0 > >> /usr/sbin/xfs_io -f -t \ > >> -c "pwrite -S 0x41 0 4096" > >> -c "pwrite -S 0x42 4096 4096" > >> -c "fzero 0 4096" \ > >> -c "pread 0 8192" \ > >> /xfstest.test/008.7067 > >> ... > >> 31 0.321638749 192.168.6.2 -> 192.168.6.1 SMB2 206 Ioctl Request FSCTL_SET_ZERO_DATA File: 008.7067 > > > >So what I see is that Samba does: > > > > fallocate(24, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 0, 4096) = 0 > > > >for this... but that's not what cifs was asked to do. Should Samba be using > >FALLOC_FL_ZERO_RANGE instead? > > This is from fsctl_zero_data() in Samba. We have: > > /* > * MS-FSCC <58> Section 2.3.67 > * This FSCTL sets the range of bytes to zero (0) without extending the > * file size. > * > * The VFS_FALLOCATE_FL_KEEP_SIZE flag is used to satisfy this > * constraint. > */ > > mode = VFS_FALLOCATE_FL_PUNCH_HOLE | VFS_FALLOCATE_FL_KEEP_SIZE; > ret = SMB_VFS_FALLOCATE(fsp, mode, zdata_info.file_off, len); > if (ret == -1) { > status = map_nt_error_from_unix_common(errno); > DEBUG(2, ("zero-data fallocate(0x%x) failed: %s\n", mode, > strerror(errno))); > return status; > } > > if (!fsp->fsp_flags.is_sparse && lp_strict_allocate(SNUM(fsp->conn))) { > /* > * File marked non-sparse and "strict allocate" is enabled - > * allocate the range that we just punched out. > * In future FALLOC_FL_ZERO_RANGE could be used exclusively for > * this, but it's currently only supported on XFS and ext4. > * > * The newly allocated range still won't be found by SEEK_DATA > * for QAR, but stat.st_blocks will reflect it. > */ > ret = SMB_VFS_FALLOCATE(fsp, VFS_FALLOCATE_FL_KEEP_SIZE, > zdata_info.file_off, len); > > Note the "currently only supported on XFS and ext4" problem > with FALLOC_FL_ZERO_RANGE. FWIW, Samba's fsctl_zero_data semantics are based on observed Windows server behaviour and the MS specs, which state(d at the time): /* * 2.3.57 FSCTL_SET_ZERO_DATA Request * * How an implementation zeros data within a file is implementation-dependent. * A file system MAY choose to deallocate regions of disk space that have been * zeroed.<50> * <50> * ... NTFS might deallocate disk space in the file if the file is stored on an * NTFS volume, and the file is sparse or compressed. It will free any allocated * space in chunks of 64 kilobytes that begin at an offset that is a multiple of * 64 kilobytes. Other bytes in the file (prior to the first freed 64-kilobyte * chunk and after the last freed 64-kilobyte chunk) will be zeroed but not * deallocated. */ IIRC while implementing this I observed Windows deallocation behaviour using FSCTL_QUERY_ALLOCATED_RANGES (referred to as QAR in the previous code snippit). Cheers, David ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Incorrect fallocate behaviour in cifs or samba? 2022-01-13 18:16 ` David Disseldorp @ 2022-01-13 18:22 ` Jeremy Allison 0 siblings, 0 replies; 5+ messages in thread From: Jeremy Allison @ 2022-01-13 18:22 UTC (permalink / raw) To: David Disseldorp Cc: David Howells, smfrench, Shyam Prasad N, jlayton, linux-cifs On Thu, Jan 13, 2022 at 07:16:07PM +0100, David Disseldorp wrote: >FWIW, Samba's fsctl_zero_data semantics are based on observed Windows >server behaviour and the MS specs, which state(d at the time): >/* > * 2.3.57 FSCTL_SET_ZERO_DATA Request > * > * How an implementation zeros data within a file is implementation-dependent. > * A file system MAY choose to deallocate regions of disk space that have been > * zeroed.<50> > * <50> > * ... NTFS might deallocate disk space in the file if the file is stored on an > * NTFS volume, and the file is sparse or compressed. It will free any allocated > * space in chunks of 64 kilobytes that begin at an offset that is a multiple of > * 64 kilobytes. Other bytes in the file (prior to the first freed 64-kilobyte > * chunk and after the last freed 64-kilobyte chunk) will be zeroed but not > * deallocated. > */ > >IIRC while implementing this I observed Windows deallocation behaviour >using FSCTL_QUERY_ALLOCATED_RANGES (referred to as QAR in the previous >code snippit). Oh thanks so much for clarifying David. It's always nice to hear things from the person who actually wrote the code :-). ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-01-13 18:22 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-01-13 13:18 cifs fallocate doesn't flush writes? David Howells 2022-01-13 15:20 ` Incorrect fallocate behaviour in cifs or samba? David Howells 2022-01-13 17:43 ` Jeremy Allison 2022-01-13 18:16 ` David Disseldorp 2022-01-13 18:22 ` Jeremy Allison
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.