linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* generic/430 COPY/delegation caching regression
@ 2021-04-13 23:19 J. Bruce Fields
  2021-04-14  3:09 ` Trond Myklebust
  2021-04-14  3:30 ` Kornievskaia, Olga
  0 siblings, 2 replies; 5+ messages in thread
From: J. Bruce Fields @ 2021-04-13 23:19 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs, Olga Kornievskaia

generic/430 started failing in 4.12-rc3, as of 7c1d1dcc24b3 "nfsd: grant
read delegations to clients holding writes".

Looks like that reintroduced the problem fixed by 16abd2a0c124 "NFSv4.2:
fix client's attribute cache management for copy_file_range": the client
needs to invalidate its cache of the destination of a copy even when it
holds a delegation.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: generic/430 COPY/delegation caching regression
  2021-04-13 23:19 generic/430 COPY/delegation caching regression J. Bruce Fields
@ 2021-04-14  3:09 ` Trond Myklebust
  2021-04-14 14:04   ` bfields
  2021-04-14  3:30 ` Kornievskaia, Olga
  1 sibling, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2021-04-14  3:09 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, kolga

On Tue, 2021-04-13 at 19:19 -0400, J. Bruce Fields wrote:
> generic/430 started failing in 4.12-rc3, as of 7c1d1dcc24b3 "nfsd:
> grant
> read delegations to clients holding writes".
> 
> Looks like that reintroduced the problem fixed by 16abd2a0c124
> "NFSv4.2:
> fix client's attribute cache management for copy_file_range": the
> client
> needs to invalidate its cache of the destination of a copy even when it
> holds a delegation.
> 
> --b.

Hmm.. The only thing I see that could be causing an issue is the fact
that we're relying on cache invalidation to change the file size. 

        nfs_set_cache_invalid(
                dst_inode, NFS_INO_REVAL_PAGECACHE | NFS_INO_REVAL_FORCED |
                                   NFS_INO_INVALID_SIZE | NFS_INO_INVALID_ATTR |
                                   NFS_INO_INVALID_DATA);

The only problem there is that nfs_set_cache_invalid() will clobber the
NFS_INO_INVALID_SIZE because if we hold a delegation, then our client
is the sole authority for the size attribute (hence we don't allow it
to be invalidated). We therefore expect a call to i_size_write(), if
the file size grew.

Otherwise, the setting of NFS_INO_INVALID_DATA should be redundant
because we've already punched a hole with truncate_pagecache_range().

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: generic/430 COPY/delegation caching regression
  2021-04-13 23:19 generic/430 COPY/delegation caching regression J. Bruce Fields
  2021-04-14  3:09 ` Trond Myklebust
@ 2021-04-14  3:30 ` Kornievskaia, Olga
  2021-04-14 13:49   ` J. Bruce Fields
  1 sibling, 1 reply; 5+ messages in thread
From: Kornievskaia, Olga @ 2021-04-14  3:30 UTC (permalink / raw)
  To: J. Bruce Fields, Trond Myklebust; +Cc: linux-nfs



On 4/13/21, 7:20 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:

    NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




    generic/430 started failing in 4.12-rc3, as of 7c1d1dcc24b3 "nfsd: grant
    read delegations to clients holding writes".

    Looks like that reintroduced the problem fixed by 16abd2a0c124 "NFSv4.2:
    fix client's attribute cache management for copy_file_range": the client
    needs to invalidate its cache of the destination of a copy even when it
    holds a delegation.

[olga] I'm confused what client version are you testing and against what server? I haven't seen generic/430 failing while testing upstream versions against upstream server verions. What should I try (as in what client version against what server version) to reproduce the failure?

    --b.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: generic/430 COPY/delegation caching regression
  2021-04-14  3:30 ` Kornievskaia, Olga
@ 2021-04-14 13:49   ` J. Bruce Fields
  0 siblings, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2021-04-14 13:49 UTC (permalink / raw)
  To: Kornievskaia, Olga; +Cc: Trond Myklebust, linux-nfs

On Wed, Apr 14, 2021 at 03:30:19AM +0000, Kornievskaia, Olga wrote:
> On 4/13/21, 7:20 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
>     generic/430 started failing in 4.12-rc3, as of 7c1d1dcc24b3 "nfsd: grant
>     read delegations to clients holding writes".
> 
>     Looks like that reintroduced the problem fixed by 16abd2a0c124 "NFSv4.2:
>     fix client's attribute cache management for copy_file_range": the client
>     needs to invalidate its cache of the destination of a copy even when it
>     holds a delegation.
> 
> [olga] I'm confused what client version are you testing and against what server? I haven't seen generic/430 failing while testing upstream versions against upstream server verions. What should I try (as in what client version against what server version) to reproduce the failure?

You can reproduce it with client and server both on rc3.

(In more detail: you need a client with 7c1d1dcc24b3, but a server that
doesn't yet have 6ee65a773096 "Revert "nfsd4: a client's own opens
needn't prevent delegation"".

I have a patch that will restore the server's ability to grant
delegations to clients with write opens, but this regression was one of
the problems I ran across in testing....)

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: generic/430 COPY/delegation caching regression
  2021-04-14  3:09 ` Trond Myklebust
@ 2021-04-14 14:04   ` bfields
  0 siblings, 0 replies; 5+ messages in thread
From: bfields @ 2021-04-14 14:04 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs, kolga

On Wed, Apr 14, 2021 at 03:09:18AM +0000, Trond Myklebust wrote:
> On Tue, 2021-04-13 at 19:19 -0400, J. Bruce Fields wrote:
> > generic/430 started failing in 4.12-rc3, as of 7c1d1dcc24b3 "nfsd:
> > grant
> > read delegations to clients holding writes".
> > 
> > Looks like that reintroduced the problem fixed by 16abd2a0c124
> > "NFSv4.2:
> > fix client's attribute cache management for copy_file_range": the
> > client
> > needs to invalidate its cache of the destination of a copy even when it
> > holds a delegation.
> > 
> > --b.
> 
> Hmm.. The only thing I see that could be causing an issue is the fact
> that we're relying on cache invalidation to change the file size. 
> 
>         nfs_set_cache_invalid(
>                 dst_inode, NFS_INO_REVAL_PAGECACHE | NFS_INO_REVAL_FORCED |
>                                    NFS_INO_INVALID_SIZE | NFS_INO_INVALID_ATTR |
>                                    NFS_INO_INVALID_DATA);
> 
> The only problem there is that nfs_set_cache_invalid() will clobber the
> NFS_INO_INVALID_SIZE because if we hold a delegation, then our client
> is the sole authority for the size attribute (hence we don't allow it
> to be invalidated). We therefore expect a call to i_size_write(), if
> the file size grew.
> 
> Otherwise, the setting of NFS_INO_INVALID_DATA should be redundant
> because we've already punched a hole with truncate_pagecache_range().

Looks like it's just copying a file and finding the destination still empty;
expected/actual output diff from xfstests is:

     e11fbace556cba26bf0076e74cab90a3  TEST_DIR/test-430/file
     e11fbace556cba26bf0076e74cab90a3  TEST_DIR/test-430/copy
     Copy beginning of original file
    +cmp: EOF on /mnt/test-430/beginning which is empty
     md5sums after copying beginning:
     e11fbace556cba26bf0076e74cab90a3  TEST_DIR/test-430/file
    -cabe45dcc9ae5b66ba86600cca6b8ba8  TEST_DIR/test-430/beginning

The test script there is:

echo "Create the original file and then copy"
$XFS_IO_PROG -f -c 'pwrite -S 0x61 0    1000' $testdir/file >> $seqres.full 2>&1
$XFS_IO_PROG -f -c 'pwrite -S 0x62 1000 1000' $testdir/file >> $seqres.full 2>&1
$XFS_IO_PROG -f -c 'pwrite -S 0x63 2000 1000' $testdir/file >> $seqres.full 2>&1
$XFS_IO_PROG -f -c 'pwrite -S 0x64 3000 1000' $testdir/file >> $seqres.full 2>&1
$XFS_IO_PROG -f -c 'pwrite -S 0x65 4000 1000' $testdir/file >> $seqres.full 2>&1
$XFS_IO_PROG -f -c "copy_range $testdir/file" "$testdir/copy"
cmp $testdir/file $testdir/copy
echo "Original md5sums:"
md5sum $testdir/{file,copy} | _filter_test_dir

echo "Copy beginning of original file"
$XFS_IO_PROG -f -c "copy_range -l 1000 $testdir/file" "$testdir/beginning"
cmp -n 1000 $testdir/file $testdir/beginning

If the client is just failing to notice when a newly created file's size is
grown as the result of a COPY, then I wonder why the first copy (of "file" to
"copy") didn't also fail.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-14 14:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 23:19 generic/430 COPY/delegation caching regression J. Bruce Fields
2021-04-14  3:09 ` Trond Myklebust
2021-04-14 14:04   ` bfields
2021-04-14  3:30 ` Kornievskaia, Olga
2021-04-14 13:49   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).