All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: commit 7c2dad99d6 "Don't let the ctime override attribute barriers"
Date: Thu, 31 Mar 2016 10:36:18 -0400	[thread overview]
Message-ID: <CAN-5tyE4OWJrBuP73tMab8NAy0_NzfZvVpSZQeSmpQ+qoDpcjQ@mail.gmail.com> (raw)
In-Reply-To: <CAHQdGtQjV2u+RrXiM6iKtz9in4zdDWDWi2gnj47-+WSdG=JKzw@mail.gmail.com>

On Thu, Mar 31, 2016 at 10:21 AM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> On Thu, Mar 31, 2016 at 10:15 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> On Wed, Mar 30, 2016 at 7:32 PM, Trond Myklebust
>> <trond.myklebust@primarydata.com> wrote:
>>> On Wed, Mar 30, 2016 at 5:51 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>> On Wed, Mar 30, 2016 at 5:45 PM, Trond Myklebust
>>>> <trond.myklebust@primarydata.com> wrote:
>>>>> On Wed, Mar 30, 2016 at 5:02 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>> On Tue, Mar 29, 2016 at 3:47 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>> I think that patch introduces a problem. Since the checking for the
>>>>>>> change in ctime was removed by the commit it leads to (improper) cache
>>>>>>> invalidation in NFSv3.
>>>>>>>
>>>>>>> Test is write 10240bytes to the server then read it. Expectation is
>>>>>>> not to see read on the wire. In the test the write is spread over
>>>>>>> 3rpcs.
>>>>>>>
>>>>>>> On the 1nd reply
>>>>>>> fattr->gencount=33 nfsi->gencount=32 generation_counter=35
>>>>>>> On the 2nd reply
>>>>>>> fattr->gencount=34 nfsi->gencount=36 generation_counter=36
>>>>>>>
>>>>>>> In the code when processing 2nd reply,
>>>>>>> nfs_post_op_update_inode_force_wcc_locked() calls into
>>>>>>> nfs_inode_attrs_need_update() it determines that it doesn't need to
>>>>>>> update them (even though the size and the time have changed). so it
>>>>>>> doesn't call nfs_wcc_update_inode() so the inode->i_version doesn't
>>>>>>> get set to the ctime that was received in the 2nd reply.
>>>>>>>
>>>>>>> On the 3rd reply
>>>>>>> fattr->gencount=37 nfsi->gencount=36 generation_counter=37
>>>>>>>
>>>>>>> It leads to nfs_inode_attrs_need_update() returns 1 and in the
>>>>>>> nfs_update_inode() the difference in the ctimes leads to invalidation.
>>>>>>> fattr->gencount was update from nfs_writeback_update_node() ->
>>>>>>> nfs_post_op_update_inode_force_wcc() calling nfs_fattr_set_barrier().
>>>>>>>
>>>>>>> I'm not sure what appropriate values for "gencount" should have been.
>>>>>>> But if the check for nfs_ctime_need_update() was still there in
>>>>>>> nfs_inode_attrs_need_update() then the 2nd reply would have
>>>>>>> appropriately updated the i_version and not lead to invalidation.
>>>>>>
>>>>>> Would like to add that this problem is not seen against the Linux
>>>>>> server because it doesn't send "before" attributes. So code doesn't
>>>>>> set the "pre_change_attr" which later doesn't make what's stored in
>>>>>> inode->i_version.
>>>>>>
>>>>>> The problem also not seen for v4 because pre_change_attr is not gotten
>>>>>> from the "before" attributes but instead from the previous value in
>>>>>> inode->i_version which is then compared to the itself.
>>>>>>
>>>>>> If reverting the problematic commit is not the solution, then how
>>>>>> about ignoring the "before" ctime attributes sent by the server. This
>>>>>> also helps with the out-of-order RPCs.
>>>>>
>>>>> Why bother doing that on the client? These attributes aren't mandatory
>>>>> to send...
>>>>>
>>>>
>>>> Leads to poor client performances. Every large enough read invalidates
>>>> the cache so all the reads go to the server always.
>>>
>>> I'm saying why not just turn off the WCC functionality on the server then.
>>
>> One reasoning could be that providing cache consistency is client's
>> responsibility. Server only provides needed information.
>
> Sure, but the server is the single point of control for all clients.

What about other client implementations that might use before attributes?

>> While I can see that handling out-of-order RPCs might not be worth it
>> because hopefully it doesn't happen often but handling in order RPCs
>> and always invalidating the cache is a bug.
>
> We don't invalidate on in-order RPCs.

Yes we do and that was the point of my initial post. The commit
introduced a regression.

  reply	other threads:[~2016-03-31 14:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-29 19:47 commit 7c2dad99d6 "Don't let the ctime override attribute barriers" Olga Kornievskaia
2016-03-30 21:02 ` Olga Kornievskaia
2016-03-30 21:45   ` Trond Myklebust
2016-03-30 21:51     ` Olga Kornievskaia
2016-03-30 23:32       ` Trond Myklebust
2016-03-31 14:15         ` Olga Kornievskaia
2016-03-31 14:21           ` Trond Myklebust
2016-03-31 14:36             ` Olga Kornievskaia [this message]
2016-03-31 15:16               ` Trond Myklebust
2016-03-31 15:27                 ` Olga Kornievskaia
2016-04-01 19:38                   ` Olga Kornievskaia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyE4OWJrBuP73tMab8NAy0_NzfZvVpSZQeSmpQ+qoDpcjQ@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.