All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiufei Xue <jiufei.xue@linux.alibaba.com>
To: Trond Myklebust <trondmy@hammerspace.com>,
	"aglo@umich.edu" <aglo@umich.edu>
Cc: "bfields@fieldses.org" <bfields@fieldses.org>,
	"Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"joseph.qi@linux.alibaba.com" <joseph.qi@linux.alibaba.com>
Subject: Re: [bug report] task hang while testing xfstests generic/323
Date: Tue, 5 Mar 2019 13:09:12 +0800	[thread overview]
Message-ID: <1417d6c8-3602-d895-a036-44031a7c1e98@linux.alibaba.com> (raw)
In-Reply-To: <740c1ad6c5b2afad196af34f48dda7097b5102b7.camel@hammerspace.com>



On 2019/3/4 下午11:50, Trond Myklebust wrote:
> On Mon, 2019-03-04 at 23:20 +0800, Jiufei Xue wrote:
>> Hi Trond,
>>
>> On 2019/3/1 下午9:08, Trond Myklebust wrote:
>>> On Fri, 2019-03-01 at 16:49 +0800, Jiufei Xue wrote:
>>>> On 2019/3/1 下午1:08, Jiufei Xue wrote:
>>>>> Hi Olga,
>>>>>
>>>>> On 2019/3/1 上午6:26, Olga Kornievskaia wrote:
>>>>>> On Thu, Feb 28, 2019 at 5:11 AM Jiufei Xue <
>>>>>> jiufei.xue@linux.alibaba.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> when I tested xfstests/generic/323 with NFSv4.1 and v4.2,
>>>>>>> the
>>>>>>> task
>>>>>>> changed to zombie occasionally while a thread is hanging
>>>>>>> with
>>>>>>> the
>>>>>>> following stack:
>>>>>>>
>>>>>>> [<0>] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc]
>>>>>>> [<0>] nfs4_do_close+0x21b/0x2c0 [nfsv4]
>>>>>>> [<0>] __put_nfs_open_context+0xa2/0x110 [nfs]
>>>>>>> [<0>] nfs_file_release+0x35/0x50 [nfs]
>>>>>>> [<0>] __fput+0xa2/0x1c0
>>>>>>> [<0>] task_work_run+0x82/0xa0
>>>>>>> [<0>] do_exit+0x2ac/0xc20
>>>>>>> [<0>] do_group_exit+0x39/0xa0
>>>>>>> [<0>] get_signal+0x1ce/0x5d0
>>>>>>> [<0>] do_signal+0x36/0x620
>>>>>>> [<0>] exit_to_usermode_loop+0x5e/0xc2
>>>>>>> [<0>] do_syscall_64+0x16c/0x190
>>>>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>>>>> [<0>] 0xffffffffffffffff
>>>>>>>
>>>>>>> Since commit 12f275cdd163(NFSv4: Retry CLOSE and
>>>>>>> DELEGRETURN on
>>>>>>> NFS4ERR_OLD_STATEID), the client will retry to close the
>>>>>>> file
>>>>>>> when
>>>>>>> stateid generation number in client is lower than server.
>>>>>>>
>>>>>>> The original intention of this commit is retrying the
>>>>>>> operation
>>>>>>> while
>>>>>>> racing with an OPEN. However, in this case the stateid
>>>>>>> generation remains
>>>>>>> mismatch forever.
>>>>>>>
>>>>>>> Any suggestions?
>>>>>>
>>>>>> Can you include a network trace of the failure? Is it
>>>>>> possible
>>>>>> that
>>>>>> the server has crashed on reply to the close and that's why
>>>>>> the
>>>>>> task
>>>>>> is hung? What server are you testing against?
>>>>>>
>>>>> I am testing the local server, the testcase is attached below.
>>>>>
>>>>> # uname -r
>>>>> 5.0.0-rc8
>>>>>
>>>>> # cat /etc/exports
>>>>> /mnt/vdd *(rw,no_subtree_check,no_root_squash)
>>>>>
>>>>> # cat test.sh
>>>>> mount /dev/vdd /mnt/vdd
>>>>> systemctl start nfs
>>>>> mount -t nfs -o rw,relatime,vers=4.1 127.0.0.1:/mnt/vdd
>>>>> /mnt/nfsscratch
>>>>>
>>>>> for ((i=0;i<10000;i++))
>>>>> do
>>>>>         echo "loop $i started"
>>>>>         /usr/local/xfstests/src/aio-dio-regress/aio-last-ref-
>>>>> held-
>>>>> by-io 1 100 /mnt/nfsscratch/aio-testfile
>>>>> 	sleep 5
>>>>> done
>>>>>
>>>>> The command aio-last-ref-held-by-io from xfstests will spawn
>>>>> 100
>>>>> threads. Each thread
>>>>> repeated doing open/io_submit/close/io_getevents/io_destroy
>>>>> until
>>>>> the parent exist
>>>>> because of timeout.
>>>>>
>>>>>
>>>>> Here is the log when the task hung.
>>>>>
>>>>> [10841.121523] nfs4_close_prepare: begin!
>>>>> [10841.121526] --> nfs4_alloc_slot used_slots=0000
>>>>> highest_used=4294967295 max_slots=10
>>>>> [10841.121528] <-- nfs4_alloc_slot used_slots=0001
>>>>> highest_used=0
>>>>> slotid=0
>>>>> [10841.121530] nfs4_close_prepare: done!
>>>>> [10841.121536] encode_sequence:
>>>>> sessionid=1551405749:3822331530:1:0
>>>>> seqid=59495845 slotid=0 max_slotid=0 cache_this=1
>>>>> [10841.121583] nfsd_dispatch: vers 4 proc 1
>>>>> [10841.121588] __find_in_sessionid_hashtbl:
>>>>> 1551405749:3822331530:1:0
>>>>> [10841.121590] nfsd4_sequence: slotid 0
>>>>> [10841.121592] check_slot_seqid enter. seqid 59495845
>>>>> slot_seqid
>>>>> 59495844
>>>>> [10841.121596] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9
>>>>> 61e12d9e c03beba6 0000000b)
>>>>> [10841.121607] nfsd: fh_verify(28: 01060001 0c66c6db d74ea4c9
>>>>> 61e12d9e c03beba6 0000000b)
>>>>> [10841.121613] NFSD: nfsd4_close on file aio-testfile
>>>>> [10841.121615] NFSD: nfs4_preprocess_seqid_op: seqid=0 stateid
>>>>> =
>>>>> (5c7892b5/e3d4268a/000ab905/0000746b)
>>>>> [10841.121619] nfsv4 compound returned 10024
>>>>> [10841.121621] --> nfsd4_store_cache_entry slot
>>>>> 00000000f540ddd9
>>>>> [10841.121625] renewing client (clientid 5c7892b5/e3d4268a)
>>>>> [10841.122526] decode_attr_type: type=00
>>>>> ......
>>>>> [10841.122558] nfs4_close_done: begin!
>>>>> [10841.122561] --> nfs4_alloc_slot used_slots=0001
>>>>> highest_used=0
>>>>> max_slots=10
>>>>> [10841.122564] <-- nfs4_alloc_slot used_slots=0003
>>>>> highest_used=1
>>>>> slotid=1
>>>>> [10841.122566] nfs4_free_slot: slotid 1 highest_used_slotid 0
>>>>> [10841.122568] nfs41_sequence_process: Error 0 free the slot
>>>>> [10841.122571] nfs4_free_slot: slotid 0 highest_used_slotid
>>>>> 4294967295
>>>>> [10841.122575] nfs4_close_done: done, ret = 0!
>>>>> [10841.122584] nfs4_close_prepare: begin!
>>>>> ......
>>>>> Client keep retry the CLOSE.
>>>>>
>>>>>> I have seen trace where close would get ERR_OLD_STATEID and
>>>>>> would
>>>>>> still retry with the same open state until it got a reply to
>>>>>> the
>>>>>> OPEN
>>>>>> which changed the state and when the client received reply to
>>>>>> that,
>>>>>> it'll retry the CLOSE with the updated stateid.
>>>>>>
>>>>>
>>>>> Yes, you are right. I have also noticed that. But what happened
>>>>> if
>>>>> the
>>>>> OPEN task exit receiving a signal? How can a client update the
>>>>> stateid?
>>>>>
>>>
>>> static void nfs4_open_release(void *calldata)
>>> {
>>>         struct
>>> nfs4_opendata *data = calldata;
>>>         struct nfs4_state *state = NULL;
>>>
>>>         /* If this request hasn't been cancelled, do nothing */
>>>         if (!data->cancelled)
>>>                 goto out_free;
>>>         /* In case of error, no cleanup! */
>>>         if (data->rpc_status != 0 || !data->rpc_done)
>>>                 goto out_free;
>>>         /* In case we need an open_confirm, no cleanup! */
>>>         if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM)
>>>                 goto out_free;
>>>         state = nfs4_opendata_to_nfs4_state(data);
>>>         if (!IS_ERR(state))
>>>                 nfs4_close_state(state, data->o_arg.fmode);
>>> out_free:
>>>         nfs4_opendata_put(data);
>>> }
>>>
>>> Note what happens when the 'data->cancelled' flag is set. This is
>>> why
>>> we always ensure that stateful operations such as OPEN, and
>>> LOCK,...
>>> are run as asynchronous RPC tasks.
>>
>> I noticed that when receiving a signal, the rpc_status will be set to
>> -ERESTARTSYS in rpc_exit->rpc_exit_task->nfs4_open_done, which made
>> stateid
>> not updated in nfs4_open_release().
>>
>> So is it possible that the stateid between the client and the server
>> remain
>> mismatch if the OPEN task received the signal right after
>> transmitting the
>> message or receiving the response, but before calling rpc_exit_task
>> normally?
> 
> If the rpc_task could be signalled, yes. However I don't see how that
> can happen to an async RPC call. Have you found such a case?
>

I haven't found such a case, just guess :)

But now I have found something new that may helpful. When the problem
happened, rpc_status was set to -121 which returned from
nfs4_xdr_dec_open->decode_putfh. I think it is the cause that client can
not update the stateid.

    kworker/u4:0-7     [001] ....   559.920542: nfs4_xdr_dec_open: status=0 openstateid=40289:0xd4bdfeb7
     kworker/1:0-2929  [001] ....   559.920550: nfs4_open_stateid_update: error=0 (OK) fileid=00:2b:12 fhandle=0x1e4566cb stateid=40289:0xd4bdfeb7
     kworker/1:0-2929  [001] ....   559.920551: nfs4_open_release: cancel=1 rpc_status=0 rpc_done=1 openstateid=40289:0xd4bdfeb7

            nfsd-2741  [001] ....   559.920571: nfsd_compound_status: op=1/6 OP_SEQUENCE status=0
            nfsd-2741  [001] ....   559.920575: nfsd_compound_status: op=2/6 OP_PUTFH status=0
            nfsd-2741  [001] ....   559.920581: nfsd_inc_and_copy_stateid: client 5c7df867:90134041 stateid 00031b7d:00009d62 // 40290
            nfsd-2741  [001] ....   559.920584: nfsd_compound_status: op=3/6 OP_OPEN status=0
            nfsd-2741  [001] ....   559.920584: nfsd_compound_status: op=4/6 OP_GETFH status=0
            nfsd-2741  [001] ....   559.920585: nfsd_compound_status: op=5/6 OP_ACCESS status=0
            nfsd-2741  [001] ....   559.920588: nfsd_compound_status: op=6/6 OP_GETATTR status=0
    kworker/u4:0-7     [001] ....   559.920622: nfs4_xdr_dec_open: status=-121 openstateid=0:0x7bd5c66f // added for debug
     kworker/1:0-2929  [001] ....   559.920628: nfs4_open_release: cancel=1 rpc_status=-121 rpc_done=1 openstateid=0:0x7bd5c66f // added for debug

And we got this error in dmesg:
[  559.834808] nfs: Server returned operation 53 but we issued a request for 22
[  559.835741] nfs: Server returned operation 53 but we issued a request for 22

However, I don't kown why the client received two OP_SEQUENCE response.
Any ideas?

Thanks,
Jiufei

  reply	other threads:[~2019-03-05  5:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28 10:10 [bug report] task hang while testing xfstests generic/323 Jiufei Xue
2019-02-28 22:26 ` Olga Kornievskaia
2019-02-28 23:56   ` Trond Myklebust
2019-03-01  5:19     ` Jiufei Xue
2019-03-01  5:08   ` Jiufei Xue
2019-03-01  8:49     ` Jiufei Xue
2019-03-01 13:08       ` Trond Myklebust
2019-03-02 16:34         ` Jiufei Xue
2019-03-04 15:20         ` Jiufei Xue
2019-03-04 15:50           ` Trond Myklebust
2019-03-05  5:09             ` Jiufei Xue [this message]
2019-03-05 14:45               ` Trond Myklebust
2019-03-06  9:59                 ` Jiufei Xue
2019-03-06 16:09                   ` bfields
2019-03-10 22:20                     ` Olga Kornievskaia
2019-03-11 14:30                       ` Trond Myklebust
2019-03-11 15:07                         ` Olga Kornievskaia
2019-03-11 15:13                           ` Olga Kornievskaia
2019-03-15  6:30                             ` Jiufei Xue
2019-03-15 20:33                               ` Olga Kornievskaia
2019-03-15 20:55                                 ` Trond Myklebust
2019-03-16 14:11                                 ` Jiufei Xue
2019-03-19 15:33                                   ` Olga Kornievskaia
2019-03-11 15:12                         ` Trond Myklebust
2019-03-11 15:14                           ` Olga Kornievskaia
2019-03-11 15:28                             ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1417d6c8-3602-d895-a036-44031a7c1e98@linux.alibaba.com \
    --to=jiufei.xue@linux.alibaba.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=aglo@umich.edu \
    --cc=bfields@fieldses.org \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.