From: Chao Yu <yuchao0@huawei.com>
To: Sahitya Tummala <stummala@codeaurora.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
linux-kernel@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: fix long latency due to discard during umount
Date: Tue, 31 Mar 2020 09:46:30 +0800 [thread overview]
Message-ID: <d65e7548-205d-ef28-e9fc-041ae1571cfd@huawei.com> (raw)
In-Reply-To: <20200330105122.GV20234@codeaurora.org>
Hi Sahitya,
On 2020/3/30 18:51, Sahitya Tummala wrote:
> Hi Chao,
>
> On Mon, Mar 30, 2020 at 06:16:40PM +0800, Chao Yu wrote:
>> On 2020/3/30 16:38, Chao Yu wrote:
>>> Hi Sahitya,
>>>
>>> Bad news, :( I guess we didn't catch the root cause, as after applying v3,
>>> I still can reproduce this issue:
>>>
>>> generic/003 10s ... 30s
>>
>> I use zram as backend device of fstest,
>>
>> Call Trace:
>> dump_stack+0x66/0x8b
>> f2fs_submit_discard_endio+0x88/0xa0 [f2fs]
>> generic_make_request_checks+0x70/0x5f0
>> generic_make_request+0x3e/0x2e0
>> submit_bio+0x72/0x140
>> __submit_discard_cmd.isra.50+0x4a8/0x710 [f2fs]
>> __issue_discard_cmd+0x171/0x3a0 [f2fs]
>>
>> Does this mean zram uses single queue, so we may always fail to submit 'nowait'
>> IO due to below condition:
>>
>> /*
>> * Non-mq queues do not honor REQ_NOWAIT, so complete a bio
>> * with BLK_STS_AGAIN status in order to catch -EAGAIN and
>> * to give a chance to the caller to repeat request gracefully.
>> */
>> if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q)) {
>> status = BLK_STS_AGAIN;
>> goto end_io;
>> }
>>
>
> Yes, I have also just figured out that as the reason. But most of the real block
> devic drivers support MQ. Can we thus fix this case by checking for MQ status
> before enabling REQ_NOWAIT as below? Please share your comments.
>
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index cda7935..e7e2ffe 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1131,7 +1131,9 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
>
> flag = dpolicy->sync ? REQ_SYNC : 0;
> - flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> +
> + if (sbi->sb->s_bdev->bd_queue && queue_is_mq(sbi->sb->s_bdev->bd_queue))
> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
IMO, it's too tight to couple with block layer logic? however, I don't have
any better idea about the solution.
Anyway, I guess we can Cc to Jan and block mailing list for comments to see
whether there is a better solution.
Thoughts?
Thanks,
>
> if (dc->state != D_PREP)
> return 0;
>
> Thanks,
>
>>
>>
>>>
>>> Thanks,
>>>
>>> On 2020/3/30 14:53, Sahitya Tummala wrote:
>>>> Hi Chao,
>>>>
>>>> On Fri, Mar 27, 2020 at 08:35:42AM +0530, Sahitya Tummala wrote:
>>>>> On Fri, Mar 27, 2020 at 09:51:43AM +0800, Chao Yu wrote:
>>>>>>
>>>>>> With this patch, most of xfstest cases cost 5 * n second longer than before.
>>>>>>
>>>>>> E.g. generic/003, during umount(), we looped into retrying one bio
>>>>>> submission.
>>>>>>
>>>>>> [61279.829724] F2FS-fs (zram1): Found nat_bits in checkpoint
>>>>>> [61279.885337] F2FS-fs (zram1): Mounted with checkpoint version = 5cf3cb8e
>>>>>> [61281.912832] submit discard bio start [23555,1]
>>>>>> [61281.912835] f2fs_submit_discard_endio [23555,1] err:-11
>>>>>> [61281.912836] submit discard bio end [23555,1]
>>>>>> [61281.912836] move dc to retry list [23555,1]
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> [61286.881212] submit discard bio start [23555,1]
>>>>>> [61286.881217] f2fs_submit_discard_endio [23555,1] err:-11
>>>>>> [61286.881223] submit discard bio end [23555,1]
>>>>>> [61286.881224] move dc to retry list [23555,1]
>>>>>> [61286.905198] submit discard bio start [23555,1]
>>>>>> [61286.905203] f2fs_submit_discard_endio [23555,1] err:-11
>>>>>> [61286.905205] submit discard bio end [23555,1]
>>>>>> [61286.905206] move dc to retry list [23555,1]
>>>>>> [61286.929157] F2FS-fs (zram1): Issue discard(23555, 23555, 1) failed, ret: -11
>>>>>>
>>>>>> Could you take a look at this issue?
>>>>>
>>>>> Let me check and get back on this.
>>>>
>>>> I found the issue. The dc with multiple bios is getting requeued again and
>>>> again in case if one of its bio gets -EAGAIN error. Even the successfully
>>>> completed bios are getting requeued again resulting into long latency.
>>>> I have fixed it by splitting the dc in such case so that we can requeue only
>>>> the leftover bios into a new dc and retry that later within the 5 sec timeout.
>>>>
>>>> Please help to review v3 posted and if it looks good, I would like to request
>>>> you to test the earlier regression scenario with it to check the result again?
>>>>
>>>> thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>> + break;
>>>>>>>>> + }
>>>>>>>>> + }
>>>>>>>>>
>>>>>>>>> atomic_inc(&dcc->issued_discard);
>>>>>>>>>
>>>>>>>>> @@ -1463,6 +1477,40 @@ static unsigned int __issue_discard_cmd_orderly(struct f2fs_sb_info *sbi,
>>>>>>>>> return issued;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> +static bool __should_discard_retry(struct f2fs_sb_info *sbi,
>>>>>>> s> > + struct discard_policy *dpolicy)
>>>>>>>>> +{
>>>>>>>>> + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
>>>>>>>>> + struct discard_cmd *dc, *tmp;
>>>>>>>>> + bool retry = false;
>>>>>>>>> + unsigned long flags;
>>>>>>>>> +
>>>>>>>>> + if (dpolicy->type != DPOLICY_UMOUNT)
>>>>>>>>> + f2fs_bug_on(sbi, 1);
>>>>>>>>> +
>>>>>>>>> + mutex_lock(&dcc->cmd_lock);
>>>>>>>>> + list_for_each_entry_safe(dc, tmp, &(dcc->retry_list), list) {
>>>>>>>>> + if (dpolicy->timeout != 0 &&
>>>>>>>>> + f2fs_time_over(sbi, dpolicy->timeout)) {
>>>>>>>>> + retry = false;
>>>>>>>>> + break;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + spin_lock_irqsave(&dc->lock, flags);
>>>>>>>>> + if (!dc->bio_ref) {
>>>>>>>>> + dc->state = D_PREP;
>>>>>>>>> + dc->error = 0;
>>>>>>>>> + reinit_completion(&dc->wait);
>>>>>>>>> + __relocate_discard_cmd(dcc, dc);
>>>>>>>>> + retry = true;
>>>>>>>>> + }
>>>>>>>>> + spin_unlock_irqrestore(&dc->lock, flags);
>>>>>>>>> + }
>>>>>>>>> + mutex_unlock(&dcc->cmd_lock);
>>>>>>>>> +
>>>>>>>>> + return retry;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>>>>>>>> struct discard_policy *dpolicy)
>>>>>>>>> {
>>>>>>>>> @@ -1470,12 +1518,13 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>>>>>>>> struct list_head *pend_list;
>>>>>>>>> struct discard_cmd *dc, *tmp;
>>>>>>>>> struct blk_plug plug;
>>>>>>>>> - int i, issued = 0;
>>>>>>>>> + int i, err, issued = 0;
>>>>>>>>> bool io_interrupted = false;
>>>>>>>>>
>>>>>>>>> if (dpolicy->timeout != 0)
>>>>>>>>> f2fs_update_time(sbi, dpolicy->timeout);
>>>>>>>>>
>>>>>>>>> +retry:
>>>>>>>>> for (i = MAX_PLIST_NUM - 1; i >= 0; i--) {
>>>>>>>>> if (dpolicy->timeout != 0 &&
>>>>>>>>> f2fs_time_over(sbi, dpolicy->timeout))
>>>>>>>>> @@ -1509,7 +1558,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>>>>>>>> break;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> - __submit_discard_cmd(sbi, dpolicy, dc, &issued);
>>>>>>>>> + err = __submit_discard_cmd(sbi, dpolicy, dc, &issued);
>>>>>>>>> + if (err == -EAGAIN)
>>>>>>>>> + congestion_wait(BLK_RW_ASYNC,
>>>>>>>>> + DEFAULT_IO_TIMEOUT);
>>>>>>>>>
>>>>>>>>> if (issued >= dpolicy->max_requests)
>>>>>>>>> break;
>>>>>>>>> @@ -1522,6 +1574,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>>>>>>>>> break;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> + if (!list_empty(&dcc->retry_list) &&
>>>>>>>>> + __should_discard_retry(sbi, dpolicy))
>>>>>>>>> + goto retry;
>>>>>>>>> +
>>>>>>>>> if (!issued && io_interrupted)
>>>>>>>>> issued = -1;
>>>>>>>>>
>>>>>>>>> @@ -1613,6 +1669,12 @@ static unsigned int __wait_discard_cmd_range(struct f2fs_sb_info *sbi,
>>>>>>>>> goto next;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> + if (dpolicy->type == DPOLICY_UMOUNT &&
>>>>>>>>> + !list_empty(&dcc->retry_list)) {
>>>>>>>>> + wait_list = &dcc->retry_list;
>>>>>>>>> + goto next;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> return trimmed;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> @@ -2051,6 +2113,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi)
>>>>>>>>> for (i = 0; i < MAX_PLIST_NUM; i++)
>>>>>>>>> INIT_LIST_HEAD(&dcc->pend_list[i]);
>>>>>>>>> INIT_LIST_HEAD(&dcc->wait_list);
>>>>>>>>> + INIT_LIST_HEAD(&dcc->retry_list);
>>>>>>>>> INIT_LIST_HEAD(&dcc->fstrim_list);
>>>>>>>>> mutex_init(&dcc->cmd_lock);
>>>>>>>>> atomic_set(&dcc->issued_discard, 0);
>>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> --
>>>>> Sent by a consultant of the Qualcomm Innovation Center, Inc.
>>>>> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
>>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>> .
>>>
>
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2020-03-31 1:46 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1584506689-5041-1-git-send-email-stummala@codeaurora.org>
2020-03-24 9:08 ` [f2fs-dev] [PATCH] f2fs: fix long latency due to discard during umount Chao Yu
2020-03-24 9:47 ` Chao Yu
2020-03-26 9:00 ` Chao Yu
2020-03-26 13:37 ` Sahitya Tummala
2020-03-27 1:51 ` Chao Yu
[not found] ` <20200327030542.GS20234@codeaurora.org>
2020-03-30 6:53 ` Sahitya Tummala
2020-03-30 8:38 ` Chao Yu
2020-03-30 10:16 ` Chao Yu
2020-03-30 10:51 ` Sahitya Tummala
2020-03-31 1:46 ` Chao Yu [this message]
2020-03-31 3:10 ` Sahitya Tummala
2020-03-31 3:50 ` Jaegeuk Kim
[not found] <1584011671-20939-1-git-send-email-stummala@codeaurora.org>
2020-03-12 17:02 ` Jaegeuk Kim
[not found] ` <20200313012604.GI20234@codeaurora.org>
2020-03-13 1:45 ` Jaegeuk Kim
2020-03-13 5:12 ` Sahitya Tummala
2020-03-13 15:38 ` Jaegeuk Kim
2020-03-13 2:20 ` Chao Yu
[not found] ` <20200313033912.GJ20234@codeaurora.org>
2020-03-13 6:30 ` Chao Yu
[not found] ` <20200313110846.GL20234@codeaurora.org>
2020-03-16 0:52 ` Chao Yu
2020-03-16 3:52 ` Sahitya Tummala
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d65e7548-205d-ef28-e9fc-041ae1571cfd@huawei.com \
--to=yuchao0@huawei.com \
--cc=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=stummala@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).