From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C0FAECE58D for ; Fri, 11 Oct 2019 03:16:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6912E21A4A for ; Fri, 11 Oct 2019 03:16:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726546AbfJKDQQ (ORCPT ); Thu, 10 Oct 2019 23:16:16 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:48258 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726096AbfJKDQQ (ORCPT ); Thu, 10 Oct 2019 23:16:16 -0400 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 1B18565B942655BC239C; Fri, 11 Oct 2019 11:16:14 +0800 (CST) Received: from [127.0.0.1] (10.177.219.49) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.439.0; Fri, 11 Oct 2019 11:16:10 +0800 Subject: Re: [PATCH v4] block: fix null pointer dereference in blk_mq_rq_timed_out() To: Jack Wang CC: Jens Axboe , , Ming Lei , Christoph Hellwig , Keith Busch , Bart Van Assche , stable , , References: <20190925122025.31246-1-yuyufen@huawei.com> From: Yufen Yu Message-ID: <9f99de42-9edb-d7df-df8c-e994ada6613c@huawei.com> Date: Fri, 11 Oct 2019 11:16:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Originating-IP: [10.177.219.49] X-CFilter-Loop: Reflected Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 2019/10/9 16:26, Jack Wang wrote: > >> We got a null pointer deference BUG_ON in blk_mq_rq_timed_out() >> as following: >> >> [ 108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040 >> [ 108.827059] PGD 0 P4D 0 >> [ 108.827313] Oops: 0000 [#1] SMP PTI >> [ 108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431 >> [ 108.829503] Workqueue: kblockd blk_mq_timeout_work >> [ 108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330 >> [ 108.838191] Call Trace: >> [ 108.838406] bt_iter+0x74/0x80 >> [ 108.838665] blk_mq_queue_tag_busy_iter+0x204/0x450 >> [ 108.839074] ? __switch_to_asm+0x34/0x70 >> [ 108.839405] ? blk_mq_stop_hw_queue+0x40/0x40 >> [ 108.839823] ? blk_mq_stop_hw_queue+0x40/0x40 >> [ 108.840273] ? syscall_return_via_sysret+0xf/0x7f >> [ 108.840732] blk_mq_timeout_work+0x74/0x200 >> [ 108.841151] process_one_work+0x297/0x680 >> [ 108.841550] worker_thread+0x29c/0x6f0 >> [ 108.841926] ? rescuer_thread+0x580/0x580 >> [ 108.842344] kthread+0x16a/0x1a0 >> [ 108.842666] ? kthread_flush_work+0x170/0x170 >> [ 108.843100] ret_from_fork+0x35/0x40 >> >> The bug is caused by the race between timeout handle and completion for >> flush request. >> >> When timeout handle function blk_mq_rq_timed_out() try to read >> 'req->q->mq_ops', the 'req' have completed and reinitiated by next >> flush request, which would call blk_rq_init() to clear 'req' as 0. >> >> After commit 12f5b93145 ("blk-mq: Remove generation seqeunce"), >> normal requests lifetime are protected by refcount. Until 'rq->ref' >> drop to zero, the request can really be free. Thus, these requests >> cannot been reused before timeout handle finish. >> >> However, flush request has defined .end_io and rq->end_io() is still >> called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq' >> can be reused by the next flush request handle, resulting in null >> pointer deference BUG ON. >> >> We fix this problem by covering flush request with 'rq->ref'. >> If the refcount is not zero, flush_end_io() return and wait the >> last holder recall it. To record the request status, we add a new >> entry 'rq_status', which will be used in flush_end_io(). >> >> Cc: Ming Lei >> Cc: Christoph Hellwig >> Cc: Keith Busch >> Cc: Bart Van Assche >> Cc: stable@vger.kernel.org # v4.18+ >> Signed-off-by: Yufen Yu >> > Hi Yufen, > > Can you share your reproducer, I think the bug was there for long > time, we hit it in kernel 4.4. > We also need to fix it for older LTS kernel. > > Do you have an idea, how should we fix it for older LTS kernel? > I have reproduced the bug by increasing delay after doing memset() in blk_rq_init() and before calling blk_mq_rq_timed_out() in blk_mq_check_expired(). To make sure the request will be timeout, I have also increase delay for flush request after blk_mq_start_request() in virtio_queue_rq() for my virtio disk. Then, we just issue a flush request for the disk by fio. The BUG_ON will be triggered. For LTS 4.4 or older kernel, the race between timeout handle and completion for normal request have not yet resolved. So, IMO, we should fix the bug first by merging commit 12f5b93145 ("blk-mq: Remove generation seqeunce") and its related patches. After that, this patch can also be merged. Thanks, Yufen