From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5652FC4743E for ; Tue, 8 Jun 2021 07:19:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3EB3C610FB for ; Tue, 8 Jun 2021 07:19:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229657AbhFHHVZ (ORCPT ); Tue, 8 Jun 2021 03:21:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:51861 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229512AbhFHHVZ (ORCPT ); Tue, 8 Jun 2021 03:21:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1623136772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w1zWaAsZTqx57RN6BYqdobUEdBQbtA9Wkluuyx7wQnY=; b=ccYoI36B0PWKlNIrF2rqLEUEPchvtST0z17CF9bQgqxp7JR9pDbwQ28gcp2jIR3P2U8mxR QDfBRcqe+C7kQXDopmv8N9sy43XhoaI0gdPORAwUdQtwuHl6xmcRLwZKev9wBr2IKQoV7p ybGx8WBCO29p2Rxro3H5kvjnV6CF034= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-326-h65XNHW5PRSQU62M05Z3LQ-1; Tue, 08 Jun 2021 03:19:31 -0400 X-MC-Unique: h65XNHW5PRSQU62M05Z3LQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 197C31009461; Tue, 8 Jun 2021 07:19:30 +0000 (UTC) Received: from localhost (ovpn-12-142.pek2.redhat.com [10.72.12.142]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0EE9C608BA; Tue, 8 Jun 2021 07:19:22 +0000 (UTC) From: Ming Lei To: Jens Axboe , Christoph Hellwig Cc: linux-block@vger.kernel.org, Ming Lei , Yi Zhang , Bart Van Assche Subject: [PATCH V2 1/2] block: fix race between adding/removing rq qos and normal IO Date: Tue, 8 Jun 2021 15:19:02 +0800 Message-Id: <20210608071903.431195-2-ming.lei@redhat.com> In-Reply-To: <20210608071903.431195-1-ming.lei@redhat.com> References: <20210608071903.431195-1-ming.lei@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Yi reported several kernel panics on: [16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... [16687.163549] pc : __rq_qos_track+0x38/0x60 or [ 997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... [ 997.850347] pc : __rq_qos_done+0x2c/0x50 Turns out it is caused by race between adding rq qos(wbt) and normal IO because rq_qos_add can be run when IO is being submitted, fix this issue by freezing queue before adding/deleting rq qos to queue. rq_qos_exit() needn't to freeze queue because it is called after queue has been frozen. iolatency calls rq_qos_add() during allocating queue, so freezing won't add delay because queue usage refcount works at atomic mode at that time. iocost calls rq_qos_add() when writing cgroup attribute file, that is fine to freeze queue at that time since we usually freeze queue when storing to queue sysfs attribute, meantime iocost only exists on the root cgroup. wbt_init calls it in blk_register_queue() and queue sysfs attribute store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ), the following patch will speedup the queue freezing in wbt_init. Reported-by: Yi Zhang Cc: Bart Van Assche Signed-off-by: Ming Lei --- block/blk-rq-qos.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index 2bc43e94f4c4..c9dccb344312 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -7,6 +7,7 @@ #include #include #include +#include #include "blk-mq-debugfs.h" @@ -99,8 +100,14 @@ static inline void rq_wait_init(struct rq_wait *rq_wait) static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos) { + /* + * No IO can be in-flight when adding rqos, so freeze queue, which + * is fine since we only support rq_qos for blk-mq queue + */ + blk_mq_freeze_queue(q); rqos->next = q->rq_qos; q->rq_qos = rqos; + blk_mq_unfreeze_queue(q); if (rqos->ops->debugfs_attrs) blk_mq_debugfs_register_rqos(rqos); @@ -110,12 +117,18 @@ static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos) { struct rq_qos **cur; + /* + * No IO can be in-flight when removing rqos, so freeze queue, + * which is fine since we only support rq_qos for blk-mq queue + */ + blk_mq_freeze_queue(q); for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) { if (*cur == rqos) { *cur = rqos->next; break; } } + blk_mq_unfreeze_queue(q); blk_mq_debugfs_unregister_rqos(rqos); } -- 2.31.1