From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,T_DKIMWL_WL_HIGH,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B0B6C282DD for ; Thu, 23 May 2019 17:24:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2545E20862 for ; Thu, 23 May 2019 17:24:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="KgthNUhv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731140AbfEWRYE (ORCPT ); Thu, 23 May 2019 13:24:04 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:54204 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731155AbfEWRYE (ORCPT ); Thu, 23 May 2019 13:24:04 -0400 Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x4NHIGRN010625 for ; Thu, 23 May 2019 10:24:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=gMzanc3N2ibG5kH32uJAdAWvrvOGnfzDvjKglBCwdck=; b=KgthNUhvgJQ8rSMhZKQFsKfwC0oN/XQTkFIUszMvGxi49zz2ahM/ME4vQ1PL9BufTjkX umywSP0txGWTFErV5Z2nC3JqZO5x3b0o8zeKMTli5tLF50Taq8Kki6qAbU+eULtyXF1r jhQ6ffak3fRVSAwyYDHutvQpHjfe67cU4Co= Received: from mail.thefacebook.com (mailout.thefacebook.com [199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2snuq590r2-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 23 May 2019 10:24:02 -0700 Received: from mx-out.facebook.com (2620:10d:c081:10::13) by mail.thefacebook.com (2620:10d:c081:35::127) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA) id 15.1.1713.5; Thu, 23 May 2019 10:23:59 -0700 Received: by devbig006.ftw2.facebook.com (Postfix, from userid 4523) id 15F0D62E1885; Thu, 23 May 2019 10:23:59 -0700 (PDT) Smtp-Origin-Hostprefix: devbig From: Song Liu Smtp-Origin-Hostname: devbig006.ftw2.facebook.com To: , CC: , "Guilherme G. Piccoli" , Ming Lei , Song Liu , Tetsuo Handa , Song Liu Smtp-Origin-Cluster: ftw2c04 Subject: [PATCH 2/2] md/raid0: Do not bypass blocking queue entered for raid0 bios Date: Thu, 23 May 2019 10:23:45 -0700 Message-ID: <20190523172345.1861077-2-songliubraving@fb.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190523172345.1861077-1-songliubraving@fb.com> References: <20190523172345.1861077-1-songliubraving@fb.com> X-FB-Internal: Safe MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-05-23_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905230117 X-FB-Internal: deliver Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: "Guilherme G. Piccoli" Commit cd4a4ae4683d ("block: don't use blocking queue entered for recursive bio submits") introduced the flag BIO_QUEUE_ENTERED in order split bios bypass the blocking queue entering routine and use the live non-blocking version. It was a result of an extensive discussion in a linux-block thread[0], and the purpose of this change was to prevent a hung task waiting on a reference to drop. Happens that md raid0 split bios all the time, and more important, it changes their underlying device to the raid member. After the change introduced by this flag's usage, we experience various crashes if a raid0 member is removed during a large write. This happens because the bio reaches the live queue entering function when the queue of the raid0 member is dying. A simple reproducer of this behavior is presented below: a) Build kernel v5.2-rc1 with CONFIG_BLK_DEV_THROTTLING=y. b) Create a raid0 md array with 2 NVMe devices as members, and mount it with an ext4 filesystem. c) Run the following oneliner (supposing the raid0 is mounted in /mnt): (dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3; echo 1 > /sys/block/nvme0n1/device/device/remove (whereas nvme0n1 is the 2nd array member) This will trigger the following warning/oops: ------------[ cut here ]------------ no blkg associated for bio on block-device: nvme0n1 WARNING: CPU: 9 PID: 184 at ./include/linux/blk-cgroup.h:785 generic_make_request_checks+0x4dd/0x690 [...] BUG: unable to handle kernel NULL pointer dereference at 0000000000000155 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI RIP: 0010:blk_throtl_bio+0x45/0x970 [...] Call Trace: generic_make_request_checks+0x1bf/0x690 generic_make_request+0x64/0x3f0 raid0_make_request+0x184/0x620 [raid0] ? raid0_make_request+0x184/0x620 [raid0] ? blk_queue_split+0x384/0x6d0 md_handle_request+0x126/0x1a0 md_make_request+0x7b/0x180 generic_make_request+0x19e/0x3f0 submit_bio+0x73/0x140 [...] This patch changes raid0 driver to fallback to the "old" blocking queue entering procedure, by clearing the BIO_QUEUE_ENTERED from raid0 bios. This prevents the crashes and restores the regular behavior of raid0 arrays when a member is removed during a large write. [0] https://marc.info/?l=linux-block&m=152638475806811 Cc: Jens Axboe Cc: Ming Lei Cc: Song Liu Cc: Tetsuo Handa Cc: stable@vger.kernel.org # v4.18 Fixes: cd4a4ae4683d ("block: don't use blocking queue entered for recursive bio submits") Signed-off-by: Guilherme G. Piccoli Signed-off-by: Song Liu --- drivers/md/raid0.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index f3fb5bb8c82a..d5bdc79e0835 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -547,6 +547,7 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio) trace_block_bio_remap(bdev_get_queue(rdev->bdev), discard_bio, disk_devt(mddev->gendisk), bio->bi_iter.bi_sector); + bio_clear_flag(bio, BIO_QUEUE_ENTERED); generic_make_request(discard_bio); } bio_endio(bio); @@ -602,6 +603,7 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio) disk_devt(mddev->gendisk), bio_sector); mddev_check_writesame(mddev, bio); mddev_check_write_zeroes(mddev, bio); + bio_clear_flag(bio, BIO_QUEUE_ENTERED); generic_make_request(bio); return true; } -- 2.17.1