From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5428C433E0 for ; Wed, 12 Aug 2020 02:53:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B7CD1207DA for ; Wed, 12 Aug 2020 02:53:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="C//CxCW9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726469AbgHLCxV (ORCPT ); Tue, 11 Aug 2020 22:53:21 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:28644 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726430AbgHLCxU (ORCPT ); Tue, 11 Aug 2020 22:53:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1597200798; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=wa4kA9V++qveSb+eG2KornNzk7POibDapr96+q0COqo=; b=C//CxCW90Plpds6YRicmXXtI4oq8tscQQ09Soe75x5EVB7Y138POwkkj2irhAUkVbo4Lqi YesCoxmApBCzjQlNMzUuMMqcmF+etAYhhXiL95Rc5RaIplpVrMtwDcdSXjeqzPxQ2zOScx Scpv8iSaAkk6KrSq2qJQ1r4LRl0UEJ4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-87-EwSIK19xOSaapmJkY8cEVg-1; Tue, 11 Aug 2020 22:53:17 -0400 X-MC-Unique: EwSIK19xOSaapmJkY8cEVg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DB64A1853DA3; Wed, 12 Aug 2020 02:53:15 +0000 (UTC) Received: from T590 (ovpn-13-156.pek2.redhat.com [10.72.13.156]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 11FA587D8E; Wed, 12 Aug 2020 02:53:02 +0000 (UTC) Date: Wed, 12 Aug 2020 10:52:58 +0800 From: Ming Lei To: Baolin Wang Cc: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig , Changpeng Liu , Daniel Verkamp , "Michael S . Tsirkin" , Stefan Hajnoczi , Stefano Garzarella Subject: Re: [PATCH V2 2/3] block: virtio_blk: fix handling single range discard request Message-ID: <20200812025258.GA2304706@T590> References: <20200811234420.2297137-1-ming.lei@redhat.com> <20200811234420.2297137-3-ming.lei@redhat.com> <20200812020706.GA69794@VM20190228-100.tbsite.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200812020706.GA69794@VM20190228-100.tbsite.net> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Wed, Aug 12, 2020 at 10:07:06AM +0800, Baolin Wang wrote: > Hi Ming, > > On Wed, Aug 12, 2020 at 07:44:19AM +0800, Ming Lei wrote: > > 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") starts > > to support multi-range discard for virtio-blk. However, the virtio-blk > > disk may report max discard segment as 1, at least that is exactly what > > qemu is doing. > > > > So far, block layer switches to normal request merge if max discard segment > > limit is 1, and multiple bios can be merged to single segment. This way may > > cause memory corruption in virtblk_setup_discard_write_zeroes(). > > > > Fix the issue by handling single max discard segment in straightforward > > way. > > > > Signed-off-by: Ming Lei > > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") > > Cc: Christoph Hellwig > > Cc: Changpeng Liu > > Cc: Daniel Verkamp > > Cc: Michael S. Tsirkin > > Cc: Stefan Hajnoczi > > Cc: Stefano Garzarella > > --- > > drivers/block/virtio_blk.c | 31 +++++++++++++++++++++++-------- > > 1 file changed, 23 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > index 63b213e00b37..b2e48dac1ebd 100644 > > --- a/drivers/block/virtio_blk.c > > +++ b/drivers/block/virtio_blk.c > > @@ -126,16 +126,31 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) > > if (!range) > > return -ENOMEM; > > > > - __rq_for_each_bio(bio, req) { > > - u64 sector = bio->bi_iter.bi_sector; > > - u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > - > > - range[n].flags = cpu_to_le32(flags); > > - range[n].num_sectors = cpu_to_le32(num_sectors); > > - range[n].sector = cpu_to_le64(sector); > > - n++; > > + /* > > + * Single max discard segment means multi-range discard isn't > > + * supported, and block layer only runs contiguity merge like > > + * normal RW request. So we can't reply on bio for retrieving > > + * each range info. > > + */ > > + if (queue_max_discard_segments(req->q) == 1) { > > + range[0].flags = cpu_to_le32(flags); > > + range[0].num_sectors = cpu_to_le32(blk_rq_sectors(req)); > > + range[0].sector = cpu_to_le64(blk_rq_pos(req)); > > + n = 1; > > + } else { > > + __rq_for_each_bio(bio, req) { > > + u64 sector = bio->bi_iter.bi_sector; > > + u32 num_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > > + > > + range[n].flags = cpu_to_le32(flags); > > + range[n].num_sectors = cpu_to_le32(num_sectors); > > + range[n].sector = cpu_to_le64(sector); > > + n++; > > + } > > } > > > > + WARN_ON_ONCE(n != segments); > > I wonder should we return an error if the discard segments are > incorrect like NVMe did[1]? In case the DMA may do some serious > damages in this case. It is an unlikely case: 1) if queue_max_discard_segments() is 1, the warning can't be triggered 2) otherwise, ELEVATOR_DISCARD_MERGE is always handled in bio_attempt_discard_merge(), and segment number is really same with number of bios in the request. If the warning is triggered, it is simply one serious bug in block layer. BTW, suppose the warning is triggered: 1) if n < segments, it is simply one warning 2) if n > segments, no matter if something like nvme_setup_discard() is done, serious memory corruption issue has been caused. So it doesn't matter to handle it in nvme's style. Thanks, Ming