Re: [PATCH v2] virtio-blk: Remove BUG_ON() in virtio_queue_rq()

From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Xie Yongji <xieyongji@bytedance.com>,
	jasowang@redhat.com, axboe@kernel.dk, hch@infradead.org,
	virtualization@lists.linux-foundation.org,
	linux-block@vger.kernel.org
Subject: Re: [PATCH v2] virtio-blk: Remove BUG_ON() in virtio_queue_rq()
Date: Wed, 2 Mar 2022 15:45:10 +0200	[thread overview]
Message-ID: <808fbd57-588d-03e3-2904-513f4bdcceaf@nvidia.com> (raw)
In-Reply-To: <20220302083112-mutt-send-email-mst@kernel.org>

On 3/2/2022 3:33 PM, Michael S. Tsirkin wrote:
> On Wed, Mar 02, 2022 at 03:24:51PM +0200, Max Gurtovoy wrote:
>> On 3/2/2022 3:17 PM, Michael S. Tsirkin wrote:
>>> On Wed, Mar 02, 2022 at 11:51:27AM +0200, Max Gurtovoy wrote:
>>>> On 3/1/2022 5:43 PM, Michael S. Tsirkin wrote:
>>>>> On Mon, Feb 28, 2022 at 02:57:20PM +0800, Xie Yongji wrote:
>>>>>> Currently we have a BUG_ON() to make sure the number of sg
>>>>>> list does not exceed queue_max_segments() in virtio_queue_rq().
>>>>>> However, the block layer uses queue_max_discard_segments()
>>>>>> instead of queue_max_segments() to limit the sg list for
>>>>>> discard requests. So the BUG_ON() might be triggered if
>>>>>> virtio-blk device reports a larger value for max discard
>>>>>> segment than queue_max_segments().
>>>>> Hmm the spec does not say what should happen if max_discard_seg
>>>>> exceeds seg_max. Is this the config you have in mind? how do you
>>>>> create it?
>>>> I don't think it's hard to create it. Just change some registers in the
>>>> device.
>>>>
>>>> But with the dynamic sgl allocation that I added recently, there is no
>>>> problem with this scenario.
>>> Well the problem is device says it can't handle such large descriptors,
>>> I guess it works anyway, but it seems scary.
>> I don't follow.
>>
>> The only problem this patch solves is when a virtio blk device reports
>> larger value for max_discard_segments than max_segments.
>>
> No, the peroblem reported is when virtio blk device reports
> max_segments < 256 but not max_discard_segments.

You mean the code will work in case device report max_discard_segments  
 > max_segments ?

I don't think so.

This is exactly what Xie Yongji mention in the commit message and what I 
was seeing.

But the code will work if VIRTIO_BLK_F_DISCARD is not supported by the 
device (even if max_segments < 256) , since blk layer set 
queue_max_discard_segments = 1 in the initialization.

And the virtio-blk driver won't change it unless VIRTIO_BLK_F_DISCARD is 
supported.

> I would expect discard to follow max_segments restrictions then.
>
>> Probably no such devices, but we need to be prepared.
> Right, question is how to handle this.
>
>>>> This commit looks good to me, thanks Xie Yongji.
>>>>
>>>> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>>
>>>>>> To fix it, let's simply
>>>>>> remove the BUG_ON() which has become unnecessary after commit
>>>>>> 02746e26c39e("virtio-blk: avoid preallocating big SGL for data").
>>>>>> And the unused vblk->sg_elems can also be removed together.
>>>>>>
>>>>>> Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
>>>>>> Suggested-by: Christoph Hellwig <hch@infradead.org>
>>>>>> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
>>>>>> ---
>>>>>>     drivers/block/virtio_blk.c | 10 +---------
>>>>>>     1 file changed, 1 insertion(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>>>>>> index c443cd64fc9b..a43eb1813cec 100644
>>>>>> --- a/drivers/block/virtio_blk.c
>>>>>> +++ b/drivers/block/virtio_blk.c
>>>>>> @@ -76,9 +76,6 @@ struct virtio_blk {
>>>>>>     	 */
>>>>>>     	refcount_t refs;
>>>>>> -	/* What host tells us, plus 2 for header & tailer. */
>>>>>> -	unsigned int sg_elems;
>>>>>> -
>>>>>>     	/* Ida index - used to track minor number allocations. */
>>>>>>     	int index;
>>>>>> @@ -322,8 +319,6 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>>     	blk_status_t status;
>>>>>>     	int err;
>>>>>> -	BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
>>>>>> -
>>>>>>     	status = virtblk_setup_cmd(vblk->vdev, req, vbr);
>>>>>>     	if (unlikely(status))
>>>>>>     		return status;
>>>>>> @@ -783,8 +778,6 @@ static int virtblk_probe(struct virtio_device *vdev)
>>>>>>     	/* Prevent integer overflows and honor max vq size */
>>>>>>     	sg_elems = min_t(u32, sg_elems, VIRTIO_BLK_MAX_SG_ELEMS - 2);
>>>>>> -	/* We need extra sg elements at head and tail. */
>>>>>> -	sg_elems += 2;
>>>>>>     	vdev->priv = vblk = kmalloc(sizeof(*vblk), GFP_KERNEL);
>>>>>>     	if (!vblk) {
>>>>>>     		err = -ENOMEM;
>>>>>> @@ -796,7 +789,6 @@ static int virtblk_probe(struct virtio_device *vdev)
>>>>>>     	mutex_init(&vblk->vdev_mutex);
>>>>>>     	vblk->vdev = vdev;
>>>>>> -	vblk->sg_elems = sg_elems;
>>>>>>     	INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
>>>>>> @@ -853,7 +845,7 @@ static int virtblk_probe(struct virtio_device *vdev)
>>>>>>     		set_disk_ro(vblk->disk, 1);
>>>>>>     	/* We can handle whatever the host told us to handle. */
>>>>>> -	blk_queue_max_segments(q, vblk->sg_elems-2);
>>>>>> +	blk_queue_max_segments(q, sg_elems);
>>>>>>     	/* No real sector limit. */
>>>>>>     	blk_queue_max_hw_sectors(q, -1U);
>>>>>> -- 
>>>>>> 2.20.1