From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AF9DC433B4 for ; Thu, 8 Apr 2021 12:45:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC551610FC for ; Thu, 8 Apr 2021 12:45:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231420AbhDHMqG (ORCPT ); Thu, 8 Apr 2021 08:46:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:55977 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229741AbhDHMqF (ORCPT ); Thu, 8 Apr 2021 08:46:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617885954; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tUXfy+VSEEl5j5H1dI8KuFyJtEablKEBo7YiZjGApMQ=; b=Qs+225uGs5gqAYYL3uIZ/nz2DmEgGMoGdhFmEWR1VRaMFAq1jGPDbTVXFa2HzzwaYcCBzK dcrI+7fsUqtUhldWY+Q2OKDWN2NjeT43nPZPQjUnYyXYDdlqrEbk5ev/OS6+FYVwHICF+a uABk/wBnjVWMmOV5w9gGnoYGrledOQg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-505-2XSbkMFuMOG4WjyQOgRVEg-1; Thu, 08 Apr 2021 08:45:50 -0400 X-MC-Unique: 2XSbkMFuMOG4WjyQOgRVEg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 10930195D562; Thu, 8 Apr 2021 12:45:49 +0000 (UTC) Received: from wangxiaodeMacBook-Air.local (ovpn-12-93.pek2.redhat.com [10.72.12.93]) by smtp.corp.redhat.com (Postfix) with ESMTP id 79BFB60BF1; Thu, 8 Apr 2021 12:45:43 +0000 (UTC) Subject: Re: [PATCH v2 2/3] virito_pci: add timeout to reset device operation To: Max Gurtovoy , mst@redhat.com, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org Cc: oren@nvidia.com, nitzanc@nvidia.com, cohuck@redhat.com References: <20210408081109.56537-1-mgurtovoy@nvidia.com> <20210408081109.56537-2-mgurtovoy@nvidia.com> <2bead2b3-fa23-dc1e-3200-ddfa24944b75@redhat.com> From: Jason Wang Message-ID: <93221213-8fc3-96ef-7e89-b7c03bea5322@redhat.com> Date: Thu, 8 Apr 2021 20:45:41 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org 在 2021/4/8 下午5:44, Max Gurtovoy 写道: > > On 4/8/2021 12:01 PM, Jason Wang wrote: >> >> 在 2021/4/8 下午4:11, Max Gurtovoy 写道: >>> According to the spec after writing 0 to device_status, the driver MUST >>> wait for a read of device_status to return 0 before reinitializing the >>> device. In case we have a device that won't return 0, the reset >>> operation will loop forever and cause the host/vm to stuck. Set timeout >>> for 3 minutes before giving up on the device. >>> >>> Signed-off-by: Max Gurtovoy >>> --- >>>   drivers/virtio/virtio_pci_modern.c | 10 +++++++++- >>>   1 file changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/virtio_pci_modern.c >>> b/drivers/virtio/virtio_pci_modern.c >>> index cc3412a96a17..dcee616e8d21 100644 >>> --- a/drivers/virtio/virtio_pci_modern.c >>> +++ b/drivers/virtio/virtio_pci_modern.c >>> @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) >>>   { >>>       struct virtio_pci_device *vp_dev = to_vp_device(vdev); >>>       struct virtio_pci_modern_device *mdev = &vp_dev->mdev; >>> +    unsigned long timeout = jiffies + msecs_to_jiffies(180000); >>>         /* 0 status means a reset. */ >>>       vp_modern_set_status(mdev, 0); >>> @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) >>>        * device_status to return 0 before reinitializing the device. >>>        * This will flush out the status write, and flush in device >>> writes, >>>        * including MSI-X interrupts, if any. >>> +     * Set a timeout before giving up on the device. >>>        */ >>> -    while (vp_modern_get_status(mdev)) >>> +    while (vp_modern_get_status(mdev)) { >>> +        if (time_after(jiffies, timeout)) { >> >> >> What happens if the device finish the rest after the timeout? > > > The driver will set VIRTIO_CONFIG_S_FAILED and one can re-probe it > later on (e.g by re-scanning the pci bus). Ok, so do we need the flush through vp_synchronize_vectors() here? Thanks > > >> >> Thanks >> >> >>> +            dev_err(&vdev->dev, "virtio: device not ready. " >>> +                "Aborting. Try again later\n"); >>> +            return -EAGAIN; >>> +        } >>>           msleep(1); >>> +    } >>>       /* Flush pending VQ/configuration callbacks. */ >>>       vp_synchronize_vectors(vdev); >>>       return 0; >> >