From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDE42C46CA3 for ; Fri, 8 Dec 2023 08:33:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rBWIJ-000365-6z; Fri, 08 Dec 2023 03:33:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rBWII-00035o-FB for qemu-devel@nongnu.org; Fri, 08 Dec 2023 03:33:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rBWIG-0001ff-FJ for qemu-devel@nongnu.org; Fri, 08 Dec 2023 03:33:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1702024386; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Dd87OZPSN7U6nOuTgQuwJapuCQxskmXypN9Yc8U+OR4=; b=A65Nu0xrSynXRy0bldQa1ONwH23CUMfJvnctz9XroLFVy6FGdnGfiVOHVUazXuHXZh+iyo yKTlLqvY/q1Q7/0E/ux5/8ksU8gyp1RA6s7DykWNjvFiPFYlDRk2upVBdjcQvGvTZUKSeb rRkymu7d+t1T9WuTS3nIjNrBoQ9srvE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-290-hsEjIwdcMoCOsK9tpliFcw-1; Fri, 08 Dec 2023 03:33:03 -0500 X-MC-Unique: hsEjIwdcMoCOsK9tpliFcw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AD60185CE46; Fri, 8 Dec 2023 08:33:02 +0000 (UTC) Received: from redhat.com (unknown [10.39.193.74]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 729E9C0164E; Fri, 8 Dec 2023 08:33:00 +0000 (UTC) Date: Fri, 8 Dec 2023 09:32:58 +0100 From: Kevin Wolf To: Fiona Ebner Cc: qemu-block@nongnu.org, Stefan Hajnoczi , richard.henderson@linaro.org, qemu-devel@nongnu.org, Thomas Lamprecht , Hanna Reitz , Paolo Bonzini , Fam Zheng , "Michael S . Tsirkin" Subject: Re: [PULL 29/32] virtio-blk: implement BlockDevOps->drained_begin() Message-ID: References: <20230530163239.576632-1-kwolf@redhat.com> <20230530163239.576632-30-kwolf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Am 07.12.2023 um 16:22 hat Fiona Ebner geschrieben: > Am 03.11.23 um 14:12 schrieb Fiona Ebner: > > Hi, > > > > I ran into a strange issue where guest IO would get completely stuck > > during certain block jobs a while ago and finally managed to find a > > small reproducer [0]. I'm using a VM with virtio-blk-pci (or > > virtio-scsi-pci) with an iothread and running > > > > fio --name=file --size=100M --direct=1 --rw=randwrite --bs=4k > > --ioengine=psync --numjobs=5 --runtime=1200 --time_based > > > > in the guest. Then I'm issuing the QMP command with the reproducer in a > > loop. Usually, the guest IO will get stuck after about 1-3 minutes, > > sometimes fio can manage to continue with a lower speed for a while (but > > trying to Ctrl+C it or doing other IO in the guest will already be > > broken), which I guess could be a hint that it's an issue with notifiers? > > > > Bisecting (to declare a commit good, I waited 10 minutes) led me to this > > patch, i.e. commit 1665d9326f ("virtio-blk: implement > > BlockDevOps->drained_begin()") and for SCSI, I verified that the issue > > similarly starts happening after 766aa2de0f ("virtio-scsi: implement > > BlockDevOps->drained_begin()"). > > > > Both issues are still present on current master (i.e. 1c98a821a2 > > ("tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device")) > > > > Happy to provide more information and hints about how to debug the issue > > further. > > > > I think I was finally able to get to the bottom of this and have a > plausible-sounding pet theory now. It involves the VirtIO notifier > optimization during poll mode. > > Let's step through some debug prints I added. First number is always the > thread ID (I'm sorry that I used warn_report rather than proper tracing): > > > 247050 nodefd 29 poll_set_started 1 > > The iothread starts poll mode for the node with fd 29 which is the > virtio host notifier. > > > 247050 0x55e515185270 poll begin for vq > > 247050 0x55e515185270 setting notification for vq 0 > > virtio_queue_set_notification is called to disable notification. > > > 247050 nodefd 29 poll_set_started 1 done > > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 1 > > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 4 > > virtio-blk handling some requests, note that suppress_notifications is 0 > because we are in poll mode. > > > 247048 nodefd 29 addr 0x55e51496ed70 marking as deleted > > Main thread marks the node for deletion when beginning drain, i.e. > detaches the host notifier. > > > 247048 nodefd 29 addr 0x55e513cdcd20 is_new 1 adding node > > Main thread adds a new node when ending drain, i.e. attaches the host > notifier. > > > 247050 nodefd 29 addr 0x55e51496ed70 remove deleted handler > > The iothread removes the handler marked for removal. In particular from > the node_poll list: QLIST_SAFE_REMOVE(node, node_poll); > > > 247050 disabling poll mode before fdmon_ops->wait > > This is just before the call to > poll_set_started(ctx, &ready_list, false) > > Whoops!! Nobody ends poll mode for the node with fd 29, because the old > node was deleted from the node_poll list already and new node is not > part of it, i.e. nobody has started poll mode for the new node. > > > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 0 > > fdmon_ops->wait() returns one last time (not sure why) but no actual > requests. > > > 247050 disabling poll mode before fdmon_ops->wait > > After this, the fdmon_ops->wait() (it's fdmon_poll_wait in my case) will > just wait forever (or until triggering QMP 'stop' and 'cont' which > restarts the dataplane). > > > A minimal workaround seems to be either calling > event_notifier_set(virtio_queue_get_host_notifier(vq)); > or > virtio_queue_set_notification(vq, true); > in drainded_end (for both VirtIO SCSI/block). > > But is this an actual issue with the AIO interface/implementation? Or > should it rather be considered a bug in the VirtIO SCSI/block drain > implementation, because of the notification optimization? I'm not involved in it myself, but the kind of theme reminds me of this downstream bug that Hanna analysed recently: https://issues.redhat.com/browse/RHEL-3934 Does it look like the same root cause to you? Kevin