From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 669ABC636CD for ; Tue, 31 Jan 2023 07:21:39 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pMkwt-0005Lf-Tw; Tue, 31 Jan 2023 02:20:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMkws-0005KW-74 for qemu-devel@nongnu.org; Tue, 31 Jan 2023 02:20:58 -0500 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMkwo-0003Vi-3m for qemu-devel@nongnu.org; Tue, 31 Jan 2023 02:20:57 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R621e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=4; SR=0; TI=SMTPD_---0VaVp1ub_1675149644; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0VaVp1ub_1675149644) by smtp.aliyun-inc.com; Tue, 31 Jan 2023 15:20:44 +0800 Message-ID: <1675149422.4129353-1-xuanzhuo@linux.alibaba.com> Subject: Re: [PATCH v1 2/2] virtio-net: virtio_net_flush_tx() check for per-queue reset Date: Tue, 31 Jan 2023 15:17:02 +0800 From: Xuan Zhuo To: Jason Wang Cc: qemu-devel@nongnu.org, Alexander Bulekov , "Michael S. Tsirkin" References: <20230129025150.119972-1-xuanzhuo@linux.alibaba.com> <20230129025150.119972-3-xuanzhuo@linux.alibaba.com> <20230129021402-mutt-send-email-mst@kernel.org> <1674977308.9335406-2-xuanzhuo@linux.alibaba.com> <20230129025950-mutt-send-email-mst@kernel.org> <1674980588.489446-5-xuanzhuo@linux.alibaba.com> <20230129065705-mutt-send-email-mst@kernel.org> <1674993822.7782302-1-xuanzhuo@linux.alibaba.com> <20230129071154-mutt-send-email-mst@kernel.org> <1675044912.9269125-1-xuanzhuo@linux.alibaba.com> <20230130003158-mutt-send-email-mst@kernel.org> <1675065225.6382265-1-xuanzhuo@linux.alibaba.com> <1675074276.8940918-1-xuanzhuo@linux.alibaba.com> In-Reply-To: Received-SPF: pass client-ip=115.124.30.112; envelope-from=xuanzhuo@linux.alibaba.com; helo=out30-112.freemail.mail.aliyun.com X-Spam_score_int: -98 X-Spam_score: -9.9 X-Spam_bar: --------- X-Spam_report: (-9.9 / 5.0 requ) BAYES_00=-1.9, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Tue, 31 Jan 2023 11:27:42 +0800, Jason Wang wrote: > On Mon, Jan 30, 2023 at 6:26 PM Xuan Zhuo wrote: > > > > On Mon, 30 Jan 2023 16:40:08 +0800, Jason Wang wrote: > > > On Mon, Jan 30, 2023 at 4:03 PM Xuan Zhuo wrote: > > > > > > > > On Mon, 30 Jan 2023 15:49:36 +0800, Jason Wang wrote: > > > > > On Mon, Jan 30, 2023 at 1:32 PM Michael S. Tsirkin wrote: > > > > > > > > > > > > On Mon, Jan 30, 2023 at 10:15:12AM +0800, Xuan Zhuo wrote: > > > > > > > On Sun, 29 Jan 2023 07:15:47 -0500, "Michael S. Tsirkin" wrote: > > > > > > > > On Sun, Jan 29, 2023 at 08:03:42PM +0800, Xuan Zhuo wrote: > > > > > > > > > On Sun, 29 Jan 2023 06:57:29 -0500, "Michael S. Tsirkin" wrote: > > > > > > > > > > On Sun, Jan 29, 2023 at 04:23:08PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > On Sun, 29 Jan 2023 03:12:12 -0500, "Michael S. Tsirkin" wrote: > > > > > > > > > > > > On Sun, Jan 29, 2023 at 03:28:28PM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > On Sun, 29 Jan 2023 02:25:43 -0500, "Michael S. Tsirkin" wrote: > > > > > > > > > > > > > > On Sun, Jan 29, 2023 at 10:51:50AM +0800, Xuan Zhuo wrote: > > > > > > > > > > > > > > > Check whether it is per-queue reset state in virtio_net_flush_tx(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Before per-queue reset, we need to recover async tx resources. At this > > > > > > > > > > > > > > > time, virtio_net_flush_tx() is called, but we should not try to send > > > > > > > > > > > > > > > new packets, so virtio_net_flush_tx() should check the current > > > > > > > > > > > > > > > per-queue reset state. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does "at this time" mean here? > > > > > > > > > > > > > > Do you in fact mean it's called from flush_or_purge_queued_packets? > > > > > > > > > > > > > > > > > > > > > > > > > > Yes > > > > > > > > > > > > > > > > > > > > > > > > > > virtio_queue_reset > > > > > > > > > > > > > k->queue_reset > > > > > > > > > > > > > virtio_net_queue_reset > > > > > > > > > > > > > flush_or_purge_queued_packets > > > > > > > > > > > > > qemu_flush_or_purge_queued_packets > > > > > > > > > > > > > ..... > > > > > > > > > > > > > (callback) virtio_net_tx_complete > > > > > > > > > > > > > virtio_net_flush_tx <-- here send new packet. We need stop it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Because it is inside the callback, I can't pass information through the stack. I > > > > > > > > > > > > > originally thought it was a general situation, so I wanted to put it in > > > > > > > > > > > > > struct VirtQueue. > > > > > > > > > > > > > > > > > > > > > > > > > > If it is not very suitable, it may be better to put it in VirtIONetQueue. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > Hmm maybe. Another idea: isn't virtio_net_tx_complete called > > > > > > > > > > > > with length 0 here? Are there other cases where length is 0? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What does the call stack look like? > > > > > > > > > > > > > > > > > > > > > > > > > > > > If yes introducing a vq state just so virtio_net_flush_tx > > > > > > > > > > > > > > knows we are in the process of reset would be a bad idea. > > > > > > > > > > > > > > We want something much more local, ideally on stack even ... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 7dc6be52 ("virtio-net: support queue reset") > > > > > > > > > > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1451 > > > > > > > > > > > > > > > Reported-by: Alexander Bulekov > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > hw/net/virtio-net.c | 3 ++- > > > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > > > > > > > > > > > > index 3ae909041a..fba6451a50 100644 > > > > > > > > > > > > > > > --- a/hw/net/virtio-net.c > > > > > > > > > > > > > > > +++ b/hw/net/virtio-net.c > > > > > > > > > > > > > > > @@ -2627,7 +2627,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) > > > > > > > > > > > > > > > VirtQueueElement *elem; > > > > > > > > > > > > > > > int32_t num_packets = 0; > > > > > > > > > > > > > > > int queue_index = vq2q(virtio_get_queue_index(q->tx_vq)); > > > > > > > > > > > > > > > - if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) { > > > > > > > > > > > > > > > + if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) || > > > > > > > > > > > > > > > + virtio_queue_reset_state(q->tx_vq)) { > > > > > > > > > > > > > > > > > > > > > > > > btw this sounds like you are asking it to reset some state. > > > > > > > > > > > > > > > > > > > > > > > > > > > return num_packets; > > > > > > > > > > > > > > > > > > > > > > > > and then > > > > > > > > > > > > > > > > > > > > > > > > ret = virtio_net_flush_tx(q); > > > > > > > > > > > > if (ret >= n->tx_burst) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > will reschedule automatically won't it? > > > > > > > > > > > > > > > > > > > > > > > > also why check in virtio_net_flush_tx and not virtio_net_tx_complete? > > > > > > > > > > > > > > > > > > > > > > virtio_net_flush_tx may been called by timer. > > > > > > > > > > We stop timer/bh during device reset, do we need to do the same with vq reset? > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > timer won't run while flush_or_purge_queued_packets is in progress. > > > > > > > > > > > > > > > > > > Is timer not executed during the VMEXIT process? Otherwise, we still have to > > > > > > > > > consider that after the flush_or_purge_queued_packets, this process before the > > > > > > > > > structure is cleared. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > > > > > > { > > > > > > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > > > > > > > > > > > if (k->queue_reset) { > > > > > > > > k->queue_reset(vdev, queue_index); > > > > > > > > } > > > > > > > > > > > > > > > > __virtio_queue_reset(vdev, queue_index); > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > No timers do not run between k->queue_reset and __virtio_queue_reset. > > > > > > > > > > > > > > > > > > > > > > > > > Even if it can be processed in virtio_net_tx_complete, is there any good way? > > > > > > > > > This is a callback, it is not convenient to pass the parameters. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > How about checking that length is 0? > > > > > > > > > > > > > > > > > > > > > I think that check length is not a good way. This modifys the semantics of 0. > > > > > > > > > > 0 seems to mean "purge" and > > > > > > > > > > > It is > > > > > > > not friendly to the future maintenance. On the other hand, qemu_net_queue_purge() > > > > > > > will pass 0, and this function is called by many places. > > > > > > > > > > That's exactly what we want actually, when do purge we don't need a flush? > > > > > > > > Yes, but I'm not sure. If we stop flush, there will be any other effects. > > > > > > So we did: > > > > > > virtio_net_queue_reset(): > > > nc = qemu_get_subqueue(n->nic, vq2q(queue_index); > > > flush_or_purge_queued_packets(nc); > > > qemu_flush_or_purge_queued_packets(nc->peer, true); // [1] > > > if (qemu_net_queue_flush(nc->incoming_queue)) { > > > .... > > > } else if (purge) { > > > qemu_net_queue_purge(nc->incoming_queue, nc->peer); > > > packet->send_cb() > > > virtio_net_tx_complete() > > > virtio_net_flush_tx() > > > qemu_sendv_packet_async() // [2] > > > } > > > > > > We try to flush the tap's incoming queue and if we fail we will purge > > > in [1]. But the sent_cb() tries to send more packets which could be > > > queued to the tap incoming queue [2]. This breaks the semantic of > > > qemu_flush_or_purge_queued_packets(). > > > > Sounds like good news, and I think so too. > > > > > > > > > > > > > On the other hand, if we use "0" as a judgment condition, do you mean only the > > > > implementation of the purge in the flush_or_purge_queued_packets()? > > > > > > It should be all the users of qemu_net_queue_purge(). The rest users > > > seems all fine: > > > > > > virtio_net_vhost_status(), if we do flush, it may end up with touching > > > vring during vhost is running. > > > filters: all do a flush before. > > > > > > > > > > > > > > > > > > > > > > > > > > How about we add an api in queue.c to replace the sent_cb callback on queue? > > > > > > > > > > > > > > Thanks. > > > > > > > > OK I guess. Jason? > > > > > > > > > > Not sure, anything different from adding a check in > > > > > virtio_net_tx_complete()? (assuming bh and timer is cancelled or > > > > > deleted). > > > > > > > > We replaced the sent_cb with a function without flush. > > > > > > I meant it won't be different from adding a > > > > > > if (virtio_queue_is_rest()) > > > > > > somewhere in virtio_net_tx_complete()? > > > > > > Only modify on the stack, without using a variable like disabled_by_reset. > > Ok, but per discussion above, it looks to me we can simply check len > against zero which seems simpler. (And it may fix other possible bugs) Yes, if '0' means purge and we should not flush new packet for purge, that is a good way. I will post patch soon. Thanks. > > Thanks > > > > > Thanks. > > > > > > > > > > Thanks > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > 2.32.0.3.g01195cf9f > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >