From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CBF0C432BE for ; Tue, 31 Aug 2021 07:11:37 +0000 (UTC) Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by mail.kernel.org (Postfix) with ESMTP id 91F8560FED for ; Tue, 31 Aug 2021 07:11:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 91F8560FED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dpdk.org Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A63D44068A; Tue, 31 Aug 2021 09:11:35 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id F237440687 for ; Tue, 31 Aug 2021 09:11:33 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630393893; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5fheDravHr8bTmx+G3xCyyCUrF18+sB+zAHxEOVyec4=; b=MVvB53PHWY+irM1iPVLxj6NH67R3rpWdkXJlp4pwbfTIX7jB0jAp3KsKzql2TN5npvZvI9 1hGexM7AhOvmZyZWFeDmODCcQtq99UZrT2xLNjkhiVe8q2aN55mZFxAcg+Fcd6iwrfQEfV ReVS9qoVuBquSyVQ6ydJ517k4usmp/Y= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-539-ReBXOsQrPj2aaYV0WIDF6w-1; Tue, 31 Aug 2021 03:11:31 -0400 X-MC-Unique: ReBXOsQrPj2aaYV0WIDF6w-1 Received: by mail-qt1-f198.google.com with SMTP id 13-20020ac8560d000000b0029f69548889so58750qtr.3 for ; Tue, 31 Aug 2021 00:11:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=5fheDravHr8bTmx+G3xCyyCUrF18+sB+zAHxEOVyec4=; b=dQ7nzFOKctjHKziWi2hkHdMU7HH99MXCwQ+1Izz0tDjA75VEPH59wEmGfKWgZ1kF2+ 5jS2YsUqWo0S15N3RqmRWtDAD4RFQVxq4Pff+bE2gjKRBSGyYNng0AysRWPPpIuqkqAd cJjSRX81bPYiCothOOtB4huzS8aDoDDHBDbAmPuJxij/7W1ScY06rG5qBDs3wbAdJ5Ga DkaGgr5kXcH+mG/3JeMqfWyHWFOwZWdizKoChIXlAUBNY1J537wWlYqNzK6Lmg4RK3jw KVge+E3LV0zDb9mc4zpjkTx5TvDBC2cMUdOaa0noihUlrtSHr1aZV6Zeg0ssbcNiekAg lltg== X-Gm-Message-State: AOAM533fKoTvWZHHwqoSKC74SS1HoU1KQdvebGDj09LZqRT39snT66Vp QZCC4fCRjEdJP5j1C0D5x9yVU2IX834myAXLr7RXIIHDjlS6aZHWqqXkMHa6N1AshNvBGPQvds4 DaNw3EA8nc7MH7UFTsgc= X-Received: by 2002:a05:620a:1598:: with SMTP id d24mr1593450qkk.484.1630393891361; Tue, 31 Aug 2021 00:11:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjNsBVh0z53fHnQn7pIr406EoBf38BZRcO+JjdK0zSEoIZPv93khwPlDXi6CbAyzyC6QGuBvS+hn1cji/k2/8= X-Received: by 2002:a05:620a:1598:: with SMTP id d24mr1593434qkk.484.1630393891139; Tue, 31 Aug 2021 00:11:31 -0700 (PDT) MIME-Version: 1.0 References: <20210827161231.579968-1-eperezma@redhat.com> In-Reply-To: From: Eugenio Perez Martin Date: Tue, 31 Aug 2021 09:10:54 +0200 Message-ID: To: "Xia, Chenbo" Cc: Maxime Coquelin , "dev@dpdk.org" , Pei Zhang , Jason Wang Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eperezma@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH] vhost: Clean iotlb cache on vring stop X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Aug 31, 2021 at 4:02 AM Xia, Chenbo wrote: > > Hi Eugenio, > > > -----Original Message----- > > From: Eugenio Perez Martin > > Sent: Tuesday, August 31, 2021 2:10 AM > > To: Xia, Chenbo > > Cc: Maxime Coquelin ; dev@dpdk.org; Pei Zha= ng > > ; Jason Wang > > Subject: Re: [PATCH] vhost: Clean iotlb cache on vring stop > > > > On Mon, Aug 30, 2021 at 1:58 PM Xia, Chenbo wrot= e: > > > > > > Hi Eugenio, > > > > > > > Hi Chenbo, > > > > > > -----Original Message----- > > > > From: Eugenio P=C3=A9rez > > > > Sent: Saturday, August 28, 2021 12:13 AM > > > > To: Maxime Coquelin ; Xia, Chenbo > > > > > > > > Cc: dev@dpdk.org; Pei Zhang ; Jason Wang > > > > > > > > Subject: [PATCH] vhost: Clean iotlb cache on vring stop > > > > > > Clean -> clean > > > > > > > Is that something I need to send a new revision for, or is it ok to > > apply on the maintainer side? > > > > > > > > > > Old IOVA cache entries are left when there is a change on virtio dr= iver > > > > in VM. In case that all these old entries have iova addresses lesse= r > > > > than new iova entries, vhost code will need to iterate all the cach= e to > > > > find the new ones. In case of just a new iova entry needed for the = new > > > > translations, this condition will last forever. > > > > > > > > This has been observed in virtio-net to testpmd's vfio-pci driver > > > > transition, reducing the performance from more than 10Mpps to less = than > > > > 0.07Mpps if the hugepage address was higher than the networking > > > > buffers. Since all new buffers are contained in this new gigantic p= age, > > > > vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at wo= rst. > > > > > > I'm curious why QEMU will not invalidate iotlb when virtio-net driver= is > > removed > > > (dma region should be unmapped). > > > > > > > I'm going to investigate this more, but qemu iommu notifier callback > > (vhost_iommu_unmap_notify) is never called through all the test. Also, > > guest kernel code calls dma_unmap_page for each buffer and vqs, but it > > never generates an iotlb flush. > > > > Or do you mean that qemu should also flush all iotlb entries on vhost > > device stop? > > I think as you said, the driver unmapped all entries, so theoretically th= e iotlb > entries should all be invalidated. The iotlb invalidation should be trigg= ered by > iommu entry unmapping. Make sense to you? > It totally makes sense. I just wanted to make sure at what level you meant the unmap should happen: If you meant qemu, then we need more code. If you meant the virtio-net/ring kernel driver, then there is a bug/misunderstanding on what is happening. I think the iotlb unmapping is not happening because the entries are still valid, although not needed with vfio/guest's testpmd. If that were not true we would see way more errors with virtio-net+iommu because of invalid translations: * virtio-net adds a buffer to virtio_ring. virtio-ring dma map does not generate a iotlb map: Qemu will wait for each device to request the missing pages. * The device will miss the translation: It requests it, and saves it in the list for the future. * virtio-ring is done with the buffer: It unmaps them. After a while, kernel needs to dma_map that page again. This new mapping does not generate a page miss as far as I see. * Since this new mapping is already on the device's iotlb cache, the device will not issue a page miss. That means the page must be mapped to the same GPA as before: If not, the network would fail. With this assumption, it is useless to iotlb-unmap them: The device will need to request them again if they're cleared from its cache. But all of this is my guess and I could be wrong :). > > > > > And since the perf drop is huge, why not cc to stable and add fix tag= ? > > > > > > > I was not sure if it was worth it to backport, but I would say that > > the issue can be reproduced with enough bad luck. Since translations > > have always been saved in a linked list: > > > > Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions") > > > > Same question as before, if no changes to the code are needed for the > > patch, do I need to send a second revision? > > I think it sounds a bug to me. So no matter how often it could be reprodu= ced, it's > worth to backport. I suggest to send a v2 with the title fixed and Fixes = & cc > stable tag added so that it will show in stable mailing list and save Max= ime's effort > when applying. > Sure, I will do it. Thanks! > Thanks, > Chenbo > > > > > Thanks! > > > > > > > Thanks, > > > Chenbo > > > > > > > > > > > Signed-off-by: Eugenio P=C3=A9rez > > > > Reported-by: Pei Zhang > > > > --- > > > > lib/vhost/vhost_user.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > > > > index 29a4c9af60..7de48f5333 100644 > > > > --- a/lib/vhost/vhost_user.c > > > > +++ b/lib/vhost/vhost_user.c > > > > @@ -2113,6 +2113,8 @@ vhost_user_get_vring_base(struct virtio_net *= *pdev, > > > > msg->size =3D sizeof(msg->payload.state); > > > > msg->fd_num =3D 0; > > > > > > > > + vhost_user_iotlb_flush_all(vq); > > > > + > > > > vring_invalidate(dev, vq); > > > > > > > > return RTE_VHOST_MSG_RESULT_REPLY; > > > > -- > > > > 2.27.0 > > > >