From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D53BC04A68 for ; Thu, 28 Jul 2022 07:36:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234498AbiG1Hgd (ORCPT ); Thu, 28 Jul 2022 03:36:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233608AbiG1Hgc (ORCPT ); Thu, 28 Jul 2022 03:36:32 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 065D451416 for ; Thu, 28 Jul 2022 00:36:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658993788; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cOXLcMDZDUjNTgJEtckjoXCe0aMU0J6v5IC/dR7zkM4=; b=U6BsqxZ0XIAKg4pf0CLIt7c+cYi48Eo2BkI+cE4GbnQ6bzQP5p38beCvRkE7yz50l4TQD6 MK91oOkRnVQ1uZXP8dxhDb6vtOD4fIdXGDJHo+owA2uk/MsLc8m0k7zo8BStv4n/SzTmqD crhAOsSyH8WptrQrJkVkvJ+pJCn8lhw= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-85-hi3rZROJM1mLbYS3--D8rQ-1; Thu, 28 Jul 2022 03:36:27 -0400 X-MC-Unique: hi3rZROJM1mLbYS3--D8rQ-1 Received: by mail-lf1-f69.google.com with SMTP id bp42-20020a05651215aa00b0048ab8b3de4aso384151lfb.13 for ; Thu, 28 Jul 2022 00:36:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=cOXLcMDZDUjNTgJEtckjoXCe0aMU0J6v5IC/dR7zkM4=; b=bJq2R1At2fFUBkLkYa7lK3/OfjRc8JrxVUJIkVLL8PWswdZA8GWoNR8+woEUjvlpY0 ZAZIdwEeeL3zDbMlLEIfYR8r/2tvGUP2m8Bg/kH+L/x2gS59n7HO+rqEMrpCpyx2I4pT hIKlYKgQrdBe9+1N7jy2hjSGoQShSE+VGSwTPqI5mxaunab7BIEI3I7OU4fNh2S+P5yO Kws3g/aoa2xWDCBCnhOHHNOwWy9JJes5XSf9Vea2ho22gosl/oveVj0Vge4eWnsk23dN kaWohM6KsTPYrxNp2MK6k5TKtAbnoYDVd418kqcJgCElLqDk+1kN1AgCVeR0lCzEd7ez ayhA== X-Gm-Message-State: AJIora/JHbA6zIrMScIy2gWgbYl3k8GdSyuyu2cJj8QdRhnm25KnSnXc YiPuXyahvdniY9KgjkJiYMqrTf4OLoXkCERX3cztnN+bcE99rWGBZ6/y0O5Ull1+nEsSY6VL8au biwVkbj12TytG4Oma+nwVDy39FKiFpg1x X-Received: by 2002:a19:9145:0:b0:48a:7ee4:5eac with SMTP id y5-20020a199145000000b0048a7ee45eacmr9350799lfj.641.1658993785254; Thu, 28 Jul 2022 00:36:25 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sqVHTkfgIFijrel0wvi00euv1Tqi1Ik4NhKEu6Su78CGXqLNnwt+PpTOPWNqCYx0fwbIw111bXD0tK4F1qQnk= X-Received: by 2002:a19:9145:0:b0:48a:7ee4:5eac with SMTP id y5-20020a199145000000b0048a7ee45eacmr9350779lfj.641.1658993784702; Thu, 28 Jul 2022 00:36:24 -0700 (PDT) MIME-Version: 1.0 References: <20220701132826.8132-1-lingshan.zhu@intel.com> <20220701132826.8132-5-lingshan.zhu@intel.com> <00889067-50ac-d2cd-675f-748f171e5c83@oracle.com> <63242254-ba84-6810-dad8-34f900b97f2f@intel.com> <8002554a-a77c-7b25-8f99-8d68248a741d@oracle.com> In-Reply-To: From: Jason Wang Date: Thu, 28 Jul 2022 15:36:13 +0800 Message-ID: Subject: Re: [PATCH V3 4/6] vDPA: !FEATURES_OK should not block querying device config space To: Si-Wei Liu Cc: "Zhu, Lingshan" , Parav Pandit , "mst@redhat.com" , Eli Cohen , "netdev@vger.kernel.org" , "xieyongji@bytedance.com" , "gautam.dawar@amd.com" , "virtualization@lists.linux-foundation.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, Jul 28, 2022 at 3:09 PM Si-Wei Liu wrote: > > > > On 7/27/2022 7:06 PM, Jason Wang wrote: > > > > =E5=9C=A8 2022/7/28 08:56, Si-Wei Liu =E5=86=99=E9=81=93: > >> > >> > >> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote: > >>> > >>> > >>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote: > >>>> Sorry to chime in late in the game. For some reason I couldn't get > >>>> to most emails for this discussion (I only subscribed to the > >>>> virtualization list), while I was taking off amongst the past few > >>>> weeks. > >>>> > >>>> It looks to me this patch is incomplete. Noted down the way in > >>>> vdpa_dev_net_config_fill(), we have the following: > >>>> features =3D vdev->config->get_driver_features(vdev); > >>>> if (nla_put_u64_64bit(msg, > >>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features, > >>>> VDPA_ATTR_PAD)) > >>>> return -EMSGSIZE; > >>>> > >>>> Making call to .get_driver_features() doesn't make sense when > >>>> feature negotiation isn't complete. Neither should present > >>>> negotiated_features to userspace before negotiation is done. > >>>> > >>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably > >>>> should not show before negotiation is done - it depends on driver > >>>> features negotiated. > >>> I have another patch in this series introduces device_features and > >>> will report device_features to the userspace even features > >>> negotiation not done. Because the spec says we should allow driver > >>> access the config space before FEATURES_OK. > >> The config space can be accessed by guest before features_ok doesn't > >> necessarily mean the value is valid. > > > > > > It's valid as long as the device offers the feature: > > > > "The device MUST allow reading of any device-specific configuration > > field before FEATURES_OK is set by the driver. This includes fields > > which are conditional on feature bits, as long as those feature bits > > are offered by the device." > I guess this statement only conveys that the field in config space can > be read before FEATURES_OK is set, though it does not *explicitly* > states the validity of field. My understanding is that it should be valid as long as the device offers the feature. For example, if _MQ is offered by device, the max_virt_queue_pairs is always valid and can be read from the driver no matter whether _MQ is negotiated. > > And looking at: > > "The mac address field always exists (though is only valid if > VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS > is set." > > It appears to me there's a border line set between "exist" and "valid". > If I understand the spec wording correctly, a spec-conforming device > implementation may or may not offer valid status value in the config > space when VIRTIO_NET_F_STATUS is offered, but before the feature is > negotiated. That's not what I read, maybe Michael can clarify this. > On the other hand, config space should contain valid mac > address the moment VIRTIO_NET_F_MAC feature is offered, regardless being > negotiated or not. I agree here. >By that, there seems to be leeway for the device > implementation to decide when config space field may become valid, > though for most of QEMU's software virtio devices, valid value is > present to config space the very first moment when feature is offered. > > "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac > entry indicates the =E2=80=9Cphysical=E2=80=9D address of the network car= d, otherwise > the driver would typically generate a random local MAC address." > "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status > comes from the bottom bit of status. Otherwise, the driver assumes it=E2= =80=99s > active." This is mostly the way how drivers that don't support _F_STATUS work. > > And also there are special cases where the read of specific > configuration space field MUST be deferred to until FEATURES_OK is set: > > "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode > can be read or set through the writeback field. 0 corresponds to a > writethrough cache, 1 to a writeback cache11. The cache mode after reset > can be either writeback or writethrough. The actual mode can be > determined by reading writeback after feature negotiation." > "The driver MUST NOT read writeback before setting the FEATURES_OK > device status bit." This seems to conflict with the normatives I quoted above, and I don't get why we need this. > "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, > the device MUST initialize writeback to 0." > > Since the spec doesn't explicitly mandate the validity of each config > space field when feature of concern is offered, to be safer we'd have to > live with odd device implementation. I know for sure QEMU software > devices won't for 99% of these cases, but that's not what is currently > defined in the spec. > > > > > > >> You may want to double check with Michael for what he quoted earlier: > >>> Nope: > >>> > >>> 2.5.1 Driver Requirements: Device Configuration Space > >>> > >>> ... > >>> > >>> For optional configuration space fields, the driver MUST check that > >>> the corresponding feature is offered > >>> before accessing that part of the configuration space. > >> > >> and how many driver bugs taking wrong assumption of the validity of > >> config space field without features_ok. I am not sure what use case > >> you want to expose config resister values for before features_ok, if > >> it's mostly for live migration I guess it's probably heading a wrong > >> direction. > > > > > > I guess it's not for migration. > Then what's the other possible use case than live migration, were to > expose config space values? Troubleshooting config space discrepancy > between vDPA and the emulated virtio device in userspace? Or tracking > changes in config space across feature negotiation, but for what? It'd > be beneficial to the interface design if the specific use case can be > clearly described... Monitoring or debugging I guess. Thanks > > > > For migration, a provision with the correct features/capability would > > be sufficient. > Right, that's what I thought too. It doesn't need to expose config space > values, simply exporting all attributes for vdpa device creation will do > the work. > > -Siwei > > > > > Thanks > > > > > >> > >> > >>>> > >>>> > >>>> Last but not the least, this "vdpa dev config" command was not > >>>> designed to display the real config space register values in the > >>>> first place. Quoting the vdpa-dev(8) man page: > >>>> > >>>>> vdpa dev config show - Show configuration of specific device or > >>>>> all devices. > >>>>> DEV - specifies the vdpa device to show its configuration. If this > >>>>> argument is omitted all devices configuration is listed. > >>>> It doesn't say anything about configuration space or register > >>>> values in config space. As long as it can convey the config > >>>> attribute when instantiating vDPA device instance, and more > >>>> importantly, the config can be easily imported from or exported to > >>>> userspace tools when trying to reconstruct vdpa instance intact on > >>>> destination host for live migration, IMHO in my personal > >>>> interpretation it doesn't matter what the config space may present. > >>>> It may be worth while adding a new debug command to expose the real > >>>> register value, but that's another story. > >>> I am not sure getting your points. vDPA now reports device feature > >>> bits(device_features) and negotiated feature bits(driver_features), > >>> and yes, the drivers features can be a subset of the device > >>> features; and the vDPA device features can be a subset of the > >>> management device features. > >> What I said is after unblocking the conditional check, you'd have to > >> handle the case for each of the vdpa attribute when feature > >> negotiation is not yet done: basically the register values you got > >> from config space via the vdpa_get_config_unlocked() call is not > >> considered to be valid before features_ok (per-spec). Although in > >> some case you may get sane value, such behavior is generally > >> undefined. If you desire to show just the device_features alone > >> without any config space field, which the device had advertised > >> *before feature negotiation is complete*, that'll be fine. But looks > >> to me this is not how patch has been implemented. Probably need some > >> more work? > >> > >> Regards, > >> -Siwei > >> > >>>> > >>>> Having said, please consider to drop the Fixes tag, as appears to > >>>> me you're proposing a new feature rather than fixing a real issue. > >>> it's a new feature to report the device feature bits than only > >>> negotiated features, however this patch is a must, or it will block > >>> the device feature bits reporting. but I agree, the fix tag is not a > >>> must. > >>>> > >>>> Thanks, > >>>> -Siwei > >>>> > >>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote: > >>>>>> From: Zhu Lingshan > >>>>>> Sent: Friday, July 1, 2022 9:28 AM > >>>>>> > >>>>>> Users may want to query the config space of a vDPA device, to > >>>>>> choose a > >>>>>> appropriate one for a certain guest. This means the users need to > >>>>>> read the > >>>>>> config space before FEATURES_OK, and the existence of config space > >>>>>> contents does not depend on FEATURES_OK. > >>>>>> > >>>>>> The spec says: > >>>>>> The device MUST allow reading of any device-specific > >>>>>> configuration field > >>>>>> before FEATURES_OK is set by the driver. This includes fields > >>>>>> which are > >>>>>> conditional on feature bits, as long as those feature bits are > >>>>>> offered by the > >>>>>> device. > >>>>>> > >>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if > >>>>>> FEATURES_OK) > >>>>> Fix is fine, but fixes tag needs correction described below. > >>>>> > >>>>> Above commit id is 13 letters should be 12. > >>>>> And > >>>>> It should be in format > >>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if > >>>>> FEATURES_OK") > >>>>> > >>>>> Please use checkpatch.pl script before posting the patches to > >>>>> catch these errors. > >>>>> There is a bot that looks at the fixes tag and identifies the > >>>>> right kernel version to apply this fix. > >>>>> > >>>>>> Signed-off-by: Zhu Lingshan > >>>>>> --- > >>>>>> drivers/vdpa/vdpa.c | 8 -------- > >>>>>> 1 file changed, 8 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index > >>>>>> 9b0e39b2f022..d76b22b2f7ae 100644 > >>>>>> --- a/drivers/vdpa/vdpa.c > >>>>>> +++ b/drivers/vdpa/vdpa.c > >>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev= , > >>>>>> struct sk_buff *msg, u32 portid, { > >>>>>> u32 device_id; > >>>>>> void *hdr; > >>>>>> - u8 status; > >>>>>> int err; > >>>>>> > >>>>>> down_read(&vdev->cf_lock); > >>>>>> - status =3D vdev->config->get_status(vdev); > >>>>>> - if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { > >>>>>> - NL_SET_ERR_MSG_MOD(extack, "Features negotiation not > >>>>>> completed"); > >>>>>> - err =3D -EAGAIN; > >>>>>> - goto out; > >>>>>> - } > >>>>>> - > >>>>>> hdr =3D genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags= , > >>>>>> VDPA_CMD_DEV_CONFIG_GET); > >>>>>> if (!hdr) { > >>>>>> -- > >>>>>> 2.31.1 > >>>>> _______________________________________________ > >>>>> Virtualization mailing list > >>>>> Virtualization@lists.linux-foundation.org > >>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailm= an/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-B= Gd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$ > >>>> > >>>> > >>> > >> > > >