All of lore.kernel.org
 help / color / mirror / Atom feed
From: Si-Wei Liu <si-wei.liu@oracle.com>
To: "Zhu, Lingshan" <lingshan.zhu@intel.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm@vger.kernel.org, netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	xieyongji@bytedance.com, gautam.dawar@amd.com
Subject: Re: [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
Date: Thu, 18 Aug 2022 17:04:52 -0700	[thread overview]
Message-ID: <5fe9d595-f073-5709-edf6-386bda8a55fd@oracle.com> (raw)
In-Reply-To: <b2430d3b-4ca2-a19c-da39-da91da732c02@intel.com>


[-- Attachment #1.1: Type: text/plain, Size: 13164 bytes --]



On 8/17/2022 11:52 PM, Zhu, Lingshan wrote:
>
>
> On 8/18/2022 2:48 AM, Si-Wei Liu wrote:
>>
>>
>> On 8/16/2022 11:23 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote:
>>>> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote:
>>>>>
>>>>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>>
>>>>>      On 8/16/2022 1:29 AM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>>
>>>>>          On 8/16/2022 3:41 PM, Si-Wei Liu wrote:
>>>>>
>>>>>              Hi Michael,
>>>>>
>>>>>              I just noticed this patch got pulled to linux-next 
>>>>> prematurely
>>>>>              without getting consensus on code review, am not sure 
>>>>> why. Hope it
>>>>>              was just an oversight.
>>>>>
>>>>>              Unfortunately this introduced functionality 
>>>>> regression to at least
>>>>>              two cases so far as I see:
>>>>>
>>>>>              1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are 
>>>>> inadvertently
>>>>>              exposed and displayed in "vdpa dev config show" 
>>>>> before feature
>>>>>              negotiation is done. Noted the corresponding features 
>>>>> name shown in
>>>>>              vdpa tool is called "negotiated_features" rather than
>>>>>              "driver_features". I see in no way the intended 
>>>>> change of the patch
>>>>>              should break this user level expectation regardless 
>>>>> of any spec
>>>>>              requirement. Do you agree on this point?
>>>>>
>>>>>          I will post a patch for iptour2, doing:
>>>>>          1) if iprout2 does not get driver_features from the 
>>>>> kernel, then don't
>>>>>          show negotiated features in the command output
>>>>>
>>>>>      This won't work as the vdpa userspace tool won't know *when* 
>>>>> features are
>>>>>      negotiated. There's no guarantee in the kernel to assume 0 
>>>>> will be returned
>>>>>      from vendor driver during negotiation. On the other hand, 
>>>>> with the supposed
>>>>>      change, userspace can't tell if there's really none of 
>>>>> features negotiated,
>>>>>      or the feature negotiation is over. Before the change the 
>>>>> userspace either
>>>>>      gets all the attributes when feature negotiation is over, or 
>>>>> it gets
>>>>>      nothing when it's ongoing, so there was a distinction.This 
>>>>> expectation of
>>>>>      what "negotiated_features" represents is established from day 
>>>>> one, I see no
>>>>>      reason the intended kernel change to show other attributes 
>>>>> should break
>>>>>      userspace behavior and user's expectation.
>>>>>
>>>>> User space can only read valid *driver_features* after the 
>>>>> features negotiation
>>>>> is done, *device_features* does not require the negotiation.
>>>>>
>>>>> If you want to prevent random values read from driver_features, 
>>>>> here I propose
>>>>> a fix: only read driver_features when the negotiation is done, 
>>>>> this means to
>>>>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the
>>>>> driver_features.
>>>>> Sounds good?
>>>>>
>>>>> @MST, if this is OK, I can include this change in my next version 
>>>>> patch series.
>>>>>
>>>>> Thanks,
>>>>> Zhu Lingshan
>>>> Sorry I don't get it. Is there going to be a new version? Do you 
>>>> want me
>>>> to revert this one and then apply a new one? It's ok if yes.
>>> Not a new version, it is a new patch, though I still didn't get the 
>>> race condition, but I believe it
>>> is reasonable to block reading the *driver_features* before 
>>> FEATURES_OK.
>>>
>>> So, I added code to check whether _FEATURES_OK is set:
>>>
>>>  861         /* only read driver features after the feature 
>>> negotiation is done */
>>>  862         status = vdev->config->get_status(vdev);
>>>  863         if (status & VIRTIO_CONFIG_S_FEATURES_OK) {
>>>  864                 features_driver = 
>>> vdev->config->get_driver_features(vdev);
>>>  865                 if (nla_put_u64_64bit(msg, 
>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,
>>>  866                                       VDPA_ATTR_PAD))
>>>  867                 return -EMSGSIZE;
>>>  868         }
>>>
>>> If this solution looks good, I will add this patch in my V2 series.
>> This solves the 1st issue in my report, but without a fix for the 2nd 
>> issue ( vdpa_dev_net_config_fill and vdpa_set_features race) 
>> addressed I don't think it covers all incurred regressions.
>>
>> I've replied the way to reproduce the race. For me it's very obvious 
>> to see in my setup.
> Though I still did not see any errors in my run, but I guess you mean 
> the race condition in set_features(), right?
Yes.
>
> The spec says: The device MUST allow reading of any device-specific 
> configuration field before FEATURES_OK is set by the driver.
>
> So there is no need to check whether driver_features is set in 
> vdpa_get_config_unlocked(). We should remove the code checks
> feature_valid and the code set_features to zero. This conflicts with 
> the spec. And so no race conditions.
I don't disagree with the removal, you can try once more. This proposal 
had been attempted but rejected.

-Siwei



>
> Thanks,
> Zhu Lingshan
>>>
>>> > So what is the race?
>>> You'll see the race if you keep 'vdpa dev config show ...' running 
>>> in a tight loop while launching a VM with the vDPA device under query.
>>>
>> -Siwei
>>
>>>
>>> Thanks
>>> Zhu Lingshan
>>>
>>>>
>>>>
>>>>>          2) process and decoding the device features.
>>>>>
>>>>>
>>>>>              2. There was also another implicit assumption that is 
>>>>> broken by
>>>>>              this patch. There could be a vdpa tool query of 
>>>>> config via
>>>>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races
>>>>>              with the first vdpa_set_features() call from VMM e.g. 
>>>>> QEMU. Since
>>>>>              the S_FEATURES_OK blocking condition is removed, if 
>>>>> the vdpa tool
>>>>>              query occurs earlier than the first 
>>>>> set_driver_features() call from
>>>>>              VMM, the following code will treat the guest as 
>>>>> legacy and then
>>>>>              trigger an erroneous vdpa_set_features_unlocked(... , 
>>>>> 0) call to
>>>>>              the vdpa driver:
>>>>>
>>>>>               374         /*
>>>>>               375          * Config accesses aren't supposed to 
>>>>> trigger before
>>>>>              features are set.
>>>>>               376          * If it does happen we assume a legacy 
>>>>> guest.
>>>>>               377          */
>>>>>               378         if (!vdev->features_valid)
>>>>>               379 vdpa_set_features_unlocked(vdev, 0);
>>>>>               380         ops->get_config(vdev, offset, buf, len);
>>>>>
>>>>>              Depending on vendor driver's implementation, L380 may 
>>>>> either return
>>>>>              invalid config data (or invalid endianness if on BE) 
>>>>> or only config
>>>>>              fields that are valid in legacy layout. What's more 
>>>>> severe is that,
>>>>>              vdpa tool query in theory shouldn't affect feature 
>>>>> negotiation at
>>>>>              all by making confusing calls to the device, but now 
>>>>> it is possible
>>>>>              with the patch. Fixing this would require more 
>>>>> delicate work on the
>>>>>              other paths involving the cf_lock reader/write 
>>>>> semaphore.
>>>>>
>>>>>              Not sure what you plan to do next, post the fixes for 
>>>>> both issues
>>>>>              and get the community review? Or simply revert the 
>>>>> patch in
>>>>>              question? Let us know.
>>>>>
>>>>>          The spec says:
>>>>>          The device MUST allow reading of any device-specific 
>>>>> configuration
>>>>>          field before FEATURES_OK is set by
>>>>>          the driver. This includes fields which are conditional on 
>>>>> feature bits,
>>>>>          as long as those feature bits are offered
>>>>>          by the device.
>>>>>
>>>>>          so whether FEATURES_OK should not block reading the 
>>>>> device config
>>>>>          space. vdpa_get_config_unlocked() will read the features, 
>>>>> I don't know
>>>>>          why it has a comment:
>>>>>                  /*
>>>>>                   * Config accesses aren't supposed to trigger 
>>>>> before features
>>>>>          are set.
>>>>>                   * If it does happen we assume a legacy guest.
>>>>>                   */
>>>>>
>>>>>          This conflicts with the spec.
>>>>>
>>>>>          vdpa_get_config_unlocked() checks vdev->features_valid, 
>>>>> if not valid,
>>>>>          it will set the drivers_features 0, I think this intends 
>>>>> to prevent
>>>>>          reading random driver_features. This function does not 
>>>>> hold any locks,
>>>>>          and didn't change anything.
>>>>>
>>>>>          So what is the race?
>>>>>          You'll see the race if you keep 'vdpa dev config show 
>>>>> ...' running in a
>>>>>      tight loop while launching a VM with the vDPA device under 
>>>>> query.
>>>>>
>>>>>      -Siwei
>>>>>
>>>>>
>>>>>
>>>>>                  Thanks
>>>>>
>>>>>
>>>>>              Thanks,
>>>>>              -Siwei
>>>>>
>>>>>
>>>>>              On 8/12/2022 3:44 AM, Zhu Lingshan wrote:
>>>>>
>>>>>                  Users may want to query the config space of a 
>>>>> vDPA device,
>>>>>                  to choose a appropriate one for a certain guest. 
>>>>> This means the
>>>>>                  users need to read the config space before 
>>>>> FEATURES_OK, and
>>>>>                  the existence of config space contents does not 
>>>>> depend on
>>>>>                  FEATURES_OK.
>>>>>
>>>>>                  The spec says:
>>>>>                  The device MUST allow reading of any device-specific
>>>>>                  configuration
>>>>>                  field before FEATURES_OK is set by the driver. 
>>>>> This includes
>>>>>                  fields which are conditional on feature bits, as 
>>>>> long as those
>>>>>                  feature bits are offered by the device.
>>>>>
>>>>>                  Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>                  ---
>>>>>                    drivers/vdpa/vdpa.c | 8 --------
>>>>>                    1 file changed, 8 deletions(-)
>>>>>
>>>>>                  diff --git a/drivers/vdpa/vdpa.c 
>>>>> b/drivers/vdpa/vdpa.c
>>>>>                  index 6eb3d972d802..bf312d9c59ab 100644
>>>>>                  --- a/drivers/vdpa/vdpa.c
>>>>>                  +++ b/drivers/vdpa/vdpa.c
>>>>>                  @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct 
>>>>> vdpa_device
>>>>>                  *vdev, struct sk_buff *msg, u32 portid,
>>>>>                    {
>>>>>                        u32 device_id;
>>>>>                        void *hdr;
>>>>>                  -    u8 status;
>>>>>                        int err;
>>>>> down_read(&vdev->cf_lock);
>>>>>                  -    status = vdev->config->get_status(vdev);
>>>>>                  -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>                  -        NL_SET_ERR_MSG_MOD(extack, "Features 
>>>>> negotiation not
>>>>>                  completed");
>>>>>                  -        err = -EAGAIN;
>>>>>                  -        goto out;
>>>>>                  -    }
>>>>>                  -
>>>>>                        hdr = genlmsg_put(msg, portid, seq, 
>>>>> &vdpa_nl_family,
>>>>>                  flags,
>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>                        if (!hdr) {
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 28638 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2022-08-19  0:05 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12 10:44 [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 1/6] vDPA/ifcvf: get_config_size should return a value no greater than dev implementation Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 2/6] vDPA/ifcvf: support userspace to query features and MQ of a management device Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 3/6] vDPA: allow userspace to query features of a vDPA device Zhu Lingshan
2022-08-12 10:44 ` [PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space Zhu Lingshan
2022-08-16  7:41   ` Si-Wei Liu
2022-08-16  7:41     ` Si-Wei Liu
2022-08-16  8:23     ` Michael S. Tsirkin
2022-08-16  8:23       ` Michael S. Tsirkin
     [not found]     ` <f2864c96-cddd-129e-7dd8-a3743fe7e0d0@intel.com>
2022-08-16  8:41       ` Michael S. Tsirkin
2022-08-16  8:41         ` Michael S. Tsirkin
2022-08-16  8:46         ` Zhu, Lingshan
2022-08-16 22:48       ` Si-Wei Liu
     [not found]         ` <a488a17a-b716-52aa-cc31-2e51f8f972d2@intel.com>
2022-08-17  6:14           ` Michael S. Tsirkin
2022-08-17  6:14             ` Michael S. Tsirkin
2022-08-17  6:23             ` Zhu, Lingshan
2022-08-17 18:48               ` Si-Wei Liu
     [not found]                 ` <b2430d3b-4ca2-a19c-da39-da91da732c02@intel.com>
2022-08-19  0:04                   ` Si-Wei Liu [this message]
2022-08-16 21:13     ` Michael S. Tsirkin
2022-08-16 21:13       ` Michael S. Tsirkin
2022-08-16 22:09       ` Si-Wei Liu
2022-08-16 22:09         ` Si-Wei Liu
2022-08-12 10:44 ` [PATCH V5 5/6] vDPA: Conditionally read fields in virtio-net dev " Zhu Lingshan
2022-08-12 10:45 ` [PATCH V5 6/6] fix 'cast to restricted le16' warnings in vdpa.c Zhu Lingshan
2022-08-12 11:14 ` [PATCH V5 0/6] ifcvf/vDPA: support query device config space through netlink Michael S. Tsirkin
2022-08-12 11:14   ` Michael S. Tsirkin
2022-08-12 11:17   ` Michael S. Tsirkin
2022-08-12 11:17     ` Michael S. Tsirkin
2022-08-12 11:41     ` Zhu, Lingshan
2022-08-15  9:36       ` Zhu, Lingshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fe9d595-f073-5709-edf6-386bda8a55fd@oracle.com \
    --to=si-wei.liu@oracle.com \
    --cc=gautam.dawar@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=lingshan.zhu@intel.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.