kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>, "Zeng, Xin" <xin.zeng@intel.com>
Cc: "herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"shameerali.kolothum.thodi@huawei.com"
	<shameerali.kolothum.thodi@huawei.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	qat-linux <qat-linux@intel.com>,
	"Cao, Yahui" <yahui.cao@intel.com>
Subject: Re: [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF devices
Date: Wed, 21 Feb 2024 17:11:10 +0200	[thread overview]
Message-ID: <461b50e9-c539-46c0-96ad-d379da581d8c@nvidia.com> (raw)
In-Reply-To: <20240221131856.GS13330@nvidia.com>

On 21/02/2024 15:18, Jason Gunthorpe wrote:
> On Wed, Feb 21, 2024 at 08:44:31AM +0000, Zeng, Xin wrote:
>> On Wednesday, February 21, 2024 1:03 AM, Jason Gunthorpe wrote:
>>> On Tue, Feb 20, 2024 at 03:53:08PM +0000, Zeng, Xin wrote:
>>>> On Tuesday, February 20, 2024 9:25 PM, Jason Gunthorpe wrote:
>>>>> To: Zeng, Xin <xin.zeng@intel.com>
>>>>> Cc: Yishai Hadas <yishaih@nvidia.com>; herbert@gondor.apana.org.au;
>>>>> alex.williamson@redhat.com; shameerali.kolothum.thodi@huawei.com;
>>> Tian,
>>>>> Kevin <kevin.tian@intel.com>; linux-crypto@vger.kernel.org;
>>>>> kvm@vger.kernel.org; qat-linux <qat-linux@intel.com>; Cao, Yahui
>>>>> <yahui.cao@intel.com>
>>>>> Subject: Re: [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF
>>> devices
>>>>>
>>>>> On Sat, Feb 17, 2024 at 04:20:20PM +0000, Zeng, Xin wrote:
>>>>>
>>>>>> Thanks for this information, but this flow is not clear to me why it
>>>>>> cause deadlock. From this flow, CPU0 is not waiting for any resource
>>>>>> held by CPU1, so after CPU0 releases mmap_lock, CPU1 can continue
>>>>>> to run. Am I missing something?
>>>>>
>>>>> At some point it was calling copy_to_user() under the state
>>>>> mutex. These days it doesn't.
>>>>>
>>>>> copy_to_user() would nest the mm_lock under the state mutex which is
>>> a
>>>>> locking inversion.
>>>>>
>>>>> So I wonder if we still have this problem now that the copy_to_user()
>>>>> is not under the mutex?
>>>>
>>>> In protocol v2, we still have the scenario in precopy_ioctl where
>>> copy_to_user is
>>>> called under state_mutex.
>>>
>>> Why? Does mlx5 do that? It looked Ok to me:
>>>
>>>          mlx5vf_state_mutex_unlock(mvdev);
>>>          if (copy_to_user((void __user *)arg, &info, minsz))
>>>                  return -EFAULT;
>>
>> Indeed, thanks, Jason. BTW, is there any reason why was "deferred_reset" mode
>> still implemented in mlx5 driver given this deadlock condition has been avoided
>> with migration protocol v2 implementation.
> 
> I do not remember. Yishai?
> 

Long time passed.., I also don't fully remember whether this was the 
only potential problem here, maybe Yes.

My plan is to prepare a cleanup patch for mlx5 and put it into our 
regression for a while, if all will be good, I may send it for the next 
kernel cycle.

By the way, there are other drivers around (i.e. hisi and mtty) that 
still run copy_to_user() under the state mutex and might hit the problem 
without the 'deferred_rest', see here[1].

If we'll reach to the conclusion that the only reason for that mechanism 
was the copy_to_user() under the state_mutex, those drivers can change 
their code easily and also send a patch to cleanup the 'deferred_reset'.

[1] 
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c#L808
[2] 
https://elixir.bootlin.com/linux/v6.8-rc5/source/samples/vfio-mdev/mtty.c#L878

Yishai





  reply	other threads:[~2024-02-21 15:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01 15:33 [PATCH 00/10] crypto: qat - enable SRIOV VF live migration for Xin Zeng
2024-02-01 15:33 ` [PATCH 01/10] crypto: qat - adf_get_etr_base() helper Xin Zeng
2024-02-01 15:33 ` [PATCH 02/10] crypto: qat - relocate and rename 4xxx PF2VM definitions Xin Zeng
2024-02-01 15:33 ` [PATCH 03/10] crypto: qat - move PFVF compat checker to a function Xin Zeng
2024-02-01 15:33 ` [PATCH 04/10] crypto: qat - relocate CSR access code Xin Zeng
2024-02-01 15:33 ` [PATCH 05/10] crypto: qat - rename get_sla_arr_of_type() Xin Zeng
2024-02-01 15:33 ` [PATCH 06/10] crypto: qat - expand CSR operations for QAT GEN4 devices Xin Zeng
2024-02-01 15:33 ` [PATCH 07/10] crypto: qat - add bank save and restore flows Xin Zeng
2024-02-04  7:49   ` [EXTERNAL] " Kamlesh Gurudasani
2024-02-19  3:41     ` Wan, Siming
2024-02-01 15:33 ` [PATCH 08/10] crypto: qat - add interface for live migration Xin Zeng
2024-02-01 15:33 ` [PATCH 09/10] crypto: qat - implement " Xin Zeng
2024-02-01 15:33 ` [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF devices Xin Zeng
2024-02-06 12:55   ` Jason Gunthorpe
2024-02-09  8:23     ` Zeng, Xin
2024-02-09 12:10       ` Jason Gunthorpe
2024-02-11  8:17         ` Yishai Hadas
2024-02-17 16:20           ` Zeng, Xin
2024-02-20 13:24             ` Jason Gunthorpe
2024-02-20 15:53               ` Zeng, Xin
2024-02-20 17:03                 ` Jason Gunthorpe
2024-02-21  8:44                   ` Zeng, Xin
2024-02-21 13:18                     ` Jason Gunthorpe
2024-02-21 15:11                       ` Yishai Hadas [this message]
2024-02-22  9:39                         ` Yishai Hadas
2024-02-06 21:14   ` Alex Williamson
2024-02-09 16:16     ` Zeng, Xin
2024-02-08 12:17   ` Shameerali Kolothum Thodi
2024-02-13 13:08     ` Zeng, Xin
2024-02-13 14:55       ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=461b50e9-c539-46c0-96ad-d379da581d8c@nvidia.com \
    --to=yishaih@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=qat-linux@intel.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=xin.zeng@intel.com \
    --cc=yahui.cao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).