linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: luoben <luoben@linux.alibaba.com>
To: Yunsheng Lin <linyunsheng@huawei.com>,
	tglx@linutronix.de, alex.williamson@redhat.com,
	linux-kernel@vger.kernel.org
Cc: tao.ma@linux.alibaba.com, gerry@linux.alibaba.com
Subject: Re: [PATCH 2/2] vfio_pci: make use of update_irq_devid and optimize irq ops
Date: Mon, 12 Aug 2019 14:37:57 +0800	[thread overview]
Message-ID: <8988bbb2-b089-5a18-e8d0-1e7339832364@linux.alibaba.com> (raw)
In-Reply-To: <6b11b0fa-06a9-fd92-084c-faaca116dc74@huawei.com>


在 2019/8/12 下午1:44, Yunsheng Lin 写道:
> On 2019/8/8 20:07, Ben Luo wrote:
>> When userspace (e.g. qemu) triggers a switch between KVM
>> irqfd and userspace eventfd, only dev_id of irq action
>> (i.e. the "trigger" in this patch's context) will be
>> changed, but a free-then-request-irq action is taken in
>> current code. And, irq affinity setting in VM will also
>> trigger a free-then-request-irq action, which actully
>> changes nothing, but only fires a producer re-registration
>> to update irte in case that posted-interrupt is in use.
>>
>> This patch makes use of update_irq_devid() and optimize
>> both cases above, which reduces the risk of losing interrupt
>> and also cuts some overhead.
>>
>> Signed-off-by: Ben Luo <luoben@linux.alibaba.com>
>> Reviewed-by: Liu Jiang <gerry@linux.alibaba.com>
>> ---
>>   drivers/vfio/pci/vfio_pci_intrs.c | 100 +++++++++++++++++++++++---------------
>>   1 file changed, 62 insertions(+), 38 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
>> index 3fa3f72..1323dc8 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -285,69 +285,93 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
>>   				      int vector, int fd, bool msix)
>>   {
>>   	struct pci_dev *pdev = vdev->pdev;
>> -	struct eventfd_ctx *trigger;
>> +	struct eventfd_ctx *trigger = NULL;
> struct eventfd_ctx *trigger = NULL;
> struct pci_dev *pdev = vdev->pdev;
>
> to maintain reverse christmas tree?
ok,  keep reverse christmas tree in v2
>
>>   	int irq, ret;
>>   
>>   	if (vector < 0 || vector >= vdev->num_ctx)
>>   		return -EINVAL;
>>   
>> +	if (fd >= 0) {
>> +		trigger = eventfd_ctx_fdget(fd);
>> +		if (IS_ERR(trigger))
>> +			return PTR_ERR(trigger);
> It seems vdev->ctx[vector].trigger is freed even if  fd < 0 before
> this patch. If it return here, vdev->ctx[vector].trigger is not free?

if fd < 0, it won't enter here

if fd >= 0,  I think it's better to return and leave everything as it 
was, and

let the caller to deal with this bad fd case and disable msi to free the 
resouces if it wants

>
>> +	}
>> +
>>   	irq = pci_irq_vector(pdev, vector);
>>   
>>   	if (vdev->ctx[vector].trigger) {
>> -		free_irq(irq, vdev->ctx[vector].trigger);
>> -		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
>> -		kfree(vdev->ctx[vector].name);
>> -		eventfd_ctx_put(vdev->ctx[vector].trigger);
>> -		vdev->ctx[vector].trigger = NULL;
>> +		if (vdev->ctx[vector].trigger != trigger) {
>> +			if (trigger) {
>> +				ret = update_irq_devid(irq,
>> +						vdev->ctx[vector].trigger, trigger);
>> +				if (unlikely(ret)) {
>> +					dev_info(&pdev->dev,
>> +							"update_irq_devid %d (token %p) fails: %d\n",
>> +							irq, vdev->ctx[vector].trigger, ret);
>> +					eventfd_ctx_put(trigger);
>> +					return ret;
>> +				}
>> +				irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
>> +				eventfd_ctx_put(vdev->ctx[vector].trigger);
>> +				vdev->ctx[vector].producer.token = trigger;
>> +				vdev->ctx[vector].trigger = trigger;
>> +			} else {
>> +				free_irq(irq, vdev->ctx[vector].trigger);
>> +				irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
>> +				kfree(vdev->ctx[vector].name);
>> +				eventfd_ctx_put(vdev->ctx[vector].trigger);
>> +				vdev->ctx[vector].trigger = NULL;
>> +			}
>> +		} else
>> +			irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
>>   	}
> Maybe adjust it a litte to reduce indent and to improve readability?
>
> 	if (vdev->ctx[vector].trigger && vdev->ctx[vector].trigger == trigger) {
> 		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
> 	} else if (vdev->ctx[vector].trigger && !trigger) {
> 		free_irq(irq, vdev->ctx[vector].trigger);
> 		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
> 		kfree(vdev->ctx[vector].name);
> 		eventfd_ctx_put(vdev->ctx[vector].trigger);
> 		vdev->ctx[vector].trigger = NULL;
> 	} else if (vdev->ctx[vector].trigger) {
> 		ret = update_irq_devid(irq, vdev->ctx[vector].trigger, trigger);
> 		if (unlikely(ret)) {
> 			dev_info(&pdev->dev,
> 				 "update_irq_devid %d (token %p) fails: %d\n",
> 				 irq, vdev->ctx[vector].trigger, ret);
> 				 eventfd_ctx_put(trigger);
> 				 return ret;
> 		}
> 		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
> 		eventfd_ctx_put(vdev->ctx[vector].trigger);
> 		vdev->ctx[vector].producer.token = trigger;
> 		vdev->ctx[vector].trigger = trigger;
> 	}
>
I will reformat this chunk in v2
>>   
>>   	if (fd < 0)
>>   		return 0;
>>   
>> -	vdev->ctx[vector].name = kasprintf(GFP_KERNEL, "vfio-msi%s[%d](%s)",
>> -					   msix ? "x" : "", vector,
>> -					   pci_name(pdev));
>> -	if (!vdev->ctx[vector].name)
>> -		return -ENOMEM;
>> +	if (vdev->ctx[vector].trigger == NULL) {
> It may be common to use the below check to do NULL checking:
> If (!vdev->ctx[vector].trigger)
ok, make it this way in v2
>
>> +		vdev->ctx[vector].name = kasprintf(GFP_KERNEL, "vfio-msi%s[%d](%s)",
>> +						   msix ? "x" : "", vector,
>> +						   pci_name(pdev));
>> +		if (!vdev->ctx[vector].name) {
>> +			eventfd_ctx_put(trigger);
>> +			return -ENOMEM;
>> +		}
>>   
>> -	trigger = eventfd_ctx_fdget(fd);
>> -	if (IS_ERR(trigger)) {
>> -		kfree(vdev->ctx[vector].name);
>> -		return PTR_ERR(trigger);
>> -	}
>> +		/*
>> +		 * The MSIx vector table resides in device memory which may be cleared
>> +		 * via backdoor resets. We don't allow direct access to the vector
>> +		 * table so even if a userspace driver attempts to save/restore around
>> +		 * such a reset it would be unsuccessful. To avoid this, restore the
>> +		 * cached value of the message prior to enabling.
>> +		 */
>> +		if (msix) {
>> +			struct msi_msg msg;
>>   
>> -	/*
>> -	 * The MSIx vector table resides in device memory which may be cleared
>> -	 * via backdoor resets. We don't allow direct access to the vector
>> -	 * table so even if a userspace driver attempts to save/restore around
>> -	 * such a reset it would be unsuccessful. To avoid this, restore the
>> -	 * cached value of the message prior to enabling.
>> -	 */
>> -	if (msix) {
>> -		struct msi_msg msg;
>> +			get_cached_msi_msg(irq, &msg);
>> +			pci_write_msi_msg(irq, &msg);
>> +		}
>>   
>> -		get_cached_msi_msg(irq, &msg);
>> -		pci_write_msi_msg(irq, &msg);
>> -	}
>> +		ret = request_irq(irq, vfio_msihandler, 0,
>> +				  vdev->ctx[vector].name, trigger);
>> +		if (ret) {
>> +			kfree(vdev->ctx[vector].name);
>> +			eventfd_ctx_put(trigger);
>> +			return ret;
>> +		}
>>   
>> -	ret = request_irq(irq, vfio_msihandler, 0,
>> -			  vdev->ctx[vector].name, trigger);
>> -	if (ret) {
>> -		kfree(vdev->ctx[vector].name);
>> -		eventfd_ctx_put(trigger);
>> -		return ret;
>> +		vdev->ctx[vector].producer.token = trigger;
>> +		vdev->ctx[vector].producer.irq = irq;
>> +		vdev->ctx[vector].trigger = trigger;
>>   	}
>>   
>> -	vdev->ctx[vector].producer.token = trigger;
>> -	vdev->ctx[vector].producer.irq = irq;
>> +	/* always update irte for posted mode */
>>   	ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
>>   	if (unlikely(ret))
>>   		dev_info(&pdev->dev,
>>   		"irq bypass producer (token %p) registration fails: %d\n",
>>   		vdev->ctx[vector].producer.token, ret);
>>   
>> -	vdev->ctx[vector].trigger = trigger;
>> -
>>   	return 0;
>>   }
>>   

Thanks,

     Ben


  reply	other threads:[~2019-08-12  6:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-08 12:07 [PATCH 0/2] introduce update_irq_devid and optimize VFIO irq ops Ben Luo
2019-08-08 12:07 ` [PATCH 1/2] genirq: introduce update_irq_devid() Ben Luo
2019-08-08 19:56   ` Thomas Gleixner
2019-08-12  2:59     ` luoben
2019-08-08 12:07 ` [PATCH 2/2] vfio_pci: make use of update_irq_devid and optimize irq ops Ben Luo
2019-08-12  5:44   ` Yunsheng Lin
2019-08-12  6:37     ` luoben [this message]
2019-08-12  8:52 ` [PATCH v2 0/3] introduce update_irq_devid and optimize VFIO " Ben Luo
2019-08-12  8:52 ` [PATCH v2 1/3] genirq: enhance error recovery code in free irq Ben Luo
2019-08-12  8:52 ` [PATCH v2 2/3] genirq: introduce update_irq_devid() Ben Luo
2019-08-12  8:52 ` [PATCH v2 3/3] vfio_pci: make use of update_irq_devid and optimize irq ops Ben Luo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8988bbb2-b089-5a18-e8d0-1e7339832364@linux.alibaba.com \
    --to=luoben@linux.alibaba.com \
    --cc=alex.williamson@redhat.com \
    --cc=gerry@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=tao.ma@linux.alibaba.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).