All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zheng Xiang <zhengxiang9@huawei.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-kernel@vger.kernel.org>, <tglx@linutronix.de>,
	<jason@lakedaemon.net>, <wanghaibin.wang@huawei.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Lock its device list during find and create its device
Date: Sat, 2 Feb 2019 09:51:32 +0800	[thread overview]
Message-ID: <e3608521-db59-6047-695a-514528a0a2a9@huawei.com> (raw)
In-Reply-To: <23664397-eb28-4362-b292-091ba190be5e@arm.com>


On 2019/2/1 17:28, Marc Zyngier wrote:
> On 01/02/2019 06:41, Zheng Xiang wrote:
>>
>> On 2019/1/31 23:12, Marc Zyngier wrote:
>>> Hi Zeng,
>>>
>>> On 31/01/2019 14:47, Zheng Xiang wrote:
>>>> Hi Marc,
>>>>
>>>> On 2019/1/29 13:42, Zheng Xiang wrote:
>>>>> On 2019/1/28 21:51, Marc Zyngier wrote:
>>>>>> On 28/01/2019 07:13, Zheng Xiang wrote:
>>>>>>> Hi Marc,
>>>>>>>
>>>>>>> Thanks for your review.
>>>>>>>
>>>>>>> On 2019/1/26 19:38, Marc Zyngier wrote:
>>>>>>>> Hi Zheng,
>>>>>>>>
>>>>>>>> On Sat, 26 Jan 2019 06:16:24 +0000,
>>>>>>>> Zheng Xiang <zhengxiang9@huawei.com> wrote:
>>>>>>>>>
>>>>>>>>> Currently each PCI device under a PCI Bridge shares the same device id
>>>>>>>>> and ITS device. Assume there are two PCI devices call its_msi_prepare
>>>>>>>>> concurrently and they are both going to find and create their ITS
>>>>>>>>> device. There is a chance that the later one couldn't find ITS device
>>>>>>>>> before the other one creating the ITS device. It will cause the later
>>>>>>>>> one to create a different ITS device even if they have the same
>>>>>>>>> device_id.
>>>>>>>>
>>>>>>>> Interesting finding. Is this something you've actually seen in practice
>>>>>>>> with two devices being probed in parallel? Or something that you found
>>>>>>>> by inspection?
>>>>>>>
>>>>>>> Yes, I find this problem after analyzing the reason of VM hung. At last, I
>>>>>>> find that the virtio-gpu cannot receive the MSI interrupts due to sharing
>>>>>>> a same event_id as virtio-serial.
>>>>>>>
>>>>>>> See https://lkml.org/lkml/2019/1/10/299 for the bug report.
>>>>>>>
>>>>>>> This problem can be reproducted with high probability by booting a Qemu/KVM
>>>>>>> VM with a virtio-serial controller and a virtio-gpu adding to a PCI Bridge
>>>>>>> and also adding some delay before creating ITS device.
>>>>>>
>>>>>> Fair enough. Do you mind sharing your QEMU command line? It'd be useful
>>>>>> if I could reproduce it here (and would give me a way to check that it
>>>>>> doesn't regress).
>>>>>
>>>>
>>>> Have you reproduced it with my QEMU command line?
>>>>
>>>> If so, should I send a V2 patch with your suggestion?
>>>
>>> I've queued the following, much more complete patch:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/irqchip-next&id=9791ec7df0e7b4d80706ccea8f24b6542f6059e9
>>>
>>> Can you check that it works for you? I didn't manage to get the right
>>> timing conditions, but I also had issues getting virtio-gpu running on
>>> my TX2, so one might explain the other.
>>>
>>
>> It works for my case, but I worried about the below lines which may
>> cause memory leak.
>>
>> @@ -2627,8 +2640,14 @@ static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
>>  		irq_domain_reset_irq_data(data);
>>  	}
>>
>> -	/* If all interrupts have been freed, start mopping the floor */
>> -	if (bitmap_empty(its_dev->event_map.lpi_map,
>> +	mutex_lock(&its->dev_alloc_lock);
>> +
>> +	/*
>> +	 * If all interrupts have been freed, start mopping the
>> +	 * floor. This is conditionned on the device not being shared.
>> +	 */
>> +	if (!its_dev->shared &&
>> +	    bitmap_empty(its_dev->event_map.lpi_map,
>>  			 its_dev->event_map.nr_lpis)) {
>>  		its_lpi_free(its_dev->event_map.lpi_map,
>>  			     its_dev->event_map.lpi_base,
>>
>> It seems that the shared its_dev would never be freed since the value of
>> its_dev->shared is always *true*.
> 
> Yes, and that is on purpose. As we don't refcount the number of
> interrupts that have been requested in the prepare phase, there is a
> race between free and alloc. We can have the following situation:
> 
> CPU0:               CPU1:
> 
> msi_prepare:
> mutex_lock()
> find device()
>   -> found
> store its_dev
> mutex_unlock()
> 
>                     its_irq_domain_free:
>                     mutex_lock()
>                     free_device()
>                     mutex_unlock()
> 
> its_irq_domain_alloc:
> use its_dev -> boom.
> 
> 
> So the trick is not to free the its_dev structure if it shares a devid.
> It is not really a leak, as the next device sharing the same devid will
> pick up the same structure.
> 
> Does it make sense?

Yes, Thanks a lot!


-- 

Thanks,
Xiang



      reply	other threads:[~2019-02-02  1:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-26  6:16 [PATCH] irqchip/gic-v3-its: Lock its device list during find and create its device Zheng Xiang
2019-01-26 11:38 ` Marc Zyngier
2019-01-28  7:13   ` Zheng Xiang
2019-01-28 13:51     ` Marc Zyngier
2019-01-29  5:42       ` Zheng Xiang
2019-01-31 14:47         ` Zheng Xiang
2019-01-31 15:12           ` Marc Zyngier
2019-02-01  6:41             ` Zheng Xiang
2019-02-01  9:28               ` Marc Zyngier
2019-02-02  1:51                 ` Zheng Xiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3608521-db59-6047-695a-514528a0a2a9@huawei.com \
    --to=zhengxiang9@huawei.com \
    --cc=jason@lakedaemon.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=tglx@linutronix.de \
    --cc=wanghaibin.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.