linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cao jin <caoj.fnst@cn.fujitsu.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>,
	<izumi.taku@jp.fujitsu.com>, <alex.williamson@redhat.com>
Subject: Re: [PATCH] vfio/pci: Support error recovery
Date: Thu, 1 Dec 2016 21:38:31 +0800	[thread overview]
Message-ID: <584027D7.7080802@cn.fujitsu.com> (raw)
In-Reply-To: <20161130033533-mutt-send-email-mst@kernel.org>



On 11/30/2016 09:46 AM, Michael S. Tsirkin wrote:
> On Mon, Nov 28, 2016 at 05:32:15PM +0800, Cao jin wrote:
>>
>>

>>>
>>>> +	if (severity == AER_FATAL && strcmp(dev->driver->name, "vfio-pci")) {
>>>
>>> You really want some flag in the device, or something similar.
>>> Also, how do we know driver is not going away at this point?
>>>
>>
>> I didn't think of this condition, and I don't quite follow how would driver
>> go away?(device has error happened, then is removed?)
> 
> Yes - hotplug request detected. Does something prevent this?
> 

I wasn't realized there would be possible to have hotplug happened
during error recovery. But, it seems is the aer driver's thing to
consider it?

>>>>   		result = reset_link(dev);
>>>>   		if (result != PCI_ERS_RESULT_RECOVERED)
>>>>   			goto failed;
>>
>>>> @@ -1187,10 +1200,30 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
>>>>   		return PCI_ERS_RESULT_DISCONNECT;
>>>>   	}
>>>>
>>>> +	/* get device's uncorrectable error status as soon as possible,
>>>> +	 * and signal it to user space. The later we read it, the possibility
>>>> +	 * the register value is mangled grows. */
>>>> +	aer_cap_offset = pci_find_ext_capability(vdev->pdev, PCI_EXT_CAP_ID_ERR);
>>>> +	ret = pci_read_config_dword(vdev->pdev, aer_cap_offset +
>>>> +                                    PCI_ERR_UNCOR_STATUS, &uncor_status);
>>>> +        if (ret)
>>>> +                return PCI_ERS_RESULT_DISCONNECT;
>>>> +
>>>> +	pr_err("device %d got AER detect notification. uncorrectable error status = 0x%x\n", pdev->devfn, uncor_status);//to be removed
>>>>   	mutex_lock(&vdev->igate);
>>>> +
>>>> +	vdev->aer_recovering = true;
>>>> +	reinit_completion(&vdev->aer_error_completion);
>>>> +
>>>> +	/* suspend config space access from user space,
>>>> +	 * when vfio-pci's error recovery process is on */
>>>
>>> what about access to memory etc? Do you need to suspend this as well?
>>>
>>
>> Yes, this question came into my mind a little bit, but I didn't see some
>> existing APIs like pci_cfg_access_xxx which can help to do this.(I am still
>> not familiar with kernel)
> 
> This isn't easy to do at all.
> 

I guess we can use completion in vfio_pci_rw to block all kinds of
access during vfio's error recovery. But from another perspective, now
that we have disabled reset_link in host, I think the access to device
could be performed normally. To me, the benefit of using pci_cfg_access
is: we won't bother to do "wait 3s to do guest's link reset"

> 
>>>> +	pci_cfg_access_trylock(vdev->pdev);
>>>
>>> If you trylock, you need to handle failure.
>>
>> try lock returns 0 if access is already locked, 1 otherwise. Is it necessary
>> to check its return value?
> 
> Locked by whom? You blissfully access as if it's locked by you.
> 
>>
>> -- 
>> Sincerely,
>> Cao jin
>>
> 
> 
> .
> 

-- 
Sincerely,
Cao jin

  reply	other threads:[~2016-12-01 13:34 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-27 11:34 [PATCH] vfio/pci: Support error recovery Cao jin
2016-11-28  3:00 ` Michael S. Tsirkin
2016-11-28  9:32   ` Cao jin
2016-11-30  1:46     ` Michael S. Tsirkin
2016-12-01 13:38       ` Cao jin [this message]
2016-12-01  4:04 ` Alex Williamson
2016-12-01  4:51   ` Michael S. Tsirkin
2016-12-01 13:40     ` Cao jin
2016-12-06  3:46       ` Michael S. Tsirkin
2016-12-06  6:47         ` Cao jin
2016-12-01 13:40   ` Cao jin
2016-12-01 14:55     ` Alex Williamson
2016-12-04 12:16       ` Cao jin
2016-12-04 15:30         ` Alex Williamson
2016-12-05  5:52           ` Cao jin
2016-12-05 16:17             ` Alex Williamson
2016-12-06  3:55               ` Michael S. Tsirkin
2016-12-06  4:59                 ` Alex Williamson
2016-12-06 10:46                   ` Cao jin
2016-12-06 15:35                     ` Alex Williamson
2016-12-07  2:49                       ` Cao jin
2016-12-08 14:46                       ` Cao jin
2016-12-08 16:30                         ` Michael S. Tsirkin
2016-12-09  3:40                           ` Cao jin
2016-12-09  3:40                         ` Cao jin
2016-12-06  6:11               ` Cao jin
2016-12-06 15:25                 ` Alex Williamson
2016-12-07  2:58                   ` Cao jin
2016-12-12 13:49 ` Cao jin
2016-12-12 19:12   ` Alex Williamson
2016-12-12 22:29     ` Michael S. Tsirkin
2016-12-12 22:43       ` Alex Williamson
2016-12-13  3:15         ` Michael S. Tsirkin
2016-12-13  3:39           ` Alex Williamson
2016-12-13 16:12             ` Michael S. Tsirkin
2016-12-13 16:27               ` Alex Williamson
2016-12-14  1:58                 ` Michael S. Tsirkin
2016-12-14  3:00                   ` Alex Williamson
2016-12-14 22:20                     ` Michael S. Tsirkin
2016-12-14 22:47                       ` Alex Williamson
2016-12-14 23:00                         ` Michael S. Tsirkin
2016-12-14 23:32                           ` Alex Williamson
2016-12-14 10:24     ` Cao jin
2016-12-14 22:16       ` Alex Williamson
2016-12-14 22:25         ` Michael S. Tsirkin
2016-12-14 22:49           ` Alex Williamson
2016-12-15 13:56         ` Cao jin
2016-12-15 14:50           ` Michael S. Tsirkin
2016-12-15 22:01             ` Alex Williamson
2016-12-16 10:15               ` Cao jin
2016-12-16 10:15             ` Cao jin
2016-12-15 17:02           ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=584027D7.7080802@cn.fujitsu.com \
    --to=caoj.fnst@cn.fujitsu.com \
    --cc=alex.williamson@redhat.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).