All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhou Jie <zhoujie2011@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: izumi.taku@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com,
	Chen Fan <fan.chen@easystack.cn>,
	qemu-devel@nongnu.org, mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume
Date: Wed, 6 Jul 2016 10:01:28 +0800	[thread overview]
Message-ID: <24876827-07fc-6f69-6949-7e358720c914@cn.fujitsu.com> (raw)
In-Reply-To: <20160705110300.029985a6@t450s.home>

Hi Alex,

> Due to weekend and holiday in my country, there were zero regular
> working hours between your emails.
I wish you had a good time.

>>> The following code will be modified.
>>> 1. vfio_pci_ioctl
>>>    add a flag in vfio_device_info for workable_state support
>>>    return workable_state in "struct vfio_pci_device" when user get info
>
> Seems like two flags are required, one to indicate the presence of this
> feature and another to indicate the state.  I'd prefer something like
> access_blocked.
User can get the state by state flag.
And also maybe blocked by ioctl or write functons.
User has choice to invoke which functions.

>>> 2. vfio_pci_ioctl
>>>    During err occurs and resume:
>>>    if (cmd == VFIO_DEVICE_SET_IRQS || VFIO_DEVICE_RESET
>>>    || VFIO_DEVICE_GET_PCI_HOT_RESET_INFO || VFIO_DEVICE_PCI_HOT_RESET)
>>>    block for workable_state clearing
>>> 3. vfio_pci_write
>>>    During err occurs and resume:
>>>    ignore and return 0
>
> This is contradictory to your comment "Do nothing for bar regions".
> ISTR that returning 0 for read/write calls is an easy way to break
> users since we've return neither the desired bytes nor an error code.
No, there is not change for read.
Just return 0 for write.
Return 0 mean that there is no byte has been written.
Consider that the aer has occurred,
it is better to not to write any thing to device.
User can still read/write bar regions by mmap address,
this may generate some date errors,
but it doesn't matter as device is going to be reset.

>>> 4. vfio_pci_aer_err_detected
>>>    Set workable_state to false in "struct vfio_pci_device"
>>>    Disable INTx:
>>>      If Disable INTx is support
>>>        disable by PCI_COMMAND
>>>      else
>>>        disable by disable_irq function
>>>    Disable MSI:
>>>        disable by clearing the "Bus Master Enable" bit of PCI_COMMAND
>
> I've suggested repeatedly to properly teardown these interrupts.  I
> disagree with your proposed approach here.  If the device is intended to
> be in a state that requires re-initialization then the interrupt setup
> should be part of that.
I have traced the INTx functions.
-vfio_pci_intx_unmask_handler
-vfio_pci_intx_mask
-vfio_intx_set_signal

They are invoked by User.
-vfio_pci_write
-vfio_pci_ioctl

During err occurs and resume above functions are blocked.
So, User cann't set the INTx.
I will disable the INTx in vfio_pci_aer_err_detected.
And reset the INTx in vfio_pci_aer_resume
according the original user setting(vdev->ctx[0].masked).

>>> 5. vfio_pci_aer_resume
>>>    Set workable_state to true in "struct vfio_pci_device"
>>>    About INTx:
>>>      According to the value of "vdev->ctx[0].masked"
>>>      to decide whether to enable INTx
>>>    About MSI:
>>>      After reset the "Bus Master Enable" bit is default to 0.
>>>      So user should process this after reset.
>
> Again, I think this is error prone, teardown the interrupts and define
> that the device state, including interrupts, needs to be reinitialized
> after error.  Why are you not incorporating this feedback?  Thanks,
The reinitialization depend on user.
For VFIO driver the process is following.
1. aer occurs
2. disable the following functions of the device
    write(except the bar regions), ioctl and interrupt
3. aer driver reset the device
4. renable the device for user
5. user process the aer event
    Maybe reset the device and reinitialization

What I do is make sure the following points.
1. Host can reset the device between step 2 and 4.
2. The user settings is the same at step 1 and 5.

Sincerely
Zhou Jie

  reply	other threads:[~2016-07-06  2:01 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-27  2:12 [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume Zhou Jie
2016-05-27 16:06 ` Alex Williamson
2016-06-12  2:38   ` Zhou Jie
2016-06-20  7:41     ` Zhou Jie
2016-06-20 16:32       ` Alex Williamson
2016-06-21  2:16         ` Zhou Jie
2016-06-21  3:13           ` Alex Williamson
2016-06-21 12:41             ` Chen Fan
2016-06-21 14:44               ` Alex Williamson
2016-06-22  3:28                 ` Zhou Jie
2016-06-22  3:56                   ` Alex Williamson
2016-06-22  5:45                     ` Zhou Jie
2016-06-22  7:49                       ` Zhou Jie
2016-06-22 15:42                         ` Alex Williamson
2016-06-25  1:24                           ` Zhou Jie
2016-06-27 15:54                             ` Alex Williamson
2016-06-28  3:26                               ` Zhou Jie
2016-06-28  3:58                                 ` Alex Williamson
2016-06-28  5:27                                   ` Zhou Jie
2016-06-28 14:40                                     ` Alex Williamson
2016-06-29  8:54                                       ` Zhou Jie
2016-06-29 18:22                                         ` Alex Williamson
2016-06-30  1:45                                           ` Zhou Jie
2016-07-03  4:00                                             ` Zhou Jie
2016-07-05  1:36                                               ` Zhou Jie
2016-07-05 17:03                                                 ` Alex Williamson
2016-07-06  2:01                                                   ` Zhou Jie [this message]
2016-07-07 19:04                                                     ` Alex Williamson
2016-07-08  1:38                                                       ` Zhou Jie
2016-07-08 17:33                                                         ` Alex Williamson
2016-07-10  1:28                                                           ` Zhou Jie
2016-07-11 16:24                                                             ` Alex Williamson
2016-07-12  1:42                                                               ` Zhou Jie
2016-07-12 15:45                                                                 ` Alex Williamson
2016-07-13  1:04                                                                   ` Zhou Jie
2016-07-13  2:54                                                                     ` Alex Williamson
2016-07-13  3:33                                                                       ` Zhou Jie
2016-06-22 15:25                       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24876827-07fc-6f69-6949-7e358720c914@cn.fujitsu.com \
    --to=zhoujie2011@cn.fujitsu.com \
    --cc=alex.williamson@redhat.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=fan.chen@easystack.cn \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.