From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Oleksandr Tyshchenko" <Oleksandr_Tyshchenko@epam.com>,
"Volodymyr Babchuk" <Volodymyr_Babchuk@epam.com>,
"Artem Mygaiev" <Artem_Mygaiev@epam.com>,
"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
"george.dunlap@citrix.com" <george.dunlap@citrix.com>,
"paul@xen.org" <paul@xen.org>,
"Bertrand Marquis" <bertrand.marquis@arm.com>,
"Roger Pau Monné" <roger.pau@citrix.com>,
"julien@xen.org" <julien@xen.org>,
"Rahul Singh" <rahul.singh@arm.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"Stefano Stabellini" <sstabellini@kernel.org>
Subject: Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal
Date: Fri, 19 Nov 2021 13:34:51 +0000 [thread overview]
Message-ID: <4e74d5e1-f54e-3888-ada2-a38545571744@epam.com> (raw)
In-Reply-To: <fbc8a807-ea73-760f-8c75-df0930a5139e@suse.com>
On 19.11.21 15:25, Jan Beulich wrote:
> On 19.11.2021 14:16, Oleksandr Andrushchenko wrote:
>> On 19.11.21 15:00, Jan Beulich wrote:
>>> On 19.11.2021 13:34, Oleksandr Andrushchenko wrote:
>>>> Possible locking and other work needed:
>>>> =======================================
>>>>
>>>> 1. pcidevs_{lock|unlock} is too heavy and is per-host
>>>> 2. pdev->vpci->lock cannot be used as vpci is freed by vpci_remove_device
>>>> 3. We may want a dedicated per-domain rw lock to be implemented:
>>>>
>>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>>> index 28146ee404e6..ebf071893b21 100644
>>>> --- a/xen/include/xen/sched.h
>>>> +++ b/xen/include/xen/sched.h
>>>> @@ -444,6 +444,7 @@ struct domain
>>>>
>>>> #ifdef CONFIG_HAS_PCI
>>>> struct list_head pdev_list;
>>>> + rwlock_t vpci_rwlock;
>>>> + bool vpci_terminating; <- atomic?
>>>> #endif
>>>> then vpci_remove_device is a writer (cold path) and vpci_process_pending and
>>>> vpci_mmio_{read|write} are readers (hot path).
>>> Right - you need such a lock for other purposes anyway, as per the
>>> discussion with Julien.
>> What about bool vpci_terminating? Do you see it as an atomic type or just bool?
> Having seen only ...
>
>>>> do_physdev_op(PHYSDEVOP_pci_device_remove) will need hypercall_create_continuation
>>>> to be implemented, so when re-start removal if need be:
>>>>
>>>> vpci_remove_device()
>>>> {
>>>> d->vpci_terminating = true;
> ... this use so far, I can't tell yet. But at a first glance a boolean
> looks to be what you need.
>
>>>> remove vPCI register handlers <- this will cut off PCI_COMMAND emulation among others
>>>> if ( !write_trylock(d->vpci_rwlock) )
>>>> return -ERESTART;
>>>> xfree(pdev->vpci);
>>>> pdev->vpci = NULL;
>>>> }
>>>>
>>>> Then this d->vpci_rwlock becomes a dedicated vpci per-domain lock for
>>>> other operations which may require it, e.g. virtual bus topology can
>>>> use it when assigning vSBDF etc.
>>>>
>>>> 4. vpci_remove_device needs to be removed from vpci_process_pending
>>>> and do nothing for Dom0 and crash DomU otherwise:
>>> Why is this? I'm not outright opposed, but I don't immediately see why
>>> trying to remove the problematic device wouldn't be a reasonable course
>>> of action anymore. vpci_remove_device() may need to become more careful
>>> as to not crashing,
>> vpci_remove_device does not crash, vpci_process_pending does
>>> though.
>> Assume we are in an error state in vpci_process_pending *on one of the vCPUs*
>> and we call vpci_remove_device. vpci_remove_device tries to acquire the
>> lock and it can't just because there are some other vpci code is running on other vCPU.
>> Then what do we do here? We are in SoftIRQ context now and we can't spin
>> trying to acquire d->vpci_rwlock forever. Neither we can blindly free vpci
>> structure because it is seen by all vCPUs and may crash them.
>>
>> If vpci_remove_device is in hypercall context it just returns -ERESTART and
>> hypercall continuation helps here. But not in SoftIRQ context.
> Maybe then you want to invoke this cleanup from RCU context (whether
> vpci_remove_device() itself or a suitable clone there of is TBD)? (I
> will admit though that I didn't check whether that would satisfy all
> constraints.)
>
> Then again it also hasn't become clear to me why you use write_trylock()
> there. The lock contention you describe doesn't, on the surface, look
> any different from situations elsewhere.
I use write_trylock in vpci_remove_device because if we can't
acquire the lock then we defer device removal. This would work
well if called from a hypercall which will employ hypercall continuation.
But SoftIRQ getting -ERESTART is something that we can't probably
handle by restarting as hypercall can, thus I only see that vpci_process_pending
will need to spin and wait until vpci_remove_device succeeds.
>
> Jan
>
Thank you,
Oleksandr
next prev parent reply other threads:[~2021-11-19 13:35 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-05 6:56 [PATCH v4 00/11] PCI devices passthrough on Arm, part 3 Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 01/11] vpci: fix function attributes for vpci_process_pending Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal Oleksandr Andrushchenko
2021-11-15 16:56 ` Jan Beulich
2021-11-16 7:32 ` Oleksandr Andrushchenko
2021-11-16 8:01 ` Jan Beulich
2021-11-16 8:23 ` Oleksandr Andrushchenko
2021-11-16 11:38 ` Jan Beulich
2021-11-16 13:27 ` Oleksandr Andrushchenko
2021-11-16 14:11 ` Jan Beulich
2021-11-16 13:41 ` Oleksandr Andrushchenko
2021-11-16 14:12 ` Jan Beulich
2021-11-16 14:24 ` Oleksandr Andrushchenko
2021-11-16 14:37 ` Oleksandr Andrushchenko
2021-11-16 16:09 ` Jan Beulich
2021-11-16 18:02 ` Julien Grall
2021-11-18 12:57 ` Oleksandr Andrushchenko
2021-11-17 8:28 ` Jan Beulich
2021-11-18 7:49 ` Oleksandr Andrushchenko
2021-11-18 8:36 ` Jan Beulich
2021-11-18 8:54 ` Oleksandr Andrushchenko
2021-11-18 9:15 ` Jan Beulich
2021-11-18 9:32 ` Oleksandr Andrushchenko
2021-11-18 13:25 ` Jan Beulich
2021-11-18 13:48 ` Oleksandr Andrushchenko
2021-11-18 14:04 ` Roger Pau Monné
2021-11-18 14:14 ` Oleksandr Andrushchenko
2021-11-18 14:35 ` Jan Beulich
2021-11-18 15:11 ` Oleksandr Andrushchenko
2021-11-18 15:16 ` Jan Beulich
2021-11-18 15:21 ` Oleksandr Andrushchenko
2021-11-18 15:41 ` Jan Beulich
2021-11-18 15:46 ` Oleksandr Andrushchenko
2021-11-18 15:53 ` Jan Beulich
2021-11-19 12:34 ` Oleksandr Andrushchenko
2021-11-19 13:00 ` Jan Beulich
2021-11-19 13:16 ` Oleksandr Andrushchenko
2021-11-19 13:25 ` Jan Beulich
2021-11-19 13:34 ` Oleksandr Andrushchenko [this message]
2021-11-22 14:21 ` Oleksandr Andrushchenko
2021-11-22 14:37 ` Jan Beulich
2021-11-22 14:45 ` Oleksandr Andrushchenko
2021-11-22 14:57 ` Jan Beulich
2021-11-22 15:02 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 03/11] vpci: make vpci registers removal a dedicated function Oleksandr Andrushchenko
2021-11-15 16:57 ` Jan Beulich
2021-11-16 8:02 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 04/11] vpci: add hooks for PCI device assign/de-assign Oleksandr Andrushchenko
2021-11-15 17:06 ` Jan Beulich
2021-11-16 9:38 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 05/11] vpci/header: implement guest BAR register handlers Oleksandr Andrushchenko
2021-11-19 11:58 ` Jan Beulich
2021-11-19 12:10 ` Oleksandr Andrushchenko
2021-11-19 12:37 ` Jan Beulich
2021-11-19 12:46 ` Oleksandr Andrushchenko
2021-11-19 12:49 ` Jan Beulich
2021-11-19 12:54 ` Oleksandr Andrushchenko
2021-11-19 13:02 ` Jan Beulich
2021-11-19 13:17 ` Oleksandr Andrushchenko
2021-11-23 15:14 ` Oleksandr Andrushchenko
2021-11-24 12:32 ` Roger Pau Monné
2021-11-24 12:36 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 06/11] vpci/header: handle p2m range sets per BAR Oleksandr Andrushchenko
2021-11-19 12:05 ` Jan Beulich
2021-11-19 12:13 ` Oleksandr Andrushchenko
2021-11-19 12:45 ` Jan Beulich
2021-11-19 12:50 ` Oleksandr Andrushchenko
2021-11-19 13:06 ` Jan Beulich
2021-11-19 13:19 ` Oleksandr Andrushchenko
2021-11-19 13:29 ` Jan Beulich
2021-11-19 13:38 ` Oleksandr Andrushchenko
2021-11-19 13:16 ` Jan Beulich
2021-11-19 13:41 ` Oleksandr Andrushchenko
2021-11-19 13:57 ` Jan Beulich
2021-11-19 14:09 ` Oleksandr Andrushchenko
2021-11-22 8:24 ` Jan Beulich
2021-11-22 8:31 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 07/11] vpci/header: program p2m with guest BAR view Oleksandr Andrushchenko
2021-11-19 12:33 ` Jan Beulich
2021-11-19 12:44 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 08/11] vpci/header: emulate PCI_COMMAND register for guests Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 09/11] vpci/header: reset the command register when adding devices Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 10/11] vpci: add initial support for virtual PCI bus topology Oleksandr Andrushchenko
2021-11-18 16:45 ` Jan Beulich
2021-11-24 11:28 ` Oleksandr Andrushchenko
2021-11-24 12:36 ` Roger Pau Monné
2021-11-24 12:43 ` Oleksandr Andrushchenko
2021-11-05 6:56 ` [PATCH v4 11/11] xen/arm: translate virtual PCI bus topology for guests Oleksandr Andrushchenko
2021-11-08 11:10 ` Jan Beulich
2021-11-08 11:16 ` Oleksandr Andrushchenko
2021-11-08 14:23 ` Roger Pau Monné
2021-11-08 15:28 ` Oleksandr Andrushchenko
2021-11-24 11:31 ` Oleksandr Andrushchenko
2021-11-19 13:56 ` [PATCH v4 00/11] PCI devices passthrough on Arm, part 3 Jan Beulich
2021-11-19 14:06 ` Oleksandr Andrushchenko
2021-11-19 14:23 ` Roger Pau Monné
2021-11-19 14:26 ` Oleksandr Andrushchenko
2021-11-20 9:47 ` Roger Pau Monné
2021-11-22 8:22 ` Jan Beulich
2021-11-22 8:34 ` Oleksandr Andrushchenko
2021-11-22 8:44 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e74d5e1-f54e-3888-ada2-a38545571744@epam.com \
--to=oleksandr_andrushchenko@epam.com \
--cc=Artem_Mygaiev@epam.com \
--cc=Oleksandr_Tyshchenko@epam.com \
--cc=Volodymyr_Babchuk@epam.com \
--cc=andrew.cooper3@citrix.com \
--cc=bertrand.marquis@arm.com \
--cc=george.dunlap@citrix.com \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=paul@xen.org \
--cc=rahul.singh@arm.com \
--cc=roger.pau@citrix.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).