From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com>
To: "Jan Beulich" <jbeulich@suse.com>,
"Roger Pau Monné" <roger.pau@citrix.com>
Cc: "julien@xen.org" <julien@xen.org>,
"sstabellini@kernel.org" <sstabellini@kernel.org>,
Oleksandr Tyshchenko <Oleksandr_Tyshchenko@epam.com>,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
Artem Mygaiev <Artem_Mygaiev@epam.com>,
"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
"george.dunlap@citrix.com" <george.dunlap@citrix.com>,
"paul@xen.org" <paul@xen.org>,
Bertrand Marquis <bertrand.marquis@arm.com>,
Rahul Singh <rahul.singh@arm.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v6 03/13] vpci: move lock outside of struct vpci
Date: Tue, 8 Feb 2022 07:35:34 +0000 [thread overview]
Message-ID: <e91965c5-0802-adf8-0c17-522f86ebf231@epam.com> (raw)
In-Reply-To: <be3f4222-6580-4c89-a202-c003b6feb9b4@epam.com>
On 07.02.22 18:44, Oleksandr Andrushchenko wrote:
>
> On 07.02.22 18:37, Jan Beulich wrote:
>> On 07.02.2022 17:21, Oleksandr Andrushchenko wrote:
>>> On 07.02.22 18:15, Jan Beulich wrote:
>>>> On 07.02.2022 17:07, Oleksandr Andrushchenko wrote:
>>>>> On 07.02.22 17:26, Jan Beulich wrote:
>>>>>> 1b. Make vpci_write use write lock for writes to command register and BARs
>>>>>> only; keep using the read lock for all other writes.
>>>>> I am not quite sure how to do that. Do you mean something like:
>>>>> void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>>>>> uint32_t data)
>>>>> [snip]
>>>>> list_for_each_entry ( r, &pdev->vpci->handlers, node )
>>>>> {
>>>>> [snip]
>>>>> if ( r->needs_write_lock)
>>>>> write_lock(d->vpci_lock)
>>>>> else
>>>>> read_lock(d->vpci_lock)
>>>>> ....
>>>>>
>>>>> And provide rw as an argument to:
>>>>>
>>>>> int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler,
>>>>> vpci_write_t *write_handler, unsigned int offset,
>>>>> unsigned int size, void *data, --->>> bool write_path <<<-----)
>>>>>
>>>>> Is this what you mean?
>>>> This sounds overly complicated. You can derive locally in vpci_write(),
>>>> from just its "reg" and "size" parameters, whether the lock needs taking
>>>> in write mode.
>>> Yes, I started writing a reply with that. So, the summary (ROM
>>> position depends on header type):
>>> if ( (reg == PCI_COMMAND) || (reg == ROM) )
>>> {
>>> read PCI_COMMAND and see if memory or IO decoding are enabled.
>>> if ( enabled )
>>> write_lock(d->vpci_lock)
>>> else
>>> read_lock(d->vpci_lock)
>>> }
>> Hmm, yes, you can actually get away without using "size", since both
>> command register and ROM BAR are 32-bit aligned registers, and 64-bit
>> accesses get split in vpci_ecam_write().
> But, OS may want reading a single byte of ROM BAR, so I think
> I'll need to check if reg+size fall into PCI_COMAND and ROM BAR
> ranges
>> For the command register the memory- / IO-decoding-enabled check may
>> end up a little more complicated, as the value to be written also
>> matters. Maybe read the command register only for the ROM BAR write,
>> using the write lock uniformly for all command register writes?
> Sounds good for the start.
> Another concern is that if we go with a read_lock and then in the
> underlying code we disable memory decoding and try doing
> something and calling cmd_write handler for any reason then....
>
> I mean that the check in the vpci_write is somewhat we can tolerate,
> but then it is must be considered that no code in the read path
> is allowed to perform write path functions. Which brings a pretty
> valid use-case: say in read mode we detect an unrecoverable error
> and need to remove the device:
> vpci_process_pending -> ERROR -> vpci_remove_device or similar.
>
> What do we do then? It is all going to be fragile...
I have tried to summarize the options we have wrt locking
and would love to hear from @Roger and @Jan.
In every variant there is a task of dealing with the overlap
detection in modify_bars, so this is the only place as of now
which needs special treatment.
Existing limitations: there is no way to upgrade a read lock to a write
lock, so paths which may require write lock protection need to use
write lock from the very beginning. Workarounds can be applied.
1. Per-domain rw lock, aka d->vpci_lock
==============================================================
Note: with per-domain rw lock it is possible to do without introducing
per-device locks, so pdev->vpci->lock can be removed and no pdev->vpci_lock
should be required.
This is only going to work in case if vpci_write always takes the write lock
and vpci_read takes a read lock and no path in vpci_read is allowed to
perform write path operations.
vpci_process_pending uses write lock as it have vpci_remove_device in its
error path.
Pros:
- no per-device vpci lock is needed?
- solves overlap code ABBA in modify_bars
Cons:
- all writes are serialized
- need to carefully select read paths, so they are guaranteed not to lead
to lock upgrade use-cases
1.1. Semi read lock upgrade in modify bars
--------------------------------------------------------------
In this case both vpci_read and vpci_write take a read lock and when it comes
to modify_bars:
1. read_unlock(d->vpci_lock)
2. write_lock(d->vpci_lock)
3. Check that pdev->vpci is still available and is the same object:
if (pdev->vpci && (pdev->vpci == old_vpci) )
{
/* vpci structure is valid and can be used. */
}
else
{
/* vpci has gone, return an error. */
}
Pros:
- no per-device vpci lock is needed?
- solves overlap code ABBA in modify_bars
- readers and writers are NOT serialized
- NO need to carefully select read paths, so they are guaranteed not to lead
to lock upgrade use-cases
Cons:
- ???
2. per-device lock (pdev->vpci_lock) + d->overlap_chk_lock
==============================================================
In order to solve overlap ABBA, we introduce a per-domain helper
lock to protect the overlapping code in modify_bars:
old_vpci = pdev->vpci;
spin_unlock(pdev->vpci_lock);
spin_lock(pdev->domain->overlap_chk_lock);
spin_lock(pdev->vpci_lock);
if ( pdev->vpci && (pdev->vpci == old_vpci) )
for_each_pdev ( pdev->domain, tmp )
{
if ( tmp != pdev )
{
spin_lock(tmp->vpci_lock);
if ( tmp->vpci )
...
}
}
Pros:
- all accesses are independent, only the same device access is serialized
- no need to care about readers and writers wrt read lock upgrade issues
Cons:
- helper spin lock
3. Move overlap detection into process pending
==============================================================
There is a Roger's patch [1] which adds a possibility for vpci_process_pending
to perform different tasks rather than just map/unmap. With this patch extended
in a way that it can hold a request queue it is possible to delay execution
of the overlap code until no pdev->vpci_lock is held, but before returning to
a guest after vpci_{read|write} or similar.
Pros:
- no need to emulate read lock upgrade
- fully parallel read/write
- queue in the vpci_process_pending will later on be used by SR-IOV,
so this is going to help the future code
Cons:
- ???
4. Re-write overlap detection code
==============================================================
It is possible to re-write overlap detection code, so the information about the
mapped/unmapped regions is not read from vpci->header->bars[i] of each device,
but instead there is a per-domain structure which holds the regions and
implements reference counting.
Pros:
- solves ABBA
Cons:
- very complex code is expected
5. You name it
==============================================================
From all the above I would recommend we go with option 2 which seems to reliably
solve ABBA and does not bring cons of the other approaches.
Thank you in advance,
Oleksandr
[1] https://lore.kernel.org/all/5BABA6EF02000078001EC452@prv1-mh.provo.novell.com/T/#m231fb0586007725bfd8538bb97ff1777a36842cf
next prev parent reply other threads:[~2022-02-08 7:36 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-04 6:34 [PATCH v6 00/13] PCI devices passthrough on Arm, part 3 Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 01/13] xen/pci: arm: add stub for is_memory_hole Oleksandr Andrushchenko
2022-02-04 8:51 ` Julien Grall
2022-02-04 9:01 ` Oleksandr Andrushchenko
2022-02-04 9:41 ` Julien Grall
2022-02-04 9:47 ` Oleksandr Andrushchenko
2022-02-04 9:57 ` Julien Grall
2022-02-04 10:35 ` Oleksandr Andrushchenko
2022-02-04 11:00 ` Julien Grall
2022-02-04 11:25 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 02/13] rangeset: add RANGESETF_no_print flag Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 03/13] vpci: move lock outside of struct vpci Oleksandr Andrushchenko
2022-02-04 7:52 ` Jan Beulich
2022-02-04 8:13 ` Oleksandr Andrushchenko
2022-02-04 8:36 ` Jan Beulich
2022-02-04 8:58 ` Oleksandr Andrushchenko
2022-02-04 9:15 ` Jan Beulich
2022-02-04 10:12 ` Oleksandr Andrushchenko
2022-02-04 10:49 ` Jan Beulich
2022-02-04 11:13 ` Roger Pau Monné
2022-02-04 11:37 ` Jan Beulich
2022-02-04 12:37 ` Oleksandr Andrushchenko
2022-02-04 12:47 ` Jan Beulich
2022-02-04 12:53 ` Oleksandr Andrushchenko
2022-02-04 13:03 ` Jan Beulich
2022-02-04 13:06 ` Roger Pau Monné
2022-02-04 14:43 ` Oleksandr Andrushchenko
2022-02-04 14:57 ` Roger Pau Monné
2022-02-07 11:08 ` Oleksandr Andrushchenko
2022-02-07 12:34 ` Jan Beulich
2022-02-07 12:57 ` Oleksandr Andrushchenko
2022-02-07 13:02 ` Jan Beulich
2022-02-07 12:46 ` Roger Pau Monné
2022-02-07 13:53 ` Oleksandr Andrushchenko
2022-02-07 14:11 ` Jan Beulich
2022-02-07 14:27 ` Roger Pau Monné
2022-02-07 14:33 ` Jan Beulich
2022-02-07 14:35 ` Oleksandr Andrushchenko
2022-02-07 15:11 ` Oleksandr Andrushchenko
2022-02-07 15:26 ` Jan Beulich
2022-02-07 16:07 ` Oleksandr Andrushchenko
2022-02-07 16:15 ` Jan Beulich
2022-02-07 16:21 ` Oleksandr Andrushchenko
2022-02-07 16:37 ` Jan Beulich
2022-02-07 16:44 ` Oleksandr Andrushchenko
2022-02-08 7:35 ` Oleksandr Andrushchenko [this message]
2022-02-08 8:57 ` Jan Beulich
2022-02-08 9:03 ` Oleksandr Andrushchenko
2022-02-08 10:50 ` Roger Pau Monné
2022-02-08 11:13 ` Oleksandr Andrushchenko
2022-02-08 13:38 ` Roger Pau Monné
2022-02-08 13:52 ` Oleksandr Andrushchenko
2022-02-08 8:53 ` Jan Beulich
2022-02-08 9:00 ` Oleksandr Andrushchenko
2022-02-08 10:11 ` Roger Pau Monné
2022-02-08 10:32 ` Oleksandr Andrushchenko
2022-02-07 16:08 ` Roger Pau Monné
2022-02-07 16:12 ` Jan Beulich
2022-02-07 14:28 ` Oleksandr Andrushchenko
2022-02-07 14:19 ` Roger Pau Monné
2022-02-07 14:27 ` Oleksandr Andrushchenko
2022-02-04 11:37 ` Oleksandr Andrushchenko
2022-02-04 12:15 ` Roger Pau Monné
2022-02-04 10:57 ` Roger Pau Monné
2022-02-04 6:34 ` [PATCH v6 04/13] vpci: restrict unhandled read/write operations for guests Oleksandr Andrushchenko
2022-02-04 14:11 ` Jan Beulich
2022-02-04 14:24 ` Oleksandr Andrushchenko
2022-02-08 8:00 ` Oleksandr Andrushchenko
2022-02-08 9:04 ` Jan Beulich
2022-02-08 9:09 ` Oleksandr Andrushchenko
2022-02-08 9:05 ` Roger Pau Monné
2022-02-08 9:10 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 05/13] vpci: add hooks for PCI device assign/de-assign Oleksandr Andrushchenko
2022-02-07 16:28 ` Jan Beulich
2022-02-08 8:32 ` Oleksandr Andrushchenko
2022-02-08 9:13 ` Jan Beulich
2022-02-08 9:27 ` Oleksandr Andrushchenko
2022-02-08 9:44 ` Jan Beulich
2022-02-08 9:55 ` Oleksandr Andrushchenko
2022-02-08 10:09 ` Jan Beulich
2022-02-08 10:22 ` Oleksandr Andrushchenko
2022-02-08 10:29 ` Jan Beulich
2022-02-08 10:52 ` Oleksandr Andrushchenko
2022-02-08 11:00 ` Jan Beulich
2022-02-08 11:25 ` Oleksandr Andrushchenko
2022-02-10 8:21 ` Oleksandr Andrushchenko
2022-02-10 9:22 ` Jan Beulich
2022-02-10 9:33 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 06/13] vpci/header: implement guest BAR register handlers Oleksandr Andrushchenko
2022-02-07 17:06 ` Jan Beulich
2022-02-08 8:06 ` Oleksandr Andrushchenko
2022-02-08 9:16 ` Jan Beulich
2022-02-08 9:29 ` Roger Pau Monné
2022-02-08 9:25 ` Roger Pau Monné
2022-02-08 9:31 ` Oleksandr Andrushchenko
2022-02-08 9:48 ` Jan Beulich
2022-02-08 9:57 ` Oleksandr Andrushchenko
2022-02-08 10:15 ` Jan Beulich
2022-02-08 10:29 ` Oleksandr Andrushchenko
2022-02-08 13:58 ` Roger Pau Monné
2022-02-04 6:34 ` [PATCH v6 07/13] vpci/header: handle p2m range sets per BAR Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 08/13] vpci/header: program p2m with guest BAR view Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests Oleksandr Andrushchenko
2022-02-04 14:25 ` Jan Beulich
2022-02-08 8:13 ` Oleksandr Andrushchenko
2022-02-08 9:33 ` Jan Beulich
2022-02-08 9:38 ` Oleksandr Andrushchenko
2022-02-08 9:52 ` Jan Beulich
2022-02-08 9:58 ` Oleksandr Andrushchenko
2022-02-08 11:11 ` Roger Pau Monné
2022-02-08 11:29 ` Oleksandr Andrushchenko
2022-02-08 14:09 ` Roger Pau Monné
2022-02-08 14:13 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 10/13] vpci/header: reset the command register when adding devices Oleksandr Andrushchenko
2022-02-04 14:30 ` Jan Beulich
2022-02-04 14:37 ` Oleksandr Andrushchenko
2022-02-07 7:29 ` Jan Beulich
2022-02-07 11:27 ` Oleksandr Andrushchenko
2022-02-07 12:38 ` Jan Beulich
2022-02-07 12:51 ` Oleksandr Andrushchenko
2022-02-07 12:54 ` Jan Beulich
2022-02-07 14:17 ` Oleksandr Andrushchenko
2022-02-07 14:31 ` Jan Beulich
2022-02-07 14:46 ` Oleksandr Andrushchenko
2022-02-07 15:05 ` Jan Beulich
2022-02-07 15:14 ` Oleksandr Andrushchenko
2022-02-07 15:28 ` Jan Beulich
2022-02-07 15:59 ` Oleksandr Andrushchenko
2022-02-10 12:54 ` Oleksandr Andrushchenko
2022-02-10 13:36 ` Jan Beulich
2022-02-10 13:56 ` Oleksandr Andrushchenko
2022-02-10 12:59 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 11/13] vpci: add initial support for virtual PCI bus topology Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 12/13] xen/arm: translate virtual PCI bus topology for guests Oleksandr Andrushchenko
2022-02-04 7:56 ` Jan Beulich
2022-02-04 8:18 ` Oleksandr Andrushchenko
2022-02-04 6:34 ` [PATCH v6 13/13] xen/arm: account IO handlers for emulated PCI MSI-X Oleksandr Andrushchenko
2022-02-11 15:28 ` Julien Grall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e91965c5-0802-adf8-0c17-522f86ebf231@epam.com \
--to=oleksandr_andrushchenko@epam.com \
--cc=Artem_Mygaiev@epam.com \
--cc=Oleksandr_Tyshchenko@epam.com \
--cc=Volodymyr_Babchuk@epam.com \
--cc=andrew.cooper3@citrix.com \
--cc=bertrand.marquis@arm.com \
--cc=george.dunlap@citrix.com \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=paul@xen.org \
--cc=rahul.singh@arm.com \
--cc=roger.pau@citrix.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).