From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Elena Afanasova <eafanasova@gmail.com>,
kvm@vger.kernel.org, jag.raman@oracle.com,
elena.ufimtseva@oracle.com
Subject: Re: [RFC 1/2] KVM: add initial support for KVM_SET_IOREGION
Date: Fri, 15 Jan 2021 11:41:54 +0800 [thread overview]
Message-ID: <1f15d3a1-2ea5-3b42-fa02-1d21de3e04a0@redhat.com> (raw)
In-Reply-To: <20210114161651.GG292902@stefanha-x1.localdomain>
On 2021/1/15 上午12:16, Stefan Hajnoczi wrote:
> On Thu, Jan 14, 2021 at 12:05:00PM +0800, Jason Wang wrote:
>> On 2021/1/13 下午11:52, Stefan Hajnoczi wrote:
>>> On Wed, Jan 13, 2021 at 10:38:29AM +0800, Jason Wang wrote:
>>>> On 2021/1/8 上午1:53, Stefan Hajnoczi wrote:
>>>>> On Thu, Jan 07, 2021 at 11:30:47AM +0800, Jason Wang wrote:
>>>>>> On 2021/1/6 下午11:05, Stefan Hajnoczi wrote:
>>>>>>> On Wed, Jan 06, 2021 at 01:21:43PM +0800, Jason Wang wrote:
>>>>>>>> On 2021/1/5 下午6:25, Stefan Hajnoczi wrote:
>>>>>>>>> On Tue, Jan 05, 2021 at 11:53:01AM +0800, Jason Wang wrote:
>>>>>>>>>> On 2021/1/5 上午8:02, Elena Afanasova wrote:
>>>>>>>>>>> On Mon, 2021-01-04 at 13:34 +0800, Jason Wang wrote:
>>>>>>>>>>>> On 2021/1/4 上午4:32, Elena Afanasova wrote:
>>>>>>>>>>>>> On Thu, 2020-12-31 at 11:45 +0800, Jason Wang wrote:
>>>>>>>>>>>>>> On 2020/12/29 下午6:02, Elena Afanasova wrote:
>>>>> 2. If separate userspace threads process the virtqueues, then set up the
>>>>> virtio-pci capabilities so the virtqueues have separate notification
>>>>> registers:
>>>>> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1150004
>>>> Right. But this works only when PCI transport is used and queue index could
>>>> be deduced from the register address (separated doorbell).
>>>>
>>>> If we use MMIO or sharing the doorbell registers among all the virtqueues
>>>> (multiplexer is zero in the above case) , it can't work without datamatch.
>>> True. Can you think of an application that needs to dispatch a shared
>>> doorbell register to several threads?
>>
>> I think it depends on semantic of doorbell register. I guess one example is
>> the virito-mmio multiqueue device.
> Good point. virtio-mmio really needs datamatch if virtqueues are handled
> by different threads.
>
>>> If this is a case that real-world applications need then we should
>>> tackle it. This is where eBPF would be appropriate. I guess the
>>> interface would be something like:
>>>
>>> /*
>>> * A custom demultiplexer function that returns the index of the <wfd,
>>> * rfd> pair to use or -1 to produce a KVM_EXIT_IOREGION_FAILURE that
>>> * userspace must handle.
>>> */
>>> int demux(const struct ioregionfd_cmd *cmd);
>>>
>>> Userspace can install an eBPF demux function as well as an array of
>>> <wfd, rfd> fd pairs. The demux function gets to look at the cmd in order
>>> to decide which fd pair it is sent to.
>>>
>>> This is how I think eBPF datamatch could work. It's not as general as in
>>> our original discussion where we also talked about custom protocols
>>> (instead of struct ioregionfd_cmd/struct ioregionfd_resp).
>>
>> Actually they are not conflict. We can make it a eBPF ioregion, then it's
>> the eBPF program that can decide:
>>
>> 1) whether or not it need to do datamatch
>> 2) how many file descriptors it want to use (store the fd in a map)
>> 3) how will the protocol looks like
>>
>> But as discussed it could be an add-on on top of the hard logic of ioregion
>> since there could be case that eBPF may not be allowed not not supported. So
>> adding simple datamatch support as a start might be a good choice.
> Let's go further. Can you share pseudo-code for the eBPF program's
> function signature (inputs/outputs)?
It could be something like this:
1) The eBPF program context could be defined as ioregion_ctx:
struct ioregion_ctx {
gpa_t addr;
int len;
void *val;
};
2) The eBPF program return value could be, 0 (IOREGION_OK) means that
the the program can handle this I/O request, otherwise failure
(IOREGION_FAIL)
So for implementing the datamatch, userspace is required to stored the
file descriptors for doorbell dispatching in a map (dispatch_map). For
virtio style doorbell, we can simply:
- find the fd via bpf map lookup
- build the protocol
- use the eBPF helper to send the command (I don't check but I guess we
need invent new eBPF helpers for read and write from a file)
Like:
SEC("datamatch")
int datamatch_prog(struct ioregion_ctx *ctx)
{
int *fd, ret;
struct customized_protocol protocol;
fd = bpf_map_lookup_elem(&ctx->val, &dispatch_map);
if (!fd)
return IOREGION_FAIL;
build_protocol(ctx, &protocol);
ret = bpf_fd_write(fd, &protocol, sizeof(protocol);
if (ret != sizeof(protocol))
return IOREGION_FAIL;
return IOREGION_OK;
}
Thanks
>
> Stefan
next prev parent reply other threads:[~2021-01-15 3:43 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-29 10:02 [RFC 0/2] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Elena Afanasova
2020-12-29 10:02 ` [RFC 1/2] KVM: add initial support for KVM_SET_IOREGION Elena Afanasova
2020-12-29 11:36 ` Stefan Hajnoczi
2020-12-30 12:14 ` Elena Afanasova
2020-12-31 3:45 ` Jason Wang
2021-01-03 20:32 ` Elena Afanasova
2021-01-04 5:34 ` Jason Wang
2021-01-05 0:02 ` Elena Afanasova
2021-01-05 3:53 ` Jason Wang
2021-01-05 10:25 ` Stefan Hajnoczi
2021-01-06 5:21 ` Jason Wang
2021-01-06 15:05 ` Stefan Hajnoczi
2021-01-07 3:30 ` Jason Wang
2021-01-07 17:53 ` Stefan Hajnoczi
2021-01-13 2:38 ` Jason Wang
2021-01-13 15:52 ` Stefan Hajnoczi
2021-01-14 4:05 ` Jason Wang
2021-01-14 16:16 ` Stefan Hajnoczi
2021-01-15 3:41 ` Jason Wang [this message]
2020-12-29 10:02 ` [RFC 2/2] KVM: add initial support for ioregionfd blocking read/write operations Elena Afanasova
2020-12-29 12:00 ` Stefan Hajnoczi
2020-12-30 12:24 ` Elena Afanasova
2020-12-31 3:46 ` Jason Wang
2021-01-03 20:37 ` Elena Afanasova
2021-01-04 5:37 ` Jason Wang
2021-01-05 0:06 ` Elena Afanasova
2020-12-29 12:06 ` [RFC 0/2] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Stefan Hajnoczi
2020-12-30 17:56 ` Elena Afanasova
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1f15d3a1-2ea5-3b42-fa02-1d21de3e04a0@redhat.com \
--to=jasowang@redhat.com \
--cc=eafanasova@gmail.com \
--cc=elena.ufimtseva@oracle.com \
--cc=jag.raman@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).