kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Elena Afanasova <eafanasova@gmail.com>,
	kvm@vger.kernel.org, jag.raman@oracle.com,
	elena.ufimtseva@oracle.com
Subject: Re: [RFC 1/2] KVM: add initial support for KVM_SET_IOREGION
Date: Fri, 15 Jan 2021 11:41:54 +0800	[thread overview]
Message-ID: <1f15d3a1-2ea5-3b42-fa02-1d21de3e04a0@redhat.com> (raw)
In-Reply-To: <20210114161651.GG292902@stefanha-x1.localdomain>


On 2021/1/15 上午12:16, Stefan Hajnoczi wrote:
> On Thu, Jan 14, 2021 at 12:05:00PM +0800, Jason Wang wrote:
>> On 2021/1/13 下午11:52, Stefan Hajnoczi wrote:
>>> On Wed, Jan 13, 2021 at 10:38:29AM +0800, Jason Wang wrote:
>>>> On 2021/1/8 上午1:53, Stefan Hajnoczi wrote:
>>>>> On Thu, Jan 07, 2021 at 11:30:47AM +0800, Jason Wang wrote:
>>>>>> On 2021/1/6 下午11:05, Stefan Hajnoczi wrote:
>>>>>>> On Wed, Jan 06, 2021 at 01:21:43PM +0800, Jason Wang wrote:
>>>>>>>> On 2021/1/5 下午6:25, Stefan Hajnoczi wrote:
>>>>>>>>> On Tue, Jan 05, 2021 at 11:53:01AM +0800, Jason Wang wrote:
>>>>>>>>>> On 2021/1/5 上午8:02, Elena Afanasova wrote:
>>>>>>>>>>> On Mon, 2021-01-04 at 13:34 +0800, Jason Wang wrote:
>>>>>>>>>>>> On 2021/1/4 上午4:32, Elena Afanasova wrote:
>>>>>>>>>>>>> On Thu, 2020-12-31 at 11:45 +0800, Jason Wang wrote:
>>>>>>>>>>>>>> On 2020/12/29 下午6:02, Elena Afanasova wrote:
>>>>> 2. If separate userspace threads process the virtqueues, then set up the
>>>>>       virtio-pci capabilities so the virtqueues have separate notification
>>>>>       registers:
>>>>>       https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1150004
>>>> Right. But this works only when PCI transport is used and queue index could
>>>> be deduced from the register address (separated doorbell).
>>>>
>>>> If we use MMIO or sharing the doorbell registers among all the virtqueues
>>>> (multiplexer is zero in the above case) , it can't work without datamatch.
>>> True. Can you think of an application that needs to dispatch a shared
>>> doorbell register to several threads?
>>
>> I think it depends on semantic of doorbell register. I guess one example is
>> the virito-mmio multiqueue device.
> Good point. virtio-mmio really needs datamatch if virtqueues are handled
> by different threads.
>
>>> If this is a case that real-world applications need then we should
>>> tackle it. This is where eBPF would be appropriate. I guess the
>>> interface would be something like:
>>>
>>>     /*
>>>      * A custom demultiplexer function that returns the index of the <wfd,
>>>      * rfd> pair to use or -1 to produce a KVM_EXIT_IOREGION_FAILURE that
>>>      * userspace must handle.
>>>      */
>>>     int demux(const struct ioregionfd_cmd *cmd);
>>>
>>> Userspace can install an eBPF demux function as well as an array of
>>> <wfd, rfd> fd pairs. The demux function gets to look at the cmd in order
>>> to decide which fd pair it is sent to.
>>>
>>> This is how I think eBPF datamatch could work. It's not as general as in
>>> our original discussion where we also talked about custom protocols
>>> (instead of struct ioregionfd_cmd/struct ioregionfd_resp).
>>
>> Actually they are not conflict. We can make it a eBPF ioregion, then it's
>> the eBPF program that can decide:
>>
>> 1) whether or not it need to do datamatch
>> 2) how many file descriptors it want to use (store the fd in a map)
>> 3) how will the protocol looks like
>>
>> But as discussed it could be an add-on on top of the hard logic of ioregion
>> since there could be case that eBPF may not be allowed not not supported. So
>> adding simple datamatch support as a start might be a good choice.
> Let's go further. Can you share pseudo-code for the eBPF program's
> function signature (inputs/outputs)?


It could be something like this:

1) The eBPF program context could be defined as ioregion_ctx:

struct ioregion_ctx {
     gpa_t addr;
     int len;
     void *val;
};

2) The eBPF program return value could be, 0 (IOREGION_OK) means that 
the the program can handle this I/O request, otherwise failure 
(IOREGION_FAIL)

So for implementing the datamatch, userspace is required to stored the 
file descriptors for doorbell dispatching in a map (dispatch_map). For 
virtio style doorbell, we can simply:

- find the fd via bpf map lookup
- build the protocol
- use the eBPF helper to send the command (I don't check but I guess we 
need invent new eBPF helpers for read and write from a file)

Like:

SEC("datamatch")
int datamatch_prog(struct ioregion_ctx *ctx)
{
     int *fd, ret;
     struct customized_protocol protocol;
     fd = bpf_map_lookup_elem(&ctx->val, &dispatch_map);
     if (!fd)
         return IOREGION_FAIL;
     build_protocol(ctx, &protocol);
     ret = bpf_fd_write(fd, &protocol, sizeof(protocol);
     if (ret != sizeof(protocol))
         return IOREGION_FAIL;
     return IOREGION_OK;
}

Thanks


>
> Stefan


  reply	other threads:[~2021-01-15  3:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-29 10:02 [RFC 0/2] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Elena Afanasova
2020-12-29 10:02 ` [RFC 1/2] KVM: add initial support for KVM_SET_IOREGION Elena Afanasova
2020-12-29 11:36   ` Stefan Hajnoczi
2020-12-30 12:14     ` Elena Afanasova
2020-12-31  3:45   ` Jason Wang
2021-01-03 20:32     ` Elena Afanasova
2021-01-04  5:34       ` Jason Wang
2021-01-05  0:02         ` Elena Afanasova
2021-01-05  3:53           ` Jason Wang
2021-01-05 10:25             ` Stefan Hajnoczi
2021-01-06  5:21               ` Jason Wang
2021-01-06 15:05                 ` Stefan Hajnoczi
2021-01-07  3:30                   ` Jason Wang
2021-01-07 17:53                     ` Stefan Hajnoczi
2021-01-13  2:38                       ` Jason Wang
2021-01-13 15:52                         ` Stefan Hajnoczi
2021-01-14  4:05                           ` Jason Wang
2021-01-14 16:16                             ` Stefan Hajnoczi
2021-01-15  3:41                               ` Jason Wang [this message]
2020-12-29 10:02 ` [RFC 2/2] KVM: add initial support for ioregionfd blocking read/write operations Elena Afanasova
2020-12-29 12:00   ` Stefan Hajnoczi
2020-12-30 12:24     ` Elena Afanasova
2020-12-31  3:46   ` Jason Wang
2021-01-03 20:37     ` Elena Afanasova
2021-01-04  5:37       ` Jason Wang
2021-01-05  0:06         ` Elena Afanasova
2020-12-29 12:06 ` [RFC 0/2] Introduce MMIO/PIO dispatch file descriptors (ioregionfd) Stefan Hajnoczi
2020-12-30 17:56   ` Elena Afanasova

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f15d3a1-2ea5-3b42-fa02-1d21de3e04a0@redhat.com \
    --to=jasowang@redhat.com \
    --cc=eafanasova@gmail.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).