All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] synchronised multi-machine time-travel APIs
@ 2019-09-16  9:27 Johannes Berg
  2019-09-16  9:31 ` Johannes Berg
  2019-09-16 13:01 ` Johannes Berg
  0 siblings, 2 replies; 3+ messages in thread
From: Johannes Berg @ 2019-09-16  9:27 UTC (permalink / raw)
  To: linux-um

Hi all,

I now have a working PoC for synchronized multi-machine time travel, in
which e.g. I can run the following test (output from one machine):

https://p.sipsolutions.net/d000cb894950dd02.txt


Note how the UML instance(s) observed ~10 minutes of time, but the
outside less than 2 seconds; note also that there's exactly a simulated
50ms network delay in my test.


Now, there's one reasonably big question I'm scratching my head over.

As you can see ("virtio_uml.device=/tmp/clock:1234 time-travel=virtio")
I've implemented the time synchronization as a virtio device (currently
with the ID 1234 just for experimentation).

This has some problems:

1) I'd need to get an ID assigned for properly publishing this; this is
   not really a problem in itself, it's just a question of effort.

2) The virtio/vhost-user framework is really not suited for synchronous
   calls, I have to jump quite some hoops to make this work, polling the
   return virtqueue's notifications, etc., and keep a lot of buffers and
   SG structures around etc.

3) There are some virtio hooks for general devices like virtio_net to
   make the simulation work (e.g. IRQs need to be deferred to the
   simulation time, not be done when signalled), and we want to use
   poll() instead of SIGIO in this case etc. This is all necessary,
   *however*, the "simtime" device that is responsible for the time
   synchronization needs to be *exempted* from all this handling, which
   again makes the code more complex than needed.

I'm not really afraid of (1) though it raises some thorny questions, but
(2) and (3) really make me question the value of this now that I
actually have it working.


What would you say if instead of using a virtio device for the time
synchronization, I was to add a (unix domain) socket-based protocol
instead? Either way it's a custom protocol, it's just a question of the
transport used (and the documentation being in the virtio spec vs. being
part of the Linux kernel.)

I can trivially adapt my code to that, and it should result in some
significant cleanup. This might even just let me go back to using SIGIO
for the virtio devices, since I don't really care if I don't need to use
it for the time scheduler.


Even the virtio device protocol I defined is as simple as having a few
messages containing a u64 op and u64 time value, and the operations are
basically

- REQUEST: update my calendar entry for the given time
- WAIT: wait for my next calendar entry and call back with RUN
- RUN: callback after WAIT, so I can continue running
- GET: return the current simulation clock to me
- UPDATE: send my current time to the simulation clock
- FREE_UNTIL: tells me that there's nothing else scheduled in the
              simulation until the time given, so I don't need to call
              REQUEST/WAIT for events - this is an optimisation, and the
              reason for the "UPDATE" message

That's basically it, and a very simple message protocol that also
carries an "ACK" message back (this was done kind of implicitly in
virtio) as a response would be entirely sufficient, and seems much
easier to implement.

Thoughts?

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] synchronised multi-machine time-travel APIs
  2019-09-16  9:27 [RFC] synchronised multi-machine time-travel APIs Johannes Berg
@ 2019-09-16  9:31 ` Johannes Berg
  2019-09-16 13:01 ` Johannes Berg
  1 sibling, 0 replies; 3+ messages in thread
From: Johannes Berg @ 2019-09-16  9:31 UTC (permalink / raw)
  To: linux-um

On Mon, 2019-09-16 at 11:27 +0200, Johannes Berg wrote:
> 
> 2) The virtio/vhost-user framework is really not suited for synchronous
>    calls, I have to jump quite some hoops to make this work, polling the
>    return virtqueue's notifications, etc., and keep a lot of buffers and
>    SG structures around etc.

Some more thought on that: virtqueues always pass buffers back and
forth, and the UML system needs to provide buffers even for the tiny
messages, since we cannot communicate otherwise. This is great for bulk
transfers in virtio, but for the tiny 16-byte messages we're passing
it's kind of inefficient.

As a result, I didn't want to have many of them submitted for "RX" at
the same time and used only one, but then if the simulation calendar has
to send multiple messages I actually ended up resorting to polling there
until buffers are available, again a bit ugly.

If we simply have a (blocking) socket between them, we can write() the
small messages there and read() them out on the other side, without
having to worry about buffer allocations etc.

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] synchronised multi-machine time-travel APIs
  2019-09-16  9:27 [RFC] synchronised multi-machine time-travel APIs Johannes Berg
  2019-09-16  9:31 ` Johannes Berg
@ 2019-09-16 13:01 ` Johannes Berg
  1 sibling, 0 replies; 3+ messages in thread
From: Johannes Berg @ 2019-09-16 13:01 UTC (permalink / raw)
  To: linux-um

On Mon, 2019-09-16 at 11:27 +0200, Johannes Berg wrote:
> 
> 3) There are some virtio hooks for general devices like virtio_net to
>    make the simulation work (e.g. IRQs need to be deferred to the
>    simulation time, not be done when signalled), and we want to use
>    poll() instead of SIGIO in this case etc. This is all necessary,
>    *however*, the "simtime" device that is responsible for the time
>    synchronization needs to be *exempted* from all this handling, which
>    again makes the code more complex than needed.

Thinking about this some more ...

We currently can handle only one IRQ at a time, of course, but e.g. with
simulated network we basically end up doing

virtio_net RX interrupt
-> request runtime for IRQ handling via simtime device

However, to request runtime we send a message out on the simtime device,
but then we have to also handle incoming messages while we wait for a
response to this, since the request might change things around ... all
while in interrupt context, so the normal message handling doesn't work.

This required adding a whole new poll() abstraction to UML, which I
never really liked.

I'll try this a bit, but it seems if we make this all go back down to
the existing epoll abstraction, we just have to tag the "time" irq/fd as
having special treatment, not everything else.

johannes


_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-09-16 13:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16  9:27 [RFC] synchronised multi-machine time-travel APIs Johannes Berg
2019-09-16  9:31 ` Johannes Berg
2019-09-16 13:01 ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.