On Thu, Mar 18, 2021 at 12:58:46PM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 17. März 2021 13:57:47 CET Jiachen Zhang wrote:
> > On Wed, Mar 17, 2021 at 7:50 PM Christian Schoenebeck <
> > 
> > qemu_oss@crudebyte.com> wrote:
> > > On Mittwoch, 17. März 2021 11:05:32 CET Stefan Hajnoczi wrote:
> > > > On Fri, Dec 18, 2020 at 05:39:34PM +0800, Jiachen Zhang wrote:
> > > > > Thanks for the suggestions. Actually, we choose to save all state
> > > > > information to QEMU because a virtiofsd has the same lifecycle as its
> > > > > QEMU master. However, saving things to a file do avoid communication
> > > 
> > > with
> > > 
> > > > > QEMU, and we no longer need to increase the complexity of vhost-user
> > > > > protocol. The suggestion to save fds to the systemd is also very
> > > > > reasonable
> > > > > if we don't consider the lifecycle issues, we will try it.
> > > > 
> > > > Hi,
> > > > We recently discussed crash recovery in the virtio-fs bi-weekly call and
> > > > I read some of this email thread because it's a topic I'm interested in.
> > > 
> > > I just had a quick fly over the patches so far. Shouldn't there be some
> > > kind
> > > of constraint for an automatic reconnection feature after a crash to
> > > prevent
> > > this being exploited by ROP brute force attacks?
> > > 
> > > E.g. adding some (maybe continuously increasing) delay and/or limiting the
> > > amount of reconnects within a certain time frame would come to my mind.
> > > 
> > > Best regards,
> > > Christian Schoenebeck
> > 
> > Thanks, Christian. I am still trying to figure out the details of the ROP
> > attacks.
> > 
> > However, QEMU's vhost-user reconnection is based on chardev socket
> > reconnection. The socket reconnection can be enabled by the "--chardev
> > socket,...,reconnect=N" in QEMU command options, in which N means QEMU will
> > try to connect the disconnected socket every N seconds. We can increase N
> > to increase the reconnect delay. If we want to change the reconnect delay
> > dynamically, I think we should change the chardev socket reconnection code.
> > It is a more generic mechanism than vhost-user-fs and vhost-user backend.
> > 
> > By the way, I also considered the socket reconnection delay time in the
> > performance aspect. As the reconnection delay increase, if an application
> > in the guest is doing I/Os, it will suffer larger tail latency. And for
> > now, the smallest delay is 1 second, which is rather large for
> > high-performance virtual I/O devices today. I think maybe a more performant
> > and safer reconnect delay adjustment mechanism should be considered in the
> > future. What are your thoughts?
> 
> So with N=1 an attacker could e.g. bypass a 16-bit PAC by brute-force in ~18 
> hours (e.g. on Arm if PAC + MTE was enabled). With 24-bit PAC (no MTE) it 
> would be ~194 days. Independent of what architecture and defend mechanism is 
> used, there is always the possibility though that some kind of side channel 
> attack exists that might require a much lower amount of attempts. So in an 
> untrusted environment I would personally limit the amount of automatic 
> reconnects and rather accept a down time for further investigation if a 
> suspicious high amount of crashes happened.
> 
> And yes, if a dynamic delay scheme was deployed in future then starting with a 
> value smaller than 1 second would make sense.

If we're talking about repeatedly crashing the process to find out its
memory map, shouldn't each process have a different randomized memory
layout?

Stefan