On Thu, Mar 18, 2021 at 12:58:46PM +0100, Christian Schoenebeck wrote: > On Mittwoch, 17. März 2021 13:57:47 CET Jiachen Zhang wrote: > > On Wed, Mar 17, 2021 at 7:50 PM Christian Schoenebeck < > > > > qemu_oss@crudebyte.com> wrote: > > > On Mittwoch, 17. März 2021 11:05:32 CET Stefan Hajnoczi wrote: > > > > On Fri, Dec 18, 2020 at 05:39:34PM +0800, Jiachen Zhang wrote: > > > > > Thanks for the suggestions. Actually, we choose to save all state > > > > > information to QEMU because a virtiofsd has the same lifecycle as its > > > > > QEMU master. However, saving things to a file do avoid communication > > > > > > with > > > > > > > > QEMU, and we no longer need to increase the complexity of vhost-user > > > > > protocol. The suggestion to save fds to the systemd is also very > > > > > reasonable > > > > > if we don't consider the lifecycle issues, we will try it. > > > > > > > > Hi, > > > > We recently discussed crash recovery in the virtio-fs bi-weekly call and > > > > I read some of this email thread because it's a topic I'm interested in. > > > > > > I just had a quick fly over the patches so far. Shouldn't there be some > > > kind > > > of constraint for an automatic reconnection feature after a crash to > > > prevent > > > this being exploited by ROP brute force attacks? > > > > > > E.g. adding some (maybe continuously increasing) delay and/or limiting the > > > amount of reconnects within a certain time frame would come to my mind. > > > > > > Best regards, > > > Christian Schoenebeck > > > > Thanks, Christian. I am still trying to figure out the details of the ROP > > attacks. > > > > However, QEMU's vhost-user reconnection is based on chardev socket > > reconnection. The socket reconnection can be enabled by the "--chardev > > socket,...,reconnect=N" in QEMU command options, in which N means QEMU will > > try to connect the disconnected socket every N seconds. We can increase N > > to increase the reconnect delay. If we want to change the reconnect delay > > dynamically, I think we should change the chardev socket reconnection code. > > It is a more generic mechanism than vhost-user-fs and vhost-user backend. > > > > By the way, I also considered the socket reconnection delay time in the > > performance aspect. As the reconnection delay increase, if an application > > in the guest is doing I/Os, it will suffer larger tail latency. And for > > now, the smallest delay is 1 second, which is rather large for > > high-performance virtual I/O devices today. I think maybe a more performant > > and safer reconnect delay adjustment mechanism should be considered in the > > future. What are your thoughts? > > So with N=1 an attacker could e.g. bypass a 16-bit PAC by brute-force in ~18 > hours (e.g. on Arm if PAC + MTE was enabled). With 24-bit PAC (no MTE) it > would be ~194 days. Independent of what architecture and defend mechanism is > used, there is always the possibility though that some kind of side channel > attack exists that might require a much lower amount of attempts. So in an > untrusted environment I would personally limit the amount of automatic > reconnects and rather accept a down time for further investigation if a > suspicious high amount of crashes happened. > > And yes, if a dynamic delay scheme was deployed in future then starting with a > value smaller than 1 second would make sense. If we're talking about repeatedly crashing the process to find out its memory map, shouldn't each process have a different randomized memory layout? Stefan