* [Qemu-devel] QEMU leaves pidfile behind on exit
@ 2018-02-09 19:12 Shaun Reitan
2018-02-13 16:28 ` Daniel P. Berrangé
0 siblings, 1 reply; 4+ messages in thread
From: Shaun Reitan @ 2018-02-09 19:12 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
QEMU leaves the pidfile behind on a clean exit when using the option
-pidfile /var/run/qemu.pid.
Should QEMU leave it behind or should it clean up after itself?
I'm willing to take a crack at a patch to fix the issue, but before I
do, I want to make sure that leaving the pidfile behind was not
intentional?
--
Shaun Reitan
NDCHost.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] QEMU leaves pidfile behind on exit
2018-02-09 19:12 [Qemu-devel] QEMU leaves pidfile behind on exit Shaun Reitan
@ 2018-02-13 16:28 ` Daniel P. Berrangé
2018-02-13 19:35 ` Laszlo Ersek
0 siblings, 1 reply; 4+ messages in thread
From: Daniel P. Berrangé @ 2018-02-13 16:28 UTC (permalink / raw)
To: Shaun Reitan; +Cc: qemu-devel, pbonzini
On Fri, Feb 09, 2018 at 07:12:59PM +0000, Shaun Reitan wrote:
> QEMU leaves the pidfile behind on a clean exit when using the option
> -pidfile /var/run/qemu.pid.
>
> Should QEMU leave it behind or should it clean up after itself?
>
> I'm willing to take a crack at a patch to fix the issue, but before I do, I
> want to make sure that leaving the pidfile behind was not intentional?
If QEMU deletes the pidfile on exit then, with the current pidfile
acquisition logic, there's a race condition possible:
To acquire we do
1. fd = open()
2. lockf(fd)
If the first QEMU that currently owns the pidfile unlinks in, while
a second qemu is in betweeen steps 1 & 2, the second QEMU will
acquire the pidfile successfully (which is fine) but the pidfile
is now unlinked. This is not fine, because a 3rd qemu can now come
and try to acquire the pidfile (by creating a new one) and succeed,
despite the second qemu still owning the (now unlinked) pidfile.
It is possible to deal with this race by making qemu_create_pidfile
more intelligent [1]. It would have todo
1. fd = open(filename)
2. fstat(fd)
3. lockf(fd)
4. stat(filename)
It must then compare the results of 2 + 4 to ensure the pidfile it
acquired is the same as the one on disk. With this change, it would
be safe for QEMU to delete the pidfile on exit.
Regards,
Daniel
[1] See the equiv libvirt logic for pidfile acquisition in
https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virpidfile.c;h=58ab29f77f2cfb8583447112dae77a07446bc627;hb=HEAD#l384
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] QEMU leaves pidfile behind on exit
2018-02-13 16:28 ` Daniel P. Berrangé
@ 2018-02-13 19:35 ` Laszlo Ersek
2018-02-14 8:46 ` Daniel P. Berrangé
0 siblings, 1 reply; 4+ messages in thread
From: Laszlo Ersek @ 2018-02-13 19:35 UTC (permalink / raw)
To: Daniel P. Berrangé, Shaun Reitan; +Cc: pbonzini, qemu-devel
On 02/13/18 17:28, Daniel P. Berrangé wrote:
> On Fri, Feb 09, 2018 at 07:12:59PM +0000, Shaun Reitan wrote:
>> QEMU leaves the pidfile behind on a clean exit when using the option
>> -pidfile /var/run/qemu.pid.
>>
>> Should QEMU leave it behind or should it clean up after itself?
>>
>> I'm willing to take a crack at a patch to fix the issue, but before I do, I
>> want to make sure that leaving the pidfile behind was not intentional?
>
> If QEMU deletes the pidfile on exit then, with the current pidfile
> acquisition logic, there's a race condition possible:
>
> To acquire we do
>
> 1. fd = open()
> 2. lockf(fd)
>
> If the first QEMU that currently owns the pidfile unlinks in, while
> a second qemu is in betweeen steps 1 & 2, the second QEMU will
> acquire the pidfile successfully (which is fine) but the pidfile
> is now unlinked. This is not fine, because a 3rd qemu can now come
> and try to acquire the pidfile (by creating a new one) and succeed,
> despite the second qemu still owning the (now unlinked) pidfile.
>
> It is possible to deal with this race by making qemu_create_pidfile
> more intelligent [1]. It would have todo
>
> 1. fd = open(filename)
> 2. fstat(fd)
> 3. lockf(fd)
> 4. stat(filename)
>
> It must then compare the results of 2 + 4 to ensure the pidfile it
> acquired is the same as the one on disk. With this change, it would
> be safe for QEMU to delete the pidfile on exit.
Why don't we just open the pidfile with (O_CREAT | O_EXCL)? O_EXCL is
supposed to be atomic.
... The open(2) manual on Linux says,
On NFS, O_EXCL is supported only when using NFSv3 or
later on kernel 2.6 or later. In NFS environments where
O_EXCL support is not provided, programs that rely on it
for performing locking tasks will contain a race condi-
tion. [...]
Sigh.
> [1] See the equiv libvirt logic for pidfile acquisition in
> https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virpidfile.c;h=58ab29f77f2cfb8583447112dae77a07446bc627;hb=HEAD#l384
>
To my knowledge, "same file" should be checked with:
a.st_dev == b.st_dev && a.st_ino == b.st_ino
Example:
- "filename" is "/var/run/qemu.pid"
- "/var/run" is originally a symbolic link to "/mnt/fs1/"
- between steps #1 and #4, "/var/run" is re-created as a symbolic link
to "/mnt/fs2/" -- a different filesystem from fs1
- "/mnt/fs2/qemu.pid" happens to have the same inode number as
"/mnt/fs1/qemu.pid"
Thanks,
Laszlo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] QEMU leaves pidfile behind on exit
2018-02-13 19:35 ` Laszlo Ersek
@ 2018-02-14 8:46 ` Daniel P. Berrangé
0 siblings, 0 replies; 4+ messages in thread
From: Daniel P. Berrangé @ 2018-02-14 8:46 UTC (permalink / raw)
To: Laszlo Ersek; +Cc: Shaun Reitan, pbonzini, qemu-devel
On Tue, Feb 13, 2018 at 08:35:23PM +0100, Laszlo Ersek wrote:
> On 02/13/18 17:28, Daniel P. Berrangé wrote:
> > On Fri, Feb 09, 2018 at 07:12:59PM +0000, Shaun Reitan wrote:
> >> QEMU leaves the pidfile behind on a clean exit when using the option
> >> -pidfile /var/run/qemu.pid.
> >>
> >> Should QEMU leave it behind or should it clean up after itself?
> >>
> >> I'm willing to take a crack at a patch to fix the issue, but before I do, I
> >> want to make sure that leaving the pidfile behind was not intentional?
> >
> > If QEMU deletes the pidfile on exit then, with the current pidfile
> > acquisition logic, there's a race condition possible:
> >
> > To acquire we do
> >
> > 1. fd = open()
> > 2. lockf(fd)
> >
> > If the first QEMU that currently owns the pidfile unlinks in, while
> > a second qemu is in betweeen steps 1 & 2, the second QEMU will
> > acquire the pidfile successfully (which is fine) but the pidfile
> > is now unlinked. This is not fine, because a 3rd qemu can now come
> > and try to acquire the pidfile (by creating a new one) and succeed,
> > despite the second qemu still owning the (now unlinked) pidfile.
> >
> > It is possible to deal with this race by making qemu_create_pidfile
> > more intelligent [1]. It would have todo
> >
> > 1. fd = open(filename)
> > 2. fstat(fd)
> > 3. lockf(fd)
> > 4. stat(filename)
> >
> > It must then compare the results of 2 + 4 to ensure the pidfile it
> > acquired is the same as the one on disk. With this change, it would
> > be safe for QEMU to delete the pidfile on exit.
>
> Why don't we just open the pidfile with (O_CREAT | O_EXCL)? O_EXCL is
> supposed to be atomic.
O_EXCL isn't a good idea because if QEMU crashes without cleaning up
you have a stale pidfile and O_EXCL will turn that into a failure to
acquire pidfile. The key point of using lockf() is to ensure we can
cope reliably with stale pidfiles
>
> ... The open(2) manual on Linux says,
>
> On NFS, O_EXCL is supported only when using NFSv3 or
> later on kernel 2.6 or later. In NFS environments where
> O_EXCL support is not provided, programs that rely on it
> for performing locking tasks will contain a race condi-
> tion. [...]
>
> Sigh.
>
> > [1] See the equiv libvirt logic for pidfile acquisition in
> > https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/util/virpidfile.c;h=58ab29f77f2cfb8583447112dae77a07446bc627;hb=HEAD#l384
> >
>
> To my knowledge, "same file" should be checked with:
>
> a.st_dev == b.st_dev && a.st_ino == b.st_ino
>
> Example:
> - "filename" is "/var/run/qemu.pid"
> - "/var/run" is originally a symbolic link to "/mnt/fs1/"
> - between steps #1 and #4, "/var/run" is re-created as a symbolic link
> to "/mnt/fs2/" -- a different filesystem from fs1
> - "/mnt/fs2/qemu.pid" happens to have the same inode number as
> "/mnt/fs1/qemu.pid"
I don't really think we need to worry about the admin changing symlinks
like this while QEMU is in middle of acquiring the PID.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-02-14 8:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-09 19:12 [Qemu-devel] QEMU leaves pidfile behind on exit Shaun Reitan
2018-02-13 16:28 ` Daniel P. Berrangé
2018-02-13 19:35 ` Laszlo Ersek
2018-02-14 8:46 ` Daniel P. Berrangé
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.