Re: [Xen-users] "xl restore" leaks a file descriptor?

* Re: [Xen-users] "xl restore" leaks a file descriptor?
       [not found]       ` <CA+jCKRVqL4DOYZK-etugCnVRhOocVKYdhGQWG4XYCqWZUWcmfA@mail.gmail.com>
@ 2015-08-11 15:48         ` Ian Campbell
  2015-08-11 15:56           ` Andrew Cooper
  2015-08-11 17:07           ` Wei Liu
  0 siblings, 2 replies; 14+ messages in thread
From: Ian Campbell @ 2015-08-11 15:48 UTC (permalink / raw)
  To: Andrew Armenia, xen-devel, Wei Liu, Ian Jackson; +Cc: xen-users

On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote:
> It's the checkpoint file - i.e. the command line argument to xl
> restore - that is being leaked.

Thanks.

[...]
> So the checkpoint file is clearly being leaked.

Indeed. I confirmed this even with the current development version using ls
-l /proc/<pid>/fd which shows an fd open on a deleted file:

# ps aux| grep xl
root     20465  0.0  0.2 106036   984 ?        SLsl 15:42   0:00 xl restore save
# ls -l /proc/20465/fd
[...]
lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save
[...]
# rm /root/save
# ls -l /proc/20465/fd
[...]
lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted)
[...]

>  Its space is not freed
> until the 'xl restore' process is ended by shutting down the domain:
[...]
> 
> It seems like xl restore should close the checkpoint file as soon as
> it's done restoring the domain, allowing the space to be freed, but
> that's clearly not happening.

Right. In fact xl sets the file to be close-on-exec right after opening it,
which is before the daemonisation step, so it ought to be closed
automatically, but isn't for some reason.

My working theory is that something in the machinery which spawns the save
helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by
unsetting CLOEXEC.

Any way, thanks for reporting. I've copied the devel list and 4.6 RM. Wei
this probably ought to be a blocker for 4.6 (and the fix ought ultimately
to be backported to 4.4 onwards at least).

NB: This leak seems to be independent of the switch to migration v2.

Ian.

> -Andrew
> 
> On Aug 11, 2015 04:55, "Ian Campbell" <ian.campbell@citrix.com> wrote:
> > 
> > On Fri, 2015-08-07 at 12:50 -0400, Andrew Armenia wrote:
> > > The issue appears to occur with any state file - not just one in
> > > particular.
> > 
> > Please give some specific examples e.g. paths to some of the files to 
> > which
> > a fd has been leaked. I'm trying to determine which state files I 
> > should be
> > investigating, since there are several things which an end user might
> > consider a "state file".
> > 
> > Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread