All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Cyrill Gorcunov
	<gorcunov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Nathan Lynch <ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>,
	Linux Containers
	<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
	Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
Subject: Re: [RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace
Date: Wed, 27 Jul 2011 01:46:57 +0200	[thread overview]
Message-ID: <20110726234657.GD28497@mtj.dyndns.org> (raw)
In-Reply-To: <20110726225911.GF14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>

Hello,

On Tue, Jul 26, 2011 at 03:59:11PM -0700, Matt Helsley wrote:
> > /proc/PID/fd already provides access to deleted files perfectly well
> > as most avid p0rn watchers would know (you can run mplayer on flash's
> > deleted temp files). ;)
> 
> Yup, access to the unlinked file contents. This is an example where
> things appear simple and complete in /proc yet it is insufficient.
> Here's what you'll need:
> 
> The string "(deleted)" in a file name is, strictly speaking, ambiguous --
> it does not mean the file is unlinked. You also can't infer that it is
> unlinked by stat()'ing that path since a different file could have
> been created in the same spot. For something unambiguous you'll
> have to add that information to /proc somewhere. fdinfo doesn't seem
> to be the right place since fds aren't unlinked -- files are. 

Hmm... but wouldn't fstat() after open reveal the original inode?  ie.

  $ cat fstat.c
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <unistd.h>
  #include <stdio.h>
  #include <fcntl.h>
  #include <assert.h>

  int main(int argc, char **argv)
  {
	  int fd;
	  struct stat st = {};

	  assert((fd = open(argv[1], O_RDONLY)) >= 0);
	  assert(!fstat(fd, &st));
	  printf("ino=%lu nlink=%lu\n",
		 (unsigned long)st.st_ino, (unsigned long)st.st_nlink);
	  return 0;
  }
  $ gcc -Wall -o fstat fstat.c
  $ cat > asdf &
  [7] 31908
  $ ./fstat asdf
  ino=9180912 nlink=1
  $ ./fstat /proc/31908/fd/1
  ino=9180912 nlink=1
  $ rm -f asdf
  $ ./fstat /proc/31908/fd/1
  ino=9180912 nlink=0
  $ touch asdf
  $ $ ./fstat asdf
  ino=9180915 nlink=1

I don't think anything is ambiguous.

> Then you've got to detect when they're the same unlinked file and share
> the copy upon restart. Or they could be different unlinked files
> with the same path in which case you should not share the copy. I suppose
> you'll have to check the device and inode and then see if any other task
> being checkpointed has it open... once for each of potentially thousands
> of fds being checkpointed.

Just build a hash table w/ fstat results.  It's O(nr_open_files)
whether you do that or not.

> Then there's the case where you've got one unlinked dentry for the
> file but a hardlink elsewhere. The /proc/PID/fd path won't point to the
> hardlinked location. So in order for those to be the same file upon
> restart you need to find the file somehow during checkpoint and/or
> restart.

You can determine whether search for another hardlink is necessary by
looking at nlink.  Hmm... I wonder whether open_by_handle_at() can be
used for this instead of scanning filesystem for matching inode
number.  Screening by nlink should eliminate most cases but if
open_by_handle_at() can deal with actual cases, it would be much
better.

> Finally these files often can be huge. Copying them elsewhere is a huge IO
> burden compared to careful relinking of the file. IO that could be better
> spent doing actual work.
>
> We solved all that with "relinking". It's possible to make a relink()
> syscall. The code I posted some time ago to containers@ can be easily
> adapted for that -- I did so for my testing of those patches. I'm not
> exactly sure how it would be done from userspace but I suspect it could
> be done.

Yeah, something like flink (like fstat for stat) should do it.  FS
methods operate on dentries anyway so it can be added in the vfs layer
proper if necessary.

Thanks.

-- 
tejun

  parent reply	other threads:[~2011-07-26 23:46 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-15 13:45 [RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace Pavel Emelyanov
     [not found] ` <4E204466.8010204-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-15 13:45   ` [PATCH 0/1] proc: Introduce the /proc/<pid>/mfd/ directory Pavel Emelyanov
     [not found]     ` <4E20448A.5010207-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21  7:21       ` Tejun Heo
2011-07-15 13:46   ` [PATCH 2/7] vfs: Introduce the fd closing helper Pavel Emelyanov
     [not found]     ` <4E2044A7.4030103-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 15:47       ` Serge E. Hallyn
2011-07-15 13:46   ` [PATCH 3/7] proc: Introduce the Children: line in /proc/<pid>/status Pavel Emelyanov
     [not found]     ` <4E2044C3.7050506-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21  6:54       ` Tejun Heo
     [not found]         ` <20110721065436.GT3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-23  8:06           ` Pavel Emelyanov
     [not found]             ` <4E2A8116.1040309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  8:41               ` Tejun Heo
     [not found]                 ` <20110723084110.GG21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23  8:45                   ` Pavel Emelyanov
     [not found]                     ` <4E2A8A0E.5030208-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  8:50                       ` Tejun Heo
     [not found]                         ` <20110723085014.GI21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23  8:51                           ` Pavel Emelyanov
2011-07-21 15:54       ` Serge E. Hallyn
2011-07-15 13:47   ` [PATCH 4/7] vfs: Add ->statfs callback for pipefs Pavel Emelyanov
     [not found]     ` <4E2044D6.3060205-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21  6:59       ` Tejun Heo
2011-07-21 15:59       ` Serge E. Hallyn
2011-07-15 13:47   ` [PATCH 5/7] clone: Introduce the CLONE_CHILD_USEPID functionality Pavel Emelyanov
     [not found]     ` <4E2044EB.20001-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21 16:04       ` Serge E. Hallyn
     [not found]         ` <20110721160459.GD19012-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2011-07-22 23:08           ` Matt Helsley
     [not found]             ` <20110722230848.GB16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  8:09               ` Pavel Emelyanov
2011-07-15 13:47   ` [PATCH 6/7] proc: Introduce the /proc/<pid>/dump file Pavel Emelyanov
     [not found]     ` <4E204500.6040800-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-16 22:57       ` Kirill A. Shutemov
     [not found]         ` <20110716225709.GA25606-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>
2011-07-17  8:06           ` Cyrill Gorcunov
2011-07-21  6:44       ` Tejun Heo
     [not found]         ` <20110721064408.GR3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-23  8:11           ` Pavel Emelyanov
     [not found]             ` <4E2A8239.5060908-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  8:37               ` Tejun Heo
     [not found]                 ` <20110723083711.GF21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23  8:49                   ` Pavel Emelyanov
     [not found]                     ` <4E2A8B12.4010709-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  8:58                       ` Tejun Heo
2011-07-15 13:48   ` [PATCH 7/7] binfmt: Introduce the binfmt_img exec handler Pavel Emelyanov
     [not found]     ` <4E204519.3040804-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-21  6:51       ` Tejun Heo
     [not found]         ` <20110721065127.GS3455-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-22 22:46           ` Matt Helsley
     [not found]             ` <20110722224617.GA16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  8:17               ` Pavel Emelyanov
     [not found]                 ` <4E2A83AC.6090504-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  8:45                   ` Tejun Heo
     [not found]                     ` <20110723084529.GH21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-23  8:51                       ` Pavel Emelyanov
     [not found]                         ` <4E2A8B7D.8010807-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-23  9:04                           ` Tejun Heo
2011-07-15 13:49   ` [TOOLS] To make use of the patches Pavel Emelyanov
     [not found]     ` <4E204554.6040901-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-22 23:45       ` Matt Helsley
     [not found]         ` <20110722234558.GD16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  8:32           ` Pavel Emelyanov
     [not found]             ` <4E2A8704.3030306-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-07-27 23:00               ` Matt Helsley
     [not found]                 ` <20110727230003.GE15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-28  8:23                   ` James Bottomley
2011-07-23  0:40       ` Reply #2: " Matt Helsley
     [not found]         ` <20110723004045.GC21563-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  8:33           ` Pavel Emelyanov
2011-07-15 15:01   ` [RFC][PATCH 0/7 + tools] Checkpoint/restore mostly in the userspace Tejun Heo
2011-07-18 13:27   ` Serge E. Hallyn
     [not found]     ` <20110718132759.GB8127-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2011-07-23  8:43       ` Pavel Emelyanov
2011-07-23  0:25   ` Matt Helsley
     [not found]     ` <20110723002558.GE16940-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  3:29       ` Matt Helsley
     [not found]         ` <20110723032945.GD21563-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-23  4:58           ` Tejun Heo
     [not found]             ` <20110723045842.GD21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 18:11               ` Matt Helsley
     [not found]                 ` <20110726181128.GD14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 22:45                   ` Tejun Heo
     [not found]                     ` <20110726224525.GC28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 23:07                       ` Matt Helsley
2011-07-23  3:53       ` Tejun Heo
     [not found]         ` <CAOS58YPqLSYi2xECUk4O5GG3s6aokT=VykmkL6UnAOzyHXNAgQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-07-26 22:59           ` Matt Helsley
     [not found]             ` <20110726225911.GF14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 23:46               ` Tejun Heo [this message]
     [not found]                 ` <20110726234657.GD28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-27  0:53                   ` Matt Helsley
     [not found]                     ` <20110727005341.GB15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-27 10:12                       ` Tejun Heo
     [not found]                         ` <20110727101228.GY2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-27 22:26                           ` Matt Helsley
2011-07-23  5:10       ` Tejun Heo
     [not found]         ` <20110723051005.GE21089-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-26 22:02           ` Matt Helsley
     [not found]             ` <20110726220215.GE14808-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-26 22:21               ` Tejun Heo
     [not found]                 ` <20110726222109.GB28497-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2011-07-27  0:06                   ` Matt Helsley
     [not found]                     ` <20110727000651.GA15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-27 12:01                       ` Tejun Heo
     [not found]                         ` <20110727120114.GZ2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-27 21:35                           ` Matt Helsley
     [not found]                             ` <20110727213510.GC15501-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2011-07-28  7:21                               ` Tejun Heo
     [not found]                                 ` <20110728072141.GB2622-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2011-07-28  7:23                                   ` Tejun Heo
2011-07-28  8:37                                   ` James Bottomley
2011-07-28  9:10                                     ` Tejun Heo
2011-07-23  8:39       ` Pavel Emelyanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110726234657.GD28497@mtj.dyndns.org \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org \
    --cc=gorcunov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org \
    --cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.