linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Helge Hafting <helgehaf@aitel.hist.no>
To: Stephan von Krawczynski <skraw@ithnet.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: FS: hardlinks on directories
Date: Tue, 05 Aug 2003 14:51:46 +0200	[thread overview]
Message-ID: <3F2FA862.2070401@aitel.hist.no> (raw)
In-Reply-To: 20030805003210.2c7f75f6.skraw@ithnet.com

Stephan von Krawczynski wrote:
> On Mon, 4 Aug 2003 16:09:28 -0500
> Jesse Pollard <jesse@cats-chateau.net> wrote:
> 
> 
>>>tar --dereference loops on symlinks _today_, to name an example.
>>>All you have to do is to provide a way to find out if a directory is a
>>>hardlink, nothing more. And that should be easy.
>>
>>Yup - because a symlink is NOT a hard link. it is a file.
>>
>>If you use hard links then there is no way to determine which is the "proper"
>>link.
> 
> 
> Where does it say that an fs is not allowed to give you this information in
> some way?

Because there is no such thing as the "proper" link.

Every filename and directoryname is a "hard link" to
some inode.  The "ln" command lets you make more
links to the same inode, there is _no_ difference
at all between the "first" and the "second" (or third) hard link.
There is no information recorded anywhere about which one
where "first".  You can remove the "first" and still
use the file through the second link.


> 
>>>>It was also done in one of the "popular" code management systems under
>>>>unix. (it allowed a "mount" of the system root to be under the CVS
>>>>repository to detect unauthorized modifications...). Unfortunately,
>>>>the system could not be backed up anymore. 1. A dump of the CVS
>>>>filesystem turned into a dump of the entire system... 2. You could not
>>>>restore the backups... The dumps failed (bru at the time) because the
>>>>pathnames got too long, the restore failed since it ran out of disk space
>>>>due to the multiple copies of the tree being created.
>>>
>>>And they never heard of "--exclude" in tar, did they?
>>
>>Doesn't work. Remember - you have to --exclude EVERY possible loop. And 
>>unless you know ahead of time, you can't exclude it. The only way we found
>>to reliably do the backup was to dismount the CVS.
> 
> 
> And if you completely run out of ideas in your wild-mounts-throughout-the-tree
> problem you should simply use "tar -l".
> 
> And in just the same way fs could provide a mode bit saying "hi, I am a
> hardlink", and tar can then easily provide option number 1345 saying "stay away
> from hardlinks".
> 
Then you stay away from each and every file on the system, because
every file is a hard link.  This is useless.

Making up a new bit in the directory saying "this directory entry
was created with 'ln' as opposed to 'open'" is indeed possible,
but wouldn't solve your problems.  Consider this:

A file is created in the normal way by a user.  (link count=1)
Someone else finds it useful creates a link to it. (link count=2)
The first user don't need the file anymore and deletes it. (link count=1)
The file still exist, because disk blocks are only released when
the link count goes to zero.
The second user can still use the file through his link, and think
it is safe.  But the file is no longer backed up because your
tar avoids the directory entry created with 'ln', and there is no
longer any directory entry created the normal way.  Similiar
problems arise for all other utilities you might want to modify
to use this sort of linking flags.

The problem don't aries with symlinks because
the file really disappear when deleted, leaving invalid
symlinks.  (Everybody knows that and don't think they
can keep a file by making a symlink, as you can with a hardlink)

The number of hard links to some inode is a reference count,
the number of symlinks is not.

Another post of yours:
 > tar --dereference loops on symlinks _today_, to name an example.
 > All you have to do is to provide a way to find out if a directory is a
 > hardlink, nothing more. And that should be easy.

There is currently no way to find out, as explained above.  And a
"flag" solution is useless, as you then will get files (and directories)
that exist only as links when the "original" link is deleted.

Another post:

 >> Things would break badly if the hierarchy became an arbitrary graph.
 > Care to name one? What exactly is the rule you see broken? Sure you 
can build
 > loops,

Loops are funny in more than one way.  Tools can be made that detect
them via inode numbers, and handle them properly. This can be costly in time
and memory, but backing up or otherwise traversing a generic graph is 
possible.

Even more fun is when you have a directory loop like this:

mkdir A
cd A
mkdir B
cd B
make hard link C back to A

cd ../..
rmdir A

You now removed A from your home directory, but the
directory itself did not disappear because it had
another hard link from C in B.

Expected behaviour, but there is no longer a path
from _anywhere_ to the B and C directories.  That
means they occupy disk space but cannot be deleted,
accessed or used in any way short of garbage collecting
the entire directory structure. That can be a big job.
The space usage can be big too, for the inaccessible B or C
directories might hold some large files.

This isn't easily avoidable by "not doing stupid things"
either, interactions between several users who don't
know what the others do can easily form loops by
linking and moving some directories around.

Helge Hafting


  parent reply	other threads:[~2003-08-05 12:47 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-04 12:15 FS: hardlinks on directories Stephan von Krawczynski
2003-08-04 12:45 ` Måns Rullgård
2003-08-04 13:22   ` Stephan von Krawczynski
2003-08-04 13:37     ` Christian Reichert
2003-08-04 13:44       ` Stephan von Krawczynski
2003-08-04 14:22         ` Christian Reichert
2003-08-04 15:31     ` Jeff Muizelaar
2003-08-04 16:15       ` Stephan von Krawczynski
2003-08-05  2:45         ` Neil Brown
2003-08-05  9:41           ` Stephan von Krawczynski
2003-08-06  1:12             ` Neil Brown
2003-08-06 10:14               ` Stephan von Krawczynski
2003-08-07  2:27                 ` Neil Brown
2003-08-04 12:47 ` Nikita Danilov
2003-08-04 13:32   ` Stephan von Krawczynski
2003-08-04 13:44 ` Andries Brouwer
2003-08-04 13:56   ` Stephan von Krawczynski
2003-08-04 14:04     ` Anton Altaparmakov
2003-08-04 14:50       ` Stephan von Krawczynski
2003-08-04 20:03         ` Olivier Galibert
2003-08-04 21:16         ` Jesse Pollard
2003-08-04 23:34           ` Stephan von Krawczynski
2003-08-05 14:20             ` Jesse Pollard
2003-08-05 14:44               ` Stephan von Krawczynski
2003-08-04 22:58         ` Andrew Pimlott
2003-08-05  0:19           ` Stephan von Krawczynski
2003-08-05  1:18             ` Andrew Pimlott
2003-08-05  8:04               ` Stephan von Krawczynski
2003-08-05 11:18                 ` Wakko Warner
2003-08-04 14:33     ` Jesse Pollard
2003-08-04 15:05       ` Stephan von Krawczynski
2003-08-04 15:57         ` Richard B. Johnson
2003-08-04 21:23           ` Jesse Pollard
2003-08-04 16:11         ` Adam Sampson
2003-08-04 17:00         ` Hans Reiser
2003-08-04 17:18           ` Sean Neakums
2003-08-05  4:53           ` jw schultz
2003-08-04 18:50         ` jlnance
2003-08-04 21:09         ` Jesse Pollard
2003-08-04 22:13           ` Stephan von Krawczynski
2003-08-04 22:32           ` Stephan von Krawczynski
2003-08-04 23:00             ` Randolph Bentson
2003-08-05  0:10               ` Stephan von Krawczynski
2003-08-05  2:09                 ` Edgar Toernig
2003-08-05  8:05                   ` Stephan von Krawczynski
2003-08-05 12:51             ` Helge Hafting [this message]
2003-08-05 13:03               ` Stephan von Krawczynski
2003-08-05 13:13                 ` Bernd Petrovitsch
2003-08-05 13:39                   ` Stephan von Krawczynski
2003-08-05 13:36                 ` Richard B. Johnson
2003-08-05 14:04                   ` Stephan von Krawczynski
2003-08-05 14:57                     ` Richard B. Johnson
2003-08-05 15:08                       ` Stephan von Krawczynski
2003-08-05 15:02                     ` Jesse Pollard
2003-08-05 15:12                       ` Stephan von Krawczynski
2003-08-05 15:44                       ` Trond Myklebust
2003-08-05 14:56                   ` Jesse Pollard
2003-08-05 22:08                 ` Helge Hafting
2003-08-24 17:35                   ` Hans Reiser
2003-08-24 19:02                     ` Helge Hafting
2003-08-25  8:27                     ` Nikita Danilov
2003-08-25 15:48                       ` Hans Reiser
2003-08-05 14:12             ` Jesse Pollard
2003-08-05 14:21               ` Stephan von Krawczynski
2003-08-05 15:53                 ` Herbert Pötzl
2003-08-04 20:47     ` Jan Harkes
2003-08-04 15:42   ` Brian Pawlowski
2003-08-04 15:56     ` Stephan von Krawczynski
2003-08-04 16:16       ` Herbert Pötzl
2003-08-04 16:35         ` Stephan von Krawczynski
2003-08-04 16:54           ` Herbert Pötzl
2003-08-04 17:18             ` Stephan von Krawczynski
2003-08-04 17:25               ` Herbert Pötzl
2003-08-04 21:38           ` Jesse Pollard
2003-08-05  0:06             ` Stephan von Krawczynski
2003-08-05  3:11           ` Neil Brown
2003-08-04 21:29       ` Jesse Pollard
2003-08-04 23:42         ` Stephan von Krawczynski
2003-08-05 16:46           ` viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F2FA862.2070401@aitel.hist.no \
    --to=helgehaf@aitel.hist.no \
    --cc=linux-kernel@vger.kernel.org \
    --cc=skraw@ithnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).