linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fgetattr/fsetattr file operation
@ 2009-06-03 17:32 Brian Molnar
  2009-06-04 10:27 ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Molnar @ 2009-06-03 17:32 UTC (permalink / raw)
  To: linux-fsdevel

There was some talk on this list a year or two ago about adding
f_op->fgetattr and f_op->fsetattr methods to the VFS file_operations
struct. For some reason or another, that was dropped on the floor, but
I'd like to pick that back up and discus it once again.

In the file-system I'm working on, there is a distinction between
stat(2)s done against a pathname and fstat(2)s done against an open
file descriptor. The results of the two are expected to differ in some
circumstances. Moreover, I'm building this file-system in FUSE, and
while I've managed to hack the underlying fuse code to achieve this
behavior, it does so at a substantial performance cost. The more
'correct' solution here would be to export an fgetattr/fsetattr
interface at VFS layer, then percolate that change up to the FUSE
interface.

Cheers,
- Brian Molnar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-03 17:32 fgetattr/fsetattr file operation Brian Molnar
@ 2009-06-04 10:27 ` Miklos Szeredi
  2009-06-04 17:42   ` Brian Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2009-06-04 10:27 UTC (permalink / raw)
  To: brian.molnar; +Cc: linux-fsdevel

On Wed, 3 Jun 2009, Brian Molnar wrote:
> There was some talk on this list a year or two ago about adding
> f_op->fgetattr and f_op->fsetattr methods to the VFS file_operations
> struct. For some reason or another, that was dropped on the floor, but
> I'd like to pick that back up and discus it once again.
> 
> In the file-system I'm working on, there is a distinction between
> stat(2)s done against a pathname and fstat(2)s done against an open
> file descriptor. The results of the two are expected to differ in some
> circumstances. Moreover, I'm building this file-system in FUSE, and
> while I've managed to hack the underlying fuse code to achieve this
> behavior, it does so at a substantial performance cost. The more
> 'correct' solution here would be to export an fgetattr/fsetattr
> interface at VFS layer, then percolate that change up to the FUSE
> interface.

Can you tell us a bit more about your use case, and how you worked
around the limitation?

Attaching an untested patch that adds f_op->fgetattr().  Testing and
comments are welcome.

Thanks,
Miklos


Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h	2009-06-02 15:55:18.000000000 +0200
+++ linux-2.6/include/linux/fs.h	2009-06-04 11:56:04.000000000 +0200
@@ -1508,6 +1508,7 @@ struct file_operations {
 	ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
 	ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int);
 	int (*setlease)(struct file *, long, struct file_lock **);
+	int (*fgetattr)(struct file *, struct kstat *);
 };
 
 struct inode_operations {
Index: linux-2.6/fs/fuse/file.c
===================================================================
--- linux-2.6.orig/fs/fuse/file.c	2009-06-04 11:42:10.000000000 +0200
+++ linux-2.6/fs/fuse/file.c	2009-06-04 12:02:48.000000000 +0200
@@ -1449,6 +1449,17 @@ static int fuse_file_flock(struct file *
 	return err;
 }
 
+static int fuse_file_fgetattr(struct file *file, struct kstat *stat)
+{
+	struct inode *inode = file->f_dentry->d_inode;
+	struct fuse_conn *fc = get_fuse_conn(inode);
+
+	if (!fuse_allow_task(fc, current))
+		return -EACCES;
+
+	return fuse_update_attributes(inode, stat, file, NULL);
+}
+
 static sector_t fuse_bmap(struct address_space *mapping, sector_t block)
 {
 	struct inode *inode = mapping->host;
@@ -1927,6 +1938,7 @@ static const struct file_operations fuse
 	.fsync		= fuse_fsync,
 	.lock		= fuse_file_lock,
 	.flock		= fuse_file_flock,
+	.fgetattr	= fuse_file_fgetattr,
 	.splice_read	= generic_file_splice_read,
 	.unlocked_ioctl	= fuse_file_ioctl,
 	.compat_ioctl	= fuse_file_compat_ioctl,
@@ -1944,6 +1956,7 @@ static const struct file_operations fuse
 	.fsync		= fuse_fsync,
 	.lock		= fuse_file_lock,
 	.flock		= fuse_file_flock,
+	.fgetattr	= fuse_file_fgetattr,
 	.unlocked_ioctl	= fuse_file_ioctl,
 	.compat_ioctl	= fuse_file_compat_ioctl,
 	.poll		= fuse_file_poll,
Index: linux-2.6/fs/stat.c
===================================================================
--- linux-2.6.orig/fs/stat.c	2009-05-20 14:11:59.000000000 +0200
+++ linux-2.6/fs/stat.c	2009-06-04 12:08:07.000000000 +0200
@@ -57,13 +57,33 @@ EXPORT_SYMBOL(vfs_getattr);
 
 int vfs_fstat(unsigned int fd, struct kstat *stat)
 {
-	struct file *f = fget(fd);
-	int error = -EBADF;
+	struct inode *inode;
+	struct vfsmount *mnt;
+	struct dentry *dentry;
+	struct file *f;
+	int error;
 
-	if (f) {
-		error = vfs_getattr(f->f_path.mnt, f->f_path.dentry, stat);
-		fput(f);
+	f = fget(fd);
+	if (!f)
+		return -EBADF;
+
+	mnt = f->f_path.mnt;
+	dentry = f->f_path.dentry;
+	inode = dentry->d_inode;
+
+	error = security_inode_getattr(mnt, dentry);
+	if (!error) {
+		if (f->f_op && f->f_op->fgetattr) {
+			error = f->f_op->fgetattr(f, stat);
+		} else {
+			if (inode->i_op->getattr)
+				error = inode->i_op->getattr(mnt, dentry, stat);
+			else
+				generic_fillattr(inode, stat);
+		}
 	}
+	fput(f);
+
 	return error;
 }
 EXPORT_SYMBOL(vfs_fstat);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-04 10:27 ` Miklos Szeredi
@ 2009-06-04 17:42   ` Brian Molnar
  2009-06-04 19:04     ` Andreas Dilger
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Molnar @ 2009-06-04 17:42 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

> On Thu, Jun 4, 2009 at 3:27 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> Can you tell us a bit more about your use case, and how you worked
> around the limitation?

In this FS, when a process opens a file for write mode access, it
maintains its own private copy of the file until it comes time to
close. Upon close, this private copy is "committed" and replaces the
file on-disk. Meanwhile, while the file is open for writing (before
close) the file on-disk remains untouched (both data and metadata).
Thus, any metadata-changing operations that take place against the
file descriptor must only be seen to the process(es) using that file
descriptor. The intended behavior is that an fstat(2) against the file
descriptor will return the attributes corresponding to the
uncommitted, private copy and a path-based stat(2) will return the
attributes of the file on-disk.

To achieve this behavior, the change I made to the FUSE fs code is
such that when a lookup is performed during a file open(2), the file's
dentry is immediately invalidated, so that we essentially get a
'private' dentry and inode, and subsequent path-walks won't find it.
Then when the file is opened, I stuff the file struct pointer into
dentry->d_fsdata. Then when the getattr is called, I check if the
dentry is disconnected and contains a non-null d_fsdata, and if so,
forward it to an fgetattr handler using the file pointer.

The code is deliciously hacktastic, and adds the cost of all these
unnecessary dentry invalidations and lookups. Plus, I haven't even
tested it thoroughly, so I'm sure it's gonna screw up some other
things. The desired approach here is to just cut to the chase and be
able to handle stats made directly to the file descriptor with an
fgetattr operation using the underlying file struct.

Cheers,
- Brian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-04 17:42   ` Brian Molnar
@ 2009-06-04 19:04     ` Andreas Dilger
  2009-06-04 19:46       ` Brian Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Dilger @ 2009-06-04 19:04 UTC (permalink / raw)
  To: Brian Molnar; +Cc: Miklos Szeredi, linux-fsdevel

On Jun 04, 2009  10:42 -0700, Brian Molnar wrote:
> > On Thu, Jun 4, 2009 at 3:27 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > Can you tell us a bit more about your use case, and how you worked
> > around the limitation?
> 
> In this FS, when a process opens a file for write mode access, it
> maintains its own private copy of the file until it comes time to
> close. Upon close, this private copy is "committed" and replaces the
> file on-disk. Meanwhile, while the file is open for writing (before
> close) the file on-disk remains untouched (both data and metadata).
> Thus, any metadata-changing operations that take place against the
> file descriptor must only be seen to the process(es) using that file
> descriptor. The intended behavior is that an fstat(2) against the file
> descriptor will return the attributes corresponding to the
> uncommitted, private copy and a path-based stat(2) will return the
> attributes of the file on-disk.

This sounds very strange - different processes (or even the same
process using different file handles) will see different results
for fstat() which are yet different from stat().  I can't imagine
that applications would like this at all.

> To achieve this behavior,

Maybe you should go back and explain why this behaviour is useful,
before we burden the kernel with it.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-04 19:04     ` Andreas Dilger
@ 2009-06-04 19:46       ` Brian Molnar
  2009-06-05 10:35         ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Molnar @ 2009-06-04 19:46 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Miklos Szeredi, linux-fsdevel

> On Thu, Jun 4, 2009 at 12:04 PM, Andreas Dilger <adilger@sun.com> wrote:
> This sounds very strange - different processes (or even the same
> process using different file handles) will see different results
> for fstat() which are yet different from stat().  I can't imagine
> that applications would like this at all.

Ya, no doubt a good deal of applications would get very confused if
they didn't know the difference. And it is not the goal of this FS to
support the use of general applications.

But, because a process has a 'private copy' of the file when it opens
it for write-mode (this being an intended feature of this
file-system), it's imperative that the application be given the
ability to access the data and metadata of the private copy as well as
the on-disk version. The on-disk file's metadata would be retrieved
via a path-based stat(2) (or by opening another instance of the file
read-only and fstat'ing the fd), and the private copy's metadata would
be retrieved via fstat(2).

> Maybe you should go back and explain why this behaviour is useful,
> before we burden the kernel with it.

Hardly a burden. It's an interface addition that is completely
optional to each filesystem. In Miklos' patch, if the fgetattr method
is not set in the file_operations struct, then the behavior is to
fallback to the original getattr function.


Cheers,
- Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-04 19:46       ` Brian Molnar
@ 2009-06-05 10:35         ` Miklos Szeredi
  2009-06-08  1:59           ` Jamie Lokier
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2009-06-05 10:35 UTC (permalink / raw)
  To: brian.molnar; +Cc: adilger, miklos, linux-fsdevel

On Thu, 4 Jun 2009, Brian Molnar wrote:
> > On Thu, Jun 4, 2009 at 12:04 PM, Andreas Dilger <adilger@sun.com> wrote:
> > This sounds very strange - different processes (or even the same
> > process using different file handles) will see different results
> > for fstat() which are yet different from stat().  I can't imagine
> > that applications would like this at all.
> 
> Ya, no doubt a good deal of applications would get very confused if
> they didn't know the difference. And it is not the goal of this FS to
> support the use of general applications.

Actually, well written applications shouldn't assume that
fd=open(path);fstat(fd) is equivalent to fd=open(path);stat(path).
Why?  Because the second one is racy wrt. remove and rename.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-05 10:35         ` Miklos Szeredi
@ 2009-06-08  1:59           ` Jamie Lokier
  2009-06-08 22:26             ` Brian Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Jamie Lokier @ 2009-06-08  1:59 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: brian.molnar, adilger, linux-fsdevel

Miklos Szeredi wrote:
> On Thu, 4 Jun 2009, Brian Molnar wrote:
> > > On Thu, Jun 4, 2009 at 12:04 PM, Andreas Dilger <adilger@sun.com> wrote:
> > > This sounds very strange - different processes (or even the same
> > > process using different file handles) will see different results
> > > for fstat() which are yet different from stat().  I can't imagine
> > > that applications would like this at all.
> > 
> > Ya, no doubt a good deal of applications would get very confused if
> > they didn't know the difference. And it is not the goal of this FS to
> > support the use of general applications.
> 
> Actually, well written applications shouldn't assume that
> fd=open(path);fstat(fd) is equivalent to fd=open(path);stat(path).
> Why?  Because the second one is racy wrt. remove and rename.

No, but well written applications can and do assume this:

    st1=stat(path)

    ...time passes, maybe do some filesystem operations, talk on the network...

    fd=open(path)
    st2=fstat(fd)

    if (st1!=st2)
        abort or loop "Race happened, something changed under me..."

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fgetattr/fsetattr file operation
  2009-06-08  1:59           ` Jamie Lokier
@ 2009-06-08 22:26             ` Brian Molnar
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Molnar @ 2009-06-08 22:26 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Miklos Szeredi, adilger, linux-fsdevel

> On Sun, Jun 7, 2009 at 6:59 PM, Jamie Lokier<jamie@shareable.org> wrote:
> No, but well written applications can and do assume this:
>
>    st1=stat(path)
>
>    ...time passes, maybe do some filesystem operations, talk on the network...
>
>    fd=open(path)
>    st2=fstat(fd)
>
>    if (st1!=st2)
>        abort or loop "Race happened, something changed under me..."
>
> -- Jamie
>

Well certainly applications can still make *that* assumption on this
file system, but generally, due to the special semantics of certain
types of file access (more specifically, when it comes to writing and
appending to files), it's expected that some applications may simply
not work with this FS. Applications utilizing this filesystem must be
aware of these cases, with the expectation that some can use them to
their advantage.

So far, the VFS interface (and by extension, the FUSE interface) has
managed to facilitate most of this functionality, but the missing
piece is in the metadata. To achieve the desired behavior, it is
necessary to associate a private set of metadata with an opened file
descriptor. And to do so requires access directly to private data
stored in the underlying file struct during the getattr operation when
the fstat(2) is called.

Cheers,
- Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-08 22:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-03 17:32 fgetattr/fsetattr file operation Brian Molnar
2009-06-04 10:27 ` Miklos Szeredi
2009-06-04 17:42   ` Brian Molnar
2009-06-04 19:04     ` Andreas Dilger
2009-06-04 19:46       ` Brian Molnar
2009-06-05 10:35         ` Miklos Szeredi
2009-06-08  1:59           ` Jamie Lokier
2009-06-08 22:26             ` Brian Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).