All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] debugfs: remove inc_nlink in debugfs_create_automount
@ 2018-12-22  8:45 yangerkun
  2018-12-22 11:25 ` Greg KH
  2018-12-23  1:13 ` Al Viro
  0 siblings, 2 replies; 4+ messages in thread
From: yangerkun @ 2018-12-22  8:45 UTC (permalink / raw)
  To: gregkh, rafael; +Cc: yangerkun, linux-kernel

Remove inc_nlink in debugfs_create_automount, or this inode will never
be free.

Signed-off-by: yangerkun <yangerkun@huawei.com>
---
 fs/debugfs/inode.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 13b01351dd1c..9294238e364f 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -551,12 +551,11 @@ struct dentry *debugfs_create_automount(const char *name,
 	if (unlikely(!inode))
 		return failed_creating(dentry);
 
+	/* directory inodes start off with i_nlink == 2 (for "." entry) */
 	make_empty_dir_inode(inode);
 	inode->i_flags |= S_AUTOMOUNT;
 	inode->i_private = data;
 	dentry->d_fsdata = (void *)f;
-	/* directory inodes start off with i_nlink == 2 (for "." entry) */
-	inc_nlink(inode);
 	d_instantiate(dentry, inode);
 	inc_nlink(d_inode(dentry->d_parent));
 	fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] debugfs: remove inc_nlink in debugfs_create_automount
  2018-12-22  8:45 [PATCH] debugfs: remove inc_nlink in debugfs_create_automount yangerkun
@ 2018-12-22 11:25 ` Greg KH
  2018-12-24  3:47   ` yangerkun
  2018-12-23  1:13 ` Al Viro
  1 sibling, 1 reply; 4+ messages in thread
From: Greg KH @ 2018-12-22 11:25 UTC (permalink / raw)
  To: yangerkun; +Cc: rafael, linux-kernel

On Sat, Dec 22, 2018 at 04:45:36PM +0800, yangerkun wrote:
> Remove inc_nlink in debugfs_create_automount, or this inode will never
> be free.
> 
> Signed-off-by: yangerkun <yangerkun@huawei.com>
> ---
>  fs/debugfs/inode.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
> index 13b01351dd1c..9294238e364f 100644
> --- a/fs/debugfs/inode.c
> +++ b/fs/debugfs/inode.c
> @@ -551,12 +551,11 @@ struct dentry *debugfs_create_automount(const char *name,
>  	if (unlikely(!inode))
>  		return failed_creating(dentry);
>  
> +	/* directory inodes start off with i_nlink == 2 (for "." entry) */
>  	make_empty_dir_inode(inode);
>  	inode->i_flags |= S_AUTOMOUNT;
>  	inode->i_private = data;
>  	dentry->d_fsdata = (void *)f;
> -	/* directory inodes start off with i_nlink == 2 (for "." entry) */
> -	inc_nlink(inode);

Again, have you tested this and how?  How did you find this issue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] debugfs: remove inc_nlink in debugfs_create_automount
  2018-12-22  8:45 [PATCH] debugfs: remove inc_nlink in debugfs_create_automount yangerkun
  2018-12-22 11:25 ` Greg KH
@ 2018-12-23  1:13 ` Al Viro
  1 sibling, 0 replies; 4+ messages in thread
From: Al Viro @ 2018-12-23  1:13 UTC (permalink / raw)
  To: yangerkun; +Cc: gregkh, rafael, linux-kernel

On Sat, Dec 22, 2018 at 04:45:36PM +0800, yangerkun wrote:
> Remove inc_nlink in debugfs_create_automount, or this inode will never
> be free.

Explain, please.  What exactly would care about i_nlink in debugfs?
It does *NOT* free any kind of backing store on inode eviction, for
a good and simple reason - there is no backing store at all.

And as for the icache retention, debugfs inodes are
	* never looked up in icache
	* never hashed
	* ... and thus never retained in icache past the final
iput()

i_nlink serves as a refcount - for on-disk inodes on filesystems that
allow hardlinks and need to decide if the on-disk inode needs to
follow an in-core one into oblivion.

The lifetime of in-core inode is *NOT* controlled by i_nlink.  They
can very well outlive i_nlink dropping to 0, for starters.  Consider
e.g. the following:

cat >/tmp/a.sh <<'EOF'
echo still not freed >/tmp/a
(sleep 5 && date && stat - && cat) </tmp/a &
date; rm /tmp/a && echo /tmp/a removed
EOF
sh /tmp/a.sh

Output will be
$ Sat Dec 22 19:50:27 EST 2018
/tmp/a removed
$ Sat Dec 22 19:50:32 EST 2018
  File: -
  Size: 12              Blocks: 8          IO Block: 4096   regular file
Device: 808h/2056d      Inode: 13          Links: 0
Access: (0644/-rw-r--r--)  Uid: ( 1000/      al)   Gid: ( 1000/      al)
Access: 2018-12-22 19:50:27.266694223 -0500
Modify: 2018-12-22 19:50:27.266694223 -0500
Change: 2018-12-22 19:50:27.274694262 -0500
 Birth: -
still not freed

Note that this is on a normal filesystem (ext2, in fact), so nothing
special is involved.  The in-core inode had remained alive until
all children with stdin redirected from /tmp/a exited.  As you can
see, link count is zero - that's what fstat(2) reported in ->st_nlink
and that came straight from ->i_nlink of the (very much alive) in-core
inode.

And of course, in-core inodes do get freed just fine without i_nlink
reaching zero.

It's used for 4 things:
	1) deciding whether it makes sense to evict in-core inode
as soon as we have no more (in-core) references pinning them (i.e.
when ->i_count reaches zero).  If there's a chance that somebody
will do an icache lookup finding that one, we might want to keep
it around until memory pressure kicks it out.  And since for
something like normal Unix filesystem such icache lookup can
happen as long as there are links to the (on-disk) inode left
in some directories, default policy is "try to keep it around if
i_nlink is positive *AND* it is reachable from icache in the
first place".  Filesystems might override that, but it's all moot
if the in-core inode is *not* reachable from icache in the first
place.  Which is the case for debugfs and similar beasts.
	2) deciding whether we want to free the on-disk inode
when an in-core one gets evicted.  Note that such freeing can not
happen as long as in-core inode is around - unlinking an open
file does *not* free it; it's still open and IO on such descriptor
works just fine.  There the normal rules are "if we are evicting
an in-core inode and we know that it has no links left, it's
time to free the on-disk counterpart".  Up to individual
filesystems, not applicable to debugfs for obvious reasons.
	3) for the same filesystems, if the link count is
maintained in on-disk inode we'll need to update it on unlink
et.al.  ->i_nlink of in-core inode is handy for keeping track
of that.  Again, not applicable in debugfs
	4) reporting st_nlink to userland on stat/fstat/etc.
That *is* applicable in debugfs and, in fact, it is the only
use of ->i_nlink there.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] debugfs: remove inc_nlink in debugfs_create_automount
  2018-12-22 11:25 ` Greg KH
@ 2018-12-24  3:47   ` yangerkun
  0 siblings, 0 replies; 4+ messages in thread
From: yangerkun @ 2018-12-24  3:47 UTC (permalink / raw)
  To: Greg KH, Al Viro; +Cc: rafael, linux-kernel



Greg KH wrote on 2018/12/22 19:25:
> On Sat, Dec 22, 2018 at 04:45:36PM +0800, yangerkun wrote:
>> Remove inc_nlink in debugfs_create_automount, or this inode will never
>> be free.
>>
>> Signed-off-by: yangerkun <yangerkun@huawei.com>
>> ---
>>   fs/debugfs/inode.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
>> index 13b01351dd1c..9294238e364f 100644
>> --- a/fs/debugfs/inode.c
>> +++ b/fs/debugfs/inode.c
>> @@ -551,12 +551,11 @@ struct dentry *debugfs_create_automount(const char *name,
>>   	if (unlikely(!inode))
>>   		return failed_creating(dentry);
>>   
>> +	/* directory inodes start off with i_nlink == 2 (for "." entry) */
>>   	make_empty_dir_inode(inode);
>>   	inode->i_flags |= S_AUTOMOUNT;
>>   	inode->i_private = data;
>>   	dentry->d_fsdata = (void *)f;
>> -	/* directory inodes start off with i_nlink == 2 (for "." entry) */
>> -	inc_nlink(inode);
> 
> Again, have you tested this and how?  How did you find this issue?
> 
> thanks,
> 
> greg k-h
> 
> .
> 
Thanks a lot for Al Viro's explain.

I was confusing about i_nlink and i_count before, so after insert and 
remove module show as below, i think the inode can still exist with 
nlink will be 1.

So the only problem is that the nlink should be 2 for a empty dir, and 
for debugfs,the influence is that the stat for the dir. Right?

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/debugfs.h>

static char *name = "test-dir";
module_param(name, charp, 0644);
MODULE_PARM_DESC(name, "Get an string(char *) value from user...\n");

struct inode *inode;

int init_module(void)
{
     struct dentry *dentry;

     dentry = debugfs_create_automount(name, NULL, NULL, NULL);
     inode = dentry->d_inode;
     debugfs_remove(dentry);
     return 0;
}

void cleanup_module(void)
{
     printk("%d\n", inode->i_nlink);
}

MODULE_LICENSE("GPL");

Thanks,
Kun.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-24  3:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-22  8:45 [PATCH] debugfs: remove inc_nlink in debugfs_create_automount yangerkun
2018-12-22 11:25 ` Greg KH
2018-12-24  3:47   ` yangerkun
2018-12-23  1:13 ` Al Viro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.