linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Jenkins <alan.christopher.jenkins@gmail.com>
To: David Howells <dhowells@redhat.com>, viro@zeniv.linux.org.uk
Cc: torvalds@linux-foundation.org, ebiederm@xmission.com,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	mszeredi@redhat.com
Subject: Re: [PATCH 03/34] teach move_mount(2) to work with OPEN_TREE_CLONE [ver #12]
Date: Sun, 7 Oct 2018 11:48:37 +0100	[thread overview]
Message-ID: <862e36a2-2a6f-4e26-3228-8cab4b4cf230@gmail.com> (raw)
In-Reply-To: <de050902-f3d0-b2db-2627-983a36c87b3c@gmail.com>

On 05/10/2018 19:24, Alan Jenkins wrote:
> On 21/09/2018 17:30, David Howells wrote:
>> From: Al Viro <viro@zeniv.linux.org.uk>
>>
>> Allow a detached tree created by open_tree(..., OPEN_TREE_CLONE) to be
>> attached by move_mount(2).
>>
>> If by the time of final fput() of OPEN_TREE_CLONE-opened file its 
>> tree is
>> not detached anymore, it won't be dissolved.  move_mount(2) is adjusted
>> to handle detached source.
>>
>> That gives us equivalents of mount --bind and mount --rbind.
>>
>> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>> Signed-off-by: David Howells <dhowells@redhat.com>
>> ---
>>
>>   fs/namespace.c |   26 ++++++++++++++++++++------
>>   1 file changed, 20 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/namespace.c b/fs/namespace.c
>> index dd38141b1723..caf5c55ef555 100644
>> --- a/fs/namespace.c
>> +++ b/fs/namespace.c
>> @@ -1785,8 +1785,10 @@ void dissolve_on_fput(struct vfsmount *mnt)
>>   {
>>       namespace_lock();
>>       lock_mount_hash();
>> -    mntget(mnt);
>> -    umount_tree(real_mount(mnt), UMOUNT_CONNECTED);
>> +    if (!real_mount(mnt)->mnt_ns) {
>> +        mntget(mnt);
>> +        umount_tree(real_mount(mnt), UMOUNT_CONNECTED);
>> +    }
>>       unlock_mount_hash();
>>       namespace_unlock();
>>   }
>> @@ -2393,6 +2395,7 @@ static int do_move_mount(struct path *old_path, 
>> struct path *new_path)
>>       struct mount *old;
>>       struct mountpoint *mp;
>>       int err;
>> +    bool attached;
>>         mp = lock_mount(new_path);
>>       err = PTR_ERR(mp);
>> @@ -2403,10 +2406,19 @@ static int do_move_mount(struct path 
>> *old_path, struct path *new_path)
>>       p = real_mount(new_path->mnt);
>>         err = -EINVAL;
>> -    if (!check_mnt(p) || !check_mnt(old))
>> +    /* The mountpoint must be in our namespace. */
>> +    if (!check_mnt(p))
>> +        goto out1;
>> +    /* The thing moved should be either ours or completely 
>> unattached. */
>> +    if (old->mnt_ns && !check_mnt(old))
>>           goto out1;
>>   -    if (!mnt_has_parent(old))
>> +    attached = mnt_has_parent(old);
>> +    /*
>> +     * We need to allow open_tree(OPEN_TREE_CLONE) followed by
>> +     * move_mount(), but mustn't allow "/" to be moved.
>> +     */
>> +    if (old->mnt_ns && !attached)
>>           goto out1;
>>         if (old->mnt.mnt_flags & MNT_LOCKED)
>
> Hi
>
> I replied last time to wonder about the MNT_UMOUNT mnt_flag. So I've 
> tested it now :-), on David's current tree (commit 5581f4935add).
>
> The modified do_move_mount() allows re-attaching something that was 
> lazy-unmounted. But the lazy unmount sets MNT_UMOUNT. And this flag is 
> not cleared when the mount is re-attached.
>
> I wasn't sure what effect this would have. Luckily it showed up 
> straight away, when I tried to unmount again. It causes a soft lockup.
>
> Debug printk:
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 4dfe7e23b7ee..ac8de9191cfe 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2472,6 +2472,10 @@ static int do_move_mount(struct path *old_path, 
> struct path *new_path)
>      if (old->mnt.mnt_flags & MNT_LOCKED)
>          goto out1;
>
> +    pr_info("mnt_flags=%x umount=%x\n",
> +            (unsigned) old->mnt.mnt_flags,
> +            (unsigned) !!(old->mnt.mnt_flags & MNT_UMOUNT);
> +
>      if (old_path->dentry != old_path->mnt->mnt_root)
>          goto out1;

The lockup seems to be a general problem with the cleanup code. Even if 
I use this as advertised, i.e. for a simple bind mount.

(I was suspicious that being able to pass around detached trees as an 
FD, and re-attach them in any namespace, allows leaking memory by 
creating a namespace loop.  I.e. maybe it gives you enough rope to skip 
the test in mnt_ns_loop().  But I didn't get that far).

I converted test-fsmount.c for my own purposes:

diff --git a/samples/vfs/test-fsmount.c b/samples/vfs/test-fsmount.c
index 74124025ade0..da6e3fbf0513 100644
--- a/samples/vfs/test-fsmount.c
+++ b/samples/vfs/test-fsmount.c
@@ -83,6 +83,11 @@ static inline int move_mount(int from_dfd, const char *from_pathname,
  		       to_dfd, to_pathname, flags);
  }
  
+static inline int open_tree(int dfd, const char *pathname, unsigned flags)
+{
+	return syscall(__NR_open_tree, dfd, pathname, flags);
+}
+
  #define E_fsconfig(fd, cmd, key, val, aux)				\
  	do {								\
  		if (fsconfig(fd, cmd, key, val, aux) == -1)		\
@@ -93,6 +98,7 @@ int main(int argc, char *argv[])
  {
  	int fsfd, mfd;
  
+#if 0
  	/* Mount a publically available AFS filesystem */
  	fsfd = fsopen("afs", 0);
  	if (fsfd == -1) {
@@ -115,4 +121,9 @@ int main(int argc, char *argv[])
  
  	E(close(mfd));
  	exit(0);
+#endif
+
+	E( mfd = open_tree(-1, "/mnt", OPEN_TREE_CLONE) );
+	E( fchdir(mfd) );
+	E( execl("/bin/bash", "/bin/bash", NULL) );
  }

If I close() the mount FD "mfd", and then do "mount --move . /mnt", my 
printk() shows MNT_UMOUNT has been set. ( I guess fchdir() works more 
like openat(... , O_PATH) than dup() ). Then unmounting /mnt hangs, as I 
would expect from my previous test.

If I instead do the mount+unmount first, and close the FD as a second 
step, I think there's a lockup in the close().  The lockup happens in 
the same place as the unmount lockup from before. (Except there's a line 
"Code: Bad RIP value", I don't know why that happens).

# unshare --mount
# test-fsmount
# mount --move . /mnt
[  270.859542] umount=0 mnt_flags=20

Check the flags are still the same:

# mount --move /mnt /mnt
[  305./mnt: mount(2) system call failed: Too many levels of symbolic links.
[  313.737030] umount=0 mnt_flags=20

Clean up the bind mount, and then the inherited mount FD.

# cd
# umount /mnt
# exit

[  351.898629] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [bash:1483]
[  351.899841] Modules linked in: xt_CHECKSUM(E) ipt_MASQUERADE(E) tun(E) bridge(E) stp(E) llc(E) ip6t_rpfilter(E) ip6t_REJECT(E) nf_reject_ipv6(E) xt_conntrack(E) ip6table_nat(E) nf_nat_ipv6(E) devlink(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) libcrc32c(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip6table_filter(E) ip6_tables(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) snd_seq(E) snd_seq_device(E) snd_pcm(E) joydev(E) crc32_pclmul(E) snd_timer(E) ghash_clmulni_intel(E) snd(E) crct10dif_pclmul(E) virtio_balloon(E) serio_raw(E) soundcore(E) crc32c_intel(E) qxl(E) drm_kms_helper(E) virtio_console(E) ttm(E) virtio_net(E) net_failover(E)
[  351.912077]  failover(E) drm(E) qemu_fw_cfg(E) pata_acpi(E) ata_generic(E)
[  351.912888] CPU: 0 PID: 1483 Comm: bash Tainted: G            E     4.19.0-rc3+ #7
[  351.914221] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
[  351.916582] RIP: 0010:pin_kill+0x128/0x140
[  351.917369] Code: f2 5a 00 48 8b 44 24 20 48 39 c5 0f 84 6f ff ff ff 48 89 df e8 e9 4a 5b 00 8b 43 18 85 c0 7e b3 c6 03 00 fb 66 0f 1f 44 00 00 <e9> 51 ff ff ff e8 be 11 dd ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00
[  351.920729] RSP: 0018:ffffa1b381be3d88 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
[  351.921801] RAX: 0000000000000000 RBX: ffff909cf2ea68b0 RCX: dead000000000200
[  351.922807] RDX: 0000000000000001 RSI: ffffa1b381be3d28 RDI: ffff909cf2ea68b0
[  351.923811] RBP: ffffa1b381be3da8 R08: ffff909d59621760 R09: 0000000000000000
[  351.924813] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000010000000
[  351.925818] R13: ffff909cf5db9a38 R14: ffff909cf2ea67a0 R15: ffff909cedc07300
[  351.926824] FS:  00007f1eb90ac740(0000) GS:ffff909d59600000(0000) knlGS:0000000000000000
[  351.927957] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  351.928772] CR2: 00007f1eabedb180 CR3: 000000000f20a003 CR4: 00000000003606f0
[  351.929779] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  351.930785] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  351.931791] Call Trace:
[  351.932160]  ? finish_wait+0x80/0x80
[  351.932684]  group_pin_kill+0x1a/0x30
[  351.933207]  namespace_unlock+0x6f/0x80
[  351.933766]  __fput+0x239/0x240
[  351.934217]  task_work_run+0x84/0xa0
[  351.934743]  do_exit+0x2d3/0xae0
[  351.935206]  ? __do_page_fault+0x263/0x4e0
[  351.935799]  do_group_exit+0x3a/0xa0
[  351.936307]  __x64_sys_exit_group+0x14/0x20
[  351.936911]  do_syscall_64+0x5b/0x160
[  351.937436]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  351.938164] RIP: 0033:0x7f1eb877adb6
[  351.938688] Code: Bad RIP value.
[  351.939149] RSP: 002b:00007ffd56e019d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  351.940216] RAX: ffffffffffffffda RBX: 00007f1eb8a69740 RCX: 00007f1eb877adb6
[  351.941222] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[  351.942229] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff80
[  351.943236] R10: 00007ffd56e0188a R11: 0000000000000246 R12: 00007f1eb8a69740
[  351.944242] R13: 0000000000000001 R14: 00007f1eb8a72708 R15: 0000000000000000



  reply	other threads:[~2018-10-07 10:48 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-21 16:30 [PATCH 00/34] VFS: Introduce filesystem context [ver #12] David Howells
2018-09-21 16:30 ` [PATCH 01/34] vfs: syscall: Add open_tree(2) to reference or clone a mount " David Howells
2018-10-21 16:41   ` Eric W. Biederman
2018-09-21 16:30 ` [PATCH 02/34] vfs: syscall: Add move_mount(2) to move mounts around " David Howells
2018-09-21 16:30 ` [PATCH 03/34] teach move_mount(2) to work with OPEN_TREE_CLONE " David Howells
2018-10-05 18:24   ` Alan Jenkins
2018-10-07 10:48     ` Alan Jenkins [this message]
2018-10-07 19:20       ` Alan Jenkins
2018-10-10 12:36       ` David Howells
2018-10-12 14:22         ` Alan Jenkins
2018-10-12 14:54         ` David Howells
2018-10-12 14:57           ` Alan Jenkins
2018-10-11  9:17       ` David Howells
2018-10-11 11:48         ` Alan Jenkins
2018-10-11 13:10         ` David Howells
2018-10-11 12:14       ` David Howells
2018-10-11 12:23         ` Alan Jenkins
2018-10-11 15:33       ` David Howells
2018-10-11 18:38         ` Eric W. Biederman
2018-10-11 20:17         ` David Howells
2018-10-13  6:06           ` Al Viro
2018-10-17 17:45       ` Alan Jenkins
2018-10-18 20:09     ` David Howells
2018-10-18 20:58     ` David Howells
2018-10-19 11:57     ` David Howells
2018-10-19 13:37     ` David Howells
2018-10-19 17:35       ` Alan Jenkins
2018-10-19 21:35       ` David Howells
2018-10-19 21:40       ` David Howells
2018-10-19 22:36       ` David Howells
2018-10-20  5:25         ` Al Viro
2018-10-20 11:06         ` Alan Jenkins
2018-10-20 11:48           ` Al Viro
2018-10-20 12:26             ` Al Viro
2018-10-21  0:40         ` David Howells
2018-10-10 11:56   ` David Howells
2018-10-10 12:31   ` David Howells
2018-10-10 12:39     ` Alan Jenkins
2018-10-10 12:50   ` David Howells
2018-10-10 13:02   ` David Howells
2018-10-10 13:06     ` Alan Jenkins
2018-10-21 16:57   ` Eric W. Biederman
2018-10-23 11:19   ` Alan Jenkins
2018-10-23 16:22     ` Al Viro
2018-09-21 16:30 ` [PATCH 04/34] vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled " David Howells
2018-09-21 16:30 ` [PATCH 05/34] vfs: Introduce the basic header for the new mount API's filesystem context " David Howells
2018-09-21 16:30 ` [PATCH 06/34] vfs: Introduce logging functions " David Howells
2018-09-21 16:31 ` [PATCH 07/34] vfs: Add configuration parser helpers " David Howells
2019-03-14  7:46   ` Geert Uytterhoeven
2019-03-14 10:27   ` David Howells
2019-03-14 10:49     ` Geert Uytterhoeven
2018-09-21 16:31 ` [PATCH 08/34] vfs: Add LSM hooks for the new mount API " David Howells
2018-09-21 16:31 ` [PATCH 09/34] vfs: Put security flags into the fs_context struct " David Howells
2018-09-21 16:31 ` [PATCH 10/34] selinux: Implement the new mount API LSM hooks " David Howells
2018-09-21 16:31 ` [PATCH 11/34] smack: Implement filesystem context security " David Howells
2018-09-21 16:31 ` [PATCH 12/34] apparmor: Implement security hooks for the new mount API " David Howells
2018-09-21 16:31 ` [PATCH 13/34] tomoyo: " David Howells
2018-09-21 16:32 ` [PATCH 14/34] vfs: Separate changing mount flags full remount " David Howells
2018-09-21 16:32 ` [PATCH 15/34] vfs: Implement a filesystem superblock creation/configuration context " David Howells
2018-09-21 16:32 ` [PATCH 16/34] vfs: Remove unused code after filesystem context changes " David Howells
2018-09-21 16:32 ` [PATCH 17/34] procfs: Move proc_fill_super() to fs/proc/root.c " David Howells
2018-09-21 16:32 ` [PATCH 18/34] proc: Add fs_context support to procfs " David Howells
2018-09-21 16:32 ` [PATCH 19/34] ipc: Convert mqueue fs to fs_context " David Howells
2018-09-21 16:32 ` [PATCH 20/34] cpuset: Use " David Howells
2018-09-21 16:33 ` [PATCH 21/34] kernfs, sysfs, cgroup, intel_rdt: Support " David Howells
2018-11-19  4:23   ` Andrei Vagin
2018-12-06 17:08     ` Andrei Vagin
2018-09-21 16:33 ` [PATCH 22/34] hugetlbfs: Convert to " David Howells
2018-09-21 16:33 ` [PATCH 23/34] vfs: Remove kern_mount_data() " David Howells
2018-09-21 16:33 ` [PATCH 24/34] vfs: Provide documentation for new mount API " David Howells
2018-09-21 16:33 ` [PATCH 25/34] Make anon_inodes unconditional " David Howells
2018-09-21 16:33 ` [PATCH 26/34] vfs: syscall: Add fsopen() to prepare for superblock creation " David Howells
2018-09-21 16:33 ` [PATCH 27/34] vfs: Implement logging through fs_context " David Howells
2018-09-21 16:33 ` [PATCH 28/34] vfs: Add some logging to the core users of the fs_context log " David Howells
2018-09-21 16:34 ` [PATCH 29/34] vfs: syscall: Add fsconfig() for configuring and managing a context " David Howells
2018-09-21 16:34 ` [PATCH 30/34] vfs: syscall: Add fsmount() to create a mount for a superblock " David Howells
2018-09-21 16:34 ` [PATCH 31/34] vfs: syscall: Add fspick() to select a superblock for reconfiguration " David Howells
2018-10-12 14:49   ` Alan Jenkins
2018-10-13  6:11     ` Al Viro
2018-10-13  9:45       ` Alan Jenkins
2018-10-13 23:04         ` Andy Lutomirski
2018-10-17 13:15       ` David Howells
2018-10-17 13:20       ` David Howells
2018-10-17 14:31         ` Alan Jenkins
2018-10-17 14:35           ` Eric W. Biederman
2018-10-17 14:55             ` Alan Jenkins
2018-10-17 15:24           ` David Howells
2018-10-17 15:38             ` Eric W. Biederman
2018-10-17 15:45         ` David Howells
2018-10-17 17:41           ` Alan Jenkins
2018-10-17 21:20           ` David Howells
2018-10-17 22:13           ` Alan Jenkins
2018-09-21 16:34 ` [PATCH 32/34] afs: Add fs_context support " David Howells
2018-09-21 16:34 ` [PATCH 33/34] afs: Use fs_context to pass parameters over automount " David Howells
2018-09-21 16:34 ` [PATCH 34/34] vfs: Add a sample program for the new mount API " David Howells
2018-12-17 14:12   ` Anders Roxell
2018-12-20 16:36   ` David Howells
2018-10-04 18:37 ` [PATCH 00/34] VFS: Introduce filesystem context " Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=862e36a2-2a6f-4e26-3228-8cab4b4cf230@gmail.com \
    --to=alan.christopher.jenkins@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).