* CGroup Namespaces (v10) @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v10 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V9: 1. Update to latest Linus tree 2. A few locking fixes Changes from V8: 1. Incorporate updated documentation from tj. 2. Put lookup_one_len() under inode lock 3. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are namespaced. 4. Make cgroup_path{,_ns} take the needed locks, since external callers cannot do so. 5. Fix the bisectability problem of to_cg_ns() being defined after use Changes from V7: 1. Rework kernfs_path_from_node_locked to return the string length 2. Rename and reorder args to kernfs_path_from_node 3. cgroup.c: undo accidental conversoins to inline 4. cgroup.h: move ns declarations to bottom. 5. Rework the documentation to fit the style of the rest of cgroup.txt Changes from V6: 1. Switch to some WARN_ONs to provide stack traces 2. Rename kernfs_node_distance to kernfs_depth 3. Make sure kernfs_common_ancestor() nodes are from same root 4. Split kernfs changes for cgroup_mount into separate patch 5. Rename kernfs_obtain_root to kernfs_node_dentry (And more, see patch changelogs) Changes from V5: 1. To get a root dentry for cgroup namespace mount, walk the path from the kernfs root dentry. Changes from V4: 1. Move the FS_USERNS_MOUNT flag to last patch 2. Rebase onto cgroup/for-4.5 3. Don't non-init user namespaces to bind new subsystems when mounting. 4. Address feedback from Tejun (thanks). Specificaly, not addressed: . kernfs_obtain_root - walking dentry from kernfs root. (I think that's the only piece) 5. Dropped unused get_task_cgroup fn/patch. 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. It now finds a common ancestor, walks from the source to it, then back up to the target. Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. ^ permalink raw reply [flat|nested] 108+ messages in thread
* CGroup Namespaces (v10) @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v10 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V9: 1. Update to latest Linus tree 2. A few locking fixes Changes from V8: 1. Incorporate updated documentation from tj. 2. Put lookup_one_len() under inode lock 3. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are namespaced. 4. Make cgroup_path{,_ns} take the needed locks, since external callers cannot do so. 5. Fix the bisectability problem of to_cg_ns() being defined after use Changes from V7: 1. Rework kernfs_path_from_node_locked to return the string length 2. Rename and reorder args to kernfs_path_from_node 3. cgroup.c: undo accidental conversoins to inline 4. cgroup.h: move ns declarations to bottom. 5. Rework the documentation to fit the style of the rest of cgroup.txt Changes from V6: 1. Switch to some WARN_ONs to provide stack traces 2. Rename kernfs_node_distance to kernfs_depth 3. Make sure kernfs_common_ancestor() nodes are from same root 4. Split kernfs changes for cgroup_mount into separate patch 5. Rename kernfs_obtain_root to kernfs_node_dentry (And more, see patch changelogs) Changes from V5: 1. To get a root dentry for cgroup namespace mount, walk the path from the kernfs root dentry. Changes from V4: 1. Move the FS_USERNS_MOUNT flag to last patch 2. Rebase onto cgroup/for-4.5 3. Don't non-init user namespaces to bind new subsystems when mounting. 4. Address feedback from Tejun (thanks). Specificaly, not addressed: . kernfs_obtain_root - walking dentry from kernfs root. (I think that's the only piece) 5. Dropped unused get_task_cgroup fn/patch. 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. It now finds a common ancestor, walks from the source to it, then back up to the target. Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. Changelog 20151223: - don't allocate memory pr_cont_kernfs_path() under spinlock --- fs/kernfs/dir.c | 192 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 166 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 996b774..38fa03a 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); - return p; + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; + + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,17 +286,25 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + int sz; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz >= sizeof(kernfs_pr_cont_buf)) { + pr_cont("(name too long)"); + goto out; + } + + pr_cont("%s", kernfs_pr_cont_buf); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, Serge E. Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. Changelog 20151223: - don't allocate memory pr_cont_kernfs_path() under spinlock --- fs/kernfs/dir.c | 192 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 166 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 996b774..38fa03a 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); - return p; + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; + + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,17 +286,25 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + int sz; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz >= sizeof(kernfs_pr_cont_buf)) { + pr_cont("(name too long)"); + goto out; + } + + pr_cont("%s", kernfs_pr_cont_buf); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn From: Aditya Kali <adityakali@google.com> CLONE_NEWCGROUP will be used to create new cgroup namespace. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- include/uapi/linux/sched.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h index cc89dde..5f0fe01 100644 --- a/include/uapi/linux/sched.h +++ b/include/uapi/linux/sched.h @@ -21,8 +21,7 @@ #define CLONE_DETACHED 0x00400000 /* Unused, ignored */ #define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */ #define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */ -/* 0x02000000 was previously the unused CLONE_STOPPED (Start in stopped state) - and is now available for re-use. */ +#define CLONE_NEWCGROUP 0x02000000 /* New cgroup namespace */ #define CLONE_NEWUTS 0x04000000 /* New utsname namespace */ #define CLONE_NEWIPC 0x08000000 /* New ipc namespace */ #define CLONE_NEWUSER 0x10000000 /* New user namespace */ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge Hallyn From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> CLONE_NEWCGROUP will be used to create new cgroup namespace. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- include/uapi/linux/sched.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h index cc89dde..5f0fe01 100644 --- a/include/uapi/linux/sched.h +++ b/include/uapi/linux/sched.h @@ -21,8 +21,7 @@ #define CLONE_DETACHED 0x00400000 /* Unused, ignored */ #define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */ #define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */ -/* 0x02000000 was previously the unused CLONE_STOPPED (Start in stopped state) - and is now available for re-use. */ +#define CLONE_NEWCGROUP 0x02000000 /* New cgroup namespace */ #define CLONE_NEWUTS 0x04000000 /* New utsname namespace */ #define CLONE_NEWIPC 0x08000000 /* New ipc namespace */ #define CLONE_NEWUSER 0x10000000 /* New user namespace */ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 3/8] cgroup: introduce cgroup namespaces [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn From: Aditya Kali <adityakali@google.com> Introduce the ability to create new cgroup namespace. The newly created cgroup namespace remembers the cgroup of the process at the point of creation of the cgroup namespace (referred as cgroupns-root). The main purpose of cgroup namespace is to virtualize the contents of /proc/self/cgroup file. Processes inside a cgroup namespace are only able to see paths relative to their namespace root (unless they are moved outside of their cgroupns-root, at which point they will see a relative path from their cgroupns-root). For a correctly setup container this enables container-tools (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized containers without leaking system level cgroup hierarchy to the task. This patch only implements the 'unshare' part of the cgroupns. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- Changelog: 2015-11-24 - move cgroup_namespace.c into cgroup.c (and .h) - reformatting - make get_cgroup_ns return void - rename ns->root_cgrps to root_cset. Changelog: 2015-12-08 - Move init_cgroup_ns to other variable declarations - Remove accidental conversion of put-css_set to inline - Drop BUG_ON(NULL) - Remove unneeded pre declaration of struct cgroupns_operations. - cgroup.h: collect common ns declerations Changelog: 2015-12-09 - cgroup.h: move ns declarations to bottom - cgroup.c: undo all accidental conversions to inline Changelog: 2015-12-22 - update for new kernfs_path_from_node() return value. Since cgroup_path was already gpl-exported, I abstained from updating its return value. Changelog: 2015-12-23 - cgroup_path(): use init_cgroup_ns when in interupt context. Changelog: 2015-01-02 - move to_cg_ns definition forward in patch series - cgroup_release_agent: grab css_set_lock around cgroup_path() - leave cgroup_path non-namespaced, use cgroup_path_ns when namespaced path is desired. Changelog: 2015-01-04 - cgroup_path: continue to use kernfs_path. Since cgroup_path is non-namespaced, use the old version. - make cgroup_path_ns_locked() static. Changelog: 2015-01-05 - don't namespace the path printed in debugfs. Changelog: 2015-01-27 - remove unneeded NULL check before put_cgroup_ns() Changelog: 2015-01-28 - lock around task_css_set in copy_cgroup_ns, and don't take rcu lock arounc copy_cgroup_ns call in cpuset. --- fs/proc/namespaces.c | 3 + include/linux/cgroup.h | 49 +++++++++++++ include/linux/nsproxy.h | 2 + include/linux/proc_ns.h | 4 ++ kernel/cgroup.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- kernel/cpuset.c | 8 +-- kernel/fork.c | 2 +- kernel/nsproxy.c | 19 ++++- 8 files changed, 253 insertions(+), 10 deletions(-) diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c index 276f124..72cb26f 100644 --- a/fs/proc/namespaces.c +++ b/fs/proc/namespaces.c @@ -28,6 +28,9 @@ static const struct proc_ns_operations *ns_entries[] = { &userns_operations, #endif &mntns_operations, +#ifdef CONFIG_CGROUPS + &cgroupns_operations, +#endif }; static const char *proc_ns_get_link(struct dentry *dentry, diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 2162dca..1773af0 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -17,6 +17,11 @@ #include <linux/seq_file.h> #include <linux/kernfs.h> #include <linux/jump_label.h> +#include <linux/nsproxy.h> +#include <linux/types.h> +#include <linux/ns_common.h> +#include <linux/nsproxy.h> +#include <linux/user_namespace.h> #include <linux/cgroup-defs.h> @@ -611,4 +616,48 @@ static inline void cgroup_sk_free(struct sock_cgroup_data *skcd) {} #endif /* CONFIG_CGROUP_DATA */ +struct cgroup_namespace { + atomic_t count; + struct ns_common ns; + struct user_namespace *user_ns; + struct css_set *root_cset; +}; + +extern struct cgroup_namespace init_cgroup_ns; + +#ifdef CONFIG_CGROUPS + +void free_cgroup_ns(struct cgroup_namespace *ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns); + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns); + +#else /* !CONFIG_CGROUPS */ + +static inline void free_cgroup_ns(struct cgroup_namespace *ns) { } +static inline struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + return old_ns; +} + +#endif /* !CONFIG_CGROUPS */ + +static inline void get_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns) + atomic_inc(&ns->count); +} + +static inline void put_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns && atomic_dec_and_test(&ns->count)) + free_cgroup_ns(ns); +} + #endif /* _LINUX_CGROUP_H */ diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 35fa08f..ac0d65b 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -8,6 +8,7 @@ struct mnt_namespace; struct uts_namespace; struct ipc_namespace; struct pid_namespace; +struct cgroup_namespace; struct fs_struct; /* @@ -33,6 +34,7 @@ struct nsproxy { struct mnt_namespace *mnt_ns; struct pid_namespace *pid_ns_for_children; struct net *net_ns; + struct cgroup_namespace *cgroup_ns; }; extern struct nsproxy init_nsproxy; diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h index 42dfc61..de0e771 100644 --- a/include/linux/proc_ns.h +++ b/include/linux/proc_ns.h @@ -9,6 +9,8 @@ struct pid_namespace; struct nsproxy; struct path; +struct task_struct; +struct inode; struct proc_ns_operations { const char *name; @@ -24,6 +26,7 @@ extern const struct proc_ns_operations ipcns_operations; extern const struct proc_ns_operations pidns_operations; extern const struct proc_ns_operations userns_operations; extern const struct proc_ns_operations mntns_operations; +extern const struct proc_ns_operations cgroupns_operations; /* * We always define these enumerators @@ -34,6 +37,7 @@ enum { PROC_UTS_INIT_INO = 0xEFFFFFFEU, PROC_USER_INIT_INO = 0xEFFFFFFDU, PROC_PID_INIT_INO = 0xEFFFFFFCU, + PROC_CGROUP_INIT_INO = 0xEFFFFFFBU, }; #ifdef CONFIG_PROC_FS diff --git a/kernel/cgroup.c b/kernel/cgroup.c index c03a640..d828e1f 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -58,6 +58,9 @@ #include <linux/kthread.h> #include <linux/delay.h> #include <linux/atomic.h> +#include <linux/proc_ns.h> +#include <linux/nsproxy.h> +#include <linux/proc_ns.h> #include <net/sock.h> /* @@ -208,6 +211,15 @@ static unsigned long have_fork_callback __read_mostly; static unsigned long have_exit_callback __read_mostly; static unsigned long have_free_callback __read_mostly; +/* Cgroup namespace for init task */ +struct cgroup_namespace init_cgroup_ns = { + .count = { .counter = 2, }, + .user_ns = &init_user_ns, + .ns.ops = &cgroupns_operations, + .ns.inum = PROC_CGROUP_INIT_INO, + .root_cset = &init_css_set, +}; + /* Ditto for the can_fork callback. */ static unsigned long have_canfork_callback __read_mostly; @@ -2166,6 +2178,36 @@ static struct file_system_type cgroup2_fs_type = { .kill_sb = cgroup_kill_sb, }; +static char * +cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + int ret; + struct cgroup *root = cset_cgroup_from_root(ns->root_cset, cgrp->root); + + ret = kernfs_path_from_node(cgrp->kn, root->kn, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; +} + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + char *ret; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + ret = cgroup_path_ns_locked(cgrp, buf, buflen, ns); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(cgroup_path_ns); + /** * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy * @task: target task @@ -2193,7 +2235,8 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen) if (root) { cgrp = task_cgroup_from_root(task, root); - path = cgroup_path(cgrp, buf, buflen); + path = cgroup_path_ns_locked(cgrp, buf, buflen, + &init_cgroup_ns); } else { /* if no hierarchy exists, everyone is in "/" */ if (strlcpy(buf, "/", buflen) < buflen) @@ -5272,6 +5315,8 @@ int __init cgroup_init(void) BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files)); + get_user_ns(init_cgroup_ns.user_ns); + mutex_lock(&cgroup_mutex); /* Add init_css_set to the hash table */ @@ -5409,7 +5454,8 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns, * " (deleted)" is appended to the cgroup path. */ if (cgroup_on_dfl(cgrp) || !(tsk->flags & PF_EXITING)) { - path = cgroup_path(cgrp, buf, PATH_MAX); + path = cgroup_path_ns_locked(cgrp, buf, PATH_MAX, + current->nsproxy->cgroup_ns); if (!path) { retval = -ENAMETOOLONG; goto out_unlock; @@ -5691,7 +5737,10 @@ static void cgroup_release_agent(struct work_struct *work) if (!pathbuf || !agentbuf) goto out; - path = cgroup_path(cgrp, pathbuf, PATH_MAX); + spin_lock_bh(&css_set_lock); + path = cgroup_path_ns_locked(cgrp, pathbuf, PATH_MAX, + &init_cgroup_ns); + spin_unlock_bh(&css_set_lock); if (!path) goto out; @@ -5875,6 +5924,127 @@ void cgroup_sk_free(struct sock_cgroup_data *skcd) #endif /* CONFIG_SOCK_CGROUP_DATA */ +/* cgroup namespaces */ + +static struct cgroup_namespace *alloc_cgroup_ns(void) +{ + struct cgroup_namespace *new_ns; + int ret; + + new_ns = kzalloc(sizeof(struct cgroup_namespace), GFP_KERNEL); + if (!new_ns) + return ERR_PTR(-ENOMEM); + ret = ns_alloc_inum(&new_ns->ns); + if (ret) { + kfree(new_ns); + return ERR_PTR(ret); + } + atomic_set(&new_ns->count, 1); + new_ns->ns.ops = &cgroupns_operations; + return new_ns; +} + +void free_cgroup_ns(struct cgroup_namespace *ns) +{ + put_css_set(ns->root_cset); + put_user_ns(ns->user_ns); + ns_free_inum(&ns->ns); + kfree(ns); +} +EXPORT_SYMBOL(free_cgroup_ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + struct cgroup_namespace *new_ns = NULL; + struct css_set *cset = NULL; + int err; + + BUG_ON(!old_ns); + + if (!(flags & CLONE_NEWCGROUP)) { + get_cgroup_ns(old_ns); + return old_ns; + } + + /* Allow only sysadmin to create cgroup namespace. */ + err = -EPERM; + if (!ns_capable(user_ns, CAP_SYS_ADMIN)) + goto err_out; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cset = task_css_set(current); + get_css_set(cset); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + err = -ENOMEM; + new_ns = alloc_cgroup_ns(); + if (!new_ns) + goto err_out; + + new_ns->user_ns = get_user_ns(user_ns); + new_ns->root_cset = cset; + + return new_ns; + +err_out: + if (cset) + put_css_set(cset); + kfree(new_ns); + return ERR_PTR(err); +} + +static inline struct cgroup_namespace *to_cg_ns(struct ns_common *ns) +{ + return container_of(ns, struct cgroup_namespace, ns); +} + +static int cgroupns_install(struct nsproxy *nsproxy, void *ns) +{ + pr_info("setns not supported for cgroup namespace"); + return -EINVAL; +} + +static struct ns_common *cgroupns_get(struct task_struct *task) +{ + struct cgroup_namespace *ns = NULL; + struct nsproxy *nsproxy; + + task_lock(task); + nsproxy = task->nsproxy; + if (nsproxy) { + ns = nsproxy->cgroup_ns; + get_cgroup_ns(ns); + } + task_unlock(task); + + return ns ? &ns->ns : NULL; +} + +static void cgroupns_put(struct ns_common *ns) +{ + put_cgroup_ns(to_cg_ns(ns)); +} + +const struct proc_ns_operations cgroupns_operations = { + .name = "cgroup", + .type = CLONE_NEWCGROUP, + .get = cgroupns_get, + .put = cgroupns_put, + .install = cgroupns_install, +}; + +static __init int cgroup_namespaces_init(void) +{ + return 0; +} +subsys_initcall(cgroup_namespaces_init); + #ifdef CONFIG_CGROUP_DEBUG static struct cgroup_subsys_state * debug_css_alloc(struct cgroup_subsys_state *parent_css) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3e945fc..62d6108 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2687,10 +2687,10 @@ int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns, goto out; retval = -ENAMETOOLONG; - rcu_read_lock(); - css = task_css(tsk, cpuset_cgrp_id); - p = cgroup_path(css->cgroup, buf, PATH_MAX); - rcu_read_unlock(); + css = task_get_css(tsk, cpuset_cgrp_id); + p = cgroup_path_ns(css->cgroup, buf, PATH_MAX, + current->nsproxy->cgroup_ns); + css_put(css); if (!p) goto out_free; seq_puts(m, p); diff --git a/kernel/fork.c b/kernel/fork.c index 2e391c7..6611a62 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1884,7 +1884,7 @@ static int check_unshare_flags(unsigned long unshare_flags) if (unshare_flags & ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND| CLONE_VM|CLONE_FILES|CLONE_SYSVSEM| CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET| - CLONE_NEWUSER|CLONE_NEWPID)) + CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWCGROUP)) return -EINVAL; /* * Not implemented, but pretend it works if there is nothing diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 49746c8..782102e 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -25,6 +25,7 @@ #include <linux/proc_ns.h> #include <linux/file.h> #include <linux/syscalls.h> +#include <linux/cgroup.h> static struct kmem_cache *nsproxy_cachep; @@ -39,6 +40,9 @@ struct nsproxy init_nsproxy = { #ifdef CONFIG_NET .net_ns = &init_net, #endif +#ifdef CONFIG_CGROUPS + .cgroup_ns = &init_cgroup_ns, +#endif }; static inline struct nsproxy *create_nsproxy(void) @@ -92,6 +96,13 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, goto out_pid; } + new_nsp->cgroup_ns = copy_cgroup_ns(flags, user_ns, + tsk->nsproxy->cgroup_ns); + if (IS_ERR(new_nsp->cgroup_ns)) { + err = PTR_ERR(new_nsp->cgroup_ns); + goto out_cgroup; + } + new_nsp->net_ns = copy_net_ns(flags, user_ns, tsk->nsproxy->net_ns); if (IS_ERR(new_nsp->net_ns)) { err = PTR_ERR(new_nsp->net_ns); @@ -101,6 +112,8 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, return new_nsp; out_net: + put_cgroup_ns(new_nsp->cgroup_ns); +out_cgroup: if (new_nsp->pid_ns_for_children) put_pid_ns(new_nsp->pid_ns_for_children); out_pid: @@ -128,7 +141,8 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) struct nsproxy *new_ns; if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWPID | CLONE_NEWNET)))) { + CLONE_NEWPID | CLONE_NEWNET | + CLONE_NEWCGROUP)))) { get_nsproxy(old_ns); return 0; } @@ -165,6 +179,7 @@ void free_nsproxy(struct nsproxy *ns) put_ipc_ns(ns->ipc_ns); if (ns->pid_ns_for_children) put_pid_ns(ns->pid_ns_for_children); + put_cgroup_ns(ns->cgroup_ns); put_net(ns->net_ns); kmem_cache_free(nsproxy_cachep, ns); } @@ -180,7 +195,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags, int err = 0; if (!(unshare_flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWNET | CLONE_NEWPID))) + CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWCGROUP))) return 0; user_ns = new_cred ? new_cred->user_ns : current_user_ns(); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 3/8] cgroup: introduce cgroup namespaces @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali@google.com> Introduce the ability to create new cgroup namespace. The newly created cgroup namespace remembers the cgroup of the process at the point of creation of the cgroup namespace (referred as cgroupns-root). The main purpose of cgroup namespace is to virtualize the contents of /proc/self/cgroup file. Processes inside a cgroup namespace are only able to see paths relative to their namespace root (unless they are moved outside of their cgroupns-root, at which point they will see a relative path from their cgroupns-root). For a correctly setup container this enables container-tools (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized containers without leaking system level cgroup hierarchy to the task. This patch only implements the 'unshare' part of the cgroupns. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- Changelog: 2015-11-24 - move cgroup_namespace.c into cgroup.c (and .h) - reformatting - make get_cgroup_ns return void - rename ns->root_cgrps to root_cset. Changelog: 2015-12-08 - Move init_cgroup_ns to other variable declarations - Remove accidental conversion of put-css_set to inline - Drop BUG_ON(NULL) - Remove unneeded pre declaration of struct cgroupns_operations. - cgroup.h: collect common ns declerations Changelog: 2015-12-09 - cgroup.h: move ns declarations to bottom - cgroup.c: undo all accidental conversions to inline Changelog: 2015-12-22 - update for new kernfs_path_from_node() return value. Since cgroup_path was already gpl-exported, I abstained from updating its return value. Changelog: 2015-12-23 - cgroup_path(): use init_cgroup_ns when in interupt context. Changelog: 2015-01-02 - move to_cg_ns definition forward in patch series - cgroup_release_agent: grab css_set_lock around cgroup_path() - leave cgroup_path non-namespaced, use cgroup_path_ns when namespaced path is desired. Changelog: 2015-01-04 - cgroup_path: continue to use kernfs_path. Since cgroup_path is non-namespaced, use the old version. - make cgroup_path_ns_locked() static. Changelog: 2015-01-05 - don't namespace the path printed in debugfs. Changelog: 2015-01-27 - remove unneeded NULL check before put_cgroup_ns() Changelog: 2015-01-28 - lock around task_css_set in copy_cgroup_ns, and don't take rcu lock arounc copy_cgroup_ns call in cpuset. --- fs/proc/namespaces.c | 3 + include/linux/cgroup.h | 49 +++++++++++++ include/linux/nsproxy.h | 2 + include/linux/proc_ns.h | 4 ++ kernel/cgroup.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- kernel/cpuset.c | 8 +-- kernel/fork.c | 2 +- kernel/nsproxy.c | 19 ++++- 8 files changed, 253 insertions(+), 10 deletions(-) diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c index 276f124..72cb26f 100644 --- a/fs/proc/namespaces.c +++ b/fs/proc/namespaces.c @@ -28,6 +28,9 @@ static const struct proc_ns_operations *ns_entries[] = { &userns_operations, #endif &mntns_operations, +#ifdef CONFIG_CGROUPS + &cgroupns_operations, +#endif }; static const char *proc_ns_get_link(struct dentry *dentry, diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 2162dca..1773af0 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -17,6 +17,11 @@ #include <linux/seq_file.h> #include <linux/kernfs.h> #include <linux/jump_label.h> +#include <linux/nsproxy.h> +#include <linux/types.h> +#include <linux/ns_common.h> +#include <linux/nsproxy.h> +#include <linux/user_namespace.h> #include <linux/cgroup-defs.h> @@ -611,4 +616,48 @@ static inline void cgroup_sk_free(struct sock_cgroup_data *skcd) {} #endif /* CONFIG_CGROUP_DATA */ +struct cgroup_namespace { + atomic_t count; + struct ns_common ns; + struct user_namespace *user_ns; + struct css_set *root_cset; +}; + +extern struct cgroup_namespace init_cgroup_ns; + +#ifdef CONFIG_CGROUPS + +void free_cgroup_ns(struct cgroup_namespace *ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns); + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns); + +#else /* !CONFIG_CGROUPS */ + +static inline void free_cgroup_ns(struct cgroup_namespace *ns) { } +static inline struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + return old_ns; +} + +#endif /* !CONFIG_CGROUPS */ + +static inline void get_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns) + atomic_inc(&ns->count); +} + +static inline void put_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns && atomic_dec_and_test(&ns->count)) + free_cgroup_ns(ns); +} + #endif /* _LINUX_CGROUP_H */ diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 35fa08f..ac0d65b 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -8,6 +8,7 @@ struct mnt_namespace; struct uts_namespace; struct ipc_namespace; struct pid_namespace; +struct cgroup_namespace; struct fs_struct; /* @@ -33,6 +34,7 @@ struct nsproxy { struct mnt_namespace *mnt_ns; struct pid_namespace *pid_ns_for_children; struct net *net_ns; + struct cgroup_namespace *cgroup_ns; }; extern struct nsproxy init_nsproxy; diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h index 42dfc61..de0e771 100644 --- a/include/linux/proc_ns.h +++ b/include/linux/proc_ns.h @@ -9,6 +9,8 @@ struct pid_namespace; struct nsproxy; struct path; +struct task_struct; +struct inode; struct proc_ns_operations { const char *name; @@ -24,6 +26,7 @@ extern const struct proc_ns_operations ipcns_operations; extern const struct proc_ns_operations pidns_operations; extern const struct proc_ns_operations userns_operations; extern const struct proc_ns_operations mntns_operations; +extern const struct proc_ns_operations cgroupns_operations; /* * We always define these enumerators @@ -34,6 +37,7 @@ enum { PROC_UTS_INIT_INO = 0xEFFFFFFEU, PROC_USER_INIT_INO = 0xEFFFFFFDU, PROC_PID_INIT_INO = 0xEFFFFFFCU, + PROC_CGROUP_INIT_INO = 0xEFFFFFFBU, }; #ifdef CONFIG_PROC_FS diff --git a/kernel/cgroup.c b/kernel/cgroup.c index c03a640..d828e1f 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -58,6 +58,9 @@ #include <linux/kthread.h> #include <linux/delay.h> #include <linux/atomic.h> +#include <linux/proc_ns.h> +#include <linux/nsproxy.h> +#include <linux/proc_ns.h> #include <net/sock.h> /* @@ -208,6 +211,15 @@ static unsigned long have_fork_callback __read_mostly; static unsigned long have_exit_callback __read_mostly; static unsigned long have_free_callback __read_mostly; +/* Cgroup namespace for init task */ +struct cgroup_namespace init_cgroup_ns = { + .count = { .counter = 2, }, + .user_ns = &init_user_ns, + .ns.ops = &cgroupns_operations, + .ns.inum = PROC_CGROUP_INIT_INO, + .root_cset = &init_css_set, +}; + /* Ditto for the can_fork callback. */ static unsigned long have_canfork_callback __read_mostly; @@ -2166,6 +2178,36 @@ static struct file_system_type cgroup2_fs_type = { .kill_sb = cgroup_kill_sb, }; +static char * +cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + int ret; + struct cgroup *root = cset_cgroup_from_root(ns->root_cset, cgrp->root); + + ret = kernfs_path_from_node(cgrp->kn, root->kn, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; +} + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + char *ret; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + ret = cgroup_path_ns_locked(cgrp, buf, buflen, ns); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(cgroup_path_ns); + /** * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy * @task: target task @@ -2193,7 +2235,8 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen) if (root) { cgrp = task_cgroup_from_root(task, root); - path = cgroup_path(cgrp, buf, buflen); + path = cgroup_path_ns_locked(cgrp, buf, buflen, + &init_cgroup_ns); } else { /* if no hierarchy exists, everyone is in "/" */ if (strlcpy(buf, "/", buflen) < buflen) @@ -5272,6 +5315,8 @@ int __init cgroup_init(void) BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files)); + get_user_ns(init_cgroup_ns.user_ns); + mutex_lock(&cgroup_mutex); /* Add init_css_set to the hash table */ @@ -5409,7 +5454,8 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns, * " (deleted)" is appended to the cgroup path. */ if (cgroup_on_dfl(cgrp) || !(tsk->flags & PF_EXITING)) { - path = cgroup_path(cgrp, buf, PATH_MAX); + path = cgroup_path_ns_locked(cgrp, buf, PATH_MAX, + current->nsproxy->cgroup_ns); if (!path) { retval = -ENAMETOOLONG; goto out_unlock; @@ -5691,7 +5737,10 @@ static void cgroup_release_agent(struct work_struct *work) if (!pathbuf || !agentbuf) goto out; - path = cgroup_path(cgrp, pathbuf, PATH_MAX); + spin_lock_bh(&css_set_lock); + path = cgroup_path_ns_locked(cgrp, pathbuf, PATH_MAX, + &init_cgroup_ns); + spin_unlock_bh(&css_set_lock); if (!path) goto out; @@ -5875,6 +5924,127 @@ void cgroup_sk_free(struct sock_cgroup_data *skcd) #endif /* CONFIG_SOCK_CGROUP_DATA */ +/* cgroup namespaces */ + +static struct cgroup_namespace *alloc_cgroup_ns(void) +{ + struct cgroup_namespace *new_ns; + int ret; + + new_ns = kzalloc(sizeof(struct cgroup_namespace), GFP_KERNEL); + if (!new_ns) + return ERR_PTR(-ENOMEM); + ret = ns_alloc_inum(&new_ns->ns); + if (ret) { + kfree(new_ns); + return ERR_PTR(ret); + } + atomic_set(&new_ns->count, 1); + new_ns->ns.ops = &cgroupns_operations; + return new_ns; +} + +void free_cgroup_ns(struct cgroup_namespace *ns) +{ + put_css_set(ns->root_cset); + put_user_ns(ns->user_ns); + ns_free_inum(&ns->ns); + kfree(ns); +} +EXPORT_SYMBOL(free_cgroup_ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + struct cgroup_namespace *new_ns = NULL; + struct css_set *cset = NULL; + int err; + + BUG_ON(!old_ns); + + if (!(flags & CLONE_NEWCGROUP)) { + get_cgroup_ns(old_ns); + return old_ns; + } + + /* Allow only sysadmin to create cgroup namespace. */ + err = -EPERM; + if (!ns_capable(user_ns, CAP_SYS_ADMIN)) + goto err_out; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cset = task_css_set(current); + get_css_set(cset); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + err = -ENOMEM; + new_ns = alloc_cgroup_ns(); + if (!new_ns) + goto err_out; + + new_ns->user_ns = get_user_ns(user_ns); + new_ns->root_cset = cset; + + return new_ns; + +err_out: + if (cset) + put_css_set(cset); + kfree(new_ns); + return ERR_PTR(err); +} + +static inline struct cgroup_namespace *to_cg_ns(struct ns_common *ns) +{ + return container_of(ns, struct cgroup_namespace, ns); +} + +static int cgroupns_install(struct nsproxy *nsproxy, void *ns) +{ + pr_info("setns not supported for cgroup namespace"); + return -EINVAL; +} + +static struct ns_common *cgroupns_get(struct task_struct *task) +{ + struct cgroup_namespace *ns = NULL; + struct nsproxy *nsproxy; + + task_lock(task); + nsproxy = task->nsproxy; + if (nsproxy) { + ns = nsproxy->cgroup_ns; + get_cgroup_ns(ns); + } + task_unlock(task); + + return ns ? &ns->ns : NULL; +} + +static void cgroupns_put(struct ns_common *ns) +{ + put_cgroup_ns(to_cg_ns(ns)); +} + +const struct proc_ns_operations cgroupns_operations = { + .name = "cgroup", + .type = CLONE_NEWCGROUP, + .get = cgroupns_get, + .put = cgroupns_put, + .install = cgroupns_install, +}; + +static __init int cgroup_namespaces_init(void) +{ + return 0; +} +subsys_initcall(cgroup_namespaces_init); + #ifdef CONFIG_CGROUP_DEBUG static struct cgroup_subsys_state * debug_css_alloc(struct cgroup_subsys_state *parent_css) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3e945fc..62d6108 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2687,10 +2687,10 @@ int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns, goto out; retval = -ENAMETOOLONG; - rcu_read_lock(); - css = task_css(tsk, cpuset_cgrp_id); - p = cgroup_path(css->cgroup, buf, PATH_MAX); - rcu_read_unlock(); + css = task_get_css(tsk, cpuset_cgrp_id); + p = cgroup_path_ns(css->cgroup, buf, PATH_MAX, + current->nsproxy->cgroup_ns); + css_put(css); if (!p) goto out_free; seq_puts(m, p); diff --git a/kernel/fork.c b/kernel/fork.c index 2e391c7..6611a62 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1884,7 +1884,7 @@ static int check_unshare_flags(unsigned long unshare_flags) if (unshare_flags & ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND| CLONE_VM|CLONE_FILES|CLONE_SYSVSEM| CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET| - CLONE_NEWUSER|CLONE_NEWPID)) + CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWCGROUP)) return -EINVAL; /* * Not implemented, but pretend it works if there is nothing diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 49746c8..782102e 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -25,6 +25,7 @@ #include <linux/proc_ns.h> #include <linux/file.h> #include <linux/syscalls.h> +#include <linux/cgroup.h> static struct kmem_cache *nsproxy_cachep; @@ -39,6 +40,9 @@ struct nsproxy init_nsproxy = { #ifdef CONFIG_NET .net_ns = &init_net, #endif +#ifdef CONFIG_CGROUPS + .cgroup_ns = &init_cgroup_ns, +#endif }; static inline struct nsproxy *create_nsproxy(void) @@ -92,6 +96,13 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, goto out_pid; } + new_nsp->cgroup_ns = copy_cgroup_ns(flags, user_ns, + tsk->nsproxy->cgroup_ns); + if (IS_ERR(new_nsp->cgroup_ns)) { + err = PTR_ERR(new_nsp->cgroup_ns); + goto out_cgroup; + } + new_nsp->net_ns = copy_net_ns(flags, user_ns, tsk->nsproxy->net_ns); if (IS_ERR(new_nsp->net_ns)) { err = PTR_ERR(new_nsp->net_ns); @@ -101,6 +112,8 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, return new_nsp; out_net: + put_cgroup_ns(new_nsp->cgroup_ns); +out_cgroup: if (new_nsp->pid_ns_for_children) put_pid_ns(new_nsp->pid_ns_for_children); out_pid: @@ -128,7 +141,8 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) struct nsproxy *new_ns; if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWPID | CLONE_NEWNET)))) { + CLONE_NEWPID | CLONE_NEWNET | + CLONE_NEWCGROUP)))) { get_nsproxy(old_ns); return 0; } @@ -165,6 +179,7 @@ void free_nsproxy(struct nsproxy *ns) put_ipc_ns(ns->ipc_ns); if (ns->pid_ns_for_children) put_pid_ns(ns->pid_ns_for_children); + put_cgroup_ns(ns->cgroup_ns); put_net(ns->net_ns); kmem_cache_free(nsproxy_cachep, ns); } @@ -180,7 +195,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags, int err = 0; if (!(unshare_flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWNET | CLONE_NEWPID))) + CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWCGROUP))) return 0; user_ns = new_cred ? new_cred->user_ns : current_user_ns(); -- 1.7.9.5 _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply related [flat|nested] 108+ messages in thread
[parent not found: <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. Changelog 20151223: - don't allocate memory pr_cont_kernfs_path() under spinlock --- fs/kernfs/dir.c | 192 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 166 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 996b774..38fa03a 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); - return p; + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; + + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,17 +286,25 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + int sz; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz >= sizeof(kernfs_pr_cont_buf)) { + pr_cont("(name too long)"); + goto out; + } + + pr_cont("%s", kernfs_pr_cont_buf); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2016-01-29 8:54 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 3/8] cgroup: introduce cgroup namespaces serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (8 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> CLONE_NEWCGROUP will be used to create new cgroup namespace. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- include/uapi/linux/sched.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h index cc89dde..5f0fe01 100644 --- a/include/uapi/linux/sched.h +++ b/include/uapi/linux/sched.h @@ -21,8 +21,7 @@ #define CLONE_DETACHED 0x00400000 /* Unused, ignored */ #define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */ #define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */ -/* 0x02000000 was previously the unused CLONE_STOPPED (Start in stopped state) - and is now available for re-use. */ +#define CLONE_NEWCGROUP 0x02000000 /* New cgroup namespace */ #define CLONE_NEWUTS 0x04000000 /* New utsname namespace */ #define CLONE_NEWIPC 0x08000000 /* New ipc namespace */ #define CLONE_NEWUSER 0x10000000 /* New user namespace */ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 3/8] cgroup: introduce cgroup namespaces [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2016-01-29 8:54 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` serge.hallyn ` (7 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Introduce the ability to create new cgroup namespace. The newly created cgroup namespace remembers the cgroup of the process at the point of creation of the cgroup namespace (referred as cgroupns-root). The main purpose of cgroup namespace is to virtualize the contents of /proc/self/cgroup file. Processes inside a cgroup namespace are only able to see paths relative to their namespace root (unless they are moved outside of their cgroupns-root, at which point they will see a relative path from their cgroupns-root). For a correctly setup container this enables container-tools (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized containers without leaking system level cgroup hierarchy to the task. This patch only implements the 'unshare' part of the cgroupns. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- Changelog: 2015-11-24 - move cgroup_namespace.c into cgroup.c (and .h) - reformatting - make get_cgroup_ns return void - rename ns->root_cgrps to root_cset. Changelog: 2015-12-08 - Move init_cgroup_ns to other variable declarations - Remove accidental conversion of put-css_set to inline - Drop BUG_ON(NULL) - Remove unneeded pre declaration of struct cgroupns_operations. - cgroup.h: collect common ns declerations Changelog: 2015-12-09 - cgroup.h: move ns declarations to bottom - cgroup.c: undo all accidental conversions to inline Changelog: 2015-12-22 - update for new kernfs_path_from_node() return value. Since cgroup_path was already gpl-exported, I abstained from updating its return value. Changelog: 2015-12-23 - cgroup_path(): use init_cgroup_ns when in interupt context. Changelog: 2015-01-02 - move to_cg_ns definition forward in patch series - cgroup_release_agent: grab css_set_lock around cgroup_path() - leave cgroup_path non-namespaced, use cgroup_path_ns when namespaced path is desired. Changelog: 2015-01-04 - cgroup_path: continue to use kernfs_path. Since cgroup_path is non-namespaced, use the old version. - make cgroup_path_ns_locked() static. Changelog: 2015-01-05 - don't namespace the path printed in debugfs. Changelog: 2015-01-27 - remove unneeded NULL check before put_cgroup_ns() Changelog: 2015-01-28 - lock around task_css_set in copy_cgroup_ns, and don't take rcu lock arounc copy_cgroup_ns call in cpuset. --- fs/proc/namespaces.c | 3 + include/linux/cgroup.h | 49 +++++++++++++ include/linux/nsproxy.h | 2 + include/linux/proc_ns.h | 4 ++ kernel/cgroup.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++- kernel/cpuset.c | 8 +-- kernel/fork.c | 2 +- kernel/nsproxy.c | 19 ++++- 8 files changed, 253 insertions(+), 10 deletions(-) diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c index 276f124..72cb26f 100644 --- a/fs/proc/namespaces.c +++ b/fs/proc/namespaces.c @@ -28,6 +28,9 @@ static const struct proc_ns_operations *ns_entries[] = { &userns_operations, #endif &mntns_operations, +#ifdef CONFIG_CGROUPS + &cgroupns_operations, +#endif }; static const char *proc_ns_get_link(struct dentry *dentry, diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 2162dca..1773af0 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -17,6 +17,11 @@ #include <linux/seq_file.h> #include <linux/kernfs.h> #include <linux/jump_label.h> +#include <linux/nsproxy.h> +#include <linux/types.h> +#include <linux/ns_common.h> +#include <linux/nsproxy.h> +#include <linux/user_namespace.h> #include <linux/cgroup-defs.h> @@ -611,4 +616,48 @@ static inline void cgroup_sk_free(struct sock_cgroup_data *skcd) {} #endif /* CONFIG_CGROUP_DATA */ +struct cgroup_namespace { + atomic_t count; + struct ns_common ns; + struct user_namespace *user_ns; + struct css_set *root_cset; +}; + +extern struct cgroup_namespace init_cgroup_ns; + +#ifdef CONFIG_CGROUPS + +void free_cgroup_ns(struct cgroup_namespace *ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns); + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns); + +#else /* !CONFIG_CGROUPS */ + +static inline void free_cgroup_ns(struct cgroup_namespace *ns) { } +static inline struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + return old_ns; +} + +#endif /* !CONFIG_CGROUPS */ + +static inline void get_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns) + atomic_inc(&ns->count); +} + +static inline void put_cgroup_ns(struct cgroup_namespace *ns) +{ + if (ns && atomic_dec_and_test(&ns->count)) + free_cgroup_ns(ns); +} + #endif /* _LINUX_CGROUP_H */ diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 35fa08f..ac0d65b 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -8,6 +8,7 @@ struct mnt_namespace; struct uts_namespace; struct ipc_namespace; struct pid_namespace; +struct cgroup_namespace; struct fs_struct; /* @@ -33,6 +34,7 @@ struct nsproxy { struct mnt_namespace *mnt_ns; struct pid_namespace *pid_ns_for_children; struct net *net_ns; + struct cgroup_namespace *cgroup_ns; }; extern struct nsproxy init_nsproxy; diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h index 42dfc61..de0e771 100644 --- a/include/linux/proc_ns.h +++ b/include/linux/proc_ns.h @@ -9,6 +9,8 @@ struct pid_namespace; struct nsproxy; struct path; +struct task_struct; +struct inode; struct proc_ns_operations { const char *name; @@ -24,6 +26,7 @@ extern const struct proc_ns_operations ipcns_operations; extern const struct proc_ns_operations pidns_operations; extern const struct proc_ns_operations userns_operations; extern const struct proc_ns_operations mntns_operations; +extern const struct proc_ns_operations cgroupns_operations; /* * We always define these enumerators @@ -34,6 +37,7 @@ enum { PROC_UTS_INIT_INO = 0xEFFFFFFEU, PROC_USER_INIT_INO = 0xEFFFFFFDU, PROC_PID_INIT_INO = 0xEFFFFFFCU, + PROC_CGROUP_INIT_INO = 0xEFFFFFFBU, }; #ifdef CONFIG_PROC_FS diff --git a/kernel/cgroup.c b/kernel/cgroup.c index c03a640..d828e1f 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -58,6 +58,9 @@ #include <linux/kthread.h> #include <linux/delay.h> #include <linux/atomic.h> +#include <linux/proc_ns.h> +#include <linux/nsproxy.h> +#include <linux/proc_ns.h> #include <net/sock.h> /* @@ -208,6 +211,15 @@ static unsigned long have_fork_callback __read_mostly; static unsigned long have_exit_callback __read_mostly; static unsigned long have_free_callback __read_mostly; +/* Cgroup namespace for init task */ +struct cgroup_namespace init_cgroup_ns = { + .count = { .counter = 2, }, + .user_ns = &init_user_ns, + .ns.ops = &cgroupns_operations, + .ns.inum = PROC_CGROUP_INIT_INO, + .root_cset = &init_css_set, +}; + /* Ditto for the can_fork callback. */ static unsigned long have_canfork_callback __read_mostly; @@ -2166,6 +2178,36 @@ static struct file_system_type cgroup2_fs_type = { .kill_sb = cgroup_kill_sb, }; +static char * +cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + int ret; + struct cgroup *root = cset_cgroup_from_root(ns->root_cset, cgrp->root); + + ret = kernfs_path_from_node(cgrp->kn, root->kn, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; +} + +char *cgroup_path_ns(struct cgroup *cgrp, char *buf, size_t buflen, + struct cgroup_namespace *ns) +{ + char *ret; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + ret = cgroup_path_ns_locked(cgrp, buf, buflen, ns); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + return ret; +} +EXPORT_SYMBOL_GPL(cgroup_path_ns); + /** * task_cgroup_path - cgroup path of a task in the first cgroup hierarchy * @task: target task @@ -2193,7 +2235,8 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen) if (root) { cgrp = task_cgroup_from_root(task, root); - path = cgroup_path(cgrp, buf, buflen); + path = cgroup_path_ns_locked(cgrp, buf, buflen, + &init_cgroup_ns); } else { /* if no hierarchy exists, everyone is in "/" */ if (strlcpy(buf, "/", buflen) < buflen) @@ -5272,6 +5315,8 @@ int __init cgroup_init(void) BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files)); + get_user_ns(init_cgroup_ns.user_ns); + mutex_lock(&cgroup_mutex); /* Add init_css_set to the hash table */ @@ -5409,7 +5454,8 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns, * " (deleted)" is appended to the cgroup path. */ if (cgroup_on_dfl(cgrp) || !(tsk->flags & PF_EXITING)) { - path = cgroup_path(cgrp, buf, PATH_MAX); + path = cgroup_path_ns_locked(cgrp, buf, PATH_MAX, + current->nsproxy->cgroup_ns); if (!path) { retval = -ENAMETOOLONG; goto out_unlock; @@ -5691,7 +5737,10 @@ static void cgroup_release_agent(struct work_struct *work) if (!pathbuf || !agentbuf) goto out; - path = cgroup_path(cgrp, pathbuf, PATH_MAX); + spin_lock_bh(&css_set_lock); + path = cgroup_path_ns_locked(cgrp, pathbuf, PATH_MAX, + &init_cgroup_ns); + spin_unlock_bh(&css_set_lock); if (!path) goto out; @@ -5875,6 +5924,127 @@ void cgroup_sk_free(struct sock_cgroup_data *skcd) #endif /* CONFIG_SOCK_CGROUP_DATA */ +/* cgroup namespaces */ + +static struct cgroup_namespace *alloc_cgroup_ns(void) +{ + struct cgroup_namespace *new_ns; + int ret; + + new_ns = kzalloc(sizeof(struct cgroup_namespace), GFP_KERNEL); + if (!new_ns) + return ERR_PTR(-ENOMEM); + ret = ns_alloc_inum(&new_ns->ns); + if (ret) { + kfree(new_ns); + return ERR_PTR(ret); + } + atomic_set(&new_ns->count, 1); + new_ns->ns.ops = &cgroupns_operations; + return new_ns; +} + +void free_cgroup_ns(struct cgroup_namespace *ns) +{ + put_css_set(ns->root_cset); + put_user_ns(ns->user_ns); + ns_free_inum(&ns->ns); + kfree(ns); +} +EXPORT_SYMBOL(free_cgroup_ns); + +struct cgroup_namespace * +copy_cgroup_ns(unsigned long flags, struct user_namespace *user_ns, + struct cgroup_namespace *old_ns) +{ + struct cgroup_namespace *new_ns = NULL; + struct css_set *cset = NULL; + int err; + + BUG_ON(!old_ns); + + if (!(flags & CLONE_NEWCGROUP)) { + get_cgroup_ns(old_ns); + return old_ns; + } + + /* Allow only sysadmin to create cgroup namespace. */ + err = -EPERM; + if (!ns_capable(user_ns, CAP_SYS_ADMIN)) + goto err_out; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cset = task_css_set(current); + get_css_set(cset); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + err = -ENOMEM; + new_ns = alloc_cgroup_ns(); + if (!new_ns) + goto err_out; + + new_ns->user_ns = get_user_ns(user_ns); + new_ns->root_cset = cset; + + return new_ns; + +err_out: + if (cset) + put_css_set(cset); + kfree(new_ns); + return ERR_PTR(err); +} + +static inline struct cgroup_namespace *to_cg_ns(struct ns_common *ns) +{ + return container_of(ns, struct cgroup_namespace, ns); +} + +static int cgroupns_install(struct nsproxy *nsproxy, void *ns) +{ + pr_info("setns not supported for cgroup namespace"); + return -EINVAL; +} + +static struct ns_common *cgroupns_get(struct task_struct *task) +{ + struct cgroup_namespace *ns = NULL; + struct nsproxy *nsproxy; + + task_lock(task); + nsproxy = task->nsproxy; + if (nsproxy) { + ns = nsproxy->cgroup_ns; + get_cgroup_ns(ns); + } + task_unlock(task); + + return ns ? &ns->ns : NULL; +} + +static void cgroupns_put(struct ns_common *ns) +{ + put_cgroup_ns(to_cg_ns(ns)); +} + +const struct proc_ns_operations cgroupns_operations = { + .name = "cgroup", + .type = CLONE_NEWCGROUP, + .get = cgroupns_get, + .put = cgroupns_put, + .install = cgroupns_install, +}; + +static __init int cgroup_namespaces_init(void) +{ + return 0; +} +subsys_initcall(cgroup_namespaces_init); + #ifdef CONFIG_CGROUP_DEBUG static struct cgroup_subsys_state * debug_css_alloc(struct cgroup_subsys_state *parent_css) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 3e945fc..62d6108 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2687,10 +2687,10 @@ int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns, goto out; retval = -ENAMETOOLONG; - rcu_read_lock(); - css = task_css(tsk, cpuset_cgrp_id); - p = cgroup_path(css->cgroup, buf, PATH_MAX); - rcu_read_unlock(); + css = task_get_css(tsk, cpuset_cgrp_id); + p = cgroup_path_ns(css->cgroup, buf, PATH_MAX, + current->nsproxy->cgroup_ns); + css_put(css); if (!p) goto out_free; seq_puts(m, p); diff --git a/kernel/fork.c b/kernel/fork.c index 2e391c7..6611a62 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1884,7 +1884,7 @@ static int check_unshare_flags(unsigned long unshare_flags) if (unshare_flags & ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND| CLONE_VM|CLONE_FILES|CLONE_SYSVSEM| CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET| - CLONE_NEWUSER|CLONE_NEWPID)) + CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWCGROUP)) return -EINVAL; /* * Not implemented, but pretend it works if there is nothing diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 49746c8..782102e 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -25,6 +25,7 @@ #include <linux/proc_ns.h> #include <linux/file.h> #include <linux/syscalls.h> +#include <linux/cgroup.h> static struct kmem_cache *nsproxy_cachep; @@ -39,6 +40,9 @@ struct nsproxy init_nsproxy = { #ifdef CONFIG_NET .net_ns = &init_net, #endif +#ifdef CONFIG_CGROUPS + .cgroup_ns = &init_cgroup_ns, +#endif }; static inline struct nsproxy *create_nsproxy(void) @@ -92,6 +96,13 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, goto out_pid; } + new_nsp->cgroup_ns = copy_cgroup_ns(flags, user_ns, + tsk->nsproxy->cgroup_ns); + if (IS_ERR(new_nsp->cgroup_ns)) { + err = PTR_ERR(new_nsp->cgroup_ns); + goto out_cgroup; + } + new_nsp->net_ns = copy_net_ns(flags, user_ns, tsk->nsproxy->net_ns); if (IS_ERR(new_nsp->net_ns)) { err = PTR_ERR(new_nsp->net_ns); @@ -101,6 +112,8 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, return new_nsp; out_net: + put_cgroup_ns(new_nsp->cgroup_ns); +out_cgroup: if (new_nsp->pid_ns_for_children) put_pid_ns(new_nsp->pid_ns_for_children); out_pid: @@ -128,7 +141,8 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) struct nsproxy *new_ns; if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWPID | CLONE_NEWNET)))) { + CLONE_NEWPID | CLONE_NEWNET | + CLONE_NEWCGROUP)))) { get_nsproxy(old_ns); return 0; } @@ -165,6 +179,7 @@ void free_nsproxy(struct nsproxy *ns) put_ipc_ns(ns->ipc_ns); if (ns->pid_ns_for_children) put_pid_ns(ns->pid_ns_for_children); + put_cgroup_ns(ns->cgroup_ns); put_net(ns->net_ns); kmem_cache_free(nsproxy_cachep, ns); } @@ -180,7 +195,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags, int err = 0; if (!(unshare_flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | - CLONE_NEWNET | CLONE_NEWPID))) + CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWCGROUP))) return 0; user_ns = new_cred ? new_cred->user_ns : current_user_ns(); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 4/8] cgroup: cgroup namespace setns support 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn -1 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> setns on a cgroup namespace is allowed only if task has CAP_SYS_ADMIN in its current user-namespace and over the user-namespace associated with target cgroupns. No implicit cgroup changes happen with attaching to another cgroupns. It is expected that the somone moves the attaching process under the target cgroupns-root. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- kernel/cgroup.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index d828e1f..96e3dab 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -6004,10 +6004,23 @@ static inline struct cgroup_namespace *to_cg_ns(struct ns_common *ns) return container_of(ns, struct cgroup_namespace, ns); } -static int cgroupns_install(struct nsproxy *nsproxy, void *ns) +static int cgroupns_install(struct nsproxy *nsproxy, struct ns_common *ns) { - pr_info("setns not supported for cgroup namespace"); - return -EINVAL; + struct cgroup_namespace *cgroup_ns = to_cg_ns(ns); + + if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN) || + !ns_capable(cgroup_ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + /* Don't need to do anything if we are attaching to our own cgroupns. */ + if (cgroup_ns == nsproxy->cgroup_ns) + return 0; + + get_cgroup_ns(cgroup_ns); + put_cgroup_ns(nsproxy->cgroup_ns); + nsproxy->cgroup_ns = cgroup_ns; + + return 0; } static struct ns_common *cgroupns_get(struct task_struct *task) -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 4/8] cgroup: cgroup namespace setns support @ 2016-01-29 8:54 ` serge.hallyn 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> setns on a cgroup namespace is allowed only if task has CAP_SYS_ADMIN in its current user-namespace and over the user-namespace associated with target cgroupns. No implicit cgroup changes happen with attaching to another cgroupns. It is expected that the somone moves the attaching process under the target cgroupns-root. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> --- kernel/cgroup.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index d828e1f..96e3dab 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -6004,10 +6004,23 @@ static inline struct cgroup_namespace *to_cg_ns(struct ns_common *ns) return container_of(ns, struct cgroup_namespace, ns); } -static int cgroupns_install(struct nsproxy *nsproxy, void *ns) +static int cgroupns_install(struct nsproxy *nsproxy, struct ns_common *ns) { - pr_info("setns not supported for cgroup namespace"); - return -EINVAL; + struct cgroup_namespace *cgroup_ns = to_cg_ns(ns); + + if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN) || + !ns_capable(cgroup_ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + /* Don't need to do anything if we are attaching to our own cgroupns. */ + if (cgroup_ns == nsproxy->cgroup_ns) + return 0; + + get_cgroup_ns(cgroup_ns); + put_cgroup_ns(nsproxy->cgroup_ns); + nsproxy->cgroup_ns = cgroup_ns; + + return 0; } static struct ns_common *cgroupns_get(struct task_struct *task) -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 5/8] kernfs: define kernfs_node_dentry [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (3 preceding siblings ...) 2016-01-29 8:54 ` serge.hallyn @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (5 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Add a new kernfs api is added to lookup the dentry for a particular kernfs path. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. 20151208 - Split out the kernfs change - Style changes - Switch from pr_crit to WARN_ON - Reorder arguments to kernfs_obtain_root - rename kernfs_obtain_root to kernfs_node_dentry 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() --- fs/kernfs/mount.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/kernfs.h | 2 ++ 2 files changed, 71 insertions(+) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 8eaf417..074bb8b 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -14,6 +14,7 @@ #include <linux/magic.h> #include <linux/slab.h> #include <linux/pagemap.h> +#include <linux/namei.h> #include "kernfs-internal.h" @@ -62,6 +63,74 @@ struct kernfs_root *kernfs_root_from_sb(struct super_block *sb) return NULL; } +/* + * find the next ancestor in the path down to @child, where @parent was the + * ancestor whose descendant we want to find. + * + * Say the path is /a/b/c/d. @child is d, @parent is NULL. We return the root + * node. If @parent is b, then we return the node for c. + * Passing in d as @parent is not ok. + */ +static struct kernfs_node * +find_next_ancestor(struct kernfs_node *child, struct kernfs_node *parent) +{ + if (child == parent) { + pr_crit_once("BUG in find_next_ancestor: called with parent == child"); + return NULL; + } + + while (child->parent != parent) { + if (!child->parent) + return NULL; + child = child->parent; + } + + return child; +} + +/** + * kernfs_node_dentry - get a dentry for the given kernfs_node + * @kn: kernfs_node for which a dentry is needed + * @sb: the kernfs super_block + */ +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb) +{ + struct dentry *dentry; + struct kernfs_node *knparent = NULL; + + BUG_ON(sb->s_op != &kernfs_sops); + + dentry = dget(sb->s_root); + + /* Check if this is the root kernfs_node */ + if (!kn->parent) + return dentry; + + knparent = find_next_ancestor(kn, NULL); + if (WARN_ON(!knparent)) + return ERR_PTR(-EINVAL); + + do { + struct dentry *dtmp; + struct kernfs_node *kntmp; + + if (kn == knparent) + return dentry; + kntmp = find_next_ancestor(kn, knparent); + if (WARN_ON(!kntmp)) + return ERR_PTR(-EINVAL); + mutex_lock(&d_inode(dentry)->i_mutex); + dtmp = lookup_one_len(kntmp->name, dentry, strlen(kntmp->name)); + mutex_unlock(&d_inode(dentry)->i_mutex); + dput(dentry); + if (IS_ERR(dtmp)) + return dtmp; + knparent = kntmp; + dentry = dtmp; + } while (1); +} + static int kernfs_fill_super(struct super_block *sb, unsigned long magic) { struct kernfs_super_info *info = kernfs_info(sb); diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 716bfde..c06c442 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -284,6 +284,8 @@ struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry); struct kernfs_root *kernfs_root_from_sb(struct super_block *sb); struct inode *kernfs_get_inode(struct super_block *sb, struct kernfs_node *kn); +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb); struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops, unsigned int flags, void *priv); void kernfs_destroy_root(struct kernfs_root *root); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (4 preceding siblings ...) 2016-01-29 8:54 ` [PATCH 5/8] kernfs: define kernfs_node_dentry serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (4 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> This patch enables cgroup mounting inside userns when a process as appropriate privileges. The cgroup filesystem mounted is rooted at the cgroupns-root. Thus, in a container-setup, only the hierarchy under the cgroupns-root is exposed inside the container. This allows container management tools to run inside the containers without depending on any global state. Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. - Group initialized variables - Explain the capable(CAP_SYS_ADMIN) check - Style fixes 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() 20160128 - grab needed lock in mount Signed-off-by: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- kernel/cgroup.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 96e3dab..3e04df0 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1983,6 +1983,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, { bool is_v2 = fs_type == &cgroup2_fs_type; struct super_block *pinned_sb = NULL; + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_subsys *ss; struct cgroup_root *root; struct cgroup_sb_opts opts; @@ -1991,6 +1992,14 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, int i; bool new_sb; + get_cgroup_ns(ns); + + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) { + put_cgroup_ns(ns); + return ERR_PTR(-EPERM); + } + /* * The first time anyone tries to mount a cgroup, enable the list * linking each css_set to its tasks and fix up all existing tasks. @@ -2001,6 +2010,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, if (is_v2) { if (data) { pr_err("cgroup2: unknown option \"%s\"\n", (char *)data); + put_cgroup_ns(ns); return ERR_PTR(-EINVAL); } cgrp_dfl_root_visible = true; @@ -2106,6 +2116,16 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, goto out_unlock; } + /* + * We know this subsystem has not yet been bound. Users in a non-init + * user namespace may only mount hierarchies with no bound subsystems, + * i.e. 'none,name=user1' + */ + if (!opts.none && !capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out_unlock; + } + root = kzalloc(sizeof(*root), GFP_KERNEL); if (!root) { ret = -ENOMEM; @@ -2124,12 +2144,37 @@ out_free: kfree(opts.release_agent); kfree(opts.name); - if (ret) + if (ret) { + put_cgroup_ns(ns); return ERR_PTR(ret); + } out_mount: dentry = kernfs_mount(fs_type, flags, root->kf_root, is_v2 ? CGROUP2_SUPER_MAGIC : CGROUP_SUPER_MAGIC, &new_sb); + + /* + * In non-init cgroup namespace, instead of root cgroup's + * dentry, we return the dentry corresponding to the + * cgroupns->root_cgrp. + */ + if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { + struct dentry *nsdentry; + struct cgroup *cgrp; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cgrp = cset_cgroup_from_root(ns->root_cset, root); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb); + dput(dentry); + dentry = nsdentry; + } + if (IS_ERR(dentry) || !new_sb) cgroup_put(&root->cgrp); @@ -2142,6 +2187,7 @@ out_mount: deactivate_super(pinned_sb); } + put_cgroup_ns(ns); return dentry; } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 7/8] cgroup: Add documentation for cgroup namespaces [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (5 preceding siblings ...) 2016-01-29 8:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (3 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> --- Changelog (2015-12-08): Merge into Documentation/cgroup.txt Changelog (2015-12-22): Reformat to try to follow the style of the rest of the cgroup.txt file. Changelog (2015-12-22): tj: Reorganized to better fit the documentation. --- Documentation/cgroup-v2.txt | 147 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 65b3eac..eee9012 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -47,6 +47,11 @@ CONTENTS 5-3. IO 5-3-1. IO Interface Files 5-3-2. Writeback +6. Namespace + 6-1. Basics + 6-2. The Root and Views + 6-3. Migration and setns(2) + 6-4. Interaction with Other Namespaces P. Information on Kernel Programming P-1. Filesystem Support for Writeback D. Deprecated v1 Core Features @@ -1085,6 +1090,148 @@ writeback as follows. vm.dirty[_background]_ratio. +6. Namespace + +6-1. Basics + +cgroup namespace provides a mechanism to virtualize the view of the +"/proc/$PID/cgroup" file and cgroup mounts. The CLONE_NEWCGROUP clone +flag can be used with clone(2) and unshare(2) to create a new cgroup +namespace. The process running inside the cgroup namespace will have +its "/proc/$PID/cgroup" output restricted to cgroupns root. The +cgroupns root is the cgroup of the process at the time of creation of +the cgroup namespace. + +Without cgroup namespace, the "/proc/$PID/cgroup" file shows the +complete path of the cgroup of a process. In a container setup where +a set of cgroups and namespaces are intended to isolate processes the +"/proc/$PID/cgroup" file may leak potential system level information +to the isolated processes. For Example: + + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +The path '/batchjobs/container_id1' can be considered as system-data +and undesirable to expose to the isolated processes. cgroup namespace +can be used to restrict visibility of this path. For example, before +creating a cgroup namespace, one would see: + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835] + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +After unsharing a new namespace, the view changes. + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183] + # cat /proc/self/cgroup + 0::/ + +When some thread from a multi-threaded process unshares its cgroup +namespace, the new cgroupns gets applied to the entire process (all +the threads). This is natural for the v2 hierarchy; however, for the +legacy hierarchies, this may be unexpected. + +A cgroup namespace is alive as long as there are processes inside or +mounts pinning it. When the last usage goes away, the cgroup +namespace is destroyed. The cgroupns root and the actual cgroups +remain. + + +6-2. The Root and Views + +The 'cgroupns root' for a cgroup namespace is the cgroup in which the +process calling unshare(2) is running. For example, if a process in +/batchjobs/container_id1 cgroup calls unshare, cgroup +/batchjobs/container_id1 becomes the cgroupns root. For the +init_cgroup_ns, this is the real root ('/') cgroup. + +The cgroupns root cgroup does not change even if the namespace creator +process later moves to a different cgroup. + + # ~/unshare -c # unshare cgroupns in some cgroup + # cat /proc/self/cgroup + 0::/ + # mkdir sub_cgrp_1 + # echo 0 > sub_cgrp_1/cgroup.procs + # cat /proc/self/cgroup + 0::/sub_cgrp_1 + +Each process gets its namespace-specific view of "/proc/$PID/cgroup" + +Processes running inside the cgroup namespace will be able to see +cgroup paths (in /proc/self/cgroup) only inside their root cgroup. +From within an unshared cgroupns: + + # sleep 100000 & + [1] 7353 + # echo 7353 > sub_cgrp_1/cgroup.procs + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + +From the initial cgroup namespace, the real cgroup path will be +visible: + + $ cat /proc/7353/cgroup + 0::/batchjobs/container_id1/sub_cgrp_1 + +From a sibling cgroup namespace (that is, a namespace rooted at a +different cgroup), the cgroup path relative to its own cgroup +namespace root will be shown. For instance, if PID 7353's cgroup +namespace root is at '/batchjobs/container_id2', then it will see + + # cat /proc/7353/cgroup + 0::/../container_id2/sub_cgrp_1 + +Note that the relative path always starts with '/' to indicate that +its relative to the cgroup namespace root of the caller. + + +6-3. Migration and setns(2) + +Processes inside a cgroup namespace can move into and out of the +namespace root if they have proper access to external cgroups. For +example, from inside a namespace with cgroupns root at +/batchjobs/container_id1, and assuming that the global hierarchy is +still accessible inside cgroupns: + + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + # echo 7353 > batchjobs/container_id2/cgroup.procs + # cat /proc/7353/cgroup + 0::/../container_id2 + +Note that this kind of setup is not encouraged. A task inside cgroup +namespace should only be exposed to its own cgroupns hierarchy. + +setns(2) to another cgroup namespace is allowed when: + +(a) the process has CAP_SYS_ADMIN against its current user namespace +(b) the process has CAP_SYS_ADMIN against the target cgroup + namespace's userns + +No implicit cgroup changes happen with attaching to another cgroup +namespace. It is expected that the someone moves the attaching +process under the target cgroup namespace root. + + +6-4. Interaction with Other Namespaces + +Namespace specific cgroup hierarchy can be mounted by a process +running inside a non-init cgroup namespace. + + # mount -t cgroup2 none $MOUNT_POINT + +This will mount the unified cgroup hierarchy with cgroupns root as the +filesystem root. The process needs CAP_SYS_ADMIN against its user and +mount namespaces. + +The virtualization of /proc/self/cgroup file combined with restricting +the view of cgroup hierarchy by namespace-private cgroupfs mount +provides a properly isolated cgroup view inside the container. + + P. Information on Kernel Programming This section contains kernel programming information in the areas -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (6 preceding siblings ...) 2016-01-29 8:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-31 17:48 ` Alban Crequy ` (2 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> allowing root in a non-init user namespace to mount it. This should now be safe, because 1. non-init-root cannot mount a previously unbound subsystem 2. the task doing the mount must be privileged with respect to the user namespace owning the cgroup namespace 3. the mounted subsystem will have its current cgroup as the root dentry. the permissions will be unchanged, so tasks will receive no new privilege over the cgroups which they did not have on the original mounts. Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- kernel/cgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 3e04df0..7a58749 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2216,12 +2216,14 @@ static struct file_system_type cgroup_fs_type = { .name = "cgroup", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; static struct file_system_type cgroup2_fs_type = { .name = "cgroup2", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; static char * -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH] selftests/cgroupns: new test for cgroup namespaces 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-31 17:48 ` Alban Crequy -1 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-01-31 17:48 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, iago-lYLaGTFnO9sWenYVfaLwtA, Alban Crequy, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> This adds the selftest "cgroupns_test" in order to test the CGroup Namespace patchset. cgroupns_test creates two child processes. They perform a list of actions defined by the array cgroupns_test. This array can easily be extended to more scenarios without adding much code. They are synchronized with eventfds to ensure only one action is performed at a time. The memory is shared between the processes (CLONE_VM) so each child process can know the pid of their siblings without extra IPC. The output explains the scenario being played. Short extract: > current cgroup: /user.slice/user-0.slice/session-1.scope > child process #0: check that process #self (pid=482) has cgroup /user.slice/user-0.slice/session-1.scope > child process #0: unshare cgroupns > child process #0: move process #self (pid=482) to cgroup cgroup-a/subcgroup-a > child process #0: join parent cgroupns The test does not change the mount namespace and does not mount any new cgroup2 filesystem. Therefore this does not test that the cgroup2 mount is correctly rooted to the cgroupns root at mount time. Signed-off-by: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- Changelog: 20160131 - rebase on sergeh/cgroupns.v10 and fix conflicts 20160115 - Detect where cgroup2 is mounted, don't assume /sys/fs/cgroup (suggested by sergeh) - Check more error conditions (from krnowak's review) - Coding style (from krnowak's review) - Update error message for Linux >= 4.5 (from krnowak's review) 20160104 - Fix coding style (from sergeh's review) - Fix printf formatting (from sergeh's review) - Fix parsing of /proc/pid/cgroup (from sergeh's review) - Fix concatenation of cgroup paths 20151219 - First version This patch is available in the cgroupns.v10-tests branch of https://github.com/kinvolk/linux.git It is rebased on top of Serge Hallyn's cgroupns.v10 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ Test results: - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=1 - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=0 --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/cgroupns/Makefile | 11 + tools/testing/selftests/cgroupns/cgroupns_test.c | 445 +++++++++++++++++++++++ 3 files changed, 457 insertions(+) create mode 100644 tools/testing/selftests/cgroupns/Makefile create mode 100644 tools/testing/selftests/cgroupns/cgroupns_test.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index b04afc3..b373135 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -1,5 +1,6 @@ TARGETS = breakpoints TARGETS += capabilities +TARGETS += cgroupns TARGETS += cpu-hotplug TARGETS += efivarfs TARGETS += exec diff --git a/tools/testing/selftests/cgroupns/Makefile b/tools/testing/selftests/cgroupns/Makefile new file mode 100644 index 0000000..0fdbe0a --- /dev/null +++ b/tools/testing/selftests/cgroupns/Makefile @@ -0,0 +1,11 @@ +CFLAGS += -I../../../../usr/include/ +CFLAGS += -I../../../../include/uapi/ + +all: cgroupns_test + +TEST_PROGS := cgroupns_test + +include ../lib.mk + +clean: + $(RM) cgroupns_test diff --git a/tools/testing/selftests/cgroupns/cgroupns_test.c b/tools/testing/selftests/cgroupns/cgroupns_test.c new file mode 100644 index 0000000..71e2336 --- /dev/null +++ b/tools/testing/selftests/cgroupns/cgroupns_test.c @@ -0,0 +1,445 @@ +#define _GNU_SOURCE + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <string.h> +#include <sys/statfs.h> +#include <inttypes.h> +#include <sched.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/socket.h> +#include <sys/wait.h> +#include <sys/eventfd.h> +#include <signal.h> +#include <fcntl.h> + +#include <linux/magic.h> +#include <linux/sched.h> + +#include "../kselftest.h" + +#define STACK_SIZE 65536 + +static char cgroup_mountpoint[4096]; +static char root_cgroup[4096]; + +#define CHILDREN_COUNT 2 +typedef struct { + pid_t pid; + uint8_t *stack; + int start_semfd; + int end_semfd; +} cgroupns_child_t; +cgroupns_child_t children[CHILDREN_COUNT]; + +typedef enum { + UNSHARE_CGROUPNS, + JOIN_CGROUPNS, + CHECK_CGROUP, + CHECK_CGROUP_WITH_ROOT_PREFIX, + MOVE_CGROUP, + MOVE_CGROUP_WITH_ROOT_PREFIX, +} cgroupns_action_t; + +static const struct { + int actor_id; + cgroupns_action_t action; + int target_id; + char *path; +} cgroupns_tests[] = { + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + + { 0, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + + { 1, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/"}, + { 1, CHECK_CGROUP, -1, "/"}, + { 1, CHECK_CGROUP, 0, "/"}, + { 1, CHECK_CGROUP, 1, "/"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a"}, + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b"}, + + { 0, CHECK_CGROUP, -1, "/cgroup-a"}, + { 0, CHECK_CGROUP, 0, "/cgroup-a"}, + { 0, CHECK_CGROUP, 1, "/cgroup-b"}, + { 1, CHECK_CGROUP, -1, "/cgroup-b"}, + { 1, CHECK_CGROUP, 0, "/cgroup-a"}, + { 1, CHECK_CGROUP, 1, "/cgroup-b"}, + + { 0, UNSHARE_CGROUPNS, -1, NULL}, + { 1, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b"}, + { 1, CHECK_CGROUP, -1, "/"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a"}, + { 1, CHECK_CGROUP, 1, "/"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a"}, + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b/sub1-b"}, + + { 0, CHECK_CGROUP, 0, "/sub1-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a"}, + { 1, CHECK_CGROUP, 1, "/sub1-b"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + + { 1, UNSHARE_CGROUPNS, -1, NULL}, + { 1, CHECK_CGROUP, 0, "/../../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, UNSHARE_CGROUPNS, -1, NULL}, + { 0, CHECK_CGROUP, 1, "/../../../../../cgroup-b/sub1-b"}, + + { 0, JOIN_CGROUPNS, -1, NULL}, + { 1, JOIN_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, +}; +#define cgroupns_tests_len (sizeof(cgroupns_tests) / sizeof(cgroupns_tests[0])) + +static void +get_cgroup_mountpoint(char *path, size_t len) +{ + char line[4096]; + char dummy[4096]; + char mountpoint[4096]; + FILE *f; + + f = fopen("/proc/self/mountinfo", "r"); + if (!f) { + printf("FAIL: cannot open mountinfo\n"); + ksft_exit_fail(); + } + + for (;;) { + if (!fgets(line, sizeof(line), f)) { + if (ferror(f)) { + printf("FAIL: cannot read mountinfo\n"); + ksft_exit_fail(); + } + printf("FAIL: cannot find cgroup2 mount in mountinfo\n"); + ksft_exit_fail(); + } + + line[strcspn(line, "\n")] = 0; + /* 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue + * (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) + */ + if (strstr(line, " - cgroup2 ") == NULL) /* (9)=cgroup2 */ + continue; + + if (sscanf(line, "%4095s %4095s %4095s %4095s %4095s", dummy, dummy, dummy, dummy, mountpoint) != 5) + continue; + + strncpy(path, mountpoint, len); + path[len-1] = '\0'; + break; + } + + fclose(f); +} + +static void +get_cgroup(pid_t pid, char *path, size_t len) +{ + char proc_path[4096]; + char line[4096]; + FILE *f; + + if (pid > 0) { + sprintf(proc_path, "/proc/%d/cgroup", pid); + } else { + sprintf(proc_path, "/proc/self/cgroup"); + } + + f = fopen(proc_path, "r"); + if (!f) { + printf("FAIL: cannot open %s\n", proc_path); + ksft_exit_fail(); + } + + for (;;) { + if (!fgets(line, sizeof(line), f)) { + if (ferror(f)) { + printf("FAIL: cannot read %s\n", proc_path); + ksft_exit_fail(); + } + printf("FAIL: could not parse %s\n", proc_path); + ksft_exit_fail(); + } + + line[strcspn(line, "\n")] = 0; + if (strncmp(line, "0::", 3) == 0) { + strncpy(path, line+3, len); + path[len-1] = '\0'; + break; + } + } + + fclose(f); +} + +static void +move_cgroup(pid_t target_pid, int prefix, char *cgroup) +{ + char knob_dir[4096]; + char knob_path[4096]; + char buf[128]; + FILE *f; + int ret; + + if (prefix) { + sprintf(knob_dir, "%s/%s/%s", cgroup_mountpoint, root_cgroup, cgroup); + sprintf(knob_path, "%s/cgroup.procs", knob_dir, cgroup); + } else { + sprintf(knob_dir, "%s/%s", cgroup_mountpoint, cgroup); + sprintf(knob_path, "%s/cgroup.procs", knob_dir); + } + + mkdir(knob_dir, 0755); + + sprintf(buf, "%d\n", target_pid); + + f = fopen(knob_path, "w"); + ret = fwrite(buf, strlen(buf), 1, f); + if (ret != 1) { + printf("FAIL: cannot write to %s (ret=%d)\n", knob_path, ret); + ksft_exit_fail(); + } + fclose(f); +} + +static int +child_func(void *arg) +{ + uintptr_t id = (uintptr_t) arg; + char child_cgroup[4096]; + char expected_cgroup[4096]; + char process_name[128]; + char proc_path[128]; + int step; + int ret; + int nsfd; + + for (step = 0; step < cgroupns_tests_len; step++) { + uint64_t counter = 0; + pid_t target_pid; + + /* wait a signal from the parent process before starting this step */ + ret = read(children[id].start_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot read semaphore\n"); + ksft_exit_fail(); + } + + /* only one process will do this step */ + if (cgroupns_tests[step].actor_id == id) { + switch (cgroupns_tests[step].action) { + case UNSHARE_CGROUPNS: + printf("child process #%lu: unshare cgroupns\n", id); + ret = unshare(CLONE_NEWCGROUP); + if (ret != 0) { + printf("FAIL: cannot unshare cgroupns\n"); + ksft_exit_fail(); + } + break; + + case JOIN_CGROUPNS: + printf("child process #%lu: join parent cgroupns\n", id); + + sprintf(proc_path, "/proc/%d/ns/cgroup", getppid()); + nsfd = open(proc_path, 0); + ret = setns(nsfd, CLONE_NEWCGROUP); + if (ret != 0) { + printf("FAIL: cannot join cgroupns\n"); + ksft_exit_fail(); + } + close(nsfd); + break; + + case CHECK_CGROUP: + case CHECK_CGROUP_WITH_ROOT_PREFIX: + if (cgroupns_tests[step].action == CHECK_CGROUP || strcmp(root_cgroup, "/") == 0) + sprintf(expected_cgroup, "%s", cgroupns_tests[step].path); + else if (strcmp(cgroupns_tests[step].path, "/") == 0) + sprintf(expected_cgroup, "%s", root_cgroup); + else + sprintf(expected_cgroup, "%s%s", root_cgroup, cgroupns_tests[step].path); + + if (cgroupns_tests[step].target_id >= 0) { + target_pid = children[cgroupns_tests[step].target_id].pid; + sprintf(process_name, "#%d (pid=%d)", + cgroupns_tests[step].target_id, target_pid); + } else { + target_pid = 0; + sprintf(process_name, "#self (pid=%d)", getpid()); + } + + printf("child process #%lu: check that process %s has cgroup %s\n", + id, process_name, expected_cgroup); + + get_cgroup(target_pid, child_cgroup, sizeof(child_cgroup)); + + if (strcmp(child_cgroup, expected_cgroup) != 0) { + printf("FAIL: child has cgroup %s\n", child_cgroup); + ksft_exit_fail(); + } + + break; + + case MOVE_CGROUP: + case MOVE_CGROUP_WITH_ROOT_PREFIX: + if (cgroupns_tests[step].target_id >= 0) { + target_pid = children[cgroupns_tests[step].target_id].pid; + sprintf(process_name, "#%d (pid=%d)", + cgroupns_tests[step].target_id, target_pid); + } else { + target_pid = children[id].pid; + sprintf(process_name, "#self (pid=%d)", target_pid); + } + + printf("child process #%lu: move process %s to cgroup %s\n", + id, process_name, cgroupns_tests[step].path); + + move_cgroup(target_pid, + cgroupns_tests[step].action == MOVE_CGROUP_WITH_ROOT_PREFIX, + cgroupns_tests[step].path); + break; + + default: + printf("FAIL: invalid action\n"); + ksft_exit_fail(); + } + } + + + /* signal the parent process we've finished this step */ + counter = 1; + ret = write(children[id].end_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot write semaphore\n"); + ksft_exit_fail(); + } + } + + return 0; +} + +int +main(int argc, char **argv) +{ + struct statfs fs; + char child_cgroup[4096]; + int ret; + int status; + uintptr_t i; + int step; + + get_cgroup_mountpoint(cgroup_mountpoint, sizeof(cgroup_mountpoint)); + printf("cgroup2 mounted on: %s\n", cgroup_mountpoint); + + if (statfs(cgroup_mountpoint, &fs) < 0) { + printf("FAIL: statfs\n"); + ksft_exit_fail(); + } + + if (fs.f_type != (typeof(fs.f_type)) CGROUP2_SUPER_MAGIC) { + printf("FAIL: this test is for Linux >= 4.5 with cgroup2 mounted\n"); + ksft_exit_fail(); + } + + get_cgroup(0, root_cgroup, sizeof(root_cgroup)); + printf("current cgroup: %s\n", root_cgroup); + + for (i = 0; i < CHILDREN_COUNT; i++) { + children[i].start_semfd = eventfd(0, EFD_SEMAPHORE); + if (children[i].start_semfd == -1) { + printf("FAIL: cannot create eventfd\n"); + ksft_exit_fail(); + } + + children[i].end_semfd = eventfd(0, EFD_SEMAPHORE); + if (children[i].end_semfd == -1) { + printf("FAIL: cannot create eventfd\n"); + ksft_exit_fail(); + } + + children[i].stack = malloc(STACK_SIZE); + if (!children[i].stack) { + printf("FAIL: cannot allocate stack\n"); + ksft_exit_fail(); + } + } + + for (i = 0; i < CHILDREN_COUNT; i++) { + children[i].pid = clone(child_func, children[i].stack + STACK_SIZE, + SIGCHLD|CLONE_VM|CLONE_FILES, (void *)i); + if (children[i].pid == -1) { + printf("FAIL: cannot clone\n"); + ksft_exit_fail(); + } + } + + for (step = 0; step < cgroupns_tests_len; step++) { + uint64_t counter = 1; + + /* signal the child processes they can start the current step */ + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = write(children[i].start_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot write semaphore\n"); + ksft_exit_fail(); + } + } + + /* wait until all child processes finished the current step */ + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = read(children[i].end_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot read semaphore\n"); + ksft_exit_fail(); + } + } + } + + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = waitpid(-1, &status, 0); + if (ret == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != 0) { + printf("FAIL: cannot wait child\n"); + ksft_exit_fail(); + } + } + + printf("SUCCESS\n"); + return ksft_exit_pass(); +} -- 2.5.0 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH] selftests/cgroupns: new test for cgroup namespaces @ 2016-01-31 17:48 ` Alban Crequy 0 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-01-31 17:48 UTC (permalink / raw) To: linux-kernel Cc: serge.hallyn, adityakali, linux-api, containers, hannes, lxc-devel, gregkh, tj, cgroups, akpm, iago, Alban Crequy From: Alban Crequy <alban@kinvolk.io> This adds the selftest "cgroupns_test" in order to test the CGroup Namespace patchset. cgroupns_test creates two child processes. They perform a list of actions defined by the array cgroupns_test. This array can easily be extended to more scenarios without adding much code. They are synchronized with eventfds to ensure only one action is performed at a time. The memory is shared between the processes (CLONE_VM) so each child process can know the pid of their siblings without extra IPC. The output explains the scenario being played. Short extract: > current cgroup: /user.slice/user-0.slice/session-1.scope > child process #0: check that process #self (pid=482) has cgroup /user.slice/user-0.slice/session-1.scope > child process #0: unshare cgroupns > child process #0: move process #self (pid=482) to cgroup cgroup-a/subcgroup-a > child process #0: join parent cgroupns The test does not change the mount namespace and does not mount any new cgroup2 filesystem. Therefore this does not test that the cgroup2 mount is correctly rooted to the cgroupns root at mount time. Signed-off-by: Alban Crequy <alban@kinvolk.io> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> --- Changelog: 20160131 - rebase on sergeh/cgroupns.v10 and fix conflicts 20160115 - Detect where cgroup2 is mounted, don't assume /sys/fs/cgroup (suggested by sergeh) - Check more error conditions (from krnowak's review) - Coding style (from krnowak's review) - Update error message for Linux >= 4.5 (from krnowak's review) 20160104 - Fix coding style (from sergeh's review) - Fix printf formatting (from sergeh's review) - Fix parsing of /proc/pid/cgroup (from sergeh's review) - Fix concatenation of cgroup paths 20151219 - First version This patch is available in the cgroupns.v10-tests branch of https://github.com/kinvolk/linux.git It is rebased on top of Serge Hallyn's cgroupns.v10 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ Test results: - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=1 - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=0 --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/cgroupns/Makefile | 11 + tools/testing/selftests/cgroupns/cgroupns_test.c | 445 +++++++++++++++++++++++ 3 files changed, 457 insertions(+) create mode 100644 tools/testing/selftests/cgroupns/Makefile create mode 100644 tools/testing/selftests/cgroupns/cgroupns_test.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index b04afc3..b373135 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -1,5 +1,6 @@ TARGETS = breakpoints TARGETS += capabilities +TARGETS += cgroupns TARGETS += cpu-hotplug TARGETS += efivarfs TARGETS += exec diff --git a/tools/testing/selftests/cgroupns/Makefile b/tools/testing/selftests/cgroupns/Makefile new file mode 100644 index 0000000..0fdbe0a --- /dev/null +++ b/tools/testing/selftests/cgroupns/Makefile @@ -0,0 +1,11 @@ +CFLAGS += -I../../../../usr/include/ +CFLAGS += -I../../../../include/uapi/ + +all: cgroupns_test + +TEST_PROGS := cgroupns_test + +include ../lib.mk + +clean: + $(RM) cgroupns_test diff --git a/tools/testing/selftests/cgroupns/cgroupns_test.c b/tools/testing/selftests/cgroupns/cgroupns_test.c new file mode 100644 index 0000000..71e2336 --- /dev/null +++ b/tools/testing/selftests/cgroupns/cgroupns_test.c @@ -0,0 +1,445 @@ +#define _GNU_SOURCE + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <string.h> +#include <sys/statfs.h> +#include <inttypes.h> +#include <sched.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/socket.h> +#include <sys/wait.h> +#include <sys/eventfd.h> +#include <signal.h> +#include <fcntl.h> + +#include <linux/magic.h> +#include <linux/sched.h> + +#include "../kselftest.h" + +#define STACK_SIZE 65536 + +static char cgroup_mountpoint[4096]; +static char root_cgroup[4096]; + +#define CHILDREN_COUNT 2 +typedef struct { + pid_t pid; + uint8_t *stack; + int start_semfd; + int end_semfd; +} cgroupns_child_t; +cgroupns_child_t children[CHILDREN_COUNT]; + +typedef enum { + UNSHARE_CGROUPNS, + JOIN_CGROUPNS, + CHECK_CGROUP, + CHECK_CGROUP_WITH_ROOT_PREFIX, + MOVE_CGROUP, + MOVE_CGROUP_WITH_ROOT_PREFIX, +} cgroupns_action_t; + +static const struct { + int actor_id; + cgroupns_action_t action; + int target_id; + char *path; +} cgroupns_tests[] = { + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + + { 0, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, + + { 1, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/"}, + { 1, CHECK_CGROUP, -1, "/"}, + { 1, CHECK_CGROUP, 0, "/"}, + { 1, CHECK_CGROUP, 1, "/"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a"}, + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b"}, + + { 0, CHECK_CGROUP, -1, "/cgroup-a"}, + { 0, CHECK_CGROUP, 0, "/cgroup-a"}, + { 0, CHECK_CGROUP, 1, "/cgroup-b"}, + { 1, CHECK_CGROUP, -1, "/cgroup-b"}, + { 1, CHECK_CGROUP, 0, "/cgroup-a"}, + { 1, CHECK_CGROUP, 1, "/cgroup-b"}, + + { 0, UNSHARE_CGROUPNS, -1, NULL}, + { 1, UNSHARE_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP, -1, "/"}, + { 0, CHECK_CGROUP, 0, "/"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b"}, + { 1, CHECK_CGROUP, -1, "/"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a"}, + { 1, CHECK_CGROUP, 1, "/"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a"}, + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b/sub1-b"}, + + { 0, CHECK_CGROUP, 0, "/sub1-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a"}, + { 1, CHECK_CGROUP, 1, "/sub1-b"}, + + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, + + { 1, UNSHARE_CGROUPNS, -1, NULL}, + { 1, CHECK_CGROUP, 0, "/../../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, UNSHARE_CGROUPNS, -1, NULL}, + { 0, CHECK_CGROUP, 1, "/../../../../../cgroup-b/sub1-b"}, + + { 0, JOIN_CGROUPNS, -1, NULL}, + { 1, JOIN_CGROUPNS, -1, NULL}, + + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, +}; +#define cgroupns_tests_len (sizeof(cgroupns_tests) / sizeof(cgroupns_tests[0])) + +static void +get_cgroup_mountpoint(char *path, size_t len) +{ + char line[4096]; + char dummy[4096]; + char mountpoint[4096]; + FILE *f; + + f = fopen("/proc/self/mountinfo", "r"); + if (!f) { + printf("FAIL: cannot open mountinfo\n"); + ksft_exit_fail(); + } + + for (;;) { + if (!fgets(line, sizeof(line), f)) { + if (ferror(f)) { + printf("FAIL: cannot read mountinfo\n"); + ksft_exit_fail(); + } + printf("FAIL: cannot find cgroup2 mount in mountinfo\n"); + ksft_exit_fail(); + } + + line[strcspn(line, "\n")] = 0; + /* 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue + * (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) + */ + if (strstr(line, " - cgroup2 ") == NULL) /* (9)=cgroup2 */ + continue; + + if (sscanf(line, "%4095s %4095s %4095s %4095s %4095s", dummy, dummy, dummy, dummy, mountpoint) != 5) + continue; + + strncpy(path, mountpoint, len); + path[len-1] = '\0'; + break; + } + + fclose(f); +} + +static void +get_cgroup(pid_t pid, char *path, size_t len) +{ + char proc_path[4096]; + char line[4096]; + FILE *f; + + if (pid > 0) { + sprintf(proc_path, "/proc/%d/cgroup", pid); + } else { + sprintf(proc_path, "/proc/self/cgroup"); + } + + f = fopen(proc_path, "r"); + if (!f) { + printf("FAIL: cannot open %s\n", proc_path); + ksft_exit_fail(); + } + + for (;;) { + if (!fgets(line, sizeof(line), f)) { + if (ferror(f)) { + printf("FAIL: cannot read %s\n", proc_path); + ksft_exit_fail(); + } + printf("FAIL: could not parse %s\n", proc_path); + ksft_exit_fail(); + } + + line[strcspn(line, "\n")] = 0; + if (strncmp(line, "0::", 3) == 0) { + strncpy(path, line+3, len); + path[len-1] = '\0'; + break; + } + } + + fclose(f); +} + +static void +move_cgroup(pid_t target_pid, int prefix, char *cgroup) +{ + char knob_dir[4096]; + char knob_path[4096]; + char buf[128]; + FILE *f; + int ret; + + if (prefix) { + sprintf(knob_dir, "%s/%s/%s", cgroup_mountpoint, root_cgroup, cgroup); + sprintf(knob_path, "%s/cgroup.procs", knob_dir, cgroup); + } else { + sprintf(knob_dir, "%s/%s", cgroup_mountpoint, cgroup); + sprintf(knob_path, "%s/cgroup.procs", knob_dir); + } + + mkdir(knob_dir, 0755); + + sprintf(buf, "%d\n", target_pid); + + f = fopen(knob_path, "w"); + ret = fwrite(buf, strlen(buf), 1, f); + if (ret != 1) { + printf("FAIL: cannot write to %s (ret=%d)\n", knob_path, ret); + ksft_exit_fail(); + } + fclose(f); +} + +static int +child_func(void *arg) +{ + uintptr_t id = (uintptr_t) arg; + char child_cgroup[4096]; + char expected_cgroup[4096]; + char process_name[128]; + char proc_path[128]; + int step; + int ret; + int nsfd; + + for (step = 0; step < cgroupns_tests_len; step++) { + uint64_t counter = 0; + pid_t target_pid; + + /* wait a signal from the parent process before starting this step */ + ret = read(children[id].start_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot read semaphore\n"); + ksft_exit_fail(); + } + + /* only one process will do this step */ + if (cgroupns_tests[step].actor_id == id) { + switch (cgroupns_tests[step].action) { + case UNSHARE_CGROUPNS: + printf("child process #%lu: unshare cgroupns\n", id); + ret = unshare(CLONE_NEWCGROUP); + if (ret != 0) { + printf("FAIL: cannot unshare cgroupns\n"); + ksft_exit_fail(); + } + break; + + case JOIN_CGROUPNS: + printf("child process #%lu: join parent cgroupns\n", id); + + sprintf(proc_path, "/proc/%d/ns/cgroup", getppid()); + nsfd = open(proc_path, 0); + ret = setns(nsfd, CLONE_NEWCGROUP); + if (ret != 0) { + printf("FAIL: cannot join cgroupns\n"); + ksft_exit_fail(); + } + close(nsfd); + break; + + case CHECK_CGROUP: + case CHECK_CGROUP_WITH_ROOT_PREFIX: + if (cgroupns_tests[step].action == CHECK_CGROUP || strcmp(root_cgroup, "/") == 0) + sprintf(expected_cgroup, "%s", cgroupns_tests[step].path); + else if (strcmp(cgroupns_tests[step].path, "/") == 0) + sprintf(expected_cgroup, "%s", root_cgroup); + else + sprintf(expected_cgroup, "%s%s", root_cgroup, cgroupns_tests[step].path); + + if (cgroupns_tests[step].target_id >= 0) { + target_pid = children[cgroupns_tests[step].target_id].pid; + sprintf(process_name, "#%d (pid=%d)", + cgroupns_tests[step].target_id, target_pid); + } else { + target_pid = 0; + sprintf(process_name, "#self (pid=%d)", getpid()); + } + + printf("child process #%lu: check that process %s has cgroup %s\n", + id, process_name, expected_cgroup); + + get_cgroup(target_pid, child_cgroup, sizeof(child_cgroup)); + + if (strcmp(child_cgroup, expected_cgroup) != 0) { + printf("FAIL: child has cgroup %s\n", child_cgroup); + ksft_exit_fail(); + } + + break; + + case MOVE_CGROUP: + case MOVE_CGROUP_WITH_ROOT_PREFIX: + if (cgroupns_tests[step].target_id >= 0) { + target_pid = children[cgroupns_tests[step].target_id].pid; + sprintf(process_name, "#%d (pid=%d)", + cgroupns_tests[step].target_id, target_pid); + } else { + target_pid = children[id].pid; + sprintf(process_name, "#self (pid=%d)", target_pid); + } + + printf("child process #%lu: move process %s to cgroup %s\n", + id, process_name, cgroupns_tests[step].path); + + move_cgroup(target_pid, + cgroupns_tests[step].action == MOVE_CGROUP_WITH_ROOT_PREFIX, + cgroupns_tests[step].path); + break; + + default: + printf("FAIL: invalid action\n"); + ksft_exit_fail(); + } + } + + + /* signal the parent process we've finished this step */ + counter = 1; + ret = write(children[id].end_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot write semaphore\n"); + ksft_exit_fail(); + } + } + + return 0; +} + +int +main(int argc, char **argv) +{ + struct statfs fs; + char child_cgroup[4096]; + int ret; + int status; + uintptr_t i; + int step; + + get_cgroup_mountpoint(cgroup_mountpoint, sizeof(cgroup_mountpoint)); + printf("cgroup2 mounted on: %s\n", cgroup_mountpoint); + + if (statfs(cgroup_mountpoint, &fs) < 0) { + printf("FAIL: statfs\n"); + ksft_exit_fail(); + } + + if (fs.f_type != (typeof(fs.f_type)) CGROUP2_SUPER_MAGIC) { + printf("FAIL: this test is for Linux >= 4.5 with cgroup2 mounted\n"); + ksft_exit_fail(); + } + + get_cgroup(0, root_cgroup, sizeof(root_cgroup)); + printf("current cgroup: %s\n", root_cgroup); + + for (i = 0; i < CHILDREN_COUNT; i++) { + children[i].start_semfd = eventfd(0, EFD_SEMAPHORE); + if (children[i].start_semfd == -1) { + printf("FAIL: cannot create eventfd\n"); + ksft_exit_fail(); + } + + children[i].end_semfd = eventfd(0, EFD_SEMAPHORE); + if (children[i].end_semfd == -1) { + printf("FAIL: cannot create eventfd\n"); + ksft_exit_fail(); + } + + children[i].stack = malloc(STACK_SIZE); + if (!children[i].stack) { + printf("FAIL: cannot allocate stack\n"); + ksft_exit_fail(); + } + } + + for (i = 0; i < CHILDREN_COUNT; i++) { + children[i].pid = clone(child_func, children[i].stack + STACK_SIZE, + SIGCHLD|CLONE_VM|CLONE_FILES, (void *)i); + if (children[i].pid == -1) { + printf("FAIL: cannot clone\n"); + ksft_exit_fail(); + } + } + + for (step = 0; step < cgroupns_tests_len; step++) { + uint64_t counter = 1; + + /* signal the child processes they can start the current step */ + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = write(children[i].start_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot write semaphore\n"); + ksft_exit_fail(); + } + } + + /* wait until all child processes finished the current step */ + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = read(children[i].end_semfd, &counter, sizeof(counter)); + if (ret != sizeof(counter)) { + printf("FAIL: cannot read semaphore\n"); + ksft_exit_fail(); + } + } + } + + for (i = 0; i < CHILDREN_COUNT; i++) { + ret = waitpid(-1, &status, 0); + if (ret == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != 0) { + printf("FAIL: cannot wait child\n"); + ksft_exit_fail(); + } + } + + printf("SUCCESS\n"); + return ksft_exit_pass(); +} -- 2.5.0 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH] selftests/cgroupns: new test for cgroup namespaces [not found] ` <1454262492-6480-1-git-send-email-alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> @ 2016-02-10 17:48 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-10 17:48 UTC (permalink / raw) To: Alban Crequy Cc: linux-kernel, gregkh, linux-api, containers, iago, Alban Crequy, lxc-devel, hannes, tj, cgroups, akpm On Sun, Jan 31, 2016 at 06:48:12PM +0100, Alban Crequy wrote: > From: Alban Crequy <alban@kinvolk.io> > > This adds the selftest "cgroupns_test" in order to test the CGroup > Namespace patchset. > > cgroupns_test creates two child processes. They perform a list of > actions defined by the array cgroupns_test. This array can easily be > extended to more scenarios without adding much code. They are > synchronized with eventfds to ensure only one action is performed at a > time. > > The memory is shared between the processes (CLONE_VM) so each child > process can know the pid of their siblings without extra IPC. > > The output explains the scenario being played. Short extract: > > > current cgroup: /user.slice/user-0.slice/session-1.scope > > child process #0: check that process #self (pid=482) has cgroup /user.slice/user-0.slice/session-1.scope > > child process #0: unshare cgroupns > > child process #0: move process #self (pid=482) to cgroup cgroup-a/subcgroup-a > > child process #0: join parent cgroupns > > The test does not change the mount namespace and does not mount any > new cgroup2 filesystem. Therefore this does not test that the cgroup2 > mount is correctly rooted to the cgroupns root at mount time. > > Signed-off-by: Alban Crequy <alban@kinvolk.io> Thanks, Alban! > Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> > > --- > > Changelog: > 20160131 - rebase on sergeh/cgroupns.v10 and fix conflicts > > 20160115 - Detect where cgroup2 is mounted, don't assume > /sys/fs/cgroup (suggested by sergeh) > - Check more error conditions (from krnowak's review) > - Coding style (from krnowak's review) > - Update error message for Linux >= 4.5 (from krnowak's > review) > > 20160104 - Fix coding style (from sergeh's review) > - Fix printf formatting (from sergeh's review) > - Fix parsing of /proc/pid/cgroup (from sergeh's review) > - Fix concatenation of cgroup paths > > 20151219 - First version > > This patch is available in the cgroupns.v10-tests branch of > https://github.com/kinvolk/linux.git > It is rebased on top of Serge Hallyn's cgroupns.v10 branch of > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > Test results: > > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=1 > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=0 > --- > tools/testing/selftests/Makefile | 1 + > tools/testing/selftests/cgroupns/Makefile | 11 + > tools/testing/selftests/cgroupns/cgroupns_test.c | 445 +++++++++++++++++++++++ > 3 files changed, 457 insertions(+) > create mode 100644 tools/testing/selftests/cgroupns/Makefile > create mode 100644 tools/testing/selftests/cgroupns/cgroupns_test.c > > diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile > index b04afc3..b373135 100644 > --- a/tools/testing/selftests/Makefile > +++ b/tools/testing/selftests/Makefile > @@ -1,5 +1,6 @@ > TARGETS = breakpoints > TARGETS += capabilities > +TARGETS += cgroupns > TARGETS += cpu-hotplug > TARGETS += efivarfs > TARGETS += exec > diff --git a/tools/testing/selftests/cgroupns/Makefile b/tools/testing/selftests/cgroupns/Makefile > new file mode 100644 > index 0000000..0fdbe0a > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/Makefile > @@ -0,0 +1,11 @@ > +CFLAGS += -I../../../../usr/include/ > +CFLAGS += -I../../../../include/uapi/ > + > +all: cgroupns_test > + > +TEST_PROGS := cgroupns_test > + > +include ../lib.mk > + > +clean: > + $(RM) cgroupns_test > diff --git a/tools/testing/selftests/cgroupns/cgroupns_test.c b/tools/testing/selftests/cgroupns/cgroupns_test.c > new file mode 100644 > index 0000000..71e2336 > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/cgroupns_test.c > @@ -0,0 +1,445 @@ > +#define _GNU_SOURCE > + > +#include <stdio.h> > +#include <stdlib.h> > +#include <unistd.h> > +#include <string.h> > +#include <sys/statfs.h> > +#include <inttypes.h> > +#include <sched.h> > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <sys/socket.h> > +#include <sys/wait.h> > +#include <sys/eventfd.h> > +#include <signal.h> > +#include <fcntl.h> > + > +#include <linux/magic.h> > +#include <linux/sched.h> > + > +#include "../kselftest.h" > + > +#define STACK_SIZE 65536 > + > +static char cgroup_mountpoint[4096]; > +static char root_cgroup[4096]; > + > +#define CHILDREN_COUNT 2 > +typedef struct { > + pid_t pid; > + uint8_t *stack; > + int start_semfd; > + int end_semfd; > +} cgroupns_child_t; > +cgroupns_child_t children[CHILDREN_COUNT]; > + > +typedef enum { > + UNSHARE_CGROUPNS, > + JOIN_CGROUPNS, > + CHECK_CGROUP, > + CHECK_CGROUP_WITH_ROOT_PREFIX, > + MOVE_CGROUP, > + MOVE_CGROUP_WITH_ROOT_PREFIX, > +} cgroupns_action_t; > + > +static const struct { > + int actor_id; > + cgroupns_action_t action; > + int target_id; > + char *path; > +} cgroupns_tests[] = { > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b"}, > + > + { 0, CHECK_CGROUP, -1, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/cgroup-b"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b/sub1-b"}, > + > + { 0, CHECK_CGROUP, 0, "/sub1-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a"}, > + { 1, CHECK_CGROUP, 1, "/sub1-b"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, CHECK_CGROUP, 0, "/../../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 0, CHECK_CGROUP, 1, "/../../../../../cgroup-b/sub1-b"}, > + > + { 0, JOIN_CGROUPNS, -1, NULL}, > + { 1, JOIN_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > +}; > +#define cgroupns_tests_len (sizeof(cgroupns_tests) / sizeof(cgroupns_tests[0])) > + > +static void > +get_cgroup_mountpoint(char *path, size_t len) > +{ > + char line[4096]; > + char dummy[4096]; > + char mountpoint[4096]; > + FILE *f; > + > + f = fopen("/proc/self/mountinfo", "r"); > + if (!f) { > + printf("FAIL: cannot open mountinfo\n"); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read mountinfo\n"); > + ksft_exit_fail(); > + } > + printf("FAIL: cannot find cgroup2 mount in mountinfo\n"); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + /* 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue > + * (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) > + */ > + if (strstr(line, " - cgroup2 ") == NULL) /* (9)=cgroup2 */ > + continue; > + > + if (sscanf(line, "%4095s %4095s %4095s %4095s %4095s", dummy, dummy, dummy, dummy, mountpoint) != 5) > + continue; > + > + strncpy(path, mountpoint, len); > + path[len-1] = '\0'; > + break; > + } > + > + fclose(f); > +} > + > +static void > +get_cgroup(pid_t pid, char *path, size_t len) > +{ > + char proc_path[4096]; > + char line[4096]; > + FILE *f; > + > + if (pid > 0) { > + sprintf(proc_path, "/proc/%d/cgroup", pid); > + } else { > + sprintf(proc_path, "/proc/self/cgroup"); > + } > + > + f = fopen(proc_path, "r"); > + if (!f) { > + printf("FAIL: cannot open %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read %s\n", proc_path); > + ksft_exit_fail(); > + } > + printf("FAIL: could not parse %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + if (strncmp(line, "0::", 3) == 0) { > + strncpy(path, line+3, len); > + path[len-1] = '\0'; > + break; > + } > + } > + > + fclose(f); > +} > + > +static void > +move_cgroup(pid_t target_pid, int prefix, char *cgroup) > +{ > + char knob_dir[4096]; > + char knob_path[4096]; > + char buf[128]; > + FILE *f; > + int ret; > + > + if (prefix) { > + sprintf(knob_dir, "%s/%s/%s", cgroup_mountpoint, root_cgroup, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir, cgroup); > + } else { > + sprintf(knob_dir, "%s/%s", cgroup_mountpoint, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir); > + } > + > + mkdir(knob_dir, 0755); > + > + sprintf(buf, "%d\n", target_pid); > + > + f = fopen(knob_path, "w"); > + ret = fwrite(buf, strlen(buf), 1, f); > + if (ret != 1) { > + printf("FAIL: cannot write to %s (ret=%d)\n", knob_path, ret); > + ksft_exit_fail(); > + } > + fclose(f); > +} > + > +static int > +child_func(void *arg) > +{ > + uintptr_t id = (uintptr_t) arg; > + char child_cgroup[4096]; > + char expected_cgroup[4096]; > + char process_name[128]; > + char proc_path[128]; > + int step; > + int ret; > + int nsfd; > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 0; > + pid_t target_pid; > + > + /* wait a signal from the parent process before starting this step */ > + ret = read(children[id].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + > + /* only one process will do this step */ > + if (cgroupns_tests[step].actor_id == id) { > + switch (cgroupns_tests[step].action) { > + case UNSHARE_CGROUPNS: > + printf("child process #%lu: unshare cgroupns\n", id); > + ret = unshare(CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot unshare cgroupns\n"); > + ksft_exit_fail(); > + } > + break; > + > + case JOIN_CGROUPNS: > + printf("child process #%lu: join parent cgroupns\n", id); > + > + sprintf(proc_path, "/proc/%d/ns/cgroup", getppid()); > + nsfd = open(proc_path, 0); > + ret = setns(nsfd, CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot join cgroupns\n"); > + ksft_exit_fail(); > + } > + close(nsfd); > + break; > + > + case CHECK_CGROUP: > + case CHECK_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].action == CHECK_CGROUP || strcmp(root_cgroup, "/") == 0) > + sprintf(expected_cgroup, "%s", cgroupns_tests[step].path); > + else if (strcmp(cgroupns_tests[step].path, "/") == 0) > + sprintf(expected_cgroup, "%s", root_cgroup); > + else > + sprintf(expected_cgroup, "%s%s", root_cgroup, cgroupns_tests[step].path); > + > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = 0; > + sprintf(process_name, "#self (pid=%d)", getpid()); > + } > + > + printf("child process #%lu: check that process %s has cgroup %s\n", > + id, process_name, expected_cgroup); > + > + get_cgroup(target_pid, child_cgroup, sizeof(child_cgroup)); > + > + if (strcmp(child_cgroup, expected_cgroup) != 0) { > + printf("FAIL: child has cgroup %s\n", child_cgroup); > + ksft_exit_fail(); > + } > + > + break; > + > + case MOVE_CGROUP: > + case MOVE_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = children[id].pid; > + sprintf(process_name, "#self (pid=%d)", target_pid); > + } > + > + printf("child process #%lu: move process %s to cgroup %s\n", > + id, process_name, cgroupns_tests[step].path); > + > + move_cgroup(target_pid, > + cgroupns_tests[step].action == MOVE_CGROUP_WITH_ROOT_PREFIX, > + cgroupns_tests[step].path); > + break; > + > + default: > + printf("FAIL: invalid action\n"); > + ksft_exit_fail(); > + } > + } > + > + > + /* signal the parent process we've finished this step */ > + counter = 1; > + ret = write(children[id].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + return 0; > +} > + > +int > +main(int argc, char **argv) > +{ > + struct statfs fs; > + char child_cgroup[4096]; > + int ret; > + int status; > + uintptr_t i; > + int step; > + > + get_cgroup_mountpoint(cgroup_mountpoint, sizeof(cgroup_mountpoint)); > + printf("cgroup2 mounted on: %s\n", cgroup_mountpoint); > + > + if (statfs(cgroup_mountpoint, &fs) < 0) { > + printf("FAIL: statfs\n"); > + ksft_exit_fail(); > + } > + > + if (fs.f_type != (typeof(fs.f_type)) CGROUP2_SUPER_MAGIC) { > + printf("FAIL: this test is for Linux >= 4.5 with cgroup2 mounted\n"); > + ksft_exit_fail(); > + } > + > + get_cgroup(0, root_cgroup, sizeof(root_cgroup)); > + printf("current cgroup: %s\n", root_cgroup); > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].start_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].start_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].end_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].end_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].stack = malloc(STACK_SIZE); > + if (!children[i].stack) { > + printf("FAIL: cannot allocate stack\n"); > + ksft_exit_fail(); > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].pid = clone(child_func, children[i].stack + STACK_SIZE, > + SIGCHLD|CLONE_VM|CLONE_FILES, (void *)i); > + if (children[i].pid == -1) { > + printf("FAIL: cannot clone\n"); > + ksft_exit_fail(); > + } > + } > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 1; > + > + /* signal the child processes they can start the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = write(children[i].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + /* wait until all child processes finished the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = read(children[i].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = waitpid(-1, &status, 0); > + if (ret == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != 0) { > + printf("FAIL: cannot wait child\n"); > + ksft_exit_fail(); > + } > + } > + > + printf("SUCCESS\n"); > + return ksft_exit_pass(); > +} > -- > 2.5.0 > > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH] selftests/cgroupns: new test for cgroup namespaces @ 2016-02-10 17:48 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-10 17:48 UTC (permalink / raw) To: Alban Crequy Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, iago-lYLaGTFnO9sWenYVfaLwtA, Alban Crequy, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Sun, Jan 31, 2016 at 06:48:12PM +0100, Alban Crequy wrote: > From: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> > > This adds the selftest "cgroupns_test" in order to test the CGroup > Namespace patchset. > > cgroupns_test creates two child processes. They perform a list of > actions defined by the array cgroupns_test. This array can easily be > extended to more scenarios without adding much code. They are > synchronized with eventfds to ensure only one action is performed at a > time. > > The memory is shared between the processes (CLONE_VM) so each child > process can know the pid of their siblings without extra IPC. > > The output explains the scenario being played. Short extract: > > > current cgroup: /user.slice/user-0.slice/session-1.scope > > child process #0: check that process #self (pid=482) has cgroup /user.slice/user-0.slice/session-1.scope > > child process #0: unshare cgroupns > > child process #0: move process #self (pid=482) to cgroup cgroup-a/subcgroup-a > > child process #0: join parent cgroupns > > The test does not change the mount namespace and does not mount any > new cgroup2 filesystem. Therefore this does not test that the cgroup2 > mount is correctly rooted to the cgroupns root at mount time. > > Signed-off-by: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> Thanks, Alban! > Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> > > --- > > Changelog: > 20160131 - rebase on sergeh/cgroupns.v10 and fix conflicts > > 20160115 - Detect where cgroup2 is mounted, don't assume > /sys/fs/cgroup (suggested by sergeh) > - Check more error conditions (from krnowak's review) > - Coding style (from krnowak's review) > - Update error message for Linux >= 4.5 (from krnowak's > review) > > 20160104 - Fix coding style (from sergeh's review) > - Fix printf formatting (from sergeh's review) > - Fix parsing of /proc/pid/cgroup (from sergeh's review) > - Fix concatenation of cgroup paths > > 20151219 - First version > > This patch is available in the cgroupns.v10-tests branch of > https://github.com/kinvolk/linux.git > It is rebased on top of Serge Hallyn's cgroupns.v10 branch of > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > Test results: > > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=1 > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=0 > --- > tools/testing/selftests/Makefile | 1 + > tools/testing/selftests/cgroupns/Makefile | 11 + > tools/testing/selftests/cgroupns/cgroupns_test.c | 445 +++++++++++++++++++++++ > 3 files changed, 457 insertions(+) > create mode 100644 tools/testing/selftests/cgroupns/Makefile > create mode 100644 tools/testing/selftests/cgroupns/cgroupns_test.c > > diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile > index b04afc3..b373135 100644 > --- a/tools/testing/selftests/Makefile > +++ b/tools/testing/selftests/Makefile > @@ -1,5 +1,6 @@ > TARGETS = breakpoints > TARGETS += capabilities > +TARGETS += cgroupns > TARGETS += cpu-hotplug > TARGETS += efivarfs > TARGETS += exec > diff --git a/tools/testing/selftests/cgroupns/Makefile b/tools/testing/selftests/cgroupns/Makefile > new file mode 100644 > index 0000000..0fdbe0a > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/Makefile > @@ -0,0 +1,11 @@ > +CFLAGS += -I../../../../usr/include/ > +CFLAGS += -I../../../../include/uapi/ > + > +all: cgroupns_test > + > +TEST_PROGS := cgroupns_test > + > +include ../lib.mk > + > +clean: > + $(RM) cgroupns_test > diff --git a/tools/testing/selftests/cgroupns/cgroupns_test.c b/tools/testing/selftests/cgroupns/cgroupns_test.c > new file mode 100644 > index 0000000..71e2336 > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/cgroupns_test.c > @@ -0,0 +1,445 @@ > +#define _GNU_SOURCE > + > +#include <stdio.h> > +#include <stdlib.h> > +#include <unistd.h> > +#include <string.h> > +#include <sys/statfs.h> > +#include <inttypes.h> > +#include <sched.h> > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <sys/socket.h> > +#include <sys/wait.h> > +#include <sys/eventfd.h> > +#include <signal.h> > +#include <fcntl.h> > + > +#include <linux/magic.h> > +#include <linux/sched.h> > + > +#include "../kselftest.h" > + > +#define STACK_SIZE 65536 > + > +static char cgroup_mountpoint[4096]; > +static char root_cgroup[4096]; > + > +#define CHILDREN_COUNT 2 > +typedef struct { > + pid_t pid; > + uint8_t *stack; > + int start_semfd; > + int end_semfd; > +} cgroupns_child_t; > +cgroupns_child_t children[CHILDREN_COUNT]; > + > +typedef enum { > + UNSHARE_CGROUPNS, > + JOIN_CGROUPNS, > + CHECK_CGROUP, > + CHECK_CGROUP_WITH_ROOT_PREFIX, > + MOVE_CGROUP, > + MOVE_CGROUP_WITH_ROOT_PREFIX, > +} cgroupns_action_t; > + > +static const struct { > + int actor_id; > + cgroupns_action_t action; > + int target_id; > + char *path; > +} cgroupns_tests[] = { > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b"}, > + > + { 0, CHECK_CGROUP, -1, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/cgroup-b"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b/sub1-b"}, > + > + { 0, CHECK_CGROUP, 0, "/sub1-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a"}, > + { 1, CHECK_CGROUP, 1, "/sub1-b"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, CHECK_CGROUP, 0, "/../../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 0, CHECK_CGROUP, 1, "/../../../../../cgroup-b/sub1-b"}, > + > + { 0, JOIN_CGROUPNS, -1, NULL}, > + { 1, JOIN_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > +}; > +#define cgroupns_tests_len (sizeof(cgroupns_tests) / sizeof(cgroupns_tests[0])) > + > +static void > +get_cgroup_mountpoint(char *path, size_t len) > +{ > + char line[4096]; > + char dummy[4096]; > + char mountpoint[4096]; > + FILE *f; > + > + f = fopen("/proc/self/mountinfo", "r"); > + if (!f) { > + printf("FAIL: cannot open mountinfo\n"); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read mountinfo\n"); > + ksft_exit_fail(); > + } > + printf("FAIL: cannot find cgroup2 mount in mountinfo\n"); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + /* 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue > + * (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) > + */ > + if (strstr(line, " - cgroup2 ") == NULL) /* (9)=cgroup2 */ > + continue; > + > + if (sscanf(line, "%4095s %4095s %4095s %4095s %4095s", dummy, dummy, dummy, dummy, mountpoint) != 5) > + continue; > + > + strncpy(path, mountpoint, len); > + path[len-1] = '\0'; > + break; > + } > + > + fclose(f); > +} > + > +static void > +get_cgroup(pid_t pid, char *path, size_t len) > +{ > + char proc_path[4096]; > + char line[4096]; > + FILE *f; > + > + if (pid > 0) { > + sprintf(proc_path, "/proc/%d/cgroup", pid); > + } else { > + sprintf(proc_path, "/proc/self/cgroup"); > + } > + > + f = fopen(proc_path, "r"); > + if (!f) { > + printf("FAIL: cannot open %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read %s\n", proc_path); > + ksft_exit_fail(); > + } > + printf("FAIL: could not parse %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + if (strncmp(line, "0::", 3) == 0) { > + strncpy(path, line+3, len); > + path[len-1] = '\0'; > + break; > + } > + } > + > + fclose(f); > +} > + > +static void > +move_cgroup(pid_t target_pid, int prefix, char *cgroup) > +{ > + char knob_dir[4096]; > + char knob_path[4096]; > + char buf[128]; > + FILE *f; > + int ret; > + > + if (prefix) { > + sprintf(knob_dir, "%s/%s/%s", cgroup_mountpoint, root_cgroup, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir, cgroup); > + } else { > + sprintf(knob_dir, "%s/%s", cgroup_mountpoint, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir); > + } > + > + mkdir(knob_dir, 0755); > + > + sprintf(buf, "%d\n", target_pid); > + > + f = fopen(knob_path, "w"); > + ret = fwrite(buf, strlen(buf), 1, f); > + if (ret != 1) { > + printf("FAIL: cannot write to %s (ret=%d)\n", knob_path, ret); > + ksft_exit_fail(); > + } > + fclose(f); > +} > + > +static int > +child_func(void *arg) > +{ > + uintptr_t id = (uintptr_t) arg; > + char child_cgroup[4096]; > + char expected_cgroup[4096]; > + char process_name[128]; > + char proc_path[128]; > + int step; > + int ret; > + int nsfd; > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 0; > + pid_t target_pid; > + > + /* wait a signal from the parent process before starting this step */ > + ret = read(children[id].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + > + /* only one process will do this step */ > + if (cgroupns_tests[step].actor_id == id) { > + switch (cgroupns_tests[step].action) { > + case UNSHARE_CGROUPNS: > + printf("child process #%lu: unshare cgroupns\n", id); > + ret = unshare(CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot unshare cgroupns\n"); > + ksft_exit_fail(); > + } > + break; > + > + case JOIN_CGROUPNS: > + printf("child process #%lu: join parent cgroupns\n", id); > + > + sprintf(proc_path, "/proc/%d/ns/cgroup", getppid()); > + nsfd = open(proc_path, 0); > + ret = setns(nsfd, CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot join cgroupns\n"); > + ksft_exit_fail(); > + } > + close(nsfd); > + break; > + > + case CHECK_CGROUP: > + case CHECK_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].action == CHECK_CGROUP || strcmp(root_cgroup, "/") == 0) > + sprintf(expected_cgroup, "%s", cgroupns_tests[step].path); > + else if (strcmp(cgroupns_tests[step].path, "/") == 0) > + sprintf(expected_cgroup, "%s", root_cgroup); > + else > + sprintf(expected_cgroup, "%s%s", root_cgroup, cgroupns_tests[step].path); > + > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = 0; > + sprintf(process_name, "#self (pid=%d)", getpid()); > + } > + > + printf("child process #%lu: check that process %s has cgroup %s\n", > + id, process_name, expected_cgroup); > + > + get_cgroup(target_pid, child_cgroup, sizeof(child_cgroup)); > + > + if (strcmp(child_cgroup, expected_cgroup) != 0) { > + printf("FAIL: child has cgroup %s\n", child_cgroup); > + ksft_exit_fail(); > + } > + > + break; > + > + case MOVE_CGROUP: > + case MOVE_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = children[id].pid; > + sprintf(process_name, "#self (pid=%d)", target_pid); > + } > + > + printf("child process #%lu: move process %s to cgroup %s\n", > + id, process_name, cgroupns_tests[step].path); > + > + move_cgroup(target_pid, > + cgroupns_tests[step].action == MOVE_CGROUP_WITH_ROOT_PREFIX, > + cgroupns_tests[step].path); > + break; > + > + default: > + printf("FAIL: invalid action\n"); > + ksft_exit_fail(); > + } > + } > + > + > + /* signal the parent process we've finished this step */ > + counter = 1; > + ret = write(children[id].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + return 0; > +} > + > +int > +main(int argc, char **argv) > +{ > + struct statfs fs; > + char child_cgroup[4096]; > + int ret; > + int status; > + uintptr_t i; > + int step; > + > + get_cgroup_mountpoint(cgroup_mountpoint, sizeof(cgroup_mountpoint)); > + printf("cgroup2 mounted on: %s\n", cgroup_mountpoint); > + > + if (statfs(cgroup_mountpoint, &fs) < 0) { > + printf("FAIL: statfs\n"); > + ksft_exit_fail(); > + } > + > + if (fs.f_type != (typeof(fs.f_type)) CGROUP2_SUPER_MAGIC) { > + printf("FAIL: this test is for Linux >= 4.5 with cgroup2 mounted\n"); > + ksft_exit_fail(); > + } > + > + get_cgroup(0, root_cgroup, sizeof(root_cgroup)); > + printf("current cgroup: %s\n", root_cgroup); > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].start_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].start_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].end_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].end_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].stack = malloc(STACK_SIZE); > + if (!children[i].stack) { > + printf("FAIL: cannot allocate stack\n"); > + ksft_exit_fail(); > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].pid = clone(child_func, children[i].stack + STACK_SIZE, > + SIGCHLD|CLONE_VM|CLONE_FILES, (void *)i); > + if (children[i].pid == -1) { > + printf("FAIL: cannot clone\n"); > + ksft_exit_fail(); > + } > + } > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 1; > + > + /* signal the child processes they can start the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = write(children[i].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + /* wait until all child processes finished the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = read(children[i].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = waitpid(-1, &status, 0); > + if (ret == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != 0) { > + printf("FAIL: cannot wait child\n"); > + ksft_exit_fail(); > + } > + } > + > + printf("SUCCESS\n"); > + return ksft_exit_pass(); > +} > -- > 2.5.0 > > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1454262492-6480-1-git-send-email-alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org>]
* Re: [PATCH] selftests/cgroupns: new test for cgroup namespaces [not found] ` <1454262492-6480-1-git-send-email-alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> @ 2016-02-10 17:48 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-10 17:48 UTC (permalink / raw) To: Alban Crequy Cc: iago-lYLaGTFnO9sWenYVfaLwtA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alban Crequy, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, linux-api-u79uwXL29TY76Z2rM5mHXA, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Sun, Jan 31, 2016 at 06:48:12PM +0100, Alban Crequy wrote: > From: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> > > This adds the selftest "cgroupns_test" in order to test the CGroup > Namespace patchset. > > cgroupns_test creates two child processes. They perform a list of > actions defined by the array cgroupns_test. This array can easily be > extended to more scenarios without adding much code. They are > synchronized with eventfds to ensure only one action is performed at a > time. > > The memory is shared between the processes (CLONE_VM) so each child > process can know the pid of their siblings without extra IPC. > > The output explains the scenario being played. Short extract: > > > current cgroup: /user.slice/user-0.slice/session-1.scope > > child process #0: check that process #self (pid=482) has cgroup /user.slice/user-0.slice/session-1.scope > > child process #0: unshare cgroupns > > child process #0: move process #self (pid=482) to cgroup cgroup-a/subcgroup-a > > child process #0: join parent cgroupns > > The test does not change the mount namespace and does not mount any > new cgroup2 filesystem. Therefore this does not test that the cgroup2 > mount is correctly rooted to the cgroupns root at mount time. > > Signed-off-by: Alban Crequy <alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> Thanks, Alban! > Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> > > --- > > Changelog: > 20160131 - rebase on sergeh/cgroupns.v10 and fix conflicts > > 20160115 - Detect where cgroup2 is mounted, don't assume > /sys/fs/cgroup (suggested by sergeh) > - Check more error conditions (from krnowak's review) > - Coding style (from krnowak's review) > - Update error message for Linux >= 4.5 (from krnowak's > review) > > 20160104 - Fix coding style (from sergeh's review) > - Fix printf formatting (from sergeh's review) > - Fix parsing of /proc/pid/cgroup (from sergeh's review) > - Fix concatenation of cgroup paths > > 20151219 - First version > > This patch is available in the cgroupns.v10-tests branch of > https://github.com/kinvolk/linux.git > It is rebased on top of Serge Hallyn's cgroupns.v10 branch of > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > Test results: > > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=1 > - SUCCESS on kernel cgroupns.v10 booted with systemd.unified_cgroup_hierarchy=0 > --- > tools/testing/selftests/Makefile | 1 + > tools/testing/selftests/cgroupns/Makefile | 11 + > tools/testing/selftests/cgroupns/cgroupns_test.c | 445 +++++++++++++++++++++++ > 3 files changed, 457 insertions(+) > create mode 100644 tools/testing/selftests/cgroupns/Makefile > create mode 100644 tools/testing/selftests/cgroupns/cgroupns_test.c > > diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile > index b04afc3..b373135 100644 > --- a/tools/testing/selftests/Makefile > +++ b/tools/testing/selftests/Makefile > @@ -1,5 +1,6 @@ > TARGETS = breakpoints > TARGETS += capabilities > +TARGETS += cgroupns > TARGETS += cpu-hotplug > TARGETS += efivarfs > TARGETS += exec > diff --git a/tools/testing/selftests/cgroupns/Makefile b/tools/testing/selftests/cgroupns/Makefile > new file mode 100644 > index 0000000..0fdbe0a > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/Makefile > @@ -0,0 +1,11 @@ > +CFLAGS += -I../../../../usr/include/ > +CFLAGS += -I../../../../include/uapi/ > + > +all: cgroupns_test > + > +TEST_PROGS := cgroupns_test > + > +include ../lib.mk > + > +clean: > + $(RM) cgroupns_test > diff --git a/tools/testing/selftests/cgroupns/cgroupns_test.c b/tools/testing/selftests/cgroupns/cgroupns_test.c > new file mode 100644 > index 0000000..71e2336 > --- /dev/null > +++ b/tools/testing/selftests/cgroupns/cgroupns_test.c > @@ -0,0 +1,445 @@ > +#define _GNU_SOURCE > + > +#include <stdio.h> > +#include <stdlib.h> > +#include <unistd.h> > +#include <string.h> > +#include <sys/statfs.h> > +#include <inttypes.h> > +#include <sched.h> > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <sys/socket.h> > +#include <sys/wait.h> > +#include <sys/eventfd.h> > +#include <signal.h> > +#include <fcntl.h> > + > +#include <linux/magic.h> > +#include <linux/sched.h> > + > +#include "../kselftest.h" > + > +#define STACK_SIZE 65536 > + > +static char cgroup_mountpoint[4096]; > +static char root_cgroup[4096]; > + > +#define CHILDREN_COUNT 2 > +typedef struct { > + pid_t pid; > + uint8_t *stack; > + int start_semfd; > + int end_semfd; > +} cgroupns_child_t; > +cgroupns_child_t children[CHILDREN_COUNT]; > + > +typedef enum { > + UNSHARE_CGROUPNS, > + JOIN_CGROUPNS, > + CHECK_CGROUP, > + CHECK_CGROUP_WITH_ROOT_PREFIX, > + MOVE_CGROUP, > + MOVE_CGROUP_WITH_ROOT_PREFIX, > +} cgroupns_action_t; > + > +static const struct { > + int actor_id; > + cgroupns_action_t action; > + int target_id; > + char *path; > +} cgroupns_tests[] = { > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, -1, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b"}, > + > + { 0, CHECK_CGROUP, -1, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 0, CHECK_CGROUP, 1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/cgroup-b"}, > + { 1, CHECK_CGROUP, 0, "/cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/cgroup-b"}, > + > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP, -1, "/"}, > + { 0, CHECK_CGROUP, 0, "/"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b"}, > + { 1, CHECK_CGROUP, -1, "/"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a"}, > + { 1, CHECK_CGROUP, 1, "/"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a"}, > + { 1, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-b/sub1-b"}, > + > + { 0, CHECK_CGROUP, 0, "/sub1-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a"}, > + { 1, CHECK_CGROUP, 1, "/sub1-b"}, > + > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + { 0, MOVE_CGROUP_WITH_ROOT_PREFIX, -1, "cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP, 0, "/../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP, 1, "/../cgroup-b/sub1-b"}, > + > + { 1, UNSHARE_CGROUPNS, -1, NULL}, > + { 1, CHECK_CGROUP, 0, "/../../cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, UNSHARE_CGROUPNS, -1, NULL}, > + { 0, CHECK_CGROUP, 1, "/../../../../../cgroup-b/sub1-b"}, > + > + { 0, JOIN_CGROUPNS, -1, NULL}, > + { 1, JOIN_CGROUPNS, -1, NULL}, > + > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 0, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 0, "/cgroup-a/sub1-a/sub2-a/sub3-a/sub4-a"}, > + { 1, CHECK_CGROUP_WITH_ROOT_PREFIX, 1, "/cgroup-b/sub1-b"}, > +}; > +#define cgroupns_tests_len (sizeof(cgroupns_tests) / sizeof(cgroupns_tests[0])) > + > +static void > +get_cgroup_mountpoint(char *path, size_t len) > +{ > + char line[4096]; > + char dummy[4096]; > + char mountpoint[4096]; > + FILE *f; > + > + f = fopen("/proc/self/mountinfo", "r"); > + if (!f) { > + printf("FAIL: cannot open mountinfo\n"); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read mountinfo\n"); > + ksft_exit_fail(); > + } > + printf("FAIL: cannot find cgroup2 mount in mountinfo\n"); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + /* 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue > + * (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11) > + */ > + if (strstr(line, " - cgroup2 ") == NULL) /* (9)=cgroup2 */ > + continue; > + > + if (sscanf(line, "%4095s %4095s %4095s %4095s %4095s", dummy, dummy, dummy, dummy, mountpoint) != 5) > + continue; > + > + strncpy(path, mountpoint, len); > + path[len-1] = '\0'; > + break; > + } > + > + fclose(f); > +} > + > +static void > +get_cgroup(pid_t pid, char *path, size_t len) > +{ > + char proc_path[4096]; > + char line[4096]; > + FILE *f; > + > + if (pid > 0) { > + sprintf(proc_path, "/proc/%d/cgroup", pid); > + } else { > + sprintf(proc_path, "/proc/self/cgroup"); > + } > + > + f = fopen(proc_path, "r"); > + if (!f) { > + printf("FAIL: cannot open %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + for (;;) { > + if (!fgets(line, sizeof(line), f)) { > + if (ferror(f)) { > + printf("FAIL: cannot read %s\n", proc_path); > + ksft_exit_fail(); > + } > + printf("FAIL: could not parse %s\n", proc_path); > + ksft_exit_fail(); > + } > + > + line[strcspn(line, "\n")] = 0; > + if (strncmp(line, "0::", 3) == 0) { > + strncpy(path, line+3, len); > + path[len-1] = '\0'; > + break; > + } > + } > + > + fclose(f); > +} > + > +static void > +move_cgroup(pid_t target_pid, int prefix, char *cgroup) > +{ > + char knob_dir[4096]; > + char knob_path[4096]; > + char buf[128]; > + FILE *f; > + int ret; > + > + if (prefix) { > + sprintf(knob_dir, "%s/%s/%s", cgroup_mountpoint, root_cgroup, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir, cgroup); > + } else { > + sprintf(knob_dir, "%s/%s", cgroup_mountpoint, cgroup); > + sprintf(knob_path, "%s/cgroup.procs", knob_dir); > + } > + > + mkdir(knob_dir, 0755); > + > + sprintf(buf, "%d\n", target_pid); > + > + f = fopen(knob_path, "w"); > + ret = fwrite(buf, strlen(buf), 1, f); > + if (ret != 1) { > + printf("FAIL: cannot write to %s (ret=%d)\n", knob_path, ret); > + ksft_exit_fail(); > + } > + fclose(f); > +} > + > +static int > +child_func(void *arg) > +{ > + uintptr_t id = (uintptr_t) arg; > + char child_cgroup[4096]; > + char expected_cgroup[4096]; > + char process_name[128]; > + char proc_path[128]; > + int step; > + int ret; > + int nsfd; > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 0; > + pid_t target_pid; > + > + /* wait a signal from the parent process before starting this step */ > + ret = read(children[id].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + > + /* only one process will do this step */ > + if (cgroupns_tests[step].actor_id == id) { > + switch (cgroupns_tests[step].action) { > + case UNSHARE_CGROUPNS: > + printf("child process #%lu: unshare cgroupns\n", id); > + ret = unshare(CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot unshare cgroupns\n"); > + ksft_exit_fail(); > + } > + break; > + > + case JOIN_CGROUPNS: > + printf("child process #%lu: join parent cgroupns\n", id); > + > + sprintf(proc_path, "/proc/%d/ns/cgroup", getppid()); > + nsfd = open(proc_path, 0); > + ret = setns(nsfd, CLONE_NEWCGROUP); > + if (ret != 0) { > + printf("FAIL: cannot join cgroupns\n"); > + ksft_exit_fail(); > + } > + close(nsfd); > + break; > + > + case CHECK_CGROUP: > + case CHECK_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].action == CHECK_CGROUP || strcmp(root_cgroup, "/") == 0) > + sprintf(expected_cgroup, "%s", cgroupns_tests[step].path); > + else if (strcmp(cgroupns_tests[step].path, "/") == 0) > + sprintf(expected_cgroup, "%s", root_cgroup); > + else > + sprintf(expected_cgroup, "%s%s", root_cgroup, cgroupns_tests[step].path); > + > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = 0; > + sprintf(process_name, "#self (pid=%d)", getpid()); > + } > + > + printf("child process #%lu: check that process %s has cgroup %s\n", > + id, process_name, expected_cgroup); > + > + get_cgroup(target_pid, child_cgroup, sizeof(child_cgroup)); > + > + if (strcmp(child_cgroup, expected_cgroup) != 0) { > + printf("FAIL: child has cgroup %s\n", child_cgroup); > + ksft_exit_fail(); > + } > + > + break; > + > + case MOVE_CGROUP: > + case MOVE_CGROUP_WITH_ROOT_PREFIX: > + if (cgroupns_tests[step].target_id >= 0) { > + target_pid = children[cgroupns_tests[step].target_id].pid; > + sprintf(process_name, "#%d (pid=%d)", > + cgroupns_tests[step].target_id, target_pid); > + } else { > + target_pid = children[id].pid; > + sprintf(process_name, "#self (pid=%d)", target_pid); > + } > + > + printf("child process #%lu: move process %s to cgroup %s\n", > + id, process_name, cgroupns_tests[step].path); > + > + move_cgroup(target_pid, > + cgroupns_tests[step].action == MOVE_CGROUP_WITH_ROOT_PREFIX, > + cgroupns_tests[step].path); > + break; > + > + default: > + printf("FAIL: invalid action\n"); > + ksft_exit_fail(); > + } > + } > + > + > + /* signal the parent process we've finished this step */ > + counter = 1; > + ret = write(children[id].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + return 0; > +} > + > +int > +main(int argc, char **argv) > +{ > + struct statfs fs; > + char child_cgroup[4096]; > + int ret; > + int status; > + uintptr_t i; > + int step; > + > + get_cgroup_mountpoint(cgroup_mountpoint, sizeof(cgroup_mountpoint)); > + printf("cgroup2 mounted on: %s\n", cgroup_mountpoint); > + > + if (statfs(cgroup_mountpoint, &fs) < 0) { > + printf("FAIL: statfs\n"); > + ksft_exit_fail(); > + } > + > + if (fs.f_type != (typeof(fs.f_type)) CGROUP2_SUPER_MAGIC) { > + printf("FAIL: this test is for Linux >= 4.5 with cgroup2 mounted\n"); > + ksft_exit_fail(); > + } > + > + get_cgroup(0, root_cgroup, sizeof(root_cgroup)); > + printf("current cgroup: %s\n", root_cgroup); > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].start_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].start_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].end_semfd = eventfd(0, EFD_SEMAPHORE); > + if (children[i].end_semfd == -1) { > + printf("FAIL: cannot create eventfd\n"); > + ksft_exit_fail(); > + } > + > + children[i].stack = malloc(STACK_SIZE); > + if (!children[i].stack) { > + printf("FAIL: cannot allocate stack\n"); > + ksft_exit_fail(); > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + children[i].pid = clone(child_func, children[i].stack + STACK_SIZE, > + SIGCHLD|CLONE_VM|CLONE_FILES, (void *)i); > + if (children[i].pid == -1) { > + printf("FAIL: cannot clone\n"); > + ksft_exit_fail(); > + } > + } > + > + for (step = 0; step < cgroupns_tests_len; step++) { > + uint64_t counter = 1; > + > + /* signal the child processes they can start the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = write(children[i].start_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot write semaphore\n"); > + ksft_exit_fail(); > + } > + } > + > + /* wait until all child processes finished the current step */ > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = read(children[i].end_semfd, &counter, sizeof(counter)); > + if (ret != sizeof(counter)) { > + printf("FAIL: cannot read semaphore\n"); > + ksft_exit_fail(); > + } > + } > + } > + > + for (i = 0; i < CHILDREN_COUNT; i++) { > + ret = waitpid(-1, &status, 0); > + if (ret == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != 0) { > + printf("FAIL: cannot wait child\n"); > + ksft_exit_fail(); > + } > + } > + > + printf("SUCCESS\n"); > + return ksft_exit_pass(); > +} > -- > 2.5.0 > > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (8 preceding siblings ...) 2016-01-31 17:48 ` Alban Crequy @ 2016-02-11 23:18 ` Alban Crequy 2016-02-26 13:18 ` Alban Crequy 10 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-11 23:18 UTC (permalink / raw) To: LXC development mailing-list Cc: Linux API, Linux Containers, Johannes Weiner, linux-kernel-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > 2. unsharing a cgroup namespace makes all your current cgroups your new > cgroup root. > > 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's > cgroup namespce root. A task outside of your cgroup looks like > > 8:memory:/../../.. > > 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends > on the mounting task's cgroup namespace. > > 5. setns to a cgroup namespace switches your cgroup namespace but not > your cgroups. > > With this, using github.com/hallyn/lxc #2015-11-09/cgns (and > github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full > proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. > > This is completely backward compatible and will be completely invisible > to any existing cgroup users (except for those running inside a cgroup > namespace and looking at /proc/pid/cgroup of tasks outside their > namespace.) Hi, I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match") which, as far as I understand, introduces a new userland facing API containing the full cgroup path. Does it mean that the cgroupns patchset should include cgroup path translation in xt_cgroup? > Changes from V9: > 1. Update to latest Linus tree > 2. A few locking fixes > > Changes from V8: > 1. Incorporate updated documentation from tj. > 2. Put lookup_one_len() under inode lock > 3. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are > namespaced. > 4. Make cgroup_path{,_ns} take the needed locks, since external callers cannot > do so. > 5. Fix the bisectability problem of to_cg_ns() being defined after use > > Changes from V7: > 1. Rework kernfs_path_from_node_locked to return the string length > 2. Rename and reorder args to kernfs_path_from_node > 3. cgroup.c: undo accidental conversoins to inline > 4. cgroup.h: move ns declarations to bottom. > 5. Rework the documentation to fit the style of the rest of cgroup.txt > > Changes from V6: > 1. Switch to some WARN_ONs to provide stack traces > 2. Rename kernfs_node_distance to kernfs_depth > 3. Make sure kernfs_common_ancestor() nodes are from same root > 4. Split kernfs changes for cgroup_mount into separate patch > 5. Rename kernfs_obtain_root to kernfs_node_dentry > (And more, see patch changelogs) > > Changes from V5: > 1. To get a root dentry for cgroup namespace mount, walk the path from the > kernfs root dentry. > > Changes from V4: > 1. Move the FS_USERNS_MOUNT flag to last patch > 2. Rebase onto cgroup/for-4.5 > 3. Don't non-init user namespaces to bind new subsystems when mounting. > 4. Address feedback from Tejun (thanks). Specificaly, not addressed: > . kernfs_obtain_root - walking dentry from kernfs root. > (I think that's the only piece) > 5. Dropped unused get_task_cgroup fn/patch. > 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. > It now finds a common ancestor, walks from the source to it, then back > up to the target. > > Changes from V3: > 1. Rebased onto latest cgroup changes. In particular switch to > css_set_lock and ns_common. > 2. Support all hierarchies. > > Changes from V2: > 1. Added documentation in Documentation/cgroups/namespace.txt > 2. Fixed a bug that caused crash > 3. Incorporated some other suggestions from last patchset: > - removed use of threadgroup_lock() while creating new cgroupns > - use task_lock() instead of rcu_read_lock() while accessing > task->nsproxy > - optimized setns() to own cgroupns > - simplified code around sane-behavior mount option parsing > 4. Restored ACKs from Serge Hallyn from v1 on few patches that have > not changed since then. > > Changes from V1: > 1. No pinning of processes within cgroupns. Tasks can be freely moved > across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies > apply as before. > 2. Path in /proc/<pid>/cgroup is now always shown and is relative to > cgroupns-root. So path can contain '/..' strings depending on cgroupns-root > of the reader and cgroup of <pid>. > 3. setns() does not require the process to first move under target > cgroupns-root. > > Changes form RFC (V0): > 1. setns support for cgroupns > 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now > mounts the cgroup hierarcy with cgroupns-root as the filesystem root. > 3. writes to cgroup files outside of cgroupns-root are not allowed > 4. visibility of /proc/<pid>/cgroup is further restricted by not showing > anything if the <pid> is in a sibling cgroupns and its cgroup falls outside > your cgroupns-root. > > > _______________________________________________ > lxc-devel mailing list > lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I@public.gmane.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> ` (9 preceding siblings ...) 2016-02-11 23:18 ` [lxc-devel] CGroup Namespaces (v10) Alban Crequy @ 2016-02-26 13:18 ` Alban Crequy 10 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-26 13:18 UTC (permalink / raw) To: LXC development mailing-list Cc: Linux API, Linux Containers, Johannes Weiner, linux-kernel-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Hi, On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED What's the best way for a userspace application to test at run-time whether the kernel supports cgroup namespaces? Would you recommend to test if the file /proc/self/ns/cgroup exists? Thanks! Alban ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 5/8] kernfs: define kernfs_node_dentry [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> Add a new kernfs api is added to lookup the dentry for a particular kernfs path. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. 20151208 - Split out the kernfs change - Style changes - Switch from pr_crit to WARN_ON - Reorder arguments to kernfs_obtain_root - rename kernfs_obtain_root to kernfs_node_dentry 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() --- fs/kernfs/mount.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/kernfs.h | 2 ++ 2 files changed, 71 insertions(+) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 8eaf417..074bb8b 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -14,6 +14,7 @@ #include <linux/magic.h> #include <linux/slab.h> #include <linux/pagemap.h> +#include <linux/namei.h> #include "kernfs-internal.h" @@ -62,6 +63,74 @@ struct kernfs_root *kernfs_root_from_sb(struct super_block *sb) return NULL; } +/* + * find the next ancestor in the path down to @child, where @parent was the + * ancestor whose descendant we want to find. + * + * Say the path is /a/b/c/d. @child is d, @parent is NULL. We return the root + * node. If @parent is b, then we return the node for c. + * Passing in d as @parent is not ok. + */ +static struct kernfs_node * +find_next_ancestor(struct kernfs_node *child, struct kernfs_node *parent) +{ + if (child == parent) { + pr_crit_once("BUG in find_next_ancestor: called with parent == child"); + return NULL; + } + + while (child->parent != parent) { + if (!child->parent) + return NULL; + child = child->parent; + } + + return child; +} + +/** + * kernfs_node_dentry - get a dentry for the given kernfs_node + * @kn: kernfs_node for which a dentry is needed + * @sb: the kernfs super_block + */ +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb) +{ + struct dentry *dentry; + struct kernfs_node *knparent = NULL; + + BUG_ON(sb->s_op != &kernfs_sops); + + dentry = dget(sb->s_root); + + /* Check if this is the root kernfs_node */ + if (!kn->parent) + return dentry; + + knparent = find_next_ancestor(kn, NULL); + if (WARN_ON(!knparent)) + return ERR_PTR(-EINVAL); + + do { + struct dentry *dtmp; + struct kernfs_node *kntmp; + + if (kn == knparent) + return dentry; + kntmp = find_next_ancestor(kn, knparent); + if (WARN_ON(!kntmp)) + return ERR_PTR(-EINVAL); + mutex_lock(&d_inode(dentry)->i_mutex); + dtmp = lookup_one_len(kntmp->name, dentry, strlen(kntmp->name)); + mutex_unlock(&d_inode(dentry)->i_mutex); + dput(dentry); + if (IS_ERR(dtmp)) + return dtmp; + knparent = kntmp; + dentry = dtmp; + } while (1); +} + static int kernfs_fill_super(struct super_block *sb, unsigned long magic) { struct kernfs_super_info *info = kernfs_info(sb); diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 716bfde..c06c442 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -284,6 +284,8 @@ struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry); struct kernfs_root *kernfs_root_from_sb(struct super_block *sb); struct inode *kernfs_get_inode(struct super_block *sb, struct kernfs_node *kn); +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb); struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops, unsigned int flags, void *priv); void kernfs_destroy_root(struct kernfs_root *root); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 5/8] kernfs: define kernfs_node_dentry @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Add a new kernfs api is added to lookup the dentry for a particular kernfs path. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. 20151208 - Split out the kernfs change - Style changes - Switch from pr_crit to WARN_ON - Reorder arguments to kernfs_obtain_root - rename kernfs_obtain_root to kernfs_node_dentry 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() --- fs/kernfs/mount.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/kernfs.h | 2 ++ 2 files changed, 71 insertions(+) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 8eaf417..074bb8b 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -14,6 +14,7 @@ #include <linux/magic.h> #include <linux/slab.h> #include <linux/pagemap.h> +#include <linux/namei.h> #include "kernfs-internal.h" @@ -62,6 +63,74 @@ struct kernfs_root *kernfs_root_from_sb(struct super_block *sb) return NULL; } +/* + * find the next ancestor in the path down to @child, where @parent was the + * ancestor whose descendant we want to find. + * + * Say the path is /a/b/c/d. @child is d, @parent is NULL. We return the root + * node. If @parent is b, then we return the node for c. + * Passing in d as @parent is not ok. + */ +static struct kernfs_node * +find_next_ancestor(struct kernfs_node *child, struct kernfs_node *parent) +{ + if (child == parent) { + pr_crit_once("BUG in find_next_ancestor: called with parent == child"); + return NULL; + } + + while (child->parent != parent) { + if (!child->parent) + return NULL; + child = child->parent; + } + + return child; +} + +/** + * kernfs_node_dentry - get a dentry for the given kernfs_node + * @kn: kernfs_node for which a dentry is needed + * @sb: the kernfs super_block + */ +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb) +{ + struct dentry *dentry; + struct kernfs_node *knparent = NULL; + + BUG_ON(sb->s_op != &kernfs_sops); + + dentry = dget(sb->s_root); + + /* Check if this is the root kernfs_node */ + if (!kn->parent) + return dentry; + + knparent = find_next_ancestor(kn, NULL); + if (WARN_ON(!knparent)) + return ERR_PTR(-EINVAL); + + do { + struct dentry *dtmp; + struct kernfs_node *kntmp; + + if (kn == knparent) + return dentry; + kntmp = find_next_ancestor(kn, knparent); + if (WARN_ON(!kntmp)) + return ERR_PTR(-EINVAL); + mutex_lock(&d_inode(dentry)->i_mutex); + dtmp = lookup_one_len(kntmp->name, dentry, strlen(kntmp->name)); + mutex_unlock(&d_inode(dentry)->i_mutex); + dput(dentry); + if (IS_ERR(dtmp)) + return dtmp; + knparent = kntmp; + dentry = dtmp; + } while (1); +} + static int kernfs_fill_super(struct super_block *sb, unsigned long magic) { struct kernfs_super_info *info = kernfs_info(sb); diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 716bfde..c06c442 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -284,6 +284,8 @@ struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry); struct kernfs_root *kernfs_root_from_sb(struct super_block *sb); struct inode *kernfs_get_inode(struct super_block *sb, struct kernfs_node *kn); +struct dentry *kernfs_node_dentry(struct kernfs_node *kn, + struct super_block *sb); struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops, unsigned int flags, void *priv); void kernfs_destroy_root(struct kernfs_root *root); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn, Serge Hallyn From: Serge Hallyn <serge.hallyn@ubuntu.com> This patch enables cgroup mounting inside userns when a process as appropriate privileges. The cgroup filesystem mounted is rooted at the cgroupns-root. Thus, in a container-setup, only the hierarchy under the cgroupns-root is exposed inside the container. This allows container management tools to run inside the containers without depending on any global state. Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. - Group initialized variables - Explain the capable(CAP_SYS_ADMIN) check - Style fixes 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() 20160128 - grab needed lock in mount Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> --- kernel/cgroup.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 96e3dab..3e04df0 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1983,6 +1983,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, { bool is_v2 = fs_type == &cgroup2_fs_type; struct super_block *pinned_sb = NULL; + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_subsys *ss; struct cgroup_root *root; struct cgroup_sb_opts opts; @@ -1991,6 +1992,14 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, int i; bool new_sb; + get_cgroup_ns(ns); + + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) { + put_cgroup_ns(ns); + return ERR_PTR(-EPERM); + } + /* * The first time anyone tries to mount a cgroup, enable the list * linking each css_set to its tasks and fix up all existing tasks. @@ -2001,6 +2010,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, if (is_v2) { if (data) { pr_err("cgroup2: unknown option \"%s\"\n", (char *)data); + put_cgroup_ns(ns); return ERR_PTR(-EINVAL); } cgrp_dfl_root_visible = true; @@ -2106,6 +2116,16 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, goto out_unlock; } + /* + * We know this subsystem has not yet been bound. Users in a non-init + * user namespace may only mount hierarchies with no bound subsystems, + * i.e. 'none,name=user1' + */ + if (!opts.none && !capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out_unlock; + } + root = kzalloc(sizeof(*root), GFP_KERNEL); if (!root) { ret = -ENOMEM; @@ -2124,12 +2144,37 @@ out_free: kfree(opts.release_agent); kfree(opts.name); - if (ret) + if (ret) { + put_cgroup_ns(ns); return ERR_PTR(ret); + } out_mount: dentry = kernfs_mount(fs_type, flags, root->kf_root, is_v2 ? CGROUP2_SUPER_MAGIC : CGROUP_SUPER_MAGIC, &new_sb); + + /* + * In non-init cgroup namespace, instead of root cgroup's + * dentry, we return the dentry corresponding to the + * cgroupns->root_cgrp. + */ + if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { + struct dentry *nsdentry; + struct cgroup *cgrp; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cgrp = cset_cgroup_from_root(ns->root_cset, root); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb); + dput(dentry); + dentry = nsdentry; + } + if (IS_ERR(dentry) || !new_sb) cgroup_put(&root->cgrp); @@ -2142,6 +2187,7 @@ out_mount: deactivate_super(pinned_sb); } + put_cgroup_ns(ns); return dentry; } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, Serge Hallyn, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Serge Hallyn <serge.hallyn@ubuntu.com> This patch enables cgroup mounting inside userns when a process as appropriate privileges. The cgroup filesystem mounted is rooted at the cgroupns-root. Thus, in a container-setup, only the hierarchy under the cgroupns-root is exposed inside the container. This allows container management tools to run inside the containers without depending on any global state. Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- Changelog: 20151116 - Don't allow user namespaces to bind new subsystems 20151118 - postpone the FS_USERNS_MOUNT flag until the last patch, until we can convince ourselves it is safe. 20151207 - Switch to walking up the kernfs path from kn root. - Group initialized variables - Explain the capable(CAP_SYS_ADMIN) check - Style fixes 20160104 - kernfs_node_dentry: lock inode for lookup_one_len() 20160128 - grab needed lock in mount Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> --- kernel/cgroup.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 96e3dab..3e04df0 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1983,6 +1983,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, { bool is_v2 = fs_type == &cgroup2_fs_type; struct super_block *pinned_sb = NULL; + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup_subsys *ss; struct cgroup_root *root; struct cgroup_sb_opts opts; @@ -1991,6 +1992,14 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, int i; bool new_sb; + get_cgroup_ns(ns); + + /* Check if the caller has permission to mount. */ + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) { + put_cgroup_ns(ns); + return ERR_PTR(-EPERM); + } + /* * The first time anyone tries to mount a cgroup, enable the list * linking each css_set to its tasks and fix up all existing tasks. @@ -2001,6 +2010,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, if (is_v2) { if (data) { pr_err("cgroup2: unknown option \"%s\"\n", (char *)data); + put_cgroup_ns(ns); return ERR_PTR(-EINVAL); } cgrp_dfl_root_visible = true; @@ -2106,6 +2116,16 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, goto out_unlock; } + /* + * We know this subsystem has not yet been bound. Users in a non-init + * user namespace may only mount hierarchies with no bound subsystems, + * i.e. 'none,name=user1' + */ + if (!opts.none && !capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out_unlock; + } + root = kzalloc(sizeof(*root), GFP_KERNEL); if (!root) { ret = -ENOMEM; @@ -2124,12 +2144,37 @@ out_free: kfree(opts.release_agent); kfree(opts.name); - if (ret) + if (ret) { + put_cgroup_ns(ns); return ERR_PTR(ret); + } out_mount: dentry = kernfs_mount(fs_type, flags, root->kf_root, is_v2 ? CGROUP2_SUPER_MAGIC : CGROUP_SUPER_MAGIC, &new_sb); + + /* + * In non-init cgroup namespace, instead of root cgroup's + * dentry, we return the dentry corresponding to the + * cgroupns->root_cgrp. + */ + if (!IS_ERR(dentry) && ns != &init_cgroup_ns) { + struct dentry *nsdentry; + struct cgroup *cgrp; + + mutex_lock(&cgroup_mutex); + spin_lock_bh(&css_set_lock); + + cgrp = cset_cgroup_from_root(ns->root_cset, root); + + spin_unlock_bh(&css_set_lock); + mutex_unlock(&cgroup_mutex); + + nsdentry = kernfs_node_dentry(cgrp->kn, dentry->d_sb); + dput(dentry); + dentry = nsdentry; + } + if (IS_ERR(dentry) || !new_sb) cgroup_put(&root->cgrp); @@ -2142,6 +2187,7 @@ out_mount: deactivate_super(pinned_sb); } + put_cgroup_ns(ns); return dentry; } -- 1.7.9.5 _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 7/8] cgroup: Add documentation for cgroup namespaces [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn, Serge Hallyn From: Serge Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org> --- Changelog (2015-12-08): Merge into Documentation/cgroup.txt Changelog (2015-12-22): Reformat to try to follow the style of the rest of the cgroup.txt file. Changelog (2015-12-22): tj: Reorganized to better fit the documentation. --- Documentation/cgroup-v2.txt | 147 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 65b3eac..eee9012 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -47,6 +47,11 @@ CONTENTS 5-3. IO 5-3-1. IO Interface Files 5-3-2. Writeback +6. Namespace + 6-1. Basics + 6-2. The Root and Views + 6-3. Migration and setns(2) + 6-4. Interaction with Other Namespaces P. Information on Kernel Programming P-1. Filesystem Support for Writeback D. Deprecated v1 Core Features @@ -1085,6 +1090,148 @@ writeback as follows. vm.dirty[_background]_ratio. +6. Namespace + +6-1. Basics + +cgroup namespace provides a mechanism to virtualize the view of the +"/proc/$PID/cgroup" file and cgroup mounts. The CLONE_NEWCGROUP clone +flag can be used with clone(2) and unshare(2) to create a new cgroup +namespace. The process running inside the cgroup namespace will have +its "/proc/$PID/cgroup" output restricted to cgroupns root. The +cgroupns root is the cgroup of the process at the time of creation of +the cgroup namespace. + +Without cgroup namespace, the "/proc/$PID/cgroup" file shows the +complete path of the cgroup of a process. In a container setup where +a set of cgroups and namespaces are intended to isolate processes the +"/proc/$PID/cgroup" file may leak potential system level information +to the isolated processes. For Example: + + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +The path '/batchjobs/container_id1' can be considered as system-data +and undesirable to expose to the isolated processes. cgroup namespace +can be used to restrict visibility of this path. For example, before +creating a cgroup namespace, one would see: + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835] + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +After unsharing a new namespace, the view changes. + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183] + # cat /proc/self/cgroup + 0::/ + +When some thread from a multi-threaded process unshares its cgroup +namespace, the new cgroupns gets applied to the entire process (all +the threads). This is natural for the v2 hierarchy; however, for the +legacy hierarchies, this may be unexpected. + +A cgroup namespace is alive as long as there are processes inside or +mounts pinning it. When the last usage goes away, the cgroup +namespace is destroyed. The cgroupns root and the actual cgroups +remain. + + +6-2. The Root and Views + +The 'cgroupns root' for a cgroup namespace is the cgroup in which the +process calling unshare(2) is running. For example, if a process in +/batchjobs/container_id1 cgroup calls unshare, cgroup +/batchjobs/container_id1 becomes the cgroupns root. For the +init_cgroup_ns, this is the real root ('/') cgroup. + +The cgroupns root cgroup does not change even if the namespace creator +process later moves to a different cgroup. + + # ~/unshare -c # unshare cgroupns in some cgroup + # cat /proc/self/cgroup + 0::/ + # mkdir sub_cgrp_1 + # echo 0 > sub_cgrp_1/cgroup.procs + # cat /proc/self/cgroup + 0::/sub_cgrp_1 + +Each process gets its namespace-specific view of "/proc/$PID/cgroup" + +Processes running inside the cgroup namespace will be able to see +cgroup paths (in /proc/self/cgroup) only inside their root cgroup. +From within an unshared cgroupns: + + # sleep 100000 & + [1] 7353 + # echo 7353 > sub_cgrp_1/cgroup.procs + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + +From the initial cgroup namespace, the real cgroup path will be +visible: + + $ cat /proc/7353/cgroup + 0::/batchjobs/container_id1/sub_cgrp_1 + +From a sibling cgroup namespace (that is, a namespace rooted at a +different cgroup), the cgroup path relative to its own cgroup +namespace root will be shown. For instance, if PID 7353's cgroup +namespace root is at '/batchjobs/container_id2', then it will see + + # cat /proc/7353/cgroup + 0::/../container_id2/sub_cgrp_1 + +Note that the relative path always starts with '/' to indicate that +its relative to the cgroup namespace root of the caller. + + +6-3. Migration and setns(2) + +Processes inside a cgroup namespace can move into and out of the +namespace root if they have proper access to external cgroups. For +example, from inside a namespace with cgroupns root at +/batchjobs/container_id1, and assuming that the global hierarchy is +still accessible inside cgroupns: + + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + # echo 7353 > batchjobs/container_id2/cgroup.procs + # cat /proc/7353/cgroup + 0::/../container_id2 + +Note that this kind of setup is not encouraged. A task inside cgroup +namespace should only be exposed to its own cgroupns hierarchy. + +setns(2) to another cgroup namespace is allowed when: + +(a) the process has CAP_SYS_ADMIN against its current user namespace +(b) the process has CAP_SYS_ADMIN against the target cgroup + namespace's userns + +No implicit cgroup changes happen with attaching to another cgroup +namespace. It is expected that the someone moves the attaching +process under the target cgroup namespace root. + + +6-4. Interaction with Other Namespaces + +Namespace specific cgroup hierarchy can be mounted by a process +running inside a non-init cgroup namespace. + + # mount -t cgroup2 none $MOUNT_POINT + +This will mount the unified cgroup hierarchy with cgroupns root as the +filesystem root. The process needs CAP_SYS_ADMIN against its user and +mount namespaces. + +The virtualization of /proc/self/cgroup file combined with restricting +the view of cgroup hierarchy by namespace-private cgroupfs mount +provides a properly isolated cgroup view inside the container. + + P. Information on Kernel Programming This section contains kernel programming information in the areas -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 7/8] cgroup: Add documentation for cgroup namespaces @ 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge Hallyn, Serge Hallyn From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> --- Changelog (2015-12-08): Merge into Documentation/cgroup.txt Changelog (2015-12-22): Reformat to try to follow the style of the rest of the cgroup.txt file. Changelog (2015-12-22): tj: Reorganized to better fit the documentation. --- Documentation/cgroup-v2.txt | 147 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 65b3eac..eee9012 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -47,6 +47,11 @@ CONTENTS 5-3. IO 5-3-1. IO Interface Files 5-3-2. Writeback +6. Namespace + 6-1. Basics + 6-2. The Root and Views + 6-3. Migration and setns(2) + 6-4. Interaction with Other Namespaces P. Information on Kernel Programming P-1. Filesystem Support for Writeback D. Deprecated v1 Core Features @@ -1085,6 +1090,148 @@ writeback as follows. vm.dirty[_background]_ratio. +6. Namespace + +6-1. Basics + +cgroup namespace provides a mechanism to virtualize the view of the +"/proc/$PID/cgroup" file and cgroup mounts. The CLONE_NEWCGROUP clone +flag can be used with clone(2) and unshare(2) to create a new cgroup +namespace. The process running inside the cgroup namespace will have +its "/proc/$PID/cgroup" output restricted to cgroupns root. The +cgroupns root is the cgroup of the process at the time of creation of +the cgroup namespace. + +Without cgroup namespace, the "/proc/$PID/cgroup" file shows the +complete path of the cgroup of a process. In a container setup where +a set of cgroups and namespaces are intended to isolate processes the +"/proc/$PID/cgroup" file may leak potential system level information +to the isolated processes. For Example: + + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +The path '/batchjobs/container_id1' can be considered as system-data +and undesirable to expose to the isolated processes. cgroup namespace +can be used to restrict visibility of this path. For example, before +creating a cgroup namespace, one would see: + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835] + # cat /proc/self/cgroup + 0::/batchjobs/container_id1 + +After unsharing a new namespace, the view changes. + + # ls -l /proc/self/ns/cgroup + lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183] + # cat /proc/self/cgroup + 0::/ + +When some thread from a multi-threaded process unshares its cgroup +namespace, the new cgroupns gets applied to the entire process (all +the threads). This is natural for the v2 hierarchy; however, for the +legacy hierarchies, this may be unexpected. + +A cgroup namespace is alive as long as there are processes inside or +mounts pinning it. When the last usage goes away, the cgroup +namespace is destroyed. The cgroupns root and the actual cgroups +remain. + + +6-2. The Root and Views + +The 'cgroupns root' for a cgroup namespace is the cgroup in which the +process calling unshare(2) is running. For example, if a process in +/batchjobs/container_id1 cgroup calls unshare, cgroup +/batchjobs/container_id1 becomes the cgroupns root. For the +init_cgroup_ns, this is the real root ('/') cgroup. + +The cgroupns root cgroup does not change even if the namespace creator +process later moves to a different cgroup. + + # ~/unshare -c # unshare cgroupns in some cgroup + # cat /proc/self/cgroup + 0::/ + # mkdir sub_cgrp_1 + # echo 0 > sub_cgrp_1/cgroup.procs + # cat /proc/self/cgroup + 0::/sub_cgrp_1 + +Each process gets its namespace-specific view of "/proc/$PID/cgroup" + +Processes running inside the cgroup namespace will be able to see +cgroup paths (in /proc/self/cgroup) only inside their root cgroup. +From within an unshared cgroupns: + + # sleep 100000 & + [1] 7353 + # echo 7353 > sub_cgrp_1/cgroup.procs + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + +From the initial cgroup namespace, the real cgroup path will be +visible: + + $ cat /proc/7353/cgroup + 0::/batchjobs/container_id1/sub_cgrp_1 + +From a sibling cgroup namespace (that is, a namespace rooted at a +different cgroup), the cgroup path relative to its own cgroup +namespace root will be shown. For instance, if PID 7353's cgroup +namespace root is at '/batchjobs/container_id2', then it will see + + # cat /proc/7353/cgroup + 0::/../container_id2/sub_cgrp_1 + +Note that the relative path always starts with '/' to indicate that +its relative to the cgroup namespace root of the caller. + + +6-3. Migration and setns(2) + +Processes inside a cgroup namespace can move into and out of the +namespace root if they have proper access to external cgroups. For +example, from inside a namespace with cgroupns root at +/batchjobs/container_id1, and assuming that the global hierarchy is +still accessible inside cgroupns: + + # cat /proc/7353/cgroup + 0::/sub_cgrp_1 + # echo 7353 > batchjobs/container_id2/cgroup.procs + # cat /proc/7353/cgroup + 0::/../container_id2 + +Note that this kind of setup is not encouraged. A task inside cgroup +namespace should only be exposed to its own cgroupns hierarchy. + +setns(2) to another cgroup namespace is allowed when: + +(a) the process has CAP_SYS_ADMIN against its current user namespace +(b) the process has CAP_SYS_ADMIN against the target cgroup + namespace's userns + +No implicit cgroup changes happen with attaching to another cgroup +namespace. It is expected that the someone moves the attaching +process under the target cgroup namespace root. + + +6-4. Interaction with Other Namespaces + +Namespace specific cgroup hierarchy can be mounted by a process +running inside a non-init cgroup namespace. + + # mount -t cgroup2 none $MOUNT_POINT + +This will mount the unified cgroup hierarchy with cgroupns root as the +filesystem root. The process needs CAP_SYS_ADMIN against its user and +mount namespaces. + +The virtualization of /proc/self/cgroup file combined with restricting +the view of cgroup hierarchy by namespace-private cgroupfs mount +provides a properly isolated cgroup view inside the container. + + P. Information on Kernel Programming This section contains kernel programming information in the areas -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (7 preceding siblings ...) (?) @ 2016-01-29 8:54 ` serge.hallyn 2016-02-16 18:05 ` Tejun Heo [not found] ` <1454057651-23959-9-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> -1 siblings, 2 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-29 8:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn, Serge Hallyn From: Serge Hallyn <serge.hallyn@ubuntu.com> allowing root in a non-init user namespace to mount it. This should now be safe, because 1. non-init-root cannot mount a previously unbound subsystem 2. the task doing the mount must be privileged with respect to the user namespace owning the cgroup namespace 3. the mounted subsystem will have its current cgroup as the root dentry. the permissions will be unchanged, so tasks will receive no new privilege over the cgroups which they did not have on the original mounts. Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> --- kernel/cgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 3e04df0..7a58749 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2216,12 +2216,14 @@ static struct file_system_type cgroup_fs_type = { .name = "cgroup", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; static struct file_system_type cgroup2_fs_type = { .name = "cgroup2", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, + .fs_flags = FS_USERNS_MOUNT, }; static char * -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs [not found] ` <1454057651-23959-9-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-02-16 18:05 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-16 18:05 UTC (permalink / raw) To: serge.hallyn Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge Hallyn On Fri, Jan 29, 2016 at 02:54:11AM -0600, serge.hallyn@ubuntu.com wrote: > From: Serge Hallyn <serge.hallyn@ubuntu.com> > > allowing root in a non-init user namespace to mount it. This should > now be safe, because > > 1. non-init-root cannot mount a previously unbound subsystem > 2. the task doing the mount must be privileged with respect to the > user namespace owning the cgroup namespace > 3. the mounted subsystem will have its current cgroup as the root dentry. > the permissions will be unchanged, so tasks will receive no new > privilege over the cgroups which they did not have on the original > mounts. > > Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Applied 1-8 to cgroup/for-4.6-ns w/ trivial stylistic updates. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs @ 2016-02-16 18:05 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-16 18:05 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge Hallyn On Fri, Jan 29, 2016 at 02:54:11AM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> > > allowing root in a non-init user namespace to mount it. This should > now be safe, because > > 1. non-init-root cannot mount a previously unbound subsystem > 2. the task doing the mount must be privileged with respect to the > user namespace owning the cgroup namespace > 3. the mounted subsystem will have its current cgroup as the root dentry. > the permissions will be unchanged, so tasks will receive no new > privilege over the cgroups which they did not have on the original > mounts. > > Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Applied 1-8 to cgroup/for-4.6-ns w/ trivial stylistic updates. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1454057651-23959-9-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs [not found] ` <1454057651-23959-9-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-02-16 18:05 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-16 18:05 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Fri, Jan 29, 2016 at 02:54:11AM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> > > allowing root in a non-init user namespace to mount it. This should > now be safe, because > > 1. non-init-root cannot mount a previously unbound subsystem > 2. the task doing the mount must be privileged with respect to the > user namespace owning the cgroup namespace > 3. the mounted subsystem will have its current cgroup as the root dentry. > the permissions will be unchanged, so tasks will receive no new > privilege over the cgroups which they did not have on the original > mounts. > > Signed-off-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Applied 1-8 to cgroup/for-4.6-ns w/ trivial stylistic updates. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-02-11 23:18 ` Alban Crequy 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-11 23:18 UTC (permalink / raw) To: LXC development mailing-list Cc: linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, Tejun Heo, cgroups, Andrew Morton On 29 January 2016 at 09:54, <serge.hallyn@ubuntu.com> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > 2. unsharing a cgroup namespace makes all your current cgroups your new > cgroup root. > > 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's > cgroup namespce root. A task outside of your cgroup looks like > > 8:memory:/../../.. > > 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends > on the mounting task's cgroup namespace. > > 5. setns to a cgroup namespace switches your cgroup namespace but not > your cgroups. > > With this, using github.com/hallyn/lxc #2015-11-09/cgns (and > github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full > proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. > > This is completely backward compatible and will be completely invisible > to any existing cgroup users (except for those running inside a cgroup > namespace and looking at /proc/pid/cgroup of tasks outside their > namespace.) Hi, I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match") which, as far as I understand, introduces a new userland facing API containing the full cgroup path. Does it mean that the cgroupns patchset should include cgroup path translation in xt_cgroup? > Changes from V9: > 1. Update to latest Linus tree > 2. A few locking fixes > > Changes from V8: > 1. Incorporate updated documentation from tj. > 2. Put lookup_one_len() under inode lock > 3. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are > namespaced. > 4. Make cgroup_path{,_ns} take the needed locks, since external callers cannot > do so. > 5. Fix the bisectability problem of to_cg_ns() being defined after use > > Changes from V7: > 1. Rework kernfs_path_from_node_locked to return the string length > 2. Rename and reorder args to kernfs_path_from_node > 3. cgroup.c: undo accidental conversoins to inline > 4. cgroup.h: move ns declarations to bottom. > 5. Rework the documentation to fit the style of the rest of cgroup.txt > > Changes from V6: > 1. Switch to some WARN_ONs to provide stack traces > 2. Rename kernfs_node_distance to kernfs_depth > 3. Make sure kernfs_common_ancestor() nodes are from same root > 4. Split kernfs changes for cgroup_mount into separate patch > 5. Rename kernfs_obtain_root to kernfs_node_dentry > (And more, see patch changelogs) > > Changes from V5: > 1. To get a root dentry for cgroup namespace mount, walk the path from the > kernfs root dentry. > > Changes from V4: > 1. Move the FS_USERNS_MOUNT flag to last patch > 2. Rebase onto cgroup/for-4.5 > 3. Don't non-init user namespaces to bind new subsystems when mounting. > 4. Address feedback from Tejun (thanks). Specificaly, not addressed: > . kernfs_obtain_root - walking dentry from kernfs root. > (I think that's the only piece) > 5. Dropped unused get_task_cgroup fn/patch. > 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. > It now finds a common ancestor, walks from the source to it, then back > up to the target. > > Changes from V3: > 1. Rebased onto latest cgroup changes. In particular switch to > css_set_lock and ns_common. > 2. Support all hierarchies. > > Changes from V2: > 1. Added documentation in Documentation/cgroups/namespace.txt > 2. Fixed a bug that caused crash > 3. Incorporated some other suggestions from last patchset: > - removed use of threadgroup_lock() while creating new cgroupns > - use task_lock() instead of rcu_read_lock() while accessing > task->nsproxy > - optimized setns() to own cgroupns > - simplified code around sane-behavior mount option parsing > 4. Restored ACKs from Serge Hallyn from v1 on few patches that have > not changed since then. > > Changes from V1: > 1. No pinning of processes within cgroupns. Tasks can be freely moved > across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies > apply as before. > 2. Path in /proc/<pid>/cgroup is now always shown and is relative to > cgroupns-root. So path can contain '/..' strings depending on cgroupns-root > of the reader and cgroup of <pid>. > 3. setns() does not require the process to first move under target > cgroupns-root. > > Changes form RFC (V0): > 1. setns support for cgroupns > 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now > mounts the cgroup hierarcy with cgroupns-root as the filesystem root. > 3. writes to cgroup files outside of cgroupns-root are not allowed > 4. visibility of /proc/<pid>/cgroup is further restricted by not showing > anything if the <pid> is in a sibling cgroupns and its cgroup falls outside > your cgroupns-root. > > > _______________________________________________ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-11 23:18 ` Alban Crequy 0 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-11 23:18 UTC (permalink / raw) To: LXC development mailing-list Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > 2. unsharing a cgroup namespace makes all your current cgroups your new > cgroup root. > > 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's > cgroup namespce root. A task outside of your cgroup looks like > > 8:memory:/../../.. > > 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends > on the mounting task's cgroup namespace. > > 5. setns to a cgroup namespace switches your cgroup namespace but not > your cgroups. > > With this, using github.com/hallyn/lxc #2015-11-09/cgns (and > github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full > proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. > > This is completely backward compatible and will be completely invisible > to any existing cgroup users (except for those running inside a cgroup > namespace and looking at /proc/pid/cgroup of tasks outside their > namespace.) Hi, I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match") which, as far as I understand, introduces a new userland facing API containing the full cgroup path. Does it mean that the cgroupns patchset should include cgroup path translation in xt_cgroup? > Changes from V9: > 1. Update to latest Linus tree > 2. A few locking fixes > > Changes from V8: > 1. Incorporate updated documentation from tj. > 2. Put lookup_one_len() under inode lock > 3. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are > namespaced. > 4. Make cgroup_path{,_ns} take the needed locks, since external callers cannot > do so. > 5. Fix the bisectability problem of to_cg_ns() being defined after use > > Changes from V7: > 1. Rework kernfs_path_from_node_locked to return the string length > 2. Rename and reorder args to kernfs_path_from_node > 3. cgroup.c: undo accidental conversoins to inline > 4. cgroup.h: move ns declarations to bottom. > 5. Rework the documentation to fit the style of the rest of cgroup.txt > > Changes from V6: > 1. Switch to some WARN_ONs to provide stack traces > 2. Rename kernfs_node_distance to kernfs_depth > 3. Make sure kernfs_common_ancestor() nodes are from same root > 4. Split kernfs changes for cgroup_mount into separate patch > 5. Rename kernfs_obtain_root to kernfs_node_dentry > (And more, see patch changelogs) > > Changes from V5: > 1. To get a root dentry for cgroup namespace mount, walk the path from the > kernfs root dentry. > > Changes from V4: > 1. Move the FS_USERNS_MOUNT flag to last patch > 2. Rebase onto cgroup/for-4.5 > 3. Don't non-init user namespaces to bind new subsystems when mounting. > 4. Address feedback from Tejun (thanks). Specificaly, not addressed: > . kernfs_obtain_root - walking dentry from kernfs root. > (I think that's the only piece) > 5. Dropped unused get_task_cgroup fn/patch. > 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. > It now finds a common ancestor, walks from the source to it, then back > up to the target. > > Changes from V3: > 1. Rebased onto latest cgroup changes. In particular switch to > css_set_lock and ns_common. > 2. Support all hierarchies. > > Changes from V2: > 1. Added documentation in Documentation/cgroups/namespace.txt > 2. Fixed a bug that caused crash > 3. Incorporated some other suggestions from last patchset: > - removed use of threadgroup_lock() while creating new cgroupns > - use task_lock() instead of rcu_read_lock() while accessing > task->nsproxy > - optimized setns() to own cgroupns > - simplified code around sane-behavior mount option parsing > 4. Restored ACKs from Serge Hallyn from v1 on few patches that have > not changed since then. > > Changes from V1: > 1. No pinning of processes within cgroupns. Tasks can be freely moved > across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies > apply as before. > 2. Path in /proc/<pid>/cgroup is now always shown and is relative to > cgroupns-root. So path can contain '/..' strings depending on cgroupns-root > of the reader and cgroup of <pid>. > 3. setns() does not require the process to first move under target > cgroupns-root. > > Changes form RFC (V0): > 1. setns support for cgroupns > 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now > mounts the cgroup hierarcy with cgroupns-root as the filesystem root. > 3. writes to cgroup files outside of cgroupns-root are not allowed > 4. visibility of /proc/<pid>/cgroup is further restricted by not showing > anything if the <pid> is in a sibling cgroupns and its cgroup falls outside > your cgroupns-root. > > > _______________________________________________ > lxc-devel mailing list > lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I@public.gmane.org > http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <CAMXgnP6eSQjsuPXdrbaHytujVSkizPd4cJJQwQcuSCLAgVcYJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [lxc-devel] CGroup Namespaces (v10) 2016-02-11 23:18 ` Alban Crequy @ 2016-02-12 16:09 ` Tejun Heo -1 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-12 16:09 UTC (permalink / raw) To: Alban Crequy Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Linux API, Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA, LXC development mailing-list, Johannes Weiner, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Hello, On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > cgroup2 path match") which, as far as I understand, introduces a new > userland facing API containing the full cgroup path. Does it mean that > the cgroupns patchset should include cgroup path translation in > xt_cgroup? I don't think so. None of netfilter configuration is namespaced in any way. They're system-global by nature. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-12 16:09 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-12 16:09 UTC (permalink / raw) To: Alban Crequy Cc: LXC development mailing-list, linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, cgroups, Andrew Morton Hello, On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > cgroup2 path match") which, as far as I understand, introduces a new > userland facing API containing the full cgroup path. Does it mean that > the cgroupns patchset should include cgroup path translation in > xt_cgroup? I don't think so. None of netfilter configuration is namespaced in any way. They're system-global by nature. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20160212160906.GG3741-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <20160212160906.GG3741-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2016-02-12 23:22 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-12 23:22 UTC (permalink / raw) To: Tejun Heo Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Linux API, Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alban Crequy, LXC development mailing-list, Johannes Weiner, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > Hello, > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > cgroup2 path match") which, as far as I understand, introduces a new > > userland facing API containing the full cgroup path. Does it mean that > > the cgroupns patchset should include cgroup path translation in > > xt_cgroup? > > I don't think so. None of netfilter configuration is namespaced in > any way. They're system-global by nature. I assume at some point you'll want the set ported onto for-4.6 or linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto for-4.6 at the moment, but I haven't tried linux-next, and I haven't done build+test since 4.5-rc1 came out. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) 2016-02-12 16:09 ` Tejun Heo @ 2016-02-12 23:22 ` Serge E. Hallyn -1 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-12 23:22 UTC (permalink / raw) To: Tejun Heo Cc: Alban Crequy, LXC development mailing-list, linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, cgroups, Andrew Morton On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > Hello, > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > cgroup2 path match") which, as far as I understand, introduces a new > > userland facing API containing the full cgroup path. Does it mean that > > the cgroupns patchset should include cgroup path translation in > > xt_cgroup? > > I don't think so. None of netfilter configuration is namespaced in > any way. They're system-global by nature. I assume at some point you'll want the set ported onto for-4.6 or linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto for-4.6 at the moment, but I haven't tried linux-next, and I haven't done build+test since 4.5-rc1 came out. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-12 23:22 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2016-02-12 23:22 UTC (permalink / raw) To: Tejun Heo Cc: Alban Crequy, LXC development mailing-list, linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, cgroups, Andrew Morton On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > Hello, > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > cgroup2 path match") which, as far as I understand, introduces a new > > userland facing API containing the full cgroup path. Does it mean that > > the cgroupns patchset should include cgroup path translation in > > xt_cgroup? > > I don't think so. None of netfilter configuration is namespaced in > any way. They're system-global by nature. I assume at some point you'll want the set ported onto for-4.6 or linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto for-4.6 at the moment, but I haven't tried linux-next, and I haven't done build+test since 4.5-rc1 came out. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20160212232221.GA31062-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <20160212232221.GA31062-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2016-02-15 21:17 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:17 UTC (permalink / raw) To: Serge E. Hallyn Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Linux API, Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alban Crequy, LXC development mailing-list, Johannes Weiner, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, Feb 12, 2016 at 05:22:21PM -0600, Serge E. Hallyn wrote: > On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > > Hello, > > > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > > cgroup2 path match") which, as far as I understand, introduces a new > > > userland facing API containing the full cgroup path. Does it mean that > > > the cgroupns patchset should include cgroup path translation in > > > xt_cgroup? > > > > I don't think so. None of netfilter configuration is namespaced in > > any way. They're system-global by nature. > > I assume at some point you'll want the set ported onto for-4.6 or > linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto > for-4.6 at the moment, but I haven't tried linux-next, and I haven't > done build+test since 4.5-rc1 came out. I'm getting the following on top of the current for-4.6. Can you please look into it? [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) .ns.ops = &cgroupns_operations, ^ Thanks. -- tejun _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <20160212232221.GA31062-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2016-02-15 21:17 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:17 UTC (permalink / raw) To: Serge E. Hallyn Cc: Alban Crequy, LXC development mailing-list, linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, cgroups, Andrew Morton On Fri, Feb 12, 2016 at 05:22:21PM -0600, Serge E. Hallyn wrote: > On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > > Hello, > > > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > > cgroup2 path match") which, as far as I understand, introduces a new > > > userland facing API containing the full cgroup path. Does it mean that > > > the cgroupns patchset should include cgroup path translation in > > > xt_cgroup? > > > > I don't think so. None of netfilter configuration is namespaced in > > any way. They're system-global by nature. > > I assume at some point you'll want the set ported onto for-4.6 or > linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto > for-4.6 at the moment, but I haven't tried linux-next, and I haven't > done build+test since 4.5-rc1 came out. I'm getting the following on top of the current for-4.6. Can you please look into it? [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) .ns.ops = &cgroupns_operations, ^ Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-15 21:17 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:17 UTC (permalink / raw) To: Serge E. Hallyn Cc: Alban Crequy, LXC development mailing-list, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, Feb 12, 2016 at 05:22:21PM -0600, Serge E. Hallyn wrote: > On Fri, Feb 12, 2016 at 11:09:06AM -0500, Tejun Heo wrote: > > Hello, > > > > On Fri, Feb 12, 2016 at 12:18:28AM +0100, Alban Crequy wrote: > > > I just noticed commit c38c4597e4bf ("netfilter: implement xt_cgroup > > > cgroup2 path match") which, as far as I understand, introduces a new > > > userland facing API containing the full cgroup path. Does it mean that > > > the cgroupns patchset should include cgroup path translation in > > > xt_cgroup? > > > > I don't think so. None of netfilter configuration is namespaced in > > any way. They're system-global by nature. > > I assume at some point you'll want the set ported onto for-4.6 or > linux-next? My 2016-02-03/cgns set still cherrypick cleanly onto > for-4.6 at the moment, but I haven't tried linux-next, and I haven't > done build+test since 4.5-rc1 came out. I'm getting the following on top of the current for-4.6. Can you please look into it? [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) .ns.ops = &cgroupns_operations, ^ Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20160215211705.GQ3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>]
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <20160215211705.GQ3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org> @ 2016-02-15 21:20 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:20 UTC (permalink / raw) To: Serge E. Hallyn Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Linux API, Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Alban Crequy, LXC development mailing-list, Johannes Weiner, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Mon, Feb 15, 2016 at 04:17:05PM -0500, Tejun Heo wrote: > I'm getting the following on top of the current for-4.6. Can you > please look into it? > > [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) > .ns.ops = &cgroupns_operations, > ^ Never mind. That was me being stupid with trivial conflict resolution. Thanks. -- tejun _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <20160215211705.GQ3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org> @ 2016-02-15 21:20 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:20 UTC (permalink / raw) To: Serge E. Hallyn Cc: Alban Crequy, LXC development mailing-list, linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, cgroups, Andrew Morton On Mon, Feb 15, 2016 at 04:17:05PM -0500, Tejun Heo wrote: > I'm getting the following on top of the current for-4.6. Can you > please look into it? > > [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) > .ns.ops = &cgroupns_operations, > ^ Never mind. That was me being stupid with trivial conflict resolution. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-15 21:20 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2016-02-15 21:20 UTC (permalink / raw) To: Serge E. Hallyn Cc: Alban Crequy, LXC development mailing-list, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Mon, Feb 15, 2016 at 04:17:05PM -0500, Tejun Heo wrote: > I'm getting the following on top of the current for-4.6. Can you > please look into it? > > [kernel/cgroup.c:219:13: error: ‘cgroupns_operations’ undeclared here (not in a function) > .ns.ops = &cgroupns_operations, > ^ Never mind. That was me being stupid with trivial conflict resolution. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2016-02-26 13:18 ` Alban Crequy 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA ` (9 subsequent siblings) 10 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-26 13:18 UTC (permalink / raw) To: LXC development mailing-list Cc: linux-kernel, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh, Tejun Heo, cgroups, Andrew Morton Hi, On 29 January 2016 at 09:54, <serge.hallyn@ubuntu.com> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED What's the best way for a userspace application to test at run-time whether the kernel supports cgroup namespaces? Would you recommend to test if the file /proc/self/ns/cgroup exists? Thanks! Alban ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-26 13:18 ` Alban Crequy 0 siblings, 0 replies; 108+ messages in thread From: Alban Crequy @ 2016-02-26 13:18 UTC (permalink / raw) To: LXC development mailing-list Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Aditya Kali, Linux API, Linux Containers, Johannes Weiner, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Hi, On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > Hi, > > following is a revised set of the CGroup Namespace patchset which Aditya > Kali has previously sent. The code can also be found in the cgroupns.v10 > branch of > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > To summarize the semantics: > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED What's the best way for a userspace application to test at run-time whether the kernel supports cgroup namespaces? Would you recommend to test if the file /proc/self/ns/cgroup exists? Thanks! Alban ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <CAMXgnP4Wss0ctx7mHzD0WHL4+-fC59iLZNkYONE5pAeHYr18+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-02-26 22:47 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2016-02-26 22:47 UTC (permalink / raw) To: Alban Crequy Cc: LXC development mailing-list, Linux API, Linux Containers, Johannes Weiner, linux-kernel, gregkh, Tejun Heo, cgroups, Andrew Morton Quoting Alban Crequy (alban.crequy@gmail.com): > Hi, > > On 29 January 2016 at 09:54, <serge.hallyn@ubuntu.com> wrote: > > Hi, > > > > following is a revised set of the CGroup Namespace patchset which Aditya > > Kali has previously sent. The code can also be found in the cgroupns.v10 > > branch of > > > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > > > To summarize the semantics: > > > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > What's the best way for a userspace application to test at run-time > whether the kernel supports cgroup namespaces? Would you recommend to > test if the file /proc/self/ns/cgroup exists? Yup. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [lxc-devel] CGroup Namespaces (v10) @ 2016-02-26 22:47 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2016-02-26 22:47 UTC (permalink / raw) To: Alban Crequy Cc: LXC development mailing-list, Linux API, Linux Containers, Johannes Weiner, linux-kernel-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Quoting Alban Crequy (alban.crequy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org): > Hi, > > On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > > Hi, > > > > following is a revised set of the CGroup Namespace patchset which Aditya > > Kali has previously sent. The code can also be found in the cgroupns.v10 > > branch of > > > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > > > To summarize the semantics: > > > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > What's the best way for a userspace application to test at run-time > whether the kernel supports cgroup namespaces? Would you recommend to > test if the file /proc/self/ns/cgroup exists? Yup. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <CAMXgnP4Wss0ctx7mHzD0WHL4+-fC59iLZNkYONE5pAeHYr18+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [lxc-devel] CGroup Namespaces (v10) [not found] ` <CAMXgnP4Wss0ctx7mHzD0WHL4+-fC59iLZNkYONE5pAeHYr18+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-02-26 22:47 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2016-02-26 22:47 UTC (permalink / raw) To: Alban Crequy Cc: Linux API, Linux Containers, Johannes Weiner, linux-kernel-u79uwXL29TY76Z2rM5mHXA, LXC development mailing-list, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Quoting Alban Crequy (alban.crequy-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org): > Hi, > > On 29 January 2016 at 09:54, <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > > Hi, > > > > following is a revised set of the CGroup Namespace patchset which Aditya > > Kali has previously sent. The code can also be found in the cgroupns.v10 > > branch of > > > > https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ > > > > To summarize the semantics: > > > > 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED > > What's the best way for a userspace application to test at run-time > whether the kernel supports cgroup namespaces? Would you recommend to > test if the file /proc/self/ns/cgroup exists? Yup. ^ permalink raw reply [flat|nested] 108+ messages in thread
* CGroup Namespaces (v9) @ 2016-01-04 19:54 serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA [not found] ` <1451937294-22589-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-04 19:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v9 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V8: 1. Incorporate updated documentation from tj. 1. Put lookup_one_len() under inode lock 2. Make cgroup_path non-namespaced, so only calls to cgroup_path_ns() are namespaced. 3. Make cgroup_path{,_ns} take the needed locks, since external callers cannot do so. 4. Fix the bisectability problem of to_cg_ns() being defined after use Changes from V7: 1. Rework kernfs_path_from_node_locked to return the string length 2. Rename and reorder args to kernfs_path_from_node 3. cgroup.c: undo accidental conversoins to inline 4. cgroup.h: move ns declarations to bottom. 5. Rework the documentation to fit the style of the rest of cgroup.txt Changes from V6: 1. Switch to some WARN_ONs to provide stack traces 2. Rename kernfs_node_distance to kernfs_depth 3. Make sure kernfs_common_ancestor() nodes are from same root 4. Split kernfs changes for cgroup_mount into separate patch 5. Rename kernfs_obtain_root to kernfs_node_dentry (And more, see patch changelogs) Changes from V5: 1. To get a root dentry for cgroup namespace mount, walk the path from the kernfs root dentry. Changes from V4: 1. Move the FS_USERNS_MOUNT flag to last patch 2. Rebase onto cgroup/for-4.5 3. Don't non-init user namespaces to bind new subsystems when mounting. 4. Address feedback from Tejun (thanks). Specificaly, not addressed: . kernfs_obtain_root - walking dentry from kernfs root. (I think that's the only piece) 5. Dropped unused get_task_cgroup fn/patch. 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. It now finds a common ancestor, walks from the source to it, then back up to the target. Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1451937294-22589-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2016-01-04 19:54 CGroup Namespaces (v9) serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-04 19:54 ` serge.hallyn 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2016-01-04 19:54 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. Changelog 20151223: - don't allocate memory pr_cont_kernfs_path() under spinlock --- fs/kernfs/dir.c | 192 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 166 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 742bf4a..f2b2187 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); + + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; - return p; + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,17 +286,25 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + int sz; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz >= sizeof(kernfs_pr_cont_buf)) { + pr_cont("(name too long)"); + goto out; + } + + pr_cont("%s", kernfs_pr_cont_buf); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2016-01-04 19:54 ` serge.hallyn 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2016-01-04 19:54 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. Changelog 20151223: - don't allocate memory pr_cont_kernfs_path() under spinlock --- fs/kernfs/dir.c | 192 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 166 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 742bf4a..f2b2187 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); + + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; - return p; + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,17 +286,25 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + int sz; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz >= sizeof(kernfs_pr_cont_buf)) { + pr_cont("(name too long)"); + goto out; + } + + pr_cont("%s", kernfs_pr_cont_buf); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* CGroup Namespaces (v8) @ 2015-12-23 4:23 serge.hallyn [not found] ` <1450844609-9194-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 108+ messages in thread From: serge.hallyn @ 2015-12-23 4:23 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v8 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V7: 1. Rework kernfs_path_from_node_locked to return the string length 2. Rename and reorder args to kernfs_path_from_node 3. cgroup.c: undo accidental conversoins to inline 4. cgroup.h: move ns declarations to bottom. 5. Rework the documentation to fit the style of the rest of cgroup.txt Changes from V6: 1. Switch to some WARN_ONs to provide stack traces 2. Rename kernfs_node_distance to kernfs_depth 3. Make sure kernfs_common_ancestor() nodes are from same root 4. Split kernfs changes for cgroup_mount into separate patch 5. Rename kernfs_obtain_root to kernfs_node_dentry (And more, see patch changelogs) Changes from V5: 1. To get a root dentry for cgroup namespace mount, walk the path from the kernfs root dentry. Changes from V4: 1. Move the FS_USERNS_MOUNT flag to last patch 2. Rebase onto cgroup/for-4.5 3. Don't non-init user namespaces to bind new subsystems when mounting. 4. Address feedback from Tejun (thanks). Specificaly, not addressed: . kernfs_obtain_root - walking dentry from kernfs root. (I think that's the only piece) 5. Dropped unused get_task_cgroup fn/patch. 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. It now finds a common ancestor, walks from the source to it, then back up to the target. Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1450844609-9194-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-23 4:23 CGroup Namespaces (v8) serge.hallyn @ 2015-12-23 4:23 ` serge.hallyn 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2015-12-23 4:23 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. --- fs/kernfs/dir.c | 205 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 179 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 742bf4a..e82b9a1 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); + + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; + + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; - return p; + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + char *p = NULL; + int sz1, sz2; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz1 < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz1 < sizeof(kernfs_pr_cont_buf)) { + pr_cont("%s", kernfs_pr_cont_buf); + goto out; + } + + p = kmalloc(sz1 + 1, GFP_NOFS); + if (!p) { + pr_cont("(out of memory)"); + goto out; + } + sz2 = kernfs_path_from_node_locked(kn, NULL, p, sz1 + 1); + if (sz2 > sz1 || sz2 < 0) { + pr_cont("(error)"); + goto out; + } + + pr_cont("%s", p); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); + kfree(p); } /** diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-23 4:23 ` serge.hallyn 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn @ 2015-12-23 4:23 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage Changelog 20151209: - kernfs_path_from_node: change arguments to 'to' and 'from', and change their order. Changelog 20151222: - kernfs_path_from_node{,_locked}: return the string length. kernfs_path is gpl-exported, so changing their return value seemed ill-advised, but if noone minds I can update it too. --- fs/kernfs/dir.c | 205 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 9 ++- 2 files changed, 179 insertions(+), 35 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 742bf4a..e82b9a1 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,123 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); + + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + * + * return value: length of the string. If greater than buflen, + * then contents of buf are undefined. On error, -1 is returned. + */ +static int +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + size_t depth_from, depth_to, len = 0, nlen = 0; + char *p; + int i; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) + return strlcpy(buf, "/", buflen); + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return -1; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + if (buf) + buf[0] = '\0'; + + for (i = 0; i < depth_from; i++) + len += strlcpy(buf + len, parent_str, + len < buflen ? buflen - len : 0); + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; - return p; + if (len + nlen >= buflen) + return len + nlen; + + p = buf + len + nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return len + nlen; } /** @@ -115,6 +210,34 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** + * kernfs_path_from_node - build path of node @to relative to @from. + * @from: parent kernfs_node relative to which we need to build the path + * @to: kernfs_node of interest + * @buf: buffer to copy @to's path into + * @buflen: size of @buf + * + * Builds @to's path relative to @from in @buf. @from and @to must + * be on the same kernfs-root. If @from is not parent of @to, then a relative + * path (which includes '..'s) as needed to reach from @from to @to is + * returned. + * + * If @buf isn't long enough, the return value will be greater than @buflen + * and @buf contents are undefined. + */ +int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, + char *buf, size_t buflen) +{ + unsigned long flags; + int ret; + + spin_lock_irqsave(&kernfs_rename_lock, flags); + ret = kernfs_path_from_node_locked(to, from, buf, buflen); + spin_unlock_irqrestore(&kernfs_rename_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** * kernfs_path - build full path of a given node * @kn: kernfs_node of interest * @buf: buffer to copy @kn's name into @@ -127,13 +250,12 @@ size_t kernfs_path_len(struct kernfs_node *kn) */ char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) { - unsigned long flags; - char *p; + int ret; - spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); - spin_unlock_irqrestore(&kernfs_rename_lock, flags); - return p; + ret = kernfs_path_from_node(kn, NULL, buf, buflen); + if (ret < 0 || ret >= buflen) + return NULL; + return buf; } EXPORT_SYMBOL_GPL(kernfs_path); @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) void pr_cont_kernfs_path(struct kernfs_node *kn) { unsigned long flags; - char *p; + char *p = NULL; + int sz1, sz2; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); - if (p) - pr_cont("%s", p); - else - pr_cont("<name too long>"); + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); + if (sz1 < 0) { + pr_cont("(error)"); + goto out; + } + + if (sz1 < sizeof(kernfs_pr_cont_buf)) { + pr_cont("%s", kernfs_pr_cont_buf); + goto out; + } + + p = kmalloc(sz1 + 1, GFP_NOFS); + if (!p) { + pr_cont("(out of memory)"); + goto out; + } + sz2 = kernfs_path_from_node_locked(kn, NULL, p, sz1 + 1); + if (sz2 > sz1 || sz2 < 0) { + pr_cont("(error)"); + goto out; + } + + pr_cont("%s", p); +out: spin_unlock_irqrestore(&kernfs_rename_lock, flags); + kfree(p); } /** diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index af51df3..716bfde 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,8 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); -char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen); +int kernfs_path_from_node(struct kernfs_node *root_kn, struct kernfs_node *kn, + char *buf, size_t buflen); +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); void pr_cont_kernfs_path(struct kernfs_node *kn); struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn); @@ -338,8 +339,8 @@ static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen) static inline size_t kernfs_path_len(struct kernfs_node *kn) { return 0; } -static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, - size_t buflen) +static inline char *kernfs_path(struct kernfs_node *kn, char *buf, + size_t buflen) { return NULL; } static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-23 16:08 ` Tejun Heo 2015-12-23 16:24 ` Tejun Heo 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:08 UTC (permalink / raw) To: serge.hallyn Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn Hello, Serge. On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn@ubuntu.com wrote: > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > void pr_cont_kernfs_path(struct kernfs_node *kn) > { > unsigned long flags; > - char *p; > + char *p = NULL; > + int sz1, sz2; > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > - sizeof(kernfs_pr_cont_buf)); > - if (p) > - pr_cont("%s", p); > - else > - pr_cont("<name too long>"); > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > + sizeof(kernfs_pr_cont_buf)); > + if (sz1 < 0) { > + pr_cont("(error)"); > + goto out; > + } > + > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > + pr_cont("%s", kernfs_pr_cont_buf); > + goto out; > + } > + > + p = kmalloc(sz1 + 1, GFP_NOFS); We can't do GFP_NOFS allocation while holding a spinlock and we don't want to do atomic allocation here either. I think it'd be best to keep using the static buffer. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-23 16:08 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:08 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn Hello, Serge. On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > void pr_cont_kernfs_path(struct kernfs_node *kn) > { > unsigned long flags; > - char *p; > + char *p = NULL; > + int sz1, sz2; > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > - sizeof(kernfs_pr_cont_buf)); > - if (p) > - pr_cont("%s", p); > - else > - pr_cont("<name too long>"); > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > + sizeof(kernfs_pr_cont_buf)); > + if (sz1 < 0) { > + pr_cont("(error)"); > + goto out; > + } > + > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > + pr_cont("%s", kernfs_pr_cont_buf); > + goto out; > + } > + > + p = kmalloc(sz1 + 1, GFP_NOFS); We can't do GFP_NOFS allocation while holding a spinlock and we don't want to do atomic allocation here either. I think it'd be best to keep using the static buffer. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151223160854.GF5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-23 16:36 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-23 16:36 UTC (permalink / raw) To: Tejun Heo Cc: serge.hallyn, linux-api, containers, hannes, linux-kernel, ebiederm, lxc-devel, gregkh, cgroups, akpm On Wed, Dec 23, 2015 at 11:08:54AM -0500, Tejun Heo wrote: > Hello, Serge. > > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn@ubuntu.com wrote: > > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > > void pr_cont_kernfs_path(struct kernfs_node *kn) > > { > > unsigned long flags; > > - char *p; > > + char *p = NULL; > > + int sz1, sz2; > > > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > > - sizeof(kernfs_pr_cont_buf)); > > - if (p) > > - pr_cont("%s", p); > > - else > > - pr_cont("<name too long>"); > > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > > + sizeof(kernfs_pr_cont_buf)); > > + if (sz1 < 0) { > > + pr_cont("(error)"); > > + goto out; > > + } > > + > > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > > + pr_cont("%s", kernfs_pr_cont_buf); > > + goto out; > > + } > > + > > + p = kmalloc(sz1 + 1, GFP_NOFS); > > We can't do GFP_NOFS allocation while holding a spinlock and we don't > want to do atomic allocation here either. I think it'd be best to > keep using the static buffer. D'oh, right. Will update. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-23 16:36 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-23 16:36 UTC (permalink / raw) To: Tejun Heo Cc: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Wed, Dec 23, 2015 at 11:08:54AM -0500, Tejun Heo wrote: > Hello, Serge. > > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > > void pr_cont_kernfs_path(struct kernfs_node *kn) > > { > > unsigned long flags; > > - char *p; > > + char *p = NULL; > > + int sz1, sz2; > > > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > > - sizeof(kernfs_pr_cont_buf)); > > - if (p) > > - pr_cont("%s", p); > > - else > > - pr_cont("<name too long>"); > > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > > + sizeof(kernfs_pr_cont_buf)); > > + if (sz1 < 0) { > > + pr_cont("(error)"); > > + goto out; > > + } > > + > > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > > + pr_cont("%s", kernfs_pr_cont_buf); > > + goto out; > > + } > > + > > + p = kmalloc(sz1 + 1, GFP_NOFS); > > We can't do GFP_NOFS allocation while holding a spinlock and we don't > want to do atomic allocation here either. I think it'd be best to > keep using the static buffer. D'oh, right. Will update. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151223160854.GF5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151223160854.GF5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-23 16:36 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-23 16:36 UTC (permalink / raw) To: Tejun Heo Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Wed, Dec 23, 2015 at 11:08:54AM -0500, Tejun Heo wrote: > Hello, Serge. > > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > > void pr_cont_kernfs_path(struct kernfs_node *kn) > > { > > unsigned long flags; > > - char *p; > > + char *p = NULL; > > + int sz1, sz2; > > > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > > - sizeof(kernfs_pr_cont_buf)); > > - if (p) > > - pr_cont("%s", p); > > - else > > - pr_cont("<name too long>"); > > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > > + sizeof(kernfs_pr_cont_buf)); > > + if (sz1 < 0) { > > + pr_cont("(error)"); > > + goto out; > > + } > > + > > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > > + pr_cont("%s", kernfs_pr_cont_buf); > > + goto out; > > + } > > + > > + p = kmalloc(sz1 + 1, GFP_NOFS); > > We can't do GFP_NOFS allocation while holding a spinlock and we don't > want to do atomic allocation here either. I think it'd be best to > keep using the static buffer. D'oh, right. Will update. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-23 16:24 ` Tejun Heo 2015-12-23 16:24 ` Tejun Heo 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:24 UTC (permalink / raw) To: serge.hallyn Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn@ubuntu.com wrote: > From: Aditya Kali <adityakali@google.com> > > The new function kernfs_path_from_node() generates and returns kernfs > path of a given kernfs_node relative to a given parent kernfs_node. > > Signed-off-by: Aditya Kali <adityakali@google.com> > Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Greg, can I route this together with other changes? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-23 16:24 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:24 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > The new function kernfs_path_from_node() generates and returns kernfs > path of a given kernfs_node relative to a given parent kernfs_node. > > Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Greg, can I route this together with other changes? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151223162433.GH5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-23 16:51 ` Greg KH 0 siblings, 0 replies; 108+ messages in thread From: Greg KH @ 2015-12-23 16:51 UTC (permalink / raw) To: Tejun Heo Cc: serge.hallyn, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, lizefan, hannes, Serge E. Hallyn On Wed, Dec 23, 2015 at 11:24:33AM -0500, Tejun Heo wrote: > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn@ubuntu.com wrote: > > From: Aditya Kali <adityakali@google.com> > > > > The new function kernfs_path_from_node() generates and returns kernfs > > path of a given kernfs_node relative to a given parent kernfs_node. > > > > Signed-off-by: Aditya Kali <adityakali@google.com> > > Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> > > Greg, can I route this together with other changes? Yes, please do: Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-23 16:51 ` Greg KH 0 siblings, 0 replies; 108+ messages in thread From: Greg KH @ 2015-12-23 16:51 UTC (permalink / raw) To: Tejun Heo Cc: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn On Wed, Dec 23, 2015 at 11:24:33AM -0500, Tejun Heo wrote: > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > > > The new function kernfs_path_from_node() generates and returns kernfs > > path of a given kernfs_node relative to a given parent kernfs_node. > > > > Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> > > Greg, can I route this together with other changes? Yes, please do: Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151223162433.GH5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151223162433.GH5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-23 16:51 ` Greg KH 0 siblings, 0 replies; 108+ messages in thread From: Greg KH @ 2015-12-23 16:51 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Wed, Dec 23, 2015 at 11:24:33AM -0500, Tejun Heo wrote: > On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > > > The new function kernfs_path_from_node() generates and returns kernfs > > path of a given kernfs_node relative to a given parent kernfs_node. > > > > Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> > > Greg, can I route this together with other changes? Yes, please do: Acked-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-23 16:08 ` Tejun Heo 2015-12-23 16:24 ` Tejun Heo 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:08 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hello, Serge. On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > @@ -164,18 +286,39 @@ void pr_cont_kernfs_name(struct kernfs_node *kn) > void pr_cont_kernfs_path(struct kernfs_node *kn) > { > unsigned long flags; > - char *p; > + char *p = NULL; > + int sz1, sz2; > > spin_lock_irqsave(&kernfs_rename_lock, flags); > > - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, > - sizeof(kernfs_pr_cont_buf)); > - if (p) > - pr_cont("%s", p); > - else > - pr_cont("<name too long>"); > + sz1 = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, > + sizeof(kernfs_pr_cont_buf)); > + if (sz1 < 0) { > + pr_cont("(error)"); > + goto out; > + } > + > + if (sz1 < sizeof(kernfs_pr_cont_buf)) { > + pr_cont("%s", kernfs_pr_cont_buf); > + goto out; > + } > + > + p = kmalloc(sz1 + 1, GFP_NOFS); We can't do GFP_NOFS allocation while holding a spinlock and we don't want to do atomic allocation here either. I think it'd be best to keep using the static buffer. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2015-12-23 16:08 ` Tejun Heo @ 2015-12-23 16:24 ` Tejun Heo 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-23 16:24 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Tue, Dec 22, 2015 at 10:23:22PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > > The new function kernfs_path_from_node() generates and returns kernfs > path of a given kernfs_node relative to a given parent kernfs_node. > > Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Greg, can I route this together with other changes? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* CGroup Namespaces (v7) @ 2015-12-09 19:28 serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2015-12-09 19:28 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn [not found] ` <1449689341-28742-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 0 siblings, 2 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2015-12-09 19:28 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v7 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V6: 1. Switch to some WARN_ONs to provide stack traces 2. Rename kernfs_node_distance to kernfs_depth 3. Make sure kernfs_common_ancestor() nodes are from same root 4. Split kernfs changes for cgroup_mount into separate patch 5. Rename kernfs_obtain_root to kernfs_node_dentry (And more, see patch changelogs) Changes from V5: 1. To get a root dentry for cgroup namespace mount, walk the path from the kernfs root dentry. Changes from V4: 1. Move the FS_USERNS_MOUNT flag to last patch 2. Rebase onto cgroup/for-4.5 3. Don't non-init user namespaces to bind new subsystems when mounting. 4. Address feedback from Tejun (thanks). Specificaly, not addressed: . kernfs_obtain_root - walking dentry from kernfs root. (I think that's the only piece) 5. Dropped unused get_task_cgroup fn/patch. 6. Reworked kernfs_path_from_node_locked() to try to simplify the logic. It now finds a common ancestor, walks from the source to it, then back up to the target. Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. ^ permalink raw reply [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-09 19:28 CGroup Namespaces (v7) serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2015-12-09 19:28 ` serge.hallyn 2015-12-09 21:38 ` Tejun Heo [not found] ` <1449689341-28742-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> [not found] ` <1449689341-28742-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 1 sibling, 2 replies; 108+ messages in thread From: serge.hallyn @ 2015-12-09 19:28 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage --- fs/kernfs/dir.c | 177 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 3 + 2 files changed, 153 insertions(+), 27 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..d1a001a 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,129 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); - return p; + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + int i; + size_t depth_from, depth_to, len = 0, nlen = 0; + size_t plen = sizeof(parent_str) - 1; + + /* We atleast need 2 bytes to write "/\0". */ + if (buflen < 2) + return NULL; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) { + *p = '/'; + *(++p) = '\0'; + return buf; + } + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return NULL; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + for (i = 0; i < depth_from; i++) { + if (len + plen + 1 > buflen) + return NULL; + strcpy(p, parent_str); + p += plen; + len += plen; + } + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen + 1 > buflen) + return NULL; + + p += nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return buf; } /** @@ -115,26 +216,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn, kn_root, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +291,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1449689341-28742-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-09 21:38 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 21:38 UTC (permalink / raw) To: serge.hallyn Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn Hello, Serge. On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn@ubuntu.com wrote: > +/* kernfs_node_depth - compute depth from @from to @to */ > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) ... > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > +{ > + return kernfs_path_from_node(NULL, kn, buf, buflen); > +} ... > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > index 5d4e9c4..d025ebd 100644 > --- a/include/linux/kernfs.h > +++ b/include/linux/kernfs.h > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > size_t kernfs_path_len(struct kernfs_node *kn); > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > + struct kernfs_node *kn, char *buf, > + size_t buflen); I think I commented on the same thing before, but I think it'd make more sense to put @from after @to and the prototype is using @root_kn which is a bit confusing. Was converting the path functions to return length too much work? If so, that's fine but please explain what decisions were made. I skimmed through the series and spotted several other review points which didn't get addressed. Can you please go over the previous review cycle and address the review points? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-09 21:38 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 21:38 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn Hello, Serge. On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > +/* kernfs_node_depth - compute depth from @from to @to */ > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) ... > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > +{ > + return kernfs_path_from_node(NULL, kn, buf, buflen); > +} ... > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > index 5d4e9c4..d025ebd 100644 > --- a/include/linux/kernfs.h > +++ b/include/linux/kernfs.h > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > size_t kernfs_path_len(struct kernfs_node *kn); > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > + struct kernfs_node *kn, char *buf, > + size_t buflen); I think I commented on the same thing before, but I think it'd make more sense to put @from after @to and the prototype is using @root_kn which is a bit confusing. Was converting the path functions to return length too much work? If so, that's fine but please explain what decisions were made. I skimmed through the series and spotted several other review points which didn't get addressed. Can you please go over the previous review cycle and address the review points? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151209213806.GP30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-09 22:13 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2015-12-09 22:13 UTC (permalink / raw) To: Tejun Heo Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn Quoting Tejun Heo (tj@kernel.org): > Hello, Serge. > > On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn@ubuntu.com wrote: > > +/* kernfs_node_depth - compute depth from @from to @to */ > > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) > ... > > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > > +{ > > + return kernfs_path_from_node(NULL, kn, buf, buflen); > > +} > ... > > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > > index 5d4e9c4..d025ebd 100644 > > --- a/include/linux/kernfs.h > > +++ b/include/linux/kernfs.h > > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > > size_t kernfs_path_len(struct kernfs_node *kn); > > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > > + struct kernfs_node *kn, char *buf, > > + size_t buflen); > > I think I commented on the same thing before, but I think it'd make > more sense to put @from after @to Oh. You said that for kernfs_path_from_node_locked(), and those were changed. kernfs_path_form_node() is a different fn, but > and the prototype is using @root_kn > which is a bit confusing. we can rename kn_root to from here if you think that's clearer (and change the order here as well). > Was converting the path functions to return > length too much work? If so, that's fine but please explain what > decisions were made. Yes, I had replied saying: |I can change that, but the callers right now don't re-try with |larger buffer anyway, so this would actually complicate them just |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path |right now writes into a static char[] for instance) I can still make that change if you like. > I skimmed through the series and spotted several other review points > which didn't get addressed. Can you please go over the previous > review cycle and address the review points? I did go through every email twice, once while making changes (one branch per response) and once while making changelog for each patch, sorry about whatever I missed. I'll go through each again. I'm going to be out for awhile after today, so next version will unfortunately take awhile. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-09 22:13 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2015-12-09 22:13 UTC (permalink / raw) To: Tejun Heo Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn Quoting Tejun Heo (tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org): > Hello, Serge. > > On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > +/* kernfs_node_depth - compute depth from @from to @to */ > > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) > ... > > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > > +{ > > + return kernfs_path_from_node(NULL, kn, buf, buflen); > > +} > ... > > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > > index 5d4e9c4..d025ebd 100644 > > --- a/include/linux/kernfs.h > > +++ b/include/linux/kernfs.h > > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > > size_t kernfs_path_len(struct kernfs_node *kn); > > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > > + struct kernfs_node *kn, char *buf, > > + size_t buflen); > > I think I commented on the same thing before, but I think it'd make > more sense to put @from after @to Oh. You said that for kernfs_path_from_node_locked(), and those were changed. kernfs_path_form_node() is a different fn, but > and the prototype is using @root_kn > which is a bit confusing. we can rename kn_root to from here if you think that's clearer (and change the order here as well). > Was converting the path functions to return > length too much work? If so, that's fine but please explain what > decisions were made. Yes, I had replied saying: |I can change that, but the callers right now don't re-try with |larger buffer anyway, so this would actually complicate them just |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path |right now writes into a static char[] for instance) I can still make that change if you like. > I skimmed through the series and spotted several other review points > which didn't get addressed. Can you please go over the previous > review cycle and address the review points? I did go through every email twice, once while making changes (one branch per response) and once while making changelog for each patch, sorry about whatever I missed. I'll go through each again. I'm going to be out for awhile after today, so next version will unfortunately take awhile. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-09 22:13 ` Serge Hallyn @ 2015-12-09 22:36 ` Tejun Heo -1 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 22:36 UTC (permalink / raw) To: Serge Hallyn Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm, gregkh, lizefan, hannes, Serge E. Hallyn Hey, On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > we can rename kn_root to from here if you think that's clearer (and > change the order here as well). I think it'd be better for them to be consistent and in the same order - the target and then the optional base. > > Was converting the path functions to return > > length too much work? If so, that's fine but please explain what > > decisions were made. > > Yes, I had replied saying: > > |I can change that, but the callers right now don't re-try with > |larger buffer anyway, so this would actually complicate them just > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > |right now writes into a static char[] for instance) > > I can still make that change if you like. Oops, sorry I forgot about that. The reason why kernfs_path() is written the current way was me being lazy. While I think it'd be better to make the functions behave like normal string handling functions if we're extending it, I don't think it's that important. If it's easy, please go ahead. If not, we can get back to it later when necessary. > > I skimmed through the series and spotted several other review points > > which didn't get addressed. Can you please go over the previous > > review cycle and address the review points? > > I did go through every email twice, once while making changes (one > branch per response) and once while making changelog for each patch, > sorry about whatever I missed. I'll go through each again. The other chunk I noticed was inline conversions of internal functions which didn't seem to belong to the patch. I asked whether those were stray chunks. Maybe the comment was too buried to notice? Anyways, that part actually causes conflicts when applying to cgroup/for-4.5. There are a couple more things. * Can you please put the ns related decls after the regular cgroup stuff in cgroup.h? * I think I might need to edit the documentation anyway but it'd be great if you can make the namespace section more in line with the rest of the documentation - e.g. s/CGroup/cgroup/ and more structured sectioning. At this point, it all generally looks good to me. Let's get the nits out of the way and merge it. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-09 22:36 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 22:36 UTC (permalink / raw) To: Serge Hallyn Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Serge E. Hallyn Hey, On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > we can rename kn_root to from here if you think that's clearer (and > change the order here as well). I think it'd be better for them to be consistent and in the same order - the target and then the optional base. > > Was converting the path functions to return > > length too much work? If so, that's fine but please explain what > > decisions were made. > > Yes, I had replied saying: > > |I can change that, but the callers right now don't re-try with > |larger buffer anyway, so this would actually complicate them just > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > |right now writes into a static char[] for instance) > > I can still make that change if you like. Oops, sorry I forgot about that. The reason why kernfs_path() is written the current way was me being lazy. While I think it'd be better to make the functions behave like normal string handling functions if we're extending it, I don't think it's that important. If it's easy, please go ahead. If not, we can get back to it later when necessary. > > I skimmed through the series and spotted several other review points > > which didn't get addressed. Can you please go over the previous > > review cycle and address the review points? > > I did go through every email twice, once while making changes (one > branch per response) and once while making changelog for each patch, > sorry about whatever I missed. I'll go through each again. The other chunk I noticed was inline conversions of internal functions which didn't seem to belong to the patch. I asked whether those were stray chunks. Maybe the comment was too buried to notice? Anyways, that part actually causes conflicts when applying to cgroup/for-4.5. There are a couple more things. * Can you please put the ns related decls after the regular cgroup stuff in cgroup.h? * I think I might need to edit the documentation anyway but it'd be great if you can make the namespace section more in line with the rest of the documentation - e.g. s/CGroup/cgroup/ and more structured sectioning. At this point, it all generally looks good to me. Let's get the nits out of the way and merge it. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-09 22:36 ` Tejun Heo (?) @ 2015-12-09 22:51 ` Serge E. Hallyn -1 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-09 22:51 UTC (permalink / raw) To: Tejun Heo Cc: Serge Hallyn, linux-api, containers, hannes, linux-kernel, ebiederm, lxc-devel, gregkh, cgroups, akpm On Wed, Dec 09, 2015 at 05:36:51PM -0500, Tejun Heo wrote: > Hey, > > On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > > we can rename kn_root to from here if you think that's clearer (and > > change the order here as well). > > I think it'd be better for them to be consistent and in the same order > - the target and then the optional base. > > > > Was converting the path functions to return > > > length too much work? If so, that's fine but please explain what > > > decisions were made. > > > > Yes, I had replied saying: > > > > |I can change that, but the callers right now don't re-try with > > |larger buffer anyway, so this would actually complicate them just > > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > > |right now writes into a static char[] for instance) > > > > I can still make that change if you like. > > Oops, sorry I forgot about that. The reason why kernfs_path() is > written the current way was me being lazy. While I think it'd be > better to make the functions behave like normal string handling > functions if we're extending it, I don't think it's that important. > If it's easy, please go ahead. If not, we can get back to it later > when necessary. Ok - I'm now gone until Dec 21 (and laptopping won't be an option :( ). I'll make the other changes then and do this as well. So pr_cont_kernfs_path() will dynamically allocate a longer buffer (only) if needed. > > > I skimmed through the series and spotted several other review points > > > which didn't get addressed. Can you please go over the previous > > > review cycle and address the review points? > > > > I did go through every email twice, once while making changes (one > > branch per response) and once while making changelog for each patch, > > sorry about whatever I missed. I'll go through each again. > > The other chunk I noticed was inline conversions of internal functions > which didn't seem to belong to the patch. I asked whether those were > stray chunks. Maybe the comment was too buried to notice? Anyways, > that part actually causes conflicts when applying to cgroup/for-4.5. Gah. I saw one and removed it. Grep tells me I missed some, will remove them all next time. > There are a couple more things. > > * Can you please put the ns related decls after the regular cgroup > stuff in cgroup.h? ok > * I think I might need to edit the documentation anyway but it'd be > great if you can make the namespace section more in line with the > rest of the documentation - e.g. s/CGroup/cgroup/ and more > structured sectioning. I'll read through it and look for patterns to change. > At this point, it all generally looks good to me. Let's get the > nits out of the way and merge it. > > Thanks. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151209223651.GQ30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151209223651.GQ30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-09 22:51 ` Serge E. Hallyn 2015-12-10 1:28 ` Serge E. Hallyn 1 sibling, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-09 22:51 UTC (permalink / raw) To: Tejun Heo Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Wed, Dec 09, 2015 at 05:36:51PM -0500, Tejun Heo wrote: > Hey, > > On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > > we can rename kn_root to from here if you think that's clearer (and > > change the order here as well). > > I think it'd be better for them to be consistent and in the same order > - the target and then the optional base. > > > > Was converting the path functions to return > > > length too much work? If so, that's fine but please explain what > > > decisions were made. > > > > Yes, I had replied saying: > > > > |I can change that, but the callers right now don't re-try with > > |larger buffer anyway, so this would actually complicate them just > > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > > |right now writes into a static char[] for instance) > > > > I can still make that change if you like. > > Oops, sorry I forgot about that. The reason why kernfs_path() is > written the current way was me being lazy. While I think it'd be > better to make the functions behave like normal string handling > functions if we're extending it, I don't think it's that important. > If it's easy, please go ahead. If not, we can get back to it later > when necessary. Ok - I'm now gone until Dec 21 (and laptopping won't be an option :( ). I'll make the other changes then and do this as well. So pr_cont_kernfs_path() will dynamically allocate a longer buffer (only) if needed. > > > I skimmed through the series and spotted several other review points > > > which didn't get addressed. Can you please go over the previous > > > review cycle and address the review points? > > > > I did go through every email twice, once while making changes (one > > branch per response) and once while making changelog for each patch, > > sorry about whatever I missed. I'll go through each again. > > The other chunk I noticed was inline conversions of internal functions > which didn't seem to belong to the patch. I asked whether those were > stray chunks. Maybe the comment was too buried to notice? Anyways, > that part actually causes conflicts when applying to cgroup/for-4.5. Gah. I saw one and removed it. Grep tells me I missed some, will remove them all next time. > There are a couple more things. > > * Can you please put the ns related decls after the regular cgroup > stuff in cgroup.h? ok > * I think I might need to edit the documentation anyway but it'd be > great if you can make the namespace section more in line with the > rest of the documentation - e.g. s/CGroup/cgroup/ and more > structured sectioning. I'll read through it and look for patterns to change. > At this point, it all generally looks good to me. Let's get the > nits out of the way and merge it. > > Thanks. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-09 22:36 ` Tejun Heo @ 2015-12-10 1:28 ` Serge E. Hallyn -1 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-10 1:28 UTC (permalink / raw) To: Tejun Heo Cc: gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Wed, Dec 09, 2015 at 05:36:51PM -0500, Tejun Heo wrote: > Hey, > > On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > > we can rename kn_root to from here if you think that's clearer (and > > change the order here as well). > > I think it'd be better for them to be consistent and in the same order > - the target and then the optional base. > > > > Was converting the path functions to return > > > length too much work? If so, that's fine but please explain what > > > decisions were made. > > > > Yes, I had replied saying: > > > > |I can change that, but the callers right now don't re-try with > > |larger buffer anyway, so this would actually complicate them just > > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > > |right now writes into a static char[] for instance) > > > > I can still make that change if you like. > > Oops, sorry I forgot about that. The reason why kernfs_path() is > written the current way was me being lazy. While I think it'd be > better to make the functions behave like normal string handling > functions if we're extending it, I don't think it's that important. > If it's easy, please go ahead. If not, we can get back to it later > when necessary. > > > > I skimmed through the series and spotted several other review points > > > which didn't get addressed. Can you please go over the previous > > > review cycle and address the review points? > > > > I did go through every email twice, once while making changes (one > > branch per response) and once while making changelog for each patch, > > sorry about whatever I missed. I'll go through each again. > > The other chunk I noticed was inline conversions of internal functions > which didn't seem to belong to the patch. I asked whether those were > stray chunks. Maybe the comment was too buried to notice? Anyways, > that part actually causes conflicts when applying to cgroup/for-4.5. > > There are a couple more things. > > * Can you please put the ns related decls after the regular cgroup > stuff in cgroup.h? > > * I think I might need to edit the documentation anyway but it'd be > great if you can make the namespace section more in line with the > rest of the documentation - e.g. s/CGroup/cgroup/ and more > structured sectioning. Ok fwiw I've fixed up the arguments to kernfs_path_from_node, removed the inlines, and moved the ns related decls after the others in cgroup.h (i.e. done the easy stuff) in the 2015-12-09/cgroupns.3 branch of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security.git I'll address the rest either after next week or, hopefully, when I get a chance earlier. > At this point, it all generally looks good to me. Let's get the > nits out of the way and merge it. If you wanted to take the branch as is, then I'll do the documentation and pr_cont_kernfs_path() etc rewrite as separate patches, but I'll assume you'd like to at least wait for doc rewrite. -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-10 1:28 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-10 1:28 UTC (permalink / raw) To: Tejun Heo Cc: Serge Hallyn, linux-api, containers, hannes, linux-kernel, ebiederm, lxc-devel, gregkh, cgroups, akpm On Wed, Dec 09, 2015 at 05:36:51PM -0500, Tejun Heo wrote: > Hey, > > On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > > we can rename kn_root to from here if you think that's clearer (and > > change the order here as well). > > I think it'd be better for them to be consistent and in the same order > - the target and then the optional base. > > > > Was converting the path functions to return > > > length too much work? If so, that's fine but please explain what > > > decisions were made. > > > > Yes, I had replied saying: > > > > |I can change that, but the callers right now don't re-try with > > |larger buffer anyway, so this would actually complicate them just > > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > > |right now writes into a static char[] for instance) > > > > I can still make that change if you like. > > Oops, sorry I forgot about that. The reason why kernfs_path() is > written the current way was me being lazy. While I think it'd be > better to make the functions behave like normal string handling > functions if we're extending it, I don't think it's that important. > If it's easy, please go ahead. If not, we can get back to it later > when necessary. > > > > I skimmed through the series and spotted several other review points > > > which didn't get addressed. Can you please go over the previous > > > review cycle and address the review points? > > > > I did go through every email twice, once while making changes (one > > branch per response) and once while making changelog for each patch, > > sorry about whatever I missed. I'll go through each again. > > The other chunk I noticed was inline conversions of internal functions > which didn't seem to belong to the patch. I asked whether those were > stray chunks. Maybe the comment was too buried to notice? Anyways, > that part actually causes conflicts when applying to cgroup/for-4.5. > > There are a couple more things. > > * Can you please put the ns related decls after the regular cgroup > stuff in cgroup.h? > > * I think I might need to edit the documentation anyway but it'd be > great if you can make the namespace section more in line with the > rest of the documentation - e.g. s/CGroup/cgroup/ and more > structured sectioning. Ok fwiw I've fixed up the arguments to kernfs_path_from_node, removed the inlines, and moved the ns related decls after the others in cgroup.h (i.e. done the easy stuff) in the 2015-12-09/cgroupns.3 branch of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security.git I'll address the rest either after next week or, hopefully, when I get a chance earlier. > At this point, it all generally looks good to me. Let's get the > nits out of the way and merge it. If you wanted to take the branch as is, then I'll do the documentation and pr_cont_kernfs_path() etc rewrite as separate patches, but I'll assume you'd like to at least wait for doc rewrite. -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-12-09 22:13 ` Serge Hallyn (?) (?) @ 2015-12-09 22:36 ` Tejun Heo -1 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 22:36 UTC (permalink / raw) To: Serge Hallyn Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hey, On Wed, Dec 09, 2015 at 10:13:27PM +0000, Serge Hallyn wrote: > we can rename kn_root to from here if you think that's clearer (and > change the order here as well). I think it'd be better for them to be consistent and in the same order - the target and then the optional base. > > Was converting the path functions to return > > length too much work? If so, that's fine but please explain what > > decisions were made. > > Yes, I had replied saying: > > |I can change that, but the callers right now don't re-try with > |larger buffer anyway, so this would actually complicate them just > |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path > |right now writes into a static char[] for instance) > > I can still make that change if you like. Oops, sorry I forgot about that. The reason why kernfs_path() is written the current way was me being lazy. While I think it'd be better to make the functions behave like normal string handling functions if we're extending it, I don't think it's that important. If it's easy, please go ahead. If not, we can get back to it later when necessary. > > I skimmed through the series and spotted several other review points > > which didn't get addressed. Can you please go over the previous > > review cycle and address the review points? > > I did go through every email twice, once while making changes (one > branch per response) and once while making changelog for each patch, > sorry about whatever I missed. I'll go through each again. The other chunk I noticed was inline conversions of internal functions which didn't seem to belong to the patch. I asked whether those were stray chunks. Maybe the comment was too buried to notice? Anyways, that part actually causes conflicts when applying to cgroup/for-4.5. There are a couple more things. * Can you please put the ns related decls after the regular cgroup stuff in cgroup.h? * I think I might need to edit the documentation anyway but it'd be great if you can make the namespace section more in line with the rest of the documentation - e.g. s/CGroup/cgroup/ and more structured sectioning. At this point, it all generally looks good to me. Let's get the nits out of the way and merge it. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151209213806.GP30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151209213806.GP30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-09 22:13 ` Serge Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge Hallyn @ 2015-12-09 22:13 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Quoting Tejun Heo (tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org): > Hello, Serge. > > On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > > +/* kernfs_node_depth - compute depth from @from to @to */ > > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) > ... > > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > > +{ > > + return kernfs_path_from_node(NULL, kn, buf, buflen); > > +} > ... > > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > > index 5d4e9c4..d025ebd 100644 > > --- a/include/linux/kernfs.h > > +++ b/include/linux/kernfs.h > > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > > size_t kernfs_path_len(struct kernfs_node *kn); > > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > > + struct kernfs_node *kn, char *buf, > > + size_t buflen); > > I think I commented on the same thing before, but I think it'd make > more sense to put @from after @to Oh. You said that for kernfs_path_from_node_locked(), and those were changed. kernfs_path_form_node() is a different fn, but > and the prototype is using @root_kn > which is a bit confusing. we can rename kn_root to from here if you think that's clearer (and change the order here as well). > Was converting the path functions to return > length too much work? If so, that's fine but please explain what > decisions were made. Yes, I had replied saying: |I can change that, but the callers right now don't re-try with |larger buffer anyway, so this would actually complicate them just |a smidgeon. Would you want them changed to do that? (pr_cont_kernfs_path |right now writes into a static char[] for instance) I can still make that change if you like. > I skimmed through the series and spotted several other review points > which didn't get addressed. Can you please go over the previous > review cycle and address the review points? I did go through every email twice, once while making changes (one branch per response) and once while making changelog for each patch, sorry about whatever I missed. I'll go through each again. I'm going to be out for awhile after today, so next version will unfortunately take awhile. thanks, -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1449689341-28742-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1449689341-28742-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-09 21:38 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-12-09 21:38 UTC (permalink / raw) To: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hello, Serge. On Wed, Dec 09, 2015 at 01:28:54PM -0600, serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org wrote: > +/* kernfs_node_depth - compute depth from @from to @to */ > +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) ... > +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) > +{ > + return kernfs_path_from_node(NULL, kn, buf, buflen); > +} ... > diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h > index 5d4e9c4..d025ebd 100644 > --- a/include/linux/kernfs.h > +++ b/include/linux/kernfs.h > @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) > > int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); > size_t kernfs_path_len(struct kernfs_node *kn); > +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, > + struct kernfs_node *kn, char *buf, > + size_t buflen); I think I commented on the same thing before, but I think it'd make more sense to put @from after @to and the prototype is using @root_kn which is a bit confusing. Was converting the path functions to return length too much work? If so, that's fine but please explain what decisions were made. I skimmed through the series and spotted several other review points which didn't get addressed. Can you please go over the previous review cycle and address the review points? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1449689341-28742-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>]
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1449689341-28742-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> @ 2015-12-09 19:28 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 0 siblings, 0 replies; 108+ messages in thread From: serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA @ 2015-12-09 19:28 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, hannes-druUgvl0LCNAfugRpC6u6w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: Serge E. Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Changelog 20151208: - kernfs_node_distance: * Remove BUG_ON(NULL)s * Rename kernfs_node_distance to kernfs_depth - kernfs_common-ancestor: * Remove useless checks for depth == 0 * Add check to ensure nodes are from same root - kernfs_path_from_node_locked: * Remove needless __must_check * Put p;len on its own decl line. * Fix wrong WARN_ONCE usage --- fs/kernfs/dir.c | 177 ++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 3 + 2 files changed, 153 insertions(+), 27 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..d1a001a 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,129 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da, db; + struct kernfs_root *ra = kernfs_root(a), *rb = kernfs_root(b); - return p; + if (ra != rb) + return NULL; + + da = kernfs_depth(ra->kn, a); + db = kernfs_depth(rb->kn, b); + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * +kernfs_path_from_node_locked(struct kernfs_node *kn_to, + struct kernfs_node *kn_from, char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + int i; + size_t depth_from, depth_to, len = 0, nlen = 0; + size_t plen = sizeof(parent_str) - 1; + + /* We atleast need 2 bytes to write "/\0". */ + if (buflen < 2) + return NULL; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) { + *p = '/'; + *(++p) = '\0'; + return buf; + } + + common = kernfs_common_ancestor(kn_from, kn_to); + if (WARN_ON(!common)) + return NULL; + + depth_to = kernfs_depth(common, kn_to); + depth_from = kernfs_depth(common, kn_from); + + for (i = 0; i < depth_from; i++) { + if (len + plen + 1 > buflen) + return NULL; + strcpy(p, parent_str); + p += plen; + len += plen; + } + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen + 1 > buflen) + return NULL; + + p += nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return buf; } /** @@ -115,26 +216,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn, kn_root, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +291,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* CGroup Namespaces (v4) @ 2015-11-16 19:51 serge-A9i7LUbDfNHQT0dZR+AlfA [not found] ` <1447703505-29672-1-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 108+ messages in thread From: serge-A9i7LUbDfNHQT0dZR+AlfA @ 2015-11-16 19:51 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi, following is a revised set of the CGroup Namespace patchset which Aditya Kali has previously sent. The code can also be found in the cgroupns.v4 branch of https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/ To summarize the semantics: 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED 2. unsharing a cgroup namespace makes all your current cgroups your new cgroup root. 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's cgroup namespce root. A task outside of your cgroup looks like 8:memory:/../../.. 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends on the mounting task's cgroup namespace. 5. setns to a cgroup namespace switches your cgroup namespace but not your cgroups. With this, using github.com/hallyn/lxc #2015-11-09/cgns (and github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts. This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.) Changes from V3: 1. Rebased onto latest cgroup changes. In particular switch to css_set_lock and ns_common. 2. Support all hierarchies. Changes from V2: 1. Added documentation in Documentation/cgroups/namespace.txt 2. Fixed a bug that caused crash 3. Incorporated some other suggestions from last patchset: - removed use of threadgroup_lock() while creating new cgroupns - use task_lock() instead of rcu_read_lock() while accessing task->nsproxy - optimized setns() to own cgroupns - simplified code around sane-behavior mount option parsing 4. Restored ACKs from Serge Hallyn from v1 on few patches that have not changed since then. Changes from V1: 1. No pinning of processes within cgroupns. Tasks can be freely moved across cgroups even outside of their cgroupns-root. Usual DAC/MAC policies apply as before. 2. Path in /proc/<pid>/cgroup is now always shown and is relative to cgroupns-root. So path can contain '/..' strings depending on cgroupns-root of the reader and cgroup of <pid>. 3. setns() does not require the process to first move under target cgroupns-root. Changes form RFC (V0): 1. setns support for cgroupns 2. 'mount -t cgroup cgroup <mntpt>' from inside a cgroupns now mounts the cgroup hierarcy with cgroupns-root as the filesystem root. 3. writes to cgroup files outside of cgroupns-root are not allowed 4. visibility of /proc/<pid>/cgroup is further restricted by not showing anything if the <pid> is in a sibling cgroupns and its cgroup falls outside your cgroupns-root. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <1447703505-29672-1-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>]
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-11-16 19:51 CGroup Namespaces (v4) serge-A9i7LUbDfNHQT0dZR+AlfA @ 2015-11-16 19:51 ` serge 0 siblings, 0 replies; 108+ messages in thread From: serge-A9i7LUbDfNHQT0dZR+AlfA @ 2015-11-16 19:51 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- fs/kernfs/dir.c | 195 ++++++++++++++++++++++++++++++++++++++++++------ include/linux/kernfs.h | 3 + 2 files changed, 177 insertions(+), 21 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..dba0d42 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,159 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/** + * kernfs_node_depth - compute depth of the kernfs node from root. + * The root node itself is considered to be at depth 0. + */ +static size_t kernfs_node_depth(struct kernfs_node *kn) { - char *p = buf + buflen; + size_t depth = 0; + + BUG_ON(!kn); + while (kn->parent) { + depth++; + kn = kn->parent; + } + return depth; +} + +/** + * kernfs_path_from_node_locked - find a relative path from @kn_from to @kn_to + * @kn_from: reference node of the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * __must_check kernfs_path_from_node_locked( + struct kernfs_node *kn_from, + struct kernfs_node *kn_to, + char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn; + size_t depth_from = 0, depth_to, d; int len; - *--p = '\0'; + /* We atleast need 2 bytes to write "/\0". */ + BUG_ON(buflen < 2); - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; + /* Short-circuit the easy case - kn_to is the root node. */ + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { + *p = '/'; + *(p + 1) = '\0'; + return p; + } + + /* We can find the relative path only if both the nodes belong to the + * same kernfs root. + */ + if (kn_from) { + BUG_ON(kernfs_root(kn_from) != kernfs_root(kn_to)); + depth_from = kernfs_node_depth(kn_from); + } + + depth_to = kernfs_node_depth(kn_to); + + /* We compose path from left to right. So first write out all possible + * "/.." strings needed to reach from 'kn_from' to the common ancestor. + */ + if (kn_from) { + while (depth_from > depth_to) { + len = strlen("/.."); + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + memcpy(p, "/..", len); + p += len; + *p = '\0'; + --depth_from; + kn_from = kn_from->parent; } + + d = depth_to; + kn = kn_to; + while (depth_from < d) { + kn = kn->parent; + d--; + } + + /* Now we have 'depth_from == depth_to' at this point. Add more + * "/.."s until we reach common ancestor. In the worst case, + * root node will be the common ancestor. + */ + while (depth_from > 0) { + /* If we reached common ancestor, stop. */ + if (kn_from == kn) + break; + len = strlen("/.."); + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + memcpy(p, "/..", len); + p += len; + *p = '\0'; + --depth_from; + kn_from = kn_from->parent; + kn = kn->parent; + } + } + + /* Figure out how many bytes we need to write the path. + */ + d = depth_to; + kn = kn_to; + len = 0; + while (depth_from < d) { + /* Account for "/<name>". */ + len += strlen(kn->name) + 1; + kn = kn->parent; + --d; + } + + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + + /* We have enough space. Move 'p' ahead by computed length and start + * writing node names into buffer. + */ + p += len; + *p = '\0'; + d = depth_to; + kn = kn_to; + while (d > depth_from) { + len = strlen(kn->name); p -= len; memcpy(p, kn->name, len); *--p = '/'; kn = kn->parent; - } while (kn && kn->parent); + --d; + } - return p; + return buf; } /** @@ -115,26 +246,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn_root, kn, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +321,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(NULL, kn, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-16 19:51 ` serge 0 siblings, 0 replies; 108+ messages in thread From: serge @ 2015-11-16 19:51 UTC (permalink / raw) To: linux-kernel Cc: adityakali, tj, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm From: Aditya Kali <adityakali@google.com> The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: Aditya Kali <adityakali@google.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> --- fs/kernfs/dir.c | 195 ++++++++++++++++++++++++++++++++++++++++++------ include/linux/kernfs.h | 3 + 2 files changed, 177 insertions(+), 21 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..dba0d42 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,159 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/** + * kernfs_node_depth - compute depth of the kernfs node from root. + * The root node itself is considered to be at depth 0. + */ +static size_t kernfs_node_depth(struct kernfs_node *kn) { - char *p = buf + buflen; + size_t depth = 0; + + BUG_ON(!kn); + while (kn->parent) { + depth++; + kn = kn->parent; + } + return depth; +} + +/** + * kernfs_path_from_node_locked - find a relative path from @kn_from to @kn_to + * @kn_from: reference node of the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * __must_check kernfs_path_from_node_locked( + struct kernfs_node *kn_from, + struct kernfs_node *kn_to, + char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn; + size_t depth_from = 0, depth_to, d; int len; - *--p = '\0'; + /* We atleast need 2 bytes to write "/\0". */ + BUG_ON(buflen < 2); - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; + /* Short-circuit the easy case - kn_to is the root node. */ + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { + *p = '/'; + *(p + 1) = '\0'; + return p; + } + + /* We can find the relative path only if both the nodes belong to the + * same kernfs root. + */ + if (kn_from) { + BUG_ON(kernfs_root(kn_from) != kernfs_root(kn_to)); + depth_from = kernfs_node_depth(kn_from); + } + + depth_to = kernfs_node_depth(kn_to); + + /* We compose path from left to right. So first write out all possible + * "/.." strings needed to reach from 'kn_from' to the common ancestor. + */ + if (kn_from) { + while (depth_from > depth_to) { + len = strlen("/.."); + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + memcpy(p, "/..", len); + p += len; + *p = '\0'; + --depth_from; + kn_from = kn_from->parent; } + + d = depth_to; + kn = kn_to; + while (depth_from < d) { + kn = kn->parent; + d--; + } + + /* Now we have 'depth_from == depth_to' at this point. Add more + * "/.."s until we reach common ancestor. In the worst case, + * root node will be the common ancestor. + */ + while (depth_from > 0) { + /* If we reached common ancestor, stop. */ + if (kn_from == kn) + break; + len = strlen("/.."); + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + memcpy(p, "/..", len); + p += len; + *p = '\0'; + --depth_from; + kn_from = kn_from->parent; + kn = kn->parent; + } + } + + /* Figure out how many bytes we need to write the path. + */ + d = depth_to; + kn = kn_to; + len = 0; + while (depth_from < d) { + /* Account for "/<name>". */ + len += strlen(kn->name) + 1; + kn = kn->parent; + --d; + } + + if ((buflen - (p - buf)) < len + 1) { + /* buffer not big enough. */ + buf[0] = '\0'; + return NULL; + } + + /* We have enough space. Move 'p' ahead by computed length and start + * writing node names into buffer. + */ + p += len; + *p = '\0'; + d = depth_to; + kn = kn_to; + while (d > depth_from) { + len = strlen(kn->name); p -= len; memcpy(p, kn->name, len); *--p = '/'; kn = kn->parent; - } while (kn && kn->parent); + --d; + } - return p; + return buf; } /** @@ -115,26 +246,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn_root, kn, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +321,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(NULL, kn, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 108+ messages in thread
[parent not found: <1447703505-29672-2-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1447703505-29672-2-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> @ 2015-11-24 16:16 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:16 UTC (permalink / raw) To: serge-A9i7LUbDfNHQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hello, On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org wrote: > +static char * __must_check kernfs_path_from_node_locked( > + struct kernfs_node *kn_from, > + struct kernfs_node *kn_to, > + char *buf, > + size_t buflen) > +{ > + char *p = buf; > + struct kernfs_node *kn; > + size_t depth_from = 0, depth_to, d; > int len; > > + /* We atleast need 2 bytes to write "/\0". */ > + BUG_ON(buflen < 2); I don't think this is BUG worthy. Just return NULL? Also, the only reason the original function returned char * was because the starting point may not be the start of the buffer which helps keeping the implementation simple. If this function is gonna be complex anyway, a better approach would be returning ssize_t and implement a simliar behavior to strlcpy(). > + /* Short-circuit the easy case - kn_to is the root node. */ > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > + *p = '/'; > + *(p + 1) = '\0'; Hmm... so if kn_from == kn_to, the output is "/"? > + return p; > + } > + > + /* We can find the relative path only if both the nodes belong to the > + * same kernfs root. > + */ > + if (kn_from) { > + BUG_ON(kernfs_root(kn_from) != kernfs_root(kn_to)); Ditto, just return NULL and maybe trigger WARN_ON_ONCE(). > + depth_from = kernfs_node_depth(kn_from); > + } > + > + depth_to = kernfs_node_depth(kn_to); > + > + /* We compose path from left to right. So first write out all possible ^ , so > + * "/.." strings needed to reach from 'kn_from' to the common ancestor. > + */ Please fully-wing multiline comments. > + if (kn_from) { > + while (depth_from > depth_to) { > + len = strlen("/.."); Maybe do something like the following instead? const char parent_str[] = "/.."; size_t len = sizeof(parent_str) - 1; > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > } > + > + d = depth_to; > + kn = kn_to; > + while (depth_from < d) { > + kn = kn->parent; > + d--; > + } > + > + /* Now we have 'depth_from == depth_to' at this point. Add more Ditto with winging. > + * "/.."s until we reach common ancestor. In the worst case, > + * root node will be the common ancestor. > + */ > + while (depth_from > 0) { > + /* If we reached common ancestor, stop. */ > + if (kn_from == kn) > + break; > + len = strlen("/.."); > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > + kn = kn->parent; > + } Hmmm... I wonder whether this and the above block can be merged. Wouldn't it be simpler to calculate common ancestor and generate /.. till it reached that point? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <1447703505-29672-2-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> @ 2015-11-24 16:16 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:16 UTC (permalink / raw) To: serge Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm Hello, On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge@hallyn.com wrote: > +static char * __must_check kernfs_path_from_node_locked( > + struct kernfs_node *kn_from, > + struct kernfs_node *kn_to, > + char *buf, > + size_t buflen) > +{ > + char *p = buf; > + struct kernfs_node *kn; > + size_t depth_from = 0, depth_to, d; > int len; > > + /* We atleast need 2 bytes to write "/\0". */ > + BUG_ON(buflen < 2); I don't think this is BUG worthy. Just return NULL? Also, the only reason the original function returned char * was because the starting point may not be the start of the buffer which helps keeping the implementation simple. If this function is gonna be complex anyway, a better approach would be returning ssize_t and implement a simliar behavior to strlcpy(). > + /* Short-circuit the easy case - kn_to is the root node. */ > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > + *p = '/'; > + *(p + 1) = '\0'; Hmm... so if kn_from == kn_to, the output is "/"? > + return p; > + } > + > + /* We can find the relative path only if both the nodes belong to the > + * same kernfs root. > + */ > + if (kn_from) { > + BUG_ON(kernfs_root(kn_from) != kernfs_root(kn_to)); Ditto, just return NULL and maybe trigger WARN_ON_ONCE(). > + depth_from = kernfs_node_depth(kn_from); > + } > + > + depth_to = kernfs_node_depth(kn_to); > + > + /* We compose path from left to right. So first write out all possible ^ , so > + * "/.." strings needed to reach from 'kn_from' to the common ancestor. > + */ Please fully-wing multiline comments. > + if (kn_from) { > + while (depth_from > depth_to) { > + len = strlen("/.."); Maybe do something like the following instead? const char parent_str[] = "/.."; size_t len = sizeof(parent_str) - 1; > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > } > + > + d = depth_to; > + kn = kn_to; > + while (depth_from < d) { > + kn = kn->parent; > + d--; > + } > + > + /* Now we have 'depth_from == depth_to' at this point. Add more Ditto with winging. > + * "/.."s until we reach common ancestor. In the worst case, > + * root node will be the common ancestor. > + */ > + while (depth_from > 0) { > + /* If we reached common ancestor, stop. */ > + if (kn_from == kn) > + break; > + len = strlen("/.."); > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > + kn = kn->parent; > + } Hmmm... I wonder whether this and the above block can be merged. Wouldn't it be simpler to calculate common ancestor and generate /.. till it reached that point? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-24 16:16 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:16 UTC (permalink / raw) To: serge-A9i7LUbDfNHQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w Hello, On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org wrote: > +static char * __must_check kernfs_path_from_node_locked( > + struct kernfs_node *kn_from, > + struct kernfs_node *kn_to, > + char *buf, > + size_t buflen) > +{ > + char *p = buf; > + struct kernfs_node *kn; > + size_t depth_from = 0, depth_to, d; > int len; > > + /* We atleast need 2 bytes to write "/\0". */ > + BUG_ON(buflen < 2); I don't think this is BUG worthy. Just return NULL? Also, the only reason the original function returned char * was because the starting point may not be the start of the buffer which helps keeping the implementation simple. If this function is gonna be complex anyway, a better approach would be returning ssize_t and implement a simliar behavior to strlcpy(). > + /* Short-circuit the easy case - kn_to is the root node. */ > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > + *p = '/'; > + *(p + 1) = '\0'; Hmm... so if kn_from == kn_to, the output is "/"? > + return p; > + } > + > + /* We can find the relative path only if both the nodes belong to the > + * same kernfs root. > + */ > + if (kn_from) { > + BUG_ON(kernfs_root(kn_from) != kernfs_root(kn_to)); Ditto, just return NULL and maybe trigger WARN_ON_ONCE(). > + depth_from = kernfs_node_depth(kn_from); > + } > + > + depth_to = kernfs_node_depth(kn_to); > + > + /* We compose path from left to right. So first write out all possible ^ , so > + * "/.." strings needed to reach from 'kn_from' to the common ancestor. > + */ Please fully-wing multiline comments. > + if (kn_from) { > + while (depth_from > depth_to) { > + len = strlen("/.."); Maybe do something like the following instead? const char parent_str[] = "/.."; size_t len = sizeof(parent_str) - 1; > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > } > + > + d = depth_to; > + kn = kn_to; > + while (depth_from < d) { > + kn = kn->parent; > + d--; > + } > + > + /* Now we have 'depth_from == depth_to' at this point. Add more Ditto with winging. > + * "/.."s until we reach common ancestor. In the worst case, > + * root node will be the common ancestor. > + */ > + while (depth_from > 0) { > + /* If we reached common ancestor, stop. */ > + if (kn_from == kn) > + break; > + len = strlen("/.."); > + if ((buflen - (p - buf)) < len + 1) { > + /* buffer not big enough. */ > + buf[0] = '\0'; > + return NULL; > + } > + memcpy(p, "/..", len); > + p += len; > + *p = '\0'; > + --depth_from; > + kn_from = kn_from->parent; > + kn = kn->parent; > + } Hmmm... I wonder whether this and the above block can be merged. Wouldn't it be simpler to calculate common ancestor and generate /.. till it reached that point? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-24 16:17 ` Tejun Heo 2015-11-27 5:25 ` Serge E. Hallyn 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:17 UTC (permalink / raw) To: serge Cc: linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm Oops, also please cc Greg Kroah-Hartman <gregkh@linuxfoundation.org> on kernfs changes. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-24 16:17 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:17 UTC (permalink / raw) To: serge-A9i7LUbDfNHQT0dZR+AlfA Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w Oops, also please cc Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> on kernfs changes. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161709.GM17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-24 17:43 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-24 17:43 UTC (permalink / raw) To: Tejun Heo Cc: serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm On Tue, Nov 24, 2015 at 11:17:09AM -0500, Tejun Heo wrote: > Oops, also please cc Greg Kroah-Hartman <gregkh@linuxfoundation.org> > on kernfs changes. Will do. Thank you for all the feedback. I'll send out a new set when I get it all addressed. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-24 17:43 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-24 17:43 UTC (permalink / raw) To: Tejun Heo Cc: serge-A9i7LUbDfNHQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w On Tue, Nov 24, 2015 at 11:17:09AM -0500, Tejun Heo wrote: > Oops, also please cc Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> > on kernfs changes. Will do. Thank you for all the feedback. I'll send out a new set when I get it all addressed. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151124161709.GM17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161709.GM17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-24 17:43 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-24 17:43 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Tue, Nov 24, 2015 at 11:17:09AM -0500, Tejun Heo wrote: > Oops, also please cc Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> > on kernfs changes. Will do. Thank you for all the feedback. I'll send out a new set when I get it all addressed. ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-24 16:17 ` Tejun Heo 2015-11-27 5:25 ` Serge E. Hallyn 1 sibling, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-24 16:17 UTC (permalink / raw) To: serge-A9i7LUbDfNHQT0dZR+AlfA Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Oops, also please cc Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> on kernfs changes. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-11-24 16:17 ` Tejun Heo @ 2015-11-27 5:25 ` Serge E. Hallyn 1 sibling, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-27 5:25 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Tue, Nov 24, 2015 at 11:16:30AM -0500, Tejun Heo wrote: > Hello, > > On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org wrote: > > +static char * __must_check kernfs_path_from_node_locked( (Note I've rewritten this to find a common ancestor and walk back to and from that, as you suggested later in this email) > > + /* Short-circuit the easy case - kn_to is the root node. */ > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > + *p = '/'; > > + *(p + 1) = '\0'; > > Hmm... so if kn_from == kn_to, the output is "/"? Yes, that's what seems to make the most sense for cgroup namespaces. I could see a case for '.' being used instead in general, but for cgroup namespaces I think we'd have to convert those back to '/'. Otherwise we'll fail in being able to run legacy software, which would get confused. -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-27 5:25 ` Serge E. Hallyn 2015-11-27 5:25 ` Serge E. Hallyn 1 sibling, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-27 5:25 UTC (permalink / raw) To: Tejun Heo Cc: serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm On Tue, Nov 24, 2015 at 11:16:30AM -0500, Tejun Heo wrote: > Hello, > > On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge@hallyn.com wrote: > > +static char * __must_check kernfs_path_from_node_locked( (Note I've rewritten this to find a common ancestor and walk back to and from that, as you suggested later in this email) > > + /* Short-circuit the easy case - kn_to is the root node. */ > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > + *p = '/'; > > + *(p + 1) = '\0'; > > Hmm... so if kn_from == kn_to, the output is "/"? Yes, that's what seems to make the most sense for cgroup namespaces. I could see a case for '.' being used instead in general, but for cgroup namespaces I think we'd have to convert those back to '/'. Otherwise we'll fail in being able to run legacy software, which would get confused. -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-27 5:25 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-27 5:25 UTC (permalink / raw) To: Tejun Heo Cc: serge-A9i7LUbDfNHQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w On Tue, Nov 24, 2015 at 11:16:30AM -0500, Tejun Heo wrote: > Hello, > > On Mon, Nov 16, 2015 at 01:51:38PM -0600, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org wrote: > > +static char * __must_check kernfs_path_from_node_locked( (Note I've rewritten this to find a common ancestor and walk back to and from that, as you suggested later in this email) > > + /* Short-circuit the easy case - kn_to is the root node. */ > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > + *p = '/'; > > + *(p + 1) = '\0'; > > Hmm... so if kn_from == kn_to, the output is "/"? Yes, that's what seems to make the most sense for cgroup namespaces. I could see a case for '.' being used instead in general, but for cgroup namespaces I think we'd have to convert those back to '/'. Otherwise we'll fail in being able to run legacy software, which would get confused. -serge ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151127052511.GA25490-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151127052511.GA25490-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2015-11-30 15:11 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-30 15:11 UTC (permalink / raw) To: Serge E. Hallyn Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hello, On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > + *p = '/'; > > > + *(p + 1) = '\0'; > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > Yes, that's what seems to make the most sense for cgroup namespaces. I > could see a case for '.' being used instead in general, but for cgroup > namespaces I think we'd have to convert those back to '/'. Otherwise > we'll fail in being able to run legacy software, which would get > confused. Yeah, I agree but the name is kinda misleading tho. The output isn't really a relative path but rather absolute path against the specified root. Maybe updating the function and parameter names would be helpful? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151127052511.GA25490-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2015-11-30 15:11 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-30 15:11 UTC (permalink / raw) To: Serge E. Hallyn Cc: serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm Hello, On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > + *p = '/'; > > > + *(p + 1) = '\0'; > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > Yes, that's what seems to make the most sense for cgroup namespaces. I > could see a case for '.' being used instead in general, but for cgroup > namespaces I think we'd have to convert those back to '/'. Otherwise > we'll fail in being able to run legacy software, which would get > confused. Yeah, I agree but the name is kinda misleading tho. The output isn't really a relative path but rather absolute path against the specified root. Maybe updating the function and parameter names would be helpful? Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-30 15:11 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-30 15:11 UTC (permalink / raw) To: Serge E. Hallyn Cc: adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, serge-A9i7LUbDfNHQT0dZR+AlfA Hello, On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > + *p = '/'; > > > + *(p + 1) = '\0'; > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > Yes, that's what seems to make the most sense for cgroup namespaces. I > could see a case for '.' being used instead in general, but for cgroup > namespaces I think we'd have to convert those back to '/'. Otherwise > we'll fail in being able to run legacy software, which would get > confused. Yeah, I agree but the name is kinda misleading tho. The output isn't really a relative path but rather absolute path against the specified root. Maybe updating the function and parameter names would be helpful? Thanks. -- tejun _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151130151147.GG3535-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-30 18:37 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-30 18:37 UTC (permalink / raw) To: Tejun Heo Cc: Serge E. Hallyn, serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm On Mon, Nov 30, 2015 at 10:11:47AM -0500, Tejun Heo wrote: > Hello, > > On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > > + *p = '/'; > > > > + *(p + 1) = '\0'; > > > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > > > Yes, that's what seems to make the most sense for cgroup namespaces. I > > could see a case for '.' being used instead in general, but for cgroup > > namespaces I think we'd have to convert those back to '/'. Otherwise > > we'll fail in being able to run legacy software, which would get > > confused. > > Yeah, I agree but the name is kinda misleading tho. The output isn't > really a relative path but rather absolute path against the specified > root. Maybe updating the function and parameter names would be > helpful? > > Thanks. Ok - updating the comment is simple enough. Though the name/params kernfs_path_from_node_locked(from, to) still seem to make sense. Would you prefer something like kernfs_absolute_path_from node_locked()? I hesitate to call 'from' 'root' since kernfs_root is a thing and this is not that. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-11-30 18:37 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-30 18:37 UTC (permalink / raw) To: Tejun Heo Cc: Serge E. Hallyn, serge-A9i7LUbDfNHQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w On Mon, Nov 30, 2015 at 10:11:47AM -0500, Tejun Heo wrote: > Hello, > > On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > > + *p = '/'; > > > > + *(p + 1) = '\0'; > > > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > > > Yes, that's what seems to make the most sense for cgroup namespaces. I > > could see a case for '.' being used instead in general, but for cgroup > > namespaces I think we'd have to convert those back to '/'. Otherwise > > we'll fail in being able to run legacy software, which would get > > confused. > > Yeah, I agree but the name is kinda misleading tho. The output isn't > really a relative path but rather absolute path against the specified > root. Maybe updating the function and parameter names would be > helpful? > > Thanks. Ok - updating the comment is simple enough. Though the name/params kernfs_path_from_node_locked(from, to) still seem to make sense. Would you prefer something like kernfs_absolute_path_from node_locked()? I hesitate to call 'from' 'root' since kernfs_root is a thing and this is not that. ^ permalink raw reply [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path 2015-11-30 18:37 ` Serge E. Hallyn (?) @ 2015-11-30 22:53 ` Tejun Heo [not found] ` <20151130225318.GD9039-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-01 2:08 ` Serge E. Hallyn -1 siblings, 2 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-30 22:53 UTC (permalink / raw) To: Serge E. Hallyn Cc: serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm Hello, Serge. On Mon, Nov 30, 2015 at 12:37:58PM -0600, Serge E. Hallyn wrote: > > Yeah, I agree but the name is kinda misleading tho. The output isn't > > really a relative path but rather absolute path against the specified > > root. Maybe updating the function and parameter names would be > > helpful? > > > > Ok - updating the comment is simple enough. Though the name/params > kernfs_path_from_node_locked(from, to) still seem to make sense. Would > you prefer something like kernfs_absolute_path_from node_locked()? I > hesitate to call 'from' 'root' since kernfs_root is a thing and this > is not that. Hmmm... I see. Let's just make sure that the comment is clear about the fact that it calculates (pseudo) absolute path rather than relative path. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151130225318.GD9039-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151130225318.GD9039-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-01 2:08 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-01 2:08 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Serge E. Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Mon, Nov 30, 2015 at 05:53:18PM -0500, Tejun Heo wrote: > Hello, Serge. > > On Mon, Nov 30, 2015 at 12:37:58PM -0600, Serge E. Hallyn wrote: > > > Yeah, I agree but the name is kinda misleading tho. The output isn't > > > really a relative path but rather absolute path against the specified > > > root. Maybe updating the function and parameter names would be > > > helpful? > > > > > > > Ok - updating the comment is simple enough. Though the name/params > > kernfs_path_from_node_locked(from, to) still seem to make sense. Would > > you prefer something like kernfs_absolute_path_from node_locked()? I > > hesitate to call 'from' 'root' since kernfs_root is a thing and this > > is not that. > > Hmmm... I see. Let's just make sure that the comment is clear about > the fact that it calculates (pseudo) absolute path rather than > relative path. > > Thanks. Ok, new patch follows (and is pushed at https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/log/?h=2015-11-30/cgroupns) [PATCH 1/7] kernfs: Add API to generate relative kernfs path The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- fs/kernfs/dir.c | 182 +++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 3 + 2 files changed, 158 insertions(+), 27 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..7cd4bb4 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,134 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_node_distance(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + BUG_ON(!to); + BUG_ON(!from); - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - return p; +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da = kernfs_node_distance(kernfs_root(a)->kn, a); + size_t db = kernfs_node_distance(kernfs_root(b)->kn, b); + + if (da == 0) + return a; + if (db == 0) + return b; + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * +__must_check kernfs_path_from_node_locked(struct kernfs_node *kn_from, + struct kernfs_node *kn_to, char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + int i; + size_t depth_from, depth_to, len = 0, nlen = 0, + plen = sizeof(parent_str) - 1; + + /* We atleast need 2 bytes to write "/\0". */ + if (buflen < 2) + return NULL; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) { + *p = '/'; + *(++p) = '\0'; + return buf; + } + + common = kernfs_common_ancestor(kn_from, kn_to); + if (!common) { + WARN_ONCE("%s: kn_from and kn_to on different roots\n", + __func__); + return NULL; + } + + depth_to = kernfs_node_distance(common, kn_to); + depth_from = kernfs_node_distance(common, kn_from); + + for (i = 0; i < depth_from; i++) { + if (len + plen + 1 > buflen) + return NULL; + strcpy(p, parent_str); + p += plen; + len += plen; + } + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen + 1 > buflen) + return NULL; + + p += nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return buf; } /** @@ -115,26 +221,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn_root, kn, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +296,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(NULL, kn, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 2.5.0 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151130225318.GD9039-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-12-01 2:08 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-01 2:08 UTC (permalink / raw) To: Tejun Heo Cc: Serge E. Hallyn, serge, linux-kernel, adityakali, linux-api, containers, cgroups, lxc-devel, akpm, ebiederm On Mon, Nov 30, 2015 at 05:53:18PM -0500, Tejun Heo wrote: > Hello, Serge. > > On Mon, Nov 30, 2015 at 12:37:58PM -0600, Serge E. Hallyn wrote: > > > Yeah, I agree but the name is kinda misleading tho. The output isn't > > > really a relative path but rather absolute path against the specified > > > root. Maybe updating the function and parameter names would be > > > helpful? > > > > > > > Ok - updating the comment is simple enough. Though the name/params > > kernfs_path_from_node_locked(from, to) still seem to make sense. Would > > you prefer something like kernfs_absolute_path_from node_locked()? I > > hesitate to call 'from' 'root' since kernfs_root is a thing and this > > is not that. > > Hmmm... I see. Let's just make sure that the comment is clear about > the fact that it calculates (pseudo) absolute path rather than > relative path. > > Thanks. Ok, new patch follows (and is pushed at https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/log/?h=2015-11-30/cgroupns) [PATCH 1/7] kernfs: Add API to generate relative kernfs path The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Signed-off-by: Aditya Kali <adityakali@google.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> --- fs/kernfs/dir.c | 182 +++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 3 + 2 files changed, 158 insertions(+), 27 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..7cd4bb4 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,134 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_node_distance(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + BUG_ON(!to); + BUG_ON(!from); - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - return p; +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da = kernfs_node_distance(kernfs_root(a)->kn, a); + size_t db = kernfs_node_distance(kernfs_root(b)->kn, b); + + if (da == 0) + return a; + if (db == 0) + return b; + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * +__must_check kernfs_path_from_node_locked(struct kernfs_node *kn_from, + struct kernfs_node *kn_to, char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + int i; + size_t depth_from, depth_to, len = 0, nlen = 0, + plen = sizeof(parent_str) - 1; + + /* We atleast need 2 bytes to write "/\0". */ + if (buflen < 2) + return NULL; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) { + *p = '/'; + *(++p) = '\0'; + return buf; + } + + common = kernfs_common_ancestor(kn_from, kn_to); + if (!common) { + WARN_ONCE("%s: kn_from and kn_to on different roots\n", + __func__); + return NULL; + } + + depth_to = kernfs_node_distance(common, kn_to); + depth_from = kernfs_node_distance(common, kn_from); + + for (i = 0; i < depth_from; i++) { + if (len + plen + 1 > buflen) + return NULL; + strcpy(p, parent_str); + p += plen; + len += plen; + } + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen + 1 > buflen) + return NULL; + + p += nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return buf; } /** @@ -115,26 +221,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn_root, kn, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +296,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(NULL, kn, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 2.5.0 ^ permalink raw reply related [flat|nested] 108+ messages in thread
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path @ 2015-12-01 2:08 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-12-01 2:08 UTC (permalink / raw) To: Tejun Heo Cc: Serge E. Hallyn, serge-A9i7LUbDfNHQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, adityakali-hpIqsD4AKlfQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, cgroups-u79uwXL29TY76Z2rM5mHXA, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, ebiederm-aS9lmoZGLiVWk0Htik3J/w On Mon, Nov 30, 2015 at 05:53:18PM -0500, Tejun Heo wrote: > Hello, Serge. > > On Mon, Nov 30, 2015 at 12:37:58PM -0600, Serge E. Hallyn wrote: > > > Yeah, I agree but the name is kinda misleading tho. The output isn't > > > really a relative path but rather absolute path against the specified > > > root. Maybe updating the function and parameter names would be > > > helpful? > > > > > > > Ok - updating the comment is simple enough. Though the name/params > > kernfs_path_from_node_locked(from, to) still seem to make sense. Would > > you prefer something like kernfs_absolute_path_from node_locked()? I > > hesitate to call 'from' 'root' since kernfs_root is a thing and this > > is not that. > > Hmmm... I see. Let's just make sure that the comment is clear about > the fact that it calculates (pseudo) absolute path rather than > relative path. > > Thanks. Ok, new patch follows (and is pushed at https://git.kernel.org/cgit/linux/kernel/git/sergeh/linux-security.git/log/?h=2015-11-30/cgroupns) [PATCH 1/7] kernfs: Add API to generate relative kernfs path The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Changelog 20151125: - Fully-wing multilinecomments - Rework kernfs_path_from_node_locked() logic - Replace BUG_ONs with returning NULL - Use a const char* for /.. and precalculate its size Changelog 20151130: - Update kernfs_path_from_node_locked comment Signed-off-by: Aditya Kali <adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Acked-by: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- fs/kernfs/dir.c | 182 +++++++++++++++++++++++++++++++++++++++++-------- include/linux/kernfs.h | 3 + 2 files changed, 158 insertions(+), 27 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 91e0045..7cd4bb4 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -44,28 +44,134 @@ static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen) return strlcpy(buf, kn->parent ? kn->name : "/", buflen); } -static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf, - size_t buflen) +/* kernfs_node_depth - compute depth from @from to @to */ +static size_t kernfs_node_distance(struct kernfs_node *from, struct kernfs_node *to) { - char *p = buf + buflen; - int len; + size_t depth = 0; - *--p = '\0'; + BUG_ON(!to); + BUG_ON(!from); - do { - len = strlen(kn->name); - if (p - buf < len + 1) { - buf[0] = '\0'; - p = NULL; - break; - } - p -= len; - memcpy(p, kn->name, len); - *--p = '/'; - kn = kn->parent; - } while (kn && kn->parent); + while (to->parent && to != from) { + depth++; + to = to->parent; + } + return depth; +} - return p; +static struct kernfs_node *kernfs_common_ancestor(struct kernfs_node *a, + struct kernfs_node *b) +{ + size_t da = kernfs_node_distance(kernfs_root(a)->kn, a); + size_t db = kernfs_node_distance(kernfs_root(b)->kn, b); + + if (da == 0) + return a; + if (db == 0) + return b; + + while (da > db) { + a = a->parent; + da--; + } + while (db > da) { + b = b->parent; + db--; + } + + /* worst case b and a will be the same at root */ + while (b != a) { + b = b->parent; + a = a->parent; + } + + return a; +} + +/** + * kernfs_path_from_node_locked - find a pseudo-absolute path to @kn_to, + * where kn_from is treated as root of the path. + * @kn_from: kernfs node which should be treated as root for the path + * @kn_to: kernfs node to which path is needed + * @buf: buffer to copy the path into + * @buflen: size of @buf + * + * We need to handle couple of scenarios here: + * [1] when @kn_from is an ancestor of @kn_to at some level + * kn_from: /n1/n2/n3 + * kn_to: /n1/n2/n3/n4/n5 + * result: /n4/n5 + * + * [2] when @kn_from is on a different hierarchy and we need to find common + * ancestor between @kn_from and @kn_to. + * kn_from: /n1/n2/n3/n4 + * kn_to: /n1/n2/n5 + * result: /../../n5 + * OR + * kn_from: /n1/n2/n3/n4/n5 [depth=5] + * kn_to: /n1/n2/n3 [depth=3] + * result: /../.. + */ +static char * +__must_check kernfs_path_from_node_locked(struct kernfs_node *kn_from, + struct kernfs_node *kn_to, char *buf, + size_t buflen) +{ + char *p = buf; + struct kernfs_node *kn, *common; + const char parent_str[] = "/.."; + int i; + size_t depth_from, depth_to, len = 0, nlen = 0, + plen = sizeof(parent_str) - 1; + + /* We atleast need 2 bytes to write "/\0". */ + if (buflen < 2) + return NULL; + + if (!kn_from) + kn_from = kernfs_root(kn_to)->kn; + + if (kn_from == kn_to) { + *p = '/'; + *(++p) = '\0'; + return buf; + } + + common = kernfs_common_ancestor(kn_from, kn_to); + if (!common) { + WARN_ONCE("%s: kn_from and kn_to on different roots\n", + __func__); + return NULL; + } + + depth_to = kernfs_node_distance(common, kn_to); + depth_from = kernfs_node_distance(common, kn_from); + + for (i = 0; i < depth_from; i++) { + if (len + plen + 1 > buflen) + return NULL; + strcpy(p, parent_str); + p += plen; + len += plen; + } + + /* Calculate how many bytes we need for the rest */ + for (kn = kn_to; kn != common; kn = kn->parent) + nlen += strlen(kn->name) + 1; + + if (len + nlen + 1 > buflen) + return NULL; + + p += nlen; + *p = '\0'; + for (kn = kn_to; kn != common; kn = kn->parent) { + nlen = strlen(kn->name); + p -= nlen; + memcpy(p, kn->name, nlen); + *(--p) = '/'; + } + + return buf; } /** @@ -115,26 +221,48 @@ size_t kernfs_path_len(struct kernfs_node *kn) } /** - * kernfs_path - build full path of a given node + * kernfs_path_from_node - build path of node @kn relative to @kn_root. + * @kn_root: parent kernfs_node relative to which we need to build the path * @kn: kernfs_node of interest - * @buf: buffer to copy @kn's name into + * @buf: buffer to copy @kn's path into * @buflen: size of @buf * - * Builds and returns the full path of @kn in @buf of @buflen bytes. The - * path is built from the end of @buf so the returned pointer usually - * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * Builds and returns @kn's path relative to @kn_root. @kn_root and @kn must + * be on the same kernfs-root. If @kn_root is not parent of @kn, then a relative + * path (which includes '..'s) as needed to reach from @kn_root to @kn is + * returned. + * The path may be built from the end of @buf so the returned pointer may not + * match @buf. If @buf isn't long enough, @buf is nul terminated * and %NULL is returned. */ -char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +char *kernfs_path_from_node(struct kernfs_node *kn_root, struct kernfs_node *kn, + char *buf, size_t buflen) { unsigned long flags; char *p; spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, buf, buflen); + p = kernfs_path_from_node_locked(kn_root, kn, buf, buflen); spin_unlock_irqrestore(&kernfs_rename_lock, flags); return p; } +EXPORT_SYMBOL_GPL(kernfs_path_from_node); + +/** + * kernfs_path - build full path of a given node + * @kn: kernfs_node of interest + * @buf: buffer to copy @kn's name into + * @buflen: size of @buf + * + * Builds and returns the full path of @kn in @buf of @buflen bytes. The + * path is built from the end of @buf so the returned pointer usually + * doesn't match @buf. If @buf isn't long enough, @buf is nul terminated + * and %NULL is returned. + */ +char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen) +{ + return kernfs_path_from_node(NULL, kn, buf, buflen); +} EXPORT_SYMBOL_GPL(kernfs_path); /** @@ -168,8 +296,8 @@ void pr_cont_kernfs_path(struct kernfs_node *kn) spin_lock_irqsave(&kernfs_rename_lock, flags); - p = kernfs_path_locked(kn, kernfs_pr_cont_buf, - sizeof(kernfs_pr_cont_buf)); + p = kernfs_path_from_node_locked(NULL, kn, kernfs_pr_cont_buf, + sizeof(kernfs_pr_cont_buf)); if (p) pr_cont("%s", p); else diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 5d4e9c4..d025ebd 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -267,6 +267,9 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn) int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen); size_t kernfs_path_len(struct kernfs_node *kn); +char * __must_check kernfs_path_from_node(struct kernfs_node *root_kn, + struct kernfs_node *kn, char *buf, + size_t buflen); char * __must_check kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen); void pr_cont_kernfs_name(struct kernfs_node *kn); -- 2.5.0 ^ permalink raw reply related [flat|nested] 108+ messages in thread
[parent not found: <20151130183758.GA25433-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151130183758.GA25433-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2015-11-30 22:53 ` Tejun Heo 0 siblings, 0 replies; 108+ messages in thread From: Tejun Heo @ 2015-11-30 22:53 UTC (permalink / raw) To: Serge E. Hallyn Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hello, Serge. On Mon, Nov 30, 2015 at 12:37:58PM -0600, Serge E. Hallyn wrote: > > Yeah, I agree but the name is kinda misleading tho. The output isn't > > really a relative path but rather absolute path against the specified > > root. Maybe updating the function and parameter names would be > > helpful? > > > > Ok - updating the comment is simple enough. Though the name/params > kernfs_path_from_node_locked(from, to) still seem to make sense. Would > you prefer something like kernfs_absolute_path_from node_locked()? I > hesitate to call 'from' 'root' since kernfs_root is a thing and this > is not that. Hmmm... I see. Let's just make sure that the comment is clear about the fact that it calculates (pseudo) absolute path rather than relative path. Thanks. -- tejun ^ permalink raw reply [flat|nested] 108+ messages in thread
[parent not found: <20151130151147.GG3535-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]
* Re: [PATCH 1/8] kernfs: Add API to generate relative kernfs path [not found] ` <20151130151147.GG3535-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> @ 2015-11-30 18:37 ` Serge E. Hallyn 0 siblings, 0 replies; 108+ messages in thread From: Serge E. Hallyn @ 2015-11-30 18:37 UTC (permalink / raw) To: Tejun Heo Cc: linux-api-u79uwXL29TY76Z2rM5mHXA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Serge E. Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, ebiederm-aS9lmoZGLiVWk0Htik3J/w, lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I, cgroups-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b On Mon, Nov 30, 2015 at 10:11:47AM -0500, Tejun Heo wrote: > Hello, > > On Thu, Nov 26, 2015 at 11:25:11PM -0600, Serge E. Hallyn wrote: > > > > + /* Short-circuit the easy case - kn_to is the root node. */ > > > > + if ((kn_from == kn_to) || (!kn_from && !kn_to->parent)) { > > > > + *p = '/'; > > > > + *(p + 1) = '\0'; > > > > > > Hmm... so if kn_from == kn_to, the output is "/"? > > > > Yes, that's what seems to make the most sense for cgroup namespaces. I > > could see a case for '.' being used instead in general, but for cgroup > > namespaces I think we'd have to convert those back to '/'. Otherwise > > we'll fail in being able to run legacy software, which would get > > confused. > > Yeah, I agree but the name is kinda misleading tho. The output isn't > really a relative path but rather absolute path against the specified > root. Maybe updating the function and parameter names would be > helpful? > > Thanks. Ok - updating the comment is simple enough. Though the name/params kernfs_path_from_node_locked(from, to) still seem to make sense. Would you prefer something like kernfs_absolute_path_from node_locked()? I hesitate to call 'from' 'root' since kernfs_root is a thing and this is not that. ^ permalink raw reply [flat|nested] 108+ messages in thread
end of thread, other threads:[~2016-02-26 22:47 UTC | newest] Thread overview: 108+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-29 8:54 CGroup Namespaces (v10) serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 3/8] cgroup: introduce cgroup namespaces serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA [not found] ` <1454057651-23959-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2016-01-29 8:54 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 2/8] sched: new clone flag CLONE_NEWCGROUP for cgroup namespace serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 3/8] cgroup: introduce cgroup namespaces serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 4/8] cgroup: cgroup namespace setns support serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` serge.hallyn 2016-01-29 8:54 ` [PATCH 5/8] kernfs: define kernfs_node_dentry serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-31 17:48 ` [PATCH] selftests/cgroupns: new test for cgroup namespaces Alban Crequy 2016-01-31 17:48 ` Alban Crequy 2016-02-10 17:48 ` Serge E. Hallyn 2016-02-10 17:48 ` Serge E. Hallyn [not found] ` <1454262492-6480-1-git-send-email-alban-lYLaGTFnO9sWenYVfaLwtA@public.gmane.org> 2016-02-10 17:48 ` Serge E. Hallyn 2016-02-11 23:18 ` [lxc-devel] CGroup Namespaces (v10) Alban Crequy 2016-02-26 13:18 ` Alban Crequy 2016-01-29 8:54 ` [PATCH 5/8] kernfs: define kernfs_node_dentry serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 6/8] cgroup: mount cgroupns-root when inside non-init cgroupns serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 7/8] cgroup: Add documentation for cgroup namespaces serge.hallyn 2016-01-29 8:54 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-29 8:54 ` [PATCH 8/8] Add FS_USERNS_FLAG to cgroup fs serge.hallyn 2016-02-16 18:05 ` Tejun Heo 2016-02-16 18:05 ` Tejun Heo [not found] ` <1454057651-23959-9-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2016-02-16 18:05 ` Tejun Heo 2016-02-11 23:18 ` [lxc-devel] CGroup Namespaces (v10) Alban Crequy 2016-02-11 23:18 ` Alban Crequy [not found] ` <CAMXgnP6eSQjsuPXdrbaHytujVSkizPd4cJJQwQcuSCLAgVcYJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-02-12 16:09 ` Tejun Heo 2016-02-12 16:09 ` Tejun Heo [not found] ` <20160212160906.GG3741-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2016-02-12 23:22 ` Serge E. Hallyn 2016-02-12 23:22 ` Serge E. Hallyn 2016-02-12 23:22 ` Serge E. Hallyn [not found] ` <20160212232221.GA31062-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2016-02-15 21:17 ` Tejun Heo 2016-02-15 21:17 ` Tejun Heo 2016-02-15 21:17 ` Tejun Heo [not found] ` <20160215211705.GQ3965-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org> 2016-02-15 21:20 ` Tejun Heo 2016-02-15 21:20 ` Tejun Heo 2016-02-15 21:20 ` Tejun Heo 2016-02-26 13:18 ` Alban Crequy 2016-02-26 13:18 ` Alban Crequy 2016-02-26 22:47 ` Serge Hallyn 2016-02-26 22:47 ` Serge Hallyn [not found] ` <CAMXgnP4Wss0ctx7mHzD0WHL4+-fC59iLZNkYONE5pAeHYr18+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-02-26 22:47 ` Serge Hallyn -- strict thread matches above, loose matches on Subject: below -- 2016-01-04 19:54 CGroup Namespaces (v9) serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA [not found] ` <1451937294-22589-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2016-01-04 19:54 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2016-01-04 19:54 ` serge.hallyn 2015-12-23 4:23 CGroup Namespaces (v8) serge.hallyn [not found] ` <1450844609-9194-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2015-12-23 4:23 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2015-12-23 4:23 ` serge.hallyn 2015-12-23 16:08 ` Tejun Heo 2015-12-23 16:08 ` Tejun Heo 2015-12-23 16:36 ` Serge E. Hallyn 2015-12-23 16:36 ` Serge E. Hallyn [not found] ` <20151223160854.GF5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-23 16:36 ` Serge E. Hallyn 2015-12-23 16:24 ` Tejun Heo 2015-12-23 16:24 ` Tejun Heo 2015-12-23 16:51 ` Greg KH 2015-12-23 16:51 ` Greg KH [not found] ` <20151223162433.GH5003-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-23 16:51 ` Greg KH [not found] ` <1450844609-9194-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2015-12-23 16:08 ` Tejun Heo 2015-12-23 16:24 ` Tejun Heo 2015-12-09 19:28 CGroup Namespaces (v7) serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2015-12-09 19:28 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge.hallyn 2015-12-09 21:38 ` Tejun Heo 2015-12-09 21:38 ` Tejun Heo 2015-12-09 22:13 ` Serge Hallyn 2015-12-09 22:13 ` Serge Hallyn 2015-12-09 22:36 ` Tejun Heo 2015-12-09 22:36 ` Tejun Heo 2015-12-09 22:51 ` Serge E. Hallyn [not found] ` <20151209223651.GQ30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-09 22:51 ` Serge E. Hallyn 2015-12-10 1:28 ` Serge E. Hallyn 2015-12-10 1:28 ` Serge E. Hallyn 2015-12-09 22:36 ` Tejun Heo [not found] ` <20151209213806.GP30240-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-09 22:13 ` Serge Hallyn [not found] ` <1449689341-28742-2-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2015-12-09 21:38 ` Tejun Heo [not found] ` <1449689341-28742-1-git-send-email-serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> 2015-12-09 19:28 ` serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA 2015-11-16 19:51 CGroup Namespaces (v4) serge-A9i7LUbDfNHQT0dZR+AlfA [not found] ` <1447703505-29672-1-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 2015-11-16 19:51 ` [PATCH 1/8] kernfs: Add API to generate relative kernfs path serge-A9i7LUbDfNHQT0dZR+AlfA 2015-11-16 19:51 ` serge [not found] ` <1447703505-29672-2-git-send-email-serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 2015-11-24 16:16 ` Tejun Heo 2015-11-24 16:16 ` Tejun Heo 2015-11-24 16:16 ` Tejun Heo 2015-11-24 16:17 ` Tejun Heo 2015-11-24 16:17 ` Tejun Heo 2015-11-24 17:43 ` Serge E. Hallyn 2015-11-24 17:43 ` Serge E. Hallyn [not found] ` <20151124161709.GM17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-11-24 17:43 ` Serge E. Hallyn [not found] ` <20151124161630.GL17033-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-11-24 16:17 ` Tejun Heo 2015-11-27 5:25 ` Serge E. Hallyn 2015-11-27 5:25 ` Serge E. Hallyn 2015-11-27 5:25 ` Serge E. Hallyn [not found] ` <20151127052511.GA25490-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2015-11-30 15:11 ` Tejun Heo 2015-11-30 15:11 ` Tejun Heo 2015-11-30 15:11 ` Tejun Heo 2015-11-30 18:37 ` Serge E. Hallyn 2015-11-30 18:37 ` Serge E. Hallyn 2015-11-30 22:53 ` Tejun Heo [not found] ` <20151130225318.GD9039-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-12-01 2:08 ` Serge E. Hallyn 2015-12-01 2:08 ` Serge E. Hallyn 2015-12-01 2:08 ` Serge E. Hallyn [not found] ` <20151130183758.GA25433-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2015-11-30 22:53 ` Tejun Heo [not found] ` <20151130151147.GG3535-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> 2015-11-30 18:37 ` Serge E. Hallyn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.