From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754403AbdCIL2g (ORCPT ); Thu, 9 Mar 2017 06:28:36 -0500 Received: from mail-qk0-f195.google.com ([209.85.220.195]:35243 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752672AbdCIL2a (ORCPT ); Thu, 9 Mar 2017 06:28:30 -0500 MIME-Version: 1.0 In-Reply-To: References: <20170221145746.GA31914@redhat.com> <20170306230515.GA3453@comp-core-i7-2640m-0182e6> From: Djalal Harouni Date: Thu, 9 Mar 2017 12:26:49 +0100 Message-ID: Subject: Re: [RFC] Add option to mount only a pids subset To: Andy Lutomirski Cc: Alexey Gladkov , Linux Kernel Mailing List , Linux API , "Kirill A. Shutemov" , Vasiliy Kulikov , Al Viro , "Eric W. Biederman" , Oleg Nesterov , Pavel Emelyanov , James Bottomley Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 7, 2017 at 5:24 PM, Andy Lutomirski wrote: > > On Mon, Mar 6, 2017 at 3:05 PM, Alexey Gladkov wrote: > > > > After discussion with Oleg Nesterov I reimplement my patch as an additional > > option for /proc. This option affects the mountpoint. It means that in one > > pid namespace it possible to have both the whole traditional /proc and > > /proc with only pids subset. > > > > I like this. I think you should split it into two patches, though: > one that reworks how procfs gets mounted and one that makes adds the > new functionality. > > Djajal had some concerns about the first part breaking applications > that use stat and expect certain behavior. This should be manageable, > though, but making stat work appropriately. I'm bit lost in the two discussion, however the main concern I was discussing with Andy was if you have per superblock proc mounts then each mount will end up with its own device ID st_dev, right now they share the same ID if they are in the same pid namespace, but if we change that then we may break the following: http://man7.org/linux/man-pages/man7/namespaces.7.html Both new NS_GET_PARENT and NS_GET_USERNS ioctl() that return an fd, suggests to follow up with fstat() to identify the namespaces.. "By applying fstat(2) to the returned file descriptor, one obtains a stat structure whose st_dev (resident device) and st_ino (inode number) fields together identify the owning/parent namespace." Other /proc/self/ns/* comparison and stat() logic... Andy suggested that we may have the same st_dev for mounts in the same pid namespace... I'm not sure which side effect this may bring! Thanks! -- tixxdz