From mboxrd@z Thu Jan 1 00:00:00 1970 From: Serge Hallyn Subject: Re: [RFC] Per-user namespace process accounting Date: Thu, 12 Jun 2014 15:08:43 +0000 Message-ID: <20140612150842.GC4228@ubuntumail> References: <5386D58D.2080809@1h.com> <5399BB42.60304@elastichosts.com> Reply-To: LXC development mailing-list Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <5399BB42.60304-1hSFou9RDDldEee+Cai+ZQ@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: lxc-devel-bounces-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I@public.gmane.org Sender: "lxc-devel" To: LXC development mailing-list Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: containers.vger.kernel.org UXVvdGluZyBBbGluIERvYnJlIChhbGluLmRvYnJlQGVsYXN0aWNob3N0cy5jb20pOgo+IE9uIDI5 LzA1LzE0IDA3OjM3LCBNYXJpYW4gTWFyaW5vdiB3cm90ZToKPiA+IEhlbGxvLAo+ID4gCj4gPiBJ IGhhdmUgdGhlIGZvbGxvd2luZyBwcm9wb3NpdGlvbi4KPiA+IAo+ID4gTnVtYmVyIG9mIGN1cnJl bnRseSBydW5uaW5nIHByb2Nlc3NlcyBpcyBhY2NvdW50ZWQgYXQgdGhlIHJvb3QgdXNlciBuYW1l c3BhY2UuIFRoZSBwcm9ibGVtIEknbSBmYWNpbmcgaXMgdGhhdCBtdWx0aXBsZQo+ID4gY29udGFp bmVycyBpbiBkaWZmZXJlbnQgdXNlciBuYW1lc3BhY2VzIHNoYXJlIHRoZSBwcm9jZXNzIGNvdW50 ZXJzLgoKTW9zdCBwcGwgaGVyZSBwcm9iYWJseSBhcmUgYXdhcmUgb2YgdGhpcywgYnV0IHRoZSBw cmV2aW91cywgbmV2ZXItY29tcGxldGVkCnVzZXIgbmFtZXNwYWNlIGltcGxlbWVudGF0aW9uIHBy b3ZpZGVkIHRoaXMgYW5kIG9ubHkgdGhpcy4gIFdlIChtb3N0bHkgRXJpYwphbmQgSSkgc3BlbnQg eWVhcnMgbG9va2luZyBmb3IgY2xlYW4gd2F5cyB0byBtYWtlIHRoYXQgaW1wbGVtZW50YXRpb24s IHdoaWNoCmhhZCBzb21lIGFkdmFudGFnZXMgKGluY2x1ZGluZyB0aGlzIG9uZSksIGNvbXBsZXRl LiAgV2UgZGlkIGhhdmUgYSBmZXcgUE9Dcwp3aGljaCB3b3JrZWQgYnV0IHdlcmUgdW5zYXRpc2Z5 aW5nLiAgVGhlIHR3byB0aGluZ3Mgd2hpY2ggd2VyZSBuZXZlciBjb252aW5jaW5nCndlcmUgKGEp IGNvbnZlcnNpb24gb2YgYWxsIHVpZCBjaGVja3MgdG8gYmUgbmFtZXNwYWNlLXNhZmUsIGFuZCAo Yikgc3RvcmluZwpuYW1lc3BhY2UgaWRlbnRpZmllcnMgb24gZGlzay4gIChBcyBJIHNheSB3ZSBk aWQgaGF2ZSBzb2x1dGlvbnMgdG8gdGhlc2UsIGJ1dApub3Qgc2F0aXNmeWluZyBvbmVzKS4gIFRo ZXNlIGFyZSB0aGUgdHdvIHRoaW5ncyB3aGljaCB0aGUgbmV3IGltcGxlbWVudGF0aW9uCmFkZHJl c3MgKmJlYXV0aWZ1bGx5Ki4KCj4gPiBTbyBpZiBjb250YWluZXJYIHJ1bnMgMTAwIHdpdGggVUlE IDk5LCBjb250YWluZXJZIHNob3VsZCBoYXZlIE5QUk9DIGxpbWl0IG9mIGFib3ZlIDEwMCBpbiBv cmRlciB0byBleGVjdXRlIGFueQo+ID4gcHJvY2Vzc2VzIHdpdGggaXN0IG93biBVSUQgOTkuCj4g PiAKPiA+IEkga25vdyB0aGF0IHNvbWUgb2YgeW91IHdpbGwgdGVsbCBtZSB0aGF0IEkgc2hvdWxk IG5vdCBwcm92aXNpb24gYWxsIG9mIG15IGNvbnRhaW5lcnMgd2l0aCB0aGUgc2FtZSBVSUQvR0lE IG1hcHMsIGJ1dAo+ID4gdGhpcyBicmluZ3MgYW5vdGhlciBwcm9ibGVtLgo+IAo+IElmIHRoaXMg bWF0dGVycywgd2UgYWxzbyBzdWZmZXIgZnJvbSB0aGUgc2FtZSBwcm9ibGVtIGhlcmUuIFNvIHdl Cj4gc3VwcG9ydCBhbnkgaW1wbGVtZW50YXRpb24gdGhhdCB3b3VsZCBhZGRyZXNzIGl0LgoKSVNU TSB0aGUgb25seSByZWFzb25hYmxlIGFuc3dlciBoZXJlIChhdCBsZWFzdCBmb3Igbm93KSBpcyB0 byBtYWtlIGl0IG1vcmUKY29udmVuaWVudCB0byBpc29sYXRlIHVpZCByYW5nZXMsIGJ5IHByb3Zp ZGluZyBhIHdheSB0byBzaGlmdCB1aWRzIGF0IG1vdW50CnRpbWUgYXMgaGFzIGJlZW4gZGlzY3Vz c2VkIGEgYml0LgoKSWYgd2UgZ28gZG93biB0aGUgcm91dGUgb2YgdGFsa2luZyBhYm91dCB1aWQg OTkgaW4gbnMgMSB2cyB1aWQgOTkgaW4gbnMgMiwKdGhlbiBwZW9wbGUgd2lsbCBhbHNvIGV4cGVj dCBpc29sYXRpb24gYXQgZmlsZSBhY2Nlc3MgdGltZSwgYW5kIHdlJ3JlIGJhY2sKdG8gYWxsIHRo ZSBkaXNhZHZhbnRhZ2VzIG9mIHRoZSBmaXJzdCB1c2VybnMgaW1wbGVtZW50YXRpb24uCgooSWYg c29tZW9uZSBwcm92ZXMgbWUgd3JvbmcgYnkgc3VnZ2VzdGluZyBhIGNsZWFuIHNvbHV0aW9uLCB0 aGVuIGF3ZXNvbWUpCgotc2VyZ2UKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KbHhjLWRldmVsIG1haWxpbmcgbGlzdApseGMtZGV2ZWxAbGlzdHMubGludXhj b250YWluZXJzLm9yZwpodHRwOi8vbGlzdHMubGludXhjb250YWluZXJzLm9yZy9saXN0aW5mby9s eGMtZGV2ZWwK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756104AbaFLPIu (ORCPT ); Thu, 12 Jun 2014 11:08:50 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:47558 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752898AbaFLPIt (ORCPT ); Thu, 12 Jun 2014 11:08:49 -0400 Date: Thu, 12 Jun 2014 15:08:43 +0000 From: Serge Hallyn To: LXC development mailing-list Cc: containers@lists.osdl.org, linux-kernel@vger.kernel.org Subject: Re: [lxc-devel] [RFC] Per-user namespace process accounting Message-ID: <20140612150842.GC4228@ubuntumail> References: <5386D58D.2080809@1h.com> <5399BB42.60304@elastichosts.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5399BB42.60304@elastichosts.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Alin Dobre (alin.dobre@elastichosts.com): > On 29/05/14 07:37, Marian Marinov wrote: > > Hello, > > > > I have the following proposition. > > > > Number of currently running processes is accounted at the root user namespace. The problem I'm facing is that multiple > > containers in different user namespaces share the process counters. Most ppl here probably are aware of this, but the previous, never-completed user namespace implementation provided this and only this. We (mostly Eric and I) spent years looking for clean ways to make that implementation, which had some advantages (including this one), complete. We did have a few POCs which worked but were unsatisfying. The two things which were never convincing were (a) conversion of all uid checks to be namespace-safe, and (b) storing namespace identifiers on disk. (As I say we did have solutions to these, but not satisfying ones). These are the two things which the new implementation address *beautifully*. > > So if containerX runs 100 with UID 99, containerY should have NPROC limit of above 100 in order to execute any > > processes with ist own UID 99. > > > > I know that some of you will tell me that I should not provision all of my containers with the same UID/GID maps, but > > this brings another problem. > > If this matters, we also suffer from the same problem here. So we > support any implementation that would address it. ISTM the only reasonable answer here (at least for now) is to make it more convenient to isolate uid ranges, by providing a way to shift uids at mount time as has been discussed a bit. If we go down the route of talking about uid 99 in ns 1 vs uid 99 in ns 2, then people will also expect isolation at file access time, and we're back to all the disadvantages of the first userns implementation. (If someone proves me wrong by suggesting a clean solution, then awesome) -serge