From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932419AbcFBHtZ (ORCPT ); Thu, 2 Jun 2016 03:49:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:51880 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932175AbcFBHtY (ORCPT ); Thu, 2 Jun 2016 03:49:24 -0400 Date: Thu, 2 Jun 2016 09:49:20 +0200 From: Jan Kara To: "Eric W. Biederman" Cc: Nikolay Borisov , john@johnmccutchan.com, eparis@redhat.com, jack@suse.cz, linux-kernel@vger.kernel.org, gorcunov@openvz.org, avagin@openvz.org, netdev@vger.kernel.org, operations@siteground.com, Linux Containers Subject: Re: [RFC PATCH 0/4] Make inotify instance/watches be accounted per userns Message-ID: <20160602074920.GG19636@quack2.suse.cz> References: <1464767580-22732-1-git-send-email-kernel@kyup.com> <8737ow7vcp.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8737ow7vcp.fsf@x220.int.ebiederm.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 01-06-16 11:00:06, Eric W. Biederman wrote: > Cc'd the containers list. > > Nikolay Borisov writes: > > > Currently the inotify instances/watches are being accounted in the > > user_struct structure. This means that in setups where multiple > > users in unprivileged containers map to the same underlying > > real user (e.g. user_struct) the inotify limits are going to be > > shared as well which can lead to unplesantries. This is a problem > > since any user inside any of the containers can potentially exhaust > > the instance/watches limit which in turn might prevent certain > > services from other containers from starting. > > On a high level this is a bit problematic as it appears to escapes the > current limits and allows anyone creating a user namespace to have their > own fresh set of limits. Given that anyone should be able to create a > user namespace whenever they feel like escaping limits is a problem. > That however is solvable. > > A practical question. What kind of limits are we looking at here? > > Are these loose limits for detecting buggy programs that have gone > off their rails? > > Are these tight limits to ensure multitasking is possible? The original motivation for these limits is to limit resource usage. There is in-kernel data structure that is associated with each notification mark you create and we don't want users to be able to DoS the system by creating too many of them. Thus we limit number of notification marks for each user. There is also a limit on the number of notification instances - those are naturally limited by the number of open file descriptors but admin may want to limit them more... So cgroups would be probably the best fit for this but I'm not sure whether it is not an overkill... Honza -- Jan Kara SUSE Labs, CR