From mboxrd@z Thu Jan 1 00:00:00 1970 From: Willy Tarreau Subject: Re: [PATCH v3 2/3] x86/ldt: Make modify_ldt optional Date: Fri, 24 Jul 2015 01:58:05 +0200 Message-ID: <20150723235805.GA3191__39819.1151828088$1437696034$gmane$org@1wt.eu> References: <7bfde005b84a90a83bf668a320c7d4ad1b940065.1437592883.git.luto@kernel.org> <20150723102434.GA2929@1wt.eu> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andy Lutomirski Cc: "security@kernel.org" , Kees Cook , Peter Zijlstra , Andrew Cooper , X86 ML , LKML , Steven Rostedt , xen-devel , Jan Beulich , Borislav Petkov , Andy Lutomirski , Sasha Levin , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org On Thu, Jul 23, 2015 at 04:40:14PM -0700, Andy Lutomirski wrote: > On Thu, Jul 23, 2015 at 4:36 PM, Kees Cook wrote: > > I've been pondering something like this that is even MORE generic, for > > any syscall. Something like a "syscalls" directory under > > /proc/sys/kernel, with 1 entry per syscall. "0" is "available", "1" is > > disabled, and "-1" disabled until next boot. > > > > It might want to be /proc/sys/kernel/syscalls/[abi]/[name], possibly > with more than just those options. We might want "disabled, returns > ENOSYS", "disabled, returns EPERM", and a lock bit. > > On x86 at least, the implementation's easy -- we can just poke the > syscall table. I wouldn't do it these days. Around 2000-2001, with a friend we designed a module with its userland counterpart which was called "overloader". The principle was to intercept syscalls in order to enforce some form of policies, log values, or remap paths, etc. The first use was to log all file creations during a "make install" to more easily build packages. It was at the era where it was easy to modify the syscall table from a module, in kernel 2.2. We quickly found that beyond logging/rewriting syscall arguments, it had limited use cases when used as a "syscall firewall" because many syscalls are still too coarse to decide whether you want to enable/disable them. I remember that socketcall() and ioctl() were among the annoying ones. Either you totally enable or totally disable. In the end, the only valid use cases we found for enabling/disabling a syscall were limited to a very small set for debugging purposes, in order to force some application code to detect a missing implementation and switch to an alternative (eg: these days if you suspect a bug in epoll you could disable it and force the app to use poll instead). It was still useful to disable module loading and FS mounting but that was about all by then. All this to say that probably only a handful of tricky syscalls would need an on/off switch but clearly not all of them at all, so I'd rather add a few entries just for the relevant ones, mainly to fix compatibility issues and nothing more. Eg: what's the point of disabling exit(), wait(), kill(), fork() or getpid()... It would only increase the difficulty to sort out bug reports. Just my opinion, Willy