From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 658D1C433E2 for ; Fri, 26 Jun 2020 04:59:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3E52620768 for ; Fri, 26 Jun 2020 04:59:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726847AbgFZE7d (ORCPT ); Fri, 26 Jun 2020 00:59:33 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:62634 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725306AbgFZE7c (ORCPT ); Fri, 26 Jun 2020 00:59:32 -0400 Received: from fsav107.sakura.ne.jp (fsav107.sakura.ne.jp [27.133.134.234]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 05Q4waSg074045; Fri, 26 Jun 2020 13:58:36 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav107.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav107.sakura.ne.jp); Fri, 26 Jun 2020 13:58:36 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav107.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 05Q4wZ1f074034 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 26 Jun 2020 13:58:35 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [RFC][PATCH] net/bpfilter: Remove this broken and apparently unmantained To: Alexei Starovoitov , Linus Torvalds Cc: David Miller , Greg Kroah-Hartman , "Eric W. Biederman" , Kees Cook , Andrew Morton , Alexei Starovoitov , Al Viro , bpf , linux-fsdevel , Daniel Borkmann , Jakub Kicinski , Masahiro Yamada , Gary Lin , Bruno Meneguele , LSM List , Casey Schaufler References: <20200625095725.GA3303921@kroah.com> <778297d2-512a-8361-cf05-42d9379e6977@i-love.sakura.ne.jp> <20200625120725.GA3493334@kroah.com> <20200625.123437.2219826613137938086.davem@davemloft.net> <20200626015121.qpxkdaqtsywe3zqx@ast-mbp.dhcp.thefacebook.com> From: Tetsuo Handa Message-ID: Date: Fri, 26 Jun 2020 13:58:35 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20200626015121.qpxkdaqtsywe3zqx@ast-mbp.dhcp.thefacebook.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 2020/06/26 10:51, Alexei Starovoitov wrote: > On Thu, Jun 25, 2020 at 06:36:34PM -0700, Linus Torvalds wrote: >> On Thu, Jun 25, 2020 at 12:34 PM David Miller wrote: >>> >>> It's kernel code executing in userspace. If you don't trust the >>> signed code you don't trust the signed code. >>> >>> Nothing is magic about a piece of code executing in userspace. >> >> Well, there's one real issue: the most likely thing that code is going >> to do is execute llvm to generate more code. Wow! Are we going to allow execution of such complicated programs? I was hoping that fork_usermode_blob() accepts only simple program like the content of "hello64" generated by ---------- ; nasm -f elf64 hello64.asm && ld -s -m elf_x86_64 -o hello64 hello64.o section .text global _start _start: mov rax, 1 ; write( mov rdi, 1 ; 1, mov rsi, msg ; "Hello world\n", mov rdx, 12 ; 12 syscall ; ); mov rax, 231 ; _exit( mov rdi, 0 ; 0 syscall ; ); section .rodata msg: db "Hello world", 0x0a ---------- which can be contained by mechanisms like seccomp; there is no pathname resolution, no networking access etc. >> >> And that's I think the real security issue here: the context in which >> the code executes. It may be triggered in one namespace, but what >> namespaces and what rules should the thing actually then execute in. >> >> So no, trying to dismiss this as "there are no security issues" is >> bogus. There very much are security issues. > > I think you're referring to: > >>> We might need to invent built-in "protected userspace" because existing >>> "unprotected userspace" is not trustworthy enough to run kernel modules. >>> That's not just inventing fork_usermode_blob(). > > Another root process can modify the memory of usermode_blob process. I'm not familiar with ptrace(); I'm just using /usr/bin/strace and /usr/bin/ltrace . What I'm worrying is that some root process tampers with memory which initially contained "hello64" above in order to let that memory do something different behavior. For example, a usermode process started by fork_usermode_blob() which was initially containing ---------- while (read(0, &uid, sizeof(uid)) == sizeof(uid)) { if (uid == 0) write(1, "OK\n", 3); else write(1, "NG\n", 3); } ---------- can be somehow tampered like ---------- while (read(0, &uid, sizeof(uid)) == sizeof(uid)) { if (uid != 0) write(1, "OK\n", 3); else write(1, "NG\n", 3); } ---------- due to interference from the rest of the system, how can we say "we trust kernel code executing in userspace" ? My question is: how is the byte array (which was copied from kernel space) kept secure/intact under "root can poke into kernel or any process memory." environment? It is obvious that we can't say "we trust kernel code executing in userspace" without some mechanism. Currently fork_usermode_blob() is not providing security context for the byte array to be executed. We could modify fork_usermode_blob() to provide security context for LSMs, but I'll be more happy if we can implement that mechanism without counting on in-tree LSMs, for SELinux is too complicated to support. > I think that's Tetsuo's point about lack of LSM hooks is kernel_sock_shutdown(). > Obviously, kernel_sock_shutdown() can be called by kernel only. I can't catch what you mean. The kernel code executing in userspace uses syscall interface (e.g. SYSCALL_DEFINE2(shutdown, int, fd, int, how) path), doesn't it? > I suspect he's imaging a hypothetical situation where kernel bits of kernel module > interact with userblob bits of kernel module. > Then another root process tampers with memory of userblob. Yes, how to protect the memory of userblob is a concern. The memory of userblob can interfere (or can be interfered by) the rest of the system is a problem. > Then userblob interaction with kernel module can do kernel_sock_shutdown() > on something that initial design of kernel+userblob module didn't intend. I can't catch what you mean. > I think this is trivially enforceable without creating new features. > Existing security_ptrace_access_check() LSM hook can prevent tampering with > memory of userblob. There is security_ptrace_access_check() LSM hook, but no zero-configuration method is available. > > As far as userblob calling llvm and other things in sequence. > That is no different from systemd calling things. Right. > security label can carry that execution context. If files get a chance to be associated with appropriate pathname and security label. > >> My personally strongest argument for remoiving this kernel code is >> that it's been there for a couple of years now, and it has never >> actually done anything useful, and there's no actual sign that it ever >> will, or that there is a solid plan in place for it. > > you probably missed the detailed plan: > https://lore.kernel.org/bpf/20200609235631.ukpm3xngbehfqthz@ast-mbp.dhcp.thefacebook.com/ > > The project #3 is the above is the one we're working on right now. > It should be ready to post in a week. I got a question on project #3. Given that "cat /sys/fs/bpf/my_ipv6_route" produces the same human output as "cat /proc/net/ipv6_route", how security checks which are done for "cat /proc/net/ipv6_route" can be enforced for "cat /sys/fs/bpf/my_ipv6_route" ? Unless same security checks (e.g. permission to read /proc/net/ipv6_route ) is enforced, such bpf usage sounds like a method for bypassing existing security mechanisms.