From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62DFFC11F64 for ; Mon, 28 Jun 2021 18:18:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3FE7F61C8C for ; Mon, 28 Jun 2021 18:18:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235998AbhF1SUx convert rfc822-to-8bit (ORCPT ); Mon, 28 Jun 2021 14:20:53 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:44528 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235850AbhF1SUp (ORCPT ); Mon, 28 Jun 2021 14:20:45 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1lxvpo-005Y4n-1h; Mon, 28 Jun 2021 12:18:16 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:37902 helo=email.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1lxvpm-00GUVd-To; Mon, 28 Jun 2021 12:18:15 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Jann Horn Cc: Andy Lutomirski , Andrei Vagin , Linux Kernel Mailing List , Linux API , linux-um@lists.infradead.org, criu@openvz.org, avagin@google.com, Andrew Morton , Anton Ivanov , Christian Brauner , Dmitry Safonov <0x7f454c46@gmail.com>, Ingo Molnar , Jeff Dike , Mike Rapoport , Michael Kerrisk , Oleg Nesterov , "Peter Zijlstra \(Intel\)" , Richard Weinberger , Thomas Gleixner References: <20210414055217.543246-1-avagin@gmail.com> <20210414055217.543246-3-avagin@gmail.com> Date: Mon, 28 Jun 2021 13:18:07 -0500 In-Reply-To: (Jann Horn's message of "Mon, 28 Jun 2021 19:14:31 +0200") Message-ID: <87o8bpyhsw.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=1lxvpm-00GUVd-To;;;mid=<87o8bpyhsw.fsf@disp2133>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+tjKQ2S/wElv4BubhnpKaH2QbS07hFhis= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH 2/4] arch/x86: implement the process_vm_exec syscall X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jann Horn writes: > On Mon, Jun 28, 2021 at 6:30 PM Andy Lutomirski wrote: >> On Mon, Jun 28, 2021, at 9:13 AM, Jann Horn wrote: >> > On Wed, Apr 14, 2021 at 7:59 AM Andrei Vagin wrote: >> > > This change introduces the new system call: >> > > process_vm_exec(pid_t pid, struct sigcontext *uctx, unsigned long flags, >> > > siginfo_t * uinfo, sigset_t *sigmask, size_t sizemask) >> > > >> > > process_vm_exec allows to execute the current process in an address >> > > space of another process. >> > [...] >> > >> > I still think that this whole API is fundamentally the wrong approach >> > because it tries to shoehorn multiple usecases with different >> > requirements into a single API. But that aside: >> > >> > > +static void swap_mm(struct mm_struct *prev_mm, struct mm_struct *target_mm) >> > > +{ >> > > + struct task_struct *tsk = current; >> > > + struct mm_struct *active_mm; >> > > + >> > > + task_lock(tsk); >> > > + /* Hold off tlb flush IPIs while switching mm's */ >> > > + local_irq_disable(); >> > > + >> > > + sync_mm_rss(prev_mm); >> > > + >> > > + vmacache_flush(tsk); >> > > + >> > > + active_mm = tsk->active_mm; >> > > + if (active_mm != target_mm) { >> > > + mmgrab(target_mm); >> > > + tsk->active_mm = target_mm; >> > > + } >> > > + tsk->mm = target_mm; >> > >> > I'm pretty sure you're not currently allowed to overwrite the ->mm >> > pointer of a userspace thread. For example, zap_threads() assumes that >> > all threads running under a process have the same ->mm. (And if you're >> > fiddling with ->mm stuff, you should probably CC linux-mm@.) >> >> exec_mmap() does it, so it can’t be entirely impossible. > > Yeah, true, execve can do it - I guess the thing that makes that > special is that it's running after de_thread(), so it's guaranteed to > be single-threaded? Even the implementation detail of swapping the mm aside. Even the idea of swaping the mm is completely broken, as an endless system calls depend upon the state held in task_struct. io_uring just tried running system calls of a process in a different context and we ultimately had to make the threads part of the original process to make enough things work to keep the problem tractable. System calls deeply and fundamentally depend on task_struct and signal_struct. I can think of two possibilities. 1) Hijack and existing process thread. 2) Inject a new thread into an existing process. Anything else is just an exercise in trouble. Of this I think Hijacking an existing thread is the only one that won't require lots of tracking down of special cases. I seem to remember audit is still struggling with how to properly audit io_uring threads. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out02.mta.xmission.com ([166.70.13.232]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lxvq9-008t6S-76 for linux-um@lists.infradead.org; Mon, 28 Jun 2021 18:18:38 +0000 From: ebiederm@xmission.com (Eric W. Biederman) References: <20210414055217.543246-1-avagin@gmail.com> <20210414055217.543246-3-avagin@gmail.com> Date: Mon, 28 Jun 2021 13:18:07 -0500 In-Reply-To: (Jann Horn's message of "Mon, 28 Jun 2021 19:14:31 +0200") Message-ID: <87o8bpyhsw.fsf@disp2133> MIME-Version: 1.0 Subject: Re: [PATCH 2/4] arch/x86: implement the process_vm_exec syscall List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-um" Errors-To: linux-um-bounces+geert=linux-m68k.org@lists.infradead.org To: Jann Horn Cc: Andy Lutomirski , Andrei Vagin , Linux Kernel Mailing List , Linux API , linux-um@lists.infradead.org, criu@openvz.org, avagin@google.com, Andrew Morton , Anton Ivanov , Christian Brauner , Dmitry Safonov <0x7f454c46@gmail.com>, Ingo Molnar , Jeff Dike , Mike Rapoport , Michael Kerrisk , Oleg Nesterov , "Peter Zijlstra (Intel)" , Richard Weinberger , Thomas Gleixner SmFubiBIb3JuIDxqYW5uaEBnb29nbGUuY29tPiB3cml0ZXM6Cgo+IE9uIE1vbiwgSnVuIDI4LCAy MDIxIGF0IDY6MzAgUE0gQW5keSBMdXRvbWlyc2tpIDxsdXRvQGtlcm5lbC5vcmc+IHdyb3RlOgo+ PiBPbiBNb24sIEp1biAyOCwgMjAyMSwgYXQgOToxMyBBTSwgSmFubiBIb3JuIHdyb3RlOgo+PiA+ IE9uIFdlZCwgQXByIDE0LCAyMDIxIGF0IDc6NTkgQU0gQW5kcmVpIFZhZ2luIDxhdmFnaW5AZ21h aWwuY29tPiB3cm90ZToKPj4gPiA+IFRoaXMgY2hhbmdlIGludHJvZHVjZXMgdGhlIG5ldyBzeXN0 ZW0gY2FsbDoKPj4gPiA+IHByb2Nlc3Nfdm1fZXhlYyhwaWRfdCBwaWQsIHN0cnVjdCBzaWdjb250 ZXh0ICp1Y3R4LCB1bnNpZ25lZCBsb25nIGZsYWdzLAo+PiA+ID4gICAgICAgICAgICAgICAgIHNp Z2luZm9fdCAqIHVpbmZvLCBzaWdzZXRfdCAqc2lnbWFzaywgc2l6ZV90IHNpemVtYXNrKQo+PiA+ ID4KPj4gPiA+IHByb2Nlc3Nfdm1fZXhlYyBhbGxvd3MgdG8gZXhlY3V0ZSB0aGUgY3VycmVudCBw cm9jZXNzIGluIGFuIGFkZHJlc3MKPj4gPiA+IHNwYWNlIG9mIGFub3RoZXIgcHJvY2Vzcy4KPj4g PiBbLi4uXQo+PiA+Cj4+ID4gSSBzdGlsbCB0aGluayB0aGF0IHRoaXMgd2hvbGUgQVBJIGlzIGZ1 bmRhbWVudGFsbHkgdGhlIHdyb25nIGFwcHJvYWNoCj4+ID4gYmVjYXVzZSBpdCB0cmllcyB0byBz aG9laG9ybiBtdWx0aXBsZSB1c2VjYXNlcyB3aXRoIGRpZmZlcmVudAo+PiA+IHJlcXVpcmVtZW50 cyBpbnRvIGEgc2luZ2xlIEFQSS4gQnV0IHRoYXQgYXNpZGU6Cj4+ID4KPj4gPiA+ICtzdGF0aWMg dm9pZCBzd2FwX21tKHN0cnVjdCBtbV9zdHJ1Y3QgKnByZXZfbW0sIHN0cnVjdCBtbV9zdHJ1Y3Qg KnRhcmdldF9tbSkKPj4gPiA+ICt7Cj4+ID4gPiArICAgICAgIHN0cnVjdCB0YXNrX3N0cnVjdCAq dHNrID0gY3VycmVudDsKPj4gPiA+ICsgICAgICAgc3RydWN0IG1tX3N0cnVjdCAqYWN0aXZlX21t Owo+PiA+ID4gKwo+PiA+ID4gKyAgICAgICB0YXNrX2xvY2sodHNrKTsKPj4gPiA+ICsgICAgICAg LyogSG9sZCBvZmYgdGxiIGZsdXNoIElQSXMgd2hpbGUgc3dpdGNoaW5nIG1tJ3MgKi8KPj4gPiA+ ICsgICAgICAgbG9jYWxfaXJxX2Rpc2FibGUoKTsKPj4gPiA+ICsKPj4gPiA+ICsgICAgICAgc3lu Y19tbV9yc3MocHJldl9tbSk7Cj4+ID4gPiArCj4+ID4gPiArICAgICAgIHZtYWNhY2hlX2ZsdXNo KHRzayk7Cj4+ID4gPiArCj4+ID4gPiArICAgICAgIGFjdGl2ZV9tbSA9IHRzay0+YWN0aXZlX21t Owo+PiA+ID4gKyAgICAgICBpZiAoYWN0aXZlX21tICE9IHRhcmdldF9tbSkgewo+PiA+ID4gKyAg ICAgICAgICAgICAgIG1tZ3JhYih0YXJnZXRfbW0pOwo+PiA+ID4gKyAgICAgICAgICAgICAgIHRz ay0+YWN0aXZlX21tID0gdGFyZ2V0X21tOwo+PiA+ID4gKyAgICAgICB9Cj4+ID4gPiArICAgICAg IHRzay0+bW0gPSB0YXJnZXRfbW07Cj4+ID4KPj4gPiBJJ20gcHJldHR5IHN1cmUgeW91J3JlIG5v dCBjdXJyZW50bHkgYWxsb3dlZCB0byBvdmVyd3JpdGUgdGhlIC0+bW0KPj4gPiBwb2ludGVyIG9m IGEgdXNlcnNwYWNlIHRocmVhZC4gRm9yIGV4YW1wbGUsIHphcF90aHJlYWRzKCkgYXNzdW1lcyB0 aGF0Cj4+ID4gYWxsIHRocmVhZHMgcnVubmluZyB1bmRlciBhIHByb2Nlc3MgaGF2ZSB0aGUgc2Ft ZSAtPm1tLiAoQW5kIGlmIHlvdSdyZQo+PiA+IGZpZGRsaW5nIHdpdGggLT5tbSBzdHVmZiwgeW91 IHNob3VsZCBwcm9iYWJseSBDQyBsaW51eC1tbUAuKQo+Pgo+PiBleGVjX21tYXAoKSBkb2VzIGl0 LCBzbyBpdCBjYW7igJl0IGJlIGVudGlyZWx5IGltcG9zc2libGUuCj4KPiBZZWFoLCB0cnVlLCBl eGVjdmUgY2FuIGRvIGl0IC0gSSBndWVzcyB0aGUgdGhpbmcgdGhhdCBtYWtlcyB0aGF0Cj4gc3Bl Y2lhbCBpcyB0aGF0IGl0J3MgcnVubmluZyBhZnRlciBkZV90aHJlYWQoKSwgc28gaXQncyBndWFy YW50ZWVkIHRvCj4gYmUgc2luZ2xlLXRocmVhZGVkPwoKRXZlbiB0aGUgaW1wbGVtZW50YXRpb24g ZGV0YWlsIG9mIHN3YXBwaW5nIHRoZSBtbSBhc2lkZS4gIEV2ZW4gdGhlIGlkZWEKb2Ygc3dhcGlu ZyB0aGUgbW0gaXMgY29tcGxldGVseSBicm9rZW4sIGFzIGFuIGVuZGxlc3Mgc3lzdGVtIGNhbGxz CmRlcGVuZCB1cG9uIHRoZSBzdGF0ZSBoZWxkIGluIHRhc2tfc3RydWN0LiAgaW9fdXJpbmcganVz dCB0cmllZCBydW5uaW5nCnN5c3RlbSBjYWxscyBvZiBhIHByb2Nlc3MgaW4gYSBkaWZmZXJlbnQg Y29udGV4dCBhbmQgd2UgdWx0aW1hdGVseSBoYWQKdG8gbWFrZSB0aGUgdGhyZWFkcyBwYXJ0IG9m IHRoZSBvcmlnaW5hbCBwcm9jZXNzIHRvIG1ha2UgZW5vdWdoIHRoaW5ncwp3b3JrIHRvIGtlZXAg dGhlIHByb2JsZW0gdHJhY3RhYmxlLgoKU3lzdGVtIGNhbGxzIGRlZXBseSBhbmQgZnVuZGFtZW50 YWxseSBkZXBlbmQgb24gdGFza19zdHJ1Y3QgYW5kCnNpZ25hbF9zdHJ1Y3QuCgpJIGNhbiB0aGlu ayBvZiB0d28gcG9zc2liaWxpdGllcy4KMSkgSGlqYWNrIGFuZCBleGlzdGluZyBwcm9jZXNzIHRo cmVhZC4KMikgSW5qZWN0IGEgbmV3IHRocmVhZCBpbnRvIGFuIGV4aXN0aW5nIHByb2Nlc3MuCgpB bnl0aGluZyBlbHNlIGlzIGp1c3QgYW4gZXhlcmNpc2UgaW4gdHJvdWJsZS4gIE9mIHRoaXMgSSB0 aGluayBIaWphY2tpbmcKYW4gZXhpc3RpbmcgdGhyZWFkIGlzIHRoZSBvbmx5IG9uZSB0aGF0IHdv bid0IHJlcXVpcmUgbG90cyBvZiB0cmFja2luZwpkb3duIG9mIHNwZWNpYWwgY2FzZXMuICBJIHNl ZW0gdG8gcmVtZW1iZXIgYXVkaXQgaXMgc3RpbGwgc3RydWdnbGluZwp3aXRoIGhvdyB0byBwcm9w ZXJseSBhdWRpdCBpb191cmluZyB0aHJlYWRzLgoKRXJpYwoKX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KbGludXgtdW0gbWFpbGluZyBsaXN0CmxpbnV4LXVt QGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9s aXN0aW5mby9saW51eC11bQo=