From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F18DC433DF for ; Fri, 31 Jul 2020 15:27:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DBD80208E4 for ; Fri, 31 Jul 2020 15:27:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="bUfjRImb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DBD80208E4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 630548D0049; Fri, 31 Jul 2020 11:27:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E1748D0047; Fri, 31 Jul 2020 11:27:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F80C8D0049; Fri, 31 Jul 2020 11:27:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 36EBC8D0047 for ; Fri, 31 Jul 2020 11:27:52 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D39131EF1 for ; Fri, 31 Jul 2020 15:27:51 +0000 (UTC) X-FDA: 77098751142.07.jump11_2c08dbb26f84 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id ADDBC1803FD61 for ; Fri, 31 Jul 2020 15:27:51 +0000 (UTC) X-HE-Tag: jump11_2c08dbb26f84 X-Filterd-Recvd-Size: 5310 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 31 Jul 2020 15:27:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Bsi9xPmSMWSxHDG9tcc/ZmIBBdGYL5RzEu0Vz8YRBRg=; b=bUfjRImbe/fazmrVUlDrO7qo7p XgRpbbXCL+hV9Dqk/dBw8POrfT022WbSD15UpCQDAUPlUSCj2PAkbM6R27OpRxymm3L7YOi9/BRsE bv2W18JgDF5Ly5bKpeM0CCQ7+yi743p5SNM7I+rvEAIj9H9IPF/tng3FU+e05ZLNmFhnNekqIlmza vSvqMrer1bN0iDbn5NqpSaCs1kvaVoKWoWUp2Gpx77GizjAvaRAxBEUhnzKvD5XQqPT/XiLxluoz4 c9QcsZn1yG6uCThYdwdvrosh9soo50ts83HhDciRmV5KgPv5kXW8Xl0ou3kXFg9C6gLHxfHE1RCkv ArFe76MA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1k1Wwa-0002d9-3F; Fri, 31 Jul 2020 15:27:36 +0000 Date: Fri, 31 Jul 2020 16:27:36 +0100 From: Matthew Wilcox To: Steven Sistare Cc: "Eric W. Biederman" , Anthony Yznaga , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, mhocko@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, arnd@arndb.de, keescook@chromium.org, gerg@linux-m68k.org, ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, peterz@infradead.org, esyr@redhat.com, jgg@ziepe.ca, christian@kellner.me, areber@redhat.com, cyphar@cyphar.com Subject: Re: [RFC PATCH 0/5] madvise MADV_DOEXEC Message-ID: <20200731152736.GP23808@casper.infradead.org> References: <1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com> <20200730152250.GG23808@casper.infradead.org> <20200730171251.GI23808@casper.infradead.org> <63a7404c-e4f6-a82e-257b-217585b0277f@oracle.com> <20200730174956.GK23808@casper.infradead.org> <87y2n03brx.fsf@x220.int.ebiederm.org> <689d6348-6029-5396-8de7-a26bc3c017e5@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <689d6348-6029-5396-8de7-a26bc3c017e5@oracle.com> X-Rspamd-Queue-Id: ADDBC1803FD61 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 31, 2020 at 10:57:44AM -0400, Steven Sistare wrote: > Matthews sileby/mshare proposal has the same issue. If a process opts-in > and mmap's an address in the shared region, then content becomes mapped at > a VA that was known to the pre-fork or pre-exec process. Trust must still > be established. It's up to the recipient whether they try to map it at the same address or at a fresh address. The intended use case is a "semi-shared" address space between two processes (ie partway between a threaded, fully-shared address space and a forked un-shared address space), in which case there's a certain amount of trust and cooperation between the processes. Your preservation-across-exec use-case might or might not need the VMA to be mapped at the same address. I don't know whether qemu stores pointers in this VMA which are absolute within the qemu address space. If it's just the emulated process's address space, then everything will be absolute within its own address space and everything will be opaque to qemu. If qemu is storing its own pointers in it, then it has to be mapped at the same address. > > Here is another suggestion. > > > > Have a very simple program that does: > > > > for (;;) { > > handle = dlopen("/my/real/program"); > > real_main = dlsym(handle, "main"); > > real_main(argc, argv, envp); > > dlclose(handle); > > } > > > > With whatever obvious adjustments are needed to fit your usecase. > > > > That should give the same level of functionality, be portable to all > > unices, and not require you to duplicate code. It belive it limits you > > to not upgrading libc, or librt but that is a comparatively small > > limitation. > > > > > > Given that in general the interesting work is done in userspace and that > > userspace has provided an interface for reusing that work already. > > I don't see the justification for adding anything to exec at this point. > > Thanks for the suggestion. That is clever, and would make a fun project, > but I would not trust it for production. These few lines are just > the first of many that it would take to reset the environment to the > well-defined post-exec initial conditions that all executables expect, > and incrementally tearing down state will be prone to bugs. Getting a > clean slate from a kernel exec is a much more reliable design. The use > case is creating long-lived apps that never go down, and the simplest > implementation will have the fewest bugs and is the best. MADV_DOEXEC is > simple, and does not even require a new system call, and the kernel already > knows how to exec without bugs. It's a net increase of 200 lines of kernel code. If 4 lines of userspace code removes 200 lines of kernel code, I think I know which I prefer ...