From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753991AbcG3Row (ORCPT ); Sat, 30 Jul 2016 13:44:52 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:41310 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752094AbcG3Rou (ORCPT ); Sat, 30 Jul 2016 13:44:50 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Cyrill Gorcunov Cc: Stanislav Kinsburskiy , peterz@infradead.org, mingo@redhat.com, mhocko@suse.com, keescook@chromium.org, linux-kernel@vger.kernel.org, mguzik@redhat.com, bsegall@google.com, john.stultz@linaro.org, oleg@redhat.com, matthltc@us.ibm.com, akpm@linux-foundation.org, luto@amacapital.net, vbabka@suse.cz, xemul@virtuozzo.com References: <20160712152940.24895.61315.stgit@localhost.localdomain> <8a863273-c571-63d6-c0c3-637dff5645a3@virtuozzo.com> <87y44pbmtc.fsf@x220.int.ebiederm.org> <20160725192242.GA26208@uranus> <87a8h58pac.fsf@x220.int.ebiederm.org> <20160726083445.GB26208@uranus> Date: Sat, 30 Jul 2016 12:31:40 -0500 In-Reply-To: <20160726083445.GB26208@uranus> (Cyrill Gorcunov's message of "Tue, 26 Jul 2016 11:34:45 +0300") Message-ID: <87y44j6nib.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1bTYJY-0002K0-Jx;;;mid=<87y44j6nib.fsf@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=67.3.204.119;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18G3cTPMKO3A4x80p92OX9c0QqzkJA2QSE= X-SA-Exim-Connect-IP: 67.3.204.119 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 XMSubMetaSx_00 1+ Sexy Words * 0.1 XMSolicitRefs_0 Weightloss drug * 1.2 XMSubMetaSxObfu_02 Obfuscated Sexy AdVerb X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Cyrill Gorcunov X-Spam-Relay-Country: X-Spam-Timing: total 938 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 4.1 (0.4%), b_tie_ro: 3.0 (0.3%), parse: 1.10 (0.1%), extract_message_metadata: 7 (0.7%), get_uri_detail_list: 4.8 (0.5%), tests_pri_-1000: 4.0 (0.4%), tests_pri_-950: 1.19 (0.1%), tests_pri_-900: 1.01 (0.1%), tests_pri_-400: 32 (3.4%), check_bayes: 31 (3.3%), b_tokenize: 9 (1.0%), b_tok_get_all: 11 (1.2%), b_comp_prob: 3.6 (0.4%), b_tok_touch_all: 3.5 (0.4%), b_finish: 0.82 (0.1%), tests_pri_0: 872 (92.9%), check_dkim_signature: 0.51 (0.1%), check_dkim_adsp: 90 (9.6%), tests_pri_500: 6 (0.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] prctl: remove one-shot limitation for changing exe link X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Cyrill Gorcunov writes: > On Mon, Jul 25, 2016 at 02:56:43PM -0500, Eric W. Biederman wrote: > ... >> >> >> >> Also there is a big fat bug in prctl_set_mm_exe_file. It doesn't >> >> validate that the new file is a actually mmaped executable. We would >> >> definitely need that to be fixed before even considering removing the >> >> limit. >> > >> > Could you please elaborate? We check for inode being executable, >> > what else needed? >> >> That the inode is mmaped into the process with executable mappings. >> >> Effectively what we check the old mapping for and refuse to remove the old >> mm_exe_file if it exists. >> >> I think a reasonable argument can be made that if the file is >> executable, and it is mmaped with executable pages that exe_file is not >> a complete lie. > > I might be missing something obvious, so sorry for the question -- > when criu setups old exe link the inode we obtain from file open > is not mapped into memory, the old exe not read by anyone because > it's not even executed anyhow. So I don't really understand which > mapping we should check here. Mind to point me? That sounds like an out and out bug that should not be preserved. Of course we should mmap the executable and set it up so that it can be executed (at least as much as the executable was previously mapped). Anything else is a buggy restart, and lying to userspace. >> Which is the important part. At the end of the day how much can >> userspace trust /proc/pid/exe? If we are too lax it is just a random >> file descriptor we can not trust at all. At which point there is >> exactly no point in preserving it in checkpoint/restart, because nothing >> will trust or look at it. > > You know, I think we should not trust exe link much, and in real we > never could: this link is rather a hint pointing which executable a > process has been using on execve call, once the process start working > one can't be sure if the code currently running is exactly from the > file pointed by exe link. It just a hint suitable for debuggin and > obtain clean view of which processes are running on noncompromised > system. Monitoring exe link change won't help much if there are > malicious software running on the system. But it is not just a hint. It is a record of which executable we called execve on. Knowing which file was executed doesn't guarantee what is running now but it provides a very strong hint. At then end of a restart the state of a process should be (by definition) exactly the state the process was before a checkpoint and thus a state the original executable could have gotten into. I admit it is possible for an application to unmap itself. I honestly have not met that application (except perhaps criu). >> If the only user is checkpoint/restart perhaps it should be only ptrace >> that can set this and not the process itself with a prctl. I don't >> know. All I know is that we should work on making it a very trustable >> value even though in some specific instances we can set it. > > Since as I said I suppose nobody except us using this feature, we can > setup some sysctl trigger for it (I personally think this is an > overkill, but OTOH if people rely on the exe link and not going > to use criu at all, this trigger will help). Some clarity of thought came to me, and I apologise for not replying sooner with it sooner. My problem with the original patch submission is that it was justifying changing prctl_set_mm_exe_file based on what prctl_set_mm_exe_file does today. As prctl_set_mm_exe_file was added for the checkpoint/restart case that is justifying changing code based on a buggy implementation. It is necessary to look at the ordinary situation. Without prctl_set_mm_exe /proc/[pid]/exe can be counted on as a record of which executable was last passed to execve. Furthermore the state of a process can be counted on to be a state reachable from calling execve on /proc/[pid]/exe. Which means to preserve those expectations prctl_set_mm_exe_file should in practice just be a nicer less cumbersome interface to things you can already achieve with execve. Justifying removale of the one-short nature for prctl_set_mm_exe_file is as straight forward as noting that a process can call execve on any executable file. However when I compare the invariants that execve has on a file (such as the executable being mmaped) I see some noticable disparities between what prctl_set_mm_exe_file allows and what execve allows. With prctl_set_mm_exe being less strict. So what I am requesting is very simple. That the checks in prctl_set_mm_exe_file be tightened up to more closely approach what execve requires. Thus preserving the value of the /proc/[pid]/exe for the applications that want to use the exe link. Once the checks in prctl_set_mm_exe_file are tightened up please feel free to remove the one shot test. Eric