Dear Andrew, Andrew Morton Wrote: > Well OK. Put all that on top of a patch, add suitable signoffs and > cc's and send it along? The purpose of this patch is to allow privileged processes to set their own per-memory memory-region fields: start_code, end_code, start_data, end_data, start_brk, brk, start_stack, arg_start, arg_end, env_start, env_end. This functionality is needed by any application or package that needs to reconstruct Linux processes, that is, to start them in any way other than by means of an "execve()" from an executable file. This includes: 1. Restoring processes from a checkpoint-file (by all potential user-level checkpointing packages, not only CRIU's). 2. Restarting processes on another node after process migration. 3. Starting duplicated copies of a running process (for reliability and high-availablity). 4. Starting a process from an executable format that is not supported by Linux, thus requiring a "manual execve" by a user-level utility. 5. Similarly, starting a process from a networked and/or crypted executable that, for confidentiality, licensing or other reasons, may not be written to the local file-systems. The code that does that was already included in the Linux kernel by the CRIU group, in the form of "prctl(PR_SET_MM)", but prior to this was enclosed within their private "#ifdef CONFIG_CHECKPOINT_RESTORE", which is normally disabled. It was not clear from your answer, Andrew, whether you prefer to remove the "#ifdef CONFIG_CHECKPOINT_RESTORE" altogether from the said code, or to enclose it in a new configuration option that is enabled by default. I therefore attach two alternative patches to choose from: the first removes the #ifdef altogether while the second introduces a new option. Signed-off-by: Amnon Shiloh. Best Regards, Amnon. > On Fri, 22 Feb 2013 12:18:01 +1100 (EST) > u3557@miso.sublimeip.com (Amnon Shiloh) wrote: > > > The code in "kernel/sys.c" that is currently within > > CONFIG_CHECKPOINT_RESTORE is in fact, as I explain below, > > one possible solution to a general issue, required by a wide > > class of applications. It just so happened that the CRIU group > > were the first to place this, or an equivalent code, in the kernel, > > that allows a privileged process to set its 11 per-process memory-region > > fields: > > start_code, end_code, start_data, end_data, start_brk, brk, > > start_stack, arg_start, arg_end, env_start, env_end. > > > > > > Contrary to the rest of the CHECKPOINT_RESTORE code, which is specific > > to the CRIU package, the code in "kernel/sys.c" (or its equivalent) is > > needed by ANY application or package that needs to reconstruct Linux > > processes, that means, starting them from the middle rather than from > > an executable file. > > > > That includes user-level checkpointing (any, not just CRIU's), > > process-migration (to other computers, as my own package does) > > and process duplication (for high-availability/reliability) - > > in fact even for starting a process from an executable format > > that is not supported by Linux, thus requiring a "manual execve" > > by a user-level utility. > > > > My first preference is to remove that "#ifdef CONFIG_CHECKPOINT_RESTORE" > > altogether. Note that there are no security issues because this code > > is already restricted to "capable(CAP_SYS_RESOURCE)". > > Short of that is the proposed patch. > > Well OK. Put all that on top of a patch, add suitable signoffs and > cc's and send it along? >