From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.4 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDDFCC4338F for ; Thu, 29 Jul 2021 12:10:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C10A460F0F for ; Thu, 29 Jul 2021 12:10:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234176AbhG2MKC (ORCPT ); Thu, 29 Jul 2021 08:10:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231674AbhG2MKB (ORCPT ); Thu, 29 Jul 2021 08:10:01 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EF2FC061765; Thu, 29 Jul 2021 05:09:58 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id g15so6669389wrd.3; Thu, 29 Jul 2021 05:09:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=vQTKbKS+/LH19uuLNpiiGje2jFblTOyPqwAuffuliDA=; b=qeBmZy4bp9GmS6mfdgEVjwkaC/m+fALTtDySt72x1ZNIT/nBFugoJ8892oMXlVLIc+ joJiSlMlCzgYFQd5DMiM9cOG8njk6MfNia4bwsYHabc3XhZY57z7jD+AOj6tYZYedm2Q 6eA8oPaYdJlCV4sGGGnU1srECxrpaH6tF1XnSKI8wDAzcLSRuHhBY/QUwIfzWIYD/omm vFTT/zx5oI0qXineSC8KU9nd7LR3EqbhrmX5PGOBYcyF3rWU/Gh6xZW26Q34TAjWFN8w bGMPN3rKFgadPnmhBTSyK6bGL1qw0oQi4mX2vZJM1Gyl6+2KigLzbQEIwmE78/lWYdXR axlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=vQTKbKS+/LH19uuLNpiiGje2jFblTOyPqwAuffuliDA=; b=sERayNgWJKdeJk3+Ch5pJyDD/Eo5qmO5nMUqpGeqrts0Kg/btW0Y4Ch2p6AvHUpwJl VgwUt0djbOgzV9nHnYAiCh4hEah7ebcmn0QyETwiQ58T4VbzLZbYtD/wumJx5iNrMyY5 BlMoodb5bOChflF81weD5oaCgSFyvB8BzTsZY/DXPCJXxFC8VRo+DZBC1VYqhT0MaW8i 8Y/fNm9pU2TMEDpbsR0HZiFJ3WA3zz487dguiHwNCTAjF6rKC/6z04chtmBZgkKU9qzI AJ8FWgmyFhN9r8zA7M4W5EHnt6W/2ocNaAUXUupiHoWlPwgZ1hgpnmWmm0KulA6zDH7x pqtg== X-Gm-Message-State: AOAM532tyllpk4w0BP/UjkTbQb7y69lAzJSuXxjQW22QUTbbL3CwHoak 7jkg/AbvPuUm8C3oyx8LFa0nk5cqxTA= X-Google-Smtp-Source: ABdhPJwnUNmNkrWN95kZ2B6Gb6qydeBm6knpM3GsbDmvw45bhZOXojMavgbLX1+XR8M9BleQVTG7NQ== X-Received: by 2002:adf:e550:: with SMTP id z16mr4741697wrm.250.1627560597074; Thu, 29 Jul 2021 05:09:57 -0700 (PDT) Received: from [10.8.0.150] ([195.53.121.100]) by smtp.gmail.com with ESMTPSA id j1sm3432402wrm.86.2021.07.29.05.09.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Jul 2021 05:09:56 -0700 (PDT) Subject: Re: [PATCH] refpage_create.2: Document refpage_create(2) To: Peter Collingbourne Cc: Jann Horn , Linux ARM , linux-mm@kvack.org, kernel test robot , Matthew Wilcox , Linux API , "Kirill A . Shutemov" , linux-doc@vger.kernel.org, linux-man@vger.kernel.org, John Hubbard , Andrew Morton , Catalin Marinas , Evgenii Stepanov , Michael Kerrisk References: <20210717025951.3946505-1-pcc@google.com> From: "Alejandro Colomar (man-pages)" Message-ID: Date: Thu, 29 Jul 2021 14:09:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210717025951.3946505-1-pcc@google.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org Hi Peter, On 7/17/21 4:59 AM, Peter Collingbourne wrote: > --- > The syscall has not landed in the kernel yet. > Therefore, as usual, the patch should not be taken yet > and I've used 5.x as the introducing kernel version for now. Thanks! Please see a few comments below. Apart from formatting and code issues I noted, the text looks good to me. Please, ping us when this is merged in the kernel :) Regards, Alex > > man2/refpage_create.2 | 167 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 167 insertions(+) > create mode 100644 man2/refpage_create.2 > > diff --git a/man2/refpage_create.2 b/man2/refpage_create.2 > new file mode 100644 > index 000000000..c0b928b92 > --- /dev/null > +++ b/man2/refpage_create.2 > @@ -0,0 +1,167 @@ > +.\" Copyright (C) 2021 Google LLC > +.\" Author: Peter Collingbourne > +.\" > +.\" %%%LICENSE_START(VERBATIM) > +.\" Permission is granted to make and distribute verbatim copies of this > +.\" manual provided the copyright notice and this permission notice are > +.\" preserved on all copies. > +.\" > +.\" Permission is granted to copy and distribute modified versions of this > +.\" manual under the conditions for verbatim copying, provided that the > +.\" entire resulting derived work is distributed under the terms of a > +.\" permission notice identical to this one. > +.\" > +.\" Since the Linux kernel and libraries are constantly changing, this > +.\" manual page may be incorrect or out-of-date. The author(s) assume no > +.\" responsibility for errors or omissions, or for damages resulting from > +.\" the use of the information contained herein. The author(s) may not > +.\" have taken the same level of care in the production of this manual, > +.\" which is licensed free of charge, as they might when working > +.\" professionally. > +.\" > +.\" Formatted or processed versions of this manual, if unaccompanied by > +.\" the source, must acknowledge the copyright and authors of this work. > +.\" %%%LICENSE_END > +.\" > +.TH REFPAGE_CREATE 2 2021-07-16 "Linux" "Linux Programmer's Manual" > +.SH NAME > +refpage_create \- create a reference page file descriptor > +.SH SYNOPSIS > +.nf > +.BR "#include " > +.PP > +.BI "int syscall(SYS_refpage_create, void *" content ", unsigned int " size , > +.BI " unsigned long " flags ");" > +.fi > +.PP > +.IR Note : > +glibc provides no wrapper for > +.BR refpage_create (), > +necessitating the use of > +.BR syscall (2). > +.SH DESCRIPTION > +The > +.BR refpage_create () > +system call is used to create a file descriptor > +that conceptually refers to a read-only file > +whose contents are an infinite repetition of > +.I size > +bytes of data read from the > +.I content > +argument to the system call, > +and which may be mapped into memory with > +.BR mmap (2). > +The file descriptor is created as if by passing > +.BR O_RDONLY | O_CLOEXEC > +to > +.BR open (2). > +.PP > +In reality, any read-only pages in the mapping are backed > +by a so-called reference page, > +whose contents are specified using the arguments to > +.BR refpage_create (). > +.PP > +The reference page will consist of repetitions of > +.I size > +bytes read > +from > +.IR content , > +as many as are required to fill the page. The > +.I size > +argument must be a power of two less than or equal to the page size, and the > +.I content > +argument must have at least > +.I size > +alignment. The behavior is as if a copy of this data s/\. /.\n/ Rationale: semantic newlines. > +is made while servicing the system call; > +any updates to the data after the system call has returned > +will not be reflected in the reference page. > +.PP > +If the architecture specifies that // metadata may be associated /J/ Please, use semantic newlines (see man-pages(7)) > +with memory addresses, // that metadata if present is copied > +into the reference page along with the data itself, > +but only if the size argument is at least as large > +as the granularity of the metadata. > +For example, with the ARMv8.5 Memory Tagging Extension, > +the memory tags are copied, // but only if the size is greater than /J/ > +or equal to // the architecturally specified tag granule size of 16 bytes. > +.PP > +Writable private mappings trigger specific copy-on-write behavior > +when a page in the mapping is written to. > +The behavior is as if the reference page is copied, > +but the kernel may use a more efficient technique such as > +.BR memset (3) > +to produce the copy if the > +.I size > +argument originally used to create the reference page file descriptor > +is sufficiently small. > +For this reason it is recommended to specify as small of a > +.I size > +argument as possible > +in order to activate any such optimizations implemented in the kernel. > +.PP > +The advantage of using this system call > +over creating normal anonymous mappings > +and manually initializing the pages from userspace > +is that it is more efficient. > +If it is not known that all of the pages in the mapping > +will be faulted (for example, if the system call is used > +by a general purpose memory allocator > +where the behavior of the client program is unknown), > +letting the pages be prepared on fault only if needed > +is more efficient from both a performance > +and memory consumption perspective. > +Even if all of the pages would end up being faulted, > +it would still be more efficient > +to have the kernel initialize the pages with the required contents once > +than to have the kernel zero initialize them on fault > +and then have userspace initialize them again with different contents. > +.SH EXAMPLES > +The following program creates a 128KB memory mapping The SI mandates that a space shall be inserted between a number and the associated unit. Also, if it really means 128 KiB, which I guess, please use KiB. See units(7). Use a non-breaking space to make sure that the unit goes with the number. With all that, it would be: ... creates a 128\ KiB memory ... > +preinitialized with the pattern byte 0xAA > +and verifies that the contents of the mapping are correct. > +.PP > +.EX > +#include > +#include > +#include > +#include > + > +int main() { > + unsigned char pattern = 0xaa; Please use capital AA to help visually differentiate x and a. > + unsigned long mmap_size = 131072; Why that magic number? Maybe a shift to indicate that it's a power of 2... or 128 * 1024... I don't know from the top of my head powers of 2 that high :) Also, why 'unsigned long'? The SYNOPSIS says it's an 'unsigned int'. > + > + int fd = syscall(SYS_refpage_create, &pattern, 1, 0); Please use sizeof(pattern) instead of 1 to communicate the relationship between them. > + if (fd < 0) { > + perror("refpage_create"); > + return 1; Please use EXIT_FAILURE (). Also use exit(3) instead of return, as is common practice in manual pages. > + } > + unsigned char *p = mmap(0, mmap_size, PROT_READ | PROT_WRITE, Use NULL instead of 0 for pointers. The first argument of mmap(2) is 'void *addr'. > + MAP_PRIVATE, fd, 0); > + if (p == MAP_FAILED) { > + perror("mmap"); > + return 1; > + } > + for (unsigned i = 0; i != mmap_size; ++i) { s/unsigned/unsigned int/ > + if (p[i] != pattern) { > + fprintf(stderr, "refpage failed contents check @ %u: " > + "0x%x != 0x%x\n", I prefer 0x%X, which is already in use in some manual pages (seccomp(2)). Also, 'i' may be more readable in hex, given it's an offset of an address (actually the concept of a size_t, even if the kernel doesn't use that type) don't you think? > + i, p[i], pattern); > + return 1; exit(3) > + } > + } > +} > +.EE > +.SH NOTE > +Reading from a reference page file descriptor, e.g. with > +.BR read (2), > +is not supported, nor would this be particularly useful. > +.SH VERSIONS > +This system call first appeared in Linux 5.x. > +.SH CONFORMING TO > +The > +.BR refpage_create () > +system call is Linux-specific. > +.SH SEE ALSO > +.BR mmap (2), > +.BR open (2). > -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.4 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D3E8C4338F for ; Thu, 29 Jul 2021 12:11:47 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4641D60524 for ; Thu, 29 Jul 2021 12:11:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4641D60524 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:Cc:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=5GpRMN6D3J5ZjoAxhQ05ZC/QZ4WQE95dN9F4a2CV7Uo=; b=DWhJsPhaVV6XU6BxosuYeMC7yn kE+Bn4JVojusZRbTeXhPNNjYii5FOH40t8U16b+2O48lBPUf7L03TZJSSRJ2vTr+otbJioQT4oYlS 72Gi2vu+rtao8xw8t3hlgbpEwP57gq4xj1d+6CSWg52Kq4nRR8Bdfve6qxXCndopI4Mz379pugQa9 8byAa9lBZFHR2M1vh+2+3r8Z1fwtu02vQVehGGT5NtzAzTUGLyMdOpiKXgH5KBZ9fjpi66QAX1Mbl CUpca5auUkkgxmBaVWgW+QBdO5+1oCgyS5uph4e4poFj1r68fqVCr8+go4yS8cT8VElb6ZoOPLsnQ TbHEK6Zw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m94rT-0044mQ-Hp; Thu, 29 Jul 2021 12:10:03 +0000 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1m94rO-0044lA-KE for linux-arm-kernel@lists.infradead.org; Thu, 29 Jul 2021 12:10:00 +0000 Received: by mail-wr1-x42f.google.com with SMTP id d8so6653611wrm.4 for ; Thu, 29 Jul 2021 05:09:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=vQTKbKS+/LH19uuLNpiiGje2jFblTOyPqwAuffuliDA=; b=qeBmZy4bp9GmS6mfdgEVjwkaC/m+fALTtDySt72x1ZNIT/nBFugoJ8892oMXlVLIc+ joJiSlMlCzgYFQd5DMiM9cOG8njk6MfNia4bwsYHabc3XhZY57z7jD+AOj6tYZYedm2Q 6eA8oPaYdJlCV4sGGGnU1srECxrpaH6tF1XnSKI8wDAzcLSRuHhBY/QUwIfzWIYD/omm vFTT/zx5oI0qXineSC8KU9nd7LR3EqbhrmX5PGOBYcyF3rWU/Gh6xZW26Q34TAjWFN8w bGMPN3rKFgadPnmhBTSyK6bGL1qw0oQi4mX2vZJM1Gyl6+2KigLzbQEIwmE78/lWYdXR axlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=vQTKbKS+/LH19uuLNpiiGje2jFblTOyPqwAuffuliDA=; b=G/IdRcqo0R32Y+s6PTccIgixAUDlEl8lBszpxBe8DsOAiCN/EysExvbqo8ODiPeSkj eZ7Xkidwbnoz7X8u47BLJcwoKyPLGAhREzXOwtEk3lA08XDz9DF0XmYWPu9Vp0G+Duu9 3hJWihQ6ju5FcpIsgZ0soMcP2Fn1sP/JtrmcgJTYtjc5t9IQdvLXgSPsOcET+MuyJyp3 a5VFfXX+7HPIMYpPtp1CdvtuDaIebD2jgeRz3skO5NClkNfDriW9b1NfYVyqsbZwZyHu Mw7bXNCI2egIOutaQP8SALbouCxw0vaUPAs2XG+BQXxhqCYZxCZV63iQ3FtcKE6Hhby6 RKmg== X-Gm-Message-State: AOAM533+PBLnwjz7oaKBafco3duQzSgTkBBaokUM+znBVbJTdwyKFeZs U/svXYkDOk4UoFEH248GZbc= X-Google-Smtp-Source: ABdhPJwnUNmNkrWN95kZ2B6Gb6qydeBm6knpM3GsbDmvw45bhZOXojMavgbLX1+XR8M9BleQVTG7NQ== X-Received: by 2002:adf:e550:: with SMTP id z16mr4741697wrm.250.1627560597074; Thu, 29 Jul 2021 05:09:57 -0700 (PDT) Received: from [10.8.0.150] ([195.53.121.100]) by smtp.gmail.com with ESMTPSA id j1sm3432402wrm.86.2021.07.29.05.09.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Jul 2021 05:09:56 -0700 (PDT) Subject: Re: [PATCH] refpage_create.2: Document refpage_create(2) To: Peter Collingbourne Cc: Jann Horn , Linux ARM , linux-mm@kvack.org, kernel test robot , Matthew Wilcox , Linux API , "Kirill A . Shutemov" , linux-doc@vger.kernel.org, linux-man@vger.kernel.org, John Hubbard , Andrew Morton , Catalin Marinas , Evgenii Stepanov , Michael Kerrisk References: <20210717025951.3946505-1-pcc@google.com> From: "Alejandro Colomar (man-pages)" Message-ID: Date: Thu, 29 Jul 2021 14:09:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210717025951.3946505-1-pcc@google.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210729_050958_751532_B22E360D X-CRM114-Status: GOOD ( 53.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Peter, On 7/17/21 4:59 AM, Peter Collingbourne wrote: > --- > The syscall has not landed in the kernel yet. > Therefore, as usual, the patch should not be taken yet > and I've used 5.x as the introducing kernel version for now. Thanks! Please see a few comments below. Apart from formatting and code issues I noted, the text looks good to me. Please, ping us when this is merged in the kernel :) Regards, Alex > > man2/refpage_create.2 | 167 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 167 insertions(+) > create mode 100644 man2/refpage_create.2 > > diff --git a/man2/refpage_create.2 b/man2/refpage_create.2 > new file mode 100644 > index 000000000..c0b928b92 > --- /dev/null > +++ b/man2/refpage_create.2 > @@ -0,0 +1,167 @@ > +.\" Copyright (C) 2021 Google LLC > +.\" Author: Peter Collingbourne > +.\" > +.\" %%%LICENSE_START(VERBATIM) > +.\" Permission is granted to make and distribute verbatim copies of this > +.\" manual provided the copyright notice and this permission notice are > +.\" preserved on all copies. > +.\" > +.\" Permission is granted to copy and distribute modified versions of this > +.\" manual under the conditions for verbatim copying, provided that the > +.\" entire resulting derived work is distributed under the terms of a > +.\" permission notice identical to this one. > +.\" > +.\" Since the Linux kernel and libraries are constantly changing, this > +.\" manual page may be incorrect or out-of-date. The author(s) assume no > +.\" responsibility for errors or omissions, or for damages resulting from > +.\" the use of the information contained herein. The author(s) may not > +.\" have taken the same level of care in the production of this manual, > +.\" which is licensed free of charge, as they might when working > +.\" professionally. > +.\" > +.\" Formatted or processed versions of this manual, if unaccompanied by > +.\" the source, must acknowledge the copyright and authors of this work. > +.\" %%%LICENSE_END > +.\" > +.TH REFPAGE_CREATE 2 2021-07-16 "Linux" "Linux Programmer's Manual" > +.SH NAME > +refpage_create \- create a reference page file descriptor > +.SH SYNOPSIS > +.nf > +.BR "#include " > +.PP > +.BI "int syscall(SYS_refpage_create, void *" content ", unsigned int " size , > +.BI " unsigned long " flags ");" > +.fi > +.PP > +.IR Note : > +glibc provides no wrapper for > +.BR refpage_create (), > +necessitating the use of > +.BR syscall (2). > +.SH DESCRIPTION > +The > +.BR refpage_create () > +system call is used to create a file descriptor > +that conceptually refers to a read-only file > +whose contents are an infinite repetition of > +.I size > +bytes of data read from the > +.I content > +argument to the system call, > +and which may be mapped into memory with > +.BR mmap (2). > +The file descriptor is created as if by passing > +.BR O_RDONLY | O_CLOEXEC > +to > +.BR open (2). > +.PP > +In reality, any read-only pages in the mapping are backed > +by a so-called reference page, > +whose contents are specified using the arguments to > +.BR refpage_create (). > +.PP > +The reference page will consist of repetitions of > +.I size > +bytes read > +from > +.IR content , > +as many as are required to fill the page. The > +.I size > +argument must be a power of two less than or equal to the page size, and the > +.I content > +argument must have at least > +.I size > +alignment. The behavior is as if a copy of this data s/\. /.\n/ Rationale: semantic newlines. > +is made while servicing the system call; > +any updates to the data after the system call has returned > +will not be reflected in the reference page. > +.PP > +If the architecture specifies that // metadata may be associated /J/ Please, use semantic newlines (see man-pages(7)) > +with memory addresses, // that metadata if present is copied > +into the reference page along with the data itself, > +but only if the size argument is at least as large > +as the granularity of the metadata. > +For example, with the ARMv8.5 Memory Tagging Extension, > +the memory tags are copied, // but only if the size is greater than /J/ > +or equal to // the architecturally specified tag granule size of 16 bytes. > +.PP > +Writable private mappings trigger specific copy-on-write behavior > +when a page in the mapping is written to. > +The behavior is as if the reference page is copied, > +but the kernel may use a more efficient technique such as > +.BR memset (3) > +to produce the copy if the > +.I size > +argument originally used to create the reference page file descriptor > +is sufficiently small. > +For this reason it is recommended to specify as small of a > +.I size > +argument as possible > +in order to activate any such optimizations implemented in the kernel. > +.PP > +The advantage of using this system call > +over creating normal anonymous mappings > +and manually initializing the pages from userspace > +is that it is more efficient. > +If it is not known that all of the pages in the mapping > +will be faulted (for example, if the system call is used > +by a general purpose memory allocator > +where the behavior of the client program is unknown), > +letting the pages be prepared on fault only if needed > +is more efficient from both a performance > +and memory consumption perspective. > +Even if all of the pages would end up being faulted, > +it would still be more efficient > +to have the kernel initialize the pages with the required contents once > +than to have the kernel zero initialize them on fault > +and then have userspace initialize them again with different contents. > +.SH EXAMPLES > +The following program creates a 128KB memory mapping The SI mandates that a space shall be inserted between a number and the associated unit. Also, if it really means 128 KiB, which I guess, please use KiB. See units(7). Use a non-breaking space to make sure that the unit goes with the number. With all that, it would be: ... creates a 128\ KiB memory ... > +preinitialized with the pattern byte 0xAA > +and verifies that the contents of the mapping are correct. > +.PP > +.EX > +#include > +#include > +#include > +#include > + > +int main() { > + unsigned char pattern = 0xaa; Please use capital AA to help visually differentiate x and a. > + unsigned long mmap_size = 131072; Why that magic number? Maybe a shift to indicate that it's a power of 2... or 128 * 1024... I don't know from the top of my head powers of 2 that high :) Also, why 'unsigned long'? The SYNOPSIS says it's an 'unsigned int'. > + > + int fd = syscall(SYS_refpage_create, &pattern, 1, 0); Please use sizeof(pattern) instead of 1 to communicate the relationship between them. > + if (fd < 0) { > + perror("refpage_create"); > + return 1; Please use EXIT_FAILURE (). Also use exit(3) instead of return, as is common practice in manual pages. > + } > + unsigned char *p = mmap(0, mmap_size, PROT_READ | PROT_WRITE, Use NULL instead of 0 for pointers. The first argument of mmap(2) is 'void *addr'. > + MAP_PRIVATE, fd, 0); > + if (p == MAP_FAILED) { > + perror("mmap"); > + return 1; > + } > + for (unsigned i = 0; i != mmap_size; ++i) { s/unsigned/unsigned int/ > + if (p[i] != pattern) { > + fprintf(stderr, "refpage failed contents check @ %u: " > + "0x%x != 0x%x\n", I prefer 0x%X, which is already in use in some manual pages (seccomp(2)). Also, 'i' may be more readable in hex, given it's an offset of an address (actually the concept of a size_t, even if the kernel doesn't use that type) don't you think? > + i, p[i], pattern); > + return 1; exit(3) > + } > + } > +} > +.EE > +.SH NOTE > +Reading from a reference page file descriptor, e.g. with > +.BR read (2), > +is not supported, nor would this be particularly useful. > +.SH VERSIONS > +This system call first appeared in Linux 5.x. > +.SH CONFORMING TO > +The > +.BR refpage_create () > +system call is Linux-specific. > +.SH SEE ALSO > +.BR mmap (2), > +.BR open (2). > -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel