From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3580C282C2 for ; Thu, 7 Feb 2019 13:29:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 78ED820663 for ; Thu, 7 Feb 2019 13:29:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727198AbfBGN3U (ORCPT ); Thu, 7 Feb 2019 08:29:20 -0500 Received: from mga03.intel.com ([134.134.136.65]:10010 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726952AbfBGN3T (ORCPT ); Thu, 7 Feb 2019 08:29:19 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Feb 2019 05:29:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,344,1544515200"; d="scan'208";a="297959929" Received: from linux.intel.com ([10.54.29.200]) by orsmga005.jf.intel.com with ESMTP; 07 Feb 2019 05:29:18 -0800 Received: from [10.125.252.152] (abudanko-mobl.ccr.corp.intel.com [10.125.252.152]) by linux.intel.com (Postfix) with ESMTP id 2B7095802E1; Thu, 7 Feb 2019 05:29:14 -0800 (PST) Subject: [PATCH v2 1/4] perf-security: document perf_events/Perf resource control From: Alexey Budankov To: Jonatan Corbet , Kees Cook , Thomas Gleixner , Ingo Molnar , Peter Zijlstra Cc: Jann Horn , Arnaldo Carvalho de Melo , Jiri Olsa , Namhyung Kim , Alexander Shishkin , Andi Kleen , Mark Rutland , Tvrtko Ursulin , "kernel-hardening@lists.openwall.com" , "linux-doc@vger.kernel.org" , linux-kernel References: Organization: Intel Corp. Message-ID: Date: Thu, 7 Feb 2019 16:29:14 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Extend perf-security.rst file with perf_events/Perf resource control section describing RLIMIT_NOFILE and perf_event_mlock_kb settings for performance monitoring user processes. Signed-off-by: Alexey Budankov --- Changes in v2: - applied comments on v1 --- Documentation/admin-guide/perf-security.rst | 36 +++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst index f73ebfe9bfe2..3915f07b9dea 100644 --- a/Documentation/admin-guide/perf-security.rst +++ b/Documentation/admin-guide/perf-security.rst @@ -84,6 +84,40 @@ governed by perf_event_paranoid [2]_ setting: locking limit is imposed but ignored for unprivileged processes with CAP_IPC_LOCK capability. +perf_events/Perf resource control +--------------------------------- + +The perf_events system call API [2]_ allocates file descriptors for every configured +PMU event. Open file descriptors are a per-process accountable resource governed +by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login +shell process. When configuring Perf collection for a long list of events on a +large server system, this limit can be easily hit preventing required monitoring +configuration. RLIMIT_NOFILE limit can be increased on per-user basis modifying +content of the limits.conf file [12]_ on some systems. Ordinarily, a Perf sampling session +(perf record) requires an amount of open perf_event file descriptors that is not +less than a number of monitored events multiplied by a number of monitored CPUs. + +An amount of memory available to user processes for capturing performance monitoring +data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific +resource setting defines overall per-cpu limits of memory allowed for mapping +by the user processes to execute performance monitoring. The setting essentially +extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially +for capturing monitored performance events and related data. + +For example, if a machine has eight cores and perf_event_mlock_kb limit is set +to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory +above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular, +this means that, if the user wants to start two or more performance monitoring +processes, the user is required to manually distribute available 4128 KiB between the +monitoring processes, for example, using the --mmap-pages Perf record mode option. +Otherwise, the first started performance monitoring process allocates all available +4128 KiB and the other processes will fail to proceed due to the lack of memory. + +RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for +processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users +can be provided with memory above the constraints for perf_events/Perf performance +monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability. + Bibliography ------------ @@ -94,4 +128,6 @@ Bibliography .. [5] ``_ .. [6] ``_ .. [7] ``_ +.. [11] ``_ +.. [12] ``_