From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D174FC2D0C0 for ; Mon, 23 Dec 2019 06:08:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9A16F20663 for ; Mon, 23 Dec 2019 06:08:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577081300; bh=weCzahVf5MeEGEg9ZMlYxr71QCroLW3brxmo/Ea/eiI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=I1uEF5Qe1Oc8xZjI6sbpfI60Zkov0hsgDTLpl84qdHSD77eoVsJVnqPAnGQONwvT0 IllmnhzAA3ZffY3vRO+LtRLG4W1pu6MjOLD6GEkjZ59pxKad0944eNV3XLnaeEXrAt n0UopsSs9ou6xhTDAk0XFhnPVreSv74qxGYvH6MA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726273AbfLWGIP (ORCPT ); Mon, 23 Dec 2019 01:08:15 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:36912 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725822AbfLWGIO (ORCPT ); Mon, 23 Dec 2019 01:08:14 -0500 Received: by mail-pg1-f194.google.com with SMTP id q127so8282699pga.4; Sun, 22 Dec 2019 22:08:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zPrpmzttrDY94W4ZYxsEgdCrf2nixohKcxPxWvokIwA=; b=p51ghUNUxOfDMlC0s9477JNl1QCeZ8/4dJmIRw1oqjTfZjPb1SIrExJxU328J8u65i g4KeVcdcEbgDeLGXdYp5lP3TcQvXKWTAchz58jpH03Mq1uVTAyghVLpjZAk5j6i3Z0tg BjCQtl61/hsnZcWBZfQcSEppBU4nmTh58uXDg5lBtLV3ur1QhA38zVWsJ14WEPy+yMVB uoAt6dvQNT/qX5hXYCovamBsYJgo+LiCQ3kxOMa8qUu+7rvnzDBRCNjyC3SRehkVxAHY IWJ4d/DsHTSV+OIUoWKc0XIgN0eqykx8Qwr0aoAqgsPVNf4xWjKzEU5iLURWaI1JevMP tbuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=zPrpmzttrDY94W4ZYxsEgdCrf2nixohKcxPxWvokIwA=; b=osf0/AST2OFUZN3pP9IA+qjJxlxcE3VCLDFtgih/8XsCrzBdt2T+tDdt9HMUQttjBH 4KDlNoaux6Lr0yhb7bwUJPyylk9MJrPwiyZCbSbE/kecVh331kO+j/8f3OXlvUo3/dGN P7N7e4gqbaZx2T4BpP9/LZg9LERFFNF2ZOw3+ctxIXRutlwtAIzx0GHZIG/4eVyirad4 VEzhTzAuh+lbxTTDowZfjoq42mOJ3gSgaAU0O0c3tu4qqNncLaLcQRXlfoQadoySo0FT FT7riqXGUVD6xJvx0M3xosZOJDxqN65X455mBbSv7ypibBtusoo0bg+CXilkumiYj/Cd 7qNQ== X-Gm-Message-State: APjAAAXAqVySvD3HRcmEW1BDvI9O///aU8Yo7nZK0WEvIRNDnHssmL5k Isvf5zy4eXP7FOzy5Ds8t0U= X-Google-Smtp-Source: APXvYqww8T6+In0Rsy1/yvnt/slvKQaSfyJOBT/9iZQYEsKeWQnGFEi9cdqZybuwGUCFCh860K2BTg== X-Received: by 2002:a63:7d8:: with SMTP id 207mr29693755pgh.154.1577081293916; Sun, 22 Dec 2019 22:08:13 -0800 (PST) Received: from gaurie.seo.corp.google.com ([2401:fa00:d:1:4eb0:a5ef:3975:7440]) by smtp.gmail.com with ESMTPSA id p185sm22978212pfg.61.2019.12.22.22.08.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Dec 2019 22:08:13 -0800 (PST) From: Namhyung Kim To: Ingo Molnar , Peter Zijlstra , Arnaldo Carvalho de Melo Cc: Jiri Olsa , Alexander Shishkin , Mark Rutland , Stephane Eranian , LKML , linux-perf-users@vger.kernel.org, Tejun Heo , Li Zefan , Johannes Weiner , Adrian Hunter Subject: [PATCH 1/9] perf/core: Add PERF_RECORD_CGROUP event Date: Mon, 23 Dec 2019 15:07:51 +0900 Message-Id: <20191223060759.841176-2-namhyung@kernel.org> X-Mailer: git-send-email 2.24.1.735.g03f4e72817-goog In-Reply-To: <20191223060759.841176-1-namhyung@kernel.org> References: <20191223060759.841176-1-namhyung@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org To support cgroup tracking, add CGROUP event to save a link between cgroup path and id number. This is needed since cgroups can go away when userspace tries to read the cgroup info (from the id) later. The attr.cgroup bit was also added to enable cgroup tracking from userspace. This event will be generated when a new cgroup becomes active. Userspace might need to synthesize those events for existing cgroups. Cc: Tejun Heo Cc: Li Zefan Cc: Johannes Weiner Cc: Adrian Hunter Signed-off-by: Namhyung Kim --- include/uapi/linux/perf_event.h | 13 +++- kernel/events/core.c | 111 ++++++++++++++++++++++++++++++++ 2 files changed, 123 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 377d794d3105..de2ab87ca92c 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -377,7 +377,8 @@ struct perf_event_attr { ksymbol : 1, /* include ksymbol events */ bpf_event : 1, /* include bpf events */ aux_output : 1, /* generate AUX records instead of events */ - __reserved_1 : 32; + cgroup : 1, /* include cgroup events */ + __reserved_1 : 31; union { __u32 wakeup_events; /* wakeup every n events */ @@ -1006,6 +1007,16 @@ enum perf_event_type { */ PERF_RECORD_BPF_EVENT = 18, + /* + * struct { + * struct perf_event_header header; + * u64 id; + * char path[]; + * struct sample_id sample_id; + * }; + */ + PERF_RECORD_CGROUP = 19, + PERF_RECORD_MAX, /* non-ABI */ }; diff --git a/kernel/events/core.c b/kernel/events/core.c index 4ff86d57f9e5..b0aa1b921769 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -386,6 +386,7 @@ static atomic_t nr_freq_events __read_mostly; static atomic_t nr_switch_events __read_mostly; static atomic_t nr_ksymbol_events __read_mostly; static atomic_t nr_bpf_events __read_mostly; +static atomic_t nr_cgroup_events __read_mostly; static LIST_HEAD(pmus); static DEFINE_MUTEX(pmus_lock); @@ -4455,6 +4456,8 @@ static void unaccount_event(struct perf_event *event) atomic_dec(&nr_comm_events); if (event->attr.namespaces) atomic_dec(&nr_namespaces_events); + if (event->attr.cgroup) + atomic_dec(&nr_cgroup_events); if (event->attr.task) atomic_dec(&nr_task_events); if (event->attr.freq) @@ -7564,6 +7567,105 @@ void perf_event_namespaces(struct task_struct *task) NULL); } +/* + * cgroup tracking + */ +#ifdef CONFIG_CGROUPS + +struct perf_cgroup_event { + char *path; + int path_size; + struct { + struct perf_event_header header; + u64 id; + char path[]; + } event_id; +}; + +static int perf_event_cgroup_match(struct perf_event *event) +{ + return event->attr.cgroup; +} + +static void perf_event_cgroup_output(struct perf_event *event, void *data) +{ + struct perf_cgroup_event *cgroup_event = data; + struct perf_output_handle handle; + struct perf_sample_data sample; + u16 header_size = cgroup_event->event_id.header.size; + int ret; + + if (!perf_event_cgroup_match(event)) + return; + + perf_event_header__init_id(&cgroup_event->event_id.header, + &sample, event); + ret = perf_output_begin(&handle, event, + cgroup_event->event_id.header.size); + if (ret) + goto out; + + perf_output_put(&handle, cgroup_event->event_id); + __output_copy(&handle, cgroup_event->path, cgroup_event->path_size); + + perf_event__output_id_sample(event, &handle, &sample); + + perf_output_end(&handle); +out: + cgroup_event->event_id.header.size = header_size; +} + +void perf_event_cgroup(struct cgroup *cgrp) +{ + struct perf_cgroup_event cgroup_event; + char path_enomem[16] = "//enomem"; + char *pathname; + size_t size; + + if (!atomic_read(&nr_cgroup_events)) + return; + + cgroup_event = (struct perf_cgroup_event){ + .event_id = { + .header = { + .type = PERF_RECORD_CGROUP, + .misc = 0, + .size = sizeof(cgroup_event.event_id), + }, + .id = cgroup_id(cgrp), + }, + }; + + pathname = kmalloc(PATH_MAX, GFP_KERNEL); + if (pathname == NULL) { + cgroup_event.path = path_enomem; + } else { + /* just to be sure to have enough space for alignment */ + cgroup_path(cgrp, pathname, PATH_MAX - sizeof(u64)); + cgroup_event.path = pathname; + } + + /* + * Since our buffer works in 8 byte units we need to align our string + * size to a multiple of 8. However, we must guarantee the tail end is + * zero'd out to avoid leaking random bits to userspace. + */ + size = strlen(cgroup_event.path) + 1; + while (!IS_ALIGNED(size, sizeof(u64))) + cgroup_event.path[size++] = '\0'; + + cgroup_event.event_id.header.size += size; + cgroup_event.path_size = size; + + perf_iterate_sb(perf_event_cgroup_output, + &cgroup_event, + NULL); + + kfree(pathname); +} + +#endif + /* * mmap tracking */ @@ -10607,6 +10709,8 @@ static void account_event(struct perf_event *event) atomic_inc(&nr_comm_events); if (event->attr.namespaces) atomic_inc(&nr_namespaces_events); + if (event->attr.cgroup) + atomic_inc(&nr_cgroup_events); if (event->attr.task) atomic_inc(&nr_task_events); if (event->attr.freq) @@ -12581,6 +12685,12 @@ static void perf_cgroup_css_free(struct cgroup_subsys_state *css) kfree(jc); } +static int perf_cgroup_css_online(struct cgroup_subsys_state *css) +{ + perf_event_cgroup(css->cgroup); + return 0; +} + static int __perf_cgroup_move(void *info) { struct task_struct *task = info; @@ -12602,6 +12712,7 @@ static void perf_cgroup_attach(struct cgroup_taskset *tset) struct cgroup_subsys perf_event_cgrp_subsys = { .css_alloc = perf_cgroup_css_alloc, .css_free = perf_cgroup_css_free, + .css_online = perf_cgroup_css_online, .attach = perf_cgroup_attach, /* * Implicitly enable on dfl hierarchy so that perf events can -- 2.24.1.735.g03f4e72817-goog