From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Gregg Subject: Re: perf segfault in docker container Date: Tue, 21 Jun 2016 15:32:27 -0700 Message-ID: References: <575A9660.4070907@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from mail-io0-f180.google.com ([209.85.223.180]:35428 "EHLO mail-io0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751626AbcFUWdU (ORCPT ); Tue, 21 Jun 2016 18:33:20 -0400 Received: by mail-io0-f180.google.com with SMTP id f30so29232407ioj.2 for ; Tue, 21 Jun 2016 15:32:57 -0700 (PDT) In-Reply-To: <575A9660.4070907@linux.vnet.ibm.com> Sender: linux-perf-users-owner@vger.kernel.org List-ID: To: Aravinda Prasad Cc: "linux-perf-use." , Wang Nan , Hari Bathini , Ananth M , "Naveen N. Rao" G'Day Aravinda, Sorry for the delay; answers inline: On Fri, Jun 10, 2016 at 3:28 AM, Aravinda Prasad wrote: > > Hi Brendan, > > I though of replying to your mail as I saw you running perf inside a > docker container. I believe you would be interested in events specific > to the container context as you are using "perf record -a". > > We are working on supporting "container-aware tracing" i.e., whenever > you run "perf record -a" inside a container it should report > container-wide events rather than system-wide events. Towards that goal, > we posted an RFC patch in LKML [1] last year and also discussed possible > ways to restrict events within a container in Plumbers (Container > Microconf) [2]. Sounds great. > > > Based on the discussion in Container Microconf, we are coming up with a > new prototype which should be ready for review by next week. The new > prototype introduces a new namespace "perf-namespace" (namespace name is > just a placeholder. Suggestions welcome). If the container is created > with perf-namespace, then "perf record -a" inside the container reports > only those events that are triggered within the container. I'd think that this restriction should be the default, rather than needing to create a container with a perf-namespace. Why wouldn't it make use of the existing pid namespace? > > We would like to know if you are looking for "container-aware tracing" > and also like to know the scenarios/problems you are trying to debug by > running perf inside a container. Yes, perf needs to be container-aware. To start with, we'd like to profile apps running inside Docker containers, either by running perf in the container, or by running perf from the host. As in, "perf record -F49 -a -g -- sleep 30". I've tried both and had both approaches work, with some wrestling of /tmp/perf-PID.map files and things. If perf was container-aware, then running it in the container should be the easiest way to profile an app, if it's only sampling that container. Also, from within a container, I'd expect to be able to sample kernel stacks that are running for the container processes (eg, syscalls), but not asynchronous kernel threads that are running host-wide (eg, background fsflush). More advanced things would involve tracing syscall latency and using BPF for latency histograms, from within a container. That should be allowed. What about tracepoints? Should a container be able to use the block I/O tracepoints and see disk I/O latency histograms? Filtering this to be just the container's block I/O would be tricky. Doing it system-wide may be allowable, depending on a setting in perf_event_paranoid. I think in some environments, having a container trace all tracepoints (disk, tcp, etc) is ok, provided to data is leaked from another container; whereas in other environments tracing non-container events would not be ok. Hence setting this in perf_event_paranoid. Brendan