From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752811AbbC0NKM (ORCPT ); Fri, 27 Mar 2015 09:10:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44891 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752214AbbC0NKJ (ORCPT ); Fri, 27 Mar 2015 09:10:09 -0400 Date: Fri, 27 Mar 2015 09:10:05 -0400 From: Don Zickus To: David Ahern Cc: acme@kernel.org, linux-kernel@vger.kernel.org, Joe Mario , Jiri Olsa Subject: Re: [PATCH v2] perf tool: Fix ppid for synthesized fork events Message-ID: <20150327131005.GA162412@redhat.com> References: <1427302270-10178-1-git-send-email-dsahern@gmail.com> <20150325191526.GX162412@redhat.com> <551312C0.4060706@gmail.com> <20150326211146.GZ162412@redhat.com> <55147C19.5090302@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55147C19.5090302@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 26, 2015 at 03:37:29PM -0600, David Ahern wrote: > On 3/26/15 3:11 PM, Don Zickus wrote: > >Sorry for drawing this out. Originally the performance still seemed off. > >But as we split the patch up to see where the perf impact was, the problem > >seemed to have disappeared. So we are testing the original patch again. > > > >The only difference now is we were playing with the -BN option in perf based > >on your changelog, just because we never used it before. :-) > > I was beyond surprised that you were measuring a 50% hit with the > first patch. As mentioned in a previous response it only adds the > processing of 3 additional lines to the already opened and read > /proc/pid/status file. So, when I wrote this second version I wanted > to make sure we are only measuring the impact of this change. The > /proc/pid/status files are read on startup of the record -- before > any samples are taken. > > The intent of '-e cpu-clock -F 1000 -- usleep 1' is to avoid any > samples since we don't care about them. Really the -a should be > dropped as well -- no need to open per-cpu events. > > -B impacts processing done at the end of the run: > > builin-record.c, __cmd_record(): > > if (!rec->no_buildid) > process_buildids(rec); > > and -N says don't copying anything to ~/.debug. All together it > tries to focus the measurement to /proc walking. > > > > >One last test without the -BN option and if that looks fine, then we have no > >objections. Again sorry for dragging this out. I will let you know > >tomorrow EST. > > no problem; appreciate the heads up. I talked with Joe on my way out the door yesterday and he confirmed, just removing -BN from our test showed a performance hit with your patch. With the -BN option, there is no performance hit and we are perfectly fine with your patch. So, I guess I am confused how the -BN and your patch could change behaviour. Just to re-iterate what we did, Joe kicked off a specJBB run and he did 20 captures of two runs (one with the unpatched binary and one with a pached binary). for i in {1..20} do time perf.unpatched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10 time perf.patched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10 done then we repeat the above test but with -BN in both runs. We compare the log sizes to make sure they are similar for the random snapshots and compare the times. With the -BN option, the times are generally within +/- 0.5 seconds of each. Without the -BN option the patched perf binary is generally +20-40 seconds slower. However, based on your description above about what the -BN option does, I am scratching my head about our results. Thoughts? Cheers, Don