From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47681) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1biKbx-0000qv-1a for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:08:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1biKbq-0003NA-FM for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:08:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55040) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1biKbq-0003N3-7R for qemu-devel@nongnu.org; Fri, 09 Sep 2016 08:08:42 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C38FFC0BAA2D for ; Fri, 9 Sep 2016 12:08:41 +0000 (UTC) Date: Fri, 9 Sep 2016 13:08:38 +0100 From: "Daniel P. Berrange" Message-ID: <20160909120838.GJ25802@redhat.com> Reply-To: "Daniel P. Berrange" References: <1470756748-18933-1-git-send-email-berrange@redhat.com> <87vay6ed9t.fsf@fimbulvetr.bsc.es> <20160908134305.GL30602@redhat.com> <87y431bai0.fsf@fimbulvetr.bsc.es> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87y431bai0.fsf@fimbulvetr.bsc.es> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-2.8 v1 00/60] Modular build of trace event files List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org, Stefan Hajnoczi On Fri, Sep 09, 2016 at 01:03:51PM +0200, Llu=C3=ADs Vilanova wrote: > Daniel P Berrange writes: >=20 > > On Thu, Sep 08, 2016 at 03:23:26PM +0200, Llu=C3=ADs Vilanova wrote: > >> Daniel P Berrange writes: > >>=20 > >> > I previously split the global trace-events file up into one file > >> > per-subdirectory to avoid merge conflict hell. > >> [...] > >>=20 > >> Sorry, I could not find the message where the infrastructure is modi= fied to > >> provide this. But I think there's a more efficient way to provide mo= dular > >> auto-generated tracing code without the hierarchical indexing you pr= oposed. >=20 > > [snip] >=20 > >> struct TraceEvent ___trace_events[] =3D { > >> { > >> .name =3D "eventname", > >> .sstate =3D 1, > >> .dstate =3D ___trace_eventname_dstate; > >> } > >> } > >>=20 > >> TraceEvent *TRACE_EVENTNAME =3D &___trace_events[...]; >=20 > > Life would be simpler if we had the 'bool dstate' as part of the > > TraceEvent struct, but doing so would essentially be reverting this > > previous change: >=20 > > commit 585ec7273e6fdab902b2128bc6c2a8136aafef04 > > Author: Paolo Bonzini > > Date: Wed Oct 28 07:06:27 2015 +0100 >=20 > > trace: track enabled events in a separate array > =20 > > This is more cache friendly on the fast path, where we already ha= ve > > the event id available. >=20 > > I asked Paolo about this previously and he indicated it was a notable > > performance improvement, so we can't put dstate back into the TraceEv= ent > > struct :-( >=20 > Sorry, there was a typo in my example code that led to this misundersta= nding. >=20 > I meant this for the global auto-generated .c: >=20 > struct TraceEvent { > /* ... */ > bool *dstate; > }; >=20 > bool ___TRACE_EVENTNAME_DSTATE; >=20 > struct TraceEvent ___trace_events[] =3D { > { > .name =3D "eventname", > .sstate =3D 1, > .dstate =3D &___TRACE_EVENTNAME_DSTATE; > } > } >=20 > TraceEvent *TRACE_EVENTNAME =3D &___trace_events[...]; >=20 >=20 > If you look at the modified macro I pasted for "trace/control-internal.= h": >=20 >=20 > #define trace_event_get_state_dynamic_by_id(id) \ > (unlikely(trace_events_enabled_count) && \ > (___ ## id ## _DSTATE)) >=20 > We're retaining the fast-path performance of the seggregated boolean ds= tate > array, while the event descriptor contains a pointer to the per-event d= state > global boolean variable in case you want to get/set the dstate based on= an event > pointer. The various _DSTATE variables are still arbitrarily scattered in memory, as opposed to in a contiguous cache friendly array, which is one of the goals of Paolo's original change. That said, I'm unclear on how much of the performance win from Paolo's change came from eliminating the struct field de-reference, vs having the contiguous array. I'm guessing most of the win is from the former, the latter only being important if we hit multiple related tracepoints on close succession. Regards, Daniel --=20 |: http://berrange.com -o- http://www.flickr.com/photos/dberrange= / :| |: http://libvirt.org -o- http://virt-manager.or= g :| |: http://autobuild.org -o- http://search.cpan.org/~danberr= / :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vn= c :|