From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46943) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1azi6K-0003fh-1f for qemu-devel@nongnu.org; Mon, 09 May 2016 06:07:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1azi6F-0006Yo-ND for qemu-devel@nongnu.org; Mon, 09 May 2016 06:07:42 -0400 Received: from mail-wm0-x242.google.com ([2a00:1450:400c:c09::242]:34446) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1azi6F-0006YN-Bl for qemu-devel@nongnu.org; Mon, 09 May 2016 06:07:39 -0400 Received: by mail-wm0-x242.google.com with SMTP id n129so19782325wmn.1 for ; Mon, 09 May 2016 03:07:39 -0700 (PDT) Sender: Paolo Bonzini References: <1460147350-7601-1-git-send-email-pbonzini@redhat.com> <1460147350-7601-2-git-send-email-pbonzini@redhat.com> <87twiz5ah9.fsf@dusky.pond.sub.org> From: Paolo Bonzini Message-ID: <57306176.9050209@redhat.com> Date: Mon, 9 May 2016 12:07:50 +0200 MIME-Version: 1.0 In-Reply-To: <87twiz5ah9.fsf@dusky.pond.sub.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 01/50] scripts: add script to build QEMU and analyze inclusions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: qemu-devel@nongnu.org On 18/04/2016 15:10, Markus Armbruster wrote: > Paolo Bonzini writes: > >> Signed-off-by: Paolo Bonzini >> --- >> scripts/analyze-inclusions | 89 ++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 89 insertions(+) >> create mode 100644 scripts/analyze-inclusions >> >> diff --git a/scripts/analyze-inclusions b/scripts/analyze-inclusions >> new file mode 100644 >> index 0000000..e241bd4 >> --- /dev/null >> +++ b/scripts/analyze-inclusions >> @@ -0,0 +1,89 @@ >> +#! /bin/sh >> +# >> +# Copyright (C) 2016 Red Hat, Inc. >> +# >> +# Author: Paolo Bonzini >> +# >> +# Print statistics about header file inclusions. >> +# The script configures and builds QEMU itself in a "+build" >> +# subdirectory which is left around when the script exits. >> +# To run the statistics on a pre-existing "+build" directory, >> +# pass "--no-build" as the first argument on the command line. >> +# Any other command line is passed directly to "make" (so >> +# you can for example pass a "-j" argument suitable for your >> +# system). >> +# >> +# Inspired by a post by Markus Armbruster. >> + >> +mkdir -p +build >> +cd +build >> +if test "x$1" != "x--no-build"; then >> + test -f Makefile && make distclean >> + ../configure >> + make "$@" >> +fi > > Unfortunate: "mkdir -p +build" clobbers an existing symbolic link from > +build to the build tree of my choice. Changed to this: # The script has two modes of execution: # # 1) if invoked with a path on the command line (possibly # preceded by a "--" argument), it will run the analysis on # an existing build directory # # 2) otherwise, it will configure and builds QEMU itself in a # "+build" subdirectory which is left around when the script # exits. In this case the command line is passed directly to # "make" (typically used for a "-j" argument suitable for your # system). # # Inspired by a post by Markus Armbruster. case "x$1" in --) shift cd "$1" || exit $? ;; x-* | x) mkdir -p +build cd +build test -f Makefile && make distclean ../configure make "$@" ;; *) cd "$1" || exit $? esac >> + >> +QEMU_CFLAGS=$(sed -n s/^QEMU_CFLAGS=//p config-host.mak) >> +QEMU_INCLUDES=$(sed -n s/^QEMU_INCLUDES=//p config-host.mak | \ >> + sed 's/$(SRC_PATH)/../g' ) >> +CFLAGS=$(sed -n s/^CFLAGS=//p config-host.mak) >> + >> +grep_include() { >> + find . -name "*.d" | xargs grep -l "$@" | wc -l > > More robust against funny names would be: > > find . -name "*.d" -exec grep -l {} + | wc -l Missing a "$@", otherwise adopted your version. >> +echo Found $(find . -name "*.d" | wc -l) object files >> +echo $(grep_include -F 'include/qemu-common.h') files include qemu-common.h >> +echo $(grep_include -F 'hw/hw.h') files include hw/hw.h >> +echo $(grep_include 'target-[a-z0-9]*/cpu\.h') files include cpu.h >> +echo $(grep_include -F 'qapi-types.h') files include qapi-types.h >> +echo $(grep_include -F 'trace/generated-tracers.h') files include generated-tracers.h >> +echo $(grep_include -F 'qapi/error.h') files include qapi/error.h >> +echo $(grep_include -F 'qom/object.h') files include qom/object.h >> +echo $(grep_include -F 'block/aio.h') files include block/aio.h >> +echo $(grep_include -F 'exec/memory.h') files include exec/memory.h >> +echo $(grep_include -F 'fpu/softfloat.h') files include fpu/softfloat.h >> +echo $(grep_include -F 'qemu/bswap.h') files include qemu/bswap.h >> +echo > > How did you select these headers? >>From your post, mostly. A lot of these are files that we are planning to tackle one way or another, or that have a lot of indirect inclusions. >> + >> +awk1=' >> + /^# / { file = $3;next } >> + NR>1 { bytes[file]+=length; lines[file]++ } > > Your #bytes is off by one, because AWK chops off the newlines. I think > you want length() + 1. Fixed. > We want to watch commonly included big headers, especially the ones that > are prone to indirect inclusion. These will change as we go. That's valuable, but actually I wanted to check for something else. I'm looking at: - files that include the world: these are hw/hw.h, cpu.h, etc. - files included from anywhere, that probably shouldn't be included anywhere: these are the ones I cherry-picked in the first part of the script, such as block/aio.h, qemu/bswap.h, fpu/softfloat.h. > Output of my header counting bash script (first 64 lines appended) > provides possible additional initial candidates. > > 479379846 124806 3841 qapi-types.h > 199662236 55756 3581 /work/armbru/qemu/include/qom/object.h > 187691645 53857 3485 /work/armbru/qemu/include/exec/memory.h > 118894840 30643 3880 /work/armbru/qemu/include/fpu/softfloat.h > 109943680 124936 880 trace/generated-events.h These are examples of the second case. > 88524072 27022 3276 /work/armbru/qemu/include/qom/cpu.h This needs to be included by all target-*/cpu.h (which are in my list), so I'm tracking one of those instead. > 82757301 46519 1779 /work/armbru/qemu/include/migration/vmstate.h Almost always included through hw/hw.h, tracking that instead. The problem (if it is a problem) here is too many inclusions of hw/hw.h, and hw/hw.h including the kitchen sink. > 82340280 21510 3828 /work/armbru/qemu/include/qemu/queue.h Probably unavoidable, might as well move it to qemu/osdep.h?!? > 63362110 19259 3290 /work/armbru/qemu/include/disas/bfd.h For disas/bfd.h and (before this series) exec/exec-all.h, the problem is not too many inclusions of the header, but possibly unnecessary inclusions from heavily used headers. In particular I'm not sure why qom/cpu.h needs disas/bfd.h. Anyhow, my point is that the generic counting script tends to count things twice, which is why I went for a limited hand-written list based on your message and thus on your script. The obvious disadvantage is that the hand-written list may become obsolete. > 62800785 26667 2355 /work/armbru/qemu/include/qemu/timer.h > 52975068 13828 3831 /work/armbru/qemu/include/qemu/atomic.h > 51315482 16442 3121 /work/armbru/qemu/include/exec/exec-all.h Happy to say my patches fix this one. :) Paolo