* [ANNOUNCE] Git v2.19.0-rc0
@ 2018-08-20 22:13 Junio C Hamano
2018-08-20 22:41 ` Stefan Beller
2018-08-21 20:41 ` Derrick Stolee
0 siblings, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2018-08-20 22:13 UTC (permalink / raw)
To: git; +Cc: Linux Kernel, git-packagers
An early preview release Git v2.19.0-rc0 is now available for
testing at the usual places. It is comprised of 707 non-merge
commits since v2.18.0, contributed by 60 people, 14 of which are
new faces.
The tarballs are found at:
https://www.kernel.org/pub/software/scm/git/testing/
The following public repositories all have a copy of the
'v2.19.0-rc0' tag and the 'master' branch that the tag points at:
url = https://kernel.googlesource.com/pub/scm/git/git
url = git://repo.or.cz/alt-git.git
url = https://github.com/gitster/git
New contributors whose contributions weren't in v2.18.0 are as follows.
Welcome to the Git development community!
Aleksandr Makarov, Andrei Rybak, Chen Bin, Henning Schild,
Isabella Stephens, Josh Steadmon, Jules Maselbas, Kana Natsuno,
Marc Strapetz, Masaya Suzuki, Nicholas Guriev, Sebastian Kisela,
Vladimir Parfinenko, and William Chargin.
Returning contributors who helped this release are as follows.
Thanks for your continued support.
Aaron Schrab, Ævar Arnfjörð Bjarmason, Alban Gruin, Alejandro
R. Sedeño, Anthony Sottile, Antonio Ospite, Beat Bolli, Ben
Peart, Brandon Williams, brian m. carlson, Christian Couder,
Derrick Stolee, Elijah Newren, Eric Sunshine, Han-Wen Nienhuys,
Jameson Miller, Jeff Hostetler, Jeff King, Johannes Schindelin,
Johannes Sixt, Jonathan Nieder, Jonathan Tan, Junio C Hamano,
Kim Gybels, Kirill Smelkov, Luis Marsano, Łukasz Stelmach,
Luke Diamand, Martin Ågren, Max Kirillov, Michael Barabanov,
Mike Hommey, Nguyễn Thái Ngọc Duy, Olga Telezhnaya,
Phillip Wood, Prathamesh Chavan, Ramsay Jones, René Scharfe,
Stefan Beller, SZEDER Gábor, Taylor Blau, Thomas Rast, Tobias
Klauser, Todd Zullinger, Ville Skyttä, and Xiaolong Ye.
----------------------------------------------------------------
Git 2.19 Release Notes (draft)
==============================
Updates since v2.18
-------------------
UI, Workflows & Features
* "git diff" compares the index and the working tree. For paths
added with intent-to-add bit, the command shows the full contents
of them as added, but the paths themselves were not marked as new
files. They are now shown as new by default.
"git apply" learned the "--intent-to-add" option so that an
otherwise working-tree-only application of a patch will add new
paths to the index marked with the "intent-to-add" bit.
* "git grep" learned the "--column" option that gives not just the
line number but the column number of the hit.
* The "-l" option in "git branch -l" is an unfortunate short-hand for
"--create-reflog", but many users, both old and new, somehow expect
it to be something else, perhaps "--list". This step warns when "-l"
is used as a short-hand for "--create-reflog" and warns about the
future repurposing of the it when it is used.
* The userdiff pattern for .php has been updated.
* The content-transfer-encoding of the message "git send-email" sends
out by default was 8bit, which can cause trouble when there is an
overlong line to bust RFC 5322/2822 limit. A new option 'auto' to
automatically switch to quoted-printable when there is such a line
in the payload has been introduced and is made the default.
* "git checkout" and "git worktree add" learned to honor
checkout.defaultRemote when auto-vivifying a local branch out of a
remote tracking branch in a repository with multiple remotes that
have tracking branches that share the same names.
(merge 8d7b558bae ab/checkout-default-remote later to maint).
* "git grep" learned the "--only-matching" option.
* "git rebase --rebase-merges" mode now handles octopus merges as
well.
* Add a server-side knob to skip commits in exponential/fibbonacci
stride in an attempt to cover wider swath of history with a smaller
number of iterations, potentially accepting a larger packfile
transfer, instead of going back one commit a time during common
ancestor discovery during the "git fetch" transaction.
(merge 42cc7485a2 jt/fetch-negotiator-skipping later to maint).
* A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.
* Teach "git tag -s" etc. a few configuration variables (gpg.format
that can be set to "openpgp" or "x509", and gpg.<format>.program
that is used to specify what program to use to deal with the format)
to allow x.509 certs with CMS via "gpgsm" to be used instead of
openpgp via "gnupg".
* Many more strings are prepared for l10n.
* "git p4 submit" learns to ask its own pre-submit hook if it should
continue with submitting.
* The test performed at the receiving end of "git push" to prevent
bad objects from entering repository can be customized via
receive.fsck.* configuration variables; we now have gained a
counterpart to do the same on the "git fetch" side, with
fetch.fsck.* configuration variables.
* "git pull --rebase=interactive" learned "i" as a short-hand for
"interactive".
* "git instaweb" has been adjusted to run better with newer Apache on
RedHat based distros.
* "git range-diff" is a reimplementation of "git tbdiff" that lets us
compare individual patches in two iterations of a topic.
* The sideband code learned to optionally paint selected keywords at
the beginning of incoming lines on the receiving end.
Performance, Internal Implementation, Development Support etc.
* The bulk of "git submodule foreach" has been rewritten in C.
* The in-core "commit" object had an all-purpose "void *util" field,
which was tricky to use especially in library-ish part of the
code. All of the existing uses of the field has been migrated to a
more dedicated "commit-slab" mechanism and the field is eliminated.
* A less often used command "git show-index" has been modernized.
(merge fb3010c31f jk/show-index later to maint).
* The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* Continuing with the idea to programatically enumerate various
pieces of data required for command line completion, teach the
codebase to report the list of configuration variables
subcommands care about to help complete them.
* Separate "rebase -p" codepath out of "rebase -i" implementation to
slim down the latter and make it easier to manage.
* Make refspec parsing codepath more robust.
* Some flaky tests have been fixed.
* Continuing with the idea to programmatically enumerate various
pieces of data required for command line completion, the codebase
has been taught to enumerate options prefixed with "--no-" to
negate them.
* Build and test procedure for netrc credential helper (in contrib/)
has been updated.
* The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* Remove unused function definitions and declarations from ewah
bitmap subsystem.
* Code preparation to make "git p4" closer to be usable with Python 3.
* Tighten the API to make it harder to misuse in-tree .gitmodules
file, even though it shares the same syntax with configuration
files, to read random configuration items from it.
* "git fast-import" has been updated to avoid attempting to create
delta against a zero-byte-long string, which is pointless.
* The codebase has been updated to compile cleanly with -pedantic
option.
(merge 2b647a05d7 bb/pedantic later to maint).
* The character display width table has been updated to match the
latest Unicode standard.
(merge 570951eea2 bb/unicode-11-width later to maint).
* test-lint now looks for broken use of "VAR=VAL shell_func" in test
scripts.
* Conversion from uchar[40] to struct object_id continues.
* Recent "security fix" to pay attention to contents of ".gitmodules"
while accepting "git push" was a bit overly strict than necessary,
which has been adjusted.
* "git fsck" learns to make sure the optional commit-graph file is in
a sane state.
* "git diff --color-moved" feature has further been tweaked.
* Code restructuring and a small fix to transport protocol v2 during
fetching.
* Parsing of -L[<N>][,[<M>]] parameters "git blame" and "git log"
take has been tweaked.
* lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.
* Various glitches in the heuristics of merge-recursive strategy have
been documented in new tests.
* "git fetch" learned a new option "--negotiation-tip" to limit the
set of commits it tells the other end as "have", to reduce wasted
bandwidth and cycles, which would be helpful when the receiving
repository has a lot of refs that have little to do with the
history at the remote it is fetching from.
* For a large tree, the index needs to hold many cache entries
allocated on heap. These cache entries are now allocated out of a
dedicated memory pool to amortize malloc(3) overhead.
* Tests to cover various conflicting cases have been added for
merge-recursive.
* Tests to cover conflict cases that involve submodules have been
added for merge-recursive.
* Look for broken "&&" chains that are hidden in subshell, many of
which have been found and corrected.
* The singleton commit-graph in-core instance is made per in-core
repository instance.
* "make DEVELOPER=1 DEVOPTS=pedantic" allows developers to compile
with -pedantic option, which may catch more problematic program
constructs and potential bugs.
* Preparatory code to later add json output for telemetry data has
been added.
* Update the way we use Coccinelle to find out-of-style code that
need to be modernised.
* It is too easy to misuse system API functions such as strcat();
these selected functions are now forbidden in this codebase and
will cause a compilation failure.
* Add a script (in contrib/) to help users of VSCode work better with
our codebase.
* The Travis CI scripts were taught to ship back the test data from
failed tests.
(merge aea8879a6a sg/travis-retrieve-trash-upon-failure later to maint).
* The parse-options machinery learned to refrain from enclosing
placeholder string inside a "<bra" and "ket>" pair automatically
without PARSE_OPT_LITERAL_ARGHELP. Existing help text for option
arguments that are not formatted correctly have been identified and
fixed.
(merge 5f0df44cd7 rs/parse-opt-lithelp later to maint).
* Noiseword "extern" has been removed from function decls in the
header files.
* A few atoms like %(objecttype) and %(objectsize) in the format
specifier of "for-each-ref --format=<format>" can be filled without
getting the full contents of the object, but just with the object
header. These cases have been optimized by calling
oid_object_info() API (instead of reading and inspecting the data).
* The end result of documentation update has been made to be
inspected more easily to help developers.
* The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.
* Improve built-in facility to catch broken &&-chain in the tests.
* The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.
* A test prerequisite defined by various test scripts with slightly
different semantics has been consolidated into a single copy and
made into a lazily defined one.
(merge 6ec633059a wc/make-funnynames-shared-lazy-prereq later to maint).
* After a partial clone, repeated fetches from promisor remote would
have accumulated many packfiles marked with .promisor bit without
getting them coalesced into fewer packfiles, hurting performance.
"git repack" now learned to repack them.
Fixes since v2.18
-----------------
* "git remote update" can take both a single remote nickname and a
nickname for remote groups, and the completion script (in contrib/)
has been taught about it.
(merge 9cd4382ad5 ls/complete-remote-update-names later to maint).
* "git fetch --shallow-since=<cutoff>" that specifies the cut-off
point that is newer than the existing history used to end up
grabbing the entire history. Such a request now errors out.
(merge e34de73c56 nd/reject-empty-shallow-request later to maint).
* Fix for 2.17-era regression around `core.safecrlf`.
(merge 6cb09125be as/safecrlf-quiet-fix later to maint).
* The recent addition of "partial clone" experimental feature kicked
in when it shouldn't, namely, when there is no partial-clone filter
defined even if extensions.partialclone is set.
(merge cac1137dc4 jh/partial-clone later to maint).
* "git send-pack --signed" (hence "git push --signed" over the http
transport) did not read user ident from the config mechanism to
determine whom to sign the push certificate as, which has been
corrected.
(merge d067d98887 ms/send-pack-honor-config later to maint).
* "git fetch-pack --all" used to unnecessarily fail upon seeing an
annotated tag that points at an object other than a commit.
(merge c12c9df527 jk/fetch-all-peeled-fix later to maint).
* When user edits the patch in "git add -p" and the user's editor is
set to strip trailing whitespaces indiscriminately, an empty line
that is unchanged in the patch would become completely empty
(instead of a line with a sole SP on it). The code introduced in
Git 2.17 timeframe failed to parse such a patch, but now it learned
to notice the situation and cope with it.
(merge f4d35a6b49 pw/add-p-recount later to maint).
* The code to try seeing if a fetch is necessary in a submodule
during a fetch with --recurse-submodules got confused when the path
to the submodule was changed in the range of commits in the
superproject, sometimes showing "(null)". This has been corrected.
* "git submodule" did not correctly adjust core.worktree setting that
indicates whether/where a submodule repository has its associated
working tree across various state transitions, which has been
corrected.
(merge 984cd77ddb sb/submodule-core-worktree later to maint).
* Bugfix for "rebase -i" corner case regression.
(merge a9279c6785 pw/rebase-i-keep-reword-after-conflict later to maint).
* Recently added "--base" option to "git format-patch" command did
not correctly generate prereq patch ids.
(merge 15b76c1fb3 xy/format-patch-prereq-patch-id-fix later to maint).
* POSIX portability fix in Makefile to fix a glitch introduced a few
releases ago.
(merge 6600054e9b dj/runtime-prefix later to maint).
* "git filter-branch" when used with the "--state-branch" option
still attempted to rewrite the commits whose filtered result is
known from the previous attempt (which is recorded on the state
branch); the command has been corrected not to waste cycles doing
so.
(merge 709cfe848a mb/filter-branch-optim later to maint).
* Clarify that setting core.ignoreCase to deviate from reality would
not turn a case-incapable filesystem into a case-capable one.
(merge 48294b512a ms/core-icase-doc later to maint).
* "fsck.skipList" did not prevent a blob object listed there from
being inspected for is contents (e.g. we recently started to
inspect the contents of ".gitmodules" for certain malicious
patterns), which has been corrected.
(merge fb16287719 rj/submodule-fsck-skip later to maint).
* "git checkout --recurse-submodules another-branch" did not report
in which submodule it failed to update the working tree, which
resulted in an unhelpful error message.
(merge ba95d4e4bd sb/submodule-move-head-error-msg later to maint).
* "git rebase" behaved slightly differently depending on which one of
the three backends gets used; this has been documented and an
effort to make them more uniform has begun.
(merge b00bf1c9a8 en/rebase-consistency later to maint).
* The "--ignore-case" option of "git for-each-ref" (and its friends)
did not work correctly, which has been fixed.
(merge e674eb2528 jk/for-each-ref-icase later to maint).
* "git fetch" failed to correctly validate the set of objects it
received when making a shallow history deeper, which has been
corrected.
(merge cf1e7c0770 jt/connectivity-check-after-unshallow later to maint).
* Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side. The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).
(merge a7e67c11b8 jt/partial-clone-fsck-connectivity later to maint).
* Handling of an empty range by "git cherry-pick" was inconsistent
depending on how the range ended up to be empty, which has been
corrected.
(merge c5e358d073 jk/empty-pick-fix later to maint).
* "git reset --merge" (hence "git merge ---abort") and "git reset --hard"
had trouble working correctly in a sparsely checked out working
tree after a conflict, which has been corrected.
(merge b33fdfc34c mk/merge-in-sparse-checkout later to maint).
* Correct a broken use of "VAR=VAL shell_func" in a test.
(merge 650161a277 jc/t3404-one-shot-export-fix later to maint).
* "git rev-parse ':/substring'" did not consider the history leading
only to HEAD when looking for a commit with the given substring,
when the HEAD is detached. This has been fixed.
(merge 6b3351e799 wc/find-commit-with-pattern-on-detached-head later to maint).
* Build doc update for Windows.
(merge ede8d89bb1 nd/command-list later to maint).
* core.commentchar is now honored when preparing the list of commits
to replay in "rebase -i".
* "git pull --rebase" on a corrupt HEAD caused a segfault. In
general we substitute an empty tree object when running the in-core
equivalent of the diff-index command, and the codepath has been
corrected to do so as well to fix this issue.
(merge 3506dc9445 jk/has-uncommitted-changes-fix later to maint).
* httpd tests saw occasional breakage due to the way its access log
gets inspected by the tests, which has been updated to make them
less flaky.
(merge e8b3b2e275 sg/httpd-test-unflake later to maint).
* Tests to cover more D/F conflict cases have been added for
merge-recursive.
* "git gc --auto" opens file descriptors for the packfiles before
spawning "git repack/prune", which would upset Windows that does
not want a process to work on a file that is open by another
process. The issue has been worked around.
(merge 12e73a3ce4 kg/gc-auto-windows-workaround later to maint).
* The recursive merge strategy did not properly ensure there was no
change between HEAD and the index before performing its operation,
which has been corrected.
(merge 55f39cf755 en/dirty-merge-fixes later to maint).
* "git rebase" started exporting GIT_DIR environment variable and
exposing it to hook scripts when part of it got rewritten in C.
Instead of matching the old scripted Porcelains' behaviour,
compensate by also exporting GIT_WORK_TREE environment as well to
lessen the damage. This can harm existing hooks that want to
operate on different repository, but the current behaviour is
already broken for them anyway.
(merge ab5e67d751 bc/sequencer-export-work-tree-as-well later to maint).
* "git send-email" when using in a batched mode that limits the
number of messages sent in a single SMTP session lost the contents
of the variable used to choose between tls/ssl, unable to send the
second and later batches, which has been fixed.
(merge 636f3d7ac5 jm/send-email-tls-auth-on-batch later to maint).
* The lazy clone support had a few places where missing but promised
objects were not correctly tolerated, which have been fixed.
* One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".
(merge e3f2f5f9cd es/diff-color-moved-fix later to maint).
* The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.
(merge 402c47d939 bw/clone-ref-prefixes later to maint).
* "git diff --histogram" had a bad memory usage pattern, which has
been rearranged to reduce the peak usage.
(merge 79cb2ebb92 sb/histogram-less-memory later to maint).
* Code clean-up to use size_t/ssize_t when they are the right type.
(merge 7726d360b5 jk/size-t later to maint).
* The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
fetch $remote branch:branch" that asks tags that point into the
history leading to the "branch" automatically followed sent to
narrow prefix and broke the tag following, which has been fixed.
(merge 2b554353a5 jt/tag-following-with-proto-v2-fix later to maint).
* When the sparse checkout feature is in use, "git cherry-pick" and
other mergy operations lost the skip_worktree bit when a path that
is excluded from checkout requires content level merge, which is
resolved as the same as the HEAD version, without materializing the
merge result in the working tree, which made the path appear as
deleted. This has been corrected by preserving the skip_worktree
bit (and not materializing the file in the working tree).
(merge 2b75fb601c en/merge-recursive-skip-fix later to maint).
* The "author-script" file "git rebase -i" creates got broken when
we started to move the command away from shell script, which is
getting fixed now.
(merge 5522bbac20 es/rebase-i-author-script-fix later to maint).
* The automatic tree-matching in "git merge -s subtree" was broken 5
years ago and nobody has noticed since then, which is now fixed.
(merge 2ec4150713 jk/merge-subtree-heuristics later to maint).
* "git fetch $there refs/heads/s" ought to fetch the tip of the
branch 's', but when "refs/heads/refs/heads/s", i.e. a branch whose
name is "refs/heads/s" exists at the same time, fetched that one
instead by mistake. This has been corrected to honor the usual
disambiguation rules for abbreviated refnames.
(merge 60650a48c0 jt/refspec-dwim-precedence-fix later to maint).
* Futureproofing a helper function that can easily be misused.
(merge 65bb21e77e es/want-color-fd-defensive later to maint).
* The http-backend (used for smart-http transport) used to slurp the
whole input until EOF, without paying attention to CONTENT_LENGTH
that is supplied in the environment and instead expecting the Web
server to close the input stream. This has been fixed.
(merge eebfe40962 mk/http-backend-content-length later to maint).
* "git merge --abort" etc. did not clean things up properly when
there were conflicted entries in the index in certain order that
are involved in D/F conflicts. This has been corrected.
(merge ad3762042a en/abort-df-conflict-fixes later to maint).
* "git diff --indent-heuristic" had a bad corner case performance.
(merge 301ef85401 sb/indent-heuristic-optim later to maint).
* The "--exec" option to "git rebase --rebase-merges" placed the exec
commands at wrong places, which has been corrected.
* "git verify-tag" and "git verify-commit" have been taught to use
the exit status of underlying "gpg --verify" to signal bad or
untrusted signature they found.
(merge 4e5dc9ca17 jc/gpg-status later to maint).
* "git mergetool" stopped and gave an extra prompt to continue after
the last path has been handled, which did not make much sense.
(merge d651a54b8a ng/mergetool-lose-final-prompt later to maint).
* Among the three codepaths we use O_APPEND to open a file for
appending, one used for writing GIT_TRACE output requires O_APPEND
implementation that behaves sensibly when multiple processes are
writing to the same file. POSIX emulation used in the Windows port
has been updated to improve in this area.
(merge d641097589 js/mingw-o-append later to maint).
* "git pull --rebase -v" in a repository with a submodule barfed as
an intermediate process did not understand what "-v(erbose)" flag
meant, which has been fixed.
(merge e84c3cf3dc sb/pull-rebase-submodule later to maint).
* Recent update to "git config" broke updating variable in a
subsection, which has been corrected.
(merge bff7df7a87 sb/config-write-fix later to maint).
* When "git rebase -i" is told to squash two or more commits into
one, it labeled the log message for each commit with its number.
It correctly called the first one "1st commit", but the next one
was "commit #1", which was off-by-one. This has been corrected.
(merge dd2e36ebac pw/rebase-i-squash-number-fix later to maint).
* "git rebase -i", when a 'merge <branch>' insn in its todo list
fails, segfaulted, which has been (minimally) corrected.
(merge bc9238bb09 pw/rebase-i-merge-segv-fix later to maint).
* "git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even
though we won't be in a cherry-pick session after it returns, which
has been corrected.
(merge 3e7dd99208 nd/cherry-pick-quit-fix later to maint).
* Code cleanup, docfix, build fix, etc.
(merge aee9be2ebe sg/update-ref-stdin-cleanup later to maint).
(merge 037714252f jc/clean-after-sanity-tests later to maint).
(merge 5b26c3c941 en/merge-recursive-cleanup later to maint).
(merge 0dcbc0392e bw/config-refer-to-gitsubmodules-doc later to maint).
(merge bb4d000e87 bw/protocol-v2 later to maint).
(merge 928f0ab4ba vs/typofixes later to maint).
(merge d7f590be84 en/rebase-i-microfixes later to maint).
(merge 81d395cc85 js/rebase-recreate-merge later to maint).
(merge 51d1863168 tz/exclude-doc-smallfixes later to maint).
(merge a9aa3c0927 ds/commit-graph later to maint).
(merge 5cf8e06474 js/enhanced-version-info later to maint).
(merge 6aaded5509 tb/config-default later to maint).
(merge 022d2ac1f3 sb/blame-color later to maint).
(merge 5a06a20e0c bp/test-drop-caches-for-windows later to maint).
(merge dd61cc1c2e jk/ui-color-always-to-auto later to maint).
(merge 1e83b9bfdd sb/trailers-docfix later to maint).
(merge ab29f1b329 sg/fast-import-dump-refs-on-checkpoint-fix later to maint).
(merge 6a8ad880f0 jn/subtree-test-fixes later to maint).
(merge ffbd51cc60 nd/pack-objects-threading-doc later to maint).
(merge e9dac7be60 es/mw-to-git-chain-fix later to maint).
(merge fe583c6c7a rs/remote-mv-leakfix later to maint).
(merge 69885ab015 en/t3031-title-fix later to maint).
(merge 8578037bed nd/config-blame-sort later to maint).
(merge 8ad169c4ba hn/config-in-code-comment later to maint).
(merge b7446fcfdf ar/t4150-am-scissors-test-fix later to maint).
(merge a8132410ee js/typofixes later to maint).
(merge 388d0ff6e5 en/update-index-doc later to maint).
(merge e05aa688dd jc/update-index-doc later to maint).
(merge 10c600172c sg/t5310-empty-input-fix later to maint).
(merge 5641eb9465 jh/partial-clone-doc later to maint).
(merge 2711b1ad5e ab/submodule-relative-url-tests later to maint).
----------------------------------------------------------------
Changes since v2.18.0 are as follows:
Aaron Schrab (1):
sequencer: use configured comment character
Alban Gruin (4):
rebase: introduce a dedicated backend for --preserve-merges
rebase: strip unused code in git-rebase--preserve-merges.sh
rebase: use the new git-rebase--preserve-merges.sh
rebase: remove -p code from git-rebase--interactive.sh
Alejandro R. Sedeño (1):
Makefile: tweak sed invocation
Aleksandr Makarov (1):
for-each-ref: consistently pass WM_IGNORECASE flag
Andrei Rybak (2):
Documentation: fix --color option formatting
t4150: fix broken test for am --scissors
Anthony Sottile (1):
config.c: fix regression for core.safecrlf false
Antonio Ospite (6):
config: move config_from_gitmodules to submodule-config.c
submodule-config: add helper function to get 'fetch' config from .gitmodules
submodule-config: add helper to get 'update-clone' config from .gitmodules
submodule-config: make 'config_from_gitmodules' private
submodule-config: pass repository as argument to config_from_gitmodules
submodule-config: reuse config_from_gitmodules in repo_read_gitmodules
Beat Bolli (10):
builtin/config: work around an unsized array forward declaration
unicode: update the width tables to Unicode 11
connect.h: avoid forward declaration of an enum
refs/refs-internal.h: avoid forward declaration of an enum
convert.c: replace "\e" escapes with "\033".
sequencer.c: avoid empty statements at top level
string-list.c: avoid conversion from void * to function pointer
utf8.c: avoid char overflow
Makefile: add a DEVOPTS flag to get pedantic compilation
packfile: ensure that enum object_type is defined
Ben Peart (3):
convert log_ref_write_fd() to use strbuf
handle lower case drive letters on Windows
t3507: add a testcase showing failure with sparse checkout
Brandon Williams (15):
commit: convert commit_graft_pos() to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
test-pkt-line: add unpack-sideband subcommand
docs: link to gitsubmodules
upload-pack: implement ref-in-want
upload-pack: test negotiation with changing repository
fetch: refactor the population of peer ref OIDs
fetch: refactor fetch_refs into two functions
fetch: refactor to make function args narrower
fetch-pack: put shallow info in output parameter
fetch-pack: implement ref-in-want
clone: send ref-prefixes when using protocol v2
fetch-pack: mark die strings for translation
pack-protocol: mention and point to docs for protocol v2
Chen Bin (1):
git-p4: add the `p4-pre-submit` hook
Christian Couder (1):
t9104: kosherly remove remote refs
Derrick Stolee (43):
ref-filter: fix outdated comment on in_commit_list
commit: add generation number to struct commit
commit-graph: compute generation numbers
commit: use generations in paint_down_to_common()
commit-graph: always load commit-graph information
ref-filter: use generation number for --contains
commit: use generation numbers for in_merge_bases()
commit: add short-circuit to paint_down_to_common()
commit: use generation number in remove_redundant()
merge: check config before loading commits
commit-graph.txt: update design document
commit-graph: fix UX issue when .lock file exists
ewah/bitmap.c: delete unused 'bitmap_clear()'
ewah/bitmap.c: delete unused 'bitmap_each_bit()'
ewah_bitmap: delete unused 'ewah_and()'
ewah_bitmap: delete unused 'ewah_and_not()'
ewah_bitmap: delete unused 'ewah_not()'
ewah_bitmap: delete unused 'ewah_or()'
ewah_io: delete unused 'ewah_serialize()'
t5318-commit-graph.sh: use core.commitGraph
commit-graph: UNLEAK before die()
commit-graph: fix GRAPH_MIN_SIZE
commit-graph: parse commit from chosen graph
commit: force commit to parse from object database
commit-graph: load a root tree from specific graph
commit-graph: add 'verify' subcommand
commit-graph: verify catches corrupt signature
commit-graph: verify required chunks are present
commit-graph: verify corrupt OID fanout and lookup
commit-graph: verify objects exist
commit-graph: verify root tree OIDs
commit-graph: verify parent list
commit-graph: verify generation number
commit-graph: verify commit date
commit-graph: test for corrupted octopus edge
commit-graph: verify contents match checksum
fsck: verify commit-graph
commit-graph: use string-list API for input
commit-graph: add '--reachable' option
gc: automatically write commit-graph files
commit-graph: update design document
commit-graph: fix documentation inconsistencies
coccinelle: update commit.cocci
Elijah Newren (63):
t6036, t6042: use test_create_repo to keep tests independent
t6036, t6042: use test_line_count instead of wc -l
t6036, t6042: prefer test_path_is_file, test_path_is_missing
t6036, t6042: prefer test_cmp to sequences of test
t6036: prefer test_when_finished to manual cleanup in following test
merge-recursive: fix miscellaneous grammar error in comment
merge-recursive: fix numerous argument alignment issues
merge-recursive: align labels with their respective code blocks
merge-recursive: clarify the rename_dir/RENAME_DIR meaning
merge-recursive: rename conflict_rename_*() family of functions
merge-recursive: add pointer about unduly complex looking code
git-rebase.txt: document incompatible options
git-rebase.sh: update help messages a bit
t3422: new testcases for checking when incompatible options passed
git-rebase: error out when incompatible options passed
git-rebase.txt: address confusion between --no-ff vs --force-rebase
directory-rename-detection.txt: technical docs on abilities and limitations
git-rebase.txt: document behavioral differences between modes
t3401: add directory rename testcases for rebase and am
git-rebase: make --allow-empty-message the default
t3418: add testcase showing problems with rebase -i and strategy options
Fix use of strategy options with interactive rebases
git-rebase--merge: modernize "git-$cmd" to "git $cmd"
apply: fix grammar error in comment
t5407: fix test to cover intended arguments
read-cache.c: move index_has_changes() from merge.c
index_has_changes(): avoid assuming operating on the_index
t6044: verify that merges expected to abort actually abort
t6036: add a failed conflict detection case with symlink modify/modify
t6036: add a failed conflict detection case with symlink add/add
t6036: add a failed conflict detection case with submodule modify/modify
t6036: add a failed conflict detection case with submodule add/add
t6036: add a failed conflict detection case with conflicting types
t6042: add testcase covering rename/add/delete conflict type
t6042: add testcase covering rename/rename(2to1)/delete/delete conflict
t6042: add testcase covering long chains of rename conflicts
t6036: add lots of detail for directory/file conflicts in recursive case
t6036: add a failed conflict detection case: regular files, different modes
t6044: add a testcase for index matching head, when head doesn't match HEAD
merge-recursive: make sure when we say we abort that we actually abort
merge-recursive: fix assumption that head tree being merged is HEAD
t6044: add more testcases with staged changes before a merge is invoked
merge-recursive: enforce rule that index matches head before merging
merge: fix misleading pre-merge check documentation
t7405: add a file/submodule conflict
t7405: add a directory/submodule conflict
t7405: verify 'merge --abort' works after submodule/path conflicts
merge-recursive: preserve skip_worktree bit when necessary
t1015: demonstrate directory/file conflict recovery failures
read-cache: fix directory/file conflict handling in read_index_unmerged()
t3031: update test description to mention desired behavior
t7406: fix call that was failing for the wrong reason
t7406: simplify by using diff --name-only instead of diff --raw
t7406: avoid having git commands upstream of a pipe
t7406: prefer test_* helper functions to test -[feds]
t7406: avoid using test_must_fail for commands other than git
git-update-index.txt: reword possibly confusing example
Add missing includes and forward declarations
alloc: make allocate_alloc_state and clear_alloc_state more consistent
Move definition of enum branch_track from cache.h to branch.h
urlmatch.h: fix include guard
compat/precompose_utf8.h: use more common include guard style
Remove forward declaration of an enum
Eric Sunshine (53):
t: use test_might_fail() instead of manipulating exit code manually
t: use test_write_lines() instead of series of 'echo' commands
t: use sane_unset() rather than 'unset' with broken &&-chain
t: drop unnecessary terminating semicolon in subshell
t/lib-submodule-update: fix "absorbing" test
t5405: use test_must_fail() instead of checking exit code manually
t5406: use write_script() instead of birthing shell script manually
t5505: modernize and simplify hard-to-digest test
t6036: fix broken "merge fails but has appropriate contents" tests
t7201: drop pointless "exit 0" at end of subshell
t7400: fix broken "submodule add/reconfigure --force" test
t7810: use test_expect_code() instead of hand-rolled comparison
t9001: fix broken "invoke hook" test
t9814: simplify convoluted check that command correctly errors out
t0000-t0999: fix broken &&-chains
t1000-t1999: fix broken &&-chains
t2000-t2999: fix broken &&-chains
t3000-t3999: fix broken &&-chains
t3030: fix broken &&-chains
t4000-t4999: fix broken &&-chains
t5000-t5999: fix broken &&-chains
t6000-t6999: fix broken &&-chains
t7000-t7999: fix broken &&-chains
t9000-t9999: fix broken &&-chains
t9119: fix broken &&-chains
t6046/t9833: fix use of "VAR=VAL cmd" with a shell function
t/check-non-portable-shell: stop being so polite
t/check-non-portable-shell: make error messages more compact
t/check-non-portable-shell: detect "FOO=bar shell_func"
t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
t/Makefile: add machinery to check correctness of chainlint.sed
t/chainlint: add chainlint "basic" test cases
t/chainlint: add chainlint "whitespace" test cases
t/chainlint: add chainlint "one-liner" test cases
t/chainlint: add chainlint "nested subshell" test cases
t/chainlint: add chainlint "loop" and "conditional" test cases
t/chainlint: add chainlint "cuddled" test cases
t/chainlint: add chainlint "complex" test cases
t/chainlint: add chainlint "specialized" test cases
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
mw-to-git/t9360: fix broken &&-chain
t/chainlint.sed: drop extra spaces from regex character class
sequencer: fix "rebase -i --root" corrupting author header
sequencer: fix "rebase -i --root" corrupting author header timezone
sequencer: fix "rebase -i --root" corrupting author header timestamp
sequencer: don't die() on bogus user-edited timestamp
color: protect against out-of-bounds reads and writes
chainlint: match arbitrary here-docs tags rather than hard-coded names
chainlint: match 'quoted' here-doc tags
chainlint: recognize multi-line $(...) when command cuddled with "$("
chainlint: let here-doc and multi-line string commence on same line
chainlint: recognize multi-line quoted strings more robustly
chainlint: add test of pathological case which triggered false positive
Han-Wen Nienhuys (2):
config: document git config getter return value
sideband: highlight keywords in remote sideband output
Henning Schild (9):
builtin/receive-pack: use check_signature from gpg-interface
gpg-interface: make parse_gpg_output static and remove from interface header
gpg-interface: add new config to select how to sign a commit
t/t7510: check the validation of the new config gpg.format
gpg-interface: introduce an abstraction for multiple gpg formats
gpg-interface: do not hardcode the key string len anymore
gpg-interface: introduce new config to select per gpg format program
gpg-interface: introduce new signature format "x509" using gpgsm
gpg-interface t: extend the existing GPG tests with GPGSM
Isabella Stephens (2):
blame: prevent error if range ends past end of file
log: prevent error if line range ends past end of file
Jameson Miller (8):
read-cache: teach refresh_cache_entry to take istate
read-cache: teach make_cache_entry to take object_id
block alloc: add lifecycle APIs for cache_entry structs
mem-pool: only search head block for available space
mem-pool: add life cycle management functions
mem-pool: fill out functionality
block alloc: allocate cache entries from mem_pool
block alloc: add validations around cache_entry lifecyle
Jeff Hostetler (1):
json_writer: new routines to create JSON data
Jeff King (48):
make show-index a builtin
show-index: update documentation for index v2
fetch-pack: don't try to fetch peel values with --all
ewah: drop ewah_deserialize function
ewah: drop ewah_serialize_native function
t3200: unset core.logallrefupdates when testing reflog creation
t: switch "branch -l" to "branch --create-reflog"
branch: deprecate "-l" option
config: turn die_on_error into caller-facing enum
config: add CONFIG_ERROR_SILENT handler
config: add options parameter to git_config_from_mem
fsck: silence stderr when parsing .gitmodules
t6300: add a test for --ignore-case
ref-filter: avoid backend filtering with --ignore-case
t5500: prettify non-commit tag tests
sequencer: handle empty-set cases consistently
sequencer: don't say BUG on bogus input
has_uncommitted_changes(): fall back to empty tree
fsck: split ".gitmodules too large" error from parse failure
fsck: downgrade gitmodulesParse default to "info"
blame: prefer xsnprintf to strcpy for colors
check_replace_refs: fix outdated comment
check_replace_refs: rename to read_replace_refs
add core.usereplacerefs config option
reencode_string: use st_add/st_mult helpers
reencode_string: use size_t for string lengths
strbuf: use size_t for length in intermediate variables
strbuf_readlink: use ssize_t
pass st.st_size as hint for strbuf_readlink()
strbuf_humanise: use unsigned variables
automatically ban strcpy()
banned.h: mark strcat() as banned
banned.h: mark sprintf() as banned
banned.h: mark strncpy() as banned
score_trees(): fix iteration over trees with missing entries
add a script to diff rendered documentation
t5552: suppress upload-pack trace output
for_each_*_object: store flag definitions in a single location
for_each_*_object: take flag arguments as enum
for_each_*_object: give more comprehensive docstrings
for_each_packed_object: support iterating in pack-order
t1006: test cat-file --batch-all-objects with duplicates
cat-file: rename batch_{loose,packed}_object callbacks
cat-file: support "unordered" output for --batch-all-objects
cat-file: use oidset check-and-insert
cat-file: split batch "buf" into two variables
cat-file: use a single strbuf for all output
for_each_*_object: move declarations to object-store.h
Johannes Schindelin (41):
Makefile: fix the "built from commit" code
merge: allow reading the merge commit message from a file
rebase --rebase-merges: add support for octopus merges
rebase --rebase-merges: adjust man page for octopus support
vcbuild/README: update to accommodate for missing common-cmds.h
t7406: avoid failures solely due to timing issues
contrib: add a script to initialize VS Code configuration
vscode: hard-code a couple defines
cache.h: extract enum declaration from inside a struct declaration
mingw: define WIN32 explicitly
vscode: only overwrite C/C++ settings
vscode: wrap commit messages at column 72 by default
vscode: use 8-space tabs, no trailing ws, etc for Git's source code
vscode: add a dictionary for cSpell
vscode: let cSpell work on commit messages, too
pull --rebase=<type>: allow single-letter abbreviations for the type
t3430: demonstrate what -r, --autosquash & --exec should do
git-compat-util.h: fix typo
remote-curl: remove spurious period
rebase --exec: make it work with --rebase-merges
linear-assignment: a function to solve least-cost assignment problems
Introduce `range-diff` to compare iterations of a topic branch
range-diff: first rudimentary implementation
range-diff: improve the order of the shown commits
range-diff: also show the diff between patches
range-diff: right-trim commit messages
range-diff: indent the diffs just like tbdiff
range-diff: suppress the diff headers
range-diff: adjust the output of the commit pairs
range-diff: do not show "function names" in hunk headers
range-diff: use color for the commit pairs
color: add the meta color GIT_COLOR_REVERSE
diff: add an internal option to dual-color diffs of diffs
range-diff: offer to dual-color the diffs
range-diff --dual-color: skip white-space warnings
range-diff: populate the man page
completion: support `git range-diff`
range-diff: left-pad patch numbers
range-diff: make --dual-color the default mode
range-diff: use dim/bold cues to improve dual color mode
chainlint: fix for core.autocrlf=true
Johannes Sixt (1):
mingw: enable atomic O_APPEND
Jonathan Nieder (11):
object: add repository argument to grow_object_hash
object: move grafts to object parser
commit: add repository argument to commit_graft_pos
commit: add repository argument to register_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to prepare_commit_graft
commit: add repository argument to lookup_commit_graft
subtree test: add missing && to &&-chain
subtree test: simplify preparation of expected results
doc hash-function-transition: pick SHA-256 as NewHash
partial-clone: render design doc using asciidoc
Jonathan Tan (28):
list-objects: check if filter is NULL before using
fetch-pack: split up everything_local()
fetch-pack: clear marks before re-marking
fetch-pack: directly end negotiation if ACK ready
fetch-pack: use ref adv. to prune "have" sent
fetch-pack: make negotiation-related vars local
fetch-pack: move common check and marking together
fetch-pack: introduce negotiator API
pack-bitmap: remove bitmap_git global variable
pack-bitmap: add free function
fetch-pack: write shallow, then check connectivity
fetch-pack: support negotiation tip whitelist
upload-pack: send refs' objects despite "filter"
clone: check connectivity even if clone is partial
revision: tolerate promised targets of tags
tag: don't warn if target is missing but promised
negotiator/skipping: skip commits during fetch
commit-graph: refactor preparing commit graph
object-store: add missing include
commit-graph: add missing forward declaration
commit-graph: add free_commit_graph
commit-graph: store graph in struct object_store
commit-graph: add repo arg to graph readers
t5702: test fetch with multiple refspecs at a time
fetch: send "refs/tags/" prefix upon CLI refspecs
fetch-pack: unify ref in and out param
repack: refactor setup of pack-objects cmd
repack: repack promisor objects if -a or -A is set
Josh Steadmon (1):
protocol-v2 doc: put HTTP headers after request
Jules Maselbas (1):
send-email: fix tls AUTH when sending batch
Junio C Hamano (18):
tests: clean after SANITY tests
ewah: delete unused 'rlwit_discharge_empty()'
Prepare to start 2.19 cycle
First batch for 2.19 cycle
Second batch for 2.19 cycle
fixup! connect.h: avoid forward declaration of an enum
fixup! refs/refs-internal.h: avoid forward declaration of an enum
t3404: fix use of "VAR=VAL cmd" with a shell function
Third batch for 2.19 cycle
Fourth batch for 2.19 cycle
remote: make refspec follow the same disambiguation rule as local refs
Fifth batch for 2.19 cycle
update-index: there no longer is `apply --index-info`
gpg-interface: propagate exit status from gpg back to the callers
Sixth batch for 2.19 cycle
Seventh batch for 2.19 cycle
sideband: do not read beyond the end of input
Git 2.19-rc0
Kana Natsuno (2):
t4018: add missing test cases for PHP
userdiff: support new keywords in PHP hunk header
Kim Gybels (1):
gc --auto: release pack files before auto packing
Kirill Smelkov (1):
fetch-pack: test explicitly that --all can fetch tag references pointing to non-commits
Luis Marsano (2):
git-credential-netrc: use in-tree Git.pm for tests
git-credential-netrc: fix exit status when tests fail
Luke Diamand (6):
git-p4: python3: replace <> with !=
git-p4: python3: replace dict.has_key(k) with "k in dict"
git-p4: python3: remove backticks
git-p4: python3: basestring workaround
git-p4: python3: use print() function
git-p4: python3: fix octal constants
Marc Strapetz (1):
Documentation: declare "core.ignoreCase" as internal variable
Martin Ågren (1):
refspec: initalize `refspec_item` in `valid_fetch_refspec()`
Masaya Suzuki (2):
builtin/send-pack: populate the default configs
doc: fix want-capability separator
Max Kirillov (4):
http-backend: cleanup writing to child process
http-backend: respect CONTENT_LENGTH as specified by rfc3875
unpack-trees: do not fail reset because of unmerged skipped entry
http-backend: respect CONTENT_LENGTH for receive-pack
Michael Barabanov (1):
filter-branch: skip commits present on --state-branch
Mike Hommey (1):
fast-import: do not call diff_delta() with empty buffer
Nguyễn Thái Ngọc Duy (98):
commit-slab.h: code split
commit-slab: support shared commit-slab
blame: use commit-slab for blame suspects instead of commit->util
describe: use commit-slab for commit names instead of commit->util
shallow.c: use commit-slab for commit depth instead of commit->util
sequencer.c: use commit-slab to mark seen commits
sequencer.c: use commit-slab to associate todo items to commits
revision.c: use commit-slab for show_source
bisect.c: use commit-slab for commit weight instead of commit->util
name-rev: use commit-slab for rev-name instead of commit->util
show-branch: use commit-slab for commit-name instead of commit->util
show-branch: note about its object flags usage
log: use commit-slab in prepare_bases() instead of commit->util
merge: use commit-slab in merge remote desc instead of commit->util
commit.h: delete 'util' field in struct commit
diff: ignore --ita-[in]visible-in-index when diffing worktree-to-tree
diff: turn --ita-invisible-in-index on by default
t2203: add a test about "diff HEAD" case
apply: add --intent-to-add
parse-options: option to let --git-completion-helper show negative form
completion: suppress some -no- options
Add and use generic name->id mapping code for color slot parsing
grep: keep all colors in an array
fsck: factor out msg_id_info[] lazy initialization code
help: add --config to list all available config
fsck: produce camelCase config key names
advice: keep config name in camelCase in advice_config[]
am: move advice.amWorkDir parsing back to advice.c
completion: drop the hard coded list of config vars
completion: keep other config var completion in camelCase
completion: support case-insensitive config vars
log-tree: allow to customize 'grafted' color
completion: complete general config vars in two steps
upload-pack: reject shallow requests that would return nothing
completion: collapse extra --no-.. options
Update messages in preparation for i18n
archive-tar.c: mark more strings for translation
archive-zip.c: mark more strings for translation
builtin/config.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
commit-graph.c: mark more strings for translation
config.c: mark more strings for translation
connect.c: mark more strings for translation
convert.c: mark more strings for translation
dir.c: mark more strings for translation
environment.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
object.c: mark more strings for translation
pkt-line.c: mark more strings for translation
refs.c: mark more strings for translation
refspec.c: mark more strings for translation
replace-object.c: mark more strings for translation
sequencer.c: mark more strings for translation
sha1-file.c: mark more strings for translation
transport.c: mark more strings for translation
transport-helper.c: mark more strings for translation
pack-objects: document about thread synchronization
apply.h: drop extern on func declaration
attr.h: drop extern from function declaration
blame.h: drop extern on func declaration
cache-tree.h: drop extern from function declaration
convert.h: drop 'extern' from function declaration
diffcore.h: drop extern from function declaration
diff.h: remove extern from function declaration
line-range.h: drop extern from function declaration
rerere.h: drop extern from function declaration
repository.h: drop extern from function declaration
revision.h: drop extern from function declaration
submodule.h: drop extern from function declaration
config.txt: reorder blame stuff to keep config keys sorted
Makefile: add missing dependency for command-list.h
diff.c: move read_index() code back to the caller
cache-tree: wrap the_index based wrappers with #ifdef
attr: remove an implicit dependency on the_index
convert.c: remove an implicit dependency on the_index
dir.c: remove an implicit dependency on the_index in pathspec code
preload-index.c: use the right index instead of the_index
ls-files: correct index argument to get_convert_attr_ascii()
unpack-trees: remove 'extern' on function declaration
unpack-trees: add a note about path invalidation
unpack-trees: don't shadow global var the_index
unpack-trees: convert clear_ce_flags* to avoid the_index
unpack-trees: avoid the_index in verify_absent()
pathspec.c: use the right index instead of the_index
submodule.c: use the right index instead of the_index
entry.c: use the right index instead of the_index
attr: remove index from git_attr_set_direction()
grep: use the right index instead of the_index
archive.c: avoid access to the_index
archive-*.c: use the right repository
resolve-undo.c: use the right index instead of the_index
apply.c: pass struct apply_state to more functions
apply.c: make init_apply_state() take a struct repository
apply.c: remove implicit dependency on the_index
blame.c: remove implicit dependency on the_index
cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
Nicholas Guriev (1):
mergetool: don't suggest to continue after last file
Olga Telezhnaya (5):
ref-filter: add info_source to valid_atom
ref-filter: fill empty fields with empty values
ref-filter: initialize eaten variable
ref-filter: merge get_obj and get_object
ref-filter: use oid_object_info() to get object
Phillip Wood (5):
add -p: fix counting empty context lines in edited patches
sequencer: do not squash 'reword' commits when we hit conflicts
rebase -i: fix numbering in squash message
t3430: add conflicting commit
rebase -i: fix SIGSEGV when 'merge <branch>' fails
Prathamesh Chavan (4):
submodule foreach: correct '$path' in nested submodules from a subdirectory
submodule foreach: document '$sm_path' instead of '$path'
submodule foreach: document variable '$displaypath'
submodule: port submodule subcommand 'foreach' from shell to C
Ramsay Jones (3):
fsck: check skiplist for object in fsck_blob()
t6036: fix broken && chain in sub-shell
t5562: avoid non-portable "export FOO=bar" construct
René Scharfe (7):
remote: clear string_list after use in mv()
add, update-index: fix --chmod argument help
difftool: remove angular brackets from argument help
pack-objects: specify --index-version argument help explicitly
send-pack: specify --force-with-lease argument help explicitly
shortlog: correct option help for -w
parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP
SZEDER Gábor (19):
update-ref --stdin: use skip_prefix()
t7510-signed-commit: use 'test_must_fail'
tests: make forging GPG signed commits and tags more robust
t5541: clean up truncating access log
t/lib-httpd: add the strip_access_log() helper function
t/lib-httpd: avoid occasional failures when checking access.log
t5608: fix broken &&-chain
t9300: wait for background fast-import process to die after killing it
travis-ci: run Coccinelle static analysis with two parallel jobs
travis-ci: fail if Coccinelle static analysis found something to transform
coccinelle: mark the 'coccicheck' make target as .PHONY
coccinelle: use $(addsuffix) in 'coccicheck' make target
coccinelle: exclude sha1dc source files from static analysis
coccinelle: put sane filenames into output patches
coccinelle: extract dedicated make target to clean Coccinelle's results
travis-ci: include the trash directories of failed tests in the trace log
t5318: use 'test_cmp_bin' to compare commit-graph files
t5318: avoid unnecessary command substitutions
t5310-pack-bitmaps: fix bogus 'pack-objects to file can use bitmap' test
Sebastian Kisela (2):
git-instaweb: support Fedora/Red Hat apache module path
git-instaweb: fix apache2 config with apache >= 2.4
Stefan Beller (87):
repository: introduce parsed objects field
object: add repository argument to create_object
alloc: add repository argument to alloc_blob_node
alloc: add repository argument to alloc_tree_node
alloc: add repository argument to alloc_commit_node
alloc: add repository argument to alloc_tag_node
alloc: add repository argument to alloc_object_node
alloc: add repository argument to alloc_report
alloc: add repository argument to alloc_commit_index
object: allow grow_object_hash to handle arbitrary repositories
object: allow create_object to handle arbitrary repositories
alloc: allow arbitrary repositories for alloc functions
object-store: move object access functions to object-store.h
shallow: add repository argument to set_alternate_shallow_file
shallow: add repository argument to register_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to is_repository_shallow
cache: convert get_graft_file to handle arbitrary repositories
path.c: migrate global git_path_* to take a repository argument
shallow: migrate shallow information into the object parser
commit: allow prepare_commit_graft to handle arbitrary repositories
commit: allow lookup_commit_graft to handle arbitrary repositories
refs/packed-backend.c: close fd of empty file
submodule--helper: plug mem leak in print_default_remote
sequencer.c: plug leaks in do_pick_commit
submodule: fix NULL correctness in renamed broken submodules
t5526: test recursive submodules when fetching moved submodules
submodule: unset core.worktree if no working tree is present
submodule: ensure core.worktree is set after update
submodule deinit: unset core.worktree
submodule.c: report the submodule that an error occurs in
sequencer.c: plug mem leak in git_sequencer_config
.mailmap: merge different spellings of names
object: add repository argument to parse_object
object: add repository argument to lookup_object
object: add repository argument to parse_object_buffer
object: add repository argument to object_as_type
blob: add repository argument to lookup_blob
tree: add repository argument to lookup_tree
commit: add repository argument to lookup_commit_reference_gently
commit: add repository argument to lookup_commit_reference
commit: add repository argument to lookup_commit
commit: add repository argument to parse_commit_buffer
commit: add repository argument to set_commit_buffer
commit: add repository argument to get_cached_commit_buffer
tag: add repository argument to lookup_tag
tag: add repository argument to parse_tag_buffer
tag: add repository argument to deref_tag
object: allow object_as_type to handle arbitrary repositories
object: allow lookup_object to handle arbitrary repositories
blob: allow lookup_blob to handle arbitrary repositories
tree: allow lookup_tree to handle arbitrary repositories
commit: allow lookup_commit to handle arbitrary repositories
tag: allow lookup_tag to handle arbitrary repositories
tag: allow parse_tag_buffer to handle arbitrary repositories
commit.c: allow parse_commit_buffer to handle arbitrary repositories
commit-slabs: remove realloc counter outside of slab struct
commit.c: migrate the commit buffer to the parsed object store
commit.c: allow set_commit_buffer to handle arbitrary repositories
commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
object.c: allow parse_object_buffer to handle arbitrary repositories
object.c: allow parse_object to handle arbitrary repositories
tag.c: allow deref_tag to handle arbitrary repositories
commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
commit.c: allow lookup_commit_reference to handle arbitrary repositories
xdiff/xdiff.h: remove unused flags
xdiff/xdiffi.c: remove unneeded function declarations
t4015: avoid git as a pipe input
diff.c: do not pass diff options as keydata to hashmap
diff.c: adjust hash function signature to match hashmap expectation
diff.c: add a blocks mode for moved code detection
diff.c: decouple white space treatment from move detection algorithm
diff.c: factor advance_or_nullify out of mark_color_as_moved
diff.c: add white space mode to move detection that allows indent changes
diff.c: offer config option to control ws handling in move detection
xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff
xdiff/xhistogram: factor out memory cleanup into free_index()
xdiff/xhistogram: move index allocation into find_lcs
Documentation/git-interpret-trailers: explain possible values
xdiff/histogram: remove tail recursion
t1300: document current behavior of setting options
xdiff: reduce indent heuristic overhead
config: fix case sensitive subsection names on writing
git-config: document accidental multi-line setting in deprecated syntax
git-submodule.sh: accept verbose flag in cmd_update to be non-quiet
t7410: update to new style
builtin/submodule--helper: remove stray new line
Taylor Blau (9):
Documentation/config.txt: camel-case lineNumber for consistency
grep.c: expose {,inverted} match column in match_line()
grep.[ch]: extend grep_opt to allow showing matched column
grep.c: display column number of first match
builtin/grep.c: add '--column' option to 'git-grep(1)'
grep.c: add configuration variables to show matched option
contrib/git-jump/git-jump: jump to exact location
grep.c: extract show_line_header()
grep.c: teach 'git grep --only-matching'
Thomas Rast (1):
range-diff: add tests
Tobias Klauser (1):
git-rebase--preserve-merges: fix formatting of todo help message
Todd Zullinger (4):
git-credential-netrc: minor whitespace cleanup in test script
git-credential-netrc: make "all" default target of Makefile
gitignore.txt: clarify default core.excludesfile path
dir.c: fix typos in core.excludesfile comment
Ville Skyttä (1):
Documentation: spelling and grammar fixes
Vladimir Parfinenko (1):
rebase: fix documentation formatting
William Chargin (2):
sha1-name.c: for ":/", find detached HEAD commits
t: factor out FUNNYNAMES as shared lazy prereq
Xiaolong Ye (1):
format-patch: clear UNINTERESTING flag before prepare_bases
brian m. carlson (21):
send-email: add an auto option for transfer encoding
send-email: accept long lines with suitable transfer encoding
send-email: automatically determine transfer-encoding
docs: correct RFC specifying email line length
sequencer: pass absolute GIT_WORK_TREE to exec commands
cache: update object ID functions for the_hash_algo
tree-walk: replace hard-coded constants with the_hash_algo
hex: switch to using the_hash_algo
commit: express tree entry constants in terms of the_hash_algo
strbuf: allocate space with GIT_MAX_HEXSZ
sha1-name: use the_hash_algo when parsing object names
refs/files-backend: use the_hash_algo for writing refs
builtin/update-index: convert to using the_hash_algo
builtin/update-index: simplify parsing of cacheinfo
builtin/fmt-merge-msg: make hash independent
builtin/merge: switch to use the_hash_algo
builtin/merge-recursive: make hash independent
diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
sha1-file: convert constants to uses of the_hash_algo
pretty: switch hard-coded constants to the_hash_algo
Ævar Arnfjörð Bjarmason (36):
checkout tests: index should be clean after dwim checkout
checkout.h: wrap the arguments to unique_tracking_name()
checkout.c: introduce an *_INIT macro
checkout.c: change "unique" member to "num_matches"
checkout: pass the "num_matches" up to callers
builtin/checkout.c: use "ret" variable for return
checkout: add advice for ambiguous "checkout <branch>"
checkout & worktree: introduce checkout.defaultRemote
refspec: s/refspec_item_init/&_or_die/g
refspec: add back a refspec_item_init() function
doc hash-function-transition: note the lack of a changelog
receive.fsck.<msg-id> tests: remove dead code
config doc: don't describe *.fetchObjects twice
config doc: unify the description of fsck.* and receive.fsck.*
config doc: elaborate on what transfer.fsckObjects does
config doc: elaborate on fetch.fsckObjects security
transfer.fsckObjects tests: untangle confusing setup
fetch: implement fetch.fsck.*
fsck: test & document {fetch,receive}.fsck.* config fallback
fsck: add stress tests for fsck.skipList
fsck: test and document unknown fsck.<msg-id> values
tests: make use of the test_must_be_empty function
tests: make use of the test_must_be_empty function
fetch tests: change "Tag" test tag to "testTag"
push tests: remove redundant 'git push' invocation
push tests: fix logic error in "push" test assertion
push tests: add more testing for forced tag pushing
push tests: assert re-pushing annotated tags
negotiator: unknown fetch.negotiationAlgorithm should error out
fetch doc: cross-link two new negotiation options
sha1dc: update from upstream
push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
fetch tests: correct a comment "remove it" -> "remove them"
pull doc: fix a long-standing grammar error
submodule: add more exhaustive up-path testing
t2024: mark test using "checkout -p" with PERL prerequisite
Łukasz Stelmach (1):
completion: complete remote names too
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-20 22:13 [ANNOUNCE] Git v2.19.0-rc0 Junio C Hamano
@ 2018-08-20 22:41 ` Stefan Beller
2018-08-20 23:39 ` Jonathan Nieder
2018-08-21 20:41 ` Derrick Stolee
1 sibling, 1 reply; 58+ messages in thread
From: Stefan Beller @ 2018-08-20 22:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linux-Kernel@Vger. Kernel. Org, git-packagers
> * The conversion to pass "the_repository" and then "a_repository"
> throughout the object access API continues.
>
[...]
>
> * The conversion to pass "the_repository" and then "a_repository"
> throughout the object access API continues.
I guess it continues twice as two large series were merged? ;-)
sb/object-store-grafts
sb/object-store-lookup
The latter one is not the correct one, as later we'll have
* lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.
> * "git submodule" did not correctly adjust core.worktree setting that
> indicates whether/where a submodule repository has its associated
> working tree across various state transitions, which has been
> corrected.
> (merge 984cd77ddb sb/submodule-core-worktree later to maint).
Personally I do not view this as a bug fix but a feature
(but then again my thinking might be tainted of too much
submodule work) hence I would not merge it down.
Stefan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-20 22:41 ` Stefan Beller
@ 2018-08-20 23:39 ` Jonathan Nieder
2018-08-21 0:27 ` Jonathan Nieder
0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Nieder @ 2018-08-20 23:39 UTC (permalink / raw)
To: Stefan Beller; +Cc: Junio C Hamano, git
(-cc: other lists)
Stefan Beller wrote:
> Junio C Hamano wrote:
>> * "git submodule" did not correctly adjust core.worktree setting that
>> indicates whether/where a submodule repository has its associated
>> working tree across various state transitions, which has been
>> corrected.
>> (merge 984cd77ddb sb/submodule-core-worktree later to maint).
>
> Personally I do not view this as a bug fix but a feature
> (but then again my thinking might be tainted of too much
> submodule work) hence I would not merge it down.
Can you elaborate?
The symptom that this series fixes was pretty bad, so I'm pretty glad
you wrote it.
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-20 23:39 ` Jonathan Nieder
@ 2018-08-21 0:27 ` Jonathan Nieder
2018-08-21 0:46 ` Stefan Beller
0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Nieder @ 2018-08-21 0:27 UTC (permalink / raw)
To: Stefan Beller; +Cc: Junio C Hamano, git
Jonathan Nieder wrote:
> Stefan Beller wrote:
>> Junio C Hamano wrote:
>>> * "git submodule" did not correctly adjust core.worktree setting that
>>> indicates whether/where a submodule repository has its associated
>>> working tree across various state transitions, which has been
>>> corrected.
>>> (merge 984cd77ddb sb/submodule-core-worktree later to maint).
>>
>> Personally I do not view this as a bug fix but a feature
>> (but then again my thinking might be tainted of too much
>> submodule work) hence I would not merge it down.
>
> Can you elaborate?
... ah, I figured it out. You are saying "would not merge it down to
maint". In that case, I agree, since this this is not a recent bug
(it's existed since before v1.7.10-rc1~14^2~2, 2012-03-02).
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-21 0:27 ` Jonathan Nieder
@ 2018-08-21 0:46 ` Stefan Beller
0 siblings, 0 replies; 58+ messages in thread
From: Stefan Beller @ 2018-08-21 0:46 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Junio C Hamano, git
On Mon, Aug 20, 2018 at 5:27 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Jonathan Nieder wrote:
> > Stefan Beller wrote:
> >> Junio C Hamano wrote:
>
> >>> * "git submodule" did not correctly adjust core.worktree setting that
> >>> indicates whether/where a submodule repository has its associated
> >>> working tree across various state transitions, which has been
> >>> corrected.
> >>> (merge 984cd77ddb sb/submodule-core-worktree later to maint).
> >>
> >> Personally I do not view this as a bug fix but a feature
> >> (but then again my thinking might be tainted of too much
> >> submodule work) hence I would not merge it down.
> >
> > Can you elaborate?
>
> ... ah, I figured it out. You are saying "would not merge it down to
> maint". In that case, I agree, since this this is not a recent bug
> (it's existed since before v1.7.10-rc1~14^2~2, 2012-03-02).
Yeah; the behavior was the gold standard for submodules ever since,
so I am wary of changing it under the guise of fixing a bug.
The core.worktree setting doesn't harm the user by default; you
need to craft a very specific situation to benefit from this feature.
Stefan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-20 22:13 [ANNOUNCE] Git v2.19.0-rc0 Junio C Hamano
2018-08-20 22:41 ` Stefan Beller
@ 2018-08-21 20:41 ` Derrick Stolee
2018-08-21 21:29 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2018-08-21 20:41 UTC (permalink / raw)
To: Junio C Hamano, git
On 8/20/2018 6:13 PM, Junio C Hamano wrote:
> An early preview release Git v2.19.0-rc0 is now available for
> testing at the usual places.
As part of testing the release candidate, I ran the performance suite
against a fresh clone of the Linux repository using v2.18.0 and
v2.19.0-rc0 (also: GIT_PERF_REPEAT_COUNT=10). I found a few nice
improvements, but I also found a possible regression in tree walking. I
say "tree walking" because it was revealed using p0001-rev-list.sh, but
only with the "--objects" flag. I also saw some similar numbers on 'git
log --raw'.
Test v2.18.0 v2.19.0-rc0
--------------------------------------------------------------------------------------------
0001.1: rev-list --all 6.69(6.33+0.35) 6.52(6.20+0.31) -2.5%
0001.2: rev-list --all --objects 52.14(47.43+1.02) 57.15(51.09+1.18) +9.6%
To me, 9.6% seems out of the range of just noise for this length of a
command, but I could be wrong. Could anyone else try to repro these results?
(This may also not just be tree-walking, but general pack-file loading
and decompression, since I computed and stored a commit-graph file.
Hence, commits are not being parsed from the pack-file by either command.)
Aside: the perf results were not all bad. Here was an interesting
improvement:
Test v2.18.0 v2.19.0-rc0
--------------------------------------------------------------------------------------------
0002.1: read_cache/discard_cache 1000 times 5.63(5.30+0.32)
3.34(3.03+0.30) -40.7%
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-21 20:41 ` Derrick Stolee
@ 2018-08-21 21:29 ` Jeff King
2018-08-22 0:48 ` brian m. carlson
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-21 21:29 UTC (permalink / raw)
To: Derrick Stolee; +Cc: brian m. carlson, Junio C Hamano, git
On Tue, Aug 21, 2018 at 04:41:02PM -0400, Derrick Stolee wrote:
> On 8/20/2018 6:13 PM, Junio C Hamano wrote:
> > An early preview release Git v2.19.0-rc0 is now available for
> > testing at the usual places.
>
> As part of testing the release candidate, I ran the performance suite
> against a fresh clone of the Linux repository using v2.18.0 and v2.19.0-rc0
> (also: GIT_PERF_REPEAT_COUNT=10).
Wow, you're a glutton for punishment. :)
> I found a few nice improvements, but I
> also found a possible regression in tree walking. I say "tree walking"
> because it was revealed using p0001-rev-list.sh, but only with the
> "--objects" flag. I also saw some similar numbers on 'git log --raw'.
>
> Test v2.18.0 v2.19.0-rc0
> --------------------------------------------------------------------------------------------
> 0001.1: rev-list --all 6.69(6.33+0.35) 6.52(6.20+0.31) -2.5%
> 0001.2: rev-list --all --objects 52.14(47.43+1.02) 57.15(51.09+1.18) +9.6%
>
> To me, 9.6% seems out of the range of just noise for this length of a
> command, but I could be wrong. Could anyone else try to repro these results?
I got:
0001.2: rev-list --all --objects 37.07(36.62+0.45) 39.11(38.58+0.51) +5.5%
Less change, but my overall times were smaller, too, so clearly our
hardware or exact repos are a little bit different. Those numbers seem
pretty consistent in further runs.
It bisects to 509f6f62a4 (cache: update object ID functions for
the_hash_algo, 2018-07-16). Which make sense. An "--objects" traversal
spends a huge amount of time checking each tree entry to see if we've
processed that object yet, which ends up as hashcmp() in the hash table.
I expect that a fixed 20-byte memcmp() can be optimized a lot more than
one with an arbitrary value.
Even if _we_ know the value can only take on one of a few values, I
don't know that we have an easy way to tell the compiler that. Possibly
we could improve things by jumping directly to an optimized code path.
Sort of a poor-man's JIT. ;)
Doing this:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..9c004a26c9 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,7 +1023,10 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ if (the_hash_algo->rawsz == 20)
+ return memcmp(sha1, sha2, 20);
+ else
+ return memcmp(sha1, sha1, the_hash_algo->rawsz);
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
on top of v2.19-rc0 seems to give me about a 3% speedup (though I might
be imaging it, as there's a bit of noise). A function pointer in
the_hash_algo might make even more sense.
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-21 21:29 ` Jeff King
@ 2018-08-22 0:48 ` brian m. carlson
2018-08-22 3:03 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: brian m. carlson @ 2018-08-22 0:48 UTC (permalink / raw)
To: Jeff King; +Cc: Derrick Stolee, Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 2253 bytes --]
On Tue, Aug 21, 2018 at 05:29:24PM -0400, Jeff King wrote:
> 0001.2: rev-list --all --objects 37.07(36.62+0.45) 39.11(38.58+0.51) +5.5%
>
> Less change, but my overall times were smaller, too, so clearly our
> hardware or exact repos are a little bit different. Those numbers seem
> pretty consistent in further runs.
>
> It bisects to 509f6f62a4 (cache: update object ID functions for
> the_hash_algo, 2018-07-16). Which make sense. An "--objects" traversal
> spends a huge amount of time checking each tree entry to see if we've
> processed that object yet, which ends up as hashcmp() in the hash table.
> I expect that a fixed 20-byte memcmp() can be optimized a lot more than
> one with an arbitrary value.
>
> Even if _we_ know the value can only take on one of a few values, I
> don't know that we have an easy way to tell the compiler that. Possibly
> we could improve things by jumping directly to an optimized code path.
> Sort of a poor-man's JIT. ;)
>
> Doing this:
>
> diff --git a/cache.h b/cache.h
> index b1fd3d58ab..9c004a26c9 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1023,7 +1023,10 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + if (the_hash_algo->rawsz == 20)
> + return memcmp(sha1, sha2, 20);
> + else
> + return memcmp(sha1, sha1, the_hash_algo->rawsz);
> }
>
> static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
> on top of v2.19-rc0 seems to give me about a 3% speedup (though I might
> be imaging it, as there's a bit of noise). A function pointer in
> the_hash_algo might make even more sense.
It's possible that might be a better solution. I looked into a GCC
assertion that the value was either 20 or 32, and that in itself didn't
seem to help, at least in the generated code. Your solution is likely
better in that regard.
We could wire it up to be either 20 or 32 and let people experimenting
with other sizes of algorithms just add another branch. I haven't
tested how that performs, though.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 0:48 ` brian m. carlson
@ 2018-08-22 3:03 ` Jeff King
2018-08-22 3:36 ` Jeff King
` (2 more replies)
0 siblings, 3 replies; 58+ messages in thread
From: Jeff King @ 2018-08-22 3:03 UTC (permalink / raw)
To: brian m. carlson, Derrick Stolee, Junio C Hamano, git
On Wed, Aug 22, 2018 at 12:48:16AM +0000, brian m. carlson wrote:
> > diff --git a/cache.h b/cache.h
> > index b1fd3d58ab..9c004a26c9 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1023,7 +1023,10 @@ extern const struct object_id null_oid;
> >
> > static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> > {
> > - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> > + if (the_hash_algo->rawsz == 20)
> > + return memcmp(sha1, sha2, 20);
> > + else
> > + return memcmp(sha1, sha1, the_hash_algo->rawsz);
> > }
> >
> > static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
> > on top of v2.19-rc0 seems to give me about a 3% speedup (though I might
> > be imaging it, as there's a bit of noise). A function pointer in
> > the_hash_algo might make even more sense.
>
> It's possible that might be a better solution. I looked into a GCC
> assertion that the value was either 20 or 32, and that in itself didn't
> seem to help, at least in the generated code. Your solution is likely
> better in that regard.
>
> We could wire it up to be either 20 or 32 and let people experimenting
> with other sizes of algorithms just add another branch. I haven't
> tested how that performs, though.
Here's a _really_ dirty one:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..a6750524ea 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,6 +1023,7 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
+ assert(the_hash_algo->rawsz == 20);
return memcmp(sha1, sha2, the_hash_algo->rawsz);
}
We probably don't want to do that, because it makes experimenting with
new hash algos a bit painful, but it gives the same 3-4% speedup pretty
consistently. But I think it demonstrates pretty clearly that giving the
compiler the extra limit information is sufficient. Presumably the
fixed-size memcmp turns into a few multi-word compares.
And indeed, if I look at the generated asm for the call in lookup_object
(which is likely the one we're hitting a lot in this case), I see:
# cache.h:1027: return memcmp(sha1, sha2, the_hash_algo->rawsz);
.loc 4 1027 9 is_stmt 0 view .LVU86
movq (%rsi), %rcx # MEM[(void *)sha1_25(D)], MEM[(void *)sha1_25(D)]
movq 8(%rsi), %rdi # MEM[(void *)sha1_25(D)], tmp125
xorq 4(%rax), %rcx # MEM[(void *)_6], tmp116
xorq 8(%r8), %rdi # MEM[(void *)_6], tmp115
orq %rcx, %rdi # tmp116, tmp115
jne .L27 #,
movl 16(%r8), %ecx # MEM[(void *)_6], tmp129
cmpl %ecx, 16(%rsi) # tmp129, MEM[(void *)sha1_25(D)]
jne .L27 #,
So I wonder if there's some other way to tell the compiler that we'll
only have a few values. An enum comes to mind, though I don't think the
enum rules are strict enough to make this guarantee (after all, it's OK
to bitwise-OR enums, so they clearly don't specify all possible values).
Having a dedicate hashcmp function for each hash_algo seems like the
sanest approach. We pay for one indirect function call, but the function
itself will have the constants available. But it does introduce one
extra complication. We're benefiting here from knowing that the size is
always 20, but also the inline hashcmp knows that we only care about
equality, not comparison.
So if I do this:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..da56da7be2 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,7 +1023,7 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ return the_hash_algo->cmp_fn(sha1, sha2);
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
diff --git a/hash.h b/hash.h
index 7c8238bc2e..ac22ba63b6 100644
--- a/hash.h
+++ b/hash.h
@@ -64,6 +64,7 @@ typedef union git_hash_ctx git_hash_ctx;
typedef void (*git_hash_init_fn)(git_hash_ctx *ctx);
typedef void (*git_hash_update_fn)(git_hash_ctx *ctx, const void *in, size_t len);
typedef void (*git_hash_final_fn)(unsigned char *hash, git_hash_ctx *ctx);
+typedef int (*git_hash_cmp_fn)(const void *a, const void *b);
struct git_hash_algo {
/*
@@ -90,6 +91,8 @@ struct git_hash_algo {
/* The hash finalization function. */
git_hash_final_fn final_fn;
+ git_hash_cmp_fn cmp_fn;
+
/* The OID of the empty tree. */
const struct object_id *empty_tree;
diff --git a/sha1-file.c b/sha1-file.c
index 97b7423848..7072e360d7 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -69,6 +69,11 @@ static void git_hash_sha1_final(unsigned char *hash, git_hash_ctx *ctx)
git_SHA1_Final(hash, &ctx->sha1);
}
+static int git_hash_sha1_cmp(const void *a, const void *b)
+{
+ return memcmp(a, b, 20);
+}
+
static void git_hash_unknown_init(git_hash_ctx *ctx)
{
BUG("trying to init unknown hash");
@@ -84,6 +89,11 @@ static void git_hash_unknown_final(unsigned char *hash, git_hash_ctx *ctx)
BUG("trying to finalize unknown hash");
}
+static int git_hash_unknown_cmp(const void *a, const void *b)
+{
+ BUG("trying to compare unknown hash");
+}
+
const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
{
NULL,
@@ -93,6 +103,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
git_hash_unknown_init,
git_hash_unknown_update,
git_hash_unknown_final,
+ git_hash_unknown_cmp,
NULL,
NULL,
},
@@ -105,6 +116,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
git_hash_sha1_init,
git_hash_sha1_update,
git_hash_sha1_final,
+ git_hash_sha1_cmp,
&empty_tree_oid,
&empty_blob_oid,
},
the result is actually _slower_ than the current code.
If I instead introduce an "eq" function, then that function can do the
optimized thing. So if I do something more like this:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..52533a9710 100644
--- a/cache.h
+++ b/cache.h
@@ -1026,6 +1026,11 @@ static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
return memcmp(sha1, sha2, the_hash_algo->rawsz);
}
+static inline int hasheq(const unsigned char *sha1, const unsigned char *sha2)
+{
+ return the_hash_algo->eq_fn(sha1, sha2);
+}
+
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
{
return hashcmp(oid1->hash, oid2->hash);
diff --git a/sha1-file.c b/sha1-file.c
index 97b7423848..a491ff5bef 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -69,6 +69,11 @@ static void git_hash_sha1_final(unsigned char *hash, git_hash_ctx *ctx)
git_SHA1_Final(hash, &ctx->sha1);
}
+static int git_hash_sha1_eq(const void *a, const void *b)
+{
+ return !memcmp(a, b, 20);
+}
+
static void git_hash_unknown_init(git_hash_ctx *ctx)
{
BUG("trying to init unknown hash");
> [omitting similar hunks from the last one]
plus converting this one callsite:
diff --git a/object.c b/object.c
index 51c4594515..e54160550c 100644
--- a/object.c
+++ b/object.c
@@ -95,7 +95,7 @@ struct object *lookup_object(struct repository *r, const unsigned char *sha1)
first = i = hash_obj(sha1, r->parsed_objects->obj_hash_size);
while ((obj = r->parsed_objects->obj_hash[i]) != NULL) {
- if (!hashcmp(sha1, obj->oid.hash))
+ if (hasheq(sha1, obj->oid.hash))
break;
i++;
if (i == r->parsed_objects->obj_hash_size)
I get about a 1.5% speedup. If I apply this coccinelle patch:
@@
expression a, b;
@@
- !hashcmp(a, b)
+ hasheq(a, b)
@@
expression a, b;
@@
- !oidcmp(a, b)
+ oideq(a, b)
with the obvious "oideq()" implementation added, that seems to get me to
2-3%. Not _quite_ as good as the original branching version I showed.
And we had to touch all the callsites (although arguably that kind of
"eq" function is a better interface anyway, since it obviously allows
for more optimization.
So maybe the branching thing is actually not so insane. It makes new
hash_algo's Just Work; they just won't be optimized. And the change is
very localized.
Or maybe it's crazy to spend any effort at all chasing a few percent.
It's not like people's large repositories aren't just going to grow by
that much after a few months anyway. ;) It just seems like if we can do
it for little cost, it's worth it.
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 3:03 ` Jeff King
@ 2018-08-22 3:36 ` Jeff King
2018-08-22 11:11 ` Derrick Stolee
2018-08-22 5:36 ` brian m. carlson
2018-08-22 12:42 ` Paul Smith
2 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 3:36 UTC (permalink / raw)
To: brian m. carlson, Derrick Stolee, Junio C Hamano, git
On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
> with the obvious "oideq()" implementation added, that seems to get me to
> 2-3%. Not _quite_ as good as the original branching version I showed.
> And we had to touch all the callsites (although arguably that kind of
> "eq" function is a better interface anyway, since it obviously allows
> for more optimization.
>
> So maybe the branching thing is actually not so insane. It makes new
> hash_algo's Just Work; they just won't be optimized. And the change is
> very localized.
Hmph. So I went back to double-check my measurements on that branching
version, and I couldn't replicate it!
It turns out what I showed (and measured) before has a bug. Can you see
it?
diff --git a/cache.h b/cache.h
index b1fd3d58ab..9c004a26c9 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,7 +1023,10 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ if (the_hash_algo->rawsz == 20)
+ return memcmp(sha1, sha2, 20);
+ else
+ return memcmp(sha1, sha1, the_hash_algo->rawsz);
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
The problem is the fallback code compares "sha1" to "sha1". The compiler
realizes that's a noop and is able to treat it like a constant. Thus
essentially leaving only the first branch, which it then expands into a
few instructions.
If we fix that bug, then we really do memcmp on either side of the
conditional. And the compiler is smart enough to realize that hey,
that's the same as just calling memcmp with the_hash_algo->rawsz on
either side. And we end up with roughly the same code that we started
with.
So the assert() version really is the fastest. I didn't test, but I
suspect we could "trick" the compiler by having the fallback call an
opaque wrapper around memcmp(). That would prevent it from combining the
two paths, and presumably it would still optimize the constant-20 side.
Or maybe it would eventually decide our inline function is getting too
big and scrap it. Which probably crosses a line of craziness (if I
didn't already cross it two emails ago).
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 3:03 ` Jeff King
2018-08-22 3:36 ` Jeff King
@ 2018-08-22 5:36 ` brian m. carlson
2018-08-22 6:07 ` Jeff King
2018-08-22 14:28 ` Derrick Stolee
2018-08-22 12:42 ` Paul Smith
2 siblings, 2 replies; 58+ messages in thread
From: brian m. carlson @ 2018-08-22 5:36 UTC (permalink / raw)
To: Jeff King; +Cc: Derrick Stolee, Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]
On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
> So I wonder if there's some other way to tell the compiler that we'll
> only have a few values. An enum comes to mind, though I don't think the
> enum rules are strict enough to make this guarantee (after all, it's OK
> to bitwise-OR enums, so they clearly don't specify all possible values).
I was thinking about this:
diff --git a/cache.h b/cache.h
index 1398b2a4e4..1f5c6e9319 100644
--- a/cache.h
+++ b/cache.h
@@ -1033,7 +1033,14 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ switch (the_hash_algo->rawsz) {
+ case 20:
+ return memcmp(sha1, sha2, 20);
+ case 32:
+ return memcmp(sha1, sha2, 32);
+ default:
+ assert(0);
+ }
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
That would make it obvious that there are at most two options.
Unfortunately, gcc for me determines that the buffer in walker.c is 20
bytes in size and steadfastly refuses to compile because it doesn't know
that the value will never be 32 in our codebase currently. I'd need to
send in more patches before it would compile.
I don't know if something like this is an improvement or now, but this
seems to at least compile:
diff --git a/cache.h b/cache.h
index 1398b2a4e4..3207f74771 100644
--- a/cache.h
+++ b/cache.h
@@ -1033,7 +1033,13 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ switch (the_hash_algo->rawsz) {
+ case 20:
+ case 32:
+ return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ default:
+ assert(0);
+ }
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
I won't have time to sit down and test this out until tomorrow afternoon
at the earliest. If you want to send in something in the mean time,
even if that limits things to just 20 for now, that's fine.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 5:36 ` brian m. carlson
@ 2018-08-22 6:07 ` Jeff King
2018-08-22 7:39 ` Ævar Arnfjörð Bjarmason
2018-08-22 14:28 ` Derrick Stolee
1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 6:07 UTC (permalink / raw)
To: brian m. carlson, Derrick Stolee, Junio C Hamano, git
On Wed, Aug 22, 2018 at 05:36:26AM +0000, brian m. carlson wrote:
> On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
> > So I wonder if there's some other way to tell the compiler that we'll
> > only have a few values. An enum comes to mind, though I don't think the
> > enum rules are strict enough to make this guarantee (after all, it's OK
> > to bitwise-OR enums, so they clearly don't specify all possible values).
>
> I was thinking about this:
>
> diff --git a/cache.h b/cache.h
> index 1398b2a4e4..1f5c6e9319 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1033,7 +1033,14 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + return memcmp(sha1, sha2, 20);
> + case 32:
> + return memcmp(sha1, sha2, 32);
> + default:
> + assert(0);
> + }
> }
Unfortunately this version doesn't seem to be any faster than the status
quo. And looking at the generated asm, it still looks to be calling
memcpy(). Removing the "case 32" branch switches it back to fast
assembly (this is all using gcc 8.2.0, btw). So I think we're deep into
guessing what the optimizer is going to do, and there's a good chance
that other versions are going to optimize it differently.
We might be better off just writing it out manually. Unfortunately, it's
a bit hard because the neg/0/pos return is more expensive to compute
than pure equality. And only the compiler knows at each inlined site
whether we actually want equality. So now we're back to switching every
caller to use hasheq() if that's what they want.
But _if_ we're OK with that, and _if_ we don't mind some ifdefs for
portability, then this seems as fast as the original (memcmp+constant)
code on my machine:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..c406105f3c 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,7 +1023,16 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ switch (the_hash_algo->rawsz) {
+ case 20:
+ if (*(uint32_t *)sha1 == *(uint32_t *)sha2 &&
+ *(unsigned __int128 *)(sha1+4) == *(unsigned __int128 *)(sha2+4))
+ return 0;
+ case 32:
+ return memcmp(sha1, sha2, 32);
+ default:
+ assert(0);
+ }
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
Which is really no surprise, because the generated asm looks about the
same. There are obviously alignment questions there. It's possible it
could even be written portably as a simple loop. Or maybe not. We used
to do that, but modern compilers were able to optimize the memcmp
better. Maybe that's changed. Or maybe they were simply unwilling to
unroll a 20-length loop to find out that it could be turned into a few
quad-word compares.
> That would make it obvious that there are at most two options.
> Unfortunately, gcc for me determines that the buffer in walker.c is 20
> bytes in size and steadfastly refuses to compile because it doesn't know
> that the value will never be 32 in our codebase currently. I'd need to
> send in more patches before it would compile.
Yeah, I see that warning all over the place (everywhere that calls
is_null_oid(), which is passing in a 20-byte buffer).
> I don't know if something like this is an improvement or now, but this
> seems to at least compile:
>
> diff --git a/cache.h b/cache.h
> index 1398b2a4e4..3207f74771 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1033,7 +1033,13 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + case 32:
> + return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + default:
> + assert(0);
> + }
I think that would end up with the same slow code, as gcc would rather
call memcmp than expand out the two sets of asm.
> I won't have time to sit down and test this out until tomorrow afternoon
> at the earliest. If you want to send in something in the mean time,
> even if that limits things to just 20 for now, that's fine.
I don't have a good option. The assert() thing works until I add in the
"32" branch, but that's just punting the issue off until you add support
for the new hash.
Hand-rolling our own asm or C is a portability headache, and we need to
change all of the callsites to use a new hasheq().
Hiding it behind a per-hash function is conceptually cleanest, but not
quite as fast. And it also requires hasheq().
So all of the solutions seem non-trivial. Again, I'm starting to wonder
if it's worth chasing this few percent.
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 6:07 ` Jeff King
@ 2018-08-22 7:39 ` Ævar Arnfjörð Bjarmason
2018-08-22 11:14 ` Derrick Stolee
2018-08-22 15:14 ` Jeff King
0 siblings, 2 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-08-22 7:39 UTC (permalink / raw)
To: Jeff King
Cc: brian m. carlson, Derrick Stolee, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 8:20 AM Jeff King <peff@peff.net> wrote:
>
> On Wed, Aug 22, 2018 at 05:36:26AM +0000, brian m. carlson wrote:
>
> > On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
> > > So I wonder if there's some other way to tell the compiler that we'll
> > > only have a few values. An enum comes to mind, though I don't think the
> > > enum rules are strict enough to make this guarantee (after all, it's OK
> > > to bitwise-OR enums, so they clearly don't specify all possible values).
> >
> > I was thinking about this:
> >
> > diff --git a/cache.h b/cache.h
> > index 1398b2a4e4..1f5c6e9319 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1033,7 +1033,14 @@ extern const struct object_id null_oid;
> >
> > static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> > {
> > - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> > + switch (the_hash_algo->rawsz) {
> > + case 20:
> > + return memcmp(sha1, sha2, 20);
> > + case 32:
> > + return memcmp(sha1, sha2, 32);
> > + default:
> > + assert(0);
> > + }
> > }
>
> Unfortunately this version doesn't seem to be any faster than the status
> quo. And looking at the generated asm, it still looks to be calling
> memcpy(). Removing the "case 32" branch switches it back to fast
> assembly (this is all using gcc 8.2.0, btw). So I think we're deep into
> guessing what the optimizer is going to do, and there's a good chance
> that other versions are going to optimize it differently.
>
> We might be better off just writing it out manually. Unfortunately, it's
> a bit hard because the neg/0/pos return is more expensive to compute
> than pure equality. And only the compiler knows at each inlined site
> whether we actually want equality. So now we're back to switching every
> caller to use hasheq() if that's what they want.
>
> But _if_ we're OK with that, and _if_ we don't mind some ifdefs for
> portability, then this seems as fast as the original (memcmp+constant)
> code on my machine:
>
> diff --git a/cache.h b/cache.h
> index b1fd3d58ab..c406105f3c 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1023,7 +1023,16 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + if (*(uint32_t *)sha1 == *(uint32_t *)sha2 &&
> + *(unsigned __int128 *)(sha1+4) == *(unsigned __int128 *)(sha2+4))
> + return 0;
> + case 32:
> + return memcmp(sha1, sha2, 32);
> + default:
> + assert(0);
> + }
> }
>
> static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
>
> Which is really no surprise, because the generated asm looks about the
> same. There are obviously alignment questions there. It's possible it
> could even be written portably as a simple loop. Or maybe not. We used
> to do that, but modern compilers were able to optimize the memcmp
> better. Maybe that's changed. Or maybe they were simply unwilling to
> unroll a 20-length loop to find out that it could be turned into a few
> quad-word compares.
>
> > That would make it obvious that there are at most two options.
> > Unfortunately, gcc for me determines that the buffer in walker.c is 20
> > bytes in size and steadfastly refuses to compile because it doesn't know
> > that the value will never be 32 in our codebase currently. I'd need to
> > send in more patches before it would compile.
>
> Yeah, I see that warning all over the place (everywhere that calls
> is_null_oid(), which is passing in a 20-byte buffer).
>
> > I don't know if something like this is an improvement or now, but this
> > seems to at least compile:
> >
> > diff --git a/cache.h b/cache.h
> > index 1398b2a4e4..3207f74771 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1033,7 +1033,13 @@ extern const struct object_id null_oid;
> >
> > static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> > {
> > - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> > + switch (the_hash_algo->rawsz) {
> > + case 20:
> > + case 32:
> > + return memcmp(sha1, sha2, the_hash_algo->rawsz);
> > + default:
> > + assert(0);
> > + }
>
> I think that would end up with the same slow code, as gcc would rather
> call memcmp than expand out the two sets of asm.
>
> > I won't have time to sit down and test this out until tomorrow afternoon
> > at the earliest. If you want to send in something in the mean time,
> > even if that limits things to just 20 for now, that's fine.
>
> I don't have a good option. The assert() thing works until I add in the
> "32" branch, but that's just punting the issue off until you add support
> for the new hash.
>
> Hand-rolling our own asm or C is a portability headache, and we need to
> change all of the callsites to use a new hasheq().
>
> Hiding it behind a per-hash function is conceptually cleanest, but not
> quite as fast. And it also requires hasheq().
>
> So all of the solutions seem non-trivial. Again, I'm starting to wonder
> if it's worth chasing this few percent.
Did you try __builtin_expect? It's a GCC builtin for these sorts of
situations, and sometimes helps:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
I.e. you'd tell GCC we expect to have the 20 there with:
if (__builtin_expect(the_hash_algo->rawsz == 20, 1)) { ... }
The perl codebase has LIKELY() and UNLIKELY() macros for this which if
the feature isn't available fall back on just plain C code:
https://github.com/Perl/perl5/blob/v5.27.7/perl.h#L3335-L3344
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 3:36 ` Jeff King
@ 2018-08-22 11:11 ` Derrick Stolee
0 siblings, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-22 11:11 UTC (permalink / raw)
To: Jeff King, brian m. carlson, Junio C Hamano, git
On 8/21/2018 11:36 PM, Jeff King wrote:
> On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
>
>> with the obvious "oideq()" implementation added, that seems to get me to
>> 2-3%. Not _quite_ as good as the original branching version I showed.
>> And we had to touch all the callsites (although arguably that kind of
>> "eq" function is a better interface anyway, since it obviously allows
>> for more optimization.
>>
>> So maybe the branching thing is actually not so insane. It makes new
>> hash_algo's Just Work; they just won't be optimized. And the change is
>> very localized.
> Hmph. So I went back to double-check my measurements on that branching
> version, and I couldn't replicate it!
I'm actually relieved to see this, as I couldn't either.
>
> It turns out what I showed (and measured) before has a bug. Can you see
> it?
I had rewritten the section from scratch instead of applying your diff,
so I didn't get the sha1-sha1 error. I decided to sleep on it instead of
sending my email.
> So the assert() version really is the fastest. I didn't test, but I
> suspect we could "trick" the compiler by having the fallback call an
> opaque wrapper around memcmp(). That would prevent it from combining the
> two paths, and presumably it would still optimize the constant-20 side.
> Or maybe it would eventually decide our inline function is getting too
> big and scrap it. Which probably crosses a line of craziness (if I
> didn't already cross it two emails ago).
I appreciate your effort here.
Thanks
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 7:39 ` Ævar Arnfjörð Bjarmason
@ 2018-08-22 11:14 ` Derrick Stolee
2018-08-22 15:17 ` Jeff King
2018-08-22 15:14 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2018-08-22 11:14 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason, Jeff King
Cc: brian m. carlson, Junio C Hamano, Git Mailing List
On 8/22/2018 3:39 AM, Ævar Arnfjörð Bjarmason wrote:
> On Wed, Aug 22, 2018 at 8:20 AM Jeff King <peff@peff.net> wrote:
>> On Wed, Aug 22, 2018 at 05:36:26AM +0000, brian m. carlson wrote:
>>
>>> On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
>>> I don't know if something like this is an improvement or now, but this
>>> seems to at least compile:
>>>
>>> diff --git a/cache.h b/cache.h
>>> index 1398b2a4e4..3207f74771 100644
>>> --- a/cache.h
>>> +++ b/cache.h
>>> @@ -1033,7 +1033,13 @@ extern const struct object_id null_oid;
>>>
>>> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
>>> {
>>> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
>>> + switch (the_hash_algo->rawsz) {
>>> + case 20:
>>> + case 32:
>>> + return memcmp(sha1, sha2, the_hash_algo->rawsz);
>>> + default:
>>> + assert(0);
>>> + }
>> I think that would end up with the same slow code, as gcc would rather
>> call memcmp than expand out the two sets of asm.
>>
>>> I won't have time to sit down and test this out until tomorrow afternoon
>>> at the earliest. If you want to send in something in the mean time,
>>> even if that limits things to just 20 for now, that's fine.
>> I don't have a good option. The assert() thing works until I add in the
>> "32" branch, but that's just punting the issue off until you add support
>> for the new hash.
>>
>> Hand-rolling our own asm or C is a portability headache, and we need to
>> change all of the callsites to use a new hasheq().
>>
>> Hiding it behind a per-hash function is conceptually cleanest, but not
>> quite as fast. And it also requires hasheq().
>>
>> So all of the solutions seem non-trivial. Again, I'm starting to wonder
>> if it's worth chasing this few percent.
> Did you try __builtin_expect? It's a GCC builtin for these sorts of
> situations, and sometimes helps:
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>
> I.e. you'd tell GCC we expect to have the 20 there with:
>
> if (__builtin_expect(the_hash_algo->rawsz == 20, 1)) { ... }
>
> The perl codebase has LIKELY() and UNLIKELY() macros for this which if
> the feature isn't available fall back on just plain C code:
> https://github.com/Perl/perl5/blob/v5.27.7/perl.h#L3335-L3344
The other thing I was going to recommend (and I'll try to test this out
myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
volatile variable, since it is being referenced through a pointer.
Perhaps storing the value locally and then casing on it would help?
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 3:03 ` Jeff King
2018-08-22 3:36 ` Jeff King
2018-08-22 5:36 ` brian m. carlson
@ 2018-08-22 12:42 ` Paul Smith
2018-08-22 15:23 ` Jeff King
2 siblings, 1 reply; 58+ messages in thread
From: Paul Smith @ 2018-08-22 12:42 UTC (permalink / raw)
To: Jeff King, git
On Tue, 2018-08-21 at 23:03 -0400, Jeff King wrote:
> static inline int hashcmp(const unsigned char *sha1, const unsigned
> char *sha2)
> {
> + assert(the_hash_algo->rawsz == 20);
> return memcmp(sha1, sha2, the_hash_algo->rawsz);
> }
I'm not familiar with Git code, but for most environments assert() is a
macro which is compiled out when built for "release mode" (whatever
that might mean). If that's the case for Git too, then relying on
assert() to provide a side-effect (even an optimizer hint side-effect)
won't work and this will actually get slower when built for "release
mode".
Just a thought...
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 5:36 ` brian m. carlson
2018-08-22 6:07 ` Jeff King
@ 2018-08-22 14:28 ` Derrick Stolee
2018-08-22 15:24 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2018-08-22 14:28 UTC (permalink / raw)
To: brian m. carlson, Jeff King, Junio C Hamano, git
On 8/22/2018 1:36 AM, brian m. carlson wrote:
> On Tue, Aug 21, 2018 at 11:03:44PM -0400, Jeff King wrote:
>> So I wonder if there's some other way to tell the compiler that we'll
>> only have a few values. An enum comes to mind, though I don't think the
>> enum rules are strict enough to make this guarantee (after all, it's OK
>> to bitwise-OR enums, so they clearly don't specify all possible values).
> I was thinking about this:
>
> diff --git a/cache.h b/cache.h
> index 1398b2a4e4..1f5c6e9319 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1033,7 +1033,14 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + return memcmp(sha1, sha2, 20);
> + case 32:
> + return memcmp(sha1, sha2, 32);
> + default:
> + assert(0);
> + }
> }
>
> static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
>
> That would make it obvious that there are at most two options.
> Unfortunately, gcc for me determines that the buffer in walker.c is 20
> bytes in size and steadfastly refuses to compile because it doesn't know
> that the value will never be 32 in our codebase currently. I'd need to
> send in more patches before it would compile.
>
> I don't know if something like this is an improvement or now, but this
> seems to at least compile:
>
> diff --git a/cache.h b/cache.h
> index 1398b2a4e4..3207f74771 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1033,7 +1033,13 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + case 32:
> + return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + default:
> + assert(0);
> + }
> }
In my testing, I've had the best luck with this change:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..6c8b51c390 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,7 +1023,14 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned
char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ switch (the_hash_algo->rawsz) {
+ case 20:
+ return memcmp(sha1, sha2, 20);
+ case 32:
+ return memcmp(sha1, sha2, 32);
+ default:
+ assert(0);
+ }
}
The fact that '20' and '32' are constants here may be helpful to the
compiler. Can someone else test the perf?
Thanks,
-Stolee
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 7:39 ` Ævar Arnfjörð Bjarmason
2018-08-22 11:14 ` Derrick Stolee
@ 2018-08-22 15:14 ` Jeff King
1 sibling, 0 replies; 58+ messages in thread
From: Jeff King @ 2018-08-22 15:14 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: brian m. carlson, Derrick Stolee, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 09:39:57AM +0200, Ævar Arnfjörð Bjarmason wrote:
> > I don't have a good option. The assert() thing works until I add in the
> > "32" branch, but that's just punting the issue off until you add support
> > for the new hash.
> >
> > Hand-rolling our own asm or C is a portability headache, and we need to
> > change all of the callsites to use a new hasheq().
> >
> > Hiding it behind a per-hash function is conceptually cleanest, but not
> > quite as fast. And it also requires hasheq().
> >
> > So all of the solutions seem non-trivial. Again, I'm starting to wonder
> > if it's worth chasing this few percent.
>
> Did you try __builtin_expect? It's a GCC builtin for these sorts of
> situations, and sometimes helps:
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>
> I.e. you'd tell GCC we expect to have the 20 there with:
>
> if (__builtin_expect(the_hash_algo->rawsz == 20, 1)) { ... }
>
> The perl codebase has LIKELY() and UNLIKELY() macros for this which if
> the feature isn't available fall back on just plain C code:
> https://github.com/Perl/perl5/blob/v5.27.7/perl.h#L3335-L3344
Sadly, no, this doesn't seem to change anything. We still end up with a
single call to memcmp.
I also tried "hiding" the fallback call like this:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..7808bf3d6b 100644
--- a/cache.h
+++ b/cache.h
@@ -1021,9 +1021,13 @@ extern int find_unique_abbrev_r(char *hex, const struct object_id *oid, int len)
extern const unsigned char null_sha1[GIT_MAX_RAWSZ];
extern const struct object_id null_oid;
+int super_secret_memcmp(const void *a, const void *b, size_t len);
+
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ if (the_hash_algo->rawsz == 20)
+ return memcmp(sha1, sha2, 20);
+ return super_secret_memcmp(sha1, sha2, the_hash_algo->rawsz);
}
static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2)
diff --git a/sha1-file.c b/sha1-file.c
index 97b7423848..5cd0a4b73f 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -2280,3 +2280,8 @@ int read_loose_object(const char *path,
munmap(map, mapsize);
return ret;
}
+
+int super_secret_memcmp(const void *a, const void *b, size_t len)
+{
+ return memcmp(a, b, len);
+}
but that just results in calling memcmp and super_secret_memcmp on the
two codepaths (with or without the __builtin_expect).
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 11:14 ` Derrick Stolee
@ 2018-08-22 15:17 ` Jeff King
2018-08-22 16:08 ` Duy Nguyen
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 15:17 UTC (permalink / raw)
To: Derrick Stolee
Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
> The other thing I was going to recommend (and I'll try to test this out
> myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
> volatile variable, since it is being referenced through a pointer. Perhaps
> storing the value locally and then casing on it would help?
I tried various sprinkling of "const" around the declarations to make it
clear that the values wouldn't change once we saw them. But I couldn't
detect any difference. At most I think that would let us hoist the "if"
out of the loop, but gcc still seems unwilling to expand the memcmp when
there are other branches.
I think if that's the thing we want to have happen, we really do need to
just write it out on that branch rather than saying "memcmp".
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 12:42 ` Paul Smith
@ 2018-08-22 15:23 ` Jeff King
2018-08-23 1:23 ` Jonathan Nieder
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 15:23 UTC (permalink / raw)
To: Paul Smith; +Cc: git
On Wed, Aug 22, 2018 at 08:42:20AM -0400, Paul Smith wrote:
> On Tue, 2018-08-21 at 23:03 -0400, Jeff King wrote:
> > static inline int hashcmp(const unsigned char *sha1, const unsigned
> > char *sha2)
> > {
> > + assert(the_hash_algo->rawsz == 20);
> > return memcmp(sha1, sha2, the_hash_algo->rawsz);
> > }
>
> I'm not familiar with Git code, but for most environments assert() is a
> macro which is compiled out when built for "release mode" (whatever
> that might mean). If that's the case for Git too, then relying on
> assert() to provide a side-effect (even an optimizer hint side-effect)
> won't work and this will actually get slower when built for "release
> mode".
>
> Just a thought...
We don't have such a "release mode" in Git, though of course people may
pass -DNDEBUG to the compiler if they want.
However, to me how we spell the assert is mostly orthogonal to the
discussion. We can do "if (...) BUG(...)" to get a guaranteed-present
conditional. The bigger questions are:
- are we OK with such an assertion; and
- does the assertion still give us the desired behavior when we add in
a branch for rawsz==32?
And I think the answers for those are both "probably not".
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 14:28 ` Derrick Stolee
@ 2018-08-22 15:24 ` Jeff King
0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2018-08-22 15:24 UTC (permalink / raw)
To: Derrick Stolee; +Cc: brian m. carlson, Junio C Hamano, git
On Wed, Aug 22, 2018 at 10:28:56AM -0400, Derrick Stolee wrote:
> In my testing, I've had the best luck with this change:
>
> diff --git a/cache.h b/cache.h
> index b1fd3d58ab..6c8b51c390 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1023,7 +1023,14 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char
> *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + switch (the_hash_algo->rawsz) {
> + case 20:
> + return memcmp(sha1, sha2, 20);
> + case 32:
> + return memcmp(sha1, sha2, 32);
> + default:
> + assert(0);
> + }
> }
>
> The fact that '20' and '32' are constants here may be helpful to the
> compiler. Can someone else test the perf?
I tested that one last night (and just re-tested it now to be sure). It
seems to just generate two separate calls to memcmp, with no speed
improvement.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 15:17 ` Jeff King
@ 2018-08-22 16:08 ` Duy Nguyen
2018-08-22 16:14 ` Duy Nguyen
0 siblings, 1 reply; 58+ messages in thread
From: Duy Nguyen @ 2018-08-22 16:08 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
>
> On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
>
> > The other thing I was going to recommend (and I'll try to test this out
> > myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
> > volatile variable, since it is being referenced through a pointer. Perhaps
> > storing the value locally and then casing on it would help?
>
> I tried various sprinkling of "const" around the declarations to make it
> clear that the values wouldn't change once we saw them. But I couldn't
> detect any difference. At most I think that would let us hoist the "if"
> out of the loop, but gcc still seems unwilling to expand the memcmp when
> there are other branches.
>
> I think if that's the thing we want to have happen, we really do need to
> just write it out on that branch rather than saying "memcmp".
This reminds me of an old discussion about memcpy() vs doing explicit
compare loop with lots of performance measurements.. Is that what you
meant by "write it out"?
--
Duy
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:08 ` Duy Nguyen
@ 2018-08-22 16:14 ` Duy Nguyen
2018-08-22 16:26 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Duy Nguyen @ 2018-08-22 16:14 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 6:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
> >
> > On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
> >
> > > The other thing I was going to recommend (and I'll try to test this out
> > > myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
> > > volatile variable, since it is being referenced through a pointer. Perhaps
> > > storing the value locally and then casing on it would help?
> >
> > I tried various sprinkling of "const" around the declarations to make it
> > clear that the values wouldn't change once we saw them. But I couldn't
> > detect any difference. At most I think that would let us hoist the "if"
> > out of the loop, but gcc still seems unwilling to expand the memcmp when
> > there are other branches.
> >
> > I think if that's the thing we want to have happen, we really do need to
> > just write it out on that branch rather than saying "memcmp".
>
> This reminds me of an old discussion about memcpy() vs doing explicit
> compare loop with lots of performance measurements..
Ah found it. Not sure if it is still relevant in light of multiple hash support
https://public-inbox.org/git/20110427225114.GA16765@elte.hu/
--
Duy
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:14 ` Duy Nguyen
@ 2018-08-22 16:26 ` Jeff King
2018-08-22 16:49 ` Derrick Stolee
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 16:26 UTC (permalink / raw)
To: Duy Nguyen
Cc: Derrick Stolee, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 06:14:24PM +0200, Duy Nguyen wrote:
> On Wed, Aug 22, 2018 at 6:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
> >
> > On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
> > >
> > > On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
> > >
> > > > The other thing I was going to recommend (and I'll try to test this out
> > > > myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
> > > > volatile variable, since it is being referenced through a pointer. Perhaps
> > > > storing the value locally and then casing on it would help?
> > >
> > > I tried various sprinkling of "const" around the declarations to make it
> > > clear that the values wouldn't change once we saw them. But I couldn't
> > > detect any difference. At most I think that would let us hoist the "if"
> > > out of the loop, but gcc still seems unwilling to expand the memcmp when
> > > there are other branches.
> > >
> > > I think if that's the thing we want to have happen, we really do need to
> > > just write it out on that branch rather than saying "memcmp".
> >
> > This reminds me of an old discussion about memcpy() vs doing explicit
> > compare loop with lots of performance measurements..
>
> Ah found it. Not sure if it is still relevant in light of multiple hash support
>
> https://public-inbox.org/git/20110427225114.GA16765@elte.hu/
Yes, that was what I meant. We actually did switch to that hand-rolled
loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
memcmp instead of open-coded loop, 2017-08-09).
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:26 ` Jeff King
@ 2018-08-22 16:49 ` Derrick Stolee
2018-08-22 16:58 ` Duy Nguyen
2018-08-22 16:59 ` Jeff King
0 siblings, 2 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-22 16:49 UTC (permalink / raw)
To: Jeff King, Duy Nguyen
Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
Junio C Hamano, Git Mailing List
On 8/22/2018 12:26 PM, Jeff King wrote:
> On Wed, Aug 22, 2018 at 06:14:24PM +0200, Duy Nguyen wrote:
>
>> On Wed, Aug 22, 2018 at 6:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
>>> On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
>>>> On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
>>>>
>>>>> The other thing I was going to recommend (and I'll try to test this out
>>>>> myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
>>>>> volatile variable, since it is being referenced through a pointer. Perhaps
>>>>> storing the value locally and then casing on it would help?
>>>> I tried various sprinkling of "const" around the declarations to make it
>>>> clear that the values wouldn't change once we saw them. But I couldn't
>>>> detect any difference. At most I think that would let us hoist the "if"
>>>> out of the loop, but gcc still seems unwilling to expand the memcmp when
>>>> there are other branches.
>>>>
>>>> I think if that's the thing we want to have happen, we really do need to
>>>> just write it out on that branch rather than saying "memcmp".
>>> This reminds me of an old discussion about memcpy() vs doing explicit
>>> compare loop with lots of performance measurements..
>> Ah found it. Not sure if it is still relevant in light of multiple hash support
>>
>> https://public-inbox.org/git/20110427225114.GA16765@elte.hu/
> Yes, that was what I meant. We actually did switch to that hand-rolled
> loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
> memcmp instead of open-coded loop, 2017-08-09).
Looking at that commit, I'm surprised the old logic was just a for loop, instead of a word-based approach, such as the following:
diff --git a/cache.h b/cache.h
index b1fd3d58ab..5e5819ad49 100644
--- a/cache.h
+++ b/cache.h
@@ -1021,9 +1021,41 @@ extern int find_unique_abbrev_r(char *hex, const
struct object_id *oid, int len)
extern const unsigned char null_sha1[GIT_MAX_RAWSZ];
extern const struct object_id null_oid;
+static inline int word_cmp_32(uint32_t a, uint32_t b)
+{
+ return memcmp(&a, &b, sizeof(uint32_t));
+}
+
+static inline int word_cmp_64(uint64_t a, uint64_t b)
+{
+ return memcmp(&a, &b, sizeof(uint64_t));
+}
+
+struct object_id_20 {
+ uint64_t data0;
+ uint64_t data1;
+ uint32_t data2;
+};
+
static inline int hashcmp(const unsigned char *sha1, const unsigned
char *sha2)
{
- return memcmp(sha1, sha2, the_hash_algo->rawsz);
+ if (the_hash_algo->rawsz == 20) {
+ struct object_id_20 *obj1 = (struct object_id_20 *)sha1;
+ struct object_id_20 *obj2 = (struct object_id_20 *)sha2;
+
+ if (obj1->data0 == obj2->data0) {
+ if (obj1->data1 == obj2->data1) {
+ if (obj1->data2 == obj2->data2) {
+ return 0;
+ }
+ return word_cmp_32(obj1->data2,
obj2->data2);
+ }
+ return word_cmp_64(obj1->data1, obj2->data1);
+ }
+ return word_cmp_64(obj1->data0, obj2->data0);
+ }
+
+ assert(0);
}
static inline int oidcmp(const struct object_id *oid1, const struct
object_id *oid2)
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:49 ` Derrick Stolee
@ 2018-08-22 16:58 ` Duy Nguyen
2018-08-22 17:04 ` Derrick Stolee
2018-08-22 16:59 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: Duy Nguyen @ 2018-08-22 16:58 UTC (permalink / raw)
To: Derrick Stolee
Cc: Jeff King, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 6:49 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 8/22/2018 12:26 PM, Jeff King wrote:
> > On Wed, Aug 22, 2018 at 06:14:24PM +0200, Duy Nguyen wrote:
> >
> >> On Wed, Aug 22, 2018 at 6:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
> >>> On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
> >>>> On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
> >>>>
> >>>>> The other thing I was going to recommend (and I'll try to test this out
> >>>>> myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
> >>>>> volatile variable, since it is being referenced through a pointer. Perhaps
> >>>>> storing the value locally and then casing on it would help?
> >>>> I tried various sprinkling of "const" around the declarations to make it
> >>>> clear that the values wouldn't change once we saw them. But I couldn't
> >>>> detect any difference. At most I think that would let us hoist the "if"
> >>>> out of the loop, but gcc still seems unwilling to expand the memcmp when
> >>>> there are other branches.
> >>>>
> >>>> I think if that's the thing we want to have happen, we really do need to
> >>>> just write it out on that branch rather than saying "memcmp".
> >>> This reminds me of an old discussion about memcpy() vs doing explicit
> >>> compare loop with lots of performance measurements..
> >> Ah found it. Not sure if it is still relevant in light of multiple hash support
> >>
> >> https://public-inbox.org/git/20110427225114.GA16765@elte.hu/
> > Yes, that was what I meant. We actually did switch to that hand-rolled
> > loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
> > memcmp instead of open-coded loop, 2017-08-09).
>
> Looking at that commit, I'm surprised the old logic was just a for loop, instead of a word-based approach, such as the following:
Might work on x86 but it breaks on cpu architectures with stricter
alignment. I don't think we have a guarantee that object_id is always
8 byte aligned.
>
> diff --git a/cache.h b/cache.h
> index b1fd3d58ab..5e5819ad49 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1021,9 +1021,41 @@ extern int find_unique_abbrev_r(char *hex, const
> struct object_id *oid, int len)
> extern const unsigned char null_sha1[GIT_MAX_RAWSZ];
> extern const struct object_id null_oid;
>
> +static inline int word_cmp_32(uint32_t a, uint32_t b)
> +{
> + return memcmp(&a, &b, sizeof(uint32_t));
> +}
> +
> +static inline int word_cmp_64(uint64_t a, uint64_t b)
> +{
> + return memcmp(&a, &b, sizeof(uint64_t));
> +}
> +
> +struct object_id_20 {
> + uint64_t data0;
> + uint64_t data1;
> + uint32_t data2;
> +};
> +
> static inline int hashcmp(const unsigned char *sha1, const unsigned
> char *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + if (the_hash_algo->rawsz == 20) {
> + struct object_id_20 *obj1 = (struct object_id_20 *)sha1;
> + struct object_id_20 *obj2 = (struct object_id_20 *)sha2;
> +
> + if (obj1->data0 == obj2->data0) {
> + if (obj1->data1 == obj2->data1) {
> + if (obj1->data2 == obj2->data2) {
> + return 0;
> + }
> + return word_cmp_32(obj1->data2,
> obj2->data2);
> + }
> + return word_cmp_64(obj1->data1, obj2->data1);
> + }
> + return word_cmp_64(obj1->data0, obj2->data0);
> + }
> +
> + assert(0);
> }
>
> static inline int oidcmp(const struct object_id *oid1, const struct
> object_id *oid2)
>
>
--
Duy
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:49 ` Derrick Stolee
2018-08-22 16:58 ` Duy Nguyen
@ 2018-08-22 16:59 ` Jeff King
2018-08-22 17:02 ` Junio C Hamano
1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-22 16:59 UTC (permalink / raw)
To: Derrick Stolee
Cc: Duy Nguyen, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On Wed, Aug 22, 2018 at 12:49:34PM -0400, Derrick Stolee wrote:
> > Yes, that was what I meant. We actually did switch to that hand-rolled
> > loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
> > memcmp instead of open-coded loop, 2017-08-09).
>
> Looking at that commit, I'm surprised the old logic was just a for
> loop, instead of a word-based approach, such as the following:
> [...]
> +struct object_id_20 {
> + uint64_t data0;
> + uint64_t data1;
> + uint32_t data2;
> +};
> +
> static inline int hashcmp(const unsigned char *sha1, const unsigned char
> *sha2)
> {
> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
> + if (the_hash_algo->rawsz == 20) {
> + struct object_id_20 *obj1 = (struct object_id_20 *)sha1;
> + struct object_id_20 *obj2 = (struct object_id_20 *)sha2;
I wonder if you're potentially running afoul of alignment requirements
here.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:59 ` Jeff King
@ 2018-08-22 17:02 ` Junio C Hamano
0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2018-08-22 17:02 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson,
Git Mailing List
Jeff King <peff@peff.net> writes:
> On Wed, Aug 22, 2018 at 12:49:34PM -0400, Derrick Stolee wrote:
>
>> > Yes, that was what I meant. We actually did switch to that hand-rolled
>> > loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
>> > memcmp instead of open-coded loop, 2017-08-09).
>>
>> Looking at that commit, I'm surprised the old logic was just a for
>> loop, instead of a word-based approach, such as the following:
>> [...]
>> +struct object_id_20 {
>> + uint64_t data0;
>> + uint64_t data1;
>> + uint32_t data2;
>> +};
>> +
>> static inline int hashcmp(const unsigned char *sha1, const unsigned char
>> *sha2)
>> {
>> - return memcmp(sha1, sha2, the_hash_algo->rawsz);
>> + if (the_hash_algo->rawsz == 20) {
>> + struct object_id_20 *obj1 = (struct object_id_20 *)sha1;
>> + struct object_id_20 *obj2 = (struct object_id_20 *)sha2;
>
> I wonder if you're potentially running afoul of alignment requirements
> here.
Yup, and I think that all was discussed in that old thread ;-)
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 16:58 ` Duy Nguyen
@ 2018-08-22 17:04 ` Derrick Stolee
0 siblings, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-22 17:04 UTC (permalink / raw)
To: Duy Nguyen
Cc: Jeff King, Ævar Arnfjörð Bjarmason,
brian m. carlson, Junio C Hamano, Git Mailing List
On 8/22/2018 12:58 PM, Duy Nguyen wrote:
> On Wed, Aug 22, 2018 at 6:49 PM Derrick Stolee <stolee@gmail.com> wrote:
>> On 8/22/2018 12:26 PM, Jeff King wrote:
>>> On Wed, Aug 22, 2018 at 06:14:24PM +0200, Duy Nguyen wrote:
>>>
>>>> On Wed, Aug 22, 2018 at 6:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
>>>>> On Wed, Aug 22, 2018 at 6:03 PM Jeff King <peff@peff.net> wrote:
>>>>>> On Wed, Aug 22, 2018 at 07:14:42AM -0400, Derrick Stolee wrote:
>>>>>>
>>>>>>> The other thing I was going to recommend (and I'll try to test this out
>>>>>>> myself later) is to see if 'the_hash_algo->rawsz' is being treated as a
>>>>>>> volatile variable, since it is being referenced through a pointer. Perhaps
>>>>>>> storing the value locally and then casing on it would help?
>>>>>> I tried various sprinkling of "const" around the declarations to make it
>>>>>> clear that the values wouldn't change once we saw them. But I couldn't
>>>>>> detect any difference. At most I think that would let us hoist the "if"
>>>>>> out of the loop, but gcc still seems unwilling to expand the memcmp when
>>>>>> there are other branches.
>>>>>>
>>>>>> I think if that's the thing we want to have happen, we really do need to
>>>>>> just write it out on that branch rather than saying "memcmp".
>>>>> This reminds me of an old discussion about memcpy() vs doing explicit
>>>>> compare loop with lots of performance measurements..
>>>> Ah found it. Not sure if it is still relevant in light of multiple hash support
>>>>
>>>> https://public-inbox.org/git/20110427225114.GA16765@elte.hu/
>>> Yes, that was what I meant. We actually did switch to that hand-rolled
>>> loop, but later we went back to memcmp in 0b006014c8 (hashcmp: use
>>> memcmp instead of open-coded loop, 2017-08-09).
>> Looking at that commit, I'm surprised the old logic was just a for loop, instead of a word-based approach, such as the following:
> Might work on x86 but it breaks on cpu architectures with stricter
> alignment. I don't think we have a guarantee that object_id is always
> 8 byte aligned.
You (and Peff) are probably correct here, which is unfortunate. I'm not
familiar with alignment constraints, but assume that such a word-based
approach is best.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-22 15:23 ` Jeff King
@ 2018-08-23 1:23 ` Jonathan Nieder
2018-08-23 2:16 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Nieder @ 2018-08-23 1:23 UTC (permalink / raw)
To: Jeff King
Cc: Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
Hi,
Jeff King wrote:
>> On Tue, 2018-08-21 at 23:03 -0400, Jeff King wrote:
>>> static inline int hashcmp(const unsigned char *sha1, const unsigned
>>> char *sha2)
>>> {
>>> + assert(the_hash_algo->rawsz == 20);
>>> return memcmp(sha1, sha2, the_hash_algo->rawsz);
>>> }
[...]
> The bigger questions are:
>
> - are we OK with such an assertion; and
>
> - does the assertion still give us the desired behavior when we add in
> a branch for rawsz==32?
>
> And I think the answers for those are both "probably not".
At this point in the release process, I think the answer to the first
question is a pretty clear "yes".
A ~10% increase in latency of some operations is quite significant, in
exchange for no user benefit yet. We can continue to try to figure
out how to convince compilers to generate good code for this (and
that's useful), but in the meantime we should also do the simple thing
to avoid the regression for users.
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 1:23 ` Jonathan Nieder
@ 2018-08-23 2:16 ` Jeff King
2018-08-23 2:27 ` Jonathan Nieder
2018-08-23 3:47 ` brian m. carlson
0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2018-08-23 2:16 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
On Wed, Aug 22, 2018 at 06:23:43PM -0700, Jonathan Nieder wrote:
> Jeff King wrote:
> >> On Tue, 2018-08-21 at 23:03 -0400, Jeff King wrote:
>
> >>> static inline int hashcmp(const unsigned char *sha1, const unsigned
> >>> char *sha2)
> >>> {
> >>> + assert(the_hash_algo->rawsz == 20);
> >>> return memcmp(sha1, sha2, the_hash_algo->rawsz);
> >>> }
> [...]
> > The bigger questions are:
> >
> > - are we OK with such an assertion; and
> >
> > - does the assertion still give us the desired behavior when we add in
> > a branch for rawsz==32?
> >
> > And I think the answers for those are both "probably not".
>
> At this point in the release process, I think the answer to the first
> question is a pretty clear "yes".
>
> A ~10% increase in latency of some operations is quite significant, in
> exchange for no user benefit yet. We can continue to try to figure
> out how to convince compilers to generate good code for this (and
> that's useful), but in the meantime we should also do the simple thing
> to avoid the regression for users.
FWIW, it's not 10%. The best I measured was ~4% on a very
hashcmp-limited operation, and I suspect even that may be highly
dependent on the compiler. We might be able to improve more by
sprinkling more asserts around, but there are 75 mentions of
the_hash_algo->rawsz. I wouldn't want to an assert at each one.
I don't mind doing one or a handful of these asserts as part of v2.19 if
we want to try to reclaim those few percent. But I suspect the very
first commit in any further hash-transition work is just going to be to
rip them all out.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 2:16 ` Jeff King
@ 2018-08-23 2:27 ` Jonathan Nieder
2018-08-23 5:02 ` Jeff King
2018-08-23 3:47 ` brian m. carlson
1 sibling, 1 reply; 58+ messages in thread
From: Jonathan Nieder @ 2018-08-23 2:27 UTC (permalink / raw)
To: Jeff King
Cc: Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
Jeff King wrote:
> FWIW, it's not 10%. The best I measured was ~4% on a very
> hashcmp-limited operation, and I suspect even that may be highly
> dependent on the compiler. We might be able to improve more by
> sprinkling more asserts around, but there are 75 mentions of
> the_hash_algo->rawsz. I wouldn't want to an assert at each one.
>
> I don't mind doing one or a handful of these asserts as part of v2.19 if
> we want to try to reclaim those few percent. But I suspect the very
> first commit in any further hash-transition work is just going to be to
> rip them all out.
I was thinking just hashcmp and hashcpy.
Ideally such a change would come with a performance test to help the
person writing that very first commit. Except we already have
performance tests that capture this. ;-)
For further hash-transition work, I agree someone may want to revert
this, and I don't mind such a revert appearing right away in "next".
And it's possible that we might have to do the equivalent of manual
template expansion to recover the performance in some
performance-sensitive areas. Maybe we can get the compiler to
cooperate with us in that and maybe we can't. That's okay with me.
Anyway, I'll resend your patch with a commit message added some time
this evening.
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 2:16 ` Jeff King
2018-08-23 2:27 ` Jonathan Nieder
@ 2018-08-23 3:47 ` brian m. carlson
2018-08-23 5:04 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: brian m. carlson @ 2018-08-23 3:47 UTC (permalink / raw)
To: Jeff King
Cc: Jonathan Nieder, Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason
[-- Attachment #1: Type: text/plain, Size: 1372 bytes --]
On Wed, Aug 22, 2018 at 10:16:18PM -0400, Jeff King wrote:
> FWIW, it's not 10%. The best I measured was ~4% on a very
> hashcmp-limited operation, and I suspect even that may be highly
> dependent on the compiler. We might be able to improve more by
> sprinkling more asserts around, but there are 75 mentions of
> the_hash_algo->rawsz. I wouldn't want to an assert at each one.
>
> I don't mind doing one or a handful of these asserts as part of v2.19 if
> we want to try to reclaim those few percent. But I suspect the very
> first commit in any further hash-transition work is just going to be to
> rip them all out.
I expect that's going to be the case as well. I have patches that
wire up actual SHA-256 support in my hash-impl branch.
However, having said that, I'm happy to defer to whatever everyone else
thinks is best for 2.19. The assert solution would be fine with me in
this situation, and if we need to pull it out in the future, that's okay
with me.
I don't really have a strong opinion on this either way, so if someone
else does, please say so. I have somewhat more limited availability
over the next couple days, as I'm travelling on business, but I'm happy
to review a patch (and it seems like Peff has one minus the actual
commit message).
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 2:27 ` Jonathan Nieder
@ 2018-08-23 5:02 ` Jeff King
2018-08-23 5:09 ` brian m. carlson
` (2 more replies)
0 siblings, 3 replies; 58+ messages in thread
From: Jeff King @ 2018-08-23 5:02 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
On Wed, Aug 22, 2018 at 07:27:56PM -0700, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > FWIW, it's not 10%. The best I measured was ~4% on a very
> > hashcmp-limited operation, and I suspect even that may be highly
> > dependent on the compiler. We might be able to improve more by
> > sprinkling more asserts around, but there are 75 mentions of
> > the_hash_algo->rawsz. I wouldn't want to an assert at each one.
> >
> > I don't mind doing one or a handful of these asserts as part of v2.19 if
> > we want to try to reclaim those few percent. But I suspect the very
> > first commit in any further hash-transition work is just going to be to
> > rip them all out.
>
> I was thinking just hashcmp and hashcpy.
>
> Ideally such a change would come with a performance test to help the
> person writing that very first commit. Except we already have
> performance tests that capture this. ;-)
>
> For further hash-transition work, I agree someone may want to revert
> this, and I don't mind such a revert appearing right away in "next".
> And it's possible that we might have to do the equivalent of manual
> template expansion to recover the performance in some
> performance-sensitive areas. Maybe we can get the compiler to
> cooperate with us in that and maybe we can't. That's okay with me.
>
> Anyway, I'll resend your patch with a commit message added some time
> this evening.
Here's the patch. For some reason my numbers aren't quite as large as
they were yesterday (I was very careful to keep the system unloaded
today, whereas yesterday I was doing a few other things, so perhaps that
is the difference).
-- >8 --
Subject: [PATCH] hashcmp: assert constant hash size
Prior to 509f6f62a4 (cache: update object ID functions for
the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a
constant size of 20 bytes. Some compilers were able to turn
that into a few quad-word comparisons, which is faster than
actually calling memcmp().
In 509f6f62a4, we started using the_hash_algo->rawsz
instead. Even though this will always be 20, the compiler
doesn't know that while inlining hashcmp() and ends up just
generating a call to memcmp().
Eventually we'll have to deal with multiple hash sizes, but
for the upcoming v2.19, we can restore some of the original
performance by asserting on the size. That gives the
compiler enough information to know that the memcmp will
always be called with a length of 20, and it performs the
same optimization.
Here are numbers for p0001.2 run against linux.git on a few
versions. This is using -O2 with gcc 8.2.0.
Test v2.18.0 v2.19.0-rc0 HEAD
------------------------------------------------------------------------------
0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
You can see that v2.19 is a little slower than v2.18. This
commit ended up slightly faster than v2.18, but there's a
fair bit of run-to-run noise (the generated code in the two
cases is basically the same). This patch does seem to be
consistently 1-2% faster than v2.19.
I tried changing hashcpy(), which was also touched by
509f6f62a4, in the same way, but couldn't measure any
speedup. Which makes sense, at least for this workload. A
traversal of the whole commit graph requires looking up
every entry of every tree via lookup_object(). That's many
multiples of the numbers of objects in the repository (most
of the lookups just return "yes, we already saw that
object").
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
cache.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/cache.h b/cache.h
index b1fd3d58ab..4d014541ab 100644
--- a/cache.h
+++ b/cache.h
@@ -1023,6 +1023,16 @@ extern const struct object_id null_oid;
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
+ /*
+ * This is a temporary optimization hack. By asserting the size here,
+ * we let the compiler know that it's always going to be 20, which lets
+ * it turn this fixed-size memcmp into a few inline instructions.
+ *
+ * This will need to be extended or ripped out when we learn about
+ * hashes of different sizes.
+ */
+ if (the_hash_algo->rawsz != 20)
+ BUG("hash size not yet supported by hashcmp");
return memcmp(sha1, sha2, the_hash_algo->rawsz);
}
--
2.19.0.rc0.412.g7005db4e88
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 3:47 ` brian m. carlson
@ 2018-08-23 5:04 ` Jeff King
2018-08-23 10:26 ` Derrick Stolee
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-23 5:04 UTC (permalink / raw)
To: brian m. carlson, Jonathan Nieder, Paul Smith, git,
Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 03:47:07AM +0000, brian m. carlson wrote:
> I expect that's going to be the case as well. I have patches that
> wire up actual SHA-256 support in my hash-impl branch.
>
> However, having said that, I'm happy to defer to whatever everyone else
> thinks is best for 2.19. The assert solution would be fine with me in
> this situation, and if we need to pull it out in the future, that's okay
> with me.
>
> I don't really have a strong opinion on this either way, so if someone
> else does, please say so. I have somewhat more limited availability
> over the next couple days, as I'm travelling on business, but I'm happy
> to review a patch (and it seems like Peff has one minus the actual
> commit message).
I just posted the patch elsewhere in the thread. I think you can safely
ignore the rest of it if you are otherwise occupied. Even if v2.19 ships
without some mitigation, I don't know that it's all that big a deal,
given the numbers I generated (which for some reason are less dramatic
than Stolee's).
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 5:02 ` Jeff King
@ 2018-08-23 5:09 ` brian m. carlson
2018-08-23 5:10 ` Jonathan Nieder
2018-08-23 13:20 ` Junio C Hamano
2 siblings, 0 replies; 58+ messages in thread
From: brian m. carlson @ 2018-08-23 5:09 UTC (permalink / raw)
To: Jeff King
Cc: Jonathan Nieder, Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason
[-- Attachment #1: Type: text/plain, Size: 437 bytes --]
On Thu, Aug 23, 2018 at 01:02:25AM -0400, Jeff King wrote:
> Here's the patch. For some reason my numbers aren't quite as large as
> they were yesterday (I was very careful to keep the system unloaded
> today, whereas yesterday I was doing a few other things, so perhaps that
> is the difference).
This looks sane to me. Thanks for writing this up.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 5:02 ` Jeff King
2018-08-23 5:09 ` brian m. carlson
@ 2018-08-23 5:10 ` Jonathan Nieder
2018-08-23 13:20 ` Junio C Hamano
2 siblings, 0 replies; 58+ messages in thread
From: Jonathan Nieder @ 2018-08-23 5:10 UTC (permalink / raw)
To: Jeff King
Cc: Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
Jeff King wrote:
> Here's the patch. For some reason my numbers aren't quite as large as
> they were yesterday (I was very careful to keep the system unloaded
> today, whereas yesterday I was doing a few other things, so perhaps that
> is the difference).
>
> -- >8 --
> Subject: [PATCH] hashcmp: assert constant hash size
>
> Prior to 509f6f62a4 (cache: update object ID functions for
> the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a
> constant size of 20 bytes. Some compilers were able to turn
> that into a few quad-word comparisons, which is faster than
> actually calling memcmp().
>
> In 509f6f62a4, we started using the_hash_algo->rawsz
> instead. Even though this will always be 20, the compiler
> doesn't know that while inlining hashcmp() and ends up just
> generating a call to memcmp().
>
> Eventually we'll have to deal with multiple hash sizes, but
> for the upcoming v2.19, we can restore some of the original
> performance by asserting on the size. That gives the
> compiler enough information to know that the memcmp will
> always be called with a length of 20, and it performs the
> same optimization.
>
> Here are numbers for p0001.2 run against linux.git on a few
> versions. This is using -O2 with gcc 8.2.0.
>
> Test v2.18.0 v2.19.0-rc0 HEAD
> ------------------------------------------------------------------------------
> 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
>
> You can see that v2.19 is a little slower than v2.18. This
> commit ended up slightly faster than v2.18, but there's a
> fair bit of run-to-run noise (the generated code in the two
> cases is basically the same). This patch does seem to be
> consistently 1-2% faster than v2.19.
>
> I tried changing hashcpy(), which was also touched by
> 509f6f62a4, in the same way, but couldn't measure any
> speedup. Which makes sense, at least for this workload. A
> traversal of the whole commit graph requires looking up
> every entry of every tree via lookup_object(). That's many
> multiples of the numbers of objects in the repository (most
> of the lookups just return "yes, we already saw that
> object").
>
> Reported-by: Derrick Stolee <stolee@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> cache.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Verified using "make object.s" that the memcmp call goes away. Thank
you.
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 5:04 ` Jeff King
@ 2018-08-23 10:26 ` Derrick Stolee
2018-08-23 13:16 ` Junio C Hamano
` (3 more replies)
0 siblings, 4 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-23 10:26 UTC (permalink / raw)
To: Jeff King, brian m. carlson, Jonathan Nieder, Paul Smith, git,
Duy Nguyen, Ævar Arnfjörð Bjarmason
On 8/23/2018 1:04 AM, Jeff King wrote:
> On Thu, Aug 23, 2018 at 03:47:07AM +0000, brian m. carlson wrote:
>
>> I expect that's going to be the case as well. I have patches that
>> wire up actual SHA-256 support in my hash-impl branch.
>>
>> However, having said that, I'm happy to defer to whatever everyone else
>> thinks is best for 2.19. The assert solution would be fine with me in
>> this situation, and if we need to pull it out in the future, that's okay
>> with me.
>>
>> I don't really have a strong opinion on this either way, so if someone
>> else does, please say so. I have somewhat more limited availability
>> over the next couple days, as I'm travelling on business, but I'm happy
>> to review a patch (and it seems like Peff has one minus the actual
>> commit message).
> I just posted the patch elsewhere in the thread.
Thank you for that!
> I think you can safely
> ignore the rest of it if you are otherwise occupied. Even if v2.19 ships
> without some mitigation, I don't know that it's all that big a deal,
> given the numbers I generated (which for some reason are less dramatic
> than Stolee's).
My numbers may be more dramatic because my Linux environment is a
virtual machine.
I was thinking that having a mitigation for 2.19 is best, and then we
can focus as part of the 2.20 cycle how we can properly avoid this cost,
especially when 32 is a valid option.
Around the time that my proposed approaches were getting vetoed for
alignment issues, I figured I was out of my depth here. I reached out to
Daniel Lemire (of EWAH bitmap fame) on Twitter [1]. His blog is full of
posts of word-based approaches to different problems, so I thought he
might know something off the top of his head that would be applicable.
His conclusion (after looking only a short time) was to take a 'hasheq'
approach [2] like Peff suggested [3]. Since that requires auditing all
callers of hashcmp to see if hasheq is appropriate, it is not a good
solution for 2.19 but (in my opinion) should be evaluated as part of the
2.20 cycle.
Of course, if someone with knowledge of word-alignment issues across the
platforms we support knows how to enforce an alignment for object_id,
then something word-based like [4] could be reconsidered.
Thanks, everyone!
-Stolee
[1] https://twitter.com/stolee/status/1032312965754748930
[2]
https://lemire.me/blog/2018/08/22/avoid-lexicographical-comparisons-when-testing-for-string-equality/
[3]
https://public-inbox.org/git/20180822030344.GA14684@sigill.intra.peff.net/
[4]
https://public-inbox.org/git/7ea416cf-b043-1274-e161-85a8780b8e1c@gmail.com/
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 10:26 ` Derrick Stolee
@ 2018-08-23 13:16 ` Junio C Hamano
2018-08-23 16:14 ` Jeff King
` (2 subsequent siblings)
3 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2018-08-23 13:16 UTC (permalink / raw)
To: Derrick Stolee
Cc: Jeff King, brian m. carlson, Jonathan Nieder, Paul Smith, git,
Duy Nguyen, Ævar Arnfjörð Bjarmason
Derrick Stolee <stolee@gmail.com> writes:
> I was thinking that having a mitigation for 2.19 is best, and then we
> can focus as part of the 2.20 cycle how we can properly avoid this
> cost, especially when 32 is a valid option.
> ...
> ... to
> take a 'hasheq' approach [2] like Peff suggested [3]. Since that
> requires auditing all callers of hashcmp to see if hasheq is
> appropriate, it is not a good solution for 2.19 but (in my opinion)
> should be evaluated as part of the 2.20 cycle.
Thanks for thoughtful comments. I think it makes sense to go with
the "tell compiler hashcmp() currently is only about 20-byte array"
for now, as it is trivial to see why it cannot break things.
During 2.20 cycle, we may find out that hasheq() abstraction
performs better than hashcmp() with multiple lengths, and end up
doing that "audit and replace to use hasheq()". But as you said,
it probably isnot a good idea to rush it in this cycle.
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 5:02 ` Jeff King
2018-08-23 5:09 ` brian m. carlson
2018-08-23 5:10 ` Jonathan Nieder
@ 2018-08-23 13:20 ` Junio C Hamano
2018-08-23 16:31 ` wide t/perf output, was " Jeff King
2 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2018-08-23 13:20 UTC (permalink / raw)
To: Jeff King
Cc: Jonathan Nieder, Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
Jeff King <peff@peff.net> writes:
> Here are numbers for p0001.2 run against linux.git on a few
> versions. This is using -O2 with gcc 8.2.0.
>
> Test v2.18.0 v2.19.0-rc0 HEAD
> ------------------------------------------------------------------------------
> 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
I see what you did to the formatting here, which is a topic of
another thread ;-).
Thanks, as Derrick also noted, I agree this is an appropriate
workaround for the upcoming release and we may want to explore
hasheq() and other solutions as part of the effort during the next
cycle (which can start now if people are bored---after all working
on the codebase is the best way to hunt for recent bugs).
Thanks, will queue.
> diff --git a/cache.h b/cache.h
> index b1fd3d58ab..4d014541ab 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1023,6 +1023,16 @@ extern const struct object_id null_oid;
>
> static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
> {
> + /*
> + * This is a temporary optimization hack. By asserting the size here,
> + * we let the compiler know that it's always going to be 20, which lets
> + * it turn this fixed-size memcmp into a few inline instructions.
> + *
> + * This will need to be extended or ripped out when we learn about
> + * hashes of different sizes.
> + */
> + if (the_hash_algo->rawsz != 20)
> + BUG("hash size not yet supported by hashcmp");
> return memcmp(sha1, sha2, the_hash_algo->rawsz);
> }
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 10:26 ` Derrick Stolee
2018-08-23 13:16 ` Junio C Hamano
@ 2018-08-23 16:14 ` Jeff King
2018-08-23 23:30 ` Jacob Keller
2018-08-23 18:53 ` Jeff King
2018-09-02 18:53 ` Kaartic Sivaraam
3 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-23 16:14 UTC (permalink / raw)
To: Derrick Stolee
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote:
> Around the time that my proposed approaches were getting vetoed for
> alignment issues, I figured I was out of my depth here. I reached out to
> Daniel Lemire (of EWAH bitmap fame) on Twitter [1]. His blog is full of
> posts of word-based approaches to different problems, so I thought he might
> know something off the top of his head that would be applicable. His
> conclusion (after looking only a short time) was to take a 'hasheq' approach
> [2] like Peff suggested [3]. Since that requires auditing all callers of
> hashcmp to see if hasheq is appropriate, it is not a good solution for 2.19
> but (in my opinion) should be evaluated as part of the 2.20 cycle.
I think that audit isn't actually too bad (but definitely not something
we should do for v2.19). The cocci patch I showed earlier hits most of
them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not
sure if there's a way to ask coccinelle only for oidcmp()
used in a boolean context.
Just skimming the grep output, it's basically every call except the ones
we use for binary search.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* wide t/perf output, was Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 13:20 ` Junio C Hamano
@ 2018-08-23 16:31 ` Jeff King
0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2018-08-23 16:31 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jonathan Nieder, Paul Smith, git, Derrick Stolee, Duy Nguyen,
Ævar Arnfjörð Bjarmason, brian m. carlson
On Thu, Aug 23, 2018 at 06:20:26AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Here are numbers for p0001.2 run against linux.git on a few
> > versions. This is using -O2 with gcc 8.2.0.
> >
> > Test v2.18.0 v2.19.0-rc0 HEAD
> > ------------------------------------------------------------------------------
> > 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
>
> I see what you did to the formatting here, which is a topic of
> another thread ;-).
Do you happen to have a link? I missed that one, and digging turned up
nothing.
A while ago I wrote the patch below. I don't recall why I never sent it
in (and it doesn't apply cleanly these days, though I'm sure it could be
forward-ported).
-- >8 --
Date: Wed, 20 Jan 2016 23:54:14 -0500
Subject: [PATCH] t/perf: add "tall" output format
When aggregating results, we usually show a list of tests,
one per line, with the tested revisions in columns across.
Like:
$ ./aggregate.perl 348d4f2^ 348d4f2 p7000-filter-branch.sh
Test 348d4f2^ 348d4f2
-------------------------------------------------------------------
7000.2: noop filter 295.32(269.61+14.36) 7.92(0.85+0.72) -97.3%
This is useful if you have a lot of tests to show, but few
revisions; you're effectively comparing the two items on
each line. But sometimes you have the opposite: few tests,
but a large number of revisions. In this case, the lines
get very long, and it's hard to compare values.
This patch introduces a "tall" format that shows the same
data in a more vertical manner:
$ ./aggregate.perl --tall \
348d4f2^ 348d4f2 \
jk/filter-branch-empty^ \
jk/filter-branch-empty \
p7000-filter-branch.sh
Test: p7000-filter-branch.2
348d4f2^ 295.32(269.61+14.36)
348d4f2 7.92(0.85+0.72) -97.3%
jk/filter-branch-empty^ 9.37(0.87+0.80) -96.8%
jk/filter-branch-empty 7.71(0.92+0.62) -97.4%
Signed-off-by: Jeff King <peff@peff.net>
---
t/perf/aggregate.perl | 124 ++++++++++++++++++++++++++++++------------
1 file changed, 88 insertions(+), 36 deletions(-)
diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index e401208488..d108a02ccd 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -17,29 +17,41 @@ sub get_times {
return ($rt, $4, $5);
}
-sub format_times {
+sub format_times_list {
my ($r, $u, $s, $firstr) = @_;
if (!defined $r) {
return "<missing>";
}
my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s;
+ my $pct;
if (defined $firstr) {
if ($firstr > 0) {
- $out .= sprintf " %+.1f%%", 100.0*($r-$firstr)/$firstr;
+ $pct = sprintf "%+.1f%%", 100.0*($r-$firstr)/$firstr;
} elsif ($r == 0) {
- $out .= " =";
+ $pct = "=";
} else {
- $out .= " +inf";
+ $pct = "+inf";
}
}
- return $out;
+ return ($out, $pct);
+}
+
+sub format_times {
+ my ($times, $pct) = format_times_list(@_);
+ return defined $pct ? "$times $pct" : $times;
}
my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests);
+my ($tall_format);
while (scalar @ARGV) {
my $arg = $ARGV[0];
my $dir;
last if -f $arg or $arg eq "--";
+ if ($arg eq "--tall") {
+ $tall_format = 1;
+ shift @ARGV;
+ next;
+ }
if (! -d $arg) {
my $rev = Git::command_oneline(qw(rev-parse --verify), $arg);
$dir = "build/".$rev;
@@ -122,6 +134,11 @@ sub have_slash {
return 0;
}
+sub printable_dir {
+ my ($d) = @_;
+ return exists $dirabbrevs{$d} ? $dirabbrevs{$d} : $dirnames{$d};
+}
+
my %newdirabbrevs = %dirabbrevs;
while (!have_duplicate(values %newdirabbrevs)) {
%dirabbrevs = %newdirabbrevs;
@@ -132,44 +149,79 @@ sub have_slash {
}
}
-my %times;
-my @colwidth = ((0)x@dirs);
-for my $i (0..$#dirs) {
- my $d = $dirs[$i];
- my $w = length (exists $dirabbrevs{$d} ? $dirabbrevs{$d} : $dirnames{$d});
- $colwidth[$i] = $w if $w > $colwidth[$i];
-}
-for my $t (@subtests) {
- my $firstr;
+binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
+
+if (!$tall_format) {
+ my %times;
+ my @colwidth = ((0)x@dirs);
for my $i (0..$#dirs) {
my $d = $dirs[$i];
- $times{$prefixes{$d}.$t} = [get_times("$resultsdir/$prefixes{$d}$t.times")];
- my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
- my $w = length format_times($r,$u,$s,$firstr);
+ my $w = length(printable_dir($d));
$colwidth[$i] = $w if $w > $colwidth[$i];
- $firstr = $r unless defined $firstr;
}
-}
-my $totalwidth = 3*@dirs+$descrlen;
-$totalwidth += $_ for (@colwidth);
-
-binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
+ for my $t (@subtests) {
+ my $firstr;
+ for my $i (0..$#dirs) {
+ my $d = $dirs[$i];
+ $times{$prefixes{$d}.$t} = [get_times("$resultsdir/$prefixes{$d}$t.times")];
+ my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
+ my $w = length format_times($r,$u,$s,$firstr);
+ $colwidth[$i] = $w if $w > $colwidth[$i];
+ $firstr = $r unless defined $firstr;
+ }
+ }
+ my $totalwidth = 3*@dirs+$descrlen;
+ $totalwidth += $_ for (@colwidth);
-printf "%-${descrlen}s", "Test";
-for my $i (0..$#dirs) {
- my $d = $dirs[$i];
- printf " %-$colwidth[$i]s", (exists $dirabbrevs{$d} ? $dirabbrevs{$d} : $dirnames{$d});
-}
-print "\n";
-print "-"x$totalwidth, "\n";
-for my $t (@subtests) {
- printf "%-${descrlen}s", $descrs{$t};
- my $firstr;
+ printf "%-${descrlen}s", "Test";
for my $i (0..$#dirs) {
my $d = $dirs[$i];
- my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
- printf " %-$colwidth[$i]s", format_times($r,$u,$s,$firstr);
- $firstr = $r unless defined $firstr;
+ printf " %-$colwidth[$i]s", printable_dir($d);
}
print "\n";
+ print "-"x$totalwidth, "\n";
+ for my $t (@subtests) {
+ printf "%-${descrlen}s", $descrs{$t};
+ my $firstr;
+ for my $i (0..$#dirs) {
+ my $d = $dirs[$i];
+ my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
+ printf " %-$colwidth[$i]s", format_times($r,$u,$s,$firstr);
+ $firstr = $r unless defined $firstr;
+ }
+ print "\n";
+ }
+} else {
+ my $shown = 0;
+ for my $t (@subtests) {
+ print "\n" if $shown++;
+ print "Test: $t\n";
+
+ my %times;
+ my $firstr;
+ for my $d (@dirs) {
+ my ($r, $u, $s) = get_times("test-results/$prefixes{$d}$t.times");
+ $times{$d} = [format_times_list($r, $u, $s, $firstr)];
+ $firstr = $r unless defined $firstr;
+ }
+
+ my $maxdirlen = 0;
+ my $maxtimelen = 0;
+ for my $d (@dirs) {
+ if (length($d) > $maxdirlen) {
+ $maxdirlen = length(printable_dir($d));
+ }
+ if (length($times{$d}->[0]) > $maxtimelen) {
+ $maxtimelen = length($times{$d}->[0]);
+ }
+ }
+ $maxdirlen++;
+
+ for my $d (@dirs) {
+ printf "%-${maxdirlen}s", printable_dir($d);
+ printf " %${maxtimelen}s", $times{$d}->[0];
+ print " $times{$d}->[1]" if defined $times{$d}->[1];
+ print "\n";
+ }
+ }
}
--
2.19.0.rc0.412.g7005db4e88
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 10:26 ` Derrick Stolee
2018-08-23 13:16 ` Junio C Hamano
2018-08-23 16:14 ` Jeff King
@ 2018-08-23 18:53 ` Jeff King
2018-08-23 20:59 ` Derrick Stolee
2018-09-02 18:53 ` Kaartic Sivaraam
3 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-23 18:53 UTC (permalink / raw)
To: Derrick Stolee
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote:
> > I think you can safely
> > ignore the rest of it if you are otherwise occupied. Even if v2.19 ships
> > without some mitigation, I don't know that it's all that big a deal,
> > given the numbers I generated (which for some reason are less dramatic
> > than Stolee's).
> My numbers may be more dramatic because my Linux environment is a virtual
> machine.
If you have a chance, can you run p0001 on my patch (compared to
2.19-rc0, or to both v2.18 and v2.19-rc0)? It would be nice to double
check that it really is fixing the problem you saw.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 18:53 ` Jeff King
@ 2018-08-23 20:59 ` Derrick Stolee
2018-08-24 6:56 ` Jeff King
2018-08-24 16:45 ` Derrick Stolee
0 siblings, 2 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-23 20:59 UTC (permalink / raw)
To: Jeff King
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On 8/23/2018 2:53 PM, Jeff King wrote:
> On Thu, Aug 23, 2018 at 06:26:58AM -0400, Derrick Stolee wrote:
>
>>> I think you can safely
>>> ignore the rest of it if you are otherwise occupied. Even if v2.19 ships
>>> without some mitigation, I don't know that it's all that big a deal,
>>> given the numbers I generated (which for some reason are less dramatic
>>> than Stolee's).
>> My numbers may be more dramatic because my Linux environment is a virtual
>> machine.
> If you have a chance, can you run p0001 on my patch (compared to
> 2.19-rc0, or to both v2.18 and v2.19-rc0)? It would be nice to double
> check that it really is fixing the problem you saw.
Sure. Note: I had to create a new Linux VM on a different machine
between Tuesday and today, so the absolute numbers are different.
Using git/git:
Test v2.18.0 v2.19.0-rc0 HEAD
-------------------------------------------------------------------------
0001.2: 3.10(3.02+0.08) 3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
Using torvalds/linux:
Test v2.18.0 v2.19.0-rc0 HEAD
------------------------------------------------------------------------------
0001.2: 56.08(45.91+1.50) 56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%
Now here is where I get on my soapbox (and create a TODO for myself
later). I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively
suggests that the results should be _more_ accurate than the default of
3. However, I then remember that we only report the *minimum* time from
all the runs, which is likely to select an outlier from the
distribution. To test this, I ran a few tests manually and found the
variation between runs to be larger than 3%.
When I choose my own metrics for performance tests, I like to run at
least 10 runs, remove the largest AND smallest runs from the samples,
and then take the average. I did this manually for 'git rev-list --all
--objects' on git/git and got the following results:
v2.18.0 v2.19.0-rc0 HEAD
--------------------------------
3.126 s 3.308 s 3.170 s
For full disclosure, here is a full table including all samples:
| | v2.18.0 | v2.19.0-rc0 | HEAD |
|------|---------|-------------|---------|
| | 4.58 | 3.302 | 3.239 |
| | 3.13 | 3.337 | 3.133 |
| | 3.213 | 3.291 | 3.159 |
| | 3.219 | 3.318 | 3.131 |
| | 3.077 | 3.302 | 3.163 |
| | 3.074 | 3.328 | 3.119 |
| | 3.022 | 3.277 | 3.125 |
| | 3.083 | 3.259 | 3.203 |
| | 3.057 | 3.311 | 3.223 |
| | 3.155 | 3.413 | 3.225 |
| Max | 4.58 | 3.413 | 3.239 |
| Min | 3.022 | 3.259 | 3.119 |
| Avg* | 3.126 | 3.30825 | 3.17025 |
(Note that the largest one was the first run, on v2.18.0, which is due
to a cold disk.)
I just kicked off a script that will run this test on the Linux repo
while I drive home. I'll be able to report a similar table of data easily.
My TODO is to consider aggregating the data this way (or with a median)
instead of reporting the minimum.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 16:14 ` Jeff King
@ 2018-08-23 23:30 ` Jacob Keller
2018-08-23 23:40 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2018-08-23 23:30 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, paul,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 9:30 AM Jeff King <peff@peff.net> wrote:
> I think that audit isn't actually too bad (but definitely not something
> we should do for v2.19). The cocci patch I showed earlier hits most of
> them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not
> sure if there's a way to ask coccinelle only for oidcmp()
> used in a boolean context.
>
You can look for explicitly "if (oidcmp(...))" though. I don't know if
you can catch *any* use which degrades to boolean outside of an if
statement, but I wouldn't expect there to be too many of those?
Thanks,
Jake
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 23:30 ` Jacob Keller
@ 2018-08-23 23:40 ` Jeff King
2018-08-24 0:06 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-23 23:40 UTC (permalink / raw)
To: Jacob Keller
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, paul,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 04:30:10PM -0700, Jacob Keller wrote:
> On Thu, Aug 23, 2018 at 9:30 AM Jeff King <peff@peff.net> wrote:
> > I think that audit isn't actually too bad (but definitely not something
> > we should do for v2.19). The cocci patch I showed earlier hits most of
> > them. It misses the negated ones (e.g., "if (oidcmp(...))"). I'm not
> > sure if there's a way to ask coccinelle only for oidcmp()
> > used in a boolean context.
> >
>
> You can look for explicitly "if (oidcmp(...))" though. I don't know if
> you can catch *any* use which degrades to boolean outside of an if
> statement, but I wouldn't expect there to be too many of those?
Yeah, that was my thought, too. And I've been trying this all afternoon
without success. Why doesn't this work:
@@
expression a, b;
@@
- if (oidcmp(a, b))
+ if (!oideq(a, b))
I get:
Fatal error: exception Failure("minus: parse error: \n = File
\"contrib/coccinelle/oideq.cocci\", line 21, column 0, charpos =
221\n around = '', whole content = \n")
If I do:
- if (oidcmp(a, b)) { ... }
that seems to please the parser for the minus line. But I cannot include
the "..." on the plus line. Clearly the "..." part should be context,
but I can't seem to find the right syntax.
FWIW, I do have patches adding hasheq() and converting all of the
!oidcmp() cases. I may resort to hand-investigating each of the negated
ones, but I really feel like I should be able to do better with
coccinelle.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 23:40 ` Jeff King
@ 2018-08-24 0:06 ` Jeff King
2018-08-24 0:16 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-24 0:06 UTC (permalink / raw)
To: Jacob Keller
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, paul,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 07:40:49PM -0400, Jeff King wrote:
> > You can look for explicitly "if (oidcmp(...))" though. I don't know if
> > you can catch *any* use which degrades to boolean outside of an if
> > statement, but I wouldn't expect there to be too many of those?
>
> Yeah, that was my thought, too. And I've been trying this all afternoon
> without success. Why doesn't this work:
>
> @@
> expression a, b;
> @@
> - if (oidcmp(a, b))
> + if (!oideq(a, b))
>
> I get:
>
> Fatal error: exception Failure("minus: parse error: \n = File
> \"contrib/coccinelle/oideq.cocci\", line 21, column 0, charpos =
> 221\n around = '', whole content = \n")
>
> If I do:
>
> - if (oidcmp(a, b)) { ... }
>
> that seems to please the parser for the minus line. But I cannot include
> the "..." on the plus line. Clearly the "..." part should be context,
> but I can't seem to find the right syntax.
This almost works:
@@
expression a, b;
statement s;
@@
- if (oidcmp(a, b)) s
+ if (!oideq(a, b)) s
It generates this, for example:
diff -u -p a/bisect.c b/bisect.c
--- a/bisect.c
+++ b/bisect.c
@@ -595,7 +595,7 @@ static struct commit_list *skip_away(str
for (i = 0; cur; cur = cur->next, i++) {
if (i == index) {
- if (oidcmp(&cur->item->object.oid, current_bad_oid))
+ if (!oideq(&cur->item->object.oid, current_bad_oid))
return cur;
if (previous)
return previous;
which is what we want. But it also generates this:
diff -u -p a/bundle.c b/bundle.c
--- a/bundle.c
+++ b/bundle.c
@@ -369,25 +369,11 @@ static int write_bundle_refs(int bundle_
* commit that is referenced by the tag, and not the tag
* itself.
*/
- if (oidcmp(&oid, &e->item->oid)) {
- /*
- * Is this the positive end of a range expressed
- * in terms of a tag (e.g. v2.0 from the range
- * "v1.0..v2.0")?
- */
- struct commit *one = lookup_commit_reference(the_repository,
- &oid);
+ if (!oideq(&oid, &e->item->oid)) {
+ struct commit *one=lookup_commit_reference(the_repository,
+ &oid);
struct object *obj;
-
if (e->item == &(one->object)) {
- /*
- * Need to include e->name as an
- * independent ref to the pack-objects
- * input, so that the tag is included
- * in the output; otherwise we would
- * end up triggering "empty bundle"
- * error.
- */
obj = parse_object_or_die(&oid, e->name);
obj->flags |= SHOWN;
add_pending_object(revs, obj, e->name);
So I really do want some way of saying "all of the block, no matter what
it is". Or of leaving it out as context.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 0:06 ` Jeff King
@ 2018-08-24 0:16 ` Jeff King
2018-08-24 2:48 ` Jacob Keller
0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-24 0:16 UTC (permalink / raw)
To: Jacob Keller
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, paul,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 08:06:37PM -0400, Jeff King wrote:
> This almost works:
>
> @@
> expression a, b;
> statement s;
> @@
> - if (oidcmp(a, b)) s
> + if (!oideq(a, b)) s
>
> [...]
> So I really do want some way of saying "all of the block, no matter what
> it is". Or of leaving it out as context.
Aha. The magic invocation is:
@@
expression a, b;
statement s;
@@
- if (oidcmp(a, b))
+ if (!oideq(a, b))
s
I would have expected that you could replace "s" with "...", but that
does not seem to work.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 0:16 ` Jeff King
@ 2018-08-24 2:48 ` Jacob Keller
2018-08-24 2:59 ` Jeff King
0 siblings, 1 reply; 58+ messages in thread
From: Jacob Keller @ 2018-08-24 2:48 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, Paul Smith,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 5:16 PM Jeff King <peff@peff.net> wrote:
>
> On Thu, Aug 23, 2018 at 08:06:37PM -0400, Jeff King wrote:
>
> > This almost works:
> >
> > @@
> > expression a, b;
> > statement s;
> > @@
> > - if (oidcmp(a, b)) s
> > + if (!oideq(a, b)) s
> >
> > [...]
>
> > So I really do want some way of saying "all of the block, no matter what
> > it is". Or of leaving it out as context.
>
> Aha. The magic invocation is:
>
> @@
> expression a, b;
> statement s;
> @@
> - if (oidcmp(a, b))
> + if (!oideq(a, b))
> s
>
> I would have expected that you could replace "s" with "...", but that
> does not seem to work.
>
> -Peff
Odd...
What about..
- if (oidcmp(a,b))
+ if(!oideq(a,b))
{ ... }
Or maybe you need to use something like
<...
- if (oidcmp(a,b))
+ if (!oideq(a,b))
...>
Hmm. Yea, semantic patches are a bit confusing overall sometimes.
But it looks like you got something which works?
Thanks,
Jake
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 2:48 ` Jacob Keller
@ 2018-08-24 2:59 ` Jeff King
2018-08-24 6:45 ` Jeff King
2018-08-27 19:36 ` Junio C Hamano
0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2018-08-24 2:59 UTC (permalink / raw)
To: Jacob Keller
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, Paul Smith,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 07:48:42PM -0700, Jacob Keller wrote:
> Odd...
>
> What about..
>
> - if (oidcmp(a,b))
> + if(!oideq(a,b))
> { ... }
Nope, it doesn't like that syntactically.
> Or maybe you need to use something like
>
> <...
> - if (oidcmp(a,b))
> + if (!oideq(a,b))
> ...>
Nor that (I also tried finding documentation on what exactly the angle
brackets mean, but couldn't).
> Hmm. Yea, semantic patches are a bit confusing overall sometimes.
>
> But it looks like you got something which works?
Actually, what I showed earlier does seem to have some weirdness with
else-if. But I finally stumbled on something even better:
- oidcmp(a, b) != 0
+ !oideq(a, b)
Because of the isomorphisms that coccinelle knows about, that catches
everything we want. Obvious ones like:
diff --git a/bisect.c b/bisect.c
index 41c56a665e..7c1d8f1a6d 100644
--- a/bisect.c
+++ b/bisect.c
@@ -595,7 +595,7 @@ static struct commit_list *skip_away(struct commit_list *list, int count)
for (i = 0; cur; cur = cur->next, i++) {
if (i == index) {
- if (oidcmp(&cur->item->object.oid, current_bad_oid))
+ if (!oideq(&cur->item->object.oid, current_bad_oid))
return cur;
if (previous)
return previous;
and compound conditionals like:
diff --git a/blame.c b/blame.c
index 10d72e36dd..538d0ab1aa 100644
--- a/blame.c
+++ b/blame.c
@@ -1834,7 +1834,7 @@ void setup_scoreboard(struct blame_scoreboard *sb,
sb->revs->children.name = "children";
while (c->parents &&
- oidcmp(&c->object.oid, &sb->final->object.oid)) {
+ !oideq(&c->object.oid, &sb->final->object.oid)) {
struct commit_list *l = xcalloc(1, sizeof(*l));
l->item = c;
and even non-if contexts, like:
diff --git a/sha1-file.c b/sha1-file.c
index 631f6b9dc2..d85f4e93e1 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -825,7 +825,7 @@ int check_object_signature(const struct object_id *oid, void *map,
if (map) {
hash_object_file(map, size, type, &real_oid);
- return oidcmp(oid, &real_oid) ? -1 : 0;
+ return !oideq(oid, &real_oid) ? -1 : 0;
}
So I think we have a winner. I'll polish that up into patches and send
it out later tonight.
-Peff
^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 2:59 ` Jeff King
@ 2018-08-24 6:45 ` Jeff King
2018-08-24 11:04 ` Derrick Stolee
2018-08-27 19:36 ` Junio C Hamano
1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-24 6:45 UTC (permalink / raw)
To: Jacob Keller
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, Paul Smith,
Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 10:59:55PM -0400, Jeff King wrote:
> So I think we have a winner. I'll polish that up into patches and send
> it out later tonight.
Oof. This rabbit hole keeps going deeper and deeper. I wrote up my
coccinelle findings separately in:
https://public-inbox.org/git/20180824064228.GA3183@sigill.intra.peff.net/
which is possibly a coccinelle bug (there I talked about oidcmp, since
it can be demonstrated with the existing transformations, but the same
thing happens with my hasheq patches). I'll wait to see how that
discussion plays out, but I do otherwise have hasheq() patches ready to
go, so it's probably not worth anybody else digging in further.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 20:59 ` Derrick Stolee
@ 2018-08-24 6:56 ` Jeff King
2018-08-24 7:57 ` Ævar Arnfjörð Bjarmason
2018-08-24 16:45 ` Derrick Stolee
1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2018-08-24 6:56 UTC (permalink / raw)
To: Derrick Stolee
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, Aug 23, 2018 at 04:59:27PM -0400, Derrick Stolee wrote:
> Using git/git:
>
> Test v2.18.0 v2.19.0-rc0 HEAD
> -------------------------------------------------------------------------
> 0001.2: 3.10(3.02+0.08) 3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
>
>
> Using torvalds/linux:
>
> Test v2.18.0 v2.19.0-rc0 HEAD
> ------------------------------------------------------------------------------
> 0001.2: 56.08(45.91+1.50) 56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%
Interesting that these timings aren't as dramatic as the ones you got
the other day (mine seemed to shift, too; for whatever reason it seems
like under load the difference is larger).
> Now here is where I get on my soapbox (and create a TODO for myself later).
> I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests
> that the results should be _more_ accurate than the default of 3. However, I
> then remember that we only report the *minimum* time from all the runs,
> which is likely to select an outlier from the distribution. To test this, I
> ran a few tests manually and found the variation between runs to be larger
> than 3%.
Yes, I agree it's not a great system. The whole "best of 3" thing is
OK for throwing out cold-cache warmups, but it's really bad for teasing
out the significance of small changes, or even understanding how much
run-to-run noise there is.
> When I choose my own metrics for performance tests, I like to run at least
> 10 runs, remove the largest AND smallest runs from the samples, and then
> take the average. I did this manually for 'git rev-list --all --objects' on
> git/git and got the following results:
I agree that technique is better. I wonder if there's something even
more statistically rigorous we could do. E.g., to compute the variance
and throw away outliers based on standard deviations. And also to report
the variance to give a sense of the significance of any changes.
Obviously more runs gives greater confidence in the results, but 10
sounds like a lot. Many of these tests take minutes to run. Letting it
go overnight is OK if you're doing a once-per-release mega-run, but it's
pretty painful if you just want to generate some numbers to show off
your commit.
> v2.18.0 v2.19.0-rc0 HEAD
> --------------------------------
> 3.126 s 3.308 s 3.170 s
So that's 5% worsening in 2.19, and we reclaim all but 1.4% of it. Those
numbers match what I expect (and what I was seeing in some of my earlier
timings).
> I just kicked off a script that will run this test on the Linux repo while I
> drive home. I'll be able to report a similar table of data easily.
Thanks, I'd expect it to come up with similar percentages. So we'll see
if that holds true. :)
> My TODO is to consider aggregating the data this way (or with a median)
> instead of reporting the minimum.
Yes, I think that would be a great improvement for t/perf.
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 6:56 ` Jeff King
@ 2018-08-24 7:57 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 58+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-08-24 7:57 UTC (permalink / raw)
To: Jeff King
Cc: Derrick Stolee, brian m. carlson, Jonathan Nieder, Paul Smith,
git, Duy Nguyen, Steffen Mueller
On Fri, Aug 24 2018, Jeff King wrote:
> On Thu, Aug 23, 2018 at 04:59:27PM -0400, Derrick Stolee wrote:
>
>> Using git/git:
>>
>> Test v2.18.0 v2.19.0-rc0 HEAD
>> -------------------------------------------------------------------------
>> 0001.2: 3.10(3.02+0.08) 3.27(3.17+0.09) +5.5% 3.14(3.02+0.11) +1.3%
>>
>>
>> Using torvalds/linux:
>>
>> Test v2.18.0 v2.19.0-rc0 HEAD
>> ------------------------------------------------------------------------------
>> 0001.2: 56.08(45.91+1.50) 56.60(46.62+1.50) +0.9% 54.61(45.47+1.46) -2.6%
>
> Interesting that these timings aren't as dramatic as the ones you got
> the other day (mine seemed to shift, too; for whatever reason it seems
> like under load the difference is larger).
>
>> Now here is where I get on my soapbox (and create a TODO for myself later).
>> I ran the above with GIT_PERF_REPEAT_COUNT=10, which intuitively suggests
>> that the results should be _more_ accurate than the default of 3. However, I
>> then remember that we only report the *minimum* time from all the runs,
>> which is likely to select an outlier from the distribution. To test this, I
>> ran a few tests manually and found the variation between runs to be larger
>> than 3%.
>
> Yes, I agree it's not a great system. The whole "best of 3" thing is
> OK for throwing out cold-cache warmups, but it's really bad for teasing
> out the significance of small changes, or even understanding how much
> run-to-run noise there is.
>
>> When I choose my own metrics for performance tests, I like to run at least
>> 10 runs, remove the largest AND smallest runs from the samples, and then
>> take the average. I did this manually for 'git rev-list --all --objects' on
>> git/git and got the following results:
>
> I agree that technique is better. I wonder if there's something even
> more statistically rigorous we could do. E.g., to compute the variance
> and throw away outliers based on standard deviations. And also to report
> the variance to give a sense of the significance of any changes.
>
> Obviously more runs gives greater confidence in the results, but 10
> sounds like a lot. Many of these tests take minutes to run. Letting it
> go overnight is OK if you're doing a once-per-release mega-run, but it's
> pretty painful if you just want to generate some numbers to show off
> your commit.
An ex-coworker who's a *lot* smarter about these things than I am wrote
this module: https://metacpan.org/pod/Dumbbench
I while ago I dabbled briefly with integrating it into t/perf/ but got
distracted by something else:
The module currently works similar to [more traditional iterative
benchmark modules], except (in layman terms) it will run the command
many times, estimate the uncertainty of the result and keep
iterating until a certain user-defined precision has been
reached. Then, it calculates the resulting uncertainty and goes
through some pain to discard bad runs and subtract overhead from the
timings. The reported timing includes an uncertainty, so that
multiple benchmarks can more easily be compared.
Details of how it works here:
https://metacpan.org/pod/Dumbbench#HOW-IT-WORKS-AND-WHY-IT-DOESN'T
Something like that seems to me to be an inherently better
approach. I.e. we have lots of test cases that take 500ms, and some that
take maybe 5 minutes (depending on the size of the repository).
Indiscriminately repeating all of those for GIT_PERF_REPEAT_COUNT must
be dumber than something like the above method.
We could also speed up the runtime of the perf tests a lot with such a
method, by e.g. saying that we're OK with less certainty on tests that
take a longer time than those that take a shorter time.
>> v2.18.0 v2.19.0-rc0 HEAD
>> --------------------------------
>> 3.126 s 3.308 s 3.170 s
>
> So that's 5% worsening in 2.19, and we reclaim all but 1.4% of it. Those
> numbers match what I expect (and what I was seeing in some of my earlier
> timings).
>
>> I just kicked off a script that will run this test on the Linux repo while I
>> drive home. I'll be able to report a similar table of data easily.
>
> Thanks, I'd expect it to come up with similar percentages. So we'll see
> if that holds true. :)
>
>> My TODO is to consider aggregating the data this way (or with a median)
>> instead of reporting the minimum.
>
> Yes, I think that would be a great improvement for t/perf.
>
> -Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 6:45 ` Jeff King
@ 2018-08-24 11:04 ` Derrick Stolee
0 siblings, 0 replies; 58+ messages in thread
From: Derrick Stolee @ 2018-08-24 11:04 UTC (permalink / raw)
To: Jeff King, Jacob Keller
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, Git mailing list,
Duy Nguyen, Ævar Arnfjörð Bjarmason
On 8/24/2018 2:45 AM, Jeff King wrote:
> On Thu, Aug 23, 2018 at 10:59:55PM -0400, Jeff King wrote:
>
>> So I think we have a winner. I'll polish that up into patches and send
>> it out later tonight.
> Oof. This rabbit hole keeps going deeper and deeper. I wrote up my
> coccinelle findings separately in:
>
> https://public-inbox.org/git/20180824064228.GA3183@sigill.intra.peff.net/
>
> which is possibly a coccinelle bug (there I talked about oidcmp, since
> it can be demonstrated with the existing transformations, but the same
> thing happens with my hasheq patches). I'll wait to see how that
> discussion plays out, but I do otherwise have hasheq() patches ready to
> go, so it's probably not worth anybody else digging in further.
>
I was trying this myself yesterday, and found a doc _somewhere_ that
said what we should do is this:
@@
expression e1, e2;
@@
if (
- oidcmp(e1, e2)
+ !oideq(e1, e2)
)
(Don't forget the space before the last end-paren!)
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 20:59 ` Derrick Stolee
2018-08-24 6:56 ` Jeff King
@ 2018-08-24 16:45 ` Derrick Stolee
2018-08-25 8:26 ` Jeff King
1 sibling, 1 reply; 58+ messages in thread
From: Derrick Stolee @ 2018-08-24 16:45 UTC (permalink / raw)
To: Jeff King
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On 8/23/2018 4:59 PM, Derrick Stolee wrote:
>
> When I choose my own metrics for performance tests, I like to run at
> least 10 runs, remove the largest AND smallest runs from the samples,
> and then take the average. I did this manually for 'git rev-list --all
> --objects' on git/git and got the following results:
>
> v2.18.0 v2.19.0-rc0 HEAD
> --------------------------------
> 3.126 s 3.308 s 3.170 s
>
> For full disclosure, here is a full table including all samples:
>
> | | v2.18.0 | v2.19.0-rc0 | HEAD |
> |------|---------|-------------|---------|
> | | 4.58 | 3.302 | 3.239 |
> | | 3.13 | 3.337 | 3.133 |
> | | 3.213 | 3.291 | 3.159 |
> | | 3.219 | 3.318 | 3.131 |
> | | 3.077 | 3.302 | 3.163 |
> | | 3.074 | 3.328 | 3.119 |
> | | 3.022 | 3.277 | 3.125 |
> | | 3.083 | 3.259 | 3.203 |
> | | 3.057 | 3.311 | 3.223 |
> | | 3.155 | 3.413 | 3.225 |
> | Max | 4.58 | 3.413 | 3.239 |
> | Min | 3.022 | 3.259 | 3.119 |
> | Avg* | 3.126 | 3.30825 | 3.17025 |
>
> (Note that the largest one was the first run, on v2.18.0, which is due
> to a cold disk.)
>
> I just kicked off a script that will run this test on the Linux repo
> while I drive home. I'll be able to report a similar table of data
> easily.
Here are the numbers for Linux:
| | v2.18.0 | v2.19.0-rc0 | HEAD |
|------|----------|-------------|--------|
| | 86.5 | 70.739 | 57.266 |
| | 60.582 | 101.928 | 56.641 |
| | 58.964 | 60.139 | 60.258 |
| | 59.47 | 61.141 | 58.213 |
| | 62.554 | 60.73 | 84.54 |
| | 59.139 | 85.424 | 57.745 |
| | 58.487 | 59.31 | 59.979 |
| | 58.653 | 69.845 | 60.181 |
| | 58.085 | 102.777 | 61.455 |
| | 58.304 | 60.459 | 62.551 |
| Max | 86.5 | 102.777 | 84.54 |
| Min | 58.085 | 59.31 | 56.641 |
| Avg* | 59.51913 | 71.30063 | 59.706 |
| Med | 59.0515 | 65.493 | 60.08 |
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 16:45 ` Derrick Stolee
@ 2018-08-25 8:26 ` Jeff King
0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2018-08-25 8:26 UTC (permalink / raw)
To: Derrick Stolee
Cc: brian m. carlson, Jonathan Nieder, Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Fri, Aug 24, 2018 at 12:45:10PM -0400, Derrick Stolee wrote:
> Here are the numbers for Linux:
>
> | | v2.18.0 | v2.19.0-rc0 | HEAD |
> |------|----------|-------------|--------|
> | | 86.5 | 70.739 | 57.266 |
> | | 60.582 | 101.928 | 56.641 |
> | | 58.964 | 60.139 | 60.258 |
> | | 59.47 | 61.141 | 58.213 |
> | | 62.554 | 60.73 | 84.54 |
> | | 59.139 | 85.424 | 57.745 |
> | | 58.487 | 59.31 | 59.979 |
> | | 58.653 | 69.845 | 60.181 |
> | | 58.085 | 102.777 | 61.455 |
> | | 58.304 | 60.459 | 62.551 |
> | Max | 86.5 | 102.777 | 84.54 |
> | Min | 58.085 | 59.31 | 56.641 |
> | Avg* | 59.51913 | 71.30063 | 59.706 |
> | Med | 59.0515 | 65.493 | 60.08 |
Thanks. The median ones are the most interesting, I think (and show a
very satisfying recovery from my patch).
I'm surprised at the variance, especially in your v2.19 runs. It makes
me distrust the mean (and implies to me we could do better by throwing
away outliers based on value, not just the single high/low; or just
using the median also solves that).
The variance in the v2.18 column is what I'd expect based on past
experience (slow cold cache to start, then a few percent change
run-to-run).
-Peff
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-24 2:59 ` Jeff King
2018-08-24 6:45 ` Jeff King
@ 2018-08-27 19:36 ` Junio C Hamano
1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2018-08-27 19:36 UTC (permalink / raw)
To: Jeff King
Cc: Jacob Keller, Derrick Stolee, brian m. carlson, Jonathan Nieder,
Paul Smith, Git mailing list, Duy Nguyen,
Ævar Arnfjörð Bjarmason
Jeff King <peff@peff.net> writes:
> Actually, what I showed earlier does seem to have some weirdness with
> else-if. But I finally stumbled on something even better:
>
> - oidcmp(a, b) != 0
> + !oideq(a, b)
>
> Because of the isomorphisms that coccinelle knows about, that catches
> everything we want.
Nice ;-)
^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [ANNOUNCE] Git v2.19.0-rc0
2018-08-23 10:26 ` Derrick Stolee
` (2 preceding siblings ...)
2018-08-23 18:53 ` Jeff King
@ 2018-09-02 18:53 ` Kaartic Sivaraam
3 siblings, 0 replies; 58+ messages in thread
From: Kaartic Sivaraam @ 2018-09-02 18:53 UTC (permalink / raw)
To: Derrick Stolee, Jeff King, brian m. carlson, Jonathan Nieder,
Paul Smith, git, Duy Nguyen,
Ævar Arnfjörð Bjarmason
On Thu, 2018-08-23 at 06:26 -0400, Derrick Stolee wrote:
>
> Around the time that my proposed approaches were getting vetoed for
> alignment issues, I figured I was out of my depth here. I reached out to
> Daniel Lemire (of EWAH bitmap fame) on Twitter [1]. His blog is full of
> posts of word-based approaches to different problems, so I thought he
> might know something off the top of his head that would be applicable.
> His conclusion (after looking only a short time) was to take a 'hasheq'
> approach [2] like Peff suggested [3]. Since that requires auditing all
> callers of hashcmp to see if hasheq is appropriate, it is not a good
> solution for 2.19 but (in my opinion) should be evaluated as part of the
> 2.20 cycle.
>
That was an interesting blog post, indeed. It had an interesting
comments section. One comment especially caught my eyes was [a]:
"So the way gcc (and maybe clang) handles this is specifically by
recognizing memcmp and checking whether a only a 2-way result is needed
and then essentially replacing it with a memcmp_eq call.
..."
I find this to be an interesting note. It seems GCC does optimize when
we clearly indicate that we use the result of the memcmp as a boolean.
So would that help in anyway? Maybe it would help in writing a `hasheq`
method easily? I'm not sure.
> [1] https://twitter.com/stolee/status/1032312965754748930
>
> [2]
> https://lemire.me/blog/2018/08/22/avoid-lexicographical-comparisons-when-testing-for-string-equality/
>
> [3]
> https://public-inbox.org/git/20180822030344.GA14684@sigill.intra.peff.net/
>
> [4]
> https://public-inbox.org/git/7ea416cf-b043-1274-e161-85a8780b8e1c@gmail.com/
[a]:
https://lemire.me/blog/2018/08/22/avoid-lexicographical-comparisons-when-testing-for-string-equality/#comment-344073
--
Sivaraam
^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2018-09-02 18:54 UTC | newest]
Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-20 22:13 [ANNOUNCE] Git v2.19.0-rc0 Junio C Hamano
2018-08-20 22:41 ` Stefan Beller
2018-08-20 23:39 ` Jonathan Nieder
2018-08-21 0:27 ` Jonathan Nieder
2018-08-21 0:46 ` Stefan Beller
2018-08-21 20:41 ` Derrick Stolee
2018-08-21 21:29 ` Jeff King
2018-08-22 0:48 ` brian m. carlson
2018-08-22 3:03 ` Jeff King
2018-08-22 3:36 ` Jeff King
2018-08-22 11:11 ` Derrick Stolee
2018-08-22 5:36 ` brian m. carlson
2018-08-22 6:07 ` Jeff King
2018-08-22 7:39 ` Ævar Arnfjörð Bjarmason
2018-08-22 11:14 ` Derrick Stolee
2018-08-22 15:17 ` Jeff King
2018-08-22 16:08 ` Duy Nguyen
2018-08-22 16:14 ` Duy Nguyen
2018-08-22 16:26 ` Jeff King
2018-08-22 16:49 ` Derrick Stolee
2018-08-22 16:58 ` Duy Nguyen
2018-08-22 17:04 ` Derrick Stolee
2018-08-22 16:59 ` Jeff King
2018-08-22 17:02 ` Junio C Hamano
2018-08-22 15:14 ` Jeff King
2018-08-22 14:28 ` Derrick Stolee
2018-08-22 15:24 ` Jeff King
2018-08-22 12:42 ` Paul Smith
2018-08-22 15:23 ` Jeff King
2018-08-23 1:23 ` Jonathan Nieder
2018-08-23 2:16 ` Jeff King
2018-08-23 2:27 ` Jonathan Nieder
2018-08-23 5:02 ` Jeff King
2018-08-23 5:09 ` brian m. carlson
2018-08-23 5:10 ` Jonathan Nieder
2018-08-23 13:20 ` Junio C Hamano
2018-08-23 16:31 ` wide t/perf output, was " Jeff King
2018-08-23 3:47 ` brian m. carlson
2018-08-23 5:04 ` Jeff King
2018-08-23 10:26 ` Derrick Stolee
2018-08-23 13:16 ` Junio C Hamano
2018-08-23 16:14 ` Jeff King
2018-08-23 23:30 ` Jacob Keller
2018-08-23 23:40 ` Jeff King
2018-08-24 0:06 ` Jeff King
2018-08-24 0:16 ` Jeff King
2018-08-24 2:48 ` Jacob Keller
2018-08-24 2:59 ` Jeff King
2018-08-24 6:45 ` Jeff King
2018-08-24 11:04 ` Derrick Stolee
2018-08-27 19:36 ` Junio C Hamano
2018-08-23 18:53 ` Jeff King
2018-08-23 20:59 ` Derrick Stolee
2018-08-24 6:56 ` Jeff King
2018-08-24 7:57 ` Ævar Arnfjörð Bjarmason
2018-08-24 16:45 ` Derrick Stolee
2018-08-25 8:26 ` Jeff King
2018-09-02 18:53 ` Kaartic Sivaraam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).