* [RFH] gpg --import entropy while running tests
@ 2016-12-28 7:23 Jeff King
2016-12-28 8:02 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28 7:23 UTC (permalink / raw)
To: git
I noticed a few of our test scripts taking a long time to run, even
though they used to be quick. Here's one:
$ time ./t7612-merge-verify-signatures.sh
ok 1 - create signed commits
ok 2 - merge unsigned commit with verification
ok 3 - merge commit with bad signature with verification
ok 4 - merge commit with untrusted signature with verification
ok 5 - merge signed commit with verification
ok 6 - merge commit with bad signature without verification
# passed all 6 test(s)
1..6
real 0m12.285s
user 0m0.048s
sys 0m0.044s
That's a lot of time not using any CPU. What's going on? Running with
"sh -x" shows that we spend most of the time in this line from
lib-gpg.sh:
gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
"$TEST_DIRECTORY"/lib-gpg/keyring.gpg
And running gpg with "--debug-level guru" shows that we are blocking
while waiting for entropy. Has anybody else seen this? I feel like I
noticed it starting a few weeks ago, and indeed dropping back to gpg
2.0.26 (from 2.1.17) makes the problem go away.
Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
and generating the signatures themselves is quite fast)? Or is the new
version doing something special in the import process that we need to
work around or disable?
The reason we use --import at all is to handle differences in the .gnupg
format between versions 1 and 2. So the nuclear option would be to just
carry pre-made .gnupg directories for _both_ versions in our test suite,
and pick the appropriate one based on the output of "gpg --version".
That feels like a hack, but it gives us a lot of control.
I'd love it if we could figure out a way of making --import reliably
fast, though.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFH] gpg --import entropy while running tests
2016-12-28 7:23 [RFH] gpg --import entropy while running tests Jeff King
@ 2016-12-28 8:02 ` Jeff King
2016-12-28 8:39 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28 8:02 UTC (permalink / raw)
To: git
On Wed, Dec 28, 2016 at 02:23:03AM -0500, Jeff King wrote:
> That's a lot of time not using any CPU. What's going on? Running with
> "sh -x" shows that we spend most of the time in this line from
> lib-gpg.sh:
>
> gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
> "$TEST_DIRECTORY"/lib-gpg/keyring.gpg
>
> And running gpg with "--debug-level guru" shows that we are blocking
> while waiting for entropy. Has anybody else seen this? I feel like I
> noticed it starting a few weeks ago, and indeed dropping back to gpg
> 2.0.26 (from 2.1.17) makes the problem go away.
>
> Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
> and generating the signatures themselves is quite fast)? Or is the new
> version doing something special in the import process that we need to
> work around or disable?
Answering my own question (somewhat): this is bisectable in the gnupg
repository, and it turns out to be caused by their 4473db1ef (agent:
Kludge to mitigate blocking calls in Libgcrypt., 2016-11-11), which
introduces a 100ms sleep (yuck) that is presumably triggering way more
than it needs to. More details at:
https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97
So this does seem like a gpg bug.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFH] gpg --import entropy while running tests
2016-12-28 8:02 ` Jeff King
@ 2016-12-28 8:39 ` Jeff King
2016-12-28 16:05 ` Theodore Ts'o
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28 8:39 UTC (permalink / raw)
To: git
On Wed, Dec 28, 2016 at 03:02:30AM -0500, Jeff King wrote:
> > Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
> > and generating the signatures themselves is quite fast)? Or is the new
> > version doing something special in the import process that we need to
> > work around or disable?
>
> Answering my own question (somewhat): this is bisectable in the gnupg
> repository, and it turns out to be caused by their 4473db1ef (agent:
> Kludge to mitigate blocking calls in Libgcrypt., 2016-11-11), which
> introduces a 100ms sleep (yuck) that is presumably triggering way more
> than it needs to. More details at:
>
> https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97
>
> So this does seem like a gpg bug.
I've submitted a bug report to gpg:
https://bugs.gnupg.org/gnupg/issue2897
so we'll see what they say.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFH] gpg --import entropy while running tests
2016-12-28 8:39 ` Jeff King
@ 2016-12-28 16:05 ` Theodore Ts'o
0 siblings, 0 replies; 4+ messages in thread
From: Theodore Ts'o @ 2016-12-28 16:05 UTC (permalink / raw)
To: Jeff King; +Cc: git
On Wed, Dec 28, 2016 at 03:39:30AM -0500, Jeff King wrote:
> > https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97
> >
> > So this does seem like a gpg bug.
>
> I've submitted a bug report to gpg:
>
> https://bugs.gnupg.org/gnupg/issue2897
>
> so we'll see what they say.
Yeah, they are definitely doing something very.... hard to explain.
Pid 8348 is the gpg-agent process which the main gpg program (pid
8344) connected to. It starts trying to get randomness in response to
a KEYWRAP command:
8348 10:58:57.882909 access("/dev/random", R_OK) = 0
8348 10:58:57.883205 access("/dev/urandom", R_OK) = 0
8348 10:58:57.883472 open("/dev/urandom", O_RDONLY) = 9
8348 10:58:57.883729 fcntl(9, F_GETFD) = 0
8348 10:58:57.883914 fcntl(9, F_SETFD, FD_CLOEXEC) = 0
It opens /dev/urandom, but then never uses fd 9 ever again. Instead,
it uses getrandom, but in a pretty silly fashion, with lots of sleeps
in between, and not between each progress report, either:
8348 10:58:57.884129 write(8, "S PROGRESS need_entropy X 30 120", 32 <unfinished ...>
8344 10:58:57.884338 <... read resumed> "S PROGRESS need_entropy X 30 120", 1002) = 32
8348 10:58:57.884424 <... write resumed> ) = 32
8344 10:58:57.884488 read(5, <unfinished ...>
8348 10:58:57.884550 write(8, "\n", 1 <unfinished ...>
8344 10:58:57.884715 <... read resumed> "\n", 970) = 1
8348 10:58:57.884800 <... write resumed> ) = 1
8344 10:58:57.884883 read(5, <unfinished ...>
8348 10:58:57.884951 nanosleep({0, 100000000}, NULL) = 0
8348 10:58:57.985363 select(10, [9], NULL, NULL, {0, 100000}) = 1 (in [9], left {0, 99994})
8348 10:58:57.985593 getrandom("&\275\354^\256\320\3w\21:R]`eJ\t\t\350\245\202>\255\237\324\324\340\24^c\323\210\376"..., 90, 0) = 90
8348 10:58:57.985751 write(8, "S PROGRESS need_entropy X 120 12"..., 33) = 33
8344 10:58:57.985885 <... read resumed> "S PROGRESS need_entropy X 120 12"..., 1002) = 33
8348 10:58:57.985934 write(8, "\n", 1 <unfinished ...>
8344 10:58:57.985982 read(5, <unfinished ...>
8348 10:58:57.986015 <... write resumed> ) = 1
8344 10:58:57.986048 <... read resumed> "\n", 969) = 1
8348 10:58:57.986090 nanosleep({0, 100000000}, <unfinished ...>
8344 10:58:57.986142 read(5, <unfinished ...>
8348 10:58:58.086253 <... nanosleep resumed> NULL) = 0
8348 10:58:58.086370 write(8, "S PROGRESS need_entropy X 30 120", 32) = 32
8344 10:58:58.086502 <... read resumed> "S PROGRESS need_entropy X 30 120", 1002) = 32
8348 10:58:58.086541 write(8, "\n", 1 <unfinished ...>
8344 10:58:58.086579 read(5, <unfinished ...>
8348 10:58:58.086604 <... write resumed> ) = 1
8344 10:58:58.086630 <... read resumed> "\n", 970) = 1
8348 10:58:58.086661 nanosleep({0, 100000000}, <unfinished ...>
8344 10:58:58.086703 read(5, <unfinished ...>
8348 10:58:58.186815 <... nanosleep resumed> NULL) = 0
8348 10:58:58.186894 select(10, [9], NULL, NULL, {0, 100000}) = 1 (in [9], left {0, 99995})
8348 10:58:58.187038 getrandom("\365\221\374m\360\235\27\330\264\223\365\363<6\302\324F\5\354Q|,\366\253\337u\226\265\345\250CA"..., 90, 0) = 90
The worst part of this is that the commit description claims this is a
workaround for libgcrypt using /dev/random, but it's not using
/dev/random --- it's using getrandom, and it pointlessly opened
/dev/urandom first (having never opened /dev/random).
This looks like a classic case of Lotus Notes / Websphere disease ---
to many d*mned layers of abstraction....
- Ted
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-12-28 17:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-28 7:23 [RFH] gpg --import entropy while running tests Jeff King
2016-12-28 8:02 ` Jeff King
2016-12-28 8:39 ` Jeff King
2016-12-28 16:05 ` Theodore Ts'o
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.