All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFH] gpg --import entropy while running tests
@ 2016-12-28  7:23 Jeff King
  2016-12-28  8:02 ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28  7:23 UTC (permalink / raw)
  To: git

I noticed a few of our test scripts taking a long time to run, even
though they used to be quick. Here's one:

  $ time ./t7612-merge-verify-signatures.sh 
  ok 1 - create signed commits
  ok 2 - merge unsigned commit with verification
  ok 3 - merge commit with bad signature with verification
  ok 4 - merge commit with untrusted signature with verification
  ok 5 - merge signed commit with verification
  ok 6 - merge commit with bad signature without verification
  # passed all 6 test(s)
  1..6

  real    0m12.285s
  user    0m0.048s
  sys     0m0.044s


That's a lot of time not using any CPU. What's going on? Running with
"sh -x" shows that we spend most of the time in this line from
lib-gpg.sh:

  gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
	  "$TEST_DIRECTORY"/lib-gpg/keyring.gpg

And running gpg with "--debug-level guru" shows that we are blocking
while waiting for entropy. Has anybody else seen this? I feel like I
noticed it starting a few weeks ago, and indeed dropping back to gpg
2.0.26 (from 2.1.17) makes the problem go away.

Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
and generating the signatures themselves is quite fast)? Or is the new
version doing something special in the import process that we need to
work around or disable?

The reason we use --import at all is to handle differences in the .gnupg
format between versions 1 and 2. So the nuclear option would be to just
carry pre-made .gnupg directories for _both_ versions in our test suite,
and pick the appropriate one based on the output of "gpg --version".
That feels like a hack, but it gives us a lot of control.

I'd love it if we could figure out a way of making --import reliably
fast, though.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFH] gpg --import entropy while running tests
  2016-12-28  7:23 [RFH] gpg --import entropy while running tests Jeff King
@ 2016-12-28  8:02 ` Jeff King
  2016-12-28  8:39   ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28  8:02 UTC (permalink / raw)
  To: git

On Wed, Dec 28, 2016 at 02:23:03AM -0500, Jeff King wrote:

> That's a lot of time not using any CPU. What's going on? Running with
> "sh -x" shows that we spend most of the time in this line from
> lib-gpg.sh:
> 
>   gpg --homedir "${GNUPGHOME}" 2>/dev/null --import \
> 	  "$TEST_DIRECTORY"/lib-gpg/keyring.gpg
> 
> And running gpg with "--debug-level guru" shows that we are blocking
> while waiting for entropy. Has anybody else seen this? I feel like I
> noticed it starting a few weeks ago, and indeed dropping back to gpg
> 2.0.26 (from 2.1.17) makes the problem go away.
> 
> Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
> and generating the signatures themselves is quite fast)? Or is the new
> version doing something special in the import process that we need to
> work around or disable?

Answering my own question (somewhat): this is bisectable in the gnupg
repository, and it turns out to be caused by their 4473db1ef (agent:
Kludge to mitigate blocking calls in Libgcrypt., 2016-11-11), which
introduces a 100ms sleep (yuck) that is presumably triggering way more
than it needs to. More details at:

  https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97

So this does seem like a gpg bug.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFH] gpg --import entropy while running tests
  2016-12-28  8:02 ` Jeff King
@ 2016-12-28  8:39   ` Jeff King
  2016-12-28 16:05     ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2016-12-28  8:39 UTC (permalink / raw)
  To: git

On Wed, Dec 28, 2016 at 03:02:30AM -0500, Jeff King wrote:

> > Is it a bug in gpg (oddly, the kernel reports lots of entropy available,
> > and generating the signatures themselves is quite fast)? Or is the new
> > version doing something special in the import process that we need to
> > work around or disable?
> 
> Answering my own question (somewhat): this is bisectable in the gnupg
> repository, and it turns out to be caused by their 4473db1ef (agent:
> Kludge to mitigate blocking calls in Libgcrypt., 2016-11-11), which
> introduces a 100ms sleep (yuck) that is presumably triggering way more
> than it needs to. More details at:
> 
>   https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97
> 
> So this does seem like a gpg bug.

I've submitted a bug report to gpg:

  https://bugs.gnupg.org/gnupg/issue2897

so we'll see what they say.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFH] gpg --import entropy while running tests
  2016-12-28  8:39   ` Jeff King
@ 2016-12-28 16:05     ` Theodore Ts'o
  0 siblings, 0 replies; 4+ messages in thread
From: Theodore Ts'o @ 2016-12-28 16:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Dec 28, 2016 at 03:39:30AM -0500, Jeff King wrote:
> >   https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=commit;h=4473db1ef24031ff4e26c9a9de95dbe898ed2b97
> > 
> > So this does seem like a gpg bug.
> 
> I've submitted a bug report to gpg:
> 
>   https://bugs.gnupg.org/gnupg/issue2897
> 
> so we'll see what they say.

Yeah, they are definitely doing something very.... hard to explain.

Pid 8348 is the gpg-agent process which the main gpg program (pid
8344) connected to.  It starts trying to get randomness in response to
a KEYWRAP command:

8348  10:58:57.882909 access("/dev/random", R_OK) = 0
8348  10:58:57.883205 access("/dev/urandom", R_OK) = 0
8348  10:58:57.883472 open("/dev/urandom", O_RDONLY) = 9
8348  10:58:57.883729 fcntl(9, F_GETFD) = 0
8348  10:58:57.883914 fcntl(9, F_SETFD, FD_CLOEXEC) = 0

It opens /dev/urandom, but then never uses fd 9 ever again.  Instead,
it uses getrandom, but in a pretty silly fashion, with lots of sleeps
in between, and not between each progress report, either:

8348  10:58:57.884129 write(8, "S PROGRESS need_entropy X 30 120", 32 <unfinished ...>
8344  10:58:57.884338 <... read resumed> "S PROGRESS need_entropy X 30 120", 1002) = 32
8348  10:58:57.884424 <... write resumed> ) = 32
8344  10:58:57.884488 read(5,  <unfinished ...>
8348  10:58:57.884550 write(8, "\n", 1 <unfinished ...>
8344  10:58:57.884715 <... read resumed> "\n", 970) = 1
8348  10:58:57.884800 <... write resumed> ) = 1
8344  10:58:57.884883 read(5,  <unfinished ...>
8348  10:58:57.884951 nanosleep({0, 100000000}, NULL) = 0
8348  10:58:57.985363 select(10, [9], NULL, NULL, {0, 100000}) = 1 (in [9], left {0, 99994})
8348  10:58:57.985593 getrandom("&\275\354^\256\320\3w\21:R]`eJ\t\t\350\245\202>\255\237\324\324\340\24^c\323\210\376"..., 90, 0) = 90
8348  10:58:57.985751 write(8, "S PROGRESS need_entropy X 120 12"..., 33) = 33
8344  10:58:57.985885 <... read resumed> "S PROGRESS need_entropy X 120 12"..., 1002) = 33
8348  10:58:57.985934 write(8, "\n", 1 <unfinished ...>
8344  10:58:57.985982 read(5,  <unfinished ...>
8348  10:58:57.986015 <... write resumed> ) = 1
8344  10:58:57.986048 <... read resumed> "\n", 969) = 1
8348  10:58:57.986090 nanosleep({0, 100000000},  <unfinished ...>
8344  10:58:57.986142 read(5,  <unfinished ...>
8348  10:58:58.086253 <... nanosleep resumed> NULL) = 0
8348  10:58:58.086370 write(8, "S PROGRESS need_entropy X 30 120", 32) = 32
8344  10:58:58.086502 <... read resumed> "S PROGRESS need_entropy X 30 120", 1002) = 32
8348  10:58:58.086541 write(8, "\n", 1 <unfinished ...>
8344  10:58:58.086579 read(5,  <unfinished ...>
8348  10:58:58.086604 <... write resumed> ) = 1
8344  10:58:58.086630 <... read resumed> "\n", 970) = 1
8348  10:58:58.086661 nanosleep({0, 100000000},  <unfinished ...>
8344  10:58:58.086703 read(5,  <unfinished ...>
8348  10:58:58.186815 <... nanosleep resumed> NULL) = 0
8348  10:58:58.186894 select(10, [9], NULL, NULL, {0, 100000}) = 1 (in [9], left {0, 99995})
8348  10:58:58.187038 getrandom("\365\221\374m\360\235\27\330\264\223\365\363<6\302\324F\5\354Q|,\366\253\337u\226\265\345\250CA"..., 90, 0) = 90

The worst part of this is that the commit description claims this is a
workaround for libgcrypt using /dev/random, but it's not using
/dev/random --- it's using getrandom, and it pointlessly opened
/dev/urandom first (having never opened /dev/random).

This looks like a classic case of Lotus Notes / Websphere disease ---
to many d*mned layers of abstraction....

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-28 17:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-28  7:23 [RFH] gpg --import entropy while running tests Jeff King
2016-12-28  8:02 ` Jeff King
2016-12-28  8:39   ` Jeff King
2016-12-28 16:05     ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.