All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolai Stange <nstange@suse.de>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Alexander E. Patrakov" <patrakov@gmail.com>,
	"Ahmed S. Darwish" <darwish.07@gmail.com>,
	"Willy Tarreau" <w@1wt.eu>,
	"Matthew Garrett" <mjg59@srcf.ucam.org>,
	"Vito Caputo" <vcaputo@pengaru.com>,
	"Andreas Dilger" <adilger.kernel@dilger.ca>,
	"Jan Kara" <jack@suse.cz>, "Ray Strode" <rstrode@redhat.com>,
	"William Jon McCann" <mccann@jhu.edu>,
	zhangjs <zachary@baishancloud.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Florian Weimer" <fweimer@redhat.com>,
	"Lennart Poettering" <mzxreary@0pointer.de>,
	"Peter Matthias" <matthias.peter@bsi.bund.de>,
	"Marcelo Henrique Cerri" <marcelo.cerri@canonical.com>,
	"Roman Drahtmueller" <draht@schaltsekun.de>,
	"Neil Horman" <nhorman@redhat.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Julia Lawall" <julia.lawall@inria.fr>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	"Andy Lavr" <andy.lavr@gmail.com>,
	"Eric Biggers" <ebiggers@kernel.org>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	"Stephan Müller" <smueller@chronox.de>,
	"Torsten Duwe" <duwe@suse.de>, "Petr Tesarik" <ptesarik@suse.cz>,
	"Nicolai Stange" <nstange@suse.de>
Subject: [RFC PATCH 36/41] random: optimize the APT's presearch
Date: Mon, 21 Sep 2020 09:58:52 +0200	[thread overview]
Message-ID: <20200921075857.4424-37-nstange@suse.de> (raw)
In-Reply-To: <20200921075857.4424-1-nstange@suse.de>

The Adaptive Proportion Test's presearch phase is supposed to determine the
sample value occurring more than half of the times among the first n1=7
events in a sequence, if there is any such one.

To this end, it maintains eight counters at struct health_test for counting
the numbers of ones observed at each of the eight resp. bit positions
within the sample values' octets. The idea is that if any sample value had
been encountered more than half of the time, it would dominate all these
counters and its value could be restored unambiguously from them.

For better reviewability, this had been implemented in the most
straightforward way:
- the counters had been represented as an array of eight u8s at struct
  health_test and
- both, the counter updating as well as the candidate extracion code had
  been implemented by means of a loop over the eight bit positions.

As this is all accessed from add_interrupt_randomness(), optimizations
won't harm. In particular, using a total of eight bytes for the bit
counters is wasteful and can be reduced to half of that, optimizing the
size of struct health_test, which in turn is stored as part of the per-CPU
struct fast_pool.

For counts up to n1=7, 3 bits would suffice. Rather than using an array of
eight u8s, store the bit counter within an u32's eight four-bit nibbles.

Even though it probably won't matter on average in practice, because the
APT presearch is run only for a small fraction of all IRQ events, reduce
the number of instructions issued from the bit counter maintenance and
candidate extraction code as well. If nothing else, this can potentially
reduce the maximum IRQ latency by a few cycles.

Namely, make the bit counter updating code in health_apt_presearch_update()
spread the eight bits from the input sample evenly across an u32. That is,
the bit at position i will end up at position 4*i. This can be achieved
within five binops and four shifts. This intermediate value can then get
subsequently added in a single operation to the "packed" bit counters kept
in struct health_test in order to conclude the operation.

As for the final candidate extraction in health_apt_presearch_finalize(),
remember that the i'th counter needs to get compared against
n1 / 2 = 7 / 2 in order to restore the i'th bit from the resulting
candidate value. The condition that one such bit counter is >= 4 is
equivalent to testing its bit at the 2nd position, counted from zero.
Thus, (->apt_presearch_bit_counters & 0x44444444) >> 2
will yield a value where the LSB from the i'th nibble, equals the i'th bit
from the result and everything else is unset. The final result can then be
obtained by "shrinking" this intermediate representation back into an u8.
In total, the candidate extraction can be achieved within a sequence of
seven binops and six shifts.

Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 drivers/char/random.c | 71 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 53 insertions(+), 18 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 75a103f24fea..2c744d2a9b26 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -882,7 +882,7 @@ static void discard_queued_entropy(struct entropy_store *r,
 struct health_test {
 	unsigned short apt_event_count;
 	union {
-		u8 apt_presearch_bit_counters[8];
+		u32 apt_presearch_bit_counters;
 		struct {
 			unsigned short apt_candidate_count;
 			u8 apt_candidate;
@@ -904,40 +904,75 @@ enum health_result {
 
 static void health_apt_presearch_update(struct health_test *h, u8 sample_delta)
 {
-	int i;
+	u32 encoded;
 
-	for (i = 0; i < 8; ++i) {
-		h->apt_presearch_bit_counters[i] = sample_delta & 0x1;
-		sample_delta >>= 1;
-	}
+	/*
+	 * Encode the sample octet by "widening" it into 8
+	 * nibbles. That is, bit i from the source will be assigned to
+	 * bit 4*i in the result. All other bits are set to zero.
+	 */
+	encoded = sample_delta;
+	encoded = (encoded & 0xf) | ((encoded >> 4) << 16);
+	encoded |= (encoded << 6);
+	encoded |= (encoded << 3);
+	encoded &= 0x11111111;
+
+	/*
+	 * The nibbles from ->apt_presearch_bit_counters, each
+	 * counting the number of occurences of 1s at the
+	 * corresponding bit position, don't overflow into each other.
+	 */
+	BUILD_BUG_ON(ilog2(HEALTH_APT_PRESEARCH_EVENT_COUNT) >= 4);
+	h->apt_presearch_bit_counters += encoded;
 }
 
 static void health_apt_presearch_finalize(struct health_test *h)
 {
-	int i;
+	u32 majority, decoded;
 
 	/*
 	 * If some event octet occurred more than half of the time,
 	 * i.e. more than HEALTH_APT_PRESEARCH_EVENT_COUNT / 2 times,
 	 * then its value can be restored unambigiously from the eight
-	 * ->apt_presearch_bit_counters each holding the count of 1s
-	 * encountered at the corresponding bit positions.
+	 * ->apt_presearch_bit_counters nibbles each holding the count
+	 * of 1s encountered at the corresponding bit positions.
+	 *
+	 * Because HEALTH_APT_PRESEARCH_EVENT_COUNT is a power of two
+	 * minus one, the condition
+	 * nibble >=  HEALTH_APT_PRESEARCH_EVENT_COUNT / 2
+	 * is true iff the nibble's bit at postion
+	 * ilog2(HEALTH_APT_PRESEARCH_EVENT_COUNT) is set.
 	 */
-	h->apt_candidate = 0;
-	for (i = 0; i < 8; ++i) {
-		if (h->apt_presearch_bit_counters[i] >=
-		    (HEALTH_APT_PRESEARCH_EVENT_COUNT + 1) / 2) {
-			h->apt_candidate |= 1 << i;
-		}
-	}
+	BUILD_BUG_ON(!is_power_of_2(HEALTH_APT_PRESEARCH_EVENT_COUNT + 1));
+#define MAJORITY_BIT ilog2(HEALTH_APT_PRESEARCH_EVENT_COUNT)
+#define SHL_AND_OR(a, shl) ((a) | ((a) << shl))
+#define SET_ALL_NIBBLES_TO(to) SHL_AND_OR(SHL_AND_OR(SHL_AND_OR(to, 16), 8), 4)
+#define MAJORITY_MASK SET_ALL_NIBBLES_TO(1 << MAJORITY_BIT)
+	majority = (h->apt_presearch_bit_counters & MAJORITY_MASK);
+	majority >>= MAJORITY_BIT;
+#undef MAJORITY_MASK
+#undef SET_ALL_NIBBLES_TO
+#undef MAJORITY_BIT
+
+	/*
+	 * Reverse the encoding from health_apt_update_presearch().
+	 * That is, "shrink" the eight nibbles back into an octet such
+	 * that the result's i'th bit is set to the i'th nibble's LSB.
+	 */
+	decoded = majority;
+	decoded |= (decoded >> 3);
+	decoded |= (decoded >> 3);
+	decoded |= (decoded >> 3);
+	decoded = (((decoded >> 16) << 4) | (decoded & 0xf)) & 0xff;
+
+	h->apt_candidate = decoded;
 	h->apt_candidate_count = 0;
 };
 
 static void health_apt_reset(struct health_test *h)
 {
 	h->apt_event_count = 0;
-	memset(h->apt_presearch_bit_counters, 0,
-		sizeof(h->apt_presearch_bit_counters));
+	h->apt_presearch_bit_counters = 0;
 }
 
 static enum health_result
-- 
2.26.2


  parent reply	other threads:[~2020-09-21  8:01 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-21  7:58 [DISCUSSION PATCH 00/41] random: possible ways towards NIST SP800-90B compliance Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 01/41] random: remove dead code in credit_entropy_bits() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 02/41] random: remove dead code for nbits < 0 " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 03/41] random: prune dead assignment to entropy_bits " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 04/41] random: drop 'reserved' parameter from extract_entropy() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 05/41] random: don't reset entropy to zero on overflow Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 06/41] random: factor the exponential approximation in credit_entropy_bits() out Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 07/41] random: let pool_entropy_delta() take nbits in units of 2^-ENTROPY_SHIFT Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 08/41] random: introduce __credit_entropy_bits_fast() for hot paths Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 09/41] random: protect ->entropy_count with the pool spinlock Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 10/41] random: implement support for delayed entropy dispatching Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 11/41] random: convert add_timer_randomness() to queued_entropy API Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 12/41] random: convert add_interrupt_randomness() " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 13/41] random: convert try_to_generate_entropy() " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 14/41] random: drop __credit_entropy_bits_fast() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 15/41] random: convert add_hwgenerator_randomness() to queued_entropy API Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 16/41] random: convert random_ioctl() " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 17/41] random: drop credit_entropy_bits() and credit_entropy_bits_safe() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 18/41] random: move arch_get_random_seed() calls in crng_reseed() into own loop Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 19/41] random: reintroduce arch_has_random() + arch_has_random_seed() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 20/41] random: provide min_crng_reseed_pool_entropy() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 21/41] random: don't invoke arch_get_random_long() from add_interrupt_randomness() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 22/41] random: introduce arch_has_sp800_90b_random_seed() Nicolai Stange
2020-09-21 12:18   ` kernel test robot
2020-09-21  7:58 ` [RFC PATCH 23/41] random: don't award entropy to non-SP800-90B arch RNGs in FIPS mode Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 24/41] init: call time_init() before rand_initialize() Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 25/41] random: probe cycle counter resolution at initialization Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 26/41] random: implement support for evaluating larger fast_pool entropies Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 27/41] random: increase per-IRQ event entropy estimate if in FIPS mode Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 28/41] random: don't award entropy to disk + input events " Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 29/41] random: move definition of struct queued_entropy and related API upwards Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 30/41] random: add a queued_entropy instance to struct fast_pool Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 31/41] random: introduce struct health_test + health_test_reset() placeholders Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 32/41] random: introduce health test stub and wire it up Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 33/41] random: make health_test_process() maintain the get_cycles() delta Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 34/41] random: implement the "Adaptive Proportion" NIST SP800-90B health test Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 35/41] random: improve the APT's statistical power Nicolai Stange
2020-09-21  7:58 ` Nicolai Stange [this message]
2020-09-21  7:58 ` [RFC PATCH 37/41] random: implement the "Repetition Count" NIST SP800-90B health test Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 38/41] random: enable NIST SP800-90B startup tests Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 39/41] random: make the startup tests include muliple APT invocations Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 40/41] random: trigger startup health test on any failure of the health tests Nicolai Stange
2020-09-21  7:58 ` [RFC PATCH 41/41] random: lower per-IRQ entropy estimate upon health test failure Nicolai Stange
2020-09-21  8:09 ` [DISCUSSION PATCH 00/41] random: possible ways towards NIST SP800-90B compliance Jason A. Donenfeld
2020-09-21  8:40 ` Stephan Mueller
2020-09-22 13:23   ` Torsten Duwe
2020-09-22 16:21     ` Greg Kroah-Hartman
2020-09-22 17:48       ` Torsten Duwe
2020-10-02 12:38 ` Torsten Duwe
2020-10-02 13:15   ` Willy Tarreau
2020-10-02 13:33     ` Greg Kroah-Hartman
2020-10-02 14:05       ` Torsten Duwe
2020-10-02 13:56     ` Stephan Mueller
2020-10-16 17:26       ` Torsten Duwe
2020-10-19 19:28         ` [PATCH v36 00/13] /dev/random - a new approach Stephan Müller
2020-10-19 19:30           ` [PATCH v36 01/13] Linux Random Number Generator Stephan Müller
2020-10-19 19:31           ` [PATCH v36 02/13] LRNG - allocate one DRNG instance per NUMA node Stephan Müller
2020-10-19 19:32           ` [PATCH v36 03/13] LRNG - sysctls and /proc interface Stephan Müller
2020-10-19 19:32           ` [PATCH v36 04/13] LRNG - add switchable DRNG support Stephan Müller
2020-10-19 19:33           ` [PATCH v36 05/13] LRNG - add common generic hash support Stephan Müller
2020-10-19 19:34           ` [PATCH v36 06/13] crypto: DRBG - externalize DRBG functions for LRNG Stephan Müller
2020-10-19 19:34           ` [PATCH v36 07/13] LRNG - add SP800-90A DRBG extension Stephan Müller
2020-10-19 19:35           ` [PATCH v36 08/13] LRNG - add kernel crypto API PRNG extension Stephan Müller
2020-10-19 19:35           ` [PATCH v36 09/13] crypto: provide access to a static Jitter RNG state Stephan Müller
2020-10-19 19:36           ` [PATCH v36 10/13] LRNG - add Jitter RNG fast noise source Stephan Müller
2020-10-19 19:37           ` [PATCH v36 11/13] LRNG - add SP800-90B compliant health tests Stephan Müller
2020-10-19 19:37           ` [PATCH v36 12/13] LRNG - add interface for gathering of raw entropy Stephan Müller
2020-10-19 19:38           ` [PATCH v36 13/13] LRNG - add power-on and runtime self-tests Stephan Müller
2020-10-28 17:51           ` [PATCH v36 00/13] /dev/random - a new approach Torsten Duwe
2020-10-28 18:07             ` Greg Kroah-Hartman
2020-11-02 13:44               ` Torsten Duwe
2020-11-04 14:26                 ` Marcelo Henrique Cerri
2020-11-17 14:01                 ` Torsten Duwe
2020-11-10 10:22           ` Stephan Mueller
2020-10-02 13:35   ` [DISCUSSION PATCH 00/41] random: possible ways towards NIST SP800-90B compliance Van Leeuwen, Pascal
2020-10-02 14:04     ` Greg Kroah-Hartman
2020-10-02 14:34       ` Van Leeuwen, Pascal
2020-10-02 15:13         ` Greg Kroah-Hartman
2020-10-02 15:39           ` Van Leeuwen, Pascal
2020-10-02 16:30             ` Randy Dunlap
2020-10-02 18:14             ` Theodore Y. Ts'o
2020-10-02 19:09               ` Van Leeuwen, Pascal
2020-10-07  4:24   ` Eric Biggers
2020-10-07  5:52     ` Stephan Mueller
2020-10-07 10:38     ` Nicolai Stange

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200921075857.4424-37-nstange@suse.de \
    --to=nstange@suse.de \
    --cc=Jason@zx2c4.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=andy.lavr@gmail.com \
    --cc=arnd@arndb.de \
    --cc=dan.carpenter@oracle.com \
    --cc=darwish.07@gmail.com \
    --cc=draht@schaltsekun.de \
    --cc=duwe@suse.de \
    --cc=ebiederm@xmission.com \
    --cc=ebiggers@kernel.org \
    --cc=fweimer@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=julia.lawall@inria.fr \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=marcelo.cerri@canonical.com \
    --cc=matthias.peter@bsi.bund.de \
    --cc=mccann@jhu.edu \
    --cc=mjg59@srcf.ucam.org \
    --cc=mzxreary@0pointer.de \
    --cc=nhorman@redhat.com \
    --cc=patrakov@gmail.com \
    --cc=ptesarik@suse.cz \
    --cc=rdunlap@infradead.org \
    --cc=rstrode@redhat.com \
    --cc=smueller@chronox.de \
    --cc=tytso@mit.edu \
    --cc=vcaputo@pengaru.com \
    --cc=w@1wt.eu \
    --cc=zachary@baishancloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.