All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul
@ 2021-05-31  2:01 hv90
  2021-06-01  9:56 ` Edward Cree
  2021-06-01 11:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: hv90 @ 2021-05-31  2:01 UTC (permalink / raw)
  To: ast
  Cc: bpf, ecree, Harishankar Vishwanathan, Matan Shachnai,
	Srinivas Narayana, Santosh Nagarakatte

From: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>

This patch introduces a new algorithm for multiplication of tristate
numbers (tnums) that is provably sound. It is faster and more precise when
compared to the existing method.

Like the existing method, this new algorithm follows the long
multiplication algorithm. The idea is to generate partial products by
multiplying each bit in the multiplier (tnum a) with the multiplicand
(tnum b), and adding the partial products after appropriately bit-shifting
them. The new algorithm, however, uses just a single loop over the bits of
the multiplier (tnum a) and accumulates only the uncertain components of
the multiplicand (tnum b) into a mask-only tnum. The following paper
explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.

A natural way to construct the tnum product is by performing a tnum
addition on all the partial products. This algorithm presents another
method of doing this: decompose each partial product into two tnums,
consisting of the values and the masks separately. The mask-sum is
accumulated within the loop in acc_m. The value-sum tnum is generated
using a.value * b.value. The tnum constructed by tnum addition of the
value-sum and the mask-sum contains all possible summations of concrete
values drawn from the partial product tnums pairwise. We prove this result
in the paper.

Our evaluations show that the new algorithm is overall more precise
(producing tnums with less uncertain components) than the existing method.
As an illustrative example, consider the input tnums A and B. The numbers
in the paranthesis correspond to (value;mask).

A                = 000000x1 (1;2)
B                = 0010011x (38;1)
A * B (existing) = xxxxxxxx (0;255)
A * B (new)      = 0x1xxxxx (32;95)

Importantly, we present a proof of soundness of the new algorithm in the
aforementioned paper. Additionally, we show that this new algorithm is
empirically faster than the existing method.

Co-developed-by: Matan Shachnai <m.shachnai@rutgers.edu>
Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu>
Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>
---
 kernel/bpf/tnum.c | 41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index ceac5281bd31..3d7127f439a1 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -111,28 +111,31 @@ struct tnum tnum_xor(struct tnum a, struct tnum b)
 	return TNUM(v & ~mu, mu);
 }
 
-/* half-multiply add: acc += (unknown * mask * value).
- * An intermediate step in the multiply algorithm.
+/* Generate partial products by multiplying each bit in the multiplier (tnum a)
+ * with the multiplicand (tnum b), and add the partial products after
+ * appropriately bit-shifting them. Instead of directly performing tnum addition
+ * on the generated partial products, equivalenty, decompose each partial
+ * product into two tnums, consisting of the value-sum (acc_v) and the
+ * mask-sum (acc_m) and then perform tnum addition on them. The following paper
+ * explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.
  */
-static struct tnum hma(struct tnum acc, u64 value, u64 mask)
-{
-	while (mask) {
-		if (mask & 1)
-			acc = tnum_add(acc, TNUM(0, value));
-		mask >>= 1;
-		value <<= 1;
-	}
-	return acc;
-}
-
 struct tnum tnum_mul(struct tnum a, struct tnum b)
 {
-	struct tnum acc;
-	u64 pi;
-
-	pi = a.value * b.value;
-	acc = hma(TNUM(pi, 0), a.mask, b.mask | b.value);
-	return hma(acc, b.mask, a.value);
+	u64 acc_v = a.value * b.value;
+	struct tnum acc_m = TNUM(0, 0);
+
+	while (a.value || a.mask) {
+		/* LSB of tnum a is a certain 1 */
+		if (a.value & 1)
+			acc_m = tnum_add(acc_m, TNUM(0, b.mask));
+		/* LSB of tnum a is uncertain */
+		else if (a.mask & 1)
+			acc_m = tnum_add(acc_m, TNUM(0, b.value | b.mask));
+		/* Note: no case for LSB is certain 0 */
+		a = tnum_rshift(a, 1);
+		b = tnum_lshift(b, 1);
+	}
+	return tnum_add(TNUM(acc_v, 0), acc_m);
 }
 
 /* Note that if a and b disagree - i.e. one has a 'known 1' where the other has
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul
  2021-05-31  2:01 [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul hv90
@ 2021-06-01  9:56 ` Edward Cree
  2021-06-01 11:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Edward Cree @ 2021-06-01  9:56 UTC (permalink / raw)
  To: hv90, ast
  Cc: bpf, ecree, Harishankar Vishwanathan, Matan Shachnai,
	Srinivas Narayana, Santosh Nagarakatte

On 31/05/2021 03:01, hv90@scarletmail.rutgers.edu wrote:
> From: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>
> 
> This patch introduces a new algorithm for multiplication of tristate
> numbers (tnums) that is provably sound. It is faster and more precise when
> compared to the existing method.
> 
> Like the existing method, this new algorithm follows the long
> multiplication algorithm. The idea is to generate partial products by
> multiplying each bit in the multiplier (tnum a) with the multiplicand
> (tnum b), and adding the partial products after appropriately bit-shifting
> them. The new algorithm, however, uses just a single loop over the bits of
> the multiplier (tnum a) and accumulates only the uncertain components of
> the multiplicand (tnum b) into a mask-only tnum. The following paper
> explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.
> 
> A natural way to construct the tnum product is by performing a tnum
> addition on all the partial products. This algorithm presents another
> method of doing this: decompose each partial product into two tnums,
> consisting of the values and the masks separately. The mask-sum is
> accumulated within the loop in acc_m. The value-sum tnum is generated
> using a.value * b.value. The tnum constructed by tnum addition of the
> value-sum and the mask-sum contains all possible summations of concrete
> values drawn from the partial product tnums pairwise. We prove this result
> in the paper.
> 
> Our evaluations show that the new algorithm is overall more precise
> (producing tnums with less uncertain components) than the existing method.
> As an illustrative example, consider the input tnums A and B. The numbers
> in the paranthesis correspond to (value;mask).
> 
> A                = 000000x1 (1;2)
> B                = 0010011x (38;1)
> A * B (existing) = xxxxxxxx (0;255)
> A * B (new)      = 0x1xxxxx (32;95)
> 
> Importantly, we present a proof of soundness of the new algorithm in the
> aforementioned paper. Additionally, we show that this new algorithm is
> empirically faster than the existing method.
> 
> Co-developed-by: Matan Shachnai <m.shachnai@rutgers.edu>
> Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu>
> Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>

Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul
  2021-05-31  2:01 [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul hv90
  2021-06-01  9:56 ` Edward Cree
@ 2021-06-01 11:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-06-01 11:40 UTC (permalink / raw)
  To: HARISHANKAR VISHWANATHAN
  Cc: ast, bpf, ecree, harishankar.vishwanathan, m.shachnai,
	srinivas.narayana, santosh.nagarakatte

Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Sun, 30 May 2021 22:01:57 -0400 you wrote:
> From: Harishankar Vishwanathan <harishankar.vishwanathan@rutgers.edu>
> 
> This patch introduces a new algorithm for multiplication of tristate
> numbers (tnums) that is provably sound. It is faster and more precise when
> compared to the existing method.
> 
> Like the existing method, this new algorithm follows the long
> multiplication algorithm. The idea is to generate partial products by
> multiplying each bit in the multiplier (tnum a) with the multiplicand
> (tnum b), and adding the partial products after appropriately bit-shifting
> them. The new algorithm, however, uses just a single loop over the bits of
> the multiplier (tnum a) and accumulates only the uncertain components of
> the multiplicand (tnum b) into a mask-only tnum. The following paper
> explains the algorithm in more detail: https://arxiv.org/abs/2105.05398.
> 
> [...]

Here is the summary with links:
  - [v2,bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul
    https://git.kernel.org/bpf/bpf-next/c/05924717ac70

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-01 11:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-31  2:01 [PATCH v2 bpf-next] bpf: tnums: Provably sound, faster, and more precise algorithm for tnum_mul hv90
2021-06-01  9:56 ` Edward Cree
2021-06-01 11:40 ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.