linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] rslib: Several improvements
@ 2022-06-20  6:20 Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 1/6] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap


Hello,

I made several improvements to reed-solomon library.
Please have a look. :)

Thanks!

Changes in [PATCH v3]:
Fixed kernel-doc style. Thanks to Randy Dunlap :)
(But I decide to keep "a*b" instead of "a * b" because I think it's more
readable)
Reordered some patches to group similar things together.

Best Regards,
Zhang Boyang

===

Changes in [PATCH v2]:
Added more patches. Removed init_rs16(), since init_rs_non_canonical()
can do the same job.
Link: https://lore.kernel.org/all/20220617144624.158973-1-zhangboyang.id@gmail.com/

Changes in [PATCH v1]:
Added init_rs16().
Link: https://lore.kernel.org/all/20220606101901.83538-1-zhangboyang.id@gmail.com/

[RFC PATCH]:
Link: https://lore.kernel.org/all/20220605073857.126497-1-zhangboyang.id@gmail.com/



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/6] rslib: Fix incorrect documentation of rs_modnn()
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 2/6] rslib: Replace hard-coded 1 with alpha_to[0] Zhang Boyang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

Previous documentation of rs_modnn() states simple arithmetic modulo
return a wrong result for values >= (3 * rs->nn). However, that is not
true. The rs_modnn() does the exactly same job as (x % rs->nn). This can
be proved from following loop invariants:

  while (x >= rs->nn) {
    x -= rs->nn; // (1)
    x = (x >> rs->mm) + (x & rs->nn); // (2)
  }

Let x0 denote the value of x before assignment. At (1), it is obvious
that x % nn == x0 % nn. At (2), because nn == ((1 << mm) - 1), we have

  x0 % nn == x0 % nn
  x0 % nn == (((x0 >> mm) << mm) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * (nn + 1) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * ((nn + 1) % nn) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * 1 + (x0 & nn)) % nn   // let's assume nn > 1
  x0 % nn == ((x0 >> mm) + (x0 & nn)) % nn
  x0 % nn == x % nn

When the loop exits, it is obvious that 0 <= x < nn, so the return value
must equal to (x % rs->nn).

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index 238bb85243d3..507fa14c03b2 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -116,8 +116,7 @@ void free_rs(struct rs_control *rs);
  *  rs->mm = number of bits per symbol
  *  rs->nn = (2^rs->mm) - 1
  *
- *  Simple arithmetic modulo would return a wrong result for values
- *  >= 3 * rs->nn
+ *  Calculate (x % rs->nn), without using a div instruction
 */
 static inline int rs_modnn(struct rs_codec *rs, int x)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/6] rslib: Replace hard-coded 1 with alpha_to[0]
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 1/6] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 3/6] rslib: Fix obvious documentation mistakes Zhang Boyang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

Currently the rslib allows customizing the finite field by the `gffunc'
parameter of init_rs_non_canonical(). However, there are several places
in rslib use hard-coded 1 instead of alpha_to[0], leading to errors if
gffunc(0) != 1. This patch fixes the problem. One of such `gffunc' might
be gffunc'(x) = swab16(gffunc(swab16(x))), as gffunc'(0) = swab16(1).
This special gffunc'(x) is useful when implementing RS coder for
16 bit foreign-endian symbols.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 lib/reed_solomon/decode_rs.c    | 4 ++--
 lib/reed_solomon/reed_solomon.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 805de84ae83d..6c1d53d1b702 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -104,7 +104,7 @@
 
  decode:
 	memset(&lambda[1], 0, nroots * sizeof(lambda[0]));
-	lambda[0] = 1;
+	lambda[0] = alpha_to[0];
 
 	if (no_eras > 0) {
 		/* Init lambda to be the erasure locator polynomial */
@@ -198,7 +198,7 @@
 	memcpy(&reg[1], &lambda[1], nroots * sizeof(reg[0]));
 	count = 0;		/* Number of roots of lambda(x) */
 	for (i = 1, k = iprim - 1; i <= nn; i++, k = rs_modnn(rs, k + iprim)) {
-		q = 1;		/* lambda[0] is always 0 */
+		q = alpha_to[0];	/* lambda[0] is always 0 */
 		for (j = deg_lambda; j > 0; j--) {
 			if (reg[j] != nn) {
 				reg[j] = rs_modnn(rs, reg[j] + j);
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index bbc01bad3053..bb4f44c8edba 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -131,9 +131,9 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	rs->iprim = iprim / prim;
 
 	/* Form RS code generator polynomial from its roots */
-	rs->genpoly[0] = 1;
+	rs->genpoly[0] = rs->alpha_to[0];
 	for (i = 0, root = fcr * prim; i < nroots; i++, root += prim) {
-		rs->genpoly[i + 1] = 1;
+		rs->genpoly[i + 1] = rs->alpha_to[0];
 		/* Multiply rs->genpoly[] by  @**(root + x) */
 		for (j = i; j > 0; j--) {
 			if (rs->genpoly[j] != 0) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/6] rslib: Fix obvious documentation mistakes
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 1/6] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 2/6] rslib: Replace hard-coded 1 with alpha_to[0] Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn() Zhang Boyang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

This patch fixes some obvious documentation mistakes.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h           | 4 ++--
 lib/reed_solomon/reed_solomon.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index 507fa14c03b2..cd0b5a7a5698 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -19,8 +19,8 @@
  *
  * @mm:		Bits per symbol
  * @nn:		Symbols per block (= (1<<mm)-1)
- * @alpha_to:	log lookup table
- * @index_of:	Antilog lookup table
+ * @alpha_to:	exp() lookup table
+ * @index_of:	log() lookup table
  * @genpoly:	Generator polynomial
  * @nroots:	Number of generator roots = number of parity symbols
  * @fcr:	First consecutive root, index form
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index bb4f44c8edba..da46026a60b8 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -56,7 +56,7 @@ static DEFINE_MUTEX(rslistlock);
 
 /**
  * codec_init - Initialize a Reed-Solomon codec
- * @symsize:	symbol size, bits (1-8)
+ * @symsize:	the symbol size (number of bits)
  * @gfpoly:	Field generator polynomial coefficients
  * @gffunc:	Field generator function
  * @fcr:	first root of RS code generator polynomial, index form
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn()
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
                   ` (2 preceding siblings ...)
  2022-06-20  6:20 ` [PATCH v3 3/6] rslib: Fix obvious documentation mistakes Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  2022-06-21  0:18   ` Randy Dunlap
  2022-06-20  6:20 ` [PATCH v3 5/6] rslib: Improve the performance of encode_rs.c Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 6/6] rslib: Fix integer overflow on large fcr or prim Zhang Boyang
  5 siblings, 1 reply; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

This patch fixes the style of kernel-doc of rs_modnn().

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index cd0b5a7a5698..e92923fff3bc 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -107,7 +107,8 @@ struct rs_control *init_rs_non_canonical(int symsize, int (*func)(int),
 /* Release a rs control structure */
 void free_rs(struct rs_control *rs);
 
-/** modulo replacement for galois field arithmetics
+/**
+ * rs_modnn() - Modulo replacement for galois field arithmetics
  *
  *  @rs:	Pointer to the RS codec
  *  @x:		the value to reduce
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 5/6] rslib: Improve the performance of encode_rs.c
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
                   ` (3 preceding siblings ...)
  2022-06-20  6:20 ` [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn() Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  2022-06-20  6:20 ` [PATCH v3 6/6] rslib: Fix integer overflow on large fcr or prim Zhang Boyang
  5 siblings, 0 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

This patch enhances the performance of RS encoder by following points:

1) Avoid memmove(). The shifting operation done by memmove() can be
   merged into the calculation loop above.

2) Introduce rs_modnn_fast(). The original rs_modnn() contains a loop
   which may be slow. Since (fb + genpoly[...]) is always strictly less
   than (2 * rs->nn), we can use a ternary operator to do the same
   calculation. The new faster function is named rs_modnn_fast(). The
   new rs_modnn_fast(x) requires 0 <= x < 2*nn, in contrast, original
   rs_modnn(x) only requires x >= 0. To make things clear, the
   documentation of original rs_modnn() is also updated.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h        | 15 ++++++++++++++-
 lib/reed_solomon/encode_rs.c | 21 ++++++++++-----------
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index e92923fff3bc..a277a178157b 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -111,7 +111,7 @@ void free_rs(struct rs_control *rs);
  * rs_modnn() - Modulo replacement for galois field arithmetics
  *
  *  @rs:	Pointer to the RS codec
- *  @x:		the value to reduce
+ *  @x:		x >= 0 ; the value to reduce
  *
  *  where
  *  rs->mm = number of bits per symbol
@@ -128,4 +128,17 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
 	return x;
 }
 
+/**
+ * rs_modnn_fast() - Modulo replacement for galois field arithmetics
+ *
+ *  @rs:	Pointer to the RS codec
+ *  @x:		0 <= x < 2*nn ; the value to reduce
+ *
+ *  Same as rs_modnn(x), but faster, at the cost of limited value range of @x
+*/
+static inline int rs_modnn_fast(struct rs_codec *rs, int x)
+{
+	return x - rs->nn < 0 ? x : x - rs->nn;
+}
+
 #endif
diff --git a/lib/reed_solomon/encode_rs.c b/lib/reed_solomon/encode_rs.c
index 9112d46e869e..6e3847b17ad4 100644
--- a/lib/reed_solomon/encode_rs.c
+++ b/lib/reed_solomon/encode_rs.c
@@ -27,19 +27,18 @@
 
 	for (i = 0; i < len; i++) {
 		fb = index_of[((((uint16_t) data[i])^invmsk) & msk) ^ par[0]];
-		/* feedback term is non-zero */
 		if (fb != nn) {
-			for (j = 1; j < nroots; j++) {
-				par[j] ^= alpha_to[rs_modnn(rs, fb +
-							 genpoly[nroots - j])];
-			}
-		}
-		/* Shift */
-		memmove(&par[0], &par[1], sizeof(uint16_t) * (nroots - 1));
-		if (fb != nn) {
-			par[nroots - 1] = alpha_to[rs_modnn(rs,
-							    fb + genpoly[0])];
+			/* feedback term is non-zero */
+			for (j = 1; j < nroots; j++)
+				par[j - 1] = par[j] ^ alpha_to[rs_modnn_fast(rs,
+						      fb +
+						      genpoly[nroots - j])];
+			par[nroots - 1] = alpha_to[rs_modnn_fast(rs,
+					  fb +
+					  genpoly[0])];
 		} else {
+			for (j = 1; j < nroots; j++)
+				par[j - 1] = par[j];
 			par[nroots - 1] = 0;
 		}
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 6/6] rslib: Fix integer overflow on large fcr or prim
  2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
                   ` (4 preceding siblings ...)
  2022-06-20  6:20 ` [PATCH v3 5/6] rslib: Improve the performance of encode_rs.c Zhang Boyang
@ 2022-06-20  6:20 ` Zhang Boyang
  5 siblings, 0 replies; 8+ messages in thread
From: Zhang Boyang @ 2022-06-20  6:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook, Randy Dunlap,
	Zhang Boyang

Current rslib support symsize up to 16, so the max value of rs->nn can
be 0xFFFF. Since fcr <= nn, prim <= nn, multiplications on them can
overflow easily, e.g. fcr*root[j], fcr*prim.

This patch fixes these problems by introducing rs_modnn_mul(a, b). This
function is same as rs_modnn(a*b) but it will avoid overflow when
calculating a*b. It requires 0 <= a <= nn && 0 <= b <= nn, because it
use uint32_t to do the multiplication internally, so there will be no
overflow as long as 0 <= a <= nn <= 0xFFFF && 0 <= b <= nn <= 0xFFFF. In
fact, if we use `unsigned int' everywhere, there is no need to have
rs_modnn_mul(). But the `unsigned int' approach has poor scalability and
it may bring us to the mess of signed and unsigned integers.

With rs_modnn(), the intermediate result is now restricted to [0, nn).
This enables us to use rs_modnn_fast(a+b) to replace rs_modnn(a+b), as
long as 0 <= a+b < 2*nn. The most common case is one addend in [0, nn]
and the other addend in [0, nn). The examples of values in [0, nn] are
fcr, prim, indexes taken from rs->index_of[0...nn], etc. The examples of
values in [0, nn) are results from rs_modnn(), indexes taken from
rs->index_of[1...nn], etc.

Since the roots of RS generator polynomial, i.e. (fcr+i)*prim%nn, is
often used. It's now precomputed into rs->genroot[], to avoid writing
rs_modnn_mul(rs, rs_modnn_fast(rs, fcr + i), prim) everywhere.

The algorithm of searching for rs->iprim is also changed. Instead of
searching for (1+what*nn)%prim == 0, then iprim = (1+what*nn)/prim, it
now searches for iprim*prim%nn == 1 directly.

A new test case is also added to test_rslib.c to ensure correctness.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h           | 23 +++++++++++++
 lib/reed_solomon/decode_rs.c    | 60 +++++++++++++++++++--------------
 lib/reed_solomon/reed_solomon.c | 30 ++++++++++++-----
 lib/reed_solomon/test_rslib.c   |  8 ++---
 4 files changed, 83 insertions(+), 38 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index a277a178157b..a11ea5e8eb14 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -22,6 +22,7 @@
  * @alpha_to:	exp() lookup table
  * @index_of:	log() lookup table
  * @genpoly:	Generator polynomial
+ * @genroot:	Roots of generator polynomial, index form
  * @nroots:	Number of generator roots = number of parity symbols
  * @fcr:	First consecutive root, index form
  * @prim:	Primitive element, index form
@@ -37,6 +38,7 @@ struct rs_codec {
 	uint16_t	*alpha_to;
 	uint16_t	*index_of;
 	uint16_t	*genpoly;
+	uint16_t	*genroot;
 	int		nroots;
 	int		fcr;
 	int		prim;
@@ -128,6 +130,27 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
 	return x;
 }
 
+/**
+ * rs_modnn_mul() - Modulo replacement for galois field arithmetics
+ *
+ *  @rs:	Pointer to the RS codec
+ *  @a:		0 <= a <= nn ; a*b is the value to reduce
+ *  @b:		0 <= b <= nn ; a*b is the value to reduce
+ *
+ *  Same as rs_modnn(a*b), but avoid integer overflow when calculating a*b
+*/
+static inline int rs_modnn_mul(struct rs_codec *rs, int a, int b)
+{
+	/* nn <= 0xFFFF, so (a * b) will not overflow uint32_t */
+	uint32_t x = (uint32_t)a * (uint32_t)b;
+	uint32_t nn = (uint32_t)rs->nn;
+	while (x >= nn) {
+		x -= nn;
+		x = (x >> rs->mm) + (x & nn);
+	}
+	return (int)x;
+}
+
 /**
  * rs_modnn_fast() - Modulo replacement for galois field arithmetics
  *
diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 6c1d53d1b702..3387465ab429 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -20,6 +20,7 @@
 	int iprim = rs->iprim;
 	uint16_t *alpha_to = rs->alpha_to;
 	uint16_t *index_of = rs->index_of;
+	uint16_t *genroot = rs->genroot;
 	uint16_t u, q, tmp, num1, num2, den, discr_r, syn_error;
 	int count = 0;
 	int num_corrected;
@@ -69,8 +70,8 @@
 			} else {
 				syn[i] = ((((uint16_t) data[j]) ^
 					   invmsk) & msk) ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]] +
-						       (fcr + i) * prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
@@ -81,8 +82,8 @@
 				syn[i] = ((uint16_t) par[j]) & msk;
 			} else {
 				syn[i] = (((uint16_t) par[j]) & msk) ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]] +
-						       (fcr+i)*prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
@@ -108,15 +109,17 @@
 
 	if (no_eras > 0) {
 		/* Init lambda to be the erasure locator polynomial */
-		lambda[1] = alpha_to[rs_modnn(rs,
-					prim * (nn - 1 - (eras_pos[0] + pad)))];
+		lambda[1] = alpha_to[rs_modnn_mul(rs,
+					 prim, (nn - 1 - (eras_pos[0] + pad)))];
 		for (i = 1; i < no_eras; i++) {
-			u = rs_modnn(rs, prim * (nn - 1 - (eras_pos[i] + pad)));
+			u = rs_modnn_mul(rs,
+					 prim, (nn - 1 - (eras_pos[i] + pad)));
 			for (j = i + 1; j > 0; j--) {
 				tmp = index_of[lambda[j - 1]];
 				if (tmp != nn) {
 					lambda[j] ^=
-						alpha_to[rs_modnn(rs, u + tmp)];
+						alpha_to[rs_modnn_fast(rs,
+							 u + tmp)];
 				}
 			}
 		}
@@ -137,9 +140,9 @@
 		for (i = 0; i < r; i++) {
 			if ((lambda[i] != 0) && (s[r - i - 1] != nn)) {
 				discr_r ^=
-					alpha_to[rs_modnn(rs,
-							  index_of[lambda[i]] +
-							  s[r - i - 1])];
+					alpha_to[rs_modnn_fast(rs,
+						 index_of[lambda[i]] +
+						 s[r - i - 1])];
 			}
 		}
 		discr_r = index_of[discr_r];	/* Index form */
@@ -153,8 +156,8 @@
 			for (i = 0; i < nroots; i++) {
 				if (b[i] != nn) {
 					t[i + 1] = lambda[i + 1] ^
-						alpha_to[rs_modnn(rs, discr_r +
-								  b[i])];
+						alpha_to[rs_modnn_fast(rs,
+							 discr_r + b[i])];
 				} else
 					t[i + 1] = lambda[i + 1];
 			}
@@ -166,8 +169,9 @@
 				 */
 				for (i = 0; i <= nroots; i++) {
 					b[i] = (lambda[i] == 0) ? nn :
-						rs_modnn(rs, index_of[lambda[i]]
-							 - discr_r + nn);
+						rs_modnn_fast(rs,
+						        index_of[lambda[i]] +
+							nn - discr_r);
 				}
 			} else {
 				/* 2 lines below: B(x) <-- x*B(x) */
@@ -197,11 +201,11 @@
 	/* Find roots of error+erasure locator polynomial by Chien search */
 	memcpy(&reg[1], &lambda[1], nroots * sizeof(reg[0]));
 	count = 0;		/* Number of roots of lambda(x) */
-	for (i = 1, k = iprim - 1; i <= nn; i++, k = rs_modnn(rs, k + iprim)) {
+	for (i = 1, k = iprim-1; i <= nn; i++, k = rs_modnn_fast(rs, k+iprim)) {
 		q = alpha_to[0];	/* lambda[0] is always 0 */
 		for (j = deg_lambda; j > 0; j--) {
 			if (reg[j] != nn) {
-				reg[j] = rs_modnn(rs, reg[j] + j);
+				reg[j] = rs_modnn_fast(rs, reg[j] + j);
 				q ^= alpha_to[reg[j]];
 			}
 		}
@@ -238,8 +242,8 @@
 		tmp = 0;
 		for (j = i; j >= 0; j--) {
 			if ((s[i - j] != nn) && (lambda[j] != nn))
-				tmp ^=
-				    alpha_to[rs_modnn(rs, s[i - j] + lambda[j])];
+				tmp ^= alpha_to[rs_modnn_fast(rs,
+						s[i - j] + lambda[j])];
 		}
 		omega[i] = index_of[tmp];
 	}
@@ -254,8 +258,9 @@
 		num1 = 0;
 		for (i = deg_omega; i >= 0; i--) {
 			if (omega[i] != nn)
-				num1 ^= alpha_to[rs_modnn(rs, omega[i] +
-							i * root[j])];
+				num1 ^= alpha_to[rs_modnn_fast(rs,
+						 omega[i] +
+						 rs_modnn_mul(rs, i, root[j]))];
 		}
 
 		if (num1 == 0) {
@@ -264,15 +269,18 @@
 			continue;
 		}
 
-		num2 = alpha_to[rs_modnn(rs, root[j] * (fcr - 1) + nn)];
+		num2 = alpha_to[rs_modnn_fast(rs,
+				rs_modnn_mul(rs, root[j], fcr) +
+				nn - root[j])];
 		den = 0;
 
 		/* lambda[i+1] for i even is the formal derivative
 		 * lambda_pr of lambda[i] */
 		for (i = min(deg_lambda, nroots - 1) & ~1; i >= 0; i -= 2) {
 			if (lambda[i + 1] != nn) {
-				den ^= alpha_to[rs_modnn(rs, lambda[i + 1] +
-						       i * root[j])];
+				den ^= alpha_to[rs_modnn_fast(rs,
+						lambda[i + 1] +
+						rs_modnn_mul(rs, i, root[j]))];
 			}
 		}
 
@@ -292,8 +300,8 @@
 			if (b[j] == 0)
 				continue;
 
-			k = (fcr + i) * prim * (nn-loc[j]-1);
-			tmp ^= alpha_to[rs_modnn(rs, index_of[b[j]] + k)];
+			k = rs_modnn_mul(rs, genroot[i], nn - loc[j] - 1);
+			tmp ^= alpha_to[rs_modnn_fast(rs, index_of[b[j]] + k)];
 		}
 
 		if (tmp != alpha_to[s[i]])
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index da46026a60b8..2c86e4dfcbaa 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -100,6 +100,10 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	if(rs->genpoly == NULL)
 		goto err;
 
+	rs->genroot = kmalloc_array(rs->nroots, sizeof(uint16_t), gfp);
+	if(rs->genroot == NULL)
+		goto err;
+
 	/* Generate Galois field lookup tables */
 	rs->index_of[0] = rs->nn;	/* log(zero) = -inf */
 	rs->alpha_to[rs->nn] = 0;	/* alpha**-inf = 0 */
@@ -126,26 +130,34 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 		goto err;
 
 	/* Find prim-th root of 1, used in decoding */
-	for(iprim = 1; (iprim % prim) != 0; iprim += rs->nn);
+	for (iprim = 1; rs_modnn_mul(rs, iprim, prim) != 1; iprim++);
 	/* prim-th root of 1, index form */
-	rs->iprim = iprim / prim;
+	rs->iprim = iprim;
+
+	/* Precompute generator polynomial roots */
+	root = rs_modnn_mul(rs, fcr, prim);
+	for (i = 0; i < nroots; i++) {
+		rs->genroot[i] = root; /*  = (fcr + i) * prim % nn  */
+		root = rs_modnn_fast(rs, root + prim);
+	}
 
 	/* Form RS code generator polynomial from its roots */
 	rs->genpoly[0] = rs->alpha_to[0];
-	for (i = 0, root = fcr * prim; i < nroots; i++, root += prim) {
+	for (i = 0; i < nroots; i++) {
+		root = rs->genroot[i];
 		rs->genpoly[i + 1] = rs->alpha_to[0];
 		/* Multiply rs->genpoly[] by  @**(root + x) */
 		for (j = i; j > 0; j--) {
 			if (rs->genpoly[j] != 0) {
-				rs->genpoly[j] = rs->genpoly[j -1] ^
-					rs->alpha_to[rs_modnn(rs,
+				rs->genpoly[j] = rs->genpoly[j - 1] ^
+					rs->alpha_to[rs_modnn_fast(rs,
 					rs->index_of[rs->genpoly[j]] + root)];
 			} else
 				rs->genpoly[j] = rs->genpoly[j - 1];
 		}
 		/* rs->genpoly[0] can never be zero */
 		rs->genpoly[0] =
-			rs->alpha_to[rs_modnn(rs,
+			rs->alpha_to[rs_modnn_fast(rs,
 				rs->index_of[rs->genpoly[0]] + root)];
 	}
 	/* convert rs->genpoly[] to index form for quicker encoding */
@@ -157,6 +169,7 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	return rs;
 
 err:
+	kfree(rs->genroot);
 	kfree(rs->genpoly);
 	kfree(rs->index_of);
 	kfree(rs->alpha_to);
@@ -188,6 +201,7 @@ void free_rs(struct rs_control *rs)
 		kfree(cd->alpha_to);
 		kfree(cd->index_of);
 		kfree(cd->genpoly);
+		kfree(cd->genroot);
 		kfree(cd);
 	}
 	mutex_unlock(&rslistlock);
@@ -340,7 +354,7 @@ EXPORT_SYMBOL_GPL(encode_rs8);
  *  @data:	data field of a given type
  *  @par:	received parity data field
  *  @len:	data length
- *  @s: 	syndrome data field, must be in index form
+ *  @s: 	syndrome data field, must be in index form, 0 <= index <= nn
  *		(if NULL, syndrome is calculated)
  *  @no_eras:	number of erasures
  *  @eras_pos:	position of erasures, can be NULL
@@ -393,7 +407,7 @@ EXPORT_SYMBOL_GPL(encode_rs16);
  *  @data:	data field of a given type
  *  @par:	received parity data field
  *  @len:	data length
- *  @s: 	syndrome data field, must be in index form
+ *  @s: 	syndrome data field, must be in index form, 0 <= index <= nn
  *		(if NULL, syndrome is calculated)
  *  @no_eras:	number of erasures
  *  @eras_pos:	position of erasures, can be NULL
diff --git a/lib/reed_solomon/test_rslib.c b/lib/reed_solomon/test_rslib.c
index d9d1c33aebda..a03c7249f920 100644
--- a/lib/reed_solomon/test_rslib.c
+++ b/lib/reed_solomon/test_rslib.c
@@ -55,6 +55,7 @@ static struct etab Tab[] = {
 	{8,	0x11d,	1,	1,	30,	100	},
 	{8,	0x187,	112,	11,	32,	100	},
 	{9,	0x211,	1,	1,	33,	80	},
+	{16,  0x1ffed,	65534,	65534,	50,	5	},
 	{0, 0, 0, 0, 0, 0},
 };
 
@@ -232,9 +233,8 @@ static void compute_syndrome(struct rs_control *rsc, uint16_t *data,
 	struct rs_codec *rs = rsc->codec;
 	uint16_t *alpha_to = rs->alpha_to;
 	uint16_t *index_of = rs->index_of;
+	uint16_t *genroot = rs->genroot;
 	int nroots = rs->nroots;
-	int prim = rs->prim;
-	int fcr = rs->fcr;
 	int i, j;
 
 	/* Calculating syndrome */
@@ -245,8 +245,8 @@ static void compute_syndrome(struct rs_control *rsc, uint16_t *data,
 				syn[i] = data[j];
 			} else {
 				syn[i] = data[j] ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]]
-						+ (fcr + i) * prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn()
  2022-06-20  6:20 ` [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn() Zhang Boyang
@ 2022-06-21  0:18   ` Randy Dunlap
  0 siblings, 0 replies; 8+ messages in thread
From: Randy Dunlap @ 2022-06-21  0:18 UTC (permalink / raw)
  To: Zhang Boyang, linux-kernel
  Cc: Ferdinand Blomqvist, Thomas Gleixner, Kees Cook



On 6/19/22 23:20, Zhang Boyang wrote:
> This patch fixes the style of kernel-doc of rs_modnn().
> 
> Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
>  include/linux/rslib.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/rslib.h b/include/linux/rslib.h
> index cd0b5a7a5698..e92923fff3bc 100644
> --- a/include/linux/rslib.h
> +++ b/include/linux/rslib.h
> @@ -107,7 +107,8 @@ struct rs_control *init_rs_non_canonical(int symsize, int (*func)(int),
>  /* Release a rs control structure */
>  void free_rs(struct rs_control *rs);
>  
> -/** modulo replacement for galois field arithmetics
> +/**
> + * rs_modnn() - Modulo replacement for galois field arithmetics
>   *
>   *  @rs:	Pointer to the RS codec
>   *  @x:		the value to reduce

-- 
~Randy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-06-21  0:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-20  6:20 [PATCH v3] rslib: Several improvements Zhang Boyang
2022-06-20  6:20 ` [PATCH v3 1/6] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
2022-06-20  6:20 ` [PATCH v3 2/6] rslib: Replace hard-coded 1 with alpha_to[0] Zhang Boyang
2022-06-20  6:20 ` [PATCH v3 3/6] rslib: Fix obvious documentation mistakes Zhang Boyang
2022-06-20  6:20 ` [PATCH v3 4/6] rslib: Fix kernel-doc style for rs_modnn() Zhang Boyang
2022-06-21  0:18   ` Randy Dunlap
2022-06-20  6:20 ` [PATCH v3 5/6] rslib: Improve the performance of encode_rs.c Zhang Boyang
2022-06-20  6:20 ` [PATCH v3 6/6] rslib: Fix integer overflow on large fcr or prim Zhang Boyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).