All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots
@ 2018-03-09 14:46 Kees Cook
  2018-03-09 15:49 ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Kees Cook @ 2018-03-09 14:46 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Segher Boessenkool, kernel-hardening

Avoid VLAs[1] by always allocating the upper bound of stack space
needed. The existing users of rslib appear to max out at 32 roots,
so use that as the upper bound.

Alternative: make init_rs() a true caller-instance and pre-allocate
the workspaces. Will this need locking or are the callers already
single-threaded in their use of librs?

Using kmalloc in this path doesn't look great, especially since at
least one caller (pstore) is sensitive to allocations during rslib
usage (it expects to run it during an Oops, for example).

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 lib/reed_solomon/decode_rs.c    | 7 ++++---
 lib/reed_solomon/reed_solomon.c | 5 ++++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 0ec3f257ffdf..3e3becb836a6 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -31,9 +31,10 @@
 	 * of nroots is 8. So the necessary stack size will be about
 	 * 220 bytes max.
 	 */
-	uint16_t lambda[nroots + 1], syn[nroots];
-	uint16_t b[nroots + 1], t[nroots + 1], omega[nroots + 1];
-	uint16_t root[nroots], reg[nroots + 1], loc[nroots];
+	uint16_t lambda[RS_MAX_ROOTS + 1], syn[RS_MAX_ROOTS];
+	uint16_t b[RS_MAX_ROOTS + 1], t[RS_MAX_ROOTS + 1];
+	uint16_t omega[RS_MAX_ROOTS + 1], root[RS_MAX_ROOTS];
+	uint16_t reg[RS_MAX_ROOTS + 1], loc[RS_MAX_ROOTS];
 	int count = 0;
 	uint16_t msk = (uint16_t) rs->nn;
 
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index 06d04cfa9339..1ad9094ddf66 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -51,6 +51,9 @@ static LIST_HEAD (rslist);
 /* Protection for the list */
 static DEFINE_MUTEX(rslistlock);
 
+/* Ultimately controls the upper bounds of the on-stack buffers. */
+#define RS_MAX_ROOTS	32
+
 /**
  * rs_init - Initialize a Reed-Solomon codec
  * @symsize:	symbol size, bits (1-8)
@@ -210,7 +213,7 @@ static struct rs_control *init_rs_internal(int symsize, int gfpoly,
     		return NULL;
 	if (prim <= 0 || prim >= (1<<symsize))
     		return NULL;
-	if (nroots < 0 || nroots >= (1<<symsize))
+	if (nroots < 0 || nroots >= (1<<symsize) || nroots > RS_MAX_ROOTS)
 		return NULL;
 
 	mutex_lock(&rslistlock);
-- 
2.7.4


-- 
Kees Cook
Pixel Security

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots
  2018-03-09 14:46 [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots Kees Cook
@ 2018-03-09 15:49 ` Thomas Gleixner
  2018-03-09 20:57   ` Kees Cook
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2018-03-09 15:49 UTC (permalink / raw)
  To: Kees Cook; +Cc: linux-kernel, Segher Boessenkool, kernel-hardening

On Fri, 9 Mar 2018, Kees Cook wrote:

> Avoid VLAs[1] by always allocating the upper bound of stack space
> needed. The existing users of rslib appear to max out at 32 roots,
> so use that as the upper bound.

I think 32 is plenty. Do we have actually a user with 32?

> Alternative: make init_rs() a true caller-instance and pre-allocate
> the workspaces. Will this need locking or are the callers already
> single-threaded in their use of librs?

init_rs() is an init function which needs to be invoked _before_ the
decoder/encoder can be used.

The way it works today that it can share the rs_control between users to
avoid duplicating the polynom arrays and the setup of them.

So we might change how rs_control works and allocate rs_control for each
invocation of init_rs(). That means we need two data structures:

Rename rs_control to rs_poly and just use that internaly for sharing the
polynom arrays.

rs_control then becomes:

struct rs_control {
	struct rs_poly	*poly;
	uint16_t	lamda[MAX_ROOTS + 1];
	....
	uint16_t	loc[MAX_ROOTS];
};

But as you said that requires serialization or separation at the usage
sites.

drivers/mtd/nand/* would either need a mutex or allocate one rs_control per
instance. Simple enough to do.

drivers/md/dm-verity-fec.c looks like it's allocating a dm control struct
for each worker thread, so that should just require allocating one
rs_control per worker then.

pstore only has an issue in case of OOPS. A simple solution would be to
allocate two rs_control structs, one for regular usage and one for the OOPS
case. Not sure if that covers all possible problems, so that needs more
thoughts.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots
  2018-03-09 15:49 ` Thomas Gleixner
@ 2018-03-09 20:57   ` Kees Cook
  2018-03-09 22:57     ` Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: Kees Cook @ 2018-03-09 20:57 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Segher Boessenkool, Kernel Hardening

On Fri, Mar 9, 2018 at 7:49 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 9 Mar 2018, Kees Cook wrote:
>
>> Avoid VLAs[1] by always allocating the upper bound of stack space
>> needed. The existing users of rslib appear to max out at 32 roots,
>> so use that as the upper bound.
>
> I think 32 is plenty. Do we have actually a user with 32?

I found 24 as the max, but thought maybe 32 would be better?

drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_RSM            255
drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_MAX_RSN                253
drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_MIN_RSN
 231     /* ~10% space overhead */
drivers/md/dm-verity-fec.c:

                if (sscanf(arg_value, "%hhu%c", &num_c, &dummy) != 1
|| !num_c ||
                    num_c < (DM_VERITY_FEC_RSM - DM_VERITY_FEC_MAX_RSN) ||
                    num_c > (DM_VERITY_FEC_RSM - DM_VERITY_FEC_MIN_RSN)) {
                        ti->error = "Invalid " DM_VERITY_OPT_FEC_ROOTS;
                        return -EINVAL;
                }
                v->fec->roots = num_c;
...
drivers/md/dm-verity-fec.c:     return init_rs(8, 0x11d, 0, 1, v->fec->roots);

So this can be as much as 24.

drivers/mtd/nand/diskonchip.c:#define NROOTS 4
drivers/mtd/nand/diskonchip.c:  rs_decoder = init_rs(10, 0x409, FCR, 1, NROOTS);

4.

fs/pstore/ram.c:static int ramoops_ecc;
fs/pstore/ram.c:module_param_named(ecc, ramoops_ecc, int, 0600);
fs/pstore/ram.c:MODULE_PARM_DESC(ramoops_ecc,
fs/pstore/ram.c:        dummy_data->ecc_info.ecc_size = ramoops_ecc ==
1 ? 16 : ramoops_ecc;
...
fs/pstore/ram.c:        cxt->ecc_info = pdata->ecc_info;
...
fs/pstore/ram_core.c:   prz->rs_decoder =
init_rs(prz->ecc_info.symsize, prz->ecc_info.poly,
fs/pstore/ram_core.c-                             0, 1, prz->ecc_info.ecc_size);

The default "ecc enabled" mode for pstore is 16, but was made dynamic
a while ago. However, I've only ever seen people use a smaller number
of roots.

>> Alternative: make init_rs() a true caller-instance and pre-allocate
>> the workspaces. Will this need locking or are the callers already
>> single-threaded in their use of librs?
>
> init_rs() is an init function which needs to be invoked _before_ the
> decoder/encoder can be used.
>
> The way it works today that it can share the rs_control between users to
> avoid duplicating the polynom arrays and the setup of them.
>
> So we might change how rs_control works and allocate rs_control for each
> invocation of init_rs(). That means we need two data structures:
>
> Rename rs_control to rs_poly and just use that internaly for sharing the
> polynom arrays.
>
> rs_control then becomes:
>
> struct rs_control {
>         struct rs_poly  *poly;
>         uint16_t        lamda[MAX_ROOTS + 1];
>         ....
>         uint16_t        loc[MAX_ROOTS];
> };
>
> But as you said that requires serialization or separation at the usage
> sites.

Right. Not my favorite idea. :P

> drivers/mtd/nand/* would either need a mutex or allocate one rs_control per
> instance. Simple enough to do.
>
> drivers/md/dm-verity-fec.c looks like it's allocating a dm control struct
> for each worker thread, so that should just require allocating one
> rs_control per worker then.
>
> pstore only has an issue in case of OOPS. A simple solution would be to
> allocate two rs_control structs, one for regular usage and one for the OOPS
> case. Not sure if that covers all possible problems, so that needs more
> thoughts.

Maybe I should just go with 24 as the max, and if we have a case where
we need more, address it then?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots
  2018-03-09 20:57   ` Kees Cook
@ 2018-03-09 22:57     ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2018-03-09 22:57 UTC (permalink / raw)
  To: Kees Cook; +Cc: LKML, Segher Boessenkool, Kernel Hardening

On Fri, 9 Mar 2018, Kees Cook wrote:
> On Fri, Mar 9, 2018 at 7:49 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Fri, 9 Mar 2018, Kees Cook wrote:
> 
> Maybe I should just go with 24 as the max, and if we have a case where
> we need more, address it then?

Works for me.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-03-09 22:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-09 14:46 [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots Kees Cook
2018-03-09 15:49 ` Thomas Gleixner
2018-03-09 20:57   ` Kees Cook
2018-03-09 22:57     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.