* [PATCH 2/2] zstd: use U16 data type for rankPos
[not found] <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcas5p3.samsung.com>
@ 2019-05-10 6:13 ` Maninder Singh
[not found] ` <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcms5p2>
2019-06-06 20:09 ` Nick Terrell
0 siblings, 2 replies; 3+ messages in thread
From: Maninder Singh @ 2019-05-10 6:13 UTC (permalink / raw)
To: terrelln, herbert, davem, keescook, gustavo
Cc: linux-crypto, linux-kernel, a.sahrawat, pankaj.m, Maninder Singh,
Vaneet Narang
rankPos structure variables value can not be more than 512.
So it can easily be declared as U16 rather than U32.
It will reduce stack usage of HUF_sort from 256 bytes to 128 bytes
original:
e92ddbf0 push {r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
e24cb004 sub fp, ip, #4
e24ddc01 sub sp, sp, #256 ; 0x100
changed:
e92ddbf0 push {r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
e24cb004 sub fp, ip, #4
e24dd080 sub sp, sp, #128 ; 0x80
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
---
lib/zstd/huf_compress.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/zstd/huf_compress.c b/lib/zstd/huf_compress.c
index e727812..2203124 100644
--- a/lib/zstd/huf_compress.c
+++ b/lib/zstd/huf_compress.c
@@ -382,8 +382,8 @@ static U32 HUF_setMaxHeight(nodeElt *huffNode, U32 lastNonNull, U32 maxNbBits)
}
typedef struct {
- U32 base;
- U32 curr;
+ U16 base;
+ U16 curr;
} rankPos;
static void HUF_sort(nodeElt *huffNode, const U32 *count, U32 maxSymbolValue)
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* RE: [PATCH 2/2] zstd: use U16 data type for rankPos
[not found] ` <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcms5p2>
@ 2019-05-30 9:16 ` Vaneet Narang
0 siblings, 0 replies; 3+ messages in thread
From: Vaneet Narang @ 2019-05-30 9:16 UTC (permalink / raw)
To: Maninder Singh, terrelln, herbert, davem, keescook, gustavo
Cc: linux-crypto, linux-kernel, AMIT SAHRAWAT, PANKAJ MISHRA, Vaneet Narang
[Reminder] Any Comments?
>rankPos structure variables value can not be more than 512.
>So it can easily be declared as U16 rather than U32.
>It will reduce stack usage of HUF_sort from 256 bytes to 128 bytes
>original:
>e24ddc01 sub sp, sp, #256 ; 0x100
>changed:
>e24dd080 sub sp, sp, #128 ; 0x80
Regards,
Vaneet Narang
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 2/2] zstd: use U16 data type for rankPos
2019-05-10 6:13 ` [PATCH 2/2] zstd: use U16 data type for rankPos Maninder Singh
[not found] ` <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcms5p2>
@ 2019-06-06 20:09 ` Nick Terrell
1 sibling, 0 replies; 3+ messages in thread
From: Nick Terrell @ 2019-06-06 20:09 UTC (permalink / raw)
To: Maninder Singh
Cc: Herbert Xu, davem, keescook, gustavo, linux-crypto, linux-kernel,
a.sahrawat, pankaj.m, Vaneet Narang
> On May 9, 2019, at 11:13 PM, Maninder Singh <maninder1.s@samsung.com> wrote:
>
> rankPos structure variables value can not be more than 512.
> So it can easily be declared as U16 rather than U32.
>
> It will reduce stack usage of HUF_sort from 256 bytes to 128 bytes
>
> original:
> e92ddbf0 push {r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
> e24cb004 sub fp, ip, #4
> e24ddc01 sub sp, sp, #256 ; 0x100
>
> changed:
> e92ddbf0 push {r4, r5, r6, r7, r8, r9, fp, ip, lr, pc}
> e24cb004 sub fp, ip, #4
> e24dd080 sub sp, sp, #128 ; 0x80
>
>
> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> ---
> lib/zstd/huf_compress.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/zstd/huf_compress.c b/lib/zstd/huf_compress.c
> index e727812..2203124 100644
> --- a/lib/zstd/huf_compress.c
> +++ b/lib/zstd/huf_compress.c
> @@ -382,8 +382,8 @@ static U32 HUF_setMaxHeight(nodeElt *huffNode, U32 lastNonNull, U32 maxNbBits)
> }
>
> typedef struct {
> - U32 base;
> - U32 curr;
> + U16 base;
> + U16 curr;
> } rankPos;
This seems fine to me. I measured zstd's performance in userspace with this change,
and there is a ~1% speed regression for level 1. We wouldn't take this patch there,
but in the kernel it makes sense to me.
This function is called by HUF_buildCTable_wksp() which takes a workspace parameter.
We could put this table into the workspace instead to reduce the stack usage by the whole
256 bytes. We'd just have to make sure that the workspace is large enough.
Eventually I will update the zstd in the kernel to the latest upstream version. I've opened
up https://github.com/facebook/zstd/issues/1636 to make sure we get this optimization in
before porting.
> static void HUF_sort(nodeElt *huffNode, const U32 *count, U32 maxSymbolValue)
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-06-06 20:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcas5p3.samsung.com>
2019-05-10 6:13 ` [PATCH 2/2] zstd: use U16 data type for rankPos Maninder Singh
[not found] ` <CGME20190510061418epcas5p3679447cedd01f3ec70139f79ac7bcca1@epcms5p2>
2019-05-30 9:16 ` Vaneet Narang
2019-06-06 20:09 ` Nick Terrell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).