From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Alexey Dobriyan <adobriyan@gmail.com>,
Herbert Xu <herbert@gondor.hengli.com.au>
Subject: [21/89] crypto: sha512 - reduce stack usage to safe number
Date: Wed, 01 Feb 2012 12:59:45 -0800 [thread overview]
Message-ID: <20120201210045.753760472@clark.kroah.org> (raw)
In-Reply-To: <20120201210505.GA26028@kroah.com>
3.2-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Dobriyan <adobriyan@gmail.com>
commit 51fc6dc8f948047364f7d42a4ed89b416c6cc0a3 upstream.
For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16].
Consequently, keeping all W[80] array on stack is unnecessary,
only 16 values are really needed.
Using W[16] instead of W[80] greatly reduces stack usage
(~750 bytes to ~340 bytes on x86_64).
Line by line explanation:
* BLEND_OP
array is "circular" now, all indexes have to be modulo 16.
Round number is positive, so remainder operation should be
without surprises.
* initial full message scheduling is trimmed to first 16 values which
come from data block, the rest is calculated before it's needed.
* original loop body is unrolled version of new SHA512_0_15 and
SHA512_16_79 macros, unrolling was done to not do explicit variable
renaming. Otherwise it's the very same code after preprocessing.
See sha1_transform() code which does the same trick.
Patch survives in-tree crypto test and original bugreport test
(ping flood with hmac(sha512).
See FIPS 180-2 for SHA-512 definition
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
crypto/sha512_generic.c | 58 ++++++++++++++++++++++++++++--------------------
1 file changed, 34 insertions(+), 24 deletions(-)
--- a/crypto/sha512_generic.c
+++ b/crypto/sha512_generic.c
@@ -78,7 +78,7 @@ static inline void LOAD_OP(int I, u64 *W
static inline void BLEND_OP(int I, u64 *W)
{
- W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16];
+ W[I % 16] += s1(W[(I-2) % 16]) + W[(I-7) % 16] + s0(W[(I-15) % 16]);
}
static void
@@ -87,38 +87,48 @@ sha512_transform(u64 *state, const u8 *i
u64 a, b, c, d, e, f, g, h, t1, t2;
int i;
- u64 W[80];
+ u64 W[16];
/* load the input */
for (i = 0; i < 16; i++)
LOAD_OP(i, W, input);
- for (i = 16; i < 80; i++) {
- BLEND_OP(i, W);
- }
-
/* load the state into our registers */
a=state[0]; b=state[1]; c=state[2]; d=state[3];
e=state[4]; f=state[5]; g=state[6]; h=state[7];
- /* now iterate */
- for (i=0; i<80; i+=8) {
- t1 = h + e1(e) + Ch(e,f,g) + sha512_K[i ] + W[i ];
- t2 = e0(a) + Maj(a,b,c); d+=t1; h=t1+t2;
- t1 = g + e1(d) + Ch(d,e,f) + sha512_K[i+1] + W[i+1];
- t2 = e0(h) + Maj(h,a,b); c+=t1; g=t1+t2;
- t1 = f + e1(c) + Ch(c,d,e) + sha512_K[i+2] + W[i+2];
- t2 = e0(g) + Maj(g,h,a); b+=t1; f=t1+t2;
- t1 = e + e1(b) + Ch(b,c,d) + sha512_K[i+3] + W[i+3];
- t2 = e0(f) + Maj(f,g,h); a+=t1; e=t1+t2;
- t1 = d + e1(a) + Ch(a,b,c) + sha512_K[i+4] + W[i+4];
- t2 = e0(e) + Maj(e,f,g); h+=t1; d=t1+t2;
- t1 = c + e1(h) + Ch(h,a,b) + sha512_K[i+5] + W[i+5];
- t2 = e0(d) + Maj(d,e,f); g+=t1; c=t1+t2;
- t1 = b + e1(g) + Ch(g,h,a) + sha512_K[i+6] + W[i+6];
- t2 = e0(c) + Maj(c,d,e); f+=t1; b=t1+t2;
- t1 = a + e1(f) + Ch(f,g,h) + sha512_K[i+7] + W[i+7];
- t2 = e0(b) + Maj(b,c,d); e+=t1; a=t1+t2;
+#define SHA512_0_15(i, a, b, c, d, e, f, g, h) \
+ t1 = h + e1(e) + Ch(e, f, g) + sha512_K[i] + W[i]; \
+ t2 = e0(a) + Maj(a, b, c); \
+ d += t1; \
+ h = t1 + t2
+
+#define SHA512_16_79(i, a, b, c, d, e, f, g, h) \
+ BLEND_OP(i, W); \
+ t1 = h + e1(e) + Ch(e, f, g) + sha512_K[i] + W[(i)%16]; \
+ t2 = e0(a) + Maj(a, b, c); \
+ d += t1; \
+ h = t1 + t2
+
+ for (i = 0; i < 16; i += 8) {
+ SHA512_0_15(i, a, b, c, d, e, f, g, h);
+ SHA512_0_15(i + 1, h, a, b, c, d, e, f, g);
+ SHA512_0_15(i + 2, g, h, a, b, c, d, e, f);
+ SHA512_0_15(i + 3, f, g, h, a, b, c, d, e);
+ SHA512_0_15(i + 4, e, f, g, h, a, b, c, d);
+ SHA512_0_15(i + 5, d, e, f, g, h, a, b, c);
+ SHA512_0_15(i + 6, c, d, e, f, g, h, a, b);
+ SHA512_0_15(i + 7, b, c, d, e, f, g, h, a);
+ }
+ for (i = 16; i < 80; i += 8) {
+ SHA512_16_79(i, a, b, c, d, e, f, g, h);
+ SHA512_16_79(i + 1, h, a, b, c, d, e, f, g);
+ SHA512_16_79(i + 2, g, h, a, b, c, d, e, f);
+ SHA512_16_79(i + 3, f, g, h, a, b, c, d, e);
+ SHA512_16_79(i + 4, e, f, g, h, a, b, c, d);
+ SHA512_16_79(i + 5, d, e, f, g, h, a, b, c);
+ SHA512_16_79(i + 6, c, d, e, f, g, h, a, b);
+ SHA512_16_79(i + 7, b, c, d, e, f, g, h, a);
}
state[0] += a; state[1] += b; state[2] += c; state[3] += d;
next prev parent reply other threads:[~2012-02-01 21:09 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-01 21:05 [00/89] 3.2.3-stable review Greg KH
2012-02-01 20:59 ` [01/89] ALSA: hda - Fix buffer-alignment regression with Nvidia HDMI Greg KH
2012-02-01 20:59 ` [02/89] ALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops Greg KH
2012-02-01 20:59 ` [03/89] eCryptfs: Sanitize write counts of /dev/ecryptfs Greg KH
2012-02-01 20:59 ` [04/89] ecryptfs: Improve metadata read failure logging Greg KH
2012-02-01 20:59 ` [05/89] eCryptfs: Make truncate path killable Greg KH
2012-02-01 20:59 ` [06/89] eCryptfs: Check inode changes in setattr Greg KH
2012-02-01 20:59 ` [07/89] eCryptfs: Fix oops when printing debug info in extent crypto functions Greg KH
2012-02-01 20:59 ` [08/89] drm/radeon/kms: Add an MSI quirk for Dell RS690 Greg KH
2012-02-01 20:59 ` [09/89] drm/radeon/kms: move panel mode setup into encoder mode set Greg KH
2012-02-01 20:59 ` [10/89] drm/radeon/kms: rework modeset sequence for DCE41 and DCE5 Greg KH
2012-02-01 20:59 ` [11/89] drm: Fix authentication kernel crash Greg KH
2012-02-01 20:59 ` [12/89] xfs: Fix missing xfs_iunlock() on error recovery path in xfs_readlink() Greg KH
2012-02-01 20:59 ` [13/89] ASoC: Mark WM5100 register map cache only when going into BIAS_OFF Greg KH
2012-02-01 20:59 ` [14/89] ASoC: Disable register synchronisation for low frequency WM8996 SYSCLK Greg KH
2012-02-01 20:59 ` [15/89] ASoC: Dont go through cache when applying WM5100 rev A updates Greg KH
2012-02-01 20:59 ` [16/89] ASoC: wm8996: Call _POST_PMU callback for CPVDD Greg KH
2012-02-01 20:59 ` [17/89] brcmsmac: fix tx queue flush infinite loop Greg KH
2012-02-01 20:59 ` [18/89] mac80211: fix work removal on deauth request Greg KH
2012-02-01 20:59 ` [19/89] jbd: Issue cache flush after checkpointing Greg KH
2012-02-01 20:59 ` [20/89] crypto: sha512 - make it work, undo percpu message schedule Greg KH
2012-02-01 20:59 ` Greg KH [this message]
2012-02-01 20:59 ` [22/89] tpm_tis: add delay after aborting command Greg KH
2012-02-01 20:59 ` [23/89] x86/uv: Fix uninitialized spinlocks Greg KH
2012-02-01 20:59 ` [24/89] x86/uv: Fix uv_gpa_to_soc_phys_ram() shift Greg KH
2012-02-01 20:59 ` [25/89] x86/microcode_amd: Add support for CPU family specific container files Greg KH
2012-02-01 20:59 ` [26/89] m68k: Fix assembler constraint to prevent overeager gcc optimisation Greg KH
2012-02-01 20:59 ` [27/89] ALSA: hda: set mute led polarity for laptops with buggy BIOS based on SSID Greg KH
2012-02-01 20:59 ` [28/89] ALSA: hda - Fix silent output on ASUS A6Rp Greg KH
2012-02-01 20:59 ` [29/89] ALSA: hda - Fix silent output on Haier W18 laptop Greg KH
2012-02-01 20:59 ` [30/89] drm/i915: paper over missed irq issues with force wake voodoo Greg KH
2012-02-01 20:59 ` [31/89] drm/i915/sdvo: always set positive sync polarity Greg KH
2012-02-01 20:59 ` [32/89] drm/i915: Re-enable gen7 RC6 and GPU turbo after resume Greg KH
2012-02-01 20:59 ` [33/89] ARM: at91: fix at91rm9200 soc subtype handling Greg KH
2012-02-01 20:59 ` [34/89] mach-ux500: enable ARM errata 764369 Greg KH
2012-02-01 20:59 ` [35/89] ARM: 7296/1: proc-v7.S: remove HARVARD_CACHE preprocessor guards Greg KH
2012-02-01 21:00 ` [36/89] sysfs: Complain bitterly about attempts to remove files from nonexistent directories Greg KH
2012-02-01 21:00 ` [37/89] x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t Greg KH
2012-02-01 21:00 ` [38/89] [SCSI] mpt2sas: Removed redundant calling of _scsih_probe_devices() from _scsih_probe Greg KH
2012-02-01 21:00 ` [39/89] USB: option: Add LG docomo L-02C Greg KH
2012-02-01 21:00 ` [40/89] USB: ftdi_sio: fix TIOCSSERIAL baud_base handling Greg KH
2012-02-01 21:00 ` [41/89] USB: ftdi_sio: fix initial baud rate Greg KH
2012-02-01 21:00 ` [42/89] USB: ftdi_sio: add PID for TI XDS100v2 / BeagleBone A3 Greg KH
2012-02-01 21:00 ` [43/89] USB: serial: ftdi additional IDs Greg KH
2012-02-01 21:00 ` [44/89] USB: ftdi_sio: Add more identifiers Greg KH
2012-02-01 21:00 ` [45/89] USB: cdc-wdm: updating desc->length must be protected by spin_lock Greg KH
2012-02-01 21:00 ` [46/89] USB: cdc-wdm: use two mutexes to allow simultaneous read and write Greg KH
2012-02-01 21:00 ` [47/89] qcaux: add more Pantech UML190 and UML290 ports Greg KH
2012-02-01 21:00 ` [48/89] usb: dwc3: ep0: tidy up Pending Request handling Greg KH
2012-02-01 21:00 ` [49/89] usb: io_ti: Make edge_remove_sysfs_attrs the port_remove method Greg KH
2012-02-01 21:00 ` [50/89] TTY: fix UV serial console regression Greg KH
2012-02-01 21:00 ` [51/89] serial: amba-pl011: lock console writes against interrupts Greg KH
2012-02-01 21:00 ` [52/89] jsm: Fixed EEH recovery error Greg KH
2012-02-01 21:00 ` [53/89] iwlwifi: fix PCI-E transport "inta" race Greg KH
2012-02-01 21:00 ` [54/89] vmwgfx: Fix assignment in vmw_framebuffer_create_handle Greg KH
2012-02-01 21:00 ` [55/89] USB: Realtek cr: fix autopm scheduling while atomic Greg KH
2012-02-01 21:00 ` [56/89] USB: usbsevseg: fix max length Greg KH
2012-02-01 21:00 ` [57/89] usb: gadget: langwell: dont call gadgets disconnect() Greg KH
2012-02-01 21:00 ` [58/89] usb: gadget: storage: endian fix Greg KH
2012-02-01 21:00 ` [59/89] drivers/usb/host/ehci-fsl.c: add missing iounmap Greg KH
2012-02-01 21:00 ` [60/89] xhci: Fix USB 3.0 device restart on resume Greg KH
2012-02-01 21:00 ` [61/89] xHCI: Cleanup isoc transfer ring when TD length mismatch found Greg KH
2012-02-01 21:00 ` [62/89] usb: musb: davinci: fix build breakage Greg KH
2012-02-01 21:00 ` [63/89] hwmon: (f71805f) Fix clamping of temperature limits Greg KH
2012-02-01 21:00 ` [64/89] hwmon: (w83627ehf) Disable setting DC mode for pwm2, pwm3 on NCT6776F Greg KH
2012-02-01 21:00 ` [65/89] hwmon: (sht15) fix bad error code Greg KH
2012-02-01 21:00 ` [66/89] USB: cdc-wdm: call wake_up_all to allow driver to shutdown on device removal Greg KH
2012-02-01 21:00 ` [67/89] USB: cdc-wdm: better allocate a buffer that is at least as big as we tell the USB core Greg KH
2012-02-01 21:00 ` [68/89] USB: cdc-wdm: Avoid hanging on interface with no USB_CDC_DMM_TYPE Greg KH
2012-02-01 21:00 ` [69/89] netns: fix net_alloc_generic() Greg KH
2012-02-01 21:00 ` [70/89] netns: Fail conspicously if someone uses net_generic at an inappropriate time Greg KH
2012-02-01 21:00 ` [71/89] net caif: Register properly as a pernet subsystem Greg KH
2012-02-01 21:00 ` [72/89] af_unix: fix EPOLLET regression for stream sockets Greg KH
2012-02-01 21:00 ` [73/89] bonding: fix enslaving in alb mode when link down Greg KH
2012-02-01 21:00 ` [74/89] l2tp: l2tp_ip - fix possible oops on packet receive Greg KH
2012-02-01 21:00 ` [75/89] macvlan: fix a possible use after free Greg KH
2012-02-01 21:00 ` [76/89] net: bpf_jit: fix divide by 0 generation Greg KH
2012-02-01 21:00 ` [77/89] net: reintroduce missing rcu_assign_pointer() calls Greg KH
2012-02-01 21:00 ` [78/89] rds: Make rds_sock_lock BH rather than IRQ safe Greg KH
2012-02-01 21:00 ` [79/89] tcp: fix tcp_trim_head() to adjust segment count with skb MSS Greg KH
2012-02-01 21:00 ` [80/89] tcp: md5: using remote adress for md5 lookup in rst packet Greg KH
2012-02-01 21:00 ` [81/89] USB: serial: CP210x: Added USB-ID for the Link Instruments MSO-19 Greg KH
2012-02-01 21:00 ` [82/89] USB: cp210x: call generic open last in open Greg KH
2012-02-01 21:00 ` [83/89] USB: cp210x: fix CP2104 baudrate usage Greg KH
2012-02-01 21:00 ` [84/89] USB: cp210x: do not map baud rates to B0 Greg KH
2012-02-01 21:00 ` [85/89] USB: cp210x: fix up set_termios variables Greg KH
2012-02-01 21:00 ` [86/89] USB: cp210x: clean up, refactor and document speed handling Greg KH
2012-02-01 21:00 ` [87/89] USB: cp210x: initialise baud rate at open Greg KH
2012-02-01 21:00 ` [88/89] USB: cp210x: allow more baud rates above 1Mbaud Greg KH
2012-02-01 21:00 ` [89/89] mach-ux500: no MMC_CAP_SD_HIGHSPEED on Snowball Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120201210045.753760472@clark.kroah.org \
--to=gregkh@linuxfoundation.org \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=herbert@gondor.hengli.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).