From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Matthias_Wei=DFer?= Date: Mon, 24 Jan 2011 20:24:28 +0100 Subject: [U-Boot] [PATCH] arm: Use optimized memcpy and memset from linux In-Reply-To: <20110124161338.B0345D42A89@gemini.denx.de> References: <1295884607-9044-1-git-send-email-weisserm@arcor.de> <20110124161338.B0345D42A89@gemini.denx.de> Message-ID: <4D3DD1EC.7010506@arcor.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Am 24.01.2011 17:13, schrieb Wolfgang Denk: > Dear Matthias Weisser, > > In message <1295884607-9044-1-git-send-email-weisserm@arcor.de> you wrote: >> Using optimized versions of memset and memcpy from linux brings a quite >> noticeable speed (x2 or better) improvement for these two functions. >> >> Size impact: >> >> C version: >> text data bss dec hex filename >> 202862 18912 266456 488230 77326 u-boot >> >> ASM version: >> text data bss dec hex filename >> 203798 18912 266288 488998 77626 u-boot > > How exactly did you measure the speed improvement? I inserted a printf before and after calls to these functions with sizes of 1MB or more each. I then measured the times between these printfs using grabserial (http://elinux.org/Grabserial). In both cases caches where enabled. To be precise: As memset test case I used the memset(.., 0, ..) of the malloc pool (which was 4MB in my case) and a memcpy from flash to RAM which I inserted in cmd_bootm.c of about 2.2MB (see RFC patch http://patchwork.ozlabs.org/patch/79480/ for exact location of the memcpy). Do you think a factor of 2 is not possible against the C version? Maybe I have done something wrong while measuring theses times. From my point of view it should be possible to get such improvements as the code takes cache alignment into account and also uses the PLD instruction. I can do some additional measurements tomorrow on two systems (jadecpu with a 32Bit at 166MHz DDR2 memory and a imx25 based on with 16Bit at 133MHz LPDDR) and come up with some exact numbers. Maybe you can give some more hints what and how the improvements of this patch can be measured. Matthias Wei?er