From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Agner Date: Wed, 13 Aug 2014 13:20:35 +0200 Subject: [U-Boot] [PATCH 3/4] mtd: nand: add Freescale NFC driver In-Reply-To: <87wqadl3h6.fsf@nbsps.com> References: <4ea124ab817db6e3dee2067aadd6db14643990f5.1407312577.git.stefan@agner.ch> <1407796426.7427.100.camel@snotra.buserror.net> <1380dbe3110cbf9359220a78d69e1896@agner.ch> <1407881820.7427.137.camel@snotra.buserror.net> <87wqadl3h6.fsf@nbsps.com> Message-ID: <7b41b54fe3f11a943b0ec63f2e44f573@agner.ch> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Am 2014-08-13 00:58, schrieb Bill Pringlemeir: [snip] >>>>> +static u32 nfc_read(struct mtd_info *mtd, uint reg) >>>>> +{ >>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd); >>>>> + >>>>> + if (reg == NFC_FLASH_STATUS1 || >>>>> + reg == NFC_FLASH_STATUS2 || >>>>> + reg == NFC_IRQ_STATUS) >>>>> + return __raw_readl(nfc->regs + reg); >>>>> + /* Gang read/writes together for most registers. */ >>>>> + else >>>>> + return *(u32 *)(nfc->regs + reg); >>>>> +} >>>>> + >>>>> +static void nfc_write(struct mtd_info *mtd, uint reg, u32 val) >>>>> +{ >>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd); >>>>> + >>>>> + if (reg == NFC_FLASH_STATUS1 || >>>>> + reg == NFC_FLASH_STATUS2 || >>>>> + reg == NFC_IRQ_STATUS) >>>>> + __raw_writel(val, nfc->regs + reg); >>>>> + /* Gang read/writes together for most registers. */ >>>>> + else >>>>> + *(u32 *)(nfc->regs + reg) = val; >>>>> +} >>>> >>>> You should always be using raw I/O accessors. If the intent is to >>>> bypass I/O ordering for certain registers, the raw accessors should >>>> already be skipping that. > >>> Ok, will do. > >> Sorry, the above should say "always be using I/O accesors", not always >> raw ones. > > This is probably because of me. There are lines like this for > configuration, > > /* Set configuration register. */ > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_ADDR_AUTO_INCR_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BUFNO_AUTO_INCR_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BOOT_MODE_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_DMA_REQ_BIT); > nfc_set(mtd, NFC_FLASH_CONFIG, CONFIG_FAST_FLASH_BIT); > > If you use some sort of 'volatile', you are saying to the compiler that > these lines are *not*, > > ldr r0, [r1, #CONFIG] # read previous value > ldr r2, =~MASK > orr r0, #FAST_FLASH_BIT # set one bit. > and r0, r2 # clear all bits at once. > str r0, [r1, #CONFIG] # write it back. > > Instead it will change into five different load/modify/stores. The > memory itself is just like SRAM except a few registers that are actually > volatile; ie changed via the hardware. > > Particularly bad are the memcpy_io() on the ARM. Using these results > in about 1/2 to 1/4 of the performance of the drivers through-put on the > Vybrid. I can see that using accessors is good, but just replacing this > with 'writel' will result in significantly more code and slower > throughput. > > If people insist on this, then something like, > > val = nfc_read(mtd, NFC_REG); > val = nfc_clear(val, CONFIG_ADDR_AUTO_INCR_BIT); > val = nfc_clear(val, CONFIG_BUFNO_AUTO_INCR_BIT); > val = nfc_clear(val, CONFIG_BOOT_MODE_BIT); > val = nfc_clear(val, CONFIG_DMA_REQ_BIT); > val = nfc_set(val, CONFIG_FAST_FLASH_BIT); > nfc_write(mtd, NFC_REG, val); > > would result in the same code with 'nfc_read()' and 'nfc_write()' using > the I/O accessors. I found this more difficult to read for the higher > level functions. Instead some some standard macros for setting and > clearing bits could be used. The original driver was using 'nfc_set()' > and 'nfc_clear()'. Maybe they should just go. > I just applied this change: diff --git a/drivers/mtd/nand/fsl_nfc.c b/drivers/mtd/nand/fsl_nfc.c index df2c8be..01c010f 100644 --- a/drivers/mtd/nand/fsl_nfc.c +++ b/drivers/mtd/nand/fsl_nfc.c @@ -191,26 +191,14 @@ static u32 nfc_read(struct mtd_info *mtd, uint reg) { struct fsl_nfc *nfc = mtd_to_nfc(mtd); - if (reg == NFC_FLASH_STATUS1 || - reg == NFC_FLASH_STATUS2 || - reg == NFC_IRQ_STATUS) - return __raw_readl(nfc->regs + reg); - /* Gang read/writes together for most registers. */ - else - return *(u32 *)(nfc->regs + reg); + return __raw_readl(nfc->regs + reg); } static void nfc_write(struct mtd_info *mtd, uint reg, u32 val) { struct fsl_nfc *nfc = mtd_to_nfc(mtd); - if (reg == NFC_FLASH_STATUS1 || - reg == NFC_FLASH_STATUS2 || - reg == NFC_IRQ_STATUS) - __raw_writel(val, nfc->regs + reg); - /* Gang read/writes together for most registers. */ - else - *(u32 *)(nfc->regs + reg) = val; + __raw_writel(val, nfc->regs + reg); } static void nfc_set(struct mtd_info *mtd, uint reg, u32 bits) And did some performance measurement: => nand read ${loadaddr} 0x0 0x2000000 ===> With if/else NAND read: device 0 offset 0x0, size 0x2000000 33554432 bytes read in 8494 ms: OK (3.8 MiB/s) ===> With __raw_readl only... NAND read: device 0 offset 0x0, size 0x2000000 33554432 bytes read in 8184 ms: OK (3.9 MiB/s) => nand write ${loadaddr} rootfs-ubi 0x2000000 ===> With if/else NAND write: device 0 offset 0xa00000, size 0x2000000 33554432 bytes written in 18200 ms: OK (1.8 MiB/s) ===> With __raw_readl only... NAND write: device 0 offset 0xa00000, size 0x2000000 33554432 bytes written in 15095 ms: OK (2.1 MiB/s) These values are reproducible. I guess by removing the conditional branch in the nfc_read/nfc_write functions we even gain performance. IMHO we should use the raw_writel only and "hand optimize" for functions which are used often. For the initialization/configuration functions, there is little value to save some register access. -- Stefan