All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
@ 2010-12-22 11:54 Aneesh V
  2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
                   ` (9 more replies)
  0 siblings, 10 replies; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

With D-cache and MMU enabled for ARM in u-boot it becomes imperative to
support a minimal set of cache maintenance operations and necessary
initializations before enabling MMU.

This series of patches attempt to do the following for armv7:
* Necessary initialization sequence before enabling MMU that includes
  invalidation of TLB, data caches, branch predictor array etc.
* Framework for supporting SOC specific outer caches in a generic manner
  (using a structure of function pointers - inspired by the Linux
  implementation)
* Generic armv7 cache maintenance operations for caches known to the CPU
* Support for ARM PL310 L2 cache controller used in OMAP4
* Cleanup of the cleanup_before_linux() function
* Adapting all armv7 SOCs to use the new framework and removing
  duplicated code

Testing:
* Extensive testing on OMAP4430SDP and OMAP3430SDP by creating coherency
  issues and solving them using the maintenance routines
	- Eg: memfill a region of memory with a known pattern
	- Invalidate the region
	- Read back and compare the region with the original pattern
	- If match fails it means that invalidate is successful
	- Now add a flush call just before the invalidate
	- If match succeeds it means that flush was successful
	- Outer caches were tested with experiments involving making the
	  function pointers NULL
* Kernel booting on OMAP4430SDP and OMAP3430SDP


Aneesh V (8):
  arm: make default implementation of cache_flush() weakly linked
  armv7: cache maintenance operations for armv7
  armv7: integrate cache maintenance support
  arm: minor fixes for cache and mmu handling
  armv7: add PL310 support to u-boot
  armv7: adapt omap4 to the new cache maintenance framework
  armv7: adapt omap3 to the new cache maintenance framework
  armv7: adapt s5pc1xx to the new cache maintenance framework

 arch/arm/cpu/armv7/Makefile                   |    2 +-
 arch/arm/cpu/armv7/cache_v7.c                 |  359 +++++++++++++++++++++++++
 arch/arm/cpu/armv7/config.mk                  |    2 +-
 arch/arm/cpu/armv7/cpu.c                      |   47 ++--
 arch/arm/cpu/armv7/omap3/Makefile             |    1 -
 arch/arm/cpu/armv7/omap3/board.c              |  151 +++++++++--
 arch/arm/cpu/armv7/omap3/cache.S              |  263 ------------------
 arch/arm/cpu/armv7/omap3/lowlevel_init.S      |   30 ++
 arch/arm/cpu/armv7/omap4/board.c              |   25 ++-
 arch/arm/cpu/armv7/omap4/lowlevel_init.S      |   17 ++
 arch/arm/cpu/armv7/s5pc1xx/cache.S            |   86 +------
 arch/arm/cpu/armv7/s5pc1xx/clock.c            |   12 +
 arch/arm/cpu/armv7/start.S                    |   18 ++-
 arch/arm/include/asm/arch-omap3/omap3.h       |   20 ++
 arch/arm/include/asm/arch-omap3/sys_proto.h   |   10 +-
 arch/arm/include/asm/arch-omap4/omap4.h       |    6 +
 arch/arm/include/asm/arch-omap4/sys_proto.h   |    3 +-
 arch/arm/include/asm/arch-s5pc1xx/sys_proto.h |    4 +-
 arch/arm/include/asm/armv7.h                  |   63 +++++
 arch/arm/include/asm/pl310.h                  |   49 ++++
 arch/arm/lib/Makefile                         |    1 +
 arch/arm/lib/board.c                          |    6 +
 arch/arm/lib/cache-cp15.c                     |   18 ++-
 arch/arm/lib/cache-pl310.c                    |  114 ++++++++
 arch/arm/lib/cache.c                          |   20 +-
 include/common.h                              |    5 +-
 include/configs/omap4_panda.h                 |    6 +-
 include/configs/omap4_sdp4430.h               |    6 +-
 28 files changed, 929 insertions(+), 415 deletions(-)
 create mode 100644 arch/arm/cpu/armv7/cache_v7.c
 delete mode 100644 arch/arm/cpu/armv7/omap3/cache.S
 create mode 100644 arch/arm/include/asm/armv7.h
 create mode 100644 arch/arm/include/asm/pl310.h
 create mode 100644 arch/arm/lib/cache-pl310.c

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-08  6:40   ` Albert ARIBAUD
  2010-12-22 11:54 ` [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7 Aneesh V
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

make default implementation of cache_flush() weakly linked so that
sub-architectures can override it

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/lib/cache.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c
index 30686fe..275b6e1 100644
--- a/arch/arm/lib/cache.c
+++ b/arch/arm/lib/cache.c
@@ -25,7 +25,7 @@
 
 #include <common.h>
 
-void  flush_cache (unsigned long dummy1, unsigned long dummy2)
+void  __flush_cache(unsigned long dummy1, unsigned long dummy2)
 {
 #if defined(CONFIG_OMAP2420) || defined(CONFIG_ARM1136)
 	void arm1136_cache_flush(void);
@@ -38,10 +38,7 @@ void  flush_cache (unsigned long dummy1, unsigned long dummy2)
 	/* disable write buffer as well (page 2-22) */
 	asm("mcr p15, 0, %0, c7, c10, 4" : : "r" (0));
 #endif
-#ifdef CONFIG_OMAP34XX
-	void v7_flush_cache_all(void);
-
-	v7_flush_cache_all();
-#endif
 	return;
 }
+void  flush_cache(unsigned long dummy1, unsigned long dummy2)
+	__attribute__((weak, alias("__flush_cache")));
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
  2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-08  6:36   ` Albert ARIBAUD
  2010-12-22 11:54 ` [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support Aneesh V
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

- Add a framework for layered cache maintenance
	- separate out SOC specific outer cache maintenance from
	  maintenance of caches known to CPU

- Add generic ARMv7 cache maintenance operations that affect all
  caches known to ARMv7 CPUs. For instance in Cortex-A8 these
  opertions will affect both L1 and L2 caches. In Cortex-A9
  these will affect only L1 cache

- D-cache operations supported:
	- Invalidate entire D-cache
	- Invalidate D-cache range
	- Flush(clean & invalidate) entire D-cache
	- Flush D-cache range
- I-cache operations supported:
	- Invalidate entire I-cache

- Add maintenance functions for TLB, branch predictor array etc.

- Enable -march=armv7-a so that armv7 assembly instructions can be
  used

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/cpu/armv7/Makefile   |    2 +-
 arch/arm/cpu/armv7/cache_v7.c |  359 +++++++++++++++++++++++++++++++++++++++++
 arch/arm/cpu/armv7/config.mk  |    2 +-
 arch/arm/include/asm/armv7.h  |   63 +++++++
 include/common.h              |    5 +-
 5 files changed, 428 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/cpu/armv7/cache_v7.c
 create mode 100644 arch/arm/include/asm/armv7.h

diff --git a/arch/arm/cpu/armv7/Makefile b/arch/arm/cpu/armv7/Makefile
index 8c0e915..299792a 100644
--- a/arch/arm/cpu/armv7/Makefile
+++ b/arch/arm/cpu/armv7/Makefile
@@ -26,7 +26,7 @@ include $(TOPDIR)/config.mk
 LIB	= $(obj)lib$(CPU).o
 
 START	:= start.o
-COBJS	:= cpu.o
+COBJS	:= cpu.o cache_v7.o
 COBJS  += syslib.o
 
 SRCS	:= $(START:.o=.S) $(COBJS:.o=.c)
diff --git a/arch/arm/cpu/armv7/cache_v7.c b/arch/arm/cpu/armv7/cache_v7.c
new file mode 100644
index 0000000..0521d66
--- /dev/null
+++ b/arch/arm/cpu/armv7/cache_v7.c
@@ -0,0 +1,359 @@
+/*
+ * (C) Copyright 2010
+ * Texas Instruments Incorporated - http://www.ti.com/
+ * Aneesh V <aneesh@ti.com>
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+#include <linux/types.h>
+#include <common.h>
+#include <asm/armv7.h>
+
+#define ARMV7_DCACHE_INVAL_ALL		1
+#define ARMV7_DCACHE_CLEAN_INVAL_ALL	2
+#define ARMV7_DCACHE_INVAL_RANGE	3
+#define ARMV7_DCACHE_CLEAN_INVAL_RANGE	4
+
+struct v7_outer_cache_ops v7_outer_cache;
+
+#ifndef CONFIG_SYS_NO_DCACHE
+/*
+ * Write the level and type you want to Cache Size Selection Register(CSSELR)
+ * to get size details from Current Cache Size ID Register(CCSIDR)
+ */
+static void set_csselr(u32 level, u32 type)
+{	u32 csselr = level << 1 | type;
+	/* Write to Cache Size Selection Register(CSSELR) */
+	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
+}
+
+static u32 get_ccsidr(void)
+{
+	u32 ccsidr;
+	/* Read current CP15 Cache Size ID Register */
+	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
+	return ccsidr;
+}
+
+static u32 get_clidr(void)
+{
+	u32 clidr;
+	/* Read current CP15 Cache Level ID Register */
+	asm volatile ("mrc p15,1,%0,c0,c0,1" : "=r" (clidr));
+	return clidr;
+}
+
+/* round up the input number to a power of 2 and get the log2 */
+static inline u32 log_2_round_up(u32 num)
+{
+	/* count leading zeros */
+	asm volatile ("CLZ %0, %0" : "+r" (num));
+
+	/* position of most significant 1 */
+	num = 31 - num;
+
+	return num;
+}
+
+static void v7_inval_dcache_level_setway(u32 level, u32 num_sets,
+					 u32 num_ways, u32 way_shift,
+					 u32 log2_line_len)
+{
+	int way, set, setway;
+	/*
+	 * For optimal assembly code:
+	 *	a. count down
+	 *	b. have bigger loop inside
+	 */
+	for (way = num_ways - 1; way >= 0 ; way--)
+		for (set = num_sets - 1; set >= 0; set--) {
+			setway = (level << 1) | (set << log2_line_len) |
+				 (way << way_shift);
+			/* Invalidate data/unified cache line by set/way */
+			asm volatile ("	mcr p15, 0, %0, c7, c6, 2"
+					: : "r" (setway));
+		}
+	/* Make sure the operation is complete */
+	asm volatile ("DMB");
+}
+
+static void v7_clean_inval_dcache_level_setway(u32 level, u32 num_sets,
+					       u32 num_ways, u32 way_shift,
+					       u32 log2_line_len)
+{
+	int way, set, setway;
+	/*
+	 * For optimal assembly code:
+	 *	a. count down
+	 *	b. have bigger loop inside
+	 */
+	for (way = num_ways - 1; way >= 0 ; way--)
+		for (set = num_sets - 1; set >= 0; set--) {
+			setway = (level << 1) | (set << log2_line_len) |
+				 (way << way_shift);
+			/*
+			 * Clean & Invalidate data/unified
+			 * cache line by set/way
+			 */
+			asm volatile ("	mcr p15, 0, %0, c7, c14, 2"
+					: : "r" (setway));
+		}
+	/* Make sure the operation is complete */
+	asm volatile ("DMB");
+}
+
+static void v7_maint_dcache_level_setway(u32 level, u32 operation)
+{
+	u32 ccsidr;
+	u32 num_sets, num_ways, log2_line_len, log2_num_ways;
+	u32 way_shift;
+	set_csselr(level, ARMV7_CSSELR_IND_DATA_UNIFIED);
+
+	ccsidr = get_ccsidr();
+
+	log2_line_len = mask_n_get(ccsidr, 0, 2) + 2;
+	/* Converting from words to bytes */
+	log2_line_len += 2;
+
+	num_ways  = mask_n_get(ccsidr, 3, 12) + 1;
+	num_sets  = mask_n_get(ccsidr, 13, 27) + 1;
+	/*
+	 * According to ARMv7 ARM number of sets and number of ways need
+	 * not be a power of 2
+	 */
+	log2_num_ways = log_2_round_up(num_ways);
+
+	way_shift = (32 - log2_num_ways);
+	if (operation == ARMV7_DCACHE_INVAL_ALL)
+		v7_inval_dcache_level_setway(level, num_sets, num_ways,
+				      way_shift, log2_line_len);
+	else if (operation == ARMV7_DCACHE_CLEAN_INVAL_ALL)
+		v7_clean_inval_dcache_level_setway(level, num_sets, num_ways,
+						   way_shift, log2_line_len);
+}
+
+static void v7_maint_dcache_all(u32 operation)
+{
+	u32 level, cache_type, level_start_bit = 0;
+
+	u32 clidr = get_clidr();
+
+	for (level = 0; level < 7; level++) {
+		cache_type = mask_n_get(clidr, level_start_bit,
+					level_start_bit + 2);
+		if ((cache_type == ARMV7_CLIDR_CTYPE_DATA_ONLY) ||
+		    (cache_type == ARMV7_CLIDR_CTYPE_INSTRUCTION_DATA) ||
+		    (cache_type == ARMV7_CLIDR_CTYPE_UNIFIED))
+			v7_maint_dcache_level_setway(level, operation);
+		level_start_bit += 3;
+	}
+}
+
+static void v7_dcache_clean_inval_range(u32 start,
+					u32 stop, u32 line_len)
+{
+	u32 mva;
+	/* Align start to cache line boundary */
+	start &= ~(line_len - 1);
+	for (mva = start; mva < stop; mva = mva + line_len)
+		/* DCCIMVAC - Clean & Invalidate data cache by MVA to PoC */
+		asm volatile ("mcr p15, 0, %0, c7, c14, 1" : : "r" (mva));
+}
+
+static void v7_dcache_inval_range(u32 start, u32 stop, u32 line_len)
+{
+	u32 mva;
+
+	/*
+	 * If start address is not aligned to cache-line flush the first
+	 * line to prevent affecting somebody else's buffer
+	 */
+	if (start & (line_len - 1)) {
+		v7_dcache_clean_inval_range(start, start + 1, line_len);
+		/* move to next cache line */
+		start = (start + line_len - 1) & ~(line_len - 1);
+	}
+
+	/*
+	 * If stop address is not aligned to cache-line flush the last
+	 * line to prevent affecting somebody else's buffer
+	 */
+	if (stop & (line_len - 1)) {
+		v7_dcache_clean_inval_range(stop, stop + 1, line_len);
+		/* align to the beginning of this cache line */
+		stop &= ~(line_len - 1);
+	}
+
+	for (mva = start; mva < stop; mva = mva + line_len)
+		/* DCIMVAC - Invalidate data cache by MVA to PoC */
+		asm volatile ("mcr p15, 0, %0, c7, c6, 1" : : "r" (mva));
+}
+
+static void v7_dcache_maint_range(u32 start, u32 stop, u32 range_op)
+{
+	u32 line_len, ccsidr;
+	ccsidr = get_ccsidr();
+	line_len = mask_n_get(ccsidr, 0, 2) + 2;
+	/* Converting from words to bytes */
+	line_len += 2;
+	/* converting from log2(linelen) to linelen */
+	line_len = 1 << line_len;
+
+	switch (range_op) {
+	case ARMV7_DCACHE_CLEAN_INVAL_RANGE:
+		v7_dcache_clean_inval_range(start, stop, line_len);
+		break;
+	case ARMV7_DCACHE_INVAL_RANGE:
+		v7_dcache_inval_range(start, stop, line_len);
+		break;
+	}
+
+	/* Make sure the operation is complete */
+	asm volatile ("DMB");
+}
+
+/* Invalidate TLB */
+static void v7_inval_tlb(void)
+{
+	/* Invalidate entire unified TLB */
+	asm volatile ("mcr p15, 0, %0, c8, c7, 0" : : "r" (0));
+	/* Invalidate entire data TLB */
+	asm volatile ("mcr p15, 0, %0, c8, c6, 0" : : "r" (0));
+	/* Invalidate entire instruction TLB */
+	asm volatile ("mcr p15, 0, %0, c8, c5, 0" : : "r" (0));
+	/* Full system DSB - make sure that the invalidation is complete */
+	asm volatile ("DSB");
+	/* Full system ISB - make sure the instruction stream sees it */
+	asm volatile ("ISB");
+}
+
+void invalidate_dcache_all(void)
+{
+	v7_maint_dcache_all(ARMV7_DCACHE_INVAL_ALL);
+	if (v7_outer_cache.inval_all)
+		v7_outer_cache.inval_all();
+}
+
+/*
+ * Performs a clean & invalidation of the entire data cache
+ * at all levels
+ */
+void flush_dcache_all(void)
+{
+	v7_maint_dcache_all(ARMV7_DCACHE_CLEAN_INVAL_ALL);
+	if (v7_outer_cache.flush_all)
+		v7_outer_cache.flush_all();
+}
+
+/*
+ * Invalidates range in all levels of D-cache/unified cache used:
+ * Affects the range [start, stop - 1]
+ */
+void invalidate_dcache_range(unsigned long start, unsigned long stop)
+{
+
+	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_INVAL_RANGE);
+	if (v7_outer_cache.inval_range)
+		/* physical address is same as virtual address */
+		v7_outer_cache.inval_range(start, stop);
+}
+
+/*
+ * Flush range(clean & invalidate) from all levels of D-cache/unified
+ * cache used:
+ * Affects the range [start, stop - 1]
+ */
+void flush_dcache_range(unsigned long start, unsigned long stop)
+{
+	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_CLEAN_INVAL_RANGE);
+	if (v7_outer_cache.flush_range)
+		/* physical address is same as virtual address */
+		v7_outer_cache.flush_range(start, stop);
+}
+static void __v7_setup_outer_cache_ops(void)
+{
+	puts("v7_setup_outer_cache_ops: dummy implementation! "
+	     "real implementation not available!!\n");
+}
+
+void v7_setup_outer_cache_ops(void)
+	__attribute__((weak, alias("__v7_setup_outer_cache_ops")));
+
+void arm_init_before_mmu(void)
+{
+	v7_setup_outer_cache_ops();
+	if (v7_outer_cache.enable)
+		v7_outer_cache.enable();
+	invalidate_dcache_all();
+	v7_inval_tlb();
+}
+#else
+void invalidate_dcache_all(void)
+{
+}
+
+void flush_dcache_all(void)
+{
+}
+
+void invalidate_dcache_range(unsigned long start, unsigned long stop)
+{
+}
+
+void flush_dcache_range(unsigned long start, unsigned long stop)
+{
+}
+
+void arm_init_before_mmu(void)
+{
+}
+#endif
+
+#ifndef CONFIG_SYS_NO_ICACHE
+/* Invalidate entire I-cache and branch predictor array */
+void invalidate_icache_all(void)
+{
+	/*
+	 * Invalidate all instruction caches to PoU.
+	 * Also flushes branch target cache.
+	 */
+	asm volatile ("mcr p15, 0, %0, c7, c5, 0" : : "r" (0));
+
+	/* Invalidate entire branch predictor array */
+	asm volatile ("mcr p15, 0, %0, c7, c5, 6" : : "r" (0));
+
+	/* Full system DSB - make sure that the invalidation is complete */
+	asm volatile ("DSB");
+	/* Full system ISB - make sure the instruction stream sees it */
+	asm volatile ("ISB");
+}
+#else
+void invalidate_icache_all(void)
+{
+}
+#endif
+
+/*
+ * Flush range from all levels of d-cache/unified-cache used:
+ * Affects the range [start, start + size - 1]
+ */
+void  flush_cache(unsigned long start, unsigned long size)
+{
+	flush_dcache_range(start, start + size);
+}
diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk
index 49ac9c7..7f9b171 100644
--- a/arch/arm/cpu/armv7/config.mk
+++ b/arch/arm/cpu/armv7/config.mk
@@ -23,7 +23,7 @@
 PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
 
 # Make ARMv5 to allow more compilers to work, even though its v7a.
-PLATFORM_CPPFLAGS += -march=armv5
+PLATFORM_CPPFLAGS += -march=armv7-a
 # =========================================================================
 #
 # Supply options according to compiler version
diff --git a/arch/arm/include/asm/armv7.h b/arch/arm/include/asm/armv7.h
new file mode 100644
index 0000000..57409b6
--- /dev/null
+++ b/arch/arm/include/asm/armv7.h
@@ -0,0 +1,63 @@
+/*
+ * (C) Copyright 2010
+ * Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Aneesh V <aneesh@ti.com>
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+#ifndef ARMV7_H
+#define ARMV7_H
+
+#include <linux/types.h>
+
+/*
+ * Values for InD field in CSSELR
+ * Selects the type of cache
+ */
+#define ARMV7_CSSELR_IND_DATA_UNIFIED	0
+#define ARMV7_CSSELR_IND_INSTRUCTION	1
+
+/* Values for Ctype fields in CLIDR */
+#define ARMV7_CLIDR_CTYPE_NO_CACHE		0
+#define ARMV7_CLIDR_CTYPE_INSTRUCTION_ONLY	1
+#define ARMV7_CLIDR_CTYPE_DATA_ONLY		2
+#define ARMV7_CLIDR_CTYPE_INSTRUCTION_DATA	3
+#define ARMV7_CLIDR_CTYPE_UNIFIED		4
+
+/* some utility macros */
+#define mask(start, end) \
+	(((1 << ((end) - (start) + 1)) - 1) << (start))
+
+#define mask_n_get(reg, start, end) \
+	(((reg) & mask(start, end)) >> (start))
+
+struct v7_outer_cache_ops {
+	void (*enable)(void);
+	void (*disable)(void);
+	void (*flush_all)(void);
+	void (*inval_all)(void);
+	void (*flush_range)(u32 start, u32 end);
+	void (*inval_range)(u32 start, u32 end);
+};
+
+extern struct v7_outer_cache_ops v7_outer_cache;
+
+void v7_setup_outer_cache_ops(void);
+#endif
diff --git a/include/common.h b/include/common.h
index 189ad81..d750ff9 100644
--- a/include/common.h
+++ b/include/common.h
@@ -411,6 +411,7 @@ void	icache_disable(void);
 int	dcache_status (void);
 void	dcache_enable (void);
 void	dcache_disable(void);
+void	mmu_disable(void);
 void	relocate_code (ulong, gd_t *, ulong) __attribute__ ((noreturn));
 ulong	get_endaddr   (void);
 void	trap_init     (ulong);
@@ -603,9 +604,11 @@ ulong	video_setmem (ulong);
 
 /* arch/$(ARCH)/lib/cache.c */
 void	flush_cache   (unsigned long, unsigned long);
+void	flush_dcache_all(void);
 void	flush_dcache_range(unsigned long start, unsigned long stop);
 void	invalidate_dcache_range(unsigned long start, unsigned long stop);
-
+void	invalidate_dcache_all(void);
+void	invalidate_icache_all(void);
 
 /* arch/$(ARCH)/lib/ticks.S */
 unsigned long long get_ticks(void);
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
  2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
  2010-12-22 11:54 ` [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7 Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-08  6:54   ` Albert ARIBAUD
  2010-12-22 11:54 ` [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling Aneesh V
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

- Enable I-cache on bootup
- Enable MMU and D-cache immediately after relocation
	- Do necessary initialization before enabling d-cache and MMU
- Changes to cleanup_before_linux()
	- Make changes according to the new framework

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/cpu/armv7/cpu.c   |   47 +++++++++++++++++++------------------------
 arch/arm/cpu/armv7/start.S |   18 +++++++++++++++-
 arch/arm/lib/board.c       |    6 +++++
 arch/arm/lib/cache-cp15.c  |    9 ++++++++
 4 files changed, 53 insertions(+), 27 deletions(-)

diff --git a/arch/arm/cpu/armv7/cpu.c b/arch/arm/cpu/armv7/cpu.c
index a01e0d6..b418304 100644
--- a/arch/arm/cpu/armv7/cpu.c
+++ b/arch/arm/cpu/armv7/cpu.c
@@ -38,13 +38,10 @@
 #ifndef CONFIG_L2_OFF
 #include <asm/arch/sys_proto.h>
 #endif
-
-static void cache_flush(void);
+#include <asm/armv7.h>
 
 int cleanup_before_linux(void)
 {
-	unsigned int i;
-
 	/*
 	 * this function is called just before we call linux
 	 * it prepares the processor for linux
@@ -53,31 +50,29 @@ int cleanup_before_linux(void)
 	 */
 	disable_interrupts();
 
-	/* turn off I/D-cache */
+	/*
+	 * Turn off I-cache and invalidate it
+	 */
 	icache_disable();
-	dcache_disable();
+	invalidate_icache_all();
 
-	/* invalidate I-cache */
-	cache_flush();
-
-#ifndef CONFIG_L2_OFF
-	/* turn off L2 cache */
-	l2_cache_disable();
-	/* invalidate L2 cache also */
-	invalidate_dcache(get_device_type());
-#endif
-	i = 0;
-	/* mem barrier to sync up things */
-	asm("mcr p15, 0, %0, c7, c10, 4": :"r"(i));
+	/*
+	 * turn off D-cache
+	 * dcache_disable() in turn flushes the d-cache and disables MMU
+	 */
+	dcache_disable();
 
-#ifndef CONFIG_L2_OFF
-	l2_cache_enable();
-#endif
+	/*
+	 * After D-cache is flushed and before it is disabled there may
+	 * be some new valid entries brought into the cache. We are sure
+	 * that these lines are not dirty and will not affect our execution.
+	 * (because unwinding the call-stack and setting a bit in CP15 SCTRL
+	 * is all we did during this. We have not pushed anything on to the
+	 * stack. Neither have we affected any static data) 
+	 * So just invalidate the entire d-cache again to avoid coherency
+	 * problems for kernel
+	 */
+	invalidate_dcache_all();
 
 	return 0;
 }
-
-static void cache_flush(void)
-{
-	asm ("mcr p15, 0, %0, c7, c5, 0": :"r" (0));
-}
diff --git a/arch/arm/cpu/armv7/start.S b/arch/arm/cpu/armv7/start.S
index 684f2d2..7d17f1e 100644
--- a/arch/arm/cpu/armv7/start.S
+++ b/arch/arm/cpu/armv7/start.S
@@ -241,6 +241,14 @@ clbss_l:str	r2, [r0]		/* clear loop...		    */
  * initialization, now running from RAM.
  */
 jump_2_ram:
+/*
+ * If I-cache is enabled invalidate it
+ */
+#ifndef CONFIG_SYS_NO_ICACHE
+	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
+	dsb
+	isb
+#endif
 	ldr	r0, _board_init_r_ofs
 	adr	r1, _start
 	add	lr, r0, r1
@@ -276,6 +284,9 @@ cpu_init_crit:
 	mov	r0, #0			@ set up for MCR
 	mcr	p15, 0, r0, c8, c7, 0	@ invalidate TLBs
 	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
+	mcr	p15, 0, r0, c7, c5, 6  /* invalidate BP array */
+	dsb
+	isb
 
 	/*
 	 * disable MMU stuff and caches
@@ -284,7 +295,12 @@ cpu_init_crit:
 	bic	r0, r0, #0x00002000	@ clear bits 13 (--V-)
 	bic	r0, r0, #0x00000007	@ clear bits 2:0 (-CAM)
 	orr	r0, r0, #0x00000002	@ set bit 1 (--A-) Align
-	orr	r0, r0, #0x00000800	@ set bit 12 (Z---) BTB
+	orr	r0, r0, #0x00000800	@ set bit 11 (Z---) BTB
+#ifdef CONFIG_SYS_NO_ICACHE
+	bic	r0, r0, #0x00001000	@ clear bit 12 (I) I-cache
+#else
+	orr	r0, r0, #0x00001000	@ set bit 12 (I) I-cache
+#endif
 	mcr	p15, 0, r0, c1, c0, 0
 
 	/*
diff --git a/arch/arm/lib/board.c b/arch/arm/lib/board.c
index 7266381..bef32a6 100644
--- a/arch/arm/lib/board.c
+++ b/arch/arm/lib/board.c
@@ -459,6 +459,12 @@ void board_init_r (gd_t *id, ulong dest_addr)
 
 	gd->flags |= GD_FLG_RELOC;	/* tell others: relocation done */
 
+	/*
+	 * Enable D$:
+	 * I$, if needed, must be already enabled in start.S
+	 */
+	dcache_enable();
+
 	monitor_flash_len = _bss_start_ofs;
 	debug ("monitor flash len: %08lX\n", monitor_flash_len);
 	board_init();	/* Setup chipselects */
diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
index d9175f0..ca526fb 100644
--- a/arch/arm/lib/cache-cp15.c
+++ b/arch/arm/lib/cache-cp15.c
@@ -34,6 +34,14 @@
 
 DECLARE_GLOBAL_DATA_PTR;
 
+void __arm_init_before_mmu(void)
+{
+	puts("__arm_init_before_mmu: dummy implementation! "
+	     "real implementation missing!!\n");
+}
+void arm_init_before_mmu(void)
+	__attribute__((weak, alias("__arm_init_before_mmu")));
+
 static void cp_delay (void)
 {
 	volatile int i;
@@ -65,6 +73,7 @@ static inline void mmu_setup(void)
 	int i;
 	u32 reg;
 
+	arm_init_before_mmu();
 	/* Set up an identity-mapping for all 4GB, rw for everyone */
 	for (i = 0; i < 4096; i++)
 		page_table[i] = i << 20 | (3 << 10) | 0x12;
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (2 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-08  7:04   ` Albert ARIBAUD
  2010-12-22 11:54 ` [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot Aneesh V
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

1. make sure that page table setup is not done multiple times
2. flush_dcache_all() is more appropriate while disabling cache
   than a range flush on the entire memory(flush_cache())

   Provide a default implementation for flush_dcache_all()
   for backward compatibility and to avoid build issues.

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/lib/cache-cp15.c |    9 +++++++--
 arch/arm/lib/cache.c      |   13 ++++++++++++-
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
index ca526fb..20aa993 100644
--- a/arch/arm/lib/cache-cp15.c
+++ b/arch/arm/lib/cache-cp15.c
@@ -94,13 +94,18 @@ static inline void mmu_setup(void)
 	set_cr(reg | CR_M);
 }
 
+static int mmu_enabled(void)
+{
+	return get_cr() & CR_M;
+}
+
 /* cache_bit must be either CR_I or CR_C */
 static void cache_enable(uint32_t cache_bit)
 {
 	uint32_t reg;
 
 	/* The data cache is not active unless the mmu is enabled too */
-	if (cache_bit == CR_C)
+	if ((cache_bit == CR_C) && !mmu_enabled())
 		mmu_setup();
 	reg = get_cr();	/* get control reg. */
 	cp_delay();
@@ -119,7 +124,7 @@ static void cache_disable(uint32_t cache_bit)
 			return;
 		/* if disabling data cache, disable mmu too */
 		cache_bit |= CR_M;
-		flush_cache(0, ~0);
+		flush_dcache_all();
 	}
 	reg = get_cr();
 	cp_delay();
diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c
index 275b6e1..363609a 100644
--- a/arch/arm/lib/cache.c
+++ b/arch/arm/lib/cache.c
@@ -25,7 +25,7 @@
 
 #include <common.h>
 
-void  __flush_cache(unsigned long dummy1, unsigned long dummy2)
+void  __flush_cache(unsigned long start, unsigned long size)
 {
 #if defined(CONFIG_OMAP2420) || defined(CONFIG_ARM1136)
 	void arm1136_cache_flush(void);
@@ -42,3 +42,14 @@ void  __flush_cache(unsigned long dummy1, unsigned long dummy2)
 }
 void  flush_cache(unsigned long dummy1, unsigned long dummy2)
 	__attribute__((weak, alias("__flush_cache")));
+
+/*
+ * Default implementation:
+ * do a range flush for the entire range
+ */
+void	__flush_dcache_all(void)
+{
+	flush_cache(0, ~0);
+}
+void	flush_dcache_all(void)
+	__attribute__((weak, alias("__flush_dcache_all")));
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (3 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-09 22:48   ` Wolfgang Denk
  2010-12-22 11:54 ` [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework Aneesh V
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

Add support for some of the key maintenance operations
	- Invalidate all
	- Invalidate range
	- Flush(clean & invalidate) all
	- Flush range

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/include/asm/pl310.h |   49 ++++++++++++++++++
 arch/arm/lib/Makefile        |    1 +
 arch/arm/lib/cache-pl310.c   |  114 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 164 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/pl310.h
 create mode 100644 arch/arm/lib/cache-pl310.c

diff --git a/arch/arm/include/asm/pl310.h b/arch/arm/include/asm/pl310.h
new file mode 100644
index 0000000..8f5b33e
--- /dev/null
+++ b/arch/arm/include/asm/pl310.h
@@ -0,0 +1,49 @@
+/*
+ * (C) Copyright 2010
+ * Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Aneesh V <aneesh@ti.com>
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+#ifndef PL310_H
+#define PL310_H
+
+#include <linux/types.h>
+
+/* Register offsets */
+#define PL310_CACHE_TYPE		0x004
+#define PL310_AUX_CTRL			0x104
+
+#define PL310_CACHE_SYNC		0x730
+#define PL310_INVAL_LINE_PA		0x770
+#define PL310_INVAL_WAY			0x77C
+#define PL310_CLEAN_LINE_PA		0x7B0
+#define PL310_CLEAN_INVAL_WAY		0x7FC
+#define PL310_CLEAN_INVAL_LINE_PA	0x7F0
+
+/* Register bit fields */
+#define PL310_AUX_CTRL_ASSOCIATIVITY_MASK	(1 << 16)
+
+void pl310_inval_all(void);
+void pl310_clean_inval_all(void);
+void pl310_inval_range(u32 start, u32 end);
+void pl310_clean_inval_range(u32 start, u32 end);
+
+#endif
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 454440c..dc108a6 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -42,6 +42,7 @@ COBJS-y	+= cache.o
 ifndef CONFIG_SYS_NO_CP15_CACHE
 COBJS-y	+= cache-cp15.o
 endif
+COBJS-$(CONFIG_SYS_USE_PL310) += cache-pl310.o
 COBJS-y	+= interrupts.o
 COBJS-y	+= reset.o
 
diff --git a/arch/arm/lib/cache-pl310.c b/arch/arm/lib/cache-pl310.c
new file mode 100644
index 0000000..4f062f8
--- /dev/null
+++ b/arch/arm/lib/cache-pl310.c
@@ -0,0 +1,114 @@
+/*
+ * (C) Copyright 2010
+ * Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Aneesh V <aneesh@ti.com>
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+#include <linux/types.h>
+#include <asm/io.h>
+#include <asm/armv7.h>
+#include <asm/pl310.h>
+#include <config.h>
+
+static void pl310_cache_sync(void)
+{
+	__raw_writel(0, CONFIG_SYS_PL310_BASE + PL310_CACHE_SYNC);
+}
+
+static void pl310_background_op_all_ways(u32 op_reg_offset)
+{
+	u32 assoc_16, associativity, way_mask;
+	assoc_16 = __raw_readl(CONFIG_SYS_PL310_BASE + PL310_AUX_CTRL) &
+		   PL310_AUX_CTRL_ASSOCIATIVITY_MASK;
+
+	if (assoc_16)
+		associativity = 16;
+	else
+		associativity = 8;
+
+	way_mask = (1 << associativity) - 1;
+	/* Invalidate all ways */
+	__raw_writel(way_mask, CONFIG_SYS_PL310_BASE + op_reg_offset);
+	/* Wait for all ways to be invalidated */
+	while (__raw_readl(CONFIG_SYS_PL310_BASE + op_reg_offset) && way_mask)
+		;
+	pl310_cache_sync();
+}
+
+void pl310_inval_all(void)
+{
+	pl310_background_op_all_ways(PL310_INVAL_WAY);
+}
+
+void pl310_clean_inval_all(void)
+{
+	pl310_background_op_all_ways(PL310_CLEAN_INVAL_WAY);
+}
+
+/* Flush(clean invalidate) memory from start to stop-1 */
+void pl310_clean_inval_range(u32 start, u32 stop)
+{
+	/* PL310 currently supports only 32 bytes cache line */
+	u32 pa, line_size = 32;
+
+	/*
+	 * Align to the beginning of cache-line - this ensures that
+	 * the first 5 bits are 0 as required by PL310 TRM
+	 */
+	start &= ~(line_size - 1);
+
+	for (pa = start; pa < stop; pa = pa + line_size)
+		__raw_writel(pa, CONFIG_SYS_PL310_BASE +
+			     PL310_CLEAN_INVAL_LINE_PA);
+	pl310_cache_sync();
+}
+
+/* invalidate memory from start to stop-1 */
+void pl310_inval_range(u32 start, u32 stop)
+{
+	/* PL310 currently supports only 32 bytes cache line */
+	u32 pa, line_size = 32;
+
+	/*
+	 * If start address is not aligned to cache-line flush the first
+	 * line to prevent affecting somebody else's buffer
+	 */
+	if (start & (line_size - 1)) {
+		pl310_clean_inval_range(start, start + 1);
+		/* move to next cache line */
+		start = (start + line_size - 1) & ~(line_size - 1);
+	}
+
+	/*
+	 * If stop address is not aligned to cache-line flush the last
+	 * line to prevent affecting somebody else's buffer
+	 */
+	if (stop & (line_size - 1)) {
+		pl310_clean_inval_range(stop, stop + 1);
+		/* align to the beginning of this cache line */
+		stop &= ~(line_size - 1);
+	}
+
+	for (pa = start; pa < stop; pa = pa + line_size)
+		__raw_writel(pa, CONFIG_SYS_PL310_BASE +
+			     PL310_INVAL_LINE_PA);
+	pl310_cache_sync();
+}
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (4 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-09 22:52   ` Wolfgang Denk
  2010-12-22 11:54 ` [U-Boot] [PATCH 7/8] armv7: adapt omap3 " Aneesh V
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

adapt omap4 to the new layered cache maintenance framework

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/cpu/armv7/omap4/board.c            |   25 ++++++++++++++++++++++++-
 arch/arm/cpu/armv7/omap4/lowlevel_init.S    |   17 +++++++++++++++++
 arch/arm/include/asm/arch-omap4/omap4.h     |    6 ++++++
 arch/arm/include/asm/arch-omap4/sys_proto.h |    3 ++-
 include/configs/omap4_panda.h               |    6 +++---
 include/configs/omap4_sdp4430.h             |    6 +++---
 6 files changed, 55 insertions(+), 8 deletions(-)

diff --git a/arch/arm/cpu/armv7/omap4/board.c b/arch/arm/cpu/armv7/omap4/board.c
index fcd29a7..09612e5 100644
--- a/arch/arm/cpu/armv7/omap4/board.c
+++ b/arch/arm/cpu/armv7/omap4/board.c
@@ -31,7 +31,8 @@
 #include <asm/arch/cpu.h>
 #include <asm/arch/sys_proto.h>
 #include <asm/sizes.h>
-
+#include <asm/armv7.h>
+#include <asm/pl310.h>
 DECLARE_GLOBAL_DATA_PTR;
 
 /*
@@ -127,3 +128,25 @@ int arch_cpu_init(void)
 	set_muxconf_regs();
 	return 0;
 }
+
+/*
+ * Outer cache related functions
+ */
+#ifndef CONFIG_SYS_NO_DCACHE
+void v7_setup_outer_cache_ops(void)
+{
+#ifndef CONFIG_L2_OFF
+	v7_outer_cache.enable = omap4_enable_pl310;
+	v7_outer_cache.disable = omap4_disable_pl310;
+	v7_outer_cache.flush_all = pl310_clean_inval_all;
+	v7_outer_cache.flush_range = pl310_clean_inval_range;
+	v7_outer_cache.inval_range = pl310_inval_range;
+	/*
+	 * Do not setup v7_outer_cache.inval_all
+	 * Not needed in OMAP4 because ROM code invalidates entire L2$ at
+	 * bootup.
+	 */
+	v7_outer_cache.inval_all = NULL;
+#endif
+}
+#endif
diff --git a/arch/arm/cpu/armv7/omap4/lowlevel_init.S b/arch/arm/cpu/armv7/omap4/lowlevel_init.S
index 026dfa4..4beed53 100644
--- a/arch/arm/cpu/armv7/omap4/lowlevel_init.S
+++ b/arch/arm/cpu/armv7/omap4/lowlevel_init.S
@@ -29,6 +29,8 @@
 #include <asm/arch/omap4.h>
 
 .globl lowlevel_init
+.globl omap4_enable_pl310
+.globl omap4_disable_pl310
 lowlevel_init:
 	/*
 	 * Setup a temporary stack
@@ -45,3 +47,18 @@ lowlevel_init:
 	 */
 	bl	s_init
 	pop	{ip, pc}
+
+set_pl310_ctrl_reg:
+        PUSH	{r4-r11, lr}	@ save registers - ROM code may pollute
+				@ our registers
+        LDR	r12, =0x102	@ Set PL310 control register - value in R0
+        SMC	#0		@ call ROM Code API to set control register
+        POP	{r4-r11, pc}
+
+omap4_enable_pl310:
+	MOV	r0, #1
+	B	set_pl310_ctrl_reg
+
+omap4_disable_pl310:
+	MOV	r0, #0
+	B	set_pl310_ctrl_reg
diff --git a/arch/arm/include/asm/arch-omap4/omap4.h b/arch/arm/include/asm/arch-omap4/omap4.h
index a30bb33..9c6ec6f 100644
--- a/arch/arm/include/asm/arch-omap4/omap4.h
+++ b/arch/arm/include/asm/arch-omap4/omap4.h
@@ -100,6 +100,12 @@
 #define PRM_RSTCTRL		PRM_DEVICE_BASE
 #define PRM_RSTCTRL_RESET	0x01
 
+/* PL310 base */
+#define CONFIG_SYS_PL310_BASE	0x48242000
+
+/* PL310 associativity */
+#define OMAP44XX_PL310_ASSOCIATIVITY	16
+
 #ifndef __ASSEMBLY__
 
 struct s32ktimer {
diff --git a/arch/arm/include/asm/arch-omap4/sys_proto.h b/arch/arm/include/asm/arch-omap4/sys_proto.h
index 4813e9e..79c373d 100644
--- a/arch/arm/include/asm/arch-omap4/sys_proto.h
+++ b/arch/arm/include/asm/arch-omap4/sys_proto.h
@@ -31,7 +31,8 @@ struct omap_sysinfo {
 void gpmc_init(void);
 void watchdog_init(void);
 u32 get_device_type(void);
-void invalidate_dcache(u32);
+void omap4_enable_pl310(void);
+void omap4_disable_pl310(void);
 void set_muxconf_regs(void);
 void sr32(void *, u32, u32, u32);
 u32 wait_on_value(u32, u32, void *, u32);
diff --git a/include/configs/omap4_panda.h b/include/configs/omap4_panda.h
index 2b03b0f..0ec306a 100644
--- a/include/configs/omap4_panda.h
+++ b/include/configs/omap4_panda.h
@@ -45,9 +45,6 @@
 #define CONFIG_DISPLAY_CPUINFO		1
 #define CONFIG_DISPLAY_BOARDINFO	1
 
-/* Keep L2 Cache Disabled */
-#define CONFIG_L2_OFF			1
-
 /* Clock Defines */
 #define V_OSCK			38400000	/* Clock output from T2 */
 #define V_SCLK                   V_OSCK
@@ -234,4 +231,7 @@
 					 CONFIG_SYS_INIT_RAM_SIZE - \
 					 GENERATED_GBL_DATA_SIZE)
 
+#ifndef CONFIG_L2_OFF
+#define CONFIG_SYS_USE_PL310		1
+#endif
 #endif /* __CONFIG_H */
diff --git a/include/configs/omap4_sdp4430.h b/include/configs/omap4_sdp4430.h
index d288333..0e5840e 100644
--- a/include/configs/omap4_sdp4430.h
+++ b/include/configs/omap4_sdp4430.h
@@ -46,9 +46,6 @@
 #define CONFIG_DISPLAY_CPUINFO		1
 #define CONFIG_DISPLAY_BOARDINFO	1
 
-/* Keep L2 Cache Disabled */
-#define CONFIG_L2_OFF			1
-
 /* Clock Defines */
 #define V_OSCK			38400000	/* Clock output from T2 */
 #define V_SCLK                   V_OSCK
@@ -239,4 +236,7 @@
 					 CONFIG_SYS_INIT_RAM_SIZE - \
 					 GENERATED_GBL_DATA_SIZE)
 
+#ifndef CONFIG_L2_OFF
+#define CONFIG_SYS_USE_PL310		1
+#endif
 #endif /* __CONFIG_H */
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (5 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2011-01-09 22:57   ` Wolfgang Denk
  2010-12-22 11:54 ` [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx " Aneesh V
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

adapt omap3 to the new layered cache maintenance framework

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/cpu/armv7/omap3/Makefile           |    1 -
 arch/arm/cpu/armv7/omap3/board.c            |  151 ++++++++++++++--
 arch/arm/cpu/armv7/omap3/cache.S            |  263 ---------------------------
 arch/arm/cpu/armv7/omap3/lowlevel_init.S    |   30 +++
 arch/arm/include/asm/arch-omap3/omap3.h     |   20 ++
 arch/arm/include/asm/arch-omap3/sys_proto.h |   10 +-
 6 files changed, 190 insertions(+), 285 deletions(-)
 delete mode 100644 arch/arm/cpu/armv7/omap3/cache.S

diff --git a/arch/arm/cpu/armv7/omap3/Makefile b/arch/arm/cpu/armv7/omap3/Makefile
index 7164d50..522bcd2 100644
--- a/arch/arm/cpu/armv7/omap3/Makefile
+++ b/arch/arm/cpu/armv7/omap3/Makefile
@@ -26,7 +26,6 @@ include $(TOPDIR)/config.mk
 LIB	=  $(obj)lib$(SOC).o
 
 SOBJS	:= lowlevel_init.o
-SOBJS	+= cache.o
 
 COBJS	+= board.o
 COBJS	+= clock.o
diff --git a/arch/arm/cpu/armv7/omap3/board.c b/arch/arm/cpu/armv7/omap3/board.c
index 6c2a132..ea7ac3e 100644
--- a/arch/arm/cpu/armv7/omap3/board.c
+++ b/arch/arm/cpu/armv7/omap3/board.c
@@ -37,8 +37,12 @@
 #include <asm/arch/sys_proto.h>
 #include <asm/arch/mem.h>
 #include <asm/cache.h>
+#include <asm/armv7.h>
 
+/* Declarations */
 extern omap3_sysinfo sysinfo;
+static void omap3_setup_aux_cr(void);
+static void omap3_invalidate_l2_cache_secure(void);
 
 /******************************************************************************
  * Routine: delay
@@ -166,27 +170,15 @@ void s_init(void)
 
 	try_unlock_memory();
 
-	/*
-	 * Right now flushing at low MPU speed.
-	 * Need to move after clock init
-	 */
-	invalidate_dcache(get_device_type());
-#ifndef CONFIG_ICACHE_OFF
-	icache_enable();
-#endif
+	/* Errata workarounds */
+	omap3_setup_aux_cr();
 
-#ifdef CONFIG_L2_OFF
-	l2_cache_disable();
-#else
-	l2_cache_enable();
-#endif
+#ifndef CONFIG_L2_OFF
 	/*
-	 * Writing to AuxCR in U-boot using SMI for GP DEV
-	 * Currently SMI in Kernel on ES2 devices seems to have an issue
-	 * Once that is resolved, we can postpone this config to kernel
+	 * Invalidate L2-cache from secure mode
 	 */
-	if (get_device_type() == GP_DEVICE)
-		setup_auxcr();
+	omap3_invalidate_l2_cache_secure();
+#endif
 
 	set_muxconf_regs();
 	delay(100);
@@ -292,3 +284,126 @@ int checkboard (void)
 	return 0;
 }
 #endif	/* CONFIG_DISPLAY_BOARDINFO */
+
+static void omap3_emu_romcode_call(u32 service_id, u32 *parameters)
+{
+	u32 i, num_params = *parameters;
+	u32 *sram_scratch_space = (u32 *)OMAP3_PUBLIC_SRAM_SCRATCH_AREA;
+	/*
+	 * copy the parameters to an un-cached area to avoid coherency
+	 * issues
+	 */
+	for (i = 0; i < num_params; i++) {
+		__raw_writel(*parameters, sram_scratch_space);
+		parameters++;
+		sram_scratch_space++;
+	}
+	/* Now make the PPA call */
+	do_omap3_emu_romcode_call(service_id, OMAP3_PUBLIC_SRAM_SCRATCH_AREA);
+}
+
+static void omap3_update_aux_cr_secure(u32 set_bits, u32 clear_bits)
+{
+	u32 acr;
+	/* Read ACR */
+	asm volatile ("mrc p15, 0, %0, c1, c0, 1" : "=r" (acr));
+	acr &= ~clear_bits;
+	acr |= set_bits;
+
+	if (get_device_type() == GP_DEVICE) {
+		omap3_gp_romcode_call(OMAP3_GP_ROMCODE_API_WRITE_ACR,
+				       acr);
+	} else {
+		struct emu_hal_params emu_romcode_params;
+		emu_romcode_params.num_params = 1;
+		emu_romcode_params.param1 = acr;
+		omap3_emu_romcode_call(OMAP3_EMU_HAL_API_WRITE_ACR,
+				       (u32 *)&emu_romcode_params);
+	}
+}
+
+static void omap3_update_aux_cr(u32 set_bits, u32 clear_bits)
+{
+	u32 acr;
+	/* Read ACR */
+	asm volatile ("mrc p15, 0, %0, c1, c0, 1" : "=r" (acr));
+	acr &= ~clear_bits;
+	acr |= set_bits;
+	/* Write ACR - affects non-secure banked bits */
+	asm volatile ("mcr p15, 0, %0, c1, c0, 1" : : "r" (acr));
+}
+
+static void omap3_setup_aux_cr(void)
+{
+	/* Workaround for Cortex-A8 errata: #454179 #430973
+	 *	Set "IBE" bit
+	 *	Set "Disable Brach Size Mispredicts" bit
+	 * Workaround for erratum #621766
+	 *	Enable L1NEON bit
+	 * ACR |= (IBE | DBSM | L1NEON) => ACR |= 0xE0
+	 */
+	omap3_update_aux_cr_secure(0xE0, 0);
+}
+
+#ifndef CONFIG_L2_OFF
+/*
+ * Invalidate the entire L2 cache from secure mode
+ * This affects secure entries. Please note that this is needed only
+ * once. After this the regular CP15 based maintenance operations(that
+ * affects only non-secure entries) is enough.
+ * It's assumed that this function is called before D-cache is enabled.
+ * Otherwise there could be coherency issues between secure and
+ * non-secure sw especially in case of EMU where arguments in memory are
+ * passed
+ */
+static void omap3_invalidate_l2_cache_secure(void)
+{
+	if (get_device_type() == GP_DEVICE) {
+		omap3_gp_romcode_call(OMAP3_GP_ROMCODE_API_L2_INVAL,
+				      0);
+	} else {
+		struct emu_hal_params emu_romcode_params;
+		emu_romcode_params.num_params = 1;
+		emu_romcode_params.param1 = 0;
+		omap3_emu_romcode_call(OMAP3_EMU_HAL_API_L2_INVAL,
+				       (u32 *)&emu_romcode_params);
+	}
+}
+
+static void omap3_enable_l2_cache(void)
+{
+	/* Set L2EN */
+	omap3_update_aux_cr_secure(0x2, 0);
+	/*
+	 * On some revisions L2EN bit is banked on some revisions it's not
+	 * No harm in setting both banked bits(in fact this is required
+	 * by an erratum)
+	 */
+	omap3_update_aux_cr(0x2, 0);
+}
+
+static void omap3_disable_l2_cache(void)
+{
+	/* Clear L2EN */
+	omap3_update_aux_cr_secure(0, 0x2);
+	/*
+	 * On some revisions L2EN bit is banked on some revisions it's not
+	 * No harm in setting both banked bits(in fact this is required
+	 * by an erratum)
+	 */
+	omap3_update_aux_cr(0, 0x2);
+}
+#endif /* CONFIG_L2_OFF */
+
+void v7_setup_outer_cache_ops(void)
+{
+	/*
+	 * In OMAP3 L2 cache maintenance is done through CP15.
+	 * Only enabling disabling needs ROM code support and
+	 * is OMAP specific.
+	 */
+#ifndef CONFIG_L2_OFF
+	v7_outer_cache.enable = omap3_enable_l2_cache;
+	v7_outer_cache.disable = omap3_disable_l2_cache;
+#endif
+}
diff --git a/arch/arm/cpu/armv7/omap3/cache.S b/arch/arm/cpu/armv7/omap3/cache.S
deleted file mode 100644
index cda87ba..0000000
--- a/arch/arm/cpu/armv7/omap3/cache.S
+++ /dev/null
@@ -1,263 +0,0 @@
-/*
- * Copyright (c) 2009 Wind River Systems, Inc.
- * Tom Rix <Tom.Rix@windriver.com>
- *
- * This file is based on and replaces the existing cache.c file
- * The copyrights for the cache.c file are:
- *
- * (C) Copyright 2008 Texas Insturments
- *
- * (C) Copyright 2002
- * Sysgo Real-Time Solutions, GmbH <www.elinos.com>
- * Marius Groeger <mgroeger@sysgo.de>
- *
- * (C) Copyright 2002
- * Gary Jennejohn, DENX Software Engineering, <gj@denx.de>
- *
- * See file CREDITS for list of people who contributed to this
- * project.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; either version 2 of
- * the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
- * MA 02111-1307 USA
- */
-
-#include <asm/arch/omap3.h>
-
-/*
- * omap3 cache code
- */
-
-.align 5
-.global invalidate_dcache
-.global l2_cache_enable
-.global l2_cache_disable
-.global setup_auxcr
-
-/*
- *	invalidate_dcache()
- *
- *	Invalidate the whole D-cache.
- *
- *	Corrupted registers: r0-r5, r7, r9-r11
- *
- *	- mm	- mm_struct describing address space
- */
-invalidate_dcache:
-	stmfd	r13!, {r0 - r5, r7, r9 - r12, r14}
-
-	mov	r7, r0				@ take a backup of device type
-	cmp	r0, #0x3			@ check if the device type is
-						@ GP
-	moveq r12, #0x1				@ set up to invalide L2
-smi:	.word 0x01600070			@ Call SMI monitor (smieq)
-	cmp	r7, #0x3			@ compare again in case its
-						@ lost
-	beq	finished_inval			@ if GP device, inval done
-						@ above
-
-	mrc	p15, 1, r0, c0, c0, 1		@ read clidr
-	ands	r3, r0, #0x7000000		@ extract loc from clidr
-	mov	r3, r3, lsr #23			@ left align loc bit field
-	beq	finished_inval			@ if loc is 0, then no need to
-						@ clean
-	mov	r10, #0				@ start clean at cache level 0
-inval_loop1:
-	add	r2, r10, r10, lsr #1		@ work out 3x current cache
-						@ level
-	mov	r1, r0, lsr r2			@ extract cache type bits from
-						@ clidr
-	and	r1, r1, #7			@ mask of the bits for current
-						@ cache only
-	cmp	r1, #2				@ see what cache we have at
-						@ this level
-	blt	skip_inval			@ skip if no cache, or just
-						@ i-cache
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level
-						@ in cssr
-	mov	r2, #0				@ operand for mcr SBZ
-	mcr	p15, 0, r2, c7, c5, 4		@ flush prefetch buffer to
-						@ sych the new cssr&csidr,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-	mrc	p15, 1, r1, c0, c0, 0		@ read the new csidr
-	and	r2, r1, #7			@ extract the length of the
-						@ cache lines
-	add	r2, r2, #4			@ add 4 (line length offset)
-	ldr	r4, =0x3ff
-	ands	r4, r4, r1, lsr #3		@ find maximum number on the
-						@ way size
-	clz	r5, r4				@ find bit position of way
-						@ size increment
-	ldr	r7, =0x7fff
-	ands	r7, r7, r1, lsr #13		@ extract max number of the
-						@ index size
-inval_loop2:
-	mov	r9, r4				@ create working copy of max
-						@ way size
-inval_loop3:
-	orr	r11, r10, r9, lsl r5		@ factor way and cache number
-						@ into r11
-	orr	r11, r11, r7, lsl r2		@ factor index number into r11
-	mcr	p15, 0, r11, c7, c6, 2		@ invalidate by set/way
-	subs	r9, r9, #1			@ decrement the way
-	bge	inval_loop3
-	subs	r7, r7, #1			@ decrement the index
-	bge	inval_loop2
-skip_inval:
-	add	r10, r10, #2			@ increment cache number
-	cmp	r3, r10
-	bgt	inval_loop1
-finished_inval:
-	mov	r10, #0				@ swith back to cache level 0
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level
-						@ in cssr
-	mcr	p15, 0, r10, c7, c5, 4		@ flush prefetch buffer,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-
-	ldmfd	r13!, {r0 - r5, r7, r9 - r12, pc}
-
-l2_cache_set:
-	stmfd	r13!, {r4 - r6, lr}
-	mov	r5,  r0
-	bl	get_cpu_rev
-	mov	r4,  r0
-	bl	get_cpu_family
-	@ ES2 onwards we can disable/enable L2 ourselves
-	cmp	r0,  #CPU_OMAP34XX
-	cmpeq	r4,  #CPU_3XX_ES10
-	mrc	15, 0, r0, cr1, cr0, 1
-	bic	r0, r0, #2
-	orr	r0, r0, r5, lsl #1
-	mcreq	15, 0, r0, cr1, cr0, 1
-	@ GP Device ROM code API usage here
-	@ r12 = AUXCR Write function and r0 value
-	mov	ip, #3
-	@ SMCNE instruction to call ROM Code API
-	.word	0x11600070
-	ldmfd	r13!, {r4 - r6, pc}
-
-l2_cache_enable:
-	mov	r0, #1
-	b	l2_cache_set
-
-l2_cache_disable:
-	mov	r0, #0
-	b	l2_cache_set
-
-/******************************************************************************
- * Routine: setup_auxcr()
- * Description: Write to AuxCR desired value using SMI.
- *              general use.
- *****************************************************************************/
-setup_auxcr:
-	mrc	p15, 0, r0, c0, c0, 0		@ read main ID register
-	and	r2, r0, #0x00f00000		@ variant
-	and	r3, r0, #0x0000000f		@ revision
-	orr	r1, r3, r2, lsr #20-4		@ combine variant and revision
-	mov	r12, #0x3
-	mrc	p15, 0, r0, c1, c0, 1
-	orr	r0, r0, #0x10			@ Enable ASA
-	@ Enable L1NEON on pre-r2p1 (erratum 621766 workaround)
-	cmp	r1, #0x21
-	orrlt	r0, r0, #1 << 5
-	.word 0xE1600070			@ SMC
-	mov	r12, #0x2
-	mrc	p15, 1, r0, c9, c0, 2
-	@ Set PLD_FWD bit in L2AUXCR on pre-r2p1 (erratum 725233 workaround)
-	cmp	r1, #0x21
-	orrlt	r0, r0, #1 << 27
-	.word 0xE1600070			@ SMC
-	bx	lr
-
-.align 5
-.global v7_flush_dcache_all
-.global v7_flush_cache_all
-
-/*
- *	v7_flush_dcache_all()
- *
- *	Flush the whole D-cache.
- *
- *	Corrupted registers: r0-r7, r9-r11 (r6 only in Thumb mode)
- *
- *	- mm    - mm_struct describing address space
- */
-v7_flush_dcache_all:
-#	dmb					@ ensure ordering with previous memory accesses
-	mrc	p15, 1, r0, c0, c0, 1		@ read clidr
-	ands	r3, r0, #0x7000000		@ extract loc from clidr
-	mov	r3, r3, lsr #23			@ left align loc bit field
-	beq	finished			@ if loc is 0, then no need to clean
-	mov	r10, #0				@ start clean at cache level 0
-loop1:
-	add	r2, r10, r10, lsr #1		@ work out 3x current cache level
-	mov	r1, r0, lsr r2			@ extract cache type bits from clidr
-	and	r1, r1, #7			@ mask of the bits for current cache only
-	cmp	r1, #2				@ see what cache we have@this level
-	blt	skip				@ skip if no cache, or just i-cache
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level in cssr
-	mcr	p15, 0, r10, c7, c5, 4		@ flush prefetch buffer,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-	mrc	p15, 1, r1, c0, c0, 0		@ read the new csidr
-	and	r2, r1, #7			@ extract the length of the cache lines
-	add	r2, r2, #4			@ add 4 (line length offset)
-	ldr	r4, =0x3ff
-	ands	r4, r4, r1, lsr #3		@ find maximum number on the way size
-	clz	r5, r4				@ find bit position of way size increment
-	ldr	r7, =0x7fff
-	ands	r7, r7, r1, lsr #13		@ extract max number of the index size
-loop2:
-	mov	r9, r4				@ create working copy of max way size
-loop3:
-	orr	r11, r10, r9, lsl r5		@ factor way and cache number into r11
-	orr	r11, r11, r7, lsl r2		@ factor index number into r11
-	mcr	p15, 0, r11, c7, c14, 2		@ clean & invalidate by set/way
-	subs	r9, r9, #1			@ decrement the way
-	bge	loop3
-	subs	r7, r7, #1			@ decrement the index
-	bge	loop2
-skip:
-	add	r10, r10, #2			@ increment cache number
-	cmp	r3, r10
-	bgt	loop1
-finished:
-	mov	r10, #0				@ swith back to cache level 0
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level in cssr
-#	dsb
-	mcr	p15, 0, r10, c7, c5, 4		@ flush prefetch buffer,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-	mov	pc, lr
-
-/*
- *	v7_flush_cache_all()
- *
- *	Flush the entire cache system.
- *  The data cache flush is now achieved using atomic clean / invalidates
- *  working outwards from L1 cache. This is done using Set/Way based cache
- *  maintainance instructions.
- *  The instruction cache can still be invalidated back to the point of
- *  unification in a single instruction.
- *
- */
-v7_flush_cache_all:
-	stmfd	sp!, {r0-r7, r9-r11, lr}
-	bl	v7_flush_dcache_all
-	mov	r0, #0
-	mcr	p15, 0, r0, c7, c5, 0		@ I+BTB cache invalidate
-	ldmfd	sp!, {r0-r7, r9-r11, lr}
-	mov	pc, lr
diff --git a/arch/arm/cpu/armv7/omap3/lowlevel_init.S b/arch/arm/cpu/armv7/omap3/lowlevel_init.S
index 109481e..c2feeff 100644
--- a/arch/arm/cpu/armv7/omap3/lowlevel_init.S
+++ b/arch/arm/cpu/armv7/omap3/lowlevel_init.S
@@ -35,6 +35,36 @@
 _TEXT_BASE:
 	.word	CONFIG_SYS_TEXT_BASE	/* sdram load addr from config.mk */
 
+.global omap3_gp_romcode_call
+omap3_gp_romcode_call:
+	PUSH {r4-r12, lr} @ Save all registers from ROM code!
+	MOV r12, r0	@ Copy the Service ID in R12
+	MOV r0, r1	@ Copy parameter to R0
+	DSB
+	DMB
+	SMC #0		@ Use the SMI #1 to enter Monitor
+	POP {r4-r12, pc}
+
+/*
+ * Funtion for making PPA HAL API calls in secure devices
+ * Input:
+ *	R0 - Service ID
+ *	R1 - paramer list
+ */
+.global do_omap3_emu_romcode_call
+do_omap3_emu_romcode_call:
+	PUSH {r4-r12, lr} @ Save all registers from ROM code!
+	MOV r12, r0	@ Copy the Secure Service ID in R12
+	MOV r3, r1	@ Copy the pointer to va_list in R3
+	MOV r1, #0	@ Process ID - 0
+	MOV r2, #OMAP3_EMU_HAL_START_HAL_CRITICAL	@ Copy the pointer
+							@ to va_list in R3
+	MOV r6, #0xFF	@ Indicate new Task call
+	DSB
+	DMB
+	SMC #1		@ Use the SMI #1 to enter Monitor
+	POP {r4-r12, pc}
+
 #if !defined(CONFIG_SYS_NAND_BOOT) && !defined(CONFIG_SYS_NAND_BOOT)
 /**************************************************************************
  * cpy_clk_code: relocates clock code into SRAM where its safer to execute
diff --git a/arch/arm/include/asm/arch-omap3/omap3.h b/arch/arm/include/asm/arch-omap3/omap3.h
index 3957c79..7b861a3 100644
--- a/arch/arm/include/asm/arch-omap3/omap3.h
+++ b/arch/arm/include/asm/arch-omap3/omap3.h
@@ -145,8 +145,14 @@ struct gpio {
 #define SRAM_VECT_CODE			(SRAM_OFFSET0 | SRAM_OFFSET1 | \
 					 SRAM_OFFSET2)
 
+#define OMAP3_PUBLIC_SRAM_BASE		0x40208000 /* Works for GP & EMU */
+#define OMAP3_PUBLIC_SRAM_END		0x40210000
+
 #define LOW_LEVEL_SRAM_STACK		0x4020FFFC
 
+/* scratch area - accessible on both EMU and GP */
+#define OMAP3_PUBLIC_SRAM_SCRATCH_AREA	OMAP3_PUBLIC_SRAM_BASE
+
 #define DEBUG_LED1			149	/* gpio */
 #define DEBUG_LED2			150	/* gpio */
 
@@ -213,4 +219,18 @@ struct gpio {
 
 #define OMAP3730		0x0c00
 
+/*
+ * ROM code API related flags
+ */
+#define OMAP3_GP_ROMCODE_API_L2_INVAL		1
+#define OMAP3_GP_ROMCODE_API_WRITE_ACR		3
+
+/*
+ * EMU device PPA HAL related flags
+ */
+#define OMAP3_EMU_HAL_API_L2_INVAL		40
+#define OMAP3_EMU_HAL_API_WRITE_ACR		42
+
+#define OMAP3_EMU_HAL_START_HAL_CRITICAL	4
+
 #endif
diff --git a/arch/arm/include/asm/arch-omap3/sys_proto.h b/arch/arm/include/asm/arch-omap3/sys_proto.h
index 4a28ba1..25f54ea 100644
--- a/arch/arm/include/asm/arch-omap3/sys_proto.h
+++ b/arch/arm/include/asm/arch-omap3/sys_proto.h
@@ -27,6 +27,11 @@ typedef struct {
 	char *nand_string;
 } omap3_sysinfo;
 
+struct __attribute__ ((__packed__)) emu_hal_params {
+	u32 num_params;
+	u32 param1;
+};
+
 void prcm_init(void);
 void per_clocks_enable(void);
 
@@ -53,9 +58,7 @@ u32 is_running_in_sdram(void);
 u32 is_running_in_sram(void);
 u32 is_running_in_flash(void);
 u32 get_device_type(void);
-void l2cache_enable(void);
 void secureworld_exit(void);
-void setup_auxcr(void);
 void try_unlock_memory(void);
 u32 get_boot_type(void);
 void invalidate_dcache(u32);
@@ -66,5 +69,6 @@ void make_cs1_contiguous(void);
 void omap_nand_switch_ecc(int);
 void power_init_r(void);
 void dieid_num_r(void);
-
+void do_omap3_emu_romcode_call(u32 service_id, u32 parameters);
+void omap3_gp_romcode_call(u32 service_id, u32 parameter);
 #endif
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx to the new cache maintenance framework
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (6 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 7/8] armv7: adapt omap3 " Aneesh V
@ 2010-12-22 11:54 ` Aneesh V
  2010-12-27  7:25   ` Minkyu Kang
  2010-12-23  4:53 ` [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Steve Sakoman
  2010-12-28 19:51 ` Paulraj, Sandeep
  9 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-22 11:54 UTC (permalink / raw)
  To: u-boot

adapt s5pc1xx to the new layered cache maintenance framework

Signed-off-by: Aneesh V <aneesh@ti.com>
---
 arch/arm/cpu/armv7/s5pc1xx/cache.S            |   86 +-----------------------
 arch/arm/cpu/armv7/s5pc1xx/clock.c            |   12 ++++
 arch/arm/include/asm/arch-s5pc1xx/sys_proto.h |    4 +-
 3 files changed, 18 insertions(+), 84 deletions(-)

diff --git a/arch/arm/cpu/armv7/s5pc1xx/cache.S b/arch/arm/cpu/armv7/s5pc1xx/cache.S
index 7734b32..e62ad75 100644
--- a/arch/arm/cpu/armv7/s5pc1xx/cache.S
+++ b/arch/arm/cpu/armv7/s5pc1xx/cache.S
@@ -23,96 +23,18 @@
  * MA 02111-1307 USA
  */
 
-#include <asm/arch/cpu.h>
-
 .align 5
-.global invalidate_dcache
-.global l2_cache_enable
-.global l2_cache_disable
-
-/*
- * invalidate_dcache()
- * Invalidate the whole D-cache.
- *
- * Corrupted registers: r0-r5, r7, r9-r11
- */
-invalidate_dcache:
-	stmfd	r13!, {r0 - r5, r7, r9 - r12, r14}
-
-	cmp	r0, #0xC100			@ check if the cpu is s5pc100
-
-	beq	finished_inval			@ s5pc100 doesn't need this
-						@ routine
-	mrc	p15, 1, r0, c0, c0, 1		@ read clidr
-	ands	r3, r0, #0x7000000		@ extract loc from clidr
-	mov	r3, r3, lsr #23			@ left align loc bit field
-	beq	finished_inval			@ if loc is 0, then no need to
-						@ clean
-	mov	r10, #0				@ start clean at cache level 0
-inval_loop1:
-	add	r2, r10, r10, lsr #1		@ work out 3x current cache
-						@ level
-	mov	r1, r0, lsr r2			@ extract cache type bits from
-						@ clidr
-	and	r1, r1, #7			@ mask of the bits for current
-						@ cache only
-	cmp	r1, #2				@ see what cache we have at
-						@ this level
-	blt	skip_inval			@ skip if no cache, or just
-						@ i-cache
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level
-						@ in cssr
-	mov	r2, #0				@ operand for mcr SBZ
-	mcr	p15, 0, r2, c7, c5, 4		@ flush prefetch buffer to
-						@ sych the new cssr&csidr,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-	mrc	p15, 1, r1, c0, c0, 0		@ read the new csidr
-	and	r2, r1, #7			@ extract the length of the
-						@ cache lines
-	add	r2, r2, #4			@ add 4 (line length offset)
-	ldr	r4, =0x3ff
-	ands	r4, r4, r1, lsr #3		@ find maximum number on the
-						@ way size
-	clz	r5, r4				@ find bit position of way
-						@ size increment
-	ldr	r7, =0x7fff
-	ands	r7, r7, r1, lsr #13		@ extract max number of the
-						@ index size
-inval_loop2:
-	mov	r9, r4				@ create working copy of max
-						@ way size
-inval_loop3:
-	orr	r11, r10, r9, lsl r5		@ factor way and cache number
-						@ into r11
-	orr	r11, r11, r7, lsl r2		@ factor index number into r11
-	mcr	p15, 0, r11, c7, c6, 2		@ invalidate by set/way
-	subs	r9, r9, #1			@ decrement the way
-	bge	inval_loop3
-	subs	r7, r7, #1			@ decrement the index
-	bge	inval_loop2
-skip_inval:
-	add	r10, r10, #2			@ increment cache number
-	cmp	r3, r10
-	bgt	inval_loop1
-finished_inval:
-	mov	r10, #0				@ swith back to cache level 0
-	mcr	p15, 2, r10, c0, c0, 0		@ select current cache level
-						@ in cssr
-	mcr	p15, 0, r10, c7, c5, 4		@ flush prefetch buffer,
-						@ with armv7 this is 'isb',
-						@ but we compile with armv5
-
-	ldmfd	r13!, {r0 - r5, r7, r9 - r12, pc}
+.global ca8_l2_cache_enable
+.global ca8_l2_cache_disable
 
-l2_cache_enable:
+ca8_l2_cache_enable:
 	push	{r0, r1, r2, lr}
 	mrc	15, 0, r3, cr1, cr0, 1
 	orr	r3, r3, #2
 	mcr	15, 0, r3, cr1, cr0, 1
 	pop	{r1, r2, r3, pc}
 
-l2_cache_disable:
+ca8_l2_cache_disable:
 	push	{r0, r1, r2, lr}
 	mrc	15, 0, r3, cr1, cr0, 1
 	bic	r3, r3, #2
diff --git a/arch/arm/cpu/armv7/s5pc1xx/clock.c b/arch/arm/cpu/armv7/s5pc1xx/clock.c
index 98a27e5..da8b2d7 100644
--- a/arch/arm/cpu/armv7/s5pc1xx/clock.c
+++ b/arch/arm/cpu/armv7/s5pc1xx/clock.c
@@ -26,6 +26,8 @@
 #include <asm/io.h>
 #include <asm/arch/clock.h>
 #include <asm/arch/clk.h>
+#include <asm/arch/sys_proto.h>
+#include <asm/armv7.h>
 
 #define CLK_M	0
 #define CLK_D	1
@@ -328,3 +330,13 @@ void s5p_clock_init(void)
 	get_uart_clk = s5pc1xx_get_uart_clk;
 	get_pwm_clk = s5pc1xx_get_pwm_clk;
 }
+
+#ifndef CONFIG_SYS_NO_DCACHE
+void v7_setup_outer_cache_ops(void)
+{
+#ifndef CONFIG_L2_OFF
+	v7_outer_cache.enable = ca8_l2_cache_enable;
+	v7_outer_cache.disable = ca8_l2_cache_disable;
+#endif
+}
+#endif
diff --git a/arch/arm/include/asm/arch-s5pc1xx/sys_proto.h b/arch/arm/include/asm/arch-s5pc1xx/sys_proto.h
index 3078aaf..23144da 100644
--- a/arch/arm/include/asm/arch-s5pc1xx/sys_proto.h
+++ b/arch/arm/include/asm/arch-s5pc1xx/sys_proto.h
@@ -26,7 +26,7 @@
 
 u32 get_device_type(void);
 void invalidate_dcache(u32);
-void l2_cache_disable(void);
-void l2_cache_enable(void);
+void ca8_l2_cache_disable(void);
+void ca8_l2_cache_enable(void);
 
 #endif
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (7 preceding siblings ...)
  2010-12-22 11:54 ` [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx " Aneesh V
@ 2010-12-23  4:53 ` Steve Sakoman
  2010-12-28 19:51 ` Paulraj, Sandeep
  9 siblings, 0 replies; 50+ messages in thread
From: Steve Sakoman @ 2010-12-23  4:53 UTC (permalink / raw)
  To: u-boot

On Wed, Dec 22, 2010 at 3:54 AM, Aneesh V <aneesh@ti.com> wrote:
> With D-cache and MMU enabled for ARM in u-boot it becomes imperative to
> support a minimal set of cache maintenance operations and necessary
> initializations before enabling MMU.
>
> This series of patches attempt to do the following for armv7:
> * Necessary initialization sequence before enabling MMU that includes
> ?invalidation of TLB, data caches, branch predictor array etc.
> * Framework for supporting SOC specific outer caches in a generic manner
> ?(using a structure of function pointers - inspired by the Linux
> ?implementation)
> * Generic armv7 cache maintenance operations for caches known to the CPU
> * Support for ARM PL310 L2 cache controller used in OMAP4
> * Cleanup of the cleanup_before_linux() function
> * Adapting all armv7 SOCs to use the new framework and removing
> ?duplicated code
>
> Testing:
> * Extensive testing on OMAP4430SDP and OMAP3430SDP by creating coherency
> ?issues and solving them using the maintenance routines
> ? ? ? ?- Eg: memfill a region of memory with a known pattern
> ? ? ? ?- Invalidate the region
> ? ? ? ?- Read back and compare the region with the original pattern
> ? ? ? ?- If match fails it means that invalidate is successful
> ? ? ? ?- Now add a flush call just before the invalidate
> ? ? ? ?- If match succeeds it means that flush was successful
> ? ? ? ?- Outer caches were tested with experiments involving making the
> ? ? ? ? ?function pointers NULL
> * Kernel booting on OMAP4430SDP and OMAP3430SDP

I build tested for Beagle, Overo, Panda, and OMAP4430SDP - no issues.

I run tested on Overo and Panda to evaluate the changes on both OMAP3
and OMAP4 - no issues.

Tested-by: Steve Sakoman <steve.sakoman@linaro.org>

Steve

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx to the new cache maintenance framework
  2010-12-22 11:54 ` [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx " Aneesh V
@ 2010-12-27  7:25   ` Minkyu Kang
  2010-12-27 11:22     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Minkyu Kang @ 2010-12-27  7:25 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

On 22 December 2010 20:54, Aneesh V <aneesh@ti.com> wrote:
> adapt s5pc1xx to the new layered cache maintenance framework
>
> Signed-off-by: Aneesh V <aneesh@ti.com>
> ---
> ?arch/arm/cpu/armv7/s5pc1xx/cache.S ? ? ? ? ? ?| ? 86 +-----------------------
> ?arch/arm/cpu/armv7/s5pc1xx/clock.c ? ? ? ? ? ?| ? 12 ++++
> ?arch/arm/include/asm/arch-s5pc1xx/sys_proto.h | ? ?4 +-
> ?3 files changed, 18 insertions(+), 84 deletions(-)
>
> diff --git a/arch/arm/cpu/armv7/s5pc1xx/clock.c b/arch/arm/cpu/armv7/s5pc1xx/clock.c
> index 98a27e5..da8b2d7 100644
> --- a/arch/arm/cpu/armv7/s5pc1xx/clock.c
> +++ b/arch/arm/cpu/armv7/s5pc1xx/clock.c
> @@ -26,6 +26,8 @@
> ?#include <asm/io.h>
> ?#include <asm/arch/clock.h>
> ?#include <asm/arch/clk.h>
> +#include <asm/arch/sys_proto.h>
> +#include <asm/armv7.h>
>
> ?#define CLK_M ?0
> ?#define CLK_D ?1
> @@ -328,3 +330,13 @@ void s5p_clock_init(void)
> ? ? ? ?get_uart_clk = s5pc1xx_get_uart_clk;
> ? ? ? ?get_pwm_clk = s5pc1xx_get_pwm_clk;
> ?}
> +
> +#ifndef CONFIG_SYS_NO_DCACHE
> +void v7_setup_outer_cache_ops(void)
> +{
> +#ifndef CONFIG_L2_OFF
> + ? ? ? v7_outer_cache.enable = ca8_l2_cache_enable;
> + ? ? ? v7_outer_cache.disable = ca8_l2_cache_disable;
> +#endif
> +}
> +#endif

I don't agree with add this function at clock.c.
If need we can make new file as omap3/4 case.

Thanks
Minkyu Kang
-- 
from. prom.
www.promsoft.net

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx to the new cache maintenance framework
  2010-12-27  7:25   ` Minkyu Kang
@ 2010-12-27 11:22     ` Aneesh V
  2011-01-07  5:27       ` Minkyu Kang
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2010-12-27 11:22 UTC (permalink / raw)
  To: u-boot

Dear Minkyu Kang,

On Monday 27 December 2010 12:55 PM, Minkyu Kang wrote:
< snip >
>> +
>> +#ifndef CONFIG_SYS_NO_DCACHE
>> +void v7_setup_outer_cache_ops(void)
>> +{
>> +#ifndef CONFIG_L2_OFF
>> +       v7_outer_cache.enable = ca8_l2_cache_enable;
>> +       v7_outer_cache.disable = ca8_l2_cache_disable;
>> +#endif
>> +}
>> +#endif
>
> I don't agree with add this function at clock.c.
> If need we can make new file as omap3/4 case.

I didn't want to add a new file just for this small function. But no
problem, I will do that in the next revision.

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
  2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
                   ` (8 preceding siblings ...)
  2010-12-23  4:53 ` [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Steve Sakoman
@ 2010-12-28 19:51 ` Paulraj, Sandeep
  2011-01-08  7:07   ` Albert ARIBAUD
  9 siblings, 1 reply; 50+ messages in thread
From: Paulraj, Sandeep @ 2010-12-28 19:51 UTC (permalink / raw)
  To: u-boot



> Subject: [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
> 
> With D-cache and MMU enabled for ARM in u-boot it becomes imperative to
> support a minimal set of cache maintenance operations and necessary
> initializations before enabling MMU.
> 
> This series of patches attempt to do the following for armv7:
> * Necessary initialization sequence before enabling MMU that includes
>   invalidation of TLB, data caches, branch predictor array etc.
> * Framework for supporting SOC specific outer caches in a generic manner
>   (using a structure of function pointers - inspired by the Linux
>   implementation)
> * Generic armv7 cache maintenance operations for caches known to the CPU
> * Support for ARM PL310 L2 cache controller used in OMAP4
> * Cleanup of the cleanup_before_linux() function
> * Adapting all armv7 SOCs to use the new framework and removing
>   duplicated code
> 
> Testing:
> * Extensive testing on OMAP4430SDP and OMAP3430SDP by creating coherency
>   issues and solving them using the maintenance routines
> 	- Eg: memfill a region of memory with a known pattern
> 	- Invalidate the region
> 	- Read back and compare the region with the original pattern
> 	- If match fails it means that invalidate is successful
> 	- Now add a flush call just before the invalidate
> 	- If match succeeds it means that flush was successful
> 	- Outer caches were tested with experiments involving making the
> 	  function pointers NULL
> * Kernel booting on OMAP4430SDP and OMAP3430SDP
> 
> 
> Aneesh V (8):
>   arm: make default implementation of cache_flush() weakly linked
>   armv7: cache maintenance operations for armv7
>   armv7: integrate cache maintenance support
>   arm: minor fixes for cache and mmu handling
>   armv7: add PL310 support to u-boot
>   armv7: adapt omap4 to the new cache maintenance framework
>   armv7: adapt omap3 to the new cache maintenance framework
>   armv7: adapt s5pc1xx to the new cache maintenance framework


As I mentioned to John Rigby in anoterh e-mail, I will be on vacation till the 20th of January. So there might be a little delay in merging this patch series after due review by the community.

Regards,
Sandeep

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx to the new cache maintenance framework
  2010-12-27 11:22     ` Aneesh V
@ 2011-01-07  5:27       ` Minkyu Kang
  0 siblings, 0 replies; 50+ messages in thread
From: Minkyu Kang @ 2011-01-07  5:27 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

On 27 December 2010 20:22, Aneesh V <aneesh@ti.com> wrote:
> Dear Minkyu Kang,
>
> On Monday 27 December 2010 12:55 PM, Minkyu Kang wrote:
> < snip >
>>>
>>> +
>>> +#ifndef CONFIG_SYS_NO_DCACHE
>>> +void v7_setup_outer_cache_ops(void)
>>> +{
>>> +#ifndef CONFIG_L2_OFF
>>> + ? ? ? v7_outer_cache.enable = ca8_l2_cache_enable;
>>> + ? ? ? v7_outer_cache.disable = ca8_l2_cache_disable;
>>> +#endif
>>> +}
>>> +#endif
>>
>> I don't agree with add this function at clock.c.
>> If need we can make new file as omap3/4 case.
>
> I didn't want to add a new file just for this small function. But no
> problem, I will do that in the next revision.
>

OK, please make soc.c.
I have plan to move reset.S to this file.

Thanks.
Minkyu Kang
-- 
from. prom.
www.promsoft.net

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2010-12-22 11:54 ` [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7 Aneesh V
@ 2011-01-08  6:36   ` Albert ARIBAUD
  2011-01-08  8:40     ` Albert ARIBAUD
                       ` (3 more replies)
  0 siblings, 4 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  6:36 UTC (permalink / raw)
  To: u-boot

Hi Aneesh,

Le 22/12/2010 12:54, Aneesh V a ?crit :
> - Add a framework for layered cache maintenance
> 	- separate out SOC specific outer cache maintenance from
> 	  maintenance of caches known to CPU
>
> - Add generic ARMv7 cache maintenance operations that affect all
>    caches known to ARMv7 CPUs. For instance in Cortex-A8 these
>    opertions will affect both L1 and L2 caches. In Cortex-A9
>    these will affect only L1 cache
>
> - D-cache operations supported:
> 	- Invalidate entire D-cache
> 	- Invalidate D-cache range
> 	- Flush(clean&  invalidate) entire D-cache
> 	- Flush D-cache range
> - I-cache operations supported:
> 	- Invalidate entire I-cache
>
> - Add maintenance functions for TLB, branch predictor array etc.
>
> - Enable -march=armv7-a so that armv7 assembly instructions can be
>    used
>
> Signed-off-by: Aneesh V<aneesh@ti.com>
> ---
>   arch/arm/cpu/armv7/Makefile   |    2 +-
>   arch/arm/cpu/armv7/cache_v7.c |  359 +++++++++++++++++++++++++++++++++++++++++
>   arch/arm/cpu/armv7/config.mk  |    2 +-
>   arch/arm/include/asm/armv7.h  |   63 +++++++
>   include/common.h              |    5 +-
>   5 files changed, 428 insertions(+), 3 deletions(-)
>   create mode 100644 arch/arm/cpu/armv7/cache_v7.c
>   create mode 100644 arch/arm/include/asm/armv7.h
>
> diff --git a/arch/arm/cpu/armv7/Makefile b/arch/arm/cpu/armv7/Makefile
> index 8c0e915..299792a 100644
> --- a/arch/arm/cpu/armv7/Makefile
> +++ b/arch/arm/cpu/armv7/Makefile
> @@ -26,7 +26,7 @@ include $(TOPDIR)/config.mk
>   LIB	= $(obj)lib$(CPU).o
>
>   START	:= start.o
> -COBJS	:= cpu.o
> +COBJS	:= cpu.o cache_v7.o
>   COBJS  += syslib.o
>
>   SRCS	:= $(START:.o=.S) $(COBJS:.o=.c)
> diff --git a/arch/arm/cpu/armv7/cache_v7.c b/arch/arm/cpu/armv7/cache_v7.c
> new file mode 100644
> index 0000000..0521d66
> --- /dev/null
> +++ b/arch/arm/cpu/armv7/cache_v7.c
> @@ -0,0 +1,359 @@
> +/*
> + * (C) Copyright 2010
> + * Texas Instruments Incorporated - http://www.ti.com/
> + * Aneesh V<aneesh@ti.com>
> + *
> + * See file CREDITS for list of people who contributed to this
> + * project.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of
> + * the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
> + * MA 02111-1307 USA
> + */
> +#include<linux/types.h>
> +#include<common.h>
> +#include<asm/armv7.h>
> +
> +#define ARMV7_DCACHE_INVAL_ALL		1
> +#define ARMV7_DCACHE_CLEAN_INVAL_ALL	2
> +#define ARMV7_DCACHE_INVAL_RANGE	3
> +#define ARMV7_DCACHE_CLEAN_INVAL_RANGE	4
> +
> +struct v7_outer_cache_ops v7_outer_cache;
> +
> +#ifndef CONFIG_SYS_NO_DCACHE
> +/*
> + * Write the level and type you want to Cache Size Selection Register(CSSELR)
> + * to get size details from Current Cache Size ID Register(CCSIDR)
> + */
> +static void set_csselr(u32 level, u32 type)
> +{	u32 csselr = level<<  1 | type;
> +	/* Write to Cache Size Selection Register(CSSELR) */
> +	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
> +}
> +
> +static u32 get_ccsidr(void)
> +{
> +	u32 ccsidr;
> +	/* Read current CP15 Cache Size ID Register */
> +	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
> +	return ccsidr;
> +}
> +
> +static u32 get_clidr(void)
> +{
> +	u32 clidr;
> +	/* Read current CP15 Cache Level ID Register */
> +	asm volatile ("mrc p15,1,%0,c0,c0,1" : "=r" (clidr));
> +	return clidr;
> +}
> +
> +/* round up the input number to a power of 2 and get the log2 */
> +static inline u32 log_2_round_up(u32 num)
> +{
> +	/* count leading zeros */
> +	asm volatile ("CLZ %0, %0" : "+r" (num));
> +
> +	/* position of most significant 1 */
> +	num = 31 - num;
> +
> +	return num;
> +}
> +
> +static void v7_inval_dcache_level_setway(u32 level, u32 num_sets,
> +					 u32 num_ways, u32 way_shift,
> +					 u32 log2_line_len)
> +{
> +	int way, set, setway;
> +	/*
> +	 * For optimal assembly code:
> +	 *	a. count down
> +	 *	b. have bigger loop inside
> +	 */

Out of curiosity, can you elaborate on why the compiler would optimize 
better in these cases?

> +	for (way = num_ways - 1; way>= 0 ; way--)
> +		for (set = num_sets - 1; set>= 0; set--) {

Please fix whitespacing around operators. The best way to ''catch'em 
all'' is to run Linux' checkpatch.pl (I do this with option --no-tree) 
on all patches that you submit to u-boot and, fix all warning and errors 
and if some are left that you think should not be fixed, mention them 
and explain why they're wrongly emitted.

> +			setway = (level<<  1) | (set<<  log2_line_len) |
> +				 (way<<  way_shift);
> +			/* Invalidate data/unified cache line by set/way */
> +			asm volatile ("	mcr p15, 0, %0, c7, c6, 2"
> +					: : "r" (setway));
> +		}
> +	/* Make sure the operation is complete */
> +	asm volatile ("DMB");
> +}
> +
> +static void v7_clean_inval_dcache_level_setway(u32 level, u32 num_sets,
> +					       u32 num_ways, u32 way_shift,
> +					       u32 log2_line_len)
> +{
> +	int way, set, setway;
> +	/*
> +	 * For optimal assembly code:
> +	 *	a. count down
> +	 *	b. have bigger loop inside
> +	 */
> +	for (way = num_ways - 1; way>= 0 ; way--)
> +		for (set = num_sets - 1; set>= 0; set--) {
> +			setway = (level<<  1) | (set<<  log2_line_len) |
> +				 (way<<  way_shift);
> +			/*
> +			 * Clean&  Invalidate data/unified
> +			 * cache line by set/way
> +			 */
> +			asm volatile ("	mcr p15, 0, %0, c7, c14, 2"
> +					: : "r" (setway));
> +		}
> +	/* Make sure the operation is complete */
> +	asm volatile ("DMB");
> +}
> +
> +static void v7_maint_dcache_level_setway(u32 level, u32 operation)
> +{
> +	u32 ccsidr;
> +	u32 num_sets, num_ways, log2_line_len, log2_num_ways;
> +	u32 way_shift;
> +	set_csselr(level, ARMV7_CSSELR_IND_DATA_UNIFIED);
> +
> +	ccsidr = get_ccsidr();
> +
> +	log2_line_len = mask_n_get(ccsidr, 0, 2) + 2;
> +	/* Converting from words to bytes */
> +	log2_line_len += 2;
> +
> +	num_ways  = mask_n_get(ccsidr, 3, 12) + 1;
> +	num_sets  = mask_n_get(ccsidr, 13, 27) + 1;
> +	/*
> +	 * According to ARMv7 ARM number of sets and number of ways need
> +	 * not be a power of 2
> +	 */
> +	log2_num_ways = log_2_round_up(num_ways);
> +
> +	way_shift = (32 - log2_num_ways);
> +	if (operation == ARMV7_DCACHE_INVAL_ALL)
> +		v7_inval_dcache_level_setway(level, num_sets, num_ways,
> +				      way_shift, log2_line_len);
> +	else if (operation == ARMV7_DCACHE_CLEAN_INVAL_ALL)
> +		v7_clean_inval_dcache_level_setway(level, num_sets, num_ways,
> +						   way_shift, log2_line_len);
> +}
> +
> +static void v7_maint_dcache_all(u32 operation)
> +{
> +	u32 level, cache_type, level_start_bit = 0;
> +
> +	u32 clidr = get_clidr();
> +
> +	for (level = 0; level<  7; level++) {
> +		cache_type = mask_n_get(clidr, level_start_bit,
> +					level_start_bit + 2);
> +		if ((cache_type == ARMV7_CLIDR_CTYPE_DATA_ONLY) ||
> +		    (cache_type == ARMV7_CLIDR_CTYPE_INSTRUCTION_DATA) ||
> +		    (cache_type == ARMV7_CLIDR_CTYPE_UNIFIED))
> +			v7_maint_dcache_level_setway(level, operation);
> +		level_start_bit += 3;
> +	}
> +}
> +
> +static void v7_dcache_clean_inval_range(u32 start,
> +					u32 stop, u32 line_len)
> +{
> +	u32 mva;
> +	/* Align start to cache line boundary */
> +	start&= ~(line_len - 1);
> +	for (mva = start; mva<  stop; mva = mva + line_len)
> +		/* DCCIMVAC - Clean&  Invalidate data cache by MVA to PoC */
> +		asm volatile ("mcr p15, 0, %0, c7, c14, 1" : : "r" (mva));
> +}
> +
> +static void v7_dcache_inval_range(u32 start, u32 stop, u32 line_len)
> +{
> +	u32 mva;
> +
> +	/*
> +	 * If start address is not aligned to cache-line flush the first
> +	 * line to prevent affecting somebody else's buffer
> +	 */
> +	if (start&  (line_len - 1)) {
> +		v7_dcache_clean_inval_range(start, start + 1, line_len);
> +		/* move to next cache line */
> +		start = (start + line_len - 1)&  ~(line_len - 1);
> +	}
> +
> +	/*
> +	 * If stop address is not aligned to cache-line flush the last
> +	 * line to prevent affecting somebody else's buffer
> +	 */
> +	if (stop&  (line_len - 1)) {
> +		v7_dcache_clean_inval_range(stop, stop + 1, line_len);
> +		/* align to the beginning of this cache line */
> +		stop&= ~(line_len - 1);
> +	}
> +
> +	for (mva = start; mva<  stop; mva = mva + line_len)
> +		/* DCIMVAC - Invalidate data cache by MVA to PoC */
> +		asm volatile ("mcr p15, 0, %0, c7, c6, 1" : : "r" (mva));
> +}
> +
> +static void v7_dcache_maint_range(u32 start, u32 stop, u32 range_op)
> +{
> +	u32 line_len, ccsidr;
> +	ccsidr = get_ccsidr();
> +	line_len = mask_n_get(ccsidr, 0, 2) + 2;
> +	/* Converting from words to bytes */
> +	line_len += 2;
> +	/* converting from log2(linelen) to linelen */
> +	line_len = 1<<  line_len;
> +
> +	switch (range_op) {
> +	case ARMV7_DCACHE_CLEAN_INVAL_RANGE:
> +		v7_dcache_clean_inval_range(start, stop, line_len);
> +		break;
> +	case ARMV7_DCACHE_INVAL_RANGE:
> +		v7_dcache_inval_range(start, stop, line_len);
> +		break;
> +	}
> +
> +	/* Make sure the operation is complete */
> +	asm volatile ("DMB");
> +}
> +
> +/* Invalidate TLB */
> +static void v7_inval_tlb(void)
> +{
> +	/* Invalidate entire unified TLB */
> +	asm volatile ("mcr p15, 0, %0, c8, c7, 0" : : "r" (0));
> +	/* Invalidate entire data TLB */
> +	asm volatile ("mcr p15, 0, %0, c8, c6, 0" : : "r" (0));
> +	/* Invalidate entire instruction TLB */
> +	asm volatile ("mcr p15, 0, %0, c8, c5, 0" : : "r" (0));
> +	/* Full system DSB - make sure that the invalidation is complete */
> +	asm volatile ("DSB");
> +	/* Full system ISB - make sure the instruction stream sees it */
> +	asm volatile ("ISB");
> +}
> +
> +void invalidate_dcache_all(void)
> +{
> +	v7_maint_dcache_all(ARMV7_DCACHE_INVAL_ALL);
> +	if (v7_outer_cache.inval_all)
> +		v7_outer_cache.inval_all();

Why use pointers here rather than weak functions?

> +}
> +
> +/*
> + * Performs a clean&  invalidation of the entire data cache
> + * at all levels
> + */
> +void flush_dcache_all(void)
> +{
> +	v7_maint_dcache_all(ARMV7_DCACHE_CLEAN_INVAL_ALL);
> +	if (v7_outer_cache.flush_all)
> +		v7_outer_cache.flush_all();
> +}
> +
> +/*
> + * Invalidates range in all levels of D-cache/unified cache used:
> + * Affects the range [start, stop - 1]
> + */
> +void invalidate_dcache_range(unsigned long start, unsigned long stop)
> +{
> +
> +	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_INVAL_RANGE);
> +	if (v7_outer_cache.inval_range)
> +		/* physical address is same as virtual address */
> +		v7_outer_cache.inval_range(start, stop);
> +}
> +
> +/*
> + * Flush range(clean&  invalidate) from all levels of D-cache/unified
> + * cache used:
> + * Affects the range [start, stop - 1]
> + */
> +void flush_dcache_range(unsigned long start, unsigned long stop)
> +{
> +	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_CLEAN_INVAL_RANGE);
> +	if (v7_outer_cache.flush_range)
> +		/* physical address is same as virtual address */
> +		v7_outer_cache.flush_range(start, stop);
> +}
> +static void __v7_setup_outer_cache_ops(void)
> +{
> +	puts("v7_setup_outer_cache_ops: dummy implementation! "
> +	     "real implementation not available!!\n");
> +}
> +
> +void v7_setup_outer_cache_ops(void)
> +	__attribute__((weak, alias("__v7_setup_outer_cache_ops")));
> +
> +void arm_init_before_mmu(void)
> +{
> +	v7_setup_outer_cache_ops();
> +	if (v7_outer_cache.enable)
> +		v7_outer_cache.enable();
> +	invalidate_dcache_all();
> +	v7_inval_tlb();
> +}
> +#else
> +void invalidate_dcache_all(void)
> +{
> +}
> +
> +void flush_dcache_all(void)
> +{
> +}
> +
> +void invalidate_dcache_range(unsigned long start, unsigned long stop)
> +{
> +}
> +
> +void flush_dcache_range(unsigned long start, unsigned long stop)
> +{
> +}
> +
> +void arm_init_before_mmu(void)
> +{
> +}
> +#endif
> +
> +#ifndef CONFIG_SYS_NO_ICACHE
> +/* Invalidate entire I-cache and branch predictor array */
> +void invalidate_icache_all(void)
> +{
> +	/*
> +	 * Invalidate all instruction caches to PoU.
> +	 * Also flushes branch target cache.
> +	 */
> +	asm volatile ("mcr p15, 0, %0, c7, c5, 0" : : "r" (0));
> +
> +	/* Invalidate entire branch predictor array */
> +	asm volatile ("mcr p15, 0, %0, c7, c5, 6" : : "r" (0));
> +
> +	/* Full system DSB - make sure that the invalidation is complete */
> +	asm volatile ("DSB");
> +	/* Full system ISB - make sure the instruction stream sees it */
> +	asm volatile ("ISB");
> +}
> +#else
> +void invalidate_icache_all(void)
> +{
> +}
> +#endif
> +
> +/*
> + * Flush range from all levels of d-cache/unified-cache used:
> + * Affects the range [start, start + size - 1]
> + */
> +void  flush_cache(unsigned long start, unsigned long size)
> +{
> +	flush_dcache_range(start, start + size);
> +}

This function is the only one which is defined to something non-empty 
when CONFIG_SYS_NO_DCACHE is defined. Why is it not in the big #ifndef 
for dcache above ?

> diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk
> index 49ac9c7..7f9b171 100644
> --- a/arch/arm/cpu/armv7/config.mk
> +++ b/arch/arm/cpu/armv7/config.mk
> @@ -23,7 +23,7 @@
>   PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
>
>   # Make ARMv5 to allow more compilers to work, even though its v7a.
> -PLATFORM_CPPFLAGS += -march=armv5
> +PLATFORM_CPPFLAGS += -march=armv7-a

Did you check that this does not break any board using armv7?

>   # =========================================================================
>   #
>   # Supply options according to compiler version
> diff --git a/arch/arm/include/asm/armv7.h b/arch/arm/include/asm/armv7.h
> new file mode 100644
> index 0000000..57409b6
> --- /dev/null
> +++ b/arch/arm/include/asm/armv7.h
> @@ -0,0 +1,63 @@
> +/*
> + * (C) Copyright 2010
> + * Texas Instruments Incorporated - http://www.ti.com/
> + *
> + * Aneesh V<aneesh@ti.com>
> + *
> + * See file CREDITS for list of people who contributed to this
> + * project.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of
> + * the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
> + * MA 02111-1307 USA
> + */
> +#ifndef ARMV7_H
> +#define ARMV7_H
> +
> +#include<linux/types.h>
> +
> +/*
> + * Values for InD field in CSSELR
> + * Selects the type of cache
> + */
> +#define ARMV7_CSSELR_IND_DATA_UNIFIED	0
> +#define ARMV7_CSSELR_IND_INSTRUCTION	1
> +
> +/* Values for Ctype fields in CLIDR */
> +#define ARMV7_CLIDR_CTYPE_NO_CACHE		0
> +#define ARMV7_CLIDR_CTYPE_INSTRUCTION_ONLY	1
> +#define ARMV7_CLIDR_CTYPE_DATA_ONLY		2
> +#define ARMV7_CLIDR_CTYPE_INSTRUCTION_DATA	3
> +#define ARMV7_CLIDR_CTYPE_UNIFIED		4
> +
> +/* some utility macros */
> +#define mask(start, end) \
> +	(((1<<  ((end) - (start) + 1)) - 1)<<  (start))
> +
> +#define mask_n_get(reg, start, end) \
> +	(((reg)&  mask(start, end))>>  (start))

Seeing as these functions are only used in the ARMv7 cache C file, they 
should be moved there.

> +struct v7_outer_cache_ops {
> +	void (*enable)(void);
> +	void (*disable)(void);
> +	void (*flush_all)(void);
> +	void (*inval_all)(void);
> +	void (*flush_range)(u32 start, u32 end);
> +	void (*inval_range)(u32 start, u32 end);
> +};
> +
> +extern struct v7_outer_cache_ops v7_outer_cache;
> +
> +void v7_setup_outer_cache_ops(void);
> +#endif
> diff --git a/include/common.h b/include/common.h
> index 189ad81..d750ff9 100644
> --- a/include/common.h
> +++ b/include/common.h
> @@ -411,6 +411,7 @@ void	icache_disable(void);
>   int	dcache_status (void);
>   void	dcache_enable (void);
>   void	dcache_disable(void);
> +void	mmu_disable(void);
>   void	relocate_code (ulong, gd_t *, ulong) __attribute__ ((noreturn));
>   ulong	get_endaddr   (void);
>   void	trap_init     (ulong);
> @@ -603,9 +604,11 @@ ulong	video_setmem (ulong);
>
>   /* arch/$(ARCH)/lib/cache.c */
>   void	flush_cache   (unsigned long, unsigned long);
> +void	flush_dcache_all(void);
>   void	flush_dcache_range(unsigned long start, unsigned long stop);
>   void	invalidate_dcache_range(unsigned long start, unsigned long stop);
> -
> +void	invalidate_dcache_all(void);
> +void	invalidate_icache_all(void);
>
>   /* arch/$(ARCH)/lib/ticks.S */
>   unsigned long long get_ticks(void);

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked
  2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
@ 2011-01-08  6:40   ` Albert ARIBAUD
  0 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  6:40 UTC (permalink / raw)
  To: u-boot

Le 22/12/2010 12:54, Aneesh V a ?crit :
> make default implementation of cache_flush() weakly linked so that
> sub-architectures can override it
>
> Signed-off-by: Aneesh V<aneesh@ti.com>

Acked-by: Albert Aribaud <albert.aribaud@free.fr>

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support
  2010-12-22 11:54 ` [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support Aneesh V
@ 2011-01-08  6:54   ` Albert ARIBAUD
  2011-01-08  8:15     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  6:54 UTC (permalink / raw)
  To: u-boot

Hi Aneesh,

Le 22/12/2010 12:54, Aneesh V a ?crit :
> - Enable I-cache on bootup
> - Enable MMU and D-cache immediately after relocation
> 	- Do necessary initialization before enabling d-cache and MMU
> - Changes to cleanup_before_linux()
> 	- Make changes according to the new framework
>
> Signed-off-by: Aneesh V<aneesh@ti.com>
> ---
>   arch/arm/cpu/armv7/cpu.c   |   47 +++++++++++++++++++------------------------
>   arch/arm/cpu/armv7/start.S |   18 +++++++++++++++-
>   arch/arm/lib/board.c       |    6 +++++
>   arch/arm/lib/cache-cp15.c  |    9 ++++++++
>   4 files changed, 53 insertions(+), 27 deletions(-)
>
> diff --git a/arch/arm/cpu/armv7/cpu.c b/arch/arm/cpu/armv7/cpu.c
> index a01e0d6..b418304 100644
> --- a/arch/arm/cpu/armv7/cpu.c
> +++ b/arch/arm/cpu/armv7/cpu.c
> @@ -38,13 +38,10 @@
>   #ifndef CONFIG_L2_OFF
>   #include<asm/arch/sys_proto.h>
>   #endif
> -
> -static void cache_flush(void);
> +#include<asm/armv7.h>
>
>   int cleanup_before_linux(void)
>   {
> -	unsigned int i;
> -
>   	/*
>   	 * this function is called just before we call linux
>   	 * it prepares the processor for linux
> @@ -53,31 +50,29 @@ int cleanup_before_linux(void)
>   	 */
>   	disable_interrupts();
>
> -	/* turn off I/D-cache */
> +	/*
> +	 * Turn off I-cache and invalidate it
> +	 */
>   	icache_disable();
> -	dcache_disable();

(1) -- see below

> +	invalidate_icache_all();
>
> -	/* invalidate I-cache */
> -	cache_flush();

(2) -- see below

> -
> -#ifndef CONFIG_L2_OFF
> -	/* turn off L2 cache */
> -	l2_cache_disable();
> -	/* invalidate L2 cache also */
> -	invalidate_dcache(get_device_type());
> -#endif
> -	i = 0;
> -	/* mem barrier to sync up things */
> -	asm("mcr p15, 0, %0, c7, c10, 4": :"r"(i));
> +	/*
> +	 * turn off D-cache
> +	 * dcache_disable() in turn flushes the d-cache and disables MMU
> +	 */
> +	dcache_disable();

(3) -- see below

> -#ifndef CONFIG_L2_OFF
> -	l2_cache_enable();
> -#endif
> +	/*
> +	 * After D-cache is flushed and before it is disabled there may
> +	 * be some new valid entries brought into the cache. We are sure
> +	 * that these lines are not dirty and will not affect our execution.
> +	 * (because unwinding the call-stack and setting a bit in CP15 SCTRL
> +	 * is all we did during this. We have not pushed anything on to the
> +	 * stack. Neither have we affected any static data)
> +	 * So just invalidate the entire d-cache again to avoid coherency
> +	 * problems for kernel
> +	 */
> +	invalidate_dcache_all();

I'm not sure about the logic here, but I am no cache guru, so don't 
hesitate to make me... flush... with shame. If dcache_disable stayed in 
(1) instead of being moved to (3), wouldn't the cache_flush in (2) 
ensure the no new valid entries would appear in the dcache?
  Put another way, I'd have naively thought the sequence would be:

- disable L2 cache
- disable L1 I and D cache
- invalidate L1 I cache and flush L1 D cache
- flush L2 D cache

>   	return 0;
>   }
> -
> -static void cache_flush(void)
> -{
> -	asm ("mcr p15, 0, %0, c7, c5, 0": :"r" (0));
> -}
> diff --git a/arch/arm/cpu/armv7/start.S b/arch/arm/cpu/armv7/start.S
> index 684f2d2..7d17f1e 100644
> --- a/arch/arm/cpu/armv7/start.S
> +++ b/arch/arm/cpu/armv7/start.S
> @@ -241,6 +241,14 @@ clbss_l:str	r2, [r0]		/* clear loop...		    */
>    * initialization, now running from RAM.
>    */
>   jump_2_ram:
> +/*
> + * If I-cache is enabled invalidate it
> + */
> +#ifndef CONFIG_SYS_NO_ICACHE
> +	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
> +	dsb
> +	isb
> +#endif
>   	ldr	r0, _board_init_r_ofs
>   	adr	r1, _start
>   	add	lr, r0, r1
> @@ -276,6 +284,9 @@ cpu_init_crit:
>   	mov	r0, #0			@ set up for MCR
>   	mcr	p15, 0, r0, c8, c7, 0	@ invalidate TLBs
>   	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
> +	mcr	p15, 0, r0, c7, c5, 6  /* invalidate BP array */
> +	dsb
> +	isb
>
>   	/*
>   	 * disable MMU stuff and caches
> @@ -284,7 +295,12 @@ cpu_init_crit:
>   	bic	r0, r0, #0x00002000	@ clear bits 13 (--V-)
>   	bic	r0, r0, #0x00000007	@ clear bits 2:0 (-CAM)
>   	orr	r0, r0, #0x00000002	@ set bit 1 (--A-) Align
> -	orr	r0, r0, #0x00000800	@ set bit 12 (Z---) BTB
> +	orr	r0, r0, #0x00000800	@ set bit 11 (Z---) BTB
> +#ifdef CONFIG_SYS_NO_ICACHE
> +	bic	r0, r0, #0x00001000	@ clear bit 12 (I) I-cache
> +#else
> +	orr	r0, r0, #0x00001000	@ set bit 12 (I) I-cache
> +#endif
>   	mcr	p15, 0, r0, c1, c0, 0
>
>   	/*
> diff --git a/arch/arm/lib/board.c b/arch/arm/lib/board.c
> index 7266381..bef32a6 100644
> --- a/arch/arm/lib/board.c
> +++ b/arch/arm/lib/board.c
> @@ -459,6 +459,12 @@ void board_init_r (gd_t *id, ulong dest_addr)
>
>   	gd->flags |= GD_FLG_RELOC;	/* tell others: relocation done */
>
> +	/*
> +	 * Enable D$:
> +	 * I$, if needed, must be already enabled in start.S
> +	 */
> +	dcache_enable();
> +
>   	monitor_flash_len = _bss_start_ofs;
>   	debug ("monitor flash len: %08lX\n", monitor_flash_len);
>   	board_init();	/* Setup chipselects */
> diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
> index d9175f0..ca526fb 100644
> --- a/arch/arm/lib/cache-cp15.c
> +++ b/arch/arm/lib/cache-cp15.c
> @@ -34,6 +34,14 @@
>
>   DECLARE_GLOBAL_DATA_PTR;
>
> +void __arm_init_before_mmu(void)
> +{
> +	puts("__arm_init_before_mmu: dummy implementation! "
> +	     "real implementation missing!!\n");
> +}

If the function is absolutely needed, please make sure booting does not 
continue. If the function is not absolutely needed, please rewrite 
message since nothing would actually be 'missing', but rather 'could be 
added'.

> +void arm_init_before_mmu(void)
> +	__attribute__((weak, alias("__arm_init_before_mmu")));
> +
>   static void cp_delay (void)
>   {
>   	volatile int i;
> @@ -65,6 +73,7 @@ static inline void mmu_setup(void)
>   	int i;
>   	u32 reg;
>
> +	arm_init_before_mmu();
>   	/* Set up an identity-mapping for all 4GB, rw for everyone */
>   	for (i = 0; i<  4096; i++)
>   		page_table[i] = i<<  20 | (3<<  10) | 0x12;

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling
  2010-12-22 11:54 ` [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling Aneesh V
@ 2011-01-08  7:04   ` Albert ARIBAUD
  2011-01-08  9:13     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  7:04 UTC (permalink / raw)
  To: u-boot

Hi Aneesh,

Le 22/12/2010 12:54, Aneesh V a ?crit :
> 1. make sure that page table setup is not done multiple times
> 2. flush_dcache_all() is more appropriate while disabling cache
>     than a range flush on the entire memory(flush_cache())
>
>     Provide a default implementation for flush_dcache_all()
>     for backward compatibility and to avoid build issues.
>
> Signed-off-by: Aneesh V<aneesh@ti.com>
> ---
>   arch/arm/lib/cache-cp15.c |    9 +++++++--
>   arch/arm/lib/cache.c      |   13 ++++++++++++-
>   2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
> index ca526fb..20aa993 100644
> --- a/arch/arm/lib/cache-cp15.c
> +++ b/arch/arm/lib/cache-cp15.c
> @@ -94,13 +94,18 @@ static inline void mmu_setup(void)
>   	set_cr(reg | CR_M);
>   }
>
> +static int mmu_enabled(void)
> +{
> +	return get_cr()&  CR_M;
> +}
> +
>   /* cache_bit must be either CR_I or CR_C */
>   static void cache_enable(uint32_t cache_bit)
>   {
>   	uint32_t reg;
>
>   	/* The data cache is not active unless the mmu is enabled too */
> -	if (cache_bit == CR_C)
> +	if ((cache_bit == CR_C)&&  !mmu_enabled())
>   		mmu_setup();
>   	reg = get_cr();	/* get control reg. */
>   	cp_delay();

Do you know why double MMU setups happen? Can we not fix the execution 
path and remove the second MMU setup call there, rather that catching t 
on the fly to ignore it?

> @@ -119,7 +124,7 @@ static void cache_disable(uint32_t cache_bit)
>   			return;
>   		/* if disabling data cache, disable mmu too */
>   		cache_bit |= CR_M;
> -		flush_cache(0, ~0);
> +		flush_dcache_all();
>   	}
>   	reg = get_cr();
>   	cp_delay();
> diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c
> index 275b6e1..363609a 100644
> --- a/arch/arm/lib/cache.c
> +++ b/arch/arm/lib/cache.c
> @@ -25,7 +25,7 @@
>
>   #include<common.h>
>
> -void  __flush_cache(unsigned long dummy1, unsigned long dummy2)
> +void  __flush_cache(unsigned long start, unsigned long size)
>   {
>   #if defined(CONFIG_OMAP2420) || defined(CONFIG_ARM1136)
>   	void arm1136_cache_flush(void);
> @@ -42,3 +42,14 @@ void  __flush_cache(unsigned long dummy1, unsigned long dummy2)
>   }
>   void  flush_cache(unsigned long dummy1, unsigned long dummy2)

Please fix also parameters in this declaration.

>   	__attribute__((weak, alias("__flush_cache")));
> +
> +/*
> + * Default implementation:
> + * do a range flush for the entire range
> + */
> +void	__flush_dcache_all(void)
> +{
> +	flush_cache(0, ~0);
> +}
> +void	flush_dcache_all(void)
> +	__attribute__((weak, alias("__flush_dcache_all")));

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
  2010-12-28 19:51 ` Paulraj, Sandeep
@ 2011-01-08  7:07   ` Albert ARIBAUD
  0 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  7:07 UTC (permalink / raw)
  To: u-boot

Le 28/12/2010 20:51, Paulraj, Sandeep a ?crit :
>
>
>> Subject: [U-Boot] [PATCH 0/8] armv7: cache maintenance operations
>>
>> With D-cache and MMU enabled for ARM in u-boot it becomes imperative to
>> support a minimal set of cache maintenance operations and necessary
>> initializations before enabling MMU.
>>
>> This series of patches attempt to do the following for armv7:
>> * Necessary initialization sequence before enabling MMU that includes
>>    invalidation of TLB, data caches, branch predictor array etc.
>> * Framework for supporting SOC specific outer caches in a generic manner
>>    (using a structure of function pointers - inspired by the Linux
>>    implementation)
>> * Generic armv7 cache maintenance operations for caches known to the CPU
>> * Support for ARM PL310 L2 cache controller used in OMAP4
>> * Cleanup of the cleanup_before_linux() function
>> * Adapting all armv7 SOCs to use the new framework and removing
>>    duplicated code
>>
>> Testing:
>> * Extensive testing on OMAP4430SDP and OMAP3430SDP by creating coherency
>>    issues and solving them using the maintenance routines
>> 	- Eg: memfill a region of memory with a known pattern
>> 	- Invalidate the region
>> 	- Read back and compare the region with the original pattern
>> 	- If match fails it means that invalidate is successful
>> 	- Now add a flush call just before the invalidate
>> 	- If match succeeds it means that flush was successful
>> 	- Outer caches were tested with experiments involving making the
>> 	  function pointers NULL
>> * Kernel booting on OMAP4430SDP and OMAP3430SDP
>>
>>
>> Aneesh V (8):
>>    arm: make default implementation of cache_flush() weakly linked
>>    armv7: cache maintenance operations for armv7
>>    armv7: integrate cache maintenance support
>>    arm: minor fixes for cache and mmu handling
>>    armv7: add PL310 support to u-boot
>>    armv7: adapt omap4 to the new cache maintenance framework
>>    armv7: adapt omap3 to the new cache maintenance framework
>>    armv7: adapt s5pc1xx to the new cache maintenance framework
>
>
> As I mentioned to John Rigby in anoterh e-mail, I will be on vacation till the 20th of January. So there might be a little delay in merging this patch series after due review by the community.
>
> Regards,
> Sandeep

Hi Sandeep,

I have assigned these to me in patchwork but feel free to reclaim them 
if you prefer.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support
  2011-01-08  6:54   ` Albert ARIBAUD
@ 2011-01-08  8:15     ` Aneesh V
  0 siblings, 0 replies; 50+ messages in thread
From: Aneesh V @ 2011-01-08  8:15 UTC (permalink / raw)
  To: u-boot

Hi Albert,

On Saturday 08 January 2011 12:24 PM, Albert ARIBAUD wrote:
> Hi Aneesh,
>
> Le 22/12/2010 12:54, Aneesh V a ?crit :
>> - Enable I-cache on bootup
>> - Enable MMU and D-cache immediately after relocation
>> 	- Do necessary initialization before enabling d-cache and MMU
>> - Changes to cleanup_before_linux()
>> 	- Make changes according to the new framework
>>
>> Signed-off-by: Aneesh V<aneesh@ti.com>
>> ---
>>    arch/arm/cpu/armv7/cpu.c   |   47 +++++++++++++++++++------------------------
>>    arch/arm/cpu/armv7/start.S |   18 +++++++++++++++-
>>    arch/arm/lib/board.c       |    6 +++++
>>    arch/arm/lib/cache-cp15.c  |    9 ++++++++
>>    4 files changed, 53 insertions(+), 27 deletions(-)
>>
>> diff --git a/arch/arm/cpu/armv7/cpu.c b/arch/arm/cpu/armv7/cpu.c
>> index a01e0d6..b418304 100644
>> --- a/arch/arm/cpu/armv7/cpu.c
>> +++ b/arch/arm/cpu/armv7/cpu.c
>> @@ -38,13 +38,10 @@
>>    #ifndef CONFIG_L2_OFF
>>    #include<asm/arch/sys_proto.h>
>>    #endif
>> -
>> -static void cache_flush(void);
>> +#include<asm/armv7.h>
>>
>>    int cleanup_before_linux(void)
>>    {
>> -	unsigned int i;
>> -
>>    	/*
>>    	 * this function is called just before we call linux
>>    	 * it prepares the processor for linux
>> @@ -53,31 +50,29 @@ int cleanup_before_linux(void)
>>    	 */
>>    	disable_interrupts();
>>
>> -	/* turn off I/D-cache */
>> +	/*
>> +	 * Turn off I-cache and invalidate it
>> +	 */
>>    	icache_disable();
>> -	dcache_disable();
>
> (1) -- see below
>
>> +	invalidate_icache_all();
>>
>> -	/* invalidate I-cache */
>> -	cache_flush();
>
> (2) -- see below
>
>> -
>> -#ifndef CONFIG_L2_OFF
>> -	/* turn off L2 cache */
>> -	l2_cache_disable();
>> -	/* invalidate L2 cache also */
>> -	invalidate_dcache(get_device_type());
>> -#endif
>> -	i = 0;
>> -	/* mem barrier to sync up things */
>> -	asm("mcr p15, 0, %0, c7, c10, 4": :"r"(i));
>> +	/*
>> +	 * turn off D-cache
>> +	 * dcache_disable() in turn flushes the d-cache and disables MMU
>> +	 */
>> +	dcache_disable();
>
> (3) -- see below
>
>> -#ifndef CONFIG_L2_OFF
>> -	l2_cache_enable();
>> -#endif
>> +	/*
>> +	 * After D-cache is flushed and before it is disabled there may
>> +	 * be some new valid entries brought into the cache. We are sure
>> +	 * that these lines are not dirty and will not affect our execution.
>> +	 * (because unwinding the call-stack and setting a bit in CP15 SCTRL
>> +	 * is all we did during this. We have not pushed anything on to the
>> +	 * stack. Neither have we affected any static data)
>> +	 * So just invalidate the entire d-cache again to avoid coherency
>> +	 * problems for kernel
>> +	 */
>> +	invalidate_dcache_all();
>
> I'm not sure about the logic here, but I am no cache guru, so don't
> hesitate to make me... flush... with shame. If dcache_disable stayed in
> (1) instead of being moved to (3), wouldn't the cache_flush in (2)
> ensure the no new valid entries would appear in the dcache?
>    Put another way, I'd have naively thought the sequence would be:
>
> - disable L2 cache
> - disable L1 I and D cache
> - invalidate L1 I cache and flush L1 D cache
> - flush L2 D cache

This is what I did too in the beginning. But I ran into problems with
it. Here is how:

1 - disable L2 cache
2 - disable L1 I and D cache
3 - invalidate L1 I cache and flush L1 D cache
4 - flush L2 D cache

- At step 3 and 4 we call the functions to flush the call. These
functions in their prologue push some registers(including r14) to the
stack.
- Please note that the cache is disabled at this point in time. So
the new stack contents go directly into the memory.
- Also, note that the stack may be already cached in L1 or L2(before
the caches were disabled). So we have a coherency issue here.
- Now when the caches are flushed the stack in the cache over-writes
the stack in memory thus losing the latest stack contents.
- When we try to return from the function there is a crash!

Conclusion:
1. If you disable first and then flush you have to be careful about
memory coherency in the period between disable and flush. You must not
write anything to the memory and you must be sure about what you read
from memory. This is difficult unless you program the entire thing in
assembly.
2. If you flush first and then disable you have to take care of the new
entries coming to the cache after the flush and before disabling. This
is what we are doing.

>
>>    	return 0;
>>    }
>> -
>> -static void cache_flush(void)
>> -{
>> -	asm ("mcr p15, 0, %0, c7, c5, 0": :"r" (0));
>> -}
>> diff --git a/arch/arm/cpu/armv7/start.S b/arch/arm/cpu/armv7/start.S
>> index 684f2d2..7d17f1e 100644
>> --- a/arch/arm/cpu/armv7/start.S
>> +++ b/arch/arm/cpu/armv7/start.S
>> @@ -241,6 +241,14 @@ clbss_l:str	r2, [r0]		/* clear loop...		    */
>>     * initialization, now running from RAM.
>>     */
>>    jump_2_ram:
>> +/*
>> + * If I-cache is enabled invalidate it
>> + */
>> +#ifndef CONFIG_SYS_NO_ICACHE
>> +	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
>> +	dsb
>> +	isb
>> +#endif
>>    	ldr	r0, _board_init_r_ofs
>>    	adr	r1, _start
>>    	add	lr, r0, r1
>> @@ -276,6 +284,9 @@ cpu_init_crit:
>>    	mov	r0, #0			@ set up for MCR
>>    	mcr	p15, 0, r0, c8, c7, 0	@ invalidate TLBs
>>    	mcr	p15, 0, r0, c7, c5, 0	@ invalidate icache
>> +	mcr	p15, 0, r0, c7, c5, 6  /* invalidate BP array */
>> +	dsb
>> +	isb
>>
>>    	/*
>>    	 * disable MMU stuff and caches
>> @@ -284,7 +295,12 @@ cpu_init_crit:
>>    	bic	r0, r0, #0x00002000	@ clear bits 13 (--V-)
>>    	bic	r0, r0, #0x00000007	@ clear bits 2:0 (-CAM)
>>    	orr	r0, r0, #0x00000002	@ set bit 1 (--A-) Align
>> -	orr	r0, r0, #0x00000800	@ set bit 12 (Z---) BTB
>> +	orr	r0, r0, #0x00000800	@ set bit 11 (Z---) BTB
>> +#ifdef CONFIG_SYS_NO_ICACHE
>> +	bic	r0, r0, #0x00001000	@ clear bit 12 (I) I-cache
>> +#else
>> +	orr	r0, r0, #0x00001000	@ set bit 12 (I) I-cache
>> +#endif
>>    	mcr	p15, 0, r0, c1, c0, 0
>>
>>    	/*
>> diff --git a/arch/arm/lib/board.c b/arch/arm/lib/board.c
>> index 7266381..bef32a6 100644
>> --- a/arch/arm/lib/board.c
>> +++ b/arch/arm/lib/board.c
>> @@ -459,6 +459,12 @@ void board_init_r (gd_t *id, ulong dest_addr)
>>
>>    	gd->flags |= GD_FLG_RELOC;	/* tell others: relocation done */
>>
>> +	/*
>> +	 * Enable D$:
>> +	 * I$, if needed, must be already enabled in start.S
>> +	 */
>> +	dcache_enable();
>> +
>>    	monitor_flash_len = _bss_start_ofs;
>>    	debug ("monitor flash len: %08lX\n", monitor_flash_len);
>>    	board_init();	/* Setup chipselects */
>> diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
>> index d9175f0..ca526fb 100644
>> --- a/arch/arm/lib/cache-cp15.c
>> +++ b/arch/arm/lib/cache-cp15.c
>> @@ -34,6 +34,14 @@
>>
>>    DECLARE_GLOBAL_DATA_PTR;
>>
>> +void __arm_init_before_mmu(void)
>> +{
>> +	puts("__arm_init_before_mmu: dummy implementation! "
>> +	     "real implementation missing!!\n");
>> +}
>
> If the function is absolutely needed, please make sure booting does not
> continue. If the function is not absolutely needed, please rewrite
> message since nothing would actually be 'missing', but rather 'could be
> added'.
>

Before ARMv7 caches are guaranteed to be invalidated at reset. But I
couldn't find conclusive information about TLB. So I am not sure, it
may or may not be absolutely necessary.

I shall just remove that message!

>> +void arm_init_before_mmu(void)
>> +	__attribute__((weak, alias("__arm_init_before_mmu")));
>> +
>>    static void cp_delay (void)
>>    {
>>    	volatile int i;
>> @@ -65,6 +73,7 @@ static inline void mmu_setup(void)
>>    	int i;
>>    	u32 reg;
>>
>> +	arm_init_before_mmu();
>>    	/* Set up an identity-mapping for all 4GB, rw for everyone */
>>    	for (i = 0; i<   4096; i++)
>>    		page_table[i] = i<<   20 | (3<<   10) | 0x12;
>
> Amicalement,

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08  6:36   ` Albert ARIBAUD
@ 2011-01-08  8:40     ` Albert ARIBAUD
  2011-01-08 10:06     ` Aneesh V
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08  8:40 UTC (permalink / raw)
  To: u-boot

Le 08/01/2011 07:36, Albert ARIBAUD a ?crit :

>> --- a/arch/arm/cpu/armv7/config.mk
>> +++ b/arch/arm/cpu/armv7/config.mk
>> @@ -23,7 +23,7 @@
>>    PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
>>
>>    # Make ARMv5 to allow more compilers to work, even though its v7a.
>> -PLATFORM_CPPFLAGS += -march=armv5
>> +PLATFORM_CPPFLAGS += -march=armv7-a
>
> Did you check that this does not break any board using armv7?

Actually I should have said "Did you check that it does not break 
building with any common toolchain", considering the comment above the 
line. And indeed, it breaks building with ELDK 4.2 toolchain.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling
  2011-01-08  7:04   ` Albert ARIBAUD
@ 2011-01-08  9:13     ` Aneesh V
  0 siblings, 0 replies; 50+ messages in thread
From: Aneesh V @ 2011-01-08  9:13 UTC (permalink / raw)
  To: u-boot

Hi Albert,

On Saturday 08 January 2011 12:34 PM, Albert ARIBAUD wrote:
> Hi Aneesh,
>
> Le 22/12/2010 12:54, Aneesh V a ?crit :
>> 1. make sure that page table setup is not done multiple times
>> 2. flush_dcache_all() is more appropriate while disabling cache
>>      than a range flush on the entire memory(flush_cache())
>>
>>      Provide a default implementation for flush_dcache_all()
>>      for backward compatibility and to avoid build issues.
>>
>> Signed-off-by: Aneesh V<aneesh@ti.com>
>> ---
>>    arch/arm/lib/cache-cp15.c |    9 +++++++--
>>    arch/arm/lib/cache.c      |   13 ++++++++++++-
>>    2 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c
>> index ca526fb..20aa993 100644
>> --- a/arch/arm/lib/cache-cp15.c
>> +++ b/arch/arm/lib/cache-cp15.c
>> @@ -94,13 +94,18 @@ static inline void mmu_setup(void)
>>    	set_cr(reg | CR_M);
>>    }
>>
>> +static int mmu_enabled(void)
>> +{
>> +	return get_cr()&   CR_M;
>> +}
>> +
>>    /* cache_bit must be either CR_I or CR_C */
>>    static void cache_enable(uint32_t cache_bit)
>>    {
>>    	uint32_t reg;
>>
>>    	/* The data cache is not active unless the mmu is enabled too */
>> -	if (cache_bit == CR_C)
>> +	if ((cache_bit == CR_C)&&   !mmu_enabled())
>>    		mmu_setup();
>>    	reg = get_cr();	/* get control reg. */
>>    	cp_delay();
>
> Do you know why double MMU setups happen? Can we not fix the execution
> path and remove the second MMU setup call there, rather that catching t
> on the fly to ignore it?

Please note that mmu_setup() was getting called unconditionally from
dcache_enable(). I see that some drivers are calling dcache_enable()
and there are u-boot commands for enabling disabling cache.
Consequently mmu_setup() may get called multiple times.
Do we want to prevent dcache_enable() from being called multiple times?

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08  6:36   ` Albert ARIBAUD
  2011-01-08  8:40     ` Albert ARIBAUD
@ 2011-01-08 10:06     ` Aneesh V
  2011-01-12 19:18       ` Albert ARIBAUD
  2011-01-08 13:17     ` Aneesh V
  2011-03-01 11:54     ` Aneesh V
  3 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-08 10:06 UTC (permalink / raw)
  To: u-boot

Hi Albert,

On Saturday 08 January 2011 12:06 PM, Albert ARIBAUD wrote:
> Hi Aneesh,
>
> Le 22/12/2010 12:54, Aneesh V a ?crit :
>> - Add a framework for layered cache maintenance
>> 	- separate out SOC specific outer cache maintenance from
>> 	  maintenance of caches known to CPU
>>
>> - Add generic ARMv7 cache maintenance operations that affect all
>>     caches known to ARMv7 CPUs. For instance in Cortex-A8 these
>>     opertions will affect both L1 and L2 caches. In Cortex-A9
>>     these will affect only L1 cache
>>
>> - D-cache operations supported:
>> 	- Invalidate entire D-cache
>> 	- Invalidate D-cache range
>> 	- Flush(clean&   invalidate) entire D-cache
>> 	- Flush D-cache range
>> - I-cache operations supported:
>> 	- Invalidate entire I-cache
>>
>> - Add maintenance functions for TLB, branch predictor array etc.
>>
>> - Enable -march=armv7-a so that armv7 assembly instructions can be
>>     used
>>
>> Signed-off-by: Aneesh V<aneesh@ti.com>
>> ---
>>    arch/arm/cpu/armv7/Makefile   |    2 +-
>>    arch/arm/cpu/armv7/cache_v7.c |  359 +++++++++++++++++++++++++++++++++++++++++
>>    arch/arm/cpu/armv7/config.mk  |    2 +-
>>    arch/arm/include/asm/armv7.h  |   63 +++++++
>>    include/common.h              |    5 +-
>>    5 files changed, 428 insertions(+), 3 deletions(-)
>>    create mode 100644 arch/arm/cpu/armv7/cache_v7.c
>>    create mode 100644 arch/arm/include/asm/armv7.h
>>
>> diff --git a/arch/arm/cpu/armv7/Makefile b/arch/arm/cpu/armv7/Makefile
>> index 8c0e915..299792a 100644
>> --- a/arch/arm/cpu/armv7/Makefile
>> +++ b/arch/arm/cpu/armv7/Makefile
>> @@ -26,7 +26,7 @@ include $(TOPDIR)/config.mk
>>    LIB	= $(obj)lib$(CPU).o
>>
>>    START	:= start.o
>> -COBJS	:= cpu.o
>> +COBJS	:= cpu.o cache_v7.o
>>    COBJS  += syslib.o
>>
>>    SRCS	:= $(START:.o=.S) $(COBJS:.o=.c)
>> diff --git a/arch/arm/cpu/armv7/cache_v7.c b/arch/arm/cpu/armv7/cache_v7.c
>> new file mode 100644
>> index 0000000..0521d66
>> --- /dev/null
>> +++ b/arch/arm/cpu/armv7/cache_v7.c
>> @@ -0,0 +1,359 @@
>> +/*
>> + * (C) Copyright 2010
>> + * Texas Instruments Incorporated - http://www.ti.com/
>> + * Aneesh V<aneesh@ti.com>
>> + *
>> + * See file CREDITS for list of people who contributed to this
>> + * project.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation; either version 2 of
>> + * the License, or (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
>> + * MA 02111-1307 USA
>> + */
>> +#include<linux/types.h>
>> +#include<common.h>
>> +#include<asm/armv7.h>
>> +
>> +#define ARMV7_DCACHE_INVAL_ALL		1
>> +#define ARMV7_DCACHE_CLEAN_INVAL_ALL	2
>> +#define ARMV7_DCACHE_INVAL_RANGE	3
>> +#define ARMV7_DCACHE_CLEAN_INVAL_RANGE	4
>> +
>> +struct v7_outer_cache_ops v7_outer_cache;
>> +
>> +#ifndef CONFIG_SYS_NO_DCACHE
>> +/*
>> + * Write the level and type you want to Cache Size Selection Register(CSSELR)
>> + * to get size details from Current Cache Size ID Register(CCSIDR)
>> + */
>> +static void set_csselr(u32 level, u32 type)
>> +{	u32 csselr = level<<   1 | type;
>> +	/* Write to Cache Size Selection Register(CSSELR) */
>> +	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
>> +}
>> +
>> +static u32 get_ccsidr(void)
>> +{
>> +	u32 ccsidr;
>> +	/* Read current CP15 Cache Size ID Register */
>> +	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
>> +	return ccsidr;
>> +}
>> +
>> +static u32 get_clidr(void)
>> +{
>> +	u32 clidr;
>> +	/* Read current CP15 Cache Level ID Register */
>> +	asm volatile ("mrc p15,1,%0,c0,c0,1" : "=r" (clidr));
>> +	return clidr;
>> +}
>> +
>> +/* round up the input number to a power of 2 and get the log2 */
>> +static inline u32 log_2_round_up(u32 num)
>> +{
>> +	/* count leading zeros */
>> +	asm volatile ("CLZ %0, %0" : "+r" (num));
>> +
>> +	/* position of most significant 1 */
>> +	num = 31 - num;
>> +
>> +	return num;
>> +}
>> +
>> +static void v7_inval_dcache_level_setway(u32 level, u32 num_sets,
>> +					 u32 num_ways, u32 way_shift,
>> +					 u32 log2_line_len)
>> +{
>> +	int way, set, setway;
>> +	/*
>> +	 * For optimal assembly code:
>> +	 *	a. count down
>> +	 *	b. have bigger loop inside
>> +	 */
>
> Out of curiosity, can you elaborate on why the compiler would optimize
> better in these cases?

While counting down the termination condition check is against 0. So
you can just decrement the loop count using a 'subs' and do a 'bne'.
When you count up you have to do a comparison with a non-zero value. So
you will need one 'cmp' instruction extra:-)

bigger loop inside because that reduces the frequency at which your
outer parameter changes and hence the overall number of instructions 
executed. Consider this:
1. We encode both the loop counts along with other data into a register
that is finally written to CP15 register.
2. outer loop has the code for shifting and ORing the outer variable to
this register.
3. Inner loop has the code for shifting and ORing the inner variable.
Step (3) has to be executed 'way x set' number of times anyways.
But having bigger loop inside makes sure that 2 is executed fewer times!

With these tweaks the assembly code generated by this C code is as good
as the original hand-written assembly code with my compiler.


>
>> +	for (way = num_ways - 1; way>= 0 ; way--)
>> +		for (set = num_sets - 1; set>= 0; set--) {
>
> Please fix whitespacing around operators. The best way to ''catch'em
> all'' is to run Linux' checkpatch.pl (I do this with option --no-tree)
> on all patches that you submit to u-boot and, fix all warning and errors
> and if some are left that you think should not be fixed, mention them
> and explain why they're wrongly emitted.

I religiously do checkpatch whenever I send out a patch. Please note 
that my original mail seems to be fine. I saved it and ran checkpatch 
again. No errors, no warnings! Something amiss?

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08  6:36   ` Albert ARIBAUD
  2011-01-08  8:40     ` Albert ARIBAUD
  2011-01-08 10:06     ` Aneesh V
@ 2011-01-08 13:17     ` Aneesh V
  2011-01-08 14:06       ` Albert ARIBAUD
  2011-03-01 11:54     ` Aneesh V
  3 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-08 13:17 UTC (permalink / raw)
  To: u-boot

On Saturday 08 January 2011 12:06 PM, Albert ARIBAUD wrote:
> Hi Aneesh,
Pressed the Send button too fast last time. Missed answering the last
few questions.

<snip..>
>> +
>> +void invalidate_dcache_all(void)
>> +{
>> +	v7_maint_dcache_all(ARMV7_DCACHE_INVAL_ALL);
>> +	if (v7_outer_cache.inval_all)
>> +		v7_outer_cache.inval_all();
>
> Why use pointers here rather than weak functions?

In fact, I hadn't thought about it. Maybe I was biased by the Linux
implementation.The only reason I can think of is that pointer gives the
flexibility of doing this assignment at run-time. Let's say we had a
multi-platform u-boot that detects the SOC at run-time?

>
>> +}
>> +
>> +/*
>> + * Performs a clean&   invalidation of the entire data cache
>> + * at all levels
>> + */
>> +void flush_dcache_all(void)
>> +{
>> +	v7_maint_dcache_all(ARMV7_DCACHE_CLEAN_INVAL_ALL);
>> +	if (v7_outer_cache.flush_all)
>> +		v7_outer_cache.flush_all();
>> +}
>> +
>> +/*
>> + * Invalidates range in all levels of D-cache/unified cache used:
>> + * Affects the range [start, stop - 1]
>> + */
>> +void invalidate_dcache_range(unsigned long start, unsigned long stop)
>> +{
>> +
>> +	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_INVAL_RANGE);
>> +	if (v7_outer_cache.inval_range)
>> +		/* physical address is same as virtual address */
>> +		v7_outer_cache.inval_range(start, stop);
>> +}
>> +
>> +/*
>> + * Flush range(clean&   invalidate) from all levels of D-cache/unified
>> + * cache used:
>> + * Affects the range [start, stop - 1]
>> + */
>> +void flush_dcache_range(unsigned long start, unsigned long stop)
>> +{
>> +	v7_dcache_maint_range(start, stop, ARMV7_DCACHE_CLEAN_INVAL_RANGE);
>> +	if (v7_outer_cache.flush_range)
>> +		/* physical address is same as virtual address */
>> +		v7_outer_cache.flush_range(start, stop);
>> +}
>> +static void __v7_setup_outer_cache_ops(void)
>> +{
>> +	puts("v7_setup_outer_cache_ops: dummy implementation! "
>> +	     "real implementation not available!!\n");
>> +}
>> +
>> +void v7_setup_outer_cache_ops(void)
>> +	__attribute__((weak, alias("__v7_setup_outer_cache_ops")));
>> +
>> +void arm_init_before_mmu(void)
>> +{
>> +	v7_setup_outer_cache_ops();
>> +	if (v7_outer_cache.enable)
>> +		v7_outer_cache.enable();
>> +	invalidate_dcache_all();
>> +	v7_inval_tlb();
>> +}
>> +#else
>> +void invalidate_dcache_all(void)
>> +{
>> +}
>> +
>> +void flush_dcache_all(void)
>> +{
>> +}
>> +
>> +void invalidate_dcache_range(unsigned long start, unsigned long stop)
>> +{
>> +}
>> +
>> +void flush_dcache_range(unsigned long start, unsigned long stop)
>> +{
>> +}
>> +
>> +void arm_init_before_mmu(void)
>> +{
>> +}
>> +#endif
>> +
>> +#ifndef CONFIG_SYS_NO_ICACHE
>> +/* Invalidate entire I-cache and branch predictor array */
>> +void invalidate_icache_all(void)
>> +{
>> +	/*
>> +	 * Invalidate all instruction caches to PoU.
>> +	 * Also flushes branch target cache.
>> +	 */
>> +	asm volatile ("mcr p15, 0, %0, c7, c5, 0" : : "r" (0));
>> +
>> +	/* Invalidate entire branch predictor array */
>> +	asm volatile ("mcr p15, 0, %0, c7, c5, 6" : : "r" (0));
>> +
>> +	/* Full system DSB - make sure that the invalidation is complete */
>> +	asm volatile ("DSB");
>> +	/* Full system ISB - make sure the instruction stream sees it */
>> +	asm volatile ("ISB");
>> +}
>> +#else
>> +void invalidate_icache_all(void)
>> +{
>> +}
>> +#endif
>> +
>> +/*
>> + * Flush range from all levels of d-cache/unified-cache used:
>> + * Affects the range [start, start + size - 1]
>> + */
>> +void  flush_cache(unsigned long start, unsigned long size)
>> +{
>> +	flush_dcache_range(start, start + size);
>> +}
>
> This function is the only one which is defined to something non-empty
> when CONFIG_SYS_NO_DCACHE is defined. Why is it not in the big #ifndef
> for dcache above ?

Although this function is non-empty, flush_dcache_range() is in turn
empty. Effect will be the same, right?

>
>> diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk
>> index 49ac9c7..7f9b171 100644
>> --- a/arch/arm/cpu/armv7/config.mk
>> +++ b/arch/arm/cpu/armv7/config.mk
>> @@ -23,7 +23,7 @@
>>    PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
>>
>>    # Make ARMv5 to allow more compilers to work, even though its v7a.
>> -PLATFORM_CPPFLAGS += -march=armv5
>> +PLATFORM_CPPFLAGS += -march=armv7-a
>
> Did you check that this does not break any board using armv7?

I checked only Codesourcery tool chain.
Linux kernel build for a v7 architecture processor uses armv7-a. Is it
not fair to assume that the toolchain used for bootloader also supports
it? Do we have to support those out-dated compilers?

>>    # =========================================================================
>>    #
>>    # Supply options according to compiler version
>> diff --git a/arch/arm/include/asm/armv7.h b/arch/arm/include/asm/armv7.h
>> new file mode 100644
>> index 0000000..57409b6
>> --- /dev/null
>> +++ b/arch/arm/include/asm/armv7.h
>> @@ -0,0 +1,63 @@
>> +/*
>> + * (C) Copyright 2010
>> + * Texas Instruments Incorporated - http://www.ti.com/
>> + *
>> + * Aneesh V<aneesh@ti.com>
>> + *
>> + * See file CREDITS for list of people who contributed to this
>> + * project.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation; either version 2 of
>> + * the License, or (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
>> + * MA 02111-1307 USA
>> + */
>> +#ifndef ARMV7_H
>> +#define ARMV7_H
>> +
>> +#include<linux/types.h>
>> +
>> +/*
>> + * Values for InD field in CSSELR
>> + * Selects the type of cache
>> + */
>> +#define ARMV7_CSSELR_IND_DATA_UNIFIED	0
>> +#define ARMV7_CSSELR_IND_INSTRUCTION	1
>> +
>> +/* Values for Ctype fields in CLIDR */
>> +#define ARMV7_CLIDR_CTYPE_NO_CACHE		0
>> +#define ARMV7_CLIDR_CTYPE_INSTRUCTION_ONLY	1
>> +#define ARMV7_CLIDR_CTYPE_DATA_ONLY		2
>> +#define ARMV7_CLIDR_CTYPE_INSTRUCTION_DATA	3
>> +#define ARMV7_CLIDR_CTYPE_UNIFIED		4
>> +
>> +/* some utility macros */
>> +#define mask(start, end) \
>> +	(((1<<   ((end) - (start) + 1)) - 1)<<   (start))
>> +
>> +#define mask_n_get(reg, start, end) \
>> +	(((reg)&   mask(start, end))>>   (start))
>
> Seeing as these functions are only used in the ARMv7 cache C file, they
> should be moved there.

I plan to use a modified version of mask_n_get() and its set couterpart
mask_n_set() in my subsequent works in more files.

Can I keep it here itself or should I move it to an OMAP specific
header file or can I move it to a more generic header file? Please
suggest.

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08 13:17     ` Aneesh V
@ 2011-01-08 14:06       ` Albert ARIBAUD
  2011-01-09 22:41         ` Wolfgang Denk
  2011-01-12  9:08         ` Aneesh V
  0 siblings, 2 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-08 14:06 UTC (permalink / raw)
  To: u-boot

Le 08/01/2011 14:17, Aneesh V a ?crit :

>> Why use pointers here rather than weak functions?
>
> In fact, I hadn't thought about it. Maybe I was biased by the Linux
> implementation.The only reason I can think of is that pointer gives the
> flexibility of doing this assignment at run-time. Let's say we had a
> multi-platform u-boot that detects the SOC at run-time?

I know we consider multi-board u-boot binaries when boards are variant 
of a given SoC, that's one reason why we wanted relocation. I'm not sure 
about multi-SoC when SoC is a variant of a given cpu, though. Wolfgang, 
your opinion?

>>> +void flush_cache(unsigned long start, unsigned long size)
>>> +{
>>> + flush_dcache_range(start, start + size);
>>> +}
>>
>> This function is the only one which is defined to something non-empty
>> when CONFIG_SYS_NO_DCACHE is defined. Why is it not in the big #ifndef
>> for dcache above ?
>
> Although this function is non-empty, flush_dcache_range() is in turn
> empty. Effect will be the same, right?

Yes the effect is the same, but your patch results in a non-trivially 
empty function; I'd prefer it to be visibly empty when we compile 
without cache support.

>>> diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk
>>> index 49ac9c7..7f9b171 100644
>>> --- a/arch/arm/cpu/armv7/config.mk
>>> +++ b/arch/arm/cpu/armv7/config.mk
>>> @@ -23,7 +23,7 @@
>>> PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
>>>
>>> # Make ARMv5 to allow more compilers to work, even though its v7a.
>>> -PLATFORM_CPPFLAGS += -march=armv5
>>> +PLATFORM_CPPFLAGS += -march=armv7-a
>>
>> Did you check that this does not break any board using armv7?
>
> I checked only Codesourcery tool chain.
> Linux kernel build for a v7 architecture processor uses armv7-a. Is it
> not fair to assume that the toolchain used for bootloader also supports
> it? Do we have to support those out-dated compilers?

Just because Linux uses armv7-a for a v7 arch does not mean we must have 
it for u-boot. For starters, U-boot does not always boot Linux. :)

As for out-dated compilers, that's the question I'm asking: do we 
consider e.g. ELDK 4.2 as outdated or not? It won't accept armv7-a.

>>> +/* some utility macros */
>>> +#define mask(start, end) \
>>> + (((1<< ((end) - (start) + 1)) - 1)<< (start))
>>> +
>>> +#define mask_n_get(reg, start, end) \
>>> + (((reg)& mask(start, end))>> (start))
>>
>> Seeing as these functions are only used in the ARMv7 cache C file, they
>> should be moved there.
>
> I plan to use a modified version of mask_n_get() and its set couterpart
> mask_n_set() in my subsequent works in more files.
>
> Can I keep it here itself or should I move it to an OMAP specific
> header file or can I move it to a more generic header file? Please
> suggest.

They're very generic actually. I think they should go to a gereric bit 
manipulation header, and be named a... bit... more explicitly. For 
instance, the name 'mask' does not show that the macro creates a range 
of 'one' bits from start to end.

> Best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08 14:06       ` Albert ARIBAUD
@ 2011-01-09 22:41         ` Wolfgang Denk
  2011-01-10  4:56           ` Aneesh V
  2011-01-12  9:08         ` Aneesh V
  1 sibling, 1 reply; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-09 22:41 UTC (permalink / raw)
  To: u-boot

Dear Albert ARIBAUD,

In message <4D286F58.9010605@free.fr> you wrote:
> 
> I know we consider multi-board u-boot binaries when boards are variant 
> of a given SoC, that's one reason why we wanted relocation. I'm not sure 
> about multi-SoC when SoC is a variant of a given cpu, though. Wolfgang, 
> your opinion?

Unless we see a specific example which uses this feature, we should
not add provisions that make the code more complicated than needed.

And when we start supporting such a feature, we should probably do
this based on a device tree approach.

> > Although this function is non-empty, flush_dcache_range() is in turn
> > empty. Effect will be the same, right?
> 
> Yes the effect is the same, but your patch results in a non-trivially 
> empty function; I'd prefer it to be visibly empty when we compile 
> without cache support.

Yes, that's my opinion, too.


> Just because Linux uses armv7-a for a v7 arch does not mean we must have 
> it for u-boot. For starters, U-boot does not always boot Linux. :)
> 
> As for out-dated compilers, that's the question I'm asking: do we 
> consider e.g. ELDK 4.2 as outdated or not? It won't accept armv7-a.

That's a catch question.

Yes, ELDK 4.2 is outdated.  But it is still one of the most stable
versions of a tool chain known to me, especially when it comes to
using the very same tool versions across several architectures.

I cannot see any benefits of this code change that would justiify such
a breakage.


Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
"The bad reputation UNIX has gotten is totally undeserved, laid on by
people who don't understand, who have not gotten in there  and  tried
anything."          -- Jim Joyce, owner of Jim Joyce's UNIX Bookstore

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot
  2010-12-22 11:54 ` [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot Aneesh V
@ 2011-01-09 22:48   ` Wolfgang Denk
  2011-01-10 13:41     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-09 22:48 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <1293018898-13253-6-git-send-email-aneesh@ti.com> you wrote:
> Add support for some of the key maintenance operations
> 	- Invalidate all
> 	- Invalidate range
> 	- Flush(clean & invalidate) all
> 	- Flush range

Can you please use a more descriptive subject, and commit message?

I have no idea what a "PL310" might be - is this a new board? Or a new
SoC? or a new Ethernet controller?


And what exactly are "key maintenance operations"?  Looks as if you
were talking about basic cache operations?

> --- /dev/null
> +++ b/arch/arm/include/asm/pl310.h
...
> +/* Register offsets */
> +#define PL310_CACHE_TYPE		0x004
> +#define PL310_AUX_CTRL			0x104
> +
> +#define PL310_CACHE_SYNC		0x730
> +#define PL310_INVAL_LINE_PA		0x770
> +#define PL310_INVAL_WAY			0x77C
> +#define PL310_CLEAN_LINE_PA		0x7B0
> +#define PL310_CLEAN_INVAL_WAY		0x7FC
> +#define PL310_CLEAN_INVAL_LINE_PA	0x7F0

NAK.  Please use a C struct instead.


> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -42,6 +42,7 @@ COBJS-y	+= cache.o
>  ifndef CONFIG_SYS_NO_CP15_CACHE
>  COBJS-y	+= cache-cp15.o
>  endif
> +COBJS-$(CONFIG_SYS_USE_PL310) += cache-pl310.o
>  COBJS-y	+= interrupts.o
>  COBJS-y	+= reset.o

There is no documentation for CONFIG_SYS_USE_PL310, and there is no
use of this variable.

Also, it seems CONFIG_SYS_PL310 would be more appropriate.

...
> +static void pl310_cache_sync(void)
> +{
> +	__raw_writel(0, CONFIG_SYS_PL310_BASE + PL310_CACHE_SYNC);
> +}

Please use a proper C struct instead of base address plus offset.
Please fix globally.

...
> +	for (pa = start; pa < stop; pa = pa + line_size)
> +		__raw_writel(pa, CONFIG_SYS_PL310_BASE +
> +			     PL310_CLEAN_INVAL_LINE_PA);

Please use braces for multiline statements.


Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
No one can guarantee the actions of another.
	-- Spock, "Day of the Dove", stardate unknown

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework
  2010-12-22 11:54 ` [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework Aneesh V
@ 2011-01-09 22:52   ` Wolfgang Denk
  2011-01-10 14:33     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-09 22:52 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <1293018898-13253-7-git-send-email-aneesh@ti.com> you wrote:
> adapt omap4 to the new layered cache maintenance framework
> 
> Signed-off-by: Aneesh V <aneesh@ti.com>

> +/*
> + * Outer cache related functions
> + */
> +#ifndef CONFIG_SYS_NO_DCACHE

Do we really need such a #define?  Where is CONFIG_SYS_NO_DCACHE
documented?

> +void v7_setup_outer_cache_ops(void)
> +{
> +#ifndef CONFIG_L2_OFF

Should this be CONFIG_L2_OFF or CONFIG_SYS_L2_OFF?  Where is
CONFIG_SYS_L2_OFF documented?


> @@ -45,3 +47,18 @@ lowlevel_init:
>  	 */
>  	bl	s_init
>  	pop	{ip, pc}
> +
> +set_pl310_ctrl_reg:
> +        PUSH	{r4-r11, lr}	@ save registers - ROM code may pollute
> +				@ our registers
> +        LDR	r12, =0x102	@ Set PL310 control register - value in R0
> +        SMC	#0		@ call ROM Code API to set control register
> +        POP	{r4-r11, pc}

Indentation by TAB, please.


Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
It's not an optical illusion, it just looks like one.   -- Phil White

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2010-12-22 11:54 ` [U-Boot] [PATCH 7/8] armv7: adapt omap3 " Aneesh V
@ 2011-01-09 22:57   ` Wolfgang Denk
  2011-01-10 14:41     ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-09 22:57 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <1293018898-13253-8-git-send-email-aneesh@ti.com> you wrote:
> adapt omap3 to the new layered cache maintenance framework
...

> +/* Declarations */

Please drop this comment.  Everybody sees what this is.

> +#ifndef CONFIG_L2_OFF
>  	/*
> -	 * Writing to AuxCR in U-boot using SMI for GP DEV
> -	 * Currently SMI in Kernel on ES2 devices seems to have an issue
> -	 * Once that is resolved, we can postpone this config to kernel
> +	 * Invalidate L2-cache from secure mode
>  	 */

Why not change this into simple

	/* Invalidate L2-cache from secure mode */

?

...
> +static void omap3_emu_romcode_call(u32 service_id, u32 *parameters)
> +{
> +	u32 i, num_params = *parameters;
> +	u32 *sram_scratch_space = (u32 *)OMAP3_PUBLIC_SRAM_SCRATCH_AREA;
> +	/*
> +	 * copy the parameters to an un-cached area to avoid coherency
> +	 * issues
> +	 */
> +	for (i = 0; i < num_params; i++) {
> +		__raw_writel(*parameters, sram_scratch_space);
> +		parameters++;
> +		sram_scratch_space++;
> +	}

Do you have unlimited storage there?  Or should you add some check not
to exceed some maximum size?

> +	} else {
> +		struct emu_hal_params emu_romcode_params;
> +		emu_romcode_params.num_params = 1;
> +		emu_romcode_params.param1 = acr;
> +		omap3_emu_romcode_call(OMAP3_EMU_HAL_API_WRITE_ACR,
> +				       (u32 *)&emu_romcode_params);

Please add a blank line between declarations and code (fix globally).

> +static void omap3_setup_aux_cr(void)
> +{
> +	/* Workaround for Cortex-A8 errata: #454179 #430973
> +	 *	Set "IBE" bit
...
Incorrect multiline comment style.

...
> diff --git a/arch/arm/include/asm/arch-omap3/sys_proto.h b/arch/arm/include/asm/arch-omap3/sys_proto.h
> index 4a28ba1..25f54ea 100644
> --- a/arch/arm/include/asm/arch-omap3/sys_proto.h
> +++ b/arch/arm/include/asm/arch-omap3/sys_proto.h
> @@ -27,6 +27,11 @@ typedef struct {
>  	char *nand_string;
>  } omap3_sysinfo;
>  
> +struct __attribute__ ((__packed__)) emu_hal_params {
> +	u32 num_params;
> +	u32 param1;
> +};

Why exactly do we need the "__attribute__ ((__packed__))" here?


Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
The years of peak mental activity are undoubtedly between the ages of
four and eighteen. At four we know all the questions, at eighteen all
the answers.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-09 22:41         ` Wolfgang Denk
@ 2011-01-10  4:56           ` Aneesh V
  2011-01-17 21:47             ` Wolfgang Denk
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-10  4:56 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Monday 10 January 2011 04:11 AM, Wolfgang Denk wrote:
> Dear Albert ARIBAUD,
>
> In message<4D286F58.9010605@free.fr>  you wrote:
>>
>> I know we consider multi-board u-boot binaries when boards are variant
>> of a given SoC, that's one reason why we wanted relocation. I'm not sure
>> about multi-SoC when SoC is a variant of a given cpu, though. Wolfgang,
>> your opinion?
>
> Unless we see a specific example which uses this feature, we should
> not add provisions that make the code more complicated than needed.

Agree. But do you think the pointer based approach makes it overly
complex?

>
> And when we start supporting such a feature, we should probably do
> this based on a device tree approach.
>
>>> Although this function is non-empty, flush_dcache_range() is in turn
>>> empty. Effect will be the same, right?
>>
>> Yes the effect is the same, but your patch results in a non-trivially
>> empty function; I'd prefer it to be visibly empty when we compile
>> without cache support.
>
> Yes, that's my opinion, too.

Agree.

>
>
>> Just because Linux uses armv7-a for a v7 arch does not mean we must have
>> it for u-boot. For starters, U-boot does not always boot Linux. :)
>>
>> As for out-dated compilers, that's the question I'm asking: do we
>> consider e.g. ELDK 4.2 as outdated or not? It won't accept armv7-a.
>
> That's a catch question.
>
> Yes, ELDK 4.2 is outdated.  But it is still one of the most stable
> versions of a tool chain known to me, especially when it comes to
> using the very same tool versions across several architectures.
>
> I cannot see any benefits of this code change that would justiify such
> a breakage.

Agree. The only benefit is that I can use some memory barrier
instructions without hand-coding the respective machine instructions.
But that can be done if it helps avoiding compiler breakage.

>
>
> Best regards,
>
> Wolfgang Denk
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot
  2011-01-09 22:48   ` Wolfgang Denk
@ 2011-01-10 13:41     ` Aneesh V
  0 siblings, 0 replies; 50+ messages in thread
From: Aneesh V @ 2011-01-10 13:41 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Monday 10 January 2011 04:18 AM, Wolfgang Denk wrote:
> Dear Aneesh V,
>
> In message<1293018898-13253-6-git-send-email-aneesh@ti.com>  you wrote:
>> Add support for some of the key maintenance operations
>> 	- Invalidate all
>> 	- Invalidate range
>> 	- Flush(clean&  invalidate) all
>> 	- Flush range
>
> Can you please use a more descriptive subject, and commit message?
>
> I have no idea what a "PL310" might be - is this a new board? Or a new
> SoC? or a new Ethernet controller?

Sure. I will add a more descriptive message.
PL310 is an L2 cache controller from ARM.

>
>
> And what exactly are "key maintenance operations"?  Looks as if you
> were talking about basic cache operations?

Yes. I was talking about basic cache operations. In ARM terminology the
cache operations are called "cache maintenance operations".

>
>> --- /dev/null
>> +++ b/arch/arm/include/asm/pl310.h
> ...
>> +/* Register offsets */
>> +#define PL310_CACHE_TYPE		0x004
>> +#define PL310_AUX_CTRL			0x104
>> +
>> +#define PL310_CACHE_SYNC		0x730
>> +#define PL310_INVAL_LINE_PA		0x770
>> +#define PL310_INVAL_WAY			0x77C
>> +#define PL310_CLEAN_LINE_PA		0x7B0
>> +#define PL310_CLEAN_INVAL_WAY		0x7FC
>> +#define PL310_CLEAN_INVAL_LINE_PA	0x7F0
>
> NAK.  Please use a C struct instead.

Ok.

>
>
>> --- a/arch/arm/lib/Makefile
>> +++ b/arch/arm/lib/Makefile
>> @@ -42,6 +42,7 @@ COBJS-y	+= cache.o
>>   ifndef CONFIG_SYS_NO_CP15_CACHE
>>   COBJS-y	+= cache-cp15.o
>>   endif
>> +COBJS-$(CONFIG_SYS_USE_PL310) += cache-pl310.o
>>   COBJS-y	+= interrupts.o
>>   COBJS-y	+= reset.o
>
> There is no documentation for CONFIG_SYS_USE_PL310, and there is no
> use of this variable.
>
> Also, it seems CONFIG_SYS_PL310 would be more appropriate.

I shall make it CONFIG_SYS_PL310 and add documentation.

>
> ...
>> +static void pl310_cache_sync(void)
>> +{
>> +	__raw_writel(0, CONFIG_SYS_PL310_BASE + PL310_CACHE_SYNC);
>> +}
>
> Please use a proper C struct instead of base address plus offset.
> Please fix globally.
>
> ...
>> +	for (pa = start; pa<  stop; pa = pa + line_size)
>> +		__raw_writel(pa, CONFIG_SYS_PL310_BASE +
>> +			     PL310_CLEAN_INVAL_LINE_PA);
>
> Please use braces for multiline statements.

ok.

>
>
> Best regards,
>
> Wolfgang Denk
>

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework
  2011-01-09 22:52   ` Wolfgang Denk
@ 2011-01-10 14:33     ` Aneesh V
  2011-01-17 21:52       ` Wolfgang Denk
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-10 14:33 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Monday 10 January 2011 04:22 AM, Wolfgang Denk wrote:
> Dear Aneesh V,
>
> In message<1293018898-13253-7-git-send-email-aneesh@ti.com>  you wrote:
>> adapt omap4 to the new layered cache maintenance framework
>>
>> Signed-off-by: Aneesh V<aneesh@ti.com>
>
>> +/*
>> + * Outer cache related functions
>> + */
>> +#ifndef CONFIG_SYS_NO_DCACHE
>
> Do we really need such a #define?  Where is CONFIG_SYS_NO_DCACHE
> documented?

I shall add documentation if it doesn't exist. But I see that it's
already being used.
IMO, it's good to have this option:
1. Not all boards may like to enable data cache.
2. Disabling caches may be one thing you might want to try while
debugging issues.
3. CPUs that do not have cache operations and initialization defined
yet might want to keep it disabled.

>
>> +void v7_setup_outer_cache_ops(void)
>> +{
>> +#ifndef CONFIG_L2_OFF
>
> Should this be CONFIG_L2_OFF or CONFIG_SYS_L2_OFF?  Where is
> CONFIG_SYS_L2_OFF documented?

CONFIG_L2_OFF is also aready being used. That's why I used it.

A handful of board config files are using it, but in actual code it's
only referenced by armv7 code.

Shall I change them all to CONFIG_SYS_L2_OFF
I shall add documentation.

>
>
>> @@ -45,3 +47,18 @@ lowlevel_init:
>>   	 */
>>   	bl	s_init
>>   	pop	{ip, pc}
>> +
>> +set_pl310_ctrl_reg:
>> +        PUSH	{r4-r11, lr}	@ save registers - ROM code may pollute
>> +				@ our registers
>> +        LDR	r12, =0x102	@ Set PL310 control register - value in R0
>> +        SMC	#0		@ call ROM Code API to set control register
>> +        POP	{r4-r11, pc}
>
> Indentation by TAB, please.

Sure. I will take care in future. BTW, that was not caught by
checkpatch.pl. Is there anyway to catch such errors using some tool?

>
>
> Best regards,
>
> Wolfgang Denk
>

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2011-01-09 22:57   ` Wolfgang Denk
@ 2011-01-10 14:41     ` Aneesh V
  2011-01-17 21:55       ` Wolfgang Denk
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-10 14:41 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Monday 10 January 2011 04:27 AM, Wolfgang Denk wrote:
> Dear Aneesh V,
>
> In message<1293018898-13253-8-git-send-email-aneesh@ti.com>  you wrote:
>> adapt omap3 to the new layered cache maintenance framework
> ...
>
>> +/* Declarations */
>
> Please drop this comment.  Everybody sees what this is.

ok.

>
>> +#ifndef CONFIG_L2_OFF
>>   	/*
>> -	 * Writing to AuxCR in U-boot using SMI for GP DEV
>> -	 * Currently SMI in Kernel on ES2 devices seems to have an issue
>> -	 * Once that is resolved, we can postpone this config to kernel
>> +	 * Invalidate L2-cache from secure mode
>>   	 */
>
> Why not change this into simple
>
> 	/* Invalidate L2-cache from secure mode */
>
> ?

ok.

>
> ...
>> +static void omap3_emu_romcode_call(u32 service_id, u32 *parameters)
>> +{
>> +	u32 i, num_params = *parameters;
>> +	u32 *sram_scratch_space = (u32 *)OMAP3_PUBLIC_SRAM_SCRATCH_AREA;
>> +	/*
>> +	 * copy the parameters to an un-cached area to avoid coherency
>> +	 * issues
>> +	 */
>> +	for (i = 0; i<  num_params; i++) {
>> +		__raw_writel(*parameters, sram_scratch_space);
>> +		parameters++;
>> +		sram_scratch_space++;
>> +	}
>
> Do you have unlimited storage there?  Or should you add some check not
> to exceed some maximum size?

Number of params is typically 1 or 2. We should have enough space
unless the usage is wrong.

>
>> +	} else {
>> +		struct emu_hal_params emu_romcode_params;
>> +		emu_romcode_params.num_params = 1;
>> +		emu_romcode_params.param1 = acr;
>> +		omap3_emu_romcode_call(OMAP3_EMU_HAL_API_WRITE_ACR,
>> +				       (u32 *)&emu_romcode_params);
>
> Please add a blank line between declarations and code (fix globally).

ok.

>
>> +static void omap3_setup_aux_cr(void)
>> +{
>> +	/* Workaround for Cortex-A8 errata: #454179 #430973
>> +	 *	Set "IBE" bit
> ...
> Incorrect multiline comment style.

Will correct it.

>
> ...
>> diff --git a/arch/arm/include/asm/arch-omap3/sys_proto.h b/arch/arm/include/asm/arch-omap3/sys_proto.h
>> index 4a28ba1..25f54ea 100644
>> --- a/arch/arm/include/asm/arch-omap3/sys_proto.h
>> +++ b/arch/arm/include/asm/arch-omap3/sys_proto.h
>> @@ -27,6 +27,11 @@ typedef struct {
>>   	char *nand_string;
>>   } omap3_sysinfo;
>>
>> +struct __attribute__ ((__packed__)) emu_hal_params {
>> +	u32 num_params;
>> +	u32 param1;
>> +};
>
> Why exactly do we need the "__attribute__ ((__packed__))" here?

Because a pointer to it has to be passed to ROM code and ROM code
wouldn't expect any padding.

>
>
> Best regards,
>
> Wolfgang Denk
>

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08 14:06       ` Albert ARIBAUD
  2011-01-09 22:41         ` Wolfgang Denk
@ 2011-01-12  9:08         ` Aneesh V
  2011-01-12 19:23           ` Albert ARIBAUD
  1 sibling, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-12  9:08 UTC (permalink / raw)
  To: u-boot

On Saturday 08 January 2011 07:36 PM, Albert ARIBAUD wrote:
> Le 08/01/2011 14:17, Aneesh V a ?crit :
>
<snip..>

>>>> +/* some utility macros */
>>>> +#define mask(start, end) \
>>>> + (((1<< ((end) - (start) + 1)) - 1)<< (start))
>>>> +
>>>> +#define mask_n_get(reg, start, end) \
>>>> + (((reg)& mask(start, end))>> (start))
>>>
>>> Seeing as these functions are only used in the ARMv7 cache C file, they
>>> should be moved there.
>>
>> I plan to use a modified version of mask_n_get() and its set couterpart
>> mask_n_set() in my subsequent works in more files.
>>
>> Can I keep it here itself or should I move it to an OMAP specific
>> header file or can I move it to a more generic header file? Please
>> suggest.
>
> They're very generic actually. I think they should go to a gereric bit
> manipulation header, and be named a... bit... more explicitly. For
> instance, the name 'mask' does not show that the macro creates a range
> of 'one' bits from start to end.
>

What I need is something like below:

#define get_bit_field(nr, start, mask)\
	(((nr) & (mask)) >> (start))

#define set_bit_field(nr, start, mask, val)\
	(nr) = ((nr) & ~(mask)) | (((val) << (start)) & (mask))

Can these go in a generic header? If so, can I add them to
"include/linux/bitops.h"


Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08 10:06     ` Aneesh V
@ 2011-01-12 19:18       ` Albert ARIBAUD
  2011-01-13 11:10         ` Aneesh V
  2011-01-13 12:14         ` Aneesh V
  0 siblings, 2 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-12 19:18 UTC (permalink / raw)
  To: u-boot

(I realize I did not answer the other ones)

Le 08/01/2011 11:06, Aneesh V a ?crit :

>> Out of curiosity, can you elaborate on why the compiler would optimize
>> better in these cases?
>
> While counting down the termination condition check is against 0. So
> you can just decrement the loop count using a 'subs' and do a 'bne'.
> When you count up you have to do a comparison with a non-zero value. So
> you will need one 'cmp' instruction extra:-)

I would not try to be too smart about what instructions are generated 
and how by a compiler such as gcc which has rather complex code 
generation optimizations.

> bigger loop inside because that reduces the frequency at which your
> outer parameter changes and hence the overall number of instructions
> executed. Consider this:
> 1. We encode both the loop counts along with other data into a register
> that is finally written to CP15 register.
> 2. outer loop has the code for shifting and ORing the outer variable to
> this register.
> 3. Inner loop has the code for shifting and ORing the inner variable.
> Step (3) has to be executed 'way x set' number of times anyways.
> But having bigger loop inside makes sure that 2 is executed fewer times!

Here too it seems like you're underestimating the compiler's optimizing 
capabilities -- your explanation seems to amount to extracting a 
constant calculation from a loop, something that is rather usual in code 
optimizing.

> With these tweaks the assembly code generated by this C code is as good
> as the original hand-written assembly code with my compiler.

How about other compilers?

>>> +	for (way = num_ways - 1; way>= 0 ; way--)
>>> +		for (set = num_sets - 1; set>= 0; set--) {
>>
>> Please fix whitespacing around operators. The best way to ''catch'em
>> all'' is to run Linux' checkpatch.pl (I do this with option --no-tree)
>> on all patches that you submit to u-boot and, fix all warning and errors
>> and if some are left that you think should not be fixed, mention them
>> and explain why they're wrongly emitted.
>
> I religiously do checkpatch whenever I send out a patch. Please note
> that my original mail seems to be fine. I saved it and ran checkpatch
> again. No errors, no warnings! Something amiss?

Well, something like "set>= 0" is quite surprising as it has 
inconsistent spacing around a binary operators. But you're right, 
checkpatch does not detect it. Can you fix them manually?

> Best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-12  9:08         ` Aneesh V
@ 2011-01-12 19:23           ` Albert ARIBAUD
  2011-01-13 12:05             ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-12 19:23 UTC (permalink / raw)
  To: u-boot

Le 12/01/2011 10:08, Aneesh V a ?crit :
> On Saturday 08 January 2011 07:36 PM, Albert ARIBAUD wrote:
>> Le 08/01/2011 14:17, Aneesh V a ?crit :
>>
> <snip..>
>
>>>>> +/* some utility macros */
>>>>> +#define mask(start, end) \
>>>>> + (((1<< ((end) - (start) + 1)) - 1)<< (start))
>>>>> +
>>>>> +#define mask_n_get(reg, start, end) \
>>>>> + (((reg)& mask(start, end))>> (start))
>>>>
>>>> Seeing as these functions are only used in the ARMv7 cache C file, they
>>>> should be moved there.
>>>
>>> I plan to use a modified version of mask_n_get() and its set couterpart
>>> mask_n_set() in my subsequent works in more files.
>>>
>>> Can I keep it here itself or should I move it to an OMAP specific
>>> header file or can I move it to a more generic header file? Please
>>> suggest.
>>
>> They're very generic actually. I think they should go to a gereric bit
>> manipulation header, and be named a... bit... more explicitly. For
>> instance, the name 'mask' does not show that the macro creates a range
>> of 'one' bits from start to end.
>
> What I need is something like below:
>
> #define get_bit_field(nr, start, mask)\
> (((nr) & (mask)) >> (start))
>
> #define set_bit_field(nr, start, mask, val)\
> (nr) = ((nr) & ~(mask)) | (((val) << (start)) & (mask))
>
> Can these go in a generic header? If so, can I add them to
> "include/linux/bitops.h"

After some more thought, I am wondering if a *generic* field setting and 
getting macro is really useful. So far everyone is fine with at most 
defining field-specific macros.

> Best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-12 19:18       ` Albert ARIBAUD
@ 2011-01-13 11:10         ` Aneesh V
  2011-01-13 12:14         ` Aneesh V
  1 sibling, 0 replies; 50+ messages in thread
From: Aneesh V @ 2011-01-13 11:10 UTC (permalink / raw)
  To: u-boot

On Thursday 13 January 2011 12:48 AM, Albert ARIBAUD wrote:
> (I realize I did not answer the other ones)
>
> Le 08/01/2011 11:06, Aneesh V a ?crit :
>
>>> Out of curiosity, can you elaborate on why the compiler would optimize
>>> better in these cases?
>>
>> While counting down the termination condition check is against 0. So
>> you can just decrement the loop count using a 'subs' and do a 'bne'.
>> When you count up you have to do a comparison with a non-zero value. So
>> you will need one 'cmp' instruction extra:-)
>
> I would not try to be too smart about what instructions are generated
> and how by a compiler such as gcc which has rather complex code
> generation optimizations.

IMHO, on ARM comparing with 0 is always going to be efficient than
comparing with a non-zero number for a termination condition, assuming
a decent compiler.

>
>> bigger loop inside because that reduces the frequency at which your
>> outer parameter changes and hence the overall number of instructions
>> executed. Consider this:
>> 1. We encode both the loop counts along with other data into a register
>> that is finally written to CP15 register.
>> 2. outer loop has the code for shifting and ORing the outer variable to
>> this register.
>> 3. Inner loop has the code for shifting and ORing the inner variable.
>> Step (3) has to be executed 'way x set' number of times anyways.
>> But having bigger loop inside makes sure that 2 is executed fewer times!
>

It's not a constant calculation. It's based on loop index. And this
optimization is not relying on compiler specifics. This is a logic
level optimization. It should generally give good results with all
compilers. Perhaps I was wrong in stating that it helps in getting
better assembly. It just helps in better run-time efficiency.

 > Here too it seems like you're underestimating the compiler's optimizing
 > capabilities -- your explanation seems to amount to extracting a
 > constant calculation from a loop, something that is rather usual in code
 > optimizing.

Actually, in my experience(in this same context) GCC does a terrible
job at this! For instance:

+		for (set = num_sets - 1; set >= 0; set--) {
+			setway = (level << 1) | (set << log2_line_len) |
+				 (way << way_shift);

Here, way_shift = 32 - log2_num_ways

But if you substitute way_shift with the latter, GCC will put the 
subtraction instruction inside the loop! - where as it is clearly loop 
invariant. So, I had to move it explicitly out of the loop!
In fact, I was thinking of giving this feedback to GCC.

>
>> With these tweaks the assembly code generated by this C code is as good
>> as the original hand-written assembly code with my compiler.
>
> How about other compilers?
>

I haven't tested other compilers. However, as I mentioned above the
latter one is a logic optimization. The former hopefully should help
all ARM compilers.

As you must be knowing, existing code for cache maintenance was in
assembly. When I moved it to C I wanted to make sure that the generated
code is as good as the original assembly for this critical piece of
code (I expected some criticism about moving it to C :-)). That's
why I checked the generated code and did these ,hopefully, minor tweaks
to make it better. I hope they don't have any serious drawbacks.

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-12 19:23           ` Albert ARIBAUD
@ 2011-01-13 12:05             ` Aneesh V
  2011-01-13 13:14               ` Albert ARIBAUD
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-13 12:05 UTC (permalink / raw)
  To: u-boot

On Thursday 13 January 2011 12:53 AM, Albert ARIBAUD wrote:
> Le 12/01/2011 10:08, Aneesh V a ?crit :
>> On Saturday 08 January 2011 07:36 PM, Albert ARIBAUD wrote:
>>> Le 08/01/2011 14:17, Aneesh V a ?crit :
>>>
>> <snip..>
>>
>>>>>> +/* some utility macros */
>>>>>> +#define mask(start, end) \
>>>>>> + (((1<< ((end) - (start) + 1)) - 1)<< (start))
>>>>>> +
>>>>>> +#define mask_n_get(reg, start, end) \
>>>>>> + (((reg)& mask(start, end))>> (start))
>>>>>
>>>>> Seeing as these functions are only used in the ARMv7 cache C file,
>>>>> they
>>>>> should be moved there.
>>>>
>>>> I plan to use a modified version of mask_n_get() and its set couterpart
>>>> mask_n_set() in my subsequent works in more files.
>>>>
>>>> Can I keep it here itself or should I move it to an OMAP specific
>>>> header file or can I move it to a more generic header file? Please
>>>> suggest.
>>>
>>> They're very generic actually. I think they should go to a gereric bit
>>> manipulation header, and be named a... bit... more explicitly. For
>>> instance, the name 'mask' does not show that the macro creates a range
>>> of 'one' bits from start to end.
>>
>> What I need is something like below:
>>
>> #define get_bit_field(nr, start, mask)\
>> (((nr) & (mask)) >> (start))
>>
>> #define set_bit_field(nr, start, mask, val)\
>> (nr) = ((nr) & ~(mask)) | (((val) << (start)) & (mask))
>>
>> Can these go in a generic header? If so, can I add them to
>> "include/linux/bitops.h"
>
> After some more thought, I am wondering if a *generic* field setting and
> getting macro is really useful. So far everyone is fine with at most
> defining field-specific macros.

Is it going to be easy if you have many fields to deal with?

However, I agree that the above may be specific to our needs.

What may be of more generic interest may be something like this with
the mask automatically generated:
#define get_bit_field(nr, start, end)
#define set_bit_field(nr, start, end, val)

However, in our case I am already given the mask and start position for
each field (automatically generated from hw database). So, I prefer the
former versions.

If that doesn't look useful for generic use I will put them in
OMAP specific headers.

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-12 19:18       ` Albert ARIBAUD
  2011-01-13 11:10         ` Aneesh V
@ 2011-01-13 12:14         ` Aneesh V
  2011-01-13 17:12           ` Albert ARIBAUD
  1 sibling, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-13 12:14 UTC (permalink / raw)
  To: u-boot

On Thursday 13 January 2011 12:48 AM, Albert ARIBAUD wrote:
<snip ..>

>>>> + for (way = num_ways - 1; way>= 0 ; way--)
>>>> + for (set = num_sets - 1; set>= 0; set--) {
>>>
>>> Please fix whitespacing around operators. The best way to ''catch'em
>>> all'' is to run Linux' checkpatch.pl (I do this with option --no-tree)
>>> on all patches that you submit to u-boot and, fix all warning and errors
>>> and if some are left that you think should not be fixed, mention them
>>> and explain why they're wrongly emitted.
>>
>> I religiously do checkpatch whenever I send out a patch. Please note
>> that my original mail seems to be fine. I saved it and ran checkpatch
>> again. No errors, no warnings! Something amiss?
>
> Well, something like "set>= 0" is quite surprising as it has
> inconsistent spacing around a binary operators. But you're right,
> checkpatch does not detect it. Can you fix them manually?

Checkpatch does find such issues. I was trying to say that my original
mail doesn't have any spacing issues. The problem seems to have
appeared in your reply. Is your mail client doing something funny?

>
>> Best regards,
>> Aneesh
>
> Amicalement,

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-13 12:05             ` Aneesh V
@ 2011-01-13 13:14               ` Albert ARIBAUD
  2011-01-13 14:30                 ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-13 13:14 UTC (permalink / raw)
  To: u-boot

Le 13/01/2011 13:05, Aneesh V a ?crit :

>>> What I need is something like below:
>>>
>>> #define get_bit_field(nr, start, mask)\
>>> (((nr) & (mask)) >> (start))
>>>
>>> #define set_bit_field(nr, start, mask, val)\
>>> (nr) = ((nr) & ~(mask)) | (((val) << (start)) & (mask))
>>>
>>> Can these go in a generic header? If so, can I add them to
>>> "include/linux/bitops.h"
>>
>> After some more thought, I am wondering if a *generic* field setting and
>> getting macro is really useful. So far everyone is fine with at most
>> defining field-specific macros.
>
> Is it going to be easy if you have many fields to deal with?

I don't see how the generic macros ease anything. Instead of defining say

#define get_field_F(x)  ((x >> F_start) & F_mask)
#define set_field_F(x,v)  { x = (x ~ F_mask ) | (v << F_start) }

You'd have

#define get_field_F(x)  get_bit_field(x, F_start, F_mask)
#define set_field_F(x,v) set_bit_field(x, F_start, F_mask, v);

Which does not seem to bring any simplicity to me.

> However, I agree that the above may be specific to our needs.
>
> What may be of more generic interest may be something like this with
> the mask automatically generated:
> #define get_bit_field(nr, start, end)
> #define set_bit_field(nr, start, end, val)
>
> However, in our case I am already given the mask and start position for
> each field (automatically generated from hw database). So, I prefer the
> former versions.
>
> If that doesn't look useful for generic use I will put them in
> OMAP specific headers.
>
> Best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-13 13:14               ` Albert ARIBAUD
@ 2011-01-13 14:30                 ` Aneesh V
  2011-01-13 17:06                   ` Albert ARIBAUD
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-13 14:30 UTC (permalink / raw)
  To: u-boot

On Thursday 13 January 2011 06:44 PM, Albert ARIBAUD wrote:
> Le 13/01/2011 13:05, Aneesh V a ?crit :
>
>>>> What I need is something like below:
>>>>
>>>> #define get_bit_field(nr, start, mask)\
>>>> (((nr) & (mask)) >> (start))
>>>>
>>>> #define set_bit_field(nr, start, mask, val)\
>>>> (nr) = ((nr) & ~(mask)) | (((val) << (start)) & (mask))
>>>>
>>>> Can these go in a generic header? If so, can I add them to
>>>> "include/linux/bitops.h"
>>>
>>> After some more thought, I am wondering if a *generic* field setting and
>>> getting macro is really useful. So far everyone is fine with at most
>>> defining field-specific macros.
>>
>> Is it going to be easy if you have many fields to deal with?
>
> I don't see how the generic macros ease anything. Instead of defining say
>
> #define get_field_F(x) ((x >> F_start) & F_mask)
> #define set_field_F(x,v) { x = (x ~ F_mask ) | (v << F_start) }
>
> You'd have
>
> #define get_field_F(x) get_bit_field(x, F_start, F_mask)
> #define set_field_F(x,v) set_bit_field(x, F_start, F_mask, v);
>
> Which does not seem to bring any simplicity to me.

I wouldn't define get_field_F.
Instead I would just use set_bit_field(x, F_start, F_mask, v) directly
in the code and I have F_start and F_mask defined in the header files
(automatically generated)

Even if it was manual isn't it easier to define just F_start and F_mask
per field than defining a get_field_F per field?

Perhaps my requirement is different. If this scheme is not used by
many, I shall put these macros in OMAP specific headers.

best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-13 14:30                 ` Aneesh V
@ 2011-01-13 17:06                   ` Albert ARIBAUD
  0 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-13 17:06 UTC (permalink / raw)
  To: u-boot

Le 13/01/2011 15:30, Aneesh V a ?crit :

> Perhaps my requirement is different. If this scheme is not used by
> many, I shall put these macros in OMAP specific headers.

Yes, I'd prefer that, finally.

> best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-13 12:14         ` Aneesh V
@ 2011-01-13 17:12           ` Albert ARIBAUD
  0 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-01-13 17:12 UTC (permalink / raw)
  To: u-boot

Le 13/01/2011 13:14, Aneesh V a ?crit :
> On Thursday 13 January 2011 12:48 AM, Albert ARIBAUD wrote:
> <snip ..>
>
>>>>> + for (way = num_ways - 1; way>= 0 ; way--)
>>>>> + for (set = num_sets - 1; set>= 0; set--) {
>>>>
>>>> Please fix whitespacing around operators. The best way to ''catch'em
>>>> all'' is to run Linux' checkpatch.pl (I do this with option --no-tree)
>>>> on all patches that you submit to u-boot and, fix all warning and
>>>> errors
>>>> and if some are left that you think should not be fixed, mention them
>>>> and explain why they're wrongly emitted.
>>>
>>> I religiously do checkpatch whenever I send out a patch. Please note
>>> that my original mail seems to be fine. I saved it and ran checkpatch
>>> again. No errors, no warnings! Something amiss?
>>
>> Well, something like "set>= 0" is quite surprising as it has
>> inconsistent spacing around a binary operators. But you're right,
>> checkpatch does not detect it. Can you fix them manually?
>
> Checkpatch does find such issues. I was trying to say that my original
> mail doesn't have any spacing issues. The problem seems to have
> appeared in your reply. Is your mail client doing something funny?

You're right. The issue appears when doing a "reply" with Thunderbird. 
Seems to me like a recent change in behavior, as I recall having replied 
to mails with this kind of spacing preserved. Sorry for the noise.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-10  4:56           ` Aneesh V
@ 2011-01-17 21:47             ` Wolfgang Denk
  0 siblings, 0 replies; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-17 21:47 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <4D2A9164.5020107@ti.com> you wrote:
> 
> > Unless we see a specific example which uses this feature, we should
> > not add provisions that make the code more complicated than needed.
> 
> Agree. But do you think the pointer based approach makes it overly
> complex?

Not overly complex, but more complex than needed.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
A consultant is a person who borrows your watch, tells you what  time
it is, pockets the watch, and sends you a bill for it.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework
  2011-01-10 14:33     ` Aneesh V
@ 2011-01-17 21:52       ` Wolfgang Denk
  0 siblings, 0 replies; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-17 21:52 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <4D2B18C6.2010407@ti.com> you wrote:
> 
> A handful of board config files are using it, but in actual code it's
> only referenced by armv7 code.
> 
> Shall I change them all to CONFIG_SYS_L2_OFF

Yes, please do (in a separate patch).  Thanks.

> I shall add documentation.

Thanks.

> Sure. I will take care in future. BTW, that was not caught by
> checkpatch.pl. Is there anyway to catch such errors using some tool?

I use just visual inspection, and vi when in doubt.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Sorry, but my karma just ran over your dogma.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2011-01-10 14:41     ` Aneesh V
@ 2011-01-17 21:55       ` Wolfgang Denk
  2011-01-18  5:31         ` Aneesh V
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-17 21:55 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <4D2B1A90.9030109@ti.com> you wrote:
> 
> >> +struct __attribute__ ((__packed__)) emu_hal_params {
> >> +	u32 num_params;
> >> +	u32 param1;
> >> +};
> >
> > Why exactly do we need the "__attribute__ ((__packed__))" here?
> 
> Because a pointer to it has to be passed to ROM code and ROM code
> wouldn't expect any padding.

But padding would only be needed if there were any alignment
requirements requiring it. There are none here.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
"The combination of a number of things to make existence worthwhile."
"Yes, the philosophy of 'none,' meaning 'all.'"
	-- Spock and Lincoln, "The Savage Curtain", stardate 5906.4

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2011-01-17 21:55       ` Wolfgang Denk
@ 2011-01-18  5:31         ` Aneesh V
  2011-01-18  9:23           ` Wolfgang Denk
  0 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-01-18  5:31 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Tuesday 18 January 2011 03:25 AM, Wolfgang Denk wrote:
> Dear Aneesh V,
>
> In message<4D2B1A90.9030109@ti.com>  you wrote:
>>
>>>> +struct __attribute__ ((__packed__)) emu_hal_params {
>>>> +	u32 num_params;
>>>> +	u32 param1;
>>>> +};
>>>
>>> Why exactly do we need the "__attribute__ ((__packed__))" here?
>>
>> Because a pointer to it has to be passed to ROM code and ROM code
>> wouldn't expect any padding.
>
> But padding would only be needed if there were any alignment
> requirements requiring it. There are none here.

Yes. But, is that guaranteed?  I didn't want to make it compiler
dependent. Isn't it safer to use '__attribute__ ((__packed__))'

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 7/8] armv7: adapt omap3 to the new cache maintenance framework
  2011-01-18  5:31         ` Aneesh V
@ 2011-01-18  9:23           ` Wolfgang Denk
  0 siblings, 0 replies; 50+ messages in thread
From: Wolfgang Denk @ 2011-01-18  9:23 UTC (permalink / raw)
  To: u-boot

Dear Aneesh V,

In message <4D3525CD.8090001@ti.com> you wrote:
> 
> >>>> +struct __attribute__ ((__packed__)) emu_hal_params {
> >>>> +	u32 num_params;
> >>>> +	u32 param1;
> >>>> +};
> >>>
> >>> Why exactly do we need the "__attribute__ ((__packed__))" here?
> >>
> >> Because a pointer to it has to be passed to ROM code and ROM code
> >> wouldn't expect any padding.
> >
> > But padding would only be needed if there were any alignment
> > requirements requiring it. There are none here.
> 
> Yes. But, is that guaranteed?  I didn't want to make it compiler

Well, there has never been a guarantee agains broken tools doing
stupid or plain wrong things.

> dependent. Isn't it safer to use '__attribute__ ((__packed__))'

'__attribute__ ((__packed__))' has (especially on ARM) certain
side-effects that are neither needed nor wanted here, I think.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Knowledge, sir, should be free to all!
	-- Harry Mudd, "I, Mudd", stardate 4513.3

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-01-08  6:36   ` Albert ARIBAUD
                       ` (2 preceding siblings ...)
  2011-01-08 13:17     ` Aneesh V
@ 2011-03-01 11:54     ` Aneesh V
  2011-03-01 13:36       ` Albert ARIBAUD
  3 siblings, 1 reply; 50+ messages in thread
From: Aneesh V @ 2011-03-01 11:54 UTC (permalink / raw)
  To: u-boot

Hi Albert,

On Saturday 08 January 2011 12:06 PM, Albert ARIBAUD wrote:
> Hi Aneesh,
[snip ..]
>> +/* some utility macros */
>> +#define mask(start, end) \
>> +	(((1<<   ((end) - (start) + 1)) - 1)<<   (start))
>> +
>> +#define mask_n_get(reg, start, end) \
>> +	(((reg)&   mask(start, end))>>   (start))
>
> Seeing as these functions are only used in the ARMv7 cache C file, they
> should be moved there.
>

I am working on v2 of this series.

We had aligned on moving these macros to omap specific headers. But I
now realize that I want to use them from cache_v7.c, so it can not be
in omap header. I have used them in other places too in my recent work
so it needs to be in a header file. I am putting these functions and
some other utils I need in a new file "arch/arm/include/asm/utils.h"
Is that fine?

Best regards,
Aneesh

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7
  2011-03-01 11:54     ` Aneesh V
@ 2011-03-01 13:36       ` Albert ARIBAUD
  0 siblings, 0 replies; 50+ messages in thread
From: Albert ARIBAUD @ 2011-03-01 13:36 UTC (permalink / raw)
  To: u-boot

Le 01/03/2011 12:54, Aneesh V a ?crit :
> Hi Albert,
>
> On Saturday 08 January 2011 12:06 PM, Albert ARIBAUD wrote:
>> Hi Aneesh,
> [snip ..]
>>> +/* some utility macros */
>>> +#define mask(start, end) \
>>> + (((1<< ((end) - (start) + 1)) - 1)<< (start))
>>> +
>>> +#define mask_n_get(reg, start, end) \
>>> + (((reg)& mask(start, end))>> (start))
>>
>> Seeing as these functions are only used in the ARMv7 cache C file, they
>> should be moved there.
>>
>
> I am working on v2 of this series.
>
> We had aligned on moving these macros to omap specific headers. But I
> now realize that I want to use them from cache_v7.c, so it can not be
> in omap header. I have used them in other places too in my recent work
> so it needs to be in a header file. I am putting these functions and
> some other utils I need in a new file "arch/arm/include/asm/utils.h"
> Is that fine?

I need to look at the code that uses these macros again. I'll do that 
tonight.

> Best regards,
> Aneesh

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2011-03-01 13:36 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-22 11:54 [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 1/8] arm: make default implementation of cache_flush() weakly linked Aneesh V
2011-01-08  6:40   ` Albert ARIBAUD
2010-12-22 11:54 ` [U-Boot] [PATCH 2/8] armv7: cache maintenance operations for armv7 Aneesh V
2011-01-08  6:36   ` Albert ARIBAUD
2011-01-08  8:40     ` Albert ARIBAUD
2011-01-08 10:06     ` Aneesh V
2011-01-12 19:18       ` Albert ARIBAUD
2011-01-13 11:10         ` Aneesh V
2011-01-13 12:14         ` Aneesh V
2011-01-13 17:12           ` Albert ARIBAUD
2011-01-08 13:17     ` Aneesh V
2011-01-08 14:06       ` Albert ARIBAUD
2011-01-09 22:41         ` Wolfgang Denk
2011-01-10  4:56           ` Aneesh V
2011-01-17 21:47             ` Wolfgang Denk
2011-01-12  9:08         ` Aneesh V
2011-01-12 19:23           ` Albert ARIBAUD
2011-01-13 12:05             ` Aneesh V
2011-01-13 13:14               ` Albert ARIBAUD
2011-01-13 14:30                 ` Aneesh V
2011-01-13 17:06                   ` Albert ARIBAUD
2011-03-01 11:54     ` Aneesh V
2011-03-01 13:36       ` Albert ARIBAUD
2010-12-22 11:54 ` [U-Boot] [PATCH 3/8] armv7: integrate cache maintenance support Aneesh V
2011-01-08  6:54   ` Albert ARIBAUD
2011-01-08  8:15     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 4/8] arm: minor fixes for cache and mmu handling Aneesh V
2011-01-08  7:04   ` Albert ARIBAUD
2011-01-08  9:13     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 5/8] armv7: add PL310 support to u-boot Aneesh V
2011-01-09 22:48   ` Wolfgang Denk
2011-01-10 13:41     ` Aneesh V
2010-12-22 11:54 ` [U-Boot] [PATCH 6/8] armv7: adapt omap4 to the new cache maintenance framework Aneesh V
2011-01-09 22:52   ` Wolfgang Denk
2011-01-10 14:33     ` Aneesh V
2011-01-17 21:52       ` Wolfgang Denk
2010-12-22 11:54 ` [U-Boot] [PATCH 7/8] armv7: adapt omap3 " Aneesh V
2011-01-09 22:57   ` Wolfgang Denk
2011-01-10 14:41     ` Aneesh V
2011-01-17 21:55       ` Wolfgang Denk
2011-01-18  5:31         ` Aneesh V
2011-01-18  9:23           ` Wolfgang Denk
2010-12-22 11:54 ` [U-Boot] [PATCH 8/8] armv7: adapt s5pc1xx " Aneesh V
2010-12-27  7:25   ` Minkyu Kang
2010-12-27 11:22     ` Aneesh V
2011-01-07  5:27       ` Minkyu Kang
2010-12-23  4:53 ` [U-Boot] [PATCH 0/8] armv7: cache maintenance operations Steve Sakoman
2010-12-28 19:51 ` Paulraj, Sandeep
2011-01-08  7:07   ` Albert ARIBAUD

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.