All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH 0/22] x86: Add support for MTRRs
@ 2014-12-28  2:20 Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig Simon Glass
                   ` (20 more replies)
  0 siblings, 21 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

One of the four items left to fix up for bare ivybridge support in U-Boot
is MTRRs. These are an important part of the x86 platform since they provide
a means to control the cache behaviour for areas of memory.

Memory areas can be marked as uncacheable or cacheable. For cacheable the
write behaviour can be controlled:

- write-back: data is not written back from the cache until evicted
- write-though: data goes into the cache but is also written to memory
- write-protect: data cannot be written to this area
- write-combining: multiple writes to this area can be combined

This series adds support for deciding the MTRR setup that should be used,
commiting it to registers before relocation, updating it for the video frame
buffer, disabling CAR (cache-as-RAM) when needed and making sure that the
faster possible execution speed is provided. In addition an 'mtrr' command
is added to permit inspection and adjustment of MTRR registers from the
command line.


Simon Glass (22):
  x86: Correct XIP_ROM_SIZE
  x86: Drop RAMTOP Kconfig
  x86: Correct ifdtool microcode calculation
  x86: video: Add support for CONFIG_CONSOLE_SCROLL_LINES
  x86: config: Always scroll the display by 5 lines, for speed
  x86: video: Add a debug() to display the frame buffer address
  x86: pci: Don't return a vesa mode when there is not video
  x86: video: Add debug option to time the BIOS copy
  x86: ivybridge: Only run the Video BIOS when video is enabled
  x86: Use cache, don't clear the display in video BIOS
  x86: Tidy up VESA mode numbers
  x86: pci: Display vesa modes in hex
  x86: ivybridge: Drop support for ROM caching
  x86: Add support for MTRRs
  x86: ivybridge: Set up an MTRR for the video frame buffer
  x86: board_f: Adjust x86 boot order for performance
  x86: ivybridge: Request MTRRs for DRAM regions
  x86: Commit the current MTRRs before relocation
  x86: ivybridge: Add a way to turn off the CAR
  x86: Disable CAR before relocation on platforms that need it
  x86: ivybridge: Update microcode early in boot
  x86: Add an 'mtrr' command to list and adjust MTRRs

 arch/x86/Kconfig                         |   6 +-
 arch/x86/cpu/Makefile                    |   1 +
 arch/x86/cpu/coreboot/coreboot.c         |  22 ++---
 arch/x86/cpu/ivybridge/car.S             |  78 +++++++++++++--
 arch/x86/cpu/ivybridge/cpu.c             |  27 +----
 arch/x86/cpu/ivybridge/gma.c             |  25 ++++-
 arch/x86/cpu/ivybridge/microcode_intel.c |   9 +-
 arch/x86/cpu/ivybridge/sdram.c           |  10 ++
 arch/x86/cpu/mtrr.c                      |  81 +++++++++++++++
 arch/x86/cpu/start.S                     |  10 ++
 arch/x86/dts/link.dts                    |   3 -
 arch/x86/include/asm/global_data.h       |  15 +++
 arch/x86/include/asm/mtrr.h              | 163 ++++++++++++++-----------------
 arch/x86/lib/bios.c                      |  14 +--
 arch/x86/lib/init_helpers.c              |   8 ++
 common/Makefile                          |   1 +
 common/board_f.c                         |   8 +-
 common/cmd_mtrr.c                        | 138 ++++++++++++++++++++++++++
 doc/README.x86                           |  18 +++-
 drivers/pci/pci_rom.c                    |   9 +-
 drivers/video/cfb_console.c              |  26 +++--
 drivers/video/x86_fb.c                   |   1 +
 include/configs/x86-common.h             |   1 +
 tools/ifdtool.c                          |   4 +-
 24 files changed, 509 insertions(+), 169 deletions(-)
 create mode 100644 arch/x86/cpu/mtrr.c
 create mode 100644 common/cmd_mtrr.c

-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-30 15:05   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation Simon Glass
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

We don't need this in U-Boot since we calculate it based on available memory.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/Kconfig | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7d007bb..e992e64 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -47,10 +47,6 @@ config RAMBASE
 	hex
 	default 0x100000
 
-config RAMTOP
-	hex
-	default 0x200000
-
 config XIP_ROM_SIZE
 	hex
 	default ROM_SIZE
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  2:47   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 04/22] x86: video: Add support for CONFIG_CONSOLE_SCROLL_LINES Simon Glass
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

This currently assumes that U-Boot resides at the start of ROM. Update
it to remove this assumption.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 tools/ifdtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/ifdtool.c b/tools/ifdtool.c
index fe8366b..590ccc9 100644
--- a/tools/ifdtool.c
+++ b/tools/ifdtool.c
@@ -788,9 +788,9 @@ static int write_uboot(char *image, int size, struct input_file *uboot,
 			      fdt_strerror(data_size));
 			return -ENOENT;
 		}
-		offset = ucode_ptr - uboot->addr;
+		offset = (uint32_t)(ucode_ptr + size);
 		ptr = (void *)image + offset;
-		ptr[0] = uboot->addr + (data - image);
+		ptr[0] = (data - image) - size;
 		ptr[1] = data_size;
 		debug("Wrote microcode pointer at %x: addr=%x, size=%x\n",
 		      ucode_ptr, ptr[0], ptr[1]);
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 04/22] x86: video: Add support for CONFIG_CONSOLE_SCROLL_LINES
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 05/22] x86: config: Always scroll the display by 5 lines, for speed Simon Glass
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Some machines are very slow to scroll their displays. To cope with this,
support the CONFIG_CONSOLE_SCROLL_LINES option. Setting this to 5 allows
the display to operate at an acceptable speed by scrolling 5 lines at
a time.

This same option is available for LCDs so when these systems are unified
this code can be unified also.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 drivers/video/cfb_console.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/video/cfb_console.c b/drivers/video/cfb_console.c
index a653bb4..75cdf61 100644
--- a/drivers/video/cfb_console.c
+++ b/drivers/video/cfb_console.c
@@ -312,7 +312,11 @@ void console_cursor(int state);
 #define CONSOLE_ROW_SECOND	(video_console_address + CONSOLE_ROW_SIZE)
 #define CONSOLE_ROW_LAST	(video_console_address + CONSOLE_SIZE - CONSOLE_ROW_SIZE)
 #define CONSOLE_SIZE		(CONSOLE_ROW_SIZE * CONSOLE_ROWS)
-#define CONSOLE_SCROLL_SIZE	(CONSOLE_SIZE - CONSOLE_ROW_SIZE)
+
+/* By default we scroll by a single line */
+#ifndef CONFIG_CONSOLE_SCROLL_LINES
+#define CONFIG_CONSOLE_SCROLL_LINES 1
+#endif
 
 /* Macros */
 #ifdef	VIDEO_FB_LITTLE_ENDIAN
@@ -753,26 +757,33 @@ static void console_clear_line(int line, int begin, int end)
 
 static void console_scrollup(void)
 {
+	const int rows = CONFIG_CONSOLE_SCROLL_LINES;
+	int i;
+
 	/* copy up rows ignoring the first one */
 
 #ifdef VIDEO_HW_BITBLT
 	video_hw_bitblt(VIDEO_PIXEL_SIZE,	/* bytes per pixel */
 			0,			/* source pos x */
 			video_logo_height +
-				VIDEO_FONT_HEIGHT, /* source pos y */
+				VIDEO_FONT_HEIGHT * rows, /* source pos y */
 			0,			/* dest pos x */
 			video_logo_height,	/* dest pos y */
 			VIDEO_VISIBLE_COLS,	/* frame width */
 			VIDEO_VISIBLE_ROWS
 			- video_logo_height
-			- VIDEO_FONT_HEIGHT	/* frame height */
+			- VIDEO_FONT_HEIGHT * rows	/* frame height */
 		);
 #else
-	memcpyl(CONSOLE_ROW_FIRST, CONSOLE_ROW_SECOND,
-		CONSOLE_SCROLL_SIZE >> 2);
+	memcpyl(CONSOLE_ROW_FIRST, CONSOLE_ROW_FIRST + rows * CONSOLE_ROW_SIZE,
+		(CONSOLE_SIZE - CONSOLE_ROW_SIZE * rows) >> 2);
 #endif
 	/* clear the last one */
-	console_clear_line(CONSOLE_ROWS - 1, 0, CONSOLE_COLS - 1);
+	for (i = 1; i <= rows; i++)
+		console_clear_line(CONSOLE_ROWS - i, 0, CONSOLE_COLS - 1);
+
+	/* Decrement row number */
+	console_row -= rows;
 }
 
 static void console_back(void)
@@ -884,9 +895,6 @@ static void console_newline(int n)
 	if (console_row >= CONSOLE_ROWS) {
 		/* Scroll everything up */
 		console_scrollup();
-
-		/* Decrement row number */
-		console_row = CONSOLE_ROWS - 1;
 	}
 }
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 05/22] x86: config: Always scroll the display by 5 lines, for speed
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (2 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 04/22] x86: video: Add support for CONFIG_CONSOLE_SCROLL_LINES Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 06/22] x86: video: Add a debug() to display the frame buffer address Simon Glass
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Scrolling a line at a time is very slow for reasons that I don't understand.
It seems to take about 100ms to copy 4MB of RAM in the frame buffer. To cope
with this, scroll 5 lines each time.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 include/configs/x86-common.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/configs/x86-common.h b/include/configs/x86-common.h
index f16ae32..4f0a3c5 100644
--- a/include/configs/x86-common.h
+++ b/include/configs/x86-common.h
@@ -179,6 +179,7 @@
 #define VIDEO_FB_16BPP_WORD_SWAP
 #define CONFIG_I8042_KBD
 #define CONFIG_CFB_CONSOLE
+#define CONFIG_CONSOLE_SCROLL_LINES 5
 
 /*-----------------------------------------------------------------------
  * CPU Features
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 06/22] x86: video: Add a debug() to display the frame buffer address
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (3 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 05/22] x86: config: Always scroll the display by 5 lines, for speed Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 07/22] x86: pci: Don't return a vesa mode when there is not video Simon Glass
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Provide a way to display this address when booting.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 drivers/video/x86_fb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/video/x86_fb.c b/drivers/video/x86_fb.c
index 8743a8c..6641033 100644
--- a/drivers/video/x86_fb.c
+++ b/drivers/video/x86_fb.c
@@ -32,6 +32,7 @@ void *video_hw_init(void)
 	sprintf(gdev->modeIdent, "%dx%dx%d", gdev->winSizeX, gdev->winSizeY,
 		bits_per_pixel);
 	printf("%s\n", gdev->modeIdent);
+	debug("Frame buffer at %x\n", gdev->frameAdrs);
 
 	return (void *)gdev;
 }
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 07/22] x86: pci: Don't return a vesa mode when there is not video
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (4 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 06/22] x86: video: Add a debug() to display the frame buffer address Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy Simon Glass
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

If the video has not been set up, we should not return a success code. This
can be detected by seeing if any of the variables are non-zero.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 drivers/pci/pci_rom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci_rom.c b/drivers/pci/pci_rom.c
index af6a3ae..9808bb3 100644
--- a/drivers/pci/pci_rom.c
+++ b/drivers/pci/pci_rom.c
@@ -205,7 +205,7 @@ int vbe_get_video_info(struct graphic_device *gdev)
 	gdev->vprBase = vesa->phys_base_ptr;
 	gdev->cprBase = vesa->phys_base_ptr;
 
-	return 0;
+	return gdev->winSizeX ? 0 : -ENOSYS;
 #else
 	return -ENOSYS;
 #endif
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (5 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 07/22] x86: pci: Don't return a vesa mode when there is not video Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  3:09   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled Simon Glass
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

This can be very slow - typically 80ms even on a fast machine since it uses
the SPI flash to read the data. Add an option to display the time taken.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 drivers/pci/pci_rom.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/pci_rom.c b/drivers/pci/pci_rom.c
index 9808bb3..5ba315b 100644
--- a/drivers/pci/pci_rom.c
+++ b/drivers/pci/pci_rom.c
@@ -156,6 +156,8 @@ int pci_rom_load(uint16_t class, struct pci_rom_header *rom_header,
 
 	target = (void *)PCI_VGA_RAM_IMAGE_START;
 	if (target != rom_header) {
+		ulong start = get_timer(0);
+
 		debug("Copying VGA ROM Image from %p to %p, 0x%x bytes\n",
 		      rom_header, target, rom_size);
 		memcpy(target, rom_header, rom_size);
@@ -163,6 +165,7 @@ int pci_rom_load(uint16_t class, struct pci_rom_header *rom_header,
 			printf("VGA ROM copy failed\n");
 			return -EFAULT;
 		}
+		debug("Copy took %lums\n", get_timer(start));
 	}
 	*ram_headerp = target;
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (6 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  3:14   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 10/22] x86: Use cache, don't clear the display in video BIOS Simon Glass
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

This takes about about 700ms on link when running natively and 900ms when
running using the emulator. It is a waste of time if video is not enabled,
so don't bother running the video BIOS in that case.

We could add a command to run the video BIOS later when needed, but this is
not considered at present.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/gma.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/cpu/ivybridge/gma.c b/arch/x86/cpu/ivybridge/gma.c
index 3d7f740..125021b 100644
--- a/arch/x86/cpu/ivybridge/gma.c
+++ b/arch/x86/cpu/ivybridge/gma.c
@@ -15,6 +15,13 @@
 #include <asm/pci.h>
 #include <asm/arch/pch.h>
 #include <asm/arch/sandybridge.h>
+#include <linux/kconfig.h>
+
+#ifdef CONFIG_VIDEO
+#define RUN_VIDEO_BIOS		1
+#else
+#define RUN_VIDEO_BIOS		1
+#endif
 
 struct gt_powermeter {
 	u16 reg;
@@ -745,7 +752,15 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
 	if (ret)
 		return ret;
 
-	ret = pci_run_vga_bios(dev, int15_handler, false);
+	/*
+	 * TODO: Change to IS_ENABLED(CONFIG_VIDEO) when Kconfig supports
+	 * CONFIG_VIDEO.
+	 */
+	if (RUN_VIDEO_BIOS) {
+		start = get_timer(0);
+		ret = pci_run_vga_bios(dev, int15_handler, false);
+		debug("BIOS ran in %lums\n", get_timer(start));
+	}
 
 	/* Post VBIOS init */
 	ret = gma_pm_init_post_vbios(gtt_bar, blob, node);
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 10/22] x86: Use cache, don't clear the display in video BIOS
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (7 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 11/22] x86: Tidy up VESA mode numbers Simon Glass
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

There is no need to run with the cache disabled, and there is no point in
clearing the display frame buffer since U-Boot does it later.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/lib/bios.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/lib/bios.c b/arch/x86/lib/bios.c
index d1f8933..4285348 100644
--- a/arch/x86/lib/bios.c
+++ b/arch/x86/lib/bios.c
@@ -210,8 +210,8 @@ static u8 vbe_set_mode(struct vbe_mode_info *mi)
 	debug("VBE: Setting VESA mode %#04x\n", mi->video_mode);
 	/* request linear framebuffer mode */
 	mi->video_mode |= (1 << 14);
-	/* request clearing of framebuffer */
-	mi->video_mode &= ~(1 << 15);
+	/* don't clear the framebuffer, we do that later */
+	mi->video_mode |= (1 << 15);
 	realmode_interrupt(0x10, VESA_SET_MODE, mi->video_mode,
 			   0x0000, 0x0000, 0x0000, 0x0000);
 
@@ -262,7 +262,6 @@ void bios_run_on_x86(pci_dev_t pcidev, unsigned long addr, int vesa_mode,
 	/* Make sure the code is placed. */
 	setup_realmode_code();
 
-	disable_caches();
 	debug("Calling Option ROM at %lx, pci device %#x...", addr, num_dev);
 
 	/* Option ROM entry point is at OPROM start + 3 */
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 11/22] x86: Tidy up VESA mode numbers
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (8 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 10/22] x86: Use cache, don't clear the display in video BIOS Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex Simon Glass
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

There are some bits which should be ignored when displaying the mode number.
Make sure that they are not included in the mode that is displayed.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/lib/bios.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/bios.c b/arch/x86/lib/bios.c
index 4285348..1d75cfc 100644
--- a/arch/x86/lib/bios.c
+++ b/arch/x86/lib/bios.c
@@ -207,12 +207,14 @@ static u8 vbe_get_mode_info(struct vbe_mode_info *mi)
 
 static u8 vbe_set_mode(struct vbe_mode_info *mi)
 {
-	debug("VBE: Setting VESA mode %#04x\n", mi->video_mode);
+	int video_mode = mi->video_mode;
+
+	debug("VBE: Setting VESA mode %#04x\n", video_mode);
 	/* request linear framebuffer mode */
-	mi->video_mode |= (1 << 14);
+	video_mode |= (1 << 14);
 	/* don't clear the framebuffer, we do that later */
-	mi->video_mode |= (1 << 15);
-	realmode_interrupt(0x10, VESA_SET_MODE, mi->video_mode,
+	video_mode |= (1 << 15);
+	realmode_interrupt(0x10, VESA_SET_MODE, video_mode,
 			   0x0000, 0x0000, 0x0000, 0x0000);
 
 	return 0;
@@ -236,6 +238,7 @@ static void vbe_set_graphics(int vesa_mode, struct vbe_mode_info *mode_info)
 		return;
 	}
 
+	mode_info->video_mode &= 0x3ff;
 	vbe_set_mode(mode_info);
 }
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (9 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 11/22] x86: Tidy up VESA mode numbers Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  3:18   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 13/22] x86: ivybridge: Drop support for ROM caching Simon Glass
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

The hex value is more commonly understood, so use that instead of decimal.
Add a 0x prefix to avoid confusion.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 drivers/pci/pci_rom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci_rom.c b/drivers/pci/pci_rom.c
index 5ba315b..7d25cc9 100644
--- a/drivers/pci/pci_rom.c
+++ b/drivers/pci/pci_rom.c
@@ -247,7 +247,7 @@ int pci_run_vga_bios(pci_dev_t dev, int (*int15_handler)(void), bool emulate)
 		defined(CONFIG_FRAMEBUFFER_VESA_MODE)
 	vesa_mode = CONFIG_FRAMEBUFFER_VESA_MODE;
 #endif
-	debug("Selected vesa mode %d\b", vesa_mode);
+	debug("Selected vesa mode %#x\n", vesa_mode);
 	if (emulate) {
 #ifdef CONFIG_BIOSEMU
 		BE_VGAInfo *info;
@@ -275,7 +275,7 @@ int pci_run_vga_bios(pci_dev_t dev, int (*int15_handler)(void), bool emulate)
 		return -ENOSYS;
 #endif
 	}
-	debug("Final vesa mode %d\n", mode_info.video_mode);
+	debug("Final vesa mode %#x\n", mode_info.video_mode);
 
 	return 0;
 }
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 13/22] x86: ivybridge: Drop support for ROM caching
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (10 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 14/22] x86: Add support for MTRRs Simon Glass
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

This is set up along with CAR (Cache-as-RAM) anyway. When we relocate we
don't really need ROM caching (we read the VGA BIOS from ROM but that is
about it)

Drop it.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/cpu.c | 25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/arch/x86/cpu/ivybridge/cpu.c b/arch/x86/cpu/ivybridge/cpu.c
index 969b07b..0543e06 100644
--- a/arch/x86/cpu/ivybridge/cpu.c
+++ b/arch/x86/cpu/ivybridge/cpu.c
@@ -49,27 +49,6 @@ static void enable_spi_prefetch(struct pci_controller *hose, pci_dev_t dev)
 	pci_hose_write_config_byte(hose, dev, 0xdc, reg8);
 }
 
-static void set_var_mtrr(
-	unsigned reg, unsigned base, unsigned size, unsigned type)
-
-{
-	/* Bit Bit 32-35 of MTRRphysMask should be set to 1 */
-	/* FIXME: It only support 4G less range */
-	wrmsr(MTRRphysBase_MSR(reg), base | type, 0);
-	wrmsr(MTRRphysMask_MSR(reg), ~(size - 1) | MTRRphysMaskValid,
-	      (1 << (CONFIG_CPU_ADDR_BITS - 32)) - 1);
-}
-
-static void enable_rom_caching(void)
-{
-	disable_caches();
-	set_var_mtrr(1, 0xffc00000, 4 << 20, MTRR_TYPE_WRPROT);
-	enable_caches();
-
-	/* Enable Variable MTRRs */
-	wrmsr(MTRRdefType_MSR, 0x800, 0);
-}
-
 static int set_flex_ratio_to_tdp_nominal(void)
 {
 	msr_t flex_ratio, msr;
@@ -165,10 +144,6 @@ int arch_cpu_init(void)
 	/* This is already done in start.S, but let's do it in C */
 	enable_port80_on_lpc(hose, PCH_LPC_DEV);
 
-	/* already done in car.S */
-	if (false)
-		enable_rom_caching();
-
 	set_spi_speed();
 
 	/*
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 14/22] x86: Add support for MTRRs
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (11 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 13/22] x86: ivybridge: Drop support for ROM caching Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  3:24   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer Simon Glass
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Memory Type Range Registers are used to tell the CPU whether memory is
cacheable and if so the cache write mode to use.

Clean up the existing header file to follow style, and remove the unneeded
code.

These can speed up booting so should be supported. Add these to global_data
so they can be requested while booting. We will apply the changes during
relocation (in a later commit).

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/Makefile              |   1 +
 arch/x86/cpu/coreboot/coreboot.c   |  22 +++--
 arch/x86/cpu/ivybridge/car.S       |  12 +--
 arch/x86/cpu/mtrr.c                |  81 ++++++++++++++++++
 arch/x86/include/asm/global_data.h |  15 ++++
 arch/x86/include/asm/mtrr.h        | 163 +++++++++++++++++--------------------
 6 files changed, 186 insertions(+), 108 deletions(-)
 create mode 100644 arch/x86/cpu/mtrr.c

diff --git a/arch/x86/cpu/Makefile b/arch/x86/cpu/Makefile
index 5033d2b..62e43c0 100644
--- a/arch/x86/cpu/Makefile
+++ b/arch/x86/cpu/Makefile
@@ -17,5 +17,6 @@ obj-$(CONFIG_NORTHBRIDGE_INTEL_SANDYBRIDGE) += ivybridge/
 obj-$(CONFIG_NORTHBRIDGE_INTEL_IVYBRIDGE) += ivybridge/
 obj-$(CONFIG_INTEL_QUEENSBAY) += queensbay/
 obj-y += lapic.o
+obj-y += mtrr.o
 obj-$(CONFIG_PCI) += pci.o
 obj-y += turbo.o
diff --git a/arch/x86/cpu/coreboot/coreboot.c b/arch/x86/cpu/coreboot/coreboot.c
index cfacc05..6d06d5a 100644
--- a/arch/x86/cpu/coreboot/coreboot.c
+++ b/arch/x86/cpu/coreboot/coreboot.c
@@ -15,6 +15,7 @@
 #include <asm/cache.h>
 #include <asm/cpu.h>
 #include <asm/io.h>
+#include <asm/mtrr.h>
 #include <asm/arch/tables.h>
 #include <asm/arch/sysinfo.h>
 #include <asm/arch/timestamp.h>
@@ -64,11 +65,6 @@ int board_eth_init(bd_t *bis)
 	return pci_eth_init(bis);
 }
 
-#define MTRR_TYPE_WP          5
-#define MTRRcap_MSR           0xfe
-#define MTRRphysBase_MSR(reg) (0x200 + 2 * (reg))
-#define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
-
 void board_final_cleanup(void)
 {
 	/* Un-cache the ROM so the kernel has one
@@ -77,15 +73,17 @@ void board_final_cleanup(void)
 	 * Coreboot should have assigned this to the
 	 * top available variable MTRR.
 	 */
-	u8 top_mtrr = (native_read_msr(MTRRcap_MSR) & 0xff) - 1;
-	u8 top_type = native_read_msr(MTRRphysBase_MSR(top_mtrr)) & 0xff;
+	u8 top_mtrr = (native_read_msr(MTRR_CAP_MSR) & 0xff) - 1;
+	u8 top_type = native_read_msr(MTRR_PHYS_BASE_MSR(top_mtrr)) & 0xff;
 
 	/* Make sure this MTRR is the correct Write-Protected type */
-	if (top_type == MTRR_TYPE_WP) {
-		disable_caches();
-		wrmsrl(MTRRphysBase_MSR(top_mtrr), 0);
-		wrmsrl(MTRRphysMask_MSR(top_mtrr), 0);
-		enable_caches();
+	if (top_type == MTRR_TYPE_WRPROT) {
+		struct mtrr_state state;
+
+		mtrr_open(&state);
+		wrmsrl(MTRR_PHYS_BASE_MSR(top_mtrr), 0);
+		wrmsrl(MTRR_PHYS_MASK_MSR(top_mtrr), 0);
+		mtrr_close(&state);
 	}
 
 	/* Issue SMI to Coreboot to lock down ME and registers */
diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
index dca68e4..72b22ea 100644
--- a/arch/x86/cpu/ivybridge/car.S
+++ b/arch/x86/cpu/ivybridge/car.S
@@ -61,7 +61,7 @@ clear_mtrrs:
 
 	post_code(POST_CAR_MTRR)
 	/* Configure the default memory type to uncacheable */
-	movl	$MTRRdefType_MSR, %ecx
+	movl	$MTRR_DEF_TYPE_MSR, %ecx
 	rdmsr
 	andl	$(~0x00000cff), %eax
 	wrmsr
@@ -76,16 +76,16 @@ clear_mtrrs:
 	post_code(POST_CAR_BASE_ADDRESS)
 	/* Set Cache-as-RAM mask */
 	movl	$(MTRR_PHYS_MASK_MSR(0)), %ecx
-	movl	$(~(CACHE_AS_RAM_SIZE - 1) | MTRRphysMaskValid), %eax
+	movl	$(~(CACHE_AS_RAM_SIZE - 1) | MTRR_PHYS_MASK_VALID), %eax
 	movl	$CPU_PHYSMASK_HI, %edx
 	wrmsr
 
 	post_code(POST_CAR_MASK)
 
 	/* Enable MTRR */
-	movl	$MTRRdefType_MSR, %ecx
+	movl	$MTRR_DEF_TYPE_MSR, %ecx
 	rdmsr
-	orl	$MTRRdefTypeEn, %eax
+	orl	$MTRR_DEF_TYPE_EN, %eax
 	wrmsr
 
 	/* Enable cache (CR0.CD = 0, CR0.NW = 0) */
@@ -130,7 +130,7 @@ clear_mtrrs:
 
 	movl	$MTRR_PHYS_MASK_MSR(1), %ecx
 	movl	$CPU_PHYSMASK_HI, %edx
-	movl	$(~(CONFIG_XIP_ROM_SIZE - 1) | MTRRphysMaskValid), %eax
+	movl	$(~(CONFIG_XIP_ROM_SIZE - 1) | MTRR_PHYS_MASK_VALID), %eax
 	wrmsr
 
 	post_code(POST_CAR_ROM_CACHE)
@@ -141,7 +141,7 @@ clear_mtrrs:
 	xorl	%edx, %edx
 	wrmsr
 	movl	$MTRR_PHYS_MASK_MSR(2), %ecx
-	movl	$(CACHE_MRC_MASK | MTRRphysMaskValid), %eax
+	movl	$(CACHE_MRC_MASK | MTRR_PHYS_MASK_VALID), %eax
 	movl	$CPU_PHYSMASK_HI, %edx
 	wrmsr
 #endif
diff --git a/arch/x86/cpu/mtrr.c b/arch/x86/cpu/mtrr.c
new file mode 100644
index 0000000..d5a825d1
--- /dev/null
+++ b/arch/x86/cpu/mtrr.c
@@ -0,0 +1,81 @@
+/*
+ * (C) Copyright 2014 Google, Inc
+ *
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ * Memory Type Range Regsters - these are used to tell the CPU whether
+ * memory is cacheable and if so the cache write mode to use.
+ *
+ * These can speed up booting. See the mtrr command.
+ *
+ * Reference: Intel Architecture Software Developer's Manual, Volume 3:
+ * System Programming
+ */
+
+#include <common.h>
+#include <asm/io.h>
+#include <asm/msr.h>
+#include <asm/mtrr.h>
+
+/* Prepare to adjust MTRRs */
+void mtrr_open(struct mtrr_state *state)
+{
+	state->enable_cache = dcache_status();
+
+	if (state->enable_cache)
+		disable_caches();
+	state->deftype = native_read_msr(MTRR_DEF_TYPE_MSR);
+	wrmsrl(MTRR_DEF_TYPE_MSR, state->deftype & ~MTRR_DEF_TYPE_EN);
+}
+
+/* Clean up after adjusting MTRRs, and enable them */
+void mtrr_close(struct mtrr_state *state)
+{
+	wrmsrl(MTRR_DEF_TYPE_MSR, state->deftype | MTRR_DEF_TYPE_EN);
+	if (state->enable_cache)
+		enable_caches();
+}
+
+int mtrr_commit(bool do_caches)
+{
+	struct mtrr_request *req = gd->arch.mtrr_req;
+	struct mtrr_state state;
+	uint64_t mask;
+	int i;
+
+	mtrr_open(&state);
+	for (i = 0; i < gd->arch.mtrr_req_count; i++, req++) {
+		mask = ~(req->size - 1);
+		mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
+		wrmsrl(MTRR_PHYS_BASE_MSR(i), req->start | req->type);
+		wrmsrl(MTRR_PHYS_MASK_MSR(i), mask | MTRR_PHYS_MASK_VALID);
+	}
+
+	/* Clear the ones that are unused */
+	for (; i < MTRR_COUNT; i++)
+		wrmsrl(MTRR_PHYS_MASK_MSR(i), 0);
+	mtrr_close(&state);
+
+	return 0;
+}
+
+int mtrr_add_request(int type, uint64_t start, uint64_t size)
+{
+	struct mtrr_request *req;
+	uint64_t mask;
+
+	if (gd->arch.mtrr_req_count == MAX_MTRR_REQUESTS)
+		return -ENOSPC;
+	req = &gd->arch.mtrr_req[gd->arch.mtrr_req_count++];
+	req->type = type;
+	req->start = start;
+	req->size = size;
+	debug("%d: type=%d, %08llx  %08llx\n", gd->arch.mtrr_req_count - 1,
+	      req->type, req->start, req->size);
+	mask = ~(req->size - 1);
+	mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
+	mask |= MTRR_PHYS_MASK_VALID;
+	debug("   %016llx %016llx\n", req->start | req->type, mask);
+
+	return 0;
+}
diff --git a/arch/x86/include/asm/global_data.h b/arch/x86/include/asm/global_data.h
index 03d491a..15e76f6 100644
--- a/arch/x86/include/asm/global_data.h
+++ b/arch/x86/include/asm/global_data.h
@@ -29,6 +29,19 @@ struct memory_info {
 	struct memory_area area[CONFIG_NR_DRAM_BANKS];
 };
 
+#define MAX_MTRR_REQUESTS	8
+
+/**
+ * A request for a memory region to be set up in a particular way. These
+ * requests are processed before board_init_r() is called. They are generally
+ * optional and can be ignored with some performance impact.
+ */
+struct mtrr_request {
+	int type;		/* MTRR_TYPE_... */
+	uint64_t start;
+	uint64_t size;
+};
+
 /* Architecture-specific global data */
 struct arch_global_data {
 	struct global_data *gd_addr;		/* Location of Global Data */
@@ -50,6 +63,8 @@ struct arch_global_data {
 #ifdef CONFIG_HAVE_FSP
 	void	*hob_list;		/* FSP HOB list */
 #endif
+	struct mtrr_request mtrr_req[MAX_MTRR_REQUESTS];
+	int mtrr_req_count;
 };
 
 #endif
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 5f05a48..3c11740 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -9,99 +9,86 @@
 #ifndef _ASM_MTRR_H
 #define _ASM_MTRR_H
 
-/*  These are the region types  */
-#define MTRR_TYPE_UNCACHEABLE 0
-#define MTRR_TYPE_WRCOMB     1
-/*#define MTRR_TYPE_         2*/
-/*#define MTRR_TYPE_         3*/
-#define MTRR_TYPE_WRTHROUGH  4
-#define MTRR_TYPE_WRPROT     5
-#define MTRR_TYPE_WRBACK     6
-#define MTRR_NUM_TYPES       7
-
-#define MTRRcap_MSR     0x0fe
-#define MTRRdefType_MSR 0x2ff
-
-#define MTRRdefTypeEn		(1 << 11)
-#define MTRRdefTypeFixEn	(1 << 10)
-
-#define SMRRphysBase_MSR 0x1f2
-#define SMRRphysMask_MSR 0x1f3
-
-#define MTRRphysBase_MSR(reg) (0x200 + 2 * (reg))
-#define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
-
-#define MTRRphysMaskValid	(1 << 11)
-
-#define NUM_FIXED_RANGES 88
-#define RANGES_PER_FIXED_MTRR 8
-#define MTRRfix64K_00000_MSR 0x250
-#define MTRRfix16K_80000_MSR 0x258
-#define MTRRfix16K_A0000_MSR 0x259
-#define MTRRfix4K_C0000_MSR 0x268
-#define MTRRfix4K_C8000_MSR 0x269
-#define MTRRfix4K_D0000_MSR 0x26a
-#define MTRRfix4K_D8000_MSR 0x26b
-#define MTRRfix4K_E0000_MSR 0x26c
-#define MTRRfix4K_E8000_MSR 0x26d
-#define MTRRfix4K_F0000_MSR 0x26e
-#define MTRRfix4K_F8000_MSR 0x26f
+/* MTRR region types */
+#define MTRR_TYPE_UNCACHEABLE	0
+#define MTRR_TYPE_WRCOMB	1
+#define MTRR_TYPE_WRTHROUGH	4
+#define MTRR_TYPE_WRPROT	5
+#define MTRR_TYPE_WRBACK	6
+
+#define MTRR_TYPE_COUNT		7
+
+#define MTRR_CAP_MSR		0x0fe
+#define MTRR_DEF_TYPE_MSR	0x2ff
+
+#define MTRR_DEF_TYPE_EN	(1 << 11)
+#define MTRR_DEF_TYPE_FIX_EN	(1 << 10)
+
+#define MTRR_PHYS_BASE_MSR(reg)	(0x200 + 2 * (reg))
+#define MTRR_PHYS_MASK_MSR(reg)	(0x200 + 2 * (reg) + 1)
+
+#define MTRR_PHYS_MASK_VALID	(1 << 11)
+
+#define MTRR_BASE_TYPE_MASK	0x7
+
+/* Number of MTRRs supported */
+#define MTRR_COUNT		8
 
 #if !defined(__ASSEMBLER__)
 
-/*
- * The MTRR code has some side effects that the callers should be aware for.
- * 1. The call sequence matters. x86_setup_mtrrs() calls
- *    x86_setup_fixed_mtrrs_no_enable() then enable_fixed_mtrrs() (equivalent
- *    of x86_setup_fixed_mtrrs()) then x86_setup_var_mtrrs(). If the callers
- *    want to call the components of x86_setup_mtrrs() because of other
- *    rquirements the ordering should still preserved.
- * 2. enable_fixed_mtrr() will enable both variable and fixed MTRRs because
- *    of the nature of the global MTRR enable flag. Therefore, all direct
- *    or indirect callers of enable_fixed_mtrr() should ensure that the
- *    variable MTRR MSRs do not contain bad ranges.
- * 3. If CONFIG_CACHE_ROM is selected an MTRR is allocated for enabling
- *    the caching of the ROM. However, it is set to uncacheable (UC). It
- *    is the responsiblity of the caller to enable it by calling
- *    x86_mtrr_enable_rom_caching().
+/**
+ * Information about the previous MTRR state, set up by mtrr_open()
+ *
+ * @deftype:		Previous value of MTRR_DEF_TYPE_MSR
+ * @enable_cache:	true if cache was enabled
  */
-void x86_setup_mtrrs(void);
-/*
- * x86_setup_var_mtrrs() parameters:
- * address_bits - number of physical address bits supported by cpu
- * above4gb - 2 means dynamically detect number of variable MTRRs available.
- *            non-zero means handle memory ranges above 4GiB.
- *            0 means ignore memory ranges above 4GiB
+struct mtrr_state {
+	uint64_t deftype;
+	bool enable_cache;
+};
+
+/**
+ * mtrr_open() - Prepare to adjust MTRRs
+ *
+ * Use mtrr_open() passing in a structure - this function will init it. Then
+ * when done, pass the same structure to mtrr_close() to re-enable MTRRs and
+ * possibly the cache.
+ *
+ * @state:	Empty structure to pass in to hold settings
  */
-void x86_setup_var_mtrrs(unsigned int address_bits, unsigned int above4gb);
-void enable_fixed_mtrr(void);
-void x86_setup_fixed_mtrrs(void);
-/* Set up fixed MTRRs but do not enable them. */
-void x86_setup_fixed_mtrrs_no_enable(void);
-int x86_mtrr_check(void);
-/* ROM caching can be used after variable MTRRs are set up. Beware that
- * enabling CONFIG_CACHE_ROM will eat through quite a few MTRRs based on
- * one's IO hole size and WRCOMB resources. Be sure to check the console
- * log when enabling CONFIG_CACHE_ROM or adding WRCOMB resources. Beware that
- * on CPUs with core-scoped MTRR registers such as hyperthreaded CPUs the
- * rom caching will be disabled if all threads run the MTRR code. Therefore,
- * one needs to call x86_mtrr_enable_rom_caching() after all threads of the
- * same core have run the MTRR code. */
-#if CONFIG_CACHE_ROM
-void x86_mtrr_enable_rom_caching(void);
-void x86_mtrr_disable_rom_caching(void);
-/* Return the variable range MTRR index of the ROM cache. */
-long x86_mtrr_rom_cache_var_index(void);
-#else
-static inline void x86_mtrr_enable_rom_caching(void) {}
-static inline void x86_mtrr_disable_rom_caching(void) {}
-static inline long x86_mtrr_rom_cache_var_index(void) { return -1; }
-#endif /* CONFIG_CACHE_ROM */
+void mtrr_open(struct mtrr_state *state);
 
-#endif
+/**
+ * mtrr_open() - Clean up after adjusting MTRRs, and enable them
+ *
+ * This uses the structure containing information returned from mtrr_open().
+ *
+ * @state:	Structure from mtrr_open()
+ */
+/*  */
+void mtrr_close(struct mtrr_state *state);
+
+/**
+ * mtrr_add_request() - Add a new MTRR request
+ *
+ * This adds a request for a memory region to be set up in a particular way.
+ *
+ * @type:	Requested type (MTRR_TYPE_)
+ * @start:	Start address
+ * @size:	Size
+ */
+int mtrr_add_request(int type, uint64_t start, uint64_t size);
+
+/**
+ * mtrr_commit() - set up the MTRR registers based on current requests
+ *
+ * This sets up MTRRs for the available DRAM and the requests received so far.
+ * It must be called with caches disabled.
+ *
+ * @do_caches:	true if caches are currently on
+ */
+int mtrr_commit(bool do_caches);
 
-#if !defined(CONFIG_RAMTOP)
-# error "CONFIG_RAMTOP not defined"
 #endif
 
 #if ((CONFIG_XIP_ROM_SIZE & (CONFIG_XIP_ROM_SIZE - 1)) != 0)
@@ -114,8 +101,4 @@ static inline long x86_mtrr_rom_cache_var_index(void) { return -1; }
 
 #define CACHE_ROM_BASE	(((1 << 20) - (CONFIG_CACHE_ROM_SIZE >> 12)) << 12)
 
-#if (CONFIG_RAMTOP & (CONFIG_RAMTOP - 1)) != 0
-# error "CONFIG_RAMTOP must be a power of 2"
-#endif
-
 #endif
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (12 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 14/22] x86: Add support for MTRRs Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  3:27   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance Simon Glass
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Set the frame buffer to write-combining. This makes it faster, although for
scrolling write-through is even faster for U-Boot.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/gma.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/cpu/ivybridge/gma.c b/arch/x86/cpu/ivybridge/gma.c
index 125021b..03b2ebd 100644
--- a/arch/x86/cpu/ivybridge/gma.c
+++ b/arch/x86/cpu/ivybridge/gma.c
@@ -12,6 +12,7 @@
 #include <fdtdec.h>
 #include <pci_rom.h>
 #include <asm/io.h>
+#include <asm/mtrr.h>
 #include <asm/pci.h>
 #include <asm/arch/pch.h>
 #include <asm/arch/sandybridge.h>
@@ -738,6 +739,8 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
 		   const void *blob, int node)
 {
 	void *gtt_bar;
+	ulong start;
+	ulong base;
 	u32 reg32;
 	int ret;
 
@@ -746,6 +749,11 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
 	reg32 |= PCI_COMMAND_MASTER | PCI_COMMAND_MEMORY | PCI_COMMAND_IO;
 	pci_write_config32(dev, PCI_COMMAND, reg32);
 
+	/* Use write-combining for the graphics memory, 256MB */
+	base = pci_read_bar32(hose, dev, 2);
+	mtrr_add_request(MTRR_TYPE_WRCOMB, base, 256 << 20);
+	mtrr_commit(true);
+
 	gtt_bar = (void *)pci_read_bar32(pci_bus_to_hose(0), dev, 0);
 	debug("GT bar %p\n", gtt_bar);
 	ret = gma_pm_init_pre_vbios(gtt_bar);
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (13 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  5:51   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 17/22] x86: ivybridge: Request MTRRs for DRAM regions Simon Glass
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

For bare platforms we turn off ROM-caching before calling board_init_f_r()
It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
that the copying happens before we turn off ROM-caching.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 common/board_f.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/common/board_f.c b/common/board_f.c
index 98c9c72..1b65575 100644
--- a/common/board_f.c
+++ b/common/board_f.c
@@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
 	INIT_FUNC_WATCHDOG_RESET
 	reloc_fdt,
 	setup_reloc,
+#ifdef CONFIG_X86
+	copy_uboot_to_ram,
+	clear_bss,
+	do_elf_reloc_fixups,
+#endif
 #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
 	jump_to_copy,
 #endif
@@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
  */
 static init_fnc_t init_sequence_f_r[] = {
 	init_cache_f_r,
-	copy_uboot_to_ram,
-	clear_bss,
-	do_elf_reloc_fixups,
 
 	NULL,
 };
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 17/22] x86: ivybridge: Request MTRRs for DRAM regions
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (14 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 18/22] x86: Commit the current MTRRs before relocation Simon Glass
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

We should use MTRRs to speed up execution. Add a list of MTRR requests which
will dealt with when we relocate and run from RAM.

We set RAM as cacheable (with write-back) and registers as non-cacheable.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/sdram.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/cpu/ivybridge/sdram.c b/arch/x86/cpu/ivybridge/sdram.c
index b95e781..9504735 100644
--- a/arch/x86/cpu/ivybridge/sdram.c
+++ b/arch/x86/cpu/ivybridge/sdram.c
@@ -17,6 +17,7 @@
 #include <asm/processor.h>
 #include <asm/gpio.h>
 #include <asm/global_data.h>
+#include <asm/mtrr.h>
 #include <asm/pci.h>
 #include <asm/arch/me.h>
 #include <asm/arch/pei_data.h>
@@ -430,6 +431,15 @@ static int sdram_find(pci_dev_t dev)
 	add_memory_area(info, (2 << 28) + (2 << 20), 4 << 28);
 	add_memory_area(info, (4 << 28) + (2 << 20), tseg_base);
 	add_memory_area(info, 1ULL << 32, touud);
+
+	/* Add MTRRs for memory */
+	mtrr_add_request(MTRR_TYPE_WRBACK, 0, 2ULL << 30);
+	mtrr_add_request(MTRR_TYPE_WRBACK, 2ULL << 30, 512 << 20);
+	mtrr_add_request(MTRR_TYPE_WRBACK, 0xaULL << 28, 256 << 20);
+	mtrr_add_request(MTRR_TYPE_UNCACHEABLE, tseg_base, 16 << 20);
+	mtrr_add_request(MTRR_TYPE_UNCACHEABLE, tseg_base + (16 << 20),
+			 32 << 20);
+
 	/*
 	 * If >= 4GB installed then memory from TOLUD to 4GB
 	 * is remapped above TOM, TOUUD will account for both
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 18/22] x86: Commit the current MTRRs before relocation
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (15 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 17/22] x86: ivybridge: Request MTRRs for DRAM regions Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR Simon Glass
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Once we stop running from ROM we should set up the MTTRs to speed up
execution. This is only needed for platforms that don't have an FSP.
Also in the Coreboot case, the MTRRs are set up for us.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/lib/init_helpers.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/lib/init_helpers.c b/arch/x86/lib/init_helpers.c
index be4eb12..fc211d9 100644
--- a/arch/x86/lib/init_helpers.c
+++ b/arch/x86/lib/init_helpers.c
@@ -7,6 +7,7 @@
 #include <common.h>
 #include <fdtdec.h>
 #include <spi.h>
+#include <asm/mtrr.h>
 #include <asm/sections.h>
 
 DECLARE_GLOBAL_DATA_PTR;
@@ -66,6 +67,13 @@ int calculate_relocation_address(void)
 
 int init_cache_f_r(void)
 {
+#if defined(CONFIG_X86_RESET_VECTOR) & !defined(CONFIG_HAVE_FSP)
+	int ret;
+
+	ret = mtrr_commit(false);
+	if (ret)
+		return ret;
+#endif
 	/* Initialise the CPU cache(s) */
 	return init_cache();
 }
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (16 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 18/22] x86: Commit the current MTRRs before relocation Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  5:56   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it Simon Glass
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

Cache-as-RAM should be turned off when we relocate since we want to run from
RAM. Add a function to perform this task.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/car.S | 52 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
index 72b22ea..6e7e1e4 100644
--- a/arch/x86/cpu/ivybridge/car.S
+++ b/arch/x86/cpu/ivybridge/car.S
@@ -163,6 +163,58 @@ wait_for_sipi:
 	/* return */
 	jmp	car_init_ret
 
+.globl car_uninit
+car_uninit:
+	/* Disable cache */
+	movl	%cr0, %eax
+	orl	$X86_CR0_CD, %eax
+	movl	%eax, %cr0
+
+	/* Disable MTRRs */
+	movl	$MTRR_DEF_TYPE_MSR, %ecx
+	rdmsr
+	andl	$(~MTRR_DEF_TYPE_EN), %eax
+	wrmsr
+
+	/* Disable the no-eviction run state */
+	movl    NOEVICTMOD_MSR, %ecx
+	rdmsr
+	andl    $~2, %eax
+	wrmsr
+
+	invd
+
+	/* Disable the no-eviction mode */
+	rdmsr
+	andl    $~1, %eax
+	wrmsr
+
+#ifdef CONFIG_CACHE_MRC_BIN
+	/* Clear the MTRR that was used to cache MRC */
+	xorl	%eax, %eax
+	xorl	%edx, %edx
+	movl	$MTRR_PHYS_BASE_MSR(2), %ecx
+	wrmsr
+	movl	$MTRR_PHYS_MASK_MSR(2), %ecx
+	wrmsr
+#endif
+
+	/* Enable MTRRs */
+	movl	$MTRR_DEF_TYPE_MSR, %ecx
+	rdmsr
+	orl	$MTRR_DEF_TYPE_EN, %eax
+	wrmsr
+
+	invd
+
+	/* Return to the caller */
+	jmp	*%ebx
+
+.Lhlt:
+	post_code(0xee)
+	hlt
+	jmp	.Lhlt
+
 mtrr_table:
 	/* Fixed MTRRs */
 	.word 0x250, 0x258, 0x259
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (17 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  6:21   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot Simon Glass
  2014-12-28  2:20 ` [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs Simon Glass
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

For platforms with CAR we should disable it before relocation. Check if
this function is available and call it if so.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/start.S | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/cpu/start.S b/arch/x86/cpu/start.S
index 125782c..8cebde1 100644
--- a/arch/x86/cpu/start.S
+++ b/arch/x86/cpu/start.S
@@ -205,6 +205,16 @@ board_init_f_r_trampoline:
 	/* Setup global descriptor table so gd->xyz works */
 	call	setup_gdt
 
+	/* Set if we need to disable CAR */
+	movl	$car_uninit, %eax
+	cmpl	$0, %eax
+	jz	car_ret
+
+	/* Pass return address in ebx */
+.weak	car_uninit
+	movl	$car_ret, %ebx
+	jmp	car_uninit
+car_ret:
 	/* Re-enter U-Boot by calling board_init_f_r */
 	call	board_init_f_r
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (18 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  6:45   ` Bin Meng
  2014-12-28  2:20 ` [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs Simon Glass
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

At present the normal update (which happens much later) does not work. This
seems to have something to do with the 'no eviction' mode in the CAR, or at
least moving the microcode update after that causes it not to work.

For now, do an update early on so that it definitely works. Also refuse to
continue unless the microcode update check (later in boot) is successful.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 arch/x86/cpu/ivybridge/car.S             | 14 ++++++++++++++
 arch/x86/cpu/ivybridge/cpu.c             |  2 +-
 arch/x86/cpu/ivybridge/microcode_intel.c |  9 +++++++--
 arch/x86/dts/link.dts                    |  3 ---
 4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
index 6e7e1e4..95da087 100644
--- a/arch/x86/cpu/ivybridge/car.S
+++ b/arch/x86/cpu/ivybridge/car.S
@@ -45,6 +45,14 @@ car_init:
 	movl	$0xFEE00300, %esi
 	movl	%eax, (%esi)
 
+	/* TODO: Load microcode later - the 'no eviction' mode breaks this */
+	movl	$0x79, %ecx
+	xorl	%edx, %edx
+	movl	$_dt_ucode_base_size, %eax
+	movl	(%eax), %eax
+	addl	$0x30, %eax
+	wrmsr
+
 	post_code(POST_CAR_SIPI)
 	/* Zero out all fixed range and variable range MTRRs */
 	movl	$mtrr_table, %esi
@@ -228,3 +236,9 @@ mtrr_table:
 	.word 0x20C, 0x20D, 0x20E, 0x20F
 	.word 0x210, 0x211, 0x212, 0x213
 mtrr_table_end:
+
+	.align 4
+_dt_ucode_base_size:
+	/* These next two fields are filled in by ifdtool */
+	.long	0			/* microcode base */
+	.long	0			/* microcode size */
diff --git a/arch/x86/cpu/ivybridge/cpu.c b/arch/x86/cpu/ivybridge/cpu.c
index 0543e06..e925310 100644
--- a/arch/x86/cpu/ivybridge/cpu.c
+++ b/arch/x86/cpu/ivybridge/cpu.c
@@ -263,7 +263,7 @@ int print_cpuinfo(void)
 	enable_lapic();
 
 	ret = microcode_update_intel();
-	if (ret && ret != -ENOENT && ret != -EEXIST)
+	if (ret)
 		return ret;
 
 	/* Enable upper 128bytes of CMOS */
diff --git a/arch/x86/cpu/ivybridge/microcode_intel.c b/arch/x86/cpu/ivybridge/microcode_intel.c
index 0817751..c012820 100644
--- a/arch/x86/cpu/ivybridge/microcode_intel.c
+++ b/arch/x86/cpu/ivybridge/microcode_intel.c
@@ -120,6 +120,7 @@ int microcode_update_intel(void)
 	int count;
 	int node;
 	int ret;
+	int rev;
 
 	microcode_read_cpu(&cpu);
 	node = 0;
@@ -147,12 +148,16 @@ int microcode_update_intel(void)
 			skipped++;
 			continue;
 		}
-		ret = microcode_read_rev();
 		wrmsr(0x79, (ulong)update.data, 0);
+		rev = microcode_read_rev();
 		debug("microcode: updated to revision 0x%x date=%04x-%02x-%02x\n",
-		      microcode_read_rev(), update.date_code & 0xffff,
+		      rev, update.date_code & 0xffff,
 		      (update.date_code >> 24) & 0xff,
 		      (update.date_code >> 16) & 0xff);
+		if (update.update_revision != rev) {
+			printf("Microcode update failed\n");
+			return -EFAULT;
+		}
 		count++;
 	} while (1);
 }
diff --git a/arch/x86/dts/link.dts b/arch/x86/dts/link.dts
index a739080..1ebc334 100644
--- a/arch/x86/dts/link.dts
+++ b/arch/x86/dts/link.dts
@@ -214,9 +214,6 @@
 
 	microcode {
 		update at 0 {
-#include "microcode/m12206a7_00000029.dtsi"
-		};
-		update at 1 {
 #include "microcode/m12306a9_0000001b.dtsi"
 		};
 	};
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs
  2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
                   ` (19 preceding siblings ...)
  2014-12-28  2:20 ` [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot Simon Glass
@ 2014-12-28  2:20 ` Simon Glass
  2014-12-31  7:03   ` Bin Meng
  20 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2014-12-28  2:20 UTC (permalink / raw)
  To: u-boot

It is useful to be able to see the MTRR setup in U-Boot. Add a command
to list the state of the variable MTRR registers and allow them to be
changed.

Update the documentation to list some of the available commands.

This does not support fixed MTRRs as yet.

Signed-off-by: Simon Glass <sjg@chromium.org>
---

 common/Makefile   |   1 +
 common/cmd_mtrr.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 doc/README.x86    |  18 ++++++-
 3 files changed, 156 insertions(+), 1 deletion(-)
 create mode 100644 common/cmd_mtrr.c

diff --git a/common/Makefile b/common/Makefile
index c668a2f..39e4e5f 100644
--- a/common/Makefile
+++ b/common/Makefile
@@ -135,6 +135,7 @@ obj-$(CONFIG_CMD_MMC) += cmd_mmc.o
 obj-$(CONFIG_CMD_MMC_SPI) += cmd_mmc_spi.o
 obj-$(CONFIG_MP) += cmd_mp.o
 obj-$(CONFIG_CMD_MTDPARTS) += cmd_mtdparts.o
+obj-$(CONFIG_X86) += cmd_mtrr.o
 obj-$(CONFIG_CMD_NAND) += cmd_nand.o
 obj-$(CONFIG_CMD_NET) += cmd_net.o
 obj-$(CONFIG_CMD_ONENAND) += cmd_onenand.o
diff --git a/common/cmd_mtrr.c b/common/cmd_mtrr.c
new file mode 100644
index 0000000..7e0506b
--- /dev/null
+++ b/common/cmd_mtrr.c
@@ -0,0 +1,138 @@
+/*
+ * (C) Copyright 2014 Google, Inc
+ *
+ * SPDX-License-Identifier:	GPL-2.0+
+ */
+
+#include <common.h>
+#include <asm/msr.h>
+#include <asm/mtrr.h>
+
+static const char *const mtrr_type_name[MTRR_TYPE_COUNT] = {
+	"Uncacheable",
+	"Combine",
+	"2",
+	"3",
+	"Through",
+	"Protect",
+	"Back",
+};
+
+static int do_mtrr_list(void)
+{
+	int i;
+
+	printf("Reg Valid Write-type   %-16s %-16s %-16s\n", "Base   ||",
+	       "Mask   ||", "Size   ||");
+	for (i = 0; i < MTRR_COUNT; i++) {
+		const char *type = "Invalid";
+		uint64_t base, mask, size;
+		bool valid;
+
+		base = native_read_msr(MTRR_PHYS_BASE_MSR(i));
+		mask = native_read_msr(MTRR_PHYS_MASK_MSR(i));
+		size = ~mask & ((1ULL << CONFIG_CPU_ADDR_BITS) - 1);
+		size |= (1 << 12) - 1;
+		size += 1;
+		valid = mask & MTRR_PHYS_MASK_VALID;
+		type = mtrr_type_name[base & MTRR_BASE_TYPE_MASK];
+		printf("%d   %-5s %-12s %016llx %016llx %016llx\n", i,
+		       valid ? "Y" : "N", type, base, mask, size);
+	}
+
+	return 0;
+}
+
+static int do_mtrr_set(uint reg, int argc, char * const argv[])
+{
+	const char *typename = argv[0];
+	struct mtrr_state state;
+	uint32_t start, size;
+	uint64_t base, mask;
+	int i, type = -1;
+	bool valid;
+
+	if (argc < 3)
+		return CMD_RET_USAGE;
+	for (i = 0; i < MTRR_TYPE_COUNT; i++) {
+		if (*typename == *mtrr_type_name[i])
+			type = i;
+	}
+	if (type == -1) {
+		printf("Invalid type name %s\n", typename);
+		return CMD_RET_USAGE;
+	}
+	start = simple_strtoul(argv[1], NULL, 16);
+	size = simple_strtoul(argv[2], NULL, 16);
+
+	base = start | type;
+	valid = native_read_msr(MTRR_PHYS_MASK_MSR(reg)) & MTRR_PHYS_MASK_VALID;
+	mask = ~((uint64_t)size - 1);
+	mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
+	if (valid)
+		mask |= MTRR_PHYS_MASK_VALID;
+
+	printf("base=%llx, mask=%llx\n", base, mask);
+	mtrr_open(&state);
+	wrmsrl(MTRR_PHYS_BASE_MSR(reg), base);
+	wrmsrl(MTRR_PHYS_MASK_MSR(reg), mask);
+	mtrr_close(&state);
+
+	return 0;
+}
+
+static int mtrr_set_valid(int reg, bool valid)
+{
+	struct mtrr_state state;
+	uint64_t mask;
+
+	mtrr_open(&state);
+	mask = native_read_msr(MTRR_PHYS_MASK_MSR(reg));
+	if (valid)
+		mask |= MTRR_PHYS_MASK_VALID;
+	else
+		mask &= ~MTRR_PHYS_MASK_VALID;
+	wrmsrl(MTRR_PHYS_MASK_MSR(reg), mask);
+	mtrr_close(&state);
+
+	return 0;
+}
+
+static int do_mtrr(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
+{
+	const char *cmd;
+	uint reg;
+
+	cmd = argv[1];
+	if (argc < 2 || *cmd == 'l')
+		return do_mtrr_list();
+	argc -= 2;
+	argv += 2;
+	if (argc <= 0)
+		return CMD_RET_USAGE;
+	reg = simple_strtoul(argv[0], NULL, 16);
+	if (reg >= MTRR_COUNT) {
+		printf("Invalid register number\n");
+		return CMD_RET_USAGE;
+	}
+	if (*cmd == 'e')
+		return mtrr_set_valid(reg, true);
+	else if (*cmd == 'd')
+		return mtrr_set_valid(reg, false);
+	else if (*cmd == 's')
+		return do_mtrr_set(reg, argc - 1, argv + 1);
+	else
+		return CMD_RET_USAGE;
+
+	return 0;
+}
+
+U_BOOT_CMD(
+	mtrr,	6,	1,	do_mtrr,
+	"Use x86 memory type range registers (32-bit only)",
+	"[list]        - list current registers\n"
+	"set <reg> <type> <start> <size>   - set a register\n"
+	"\t<type> is Uncacheable, Combine, Through, Protect, Back\n"
+	"disable <reg>      - disable a register\n"
+	"ensable <reg>      - enable a register"
+);
diff --git a/doc/README.x86 b/doc/README.x86
index 5fab044..6fbfb90 100644
--- a/doc/README.x86
+++ b/doc/README.x86
@@ -110,9 +110,25 @@ be turned on. Not every device on the board is configured via devie tree, but
 more and more devices will be added as time goes by. Check out the directory
 arch/x86/dts/ for these device tree source files.
 
+Useful Commands
+---------------
+
+In keeping with the U-Boot philosophy of providing functions to check and
+adjust internal settings, there are several x86-specific commands that may be
+useful:
+
+hob  - Display information about Firmware Support Package (FSP) Hand-off
+	 Block. This is only available on platform which use FSP, mostly
+	 Atom.
+iod  - Display I/O memory
+iow  - Write I/O memory
+mtrr - List and set the Memory Type Range Registers (MTRR). These are used to
+	 tell the CPU whether memory is cacheable and if so the cache write
+	 mode to use. U-Boot sets up some reasonable values but you can
+	 adjust then with this command.
+
 TODO List
 ---------
-- MTRR support (for performance)
 - Audio
 - Chrome OS verified boot
 - SMI and ACPI support, to provide platform info and facilities to Linux
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig
  2014-12-28  2:20 ` [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig Simon Glass
@ 2014-12-30 15:05   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-30 15:05 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> We don't need this in U-Boot since we calculate it based on available memory.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/Kconfig | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 7d007bb..e992e64 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -47,10 +47,6 @@ config RAMBASE
>         hex
>         default 0x100000
>
> -config RAMTOP
> -       hex
> -       default 0x200000
> -
>  config XIP_ROM_SIZE
>         hex
>         default ROM_SIZE
> --

What about mtrr.h which is still referring to CONFIG_RAMTOP?

~/work/u-boot-x86/arch/x86$ grep -nr RAMTOP *
include/asm/mtrr.h:103:#if !defined(CONFIG_RAMTOP)
include/asm/mtrr.h:104:# error "CONFIG_RAMTOP not defined"
include/asm/mtrr.h:117:#if (CONFIG_RAMTOP & (CONFIG_RAMTOP - 1)) != 0
include/asm/mtrr.h:118:# error "CONFIG_RAMTOP must be a power of 2"
Kconfig:50:config RAMTOP

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation
  2014-12-28  2:20 ` [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation Simon Glass
@ 2014-12-31  2:47   ` Bin Meng
  2014-12-31  7:05     ` Bin Meng
  0 siblings, 1 reply; 46+ messages in thread
From: Bin Meng @ 2014-12-31  2:47 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> This currently assumes that U-Boot resides at the start of ROM. Update
> it to remove this assumption.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  tools/ifdtool.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/ifdtool.c b/tools/ifdtool.c
> index fe8366b..590ccc9 100644
> --- a/tools/ifdtool.c
> +++ b/tools/ifdtool.c
> @@ -788,9 +788,9 @@ static int write_uboot(char *image, int size, struct input_file *uboot,
>                               fdt_strerror(data_size));
>                         return -ENOENT;
>                 }
> -               offset = ucode_ptr - uboot->addr;
> +               offset = (uint32_t)(ucode_ptr + size);
>                 ptr = (void *)image + offset;
> -               ptr[0] = uboot->addr + (data - image);
> +               ptr[0] = (data - image) - size;
>                 ptr[1] = data_size;
>                 debug("Wrote microcode pointer at %x: addr=%x, size=%x\n",
>                       ucode_ptr, ptr[0], ptr[1]);
> --

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>

Tested on Intel Crown Bay by adjusting ROM_SIZE to 2MB

Tested-by:  Bin Meng <bmeng.cn@gmail.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy
  2014-12-28  2:20 ` [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy Simon Glass
@ 2014-12-31  3:09   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  3:09 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> This can be very slow - typically 80ms even on a fast machine since it uses
> the SPI flash to read the data. Add an option to display the time taken.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  drivers/pci/pci_rom.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pci_rom.c b/drivers/pci/pci_rom.c
> index 9808bb3..5ba315b 100644
> --- a/drivers/pci/pci_rom.c
> +++ b/drivers/pci/pci_rom.c
> @@ -156,6 +156,8 @@ int pci_rom_load(uint16_t class, struct pci_rom_header *rom_header,
>
>         target = (void *)PCI_VGA_RAM_IMAGE_START;
>         if (target != rom_header) {
> +               ulong start = get_timer(0);
> +
>                 debug("Copying VGA ROM Image from %p to %p, 0x%x bytes\n",
>                       rom_header, target, rom_size);
>                 memcpy(target, rom_header, rom_size);
> @@ -163,6 +165,7 @@ int pci_rom_load(uint16_t class, struct pci_rom_header *rom_header,
>                         printf("VGA ROM copy failed\n");
>                         return -EFAULT;
>                 }
> +               debug("Copy took %lums\n", get_timer(start));
>         }
>         *ram_headerp = target;
>
> --

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled
  2014-12-28  2:20 ` [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled Simon Glass
@ 2014-12-31  3:14   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  3:14 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> This takes about about 700ms on link when running natively and 900ms when
> running using the emulator. It is a waste of time if video is not enabled,
> so don't bother running the video BIOS in that case.
>
> We could add a command to run the video BIOS later when needed, but this is
> not considered at present.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/ivybridge/gma.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/cpu/ivybridge/gma.c b/arch/x86/cpu/ivybridge/gma.c
> index 3d7f740..125021b 100644
> --- a/arch/x86/cpu/ivybridge/gma.c
> +++ b/arch/x86/cpu/ivybridge/gma.c
> @@ -15,6 +15,13 @@
>  #include <asm/pci.h>
>  #include <asm/arch/pch.h>
>  #include <asm/arch/sandybridge.h>
> +#include <linux/kconfig.h>
> +
> +#ifdef CONFIG_VIDEO
> +#define RUN_VIDEO_BIOS         1
> +#else
> +#define RUN_VIDEO_BIOS         1

Intentional? Should this be 0?

> +#endif
>
>  struct gt_powermeter {
>         u16 reg;
> @@ -745,7 +752,15 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
>         if (ret)
>                 return ret;
>
> -       ret = pci_run_vga_bios(dev, int15_handler, false);
> +       /*
> +        * TODO: Change to IS_ENABLED(CONFIG_VIDEO) when Kconfig supports
> +        * CONFIG_VIDEO.
> +        */

Or maybe just simply do this? The "if (RUN_VIDEO_BIOS)" looks strange.

#ifdef CONFIG_VIDEO
               start = get_timer(0);
               ret = pci_run_vga_bios(dev, int15_handler, false);
               debug("BIOS ran in %lums\n", get_timer(start));
#endif

> +       if (RUN_VIDEO_BIOS) {
> +               start = get_timer(0);
> +               ret = pci_run_vga_bios(dev, int15_handler, false);
> +               debug("BIOS ran in %lums\n", get_timer(start));
> +       }
>
>         /* Post VBIOS init */
>         ret = gma_pm_init_post_vbios(gtt_bar, blob, node);
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex
  2014-12-28  2:20 ` [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex Simon Glass
@ 2014-12-31  3:18   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  3:18 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> The hex value is more commonly understood, so use that instead of decimal.
> Add a 0x prefix to avoid confusion.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  drivers/pci/pci_rom.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pci_rom.c b/drivers/pci/pci_rom.c
> index 5ba315b..7d25cc9 100644
> --- a/drivers/pci/pci_rom.c
> +++ b/drivers/pci/pci_rom.c
> @@ -247,7 +247,7 @@ int pci_run_vga_bios(pci_dev_t dev, int (*int15_handler)(void), bool emulate)
>                 defined(CONFIG_FRAMEBUFFER_VESA_MODE)
>         vesa_mode = CONFIG_FRAMEBUFFER_VESA_MODE;
>  #endif
> -       debug("Selected vesa mode %d\b", vesa_mode);
> +       debug("Selected vesa mode %#x\n", vesa_mode);
>         if (emulate) {
>  #ifdef CONFIG_BIOSEMU
>                 BE_VGAInfo *info;
> @@ -275,7 +275,7 @@ int pci_run_vga_bios(pci_dev_t dev, int (*int15_handler)(void), bool emulate)
>                 return -ENOSYS;
>  #endif
>         }
> -       debug("Final vesa mode %d\n", mode_info.video_mode);
> +       debug("Final vesa mode %#x\n", mode_info.video_mode);
>
>         return 0;
>  }
> --

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 14/22] x86: Add support for MTRRs
  2014-12-28  2:20 ` [U-Boot] [PATCH 14/22] x86: Add support for MTRRs Simon Glass
@ 2014-12-31  3:24   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  3:24 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> Memory Type Range Registers are used to tell the CPU whether memory is
> cacheable and if so the cache write mode to use.
>
> Clean up the existing header file to follow style, and remove the unneeded
> code.
>
> These can speed up booting so should be supported. Add these to global_data
> so they can be requested while booting. We will apply the changes during
> relocation (in a later commit).
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/Makefile              |   1 +
>  arch/x86/cpu/coreboot/coreboot.c   |  22 +++--
>  arch/x86/cpu/ivybridge/car.S       |  12 +--
>  arch/x86/cpu/mtrr.c                |  81 ++++++++++++++++++
>  arch/x86/include/asm/global_data.h |  15 ++++
>  arch/x86/include/asm/mtrr.h        | 163 +++++++++++++++++--------------------
>  6 files changed, 186 insertions(+), 108 deletions(-)
>  create mode 100644 arch/x86/cpu/mtrr.c
>
> diff --git a/arch/x86/cpu/Makefile b/arch/x86/cpu/Makefile
> index 5033d2b..62e43c0 100644
> --- a/arch/x86/cpu/Makefile
> +++ b/arch/x86/cpu/Makefile
> @@ -17,5 +17,6 @@ obj-$(CONFIG_NORTHBRIDGE_INTEL_SANDYBRIDGE) += ivybridge/
>  obj-$(CONFIG_NORTHBRIDGE_INTEL_IVYBRIDGE) += ivybridge/
>  obj-$(CONFIG_INTEL_QUEENSBAY) += queensbay/
>  obj-y += lapic.o
> +obj-y += mtrr.o
>  obj-$(CONFIG_PCI) += pci.o
>  obj-y += turbo.o
> diff --git a/arch/x86/cpu/coreboot/coreboot.c b/arch/x86/cpu/coreboot/coreboot.c
> index cfacc05..6d06d5a 100644
> --- a/arch/x86/cpu/coreboot/coreboot.c
> +++ b/arch/x86/cpu/coreboot/coreboot.c
> @@ -15,6 +15,7 @@
>  #include <asm/cache.h>
>  #include <asm/cpu.h>
>  #include <asm/io.h>
> +#include <asm/mtrr.h>
>  #include <asm/arch/tables.h>
>  #include <asm/arch/sysinfo.h>
>  #include <asm/arch/timestamp.h>
> @@ -64,11 +65,6 @@ int board_eth_init(bd_t *bis)
>         return pci_eth_init(bis);
>  }
>
> -#define MTRR_TYPE_WP          5
> -#define MTRRcap_MSR           0xfe
> -#define MTRRphysBase_MSR(reg) (0x200 + 2 * (reg))
> -#define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
> -
>  void board_final_cleanup(void)
>  {
>         /* Un-cache the ROM so the kernel has one
> @@ -77,15 +73,17 @@ void board_final_cleanup(void)
>          * Coreboot should have assigned this to the
>          * top available variable MTRR.
>          */
> -       u8 top_mtrr = (native_read_msr(MTRRcap_MSR) & 0xff) - 1;
> -       u8 top_type = native_read_msr(MTRRphysBase_MSR(top_mtrr)) & 0xff;
> +       u8 top_mtrr = (native_read_msr(MTRR_CAP_MSR) & 0xff) - 1;
> +       u8 top_type = native_read_msr(MTRR_PHYS_BASE_MSR(top_mtrr)) & 0xff;
>
>         /* Make sure this MTRR is the correct Write-Protected type */
> -       if (top_type == MTRR_TYPE_WP) {
> -               disable_caches();
> -               wrmsrl(MTRRphysBase_MSR(top_mtrr), 0);
> -               wrmsrl(MTRRphysMask_MSR(top_mtrr), 0);
> -               enable_caches();
> +       if (top_type == MTRR_TYPE_WRPROT) {
> +               struct mtrr_state state;
> +
> +               mtrr_open(&state);
> +               wrmsrl(MTRR_PHYS_BASE_MSR(top_mtrr), 0);
> +               wrmsrl(MTRR_PHYS_MASK_MSR(top_mtrr), 0);
> +               mtrr_close(&state);
>         }
>
>         /* Issue SMI to Coreboot to lock down ME and registers */
> diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
> index dca68e4..72b22ea 100644
> --- a/arch/x86/cpu/ivybridge/car.S
> +++ b/arch/x86/cpu/ivybridge/car.S
> @@ -61,7 +61,7 @@ clear_mtrrs:
>
>         post_code(POST_CAR_MTRR)
>         /* Configure the default memory type to uncacheable */
> -       movl    $MTRRdefType_MSR, %ecx
> +       movl    $MTRR_DEF_TYPE_MSR, %ecx
>         rdmsr
>         andl    $(~0x00000cff), %eax
>         wrmsr
> @@ -76,16 +76,16 @@ clear_mtrrs:
>         post_code(POST_CAR_BASE_ADDRESS)
>         /* Set Cache-as-RAM mask */
>         movl    $(MTRR_PHYS_MASK_MSR(0)), %ecx
> -       movl    $(~(CACHE_AS_RAM_SIZE - 1) | MTRRphysMaskValid), %eax
> +       movl    $(~(CACHE_AS_RAM_SIZE - 1) | MTRR_PHYS_MASK_VALID), %eax
>         movl    $CPU_PHYSMASK_HI, %edx
>         wrmsr
>
>         post_code(POST_CAR_MASK)
>
>         /* Enable MTRR */
> -       movl    $MTRRdefType_MSR, %ecx
> +       movl    $MTRR_DEF_TYPE_MSR, %ecx
>         rdmsr
> -       orl     $MTRRdefTypeEn, %eax
> +       orl     $MTRR_DEF_TYPE_EN, %eax
>         wrmsr
>
>         /* Enable cache (CR0.CD = 0, CR0.NW = 0) */
> @@ -130,7 +130,7 @@ clear_mtrrs:
>
>         movl    $MTRR_PHYS_MASK_MSR(1), %ecx
>         movl    $CPU_PHYSMASK_HI, %edx
> -       movl    $(~(CONFIG_XIP_ROM_SIZE - 1) | MTRRphysMaskValid), %eax
> +       movl    $(~(CONFIG_XIP_ROM_SIZE - 1) | MTRR_PHYS_MASK_VALID), %eax
>         wrmsr
>
>         post_code(POST_CAR_ROM_CACHE)
> @@ -141,7 +141,7 @@ clear_mtrrs:
>         xorl    %edx, %edx
>         wrmsr
>         movl    $MTRR_PHYS_MASK_MSR(2), %ecx
> -       movl    $(CACHE_MRC_MASK | MTRRphysMaskValid), %eax
> +       movl    $(CACHE_MRC_MASK | MTRR_PHYS_MASK_VALID), %eax
>         movl    $CPU_PHYSMASK_HI, %edx
>         wrmsr
>  #endif
> diff --git a/arch/x86/cpu/mtrr.c b/arch/x86/cpu/mtrr.c
> new file mode 100644
> index 0000000..d5a825d1
> --- /dev/null
> +++ b/arch/x86/cpu/mtrr.c
> @@ -0,0 +1,81 @@
> +/*
> + * (C) Copyright 2014 Google, Inc
> + *
> + * SPDX-License-Identifier:    GPL-2.0+
> + *
> + * Memory Type Range Regsters - these are used to tell the CPU whether
> + * memory is cacheable and if so the cache write mode to use.
> + *
> + * These can speed up booting. See the mtrr command.
> + *
> + * Reference: Intel Architecture Software Developer's Manual, Volume 3:
> + * System Programming
> + */
> +
> +#include <common.h>
> +#include <asm/io.h>
> +#include <asm/msr.h>
> +#include <asm/mtrr.h>
> +
> +/* Prepare to adjust MTRRs */
> +void mtrr_open(struct mtrr_state *state)
> +{
> +       state->enable_cache = dcache_status();
> +
> +       if (state->enable_cache)
> +               disable_caches();
> +       state->deftype = native_read_msr(MTRR_DEF_TYPE_MSR);
> +       wrmsrl(MTRR_DEF_TYPE_MSR, state->deftype & ~MTRR_DEF_TYPE_EN);
> +}
> +
> +/* Clean up after adjusting MTRRs, and enable them */
> +void mtrr_close(struct mtrr_state *state)
> +{
> +       wrmsrl(MTRR_DEF_TYPE_MSR, state->deftype | MTRR_DEF_TYPE_EN);
> +       if (state->enable_cache)
> +               enable_caches();
> +}
> +
> +int mtrr_commit(bool do_caches)
> +{
> +       struct mtrr_request *req = gd->arch.mtrr_req;
> +       struct mtrr_state state;
> +       uint64_t mask;
> +       int i;
> +
> +       mtrr_open(&state);
> +       for (i = 0; i < gd->arch.mtrr_req_count; i++, req++) {
> +               mask = ~(req->size - 1);
> +               mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
> +               wrmsrl(MTRR_PHYS_BASE_MSR(i), req->start | req->type);
> +               wrmsrl(MTRR_PHYS_MASK_MSR(i), mask | MTRR_PHYS_MASK_VALID);
> +       }
> +
> +       /* Clear the ones that are unused */
> +       for (; i < MTRR_COUNT; i++)
> +               wrmsrl(MTRR_PHYS_MASK_MSR(i), 0);
> +       mtrr_close(&state);
> +
> +       return 0;
> +}
> +
> +int mtrr_add_request(int type, uint64_t start, uint64_t size)
> +{
> +       struct mtrr_request *req;
> +       uint64_t mask;
> +
> +       if (gd->arch.mtrr_req_count == MAX_MTRR_REQUESTS)
> +               return -ENOSPC;
> +       req = &gd->arch.mtrr_req[gd->arch.mtrr_req_count++];
> +       req->type = type;
> +       req->start = start;
> +       req->size = size;
> +       debug("%d: type=%d, %08llx  %08llx\n", gd->arch.mtrr_req_count - 1,
> +             req->type, req->start, req->size);
> +       mask = ~(req->size - 1);
> +       mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
> +       mask |= MTRR_PHYS_MASK_VALID;
> +       debug("   %016llx %016llx\n", req->start | req->type, mask);
> +
> +       return 0;
> +}
> diff --git a/arch/x86/include/asm/global_data.h b/arch/x86/include/asm/global_data.h
> index 03d491a..15e76f6 100644
> --- a/arch/x86/include/asm/global_data.h
> +++ b/arch/x86/include/asm/global_data.h
> @@ -29,6 +29,19 @@ struct memory_info {
>         struct memory_area area[CONFIG_NR_DRAM_BANKS];
>  };
>
> +#define MAX_MTRR_REQUESTS      8
> +
> +/**
> + * A request for a memory region to be set up in a particular way. These
> + * requests are processed before board_init_r() is called. They are generally
> + * optional and can be ignored with some performance impact.
> + */
> +struct mtrr_request {
> +       int type;               /* MTRR_TYPE_... */
> +       uint64_t start;
> +       uint64_t size;
> +};
> +
>  /* Architecture-specific global data */
>  struct arch_global_data {
>         struct global_data *gd_addr;            /* Location of Global Data */
> @@ -50,6 +63,8 @@ struct arch_global_data {
>  #ifdef CONFIG_HAVE_FSP
>         void    *hob_list;              /* FSP HOB list */
>  #endif
> +       struct mtrr_request mtrr_req[MAX_MTRR_REQUESTS];
> +       int mtrr_req_count;
>  };
>
>  #endif
> diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
> index 5f05a48..3c11740 100644
> --- a/arch/x86/include/asm/mtrr.h
> +++ b/arch/x86/include/asm/mtrr.h
> @@ -9,99 +9,86 @@
>  #ifndef _ASM_MTRR_H
>  #define _ASM_MTRR_H
>
> -/*  These are the region types  */
> -#define MTRR_TYPE_UNCACHEABLE 0
> -#define MTRR_TYPE_WRCOMB     1
> -/*#define MTRR_TYPE_         2*/
> -/*#define MTRR_TYPE_         3*/
> -#define MTRR_TYPE_WRTHROUGH  4
> -#define MTRR_TYPE_WRPROT     5
> -#define MTRR_TYPE_WRBACK     6
> -#define MTRR_NUM_TYPES       7
> -
> -#define MTRRcap_MSR     0x0fe
> -#define MTRRdefType_MSR 0x2ff
> -
> -#define MTRRdefTypeEn          (1 << 11)
> -#define MTRRdefTypeFixEn       (1 << 10)
> -
> -#define SMRRphysBase_MSR 0x1f2
> -#define SMRRphysMask_MSR 0x1f3
> -
> -#define MTRRphysBase_MSR(reg) (0x200 + 2 * (reg))
> -#define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
> -
> -#define MTRRphysMaskValid      (1 << 11)
> -
> -#define NUM_FIXED_RANGES 88
> -#define RANGES_PER_FIXED_MTRR 8
> -#define MTRRfix64K_00000_MSR 0x250
> -#define MTRRfix16K_80000_MSR 0x258
> -#define MTRRfix16K_A0000_MSR 0x259
> -#define MTRRfix4K_C0000_MSR 0x268
> -#define MTRRfix4K_C8000_MSR 0x269
> -#define MTRRfix4K_D0000_MSR 0x26a
> -#define MTRRfix4K_D8000_MSR 0x26b
> -#define MTRRfix4K_E0000_MSR 0x26c
> -#define MTRRfix4K_E8000_MSR 0x26d
> -#define MTRRfix4K_F0000_MSR 0x26e
> -#define MTRRfix4K_F8000_MSR 0x26f
> +/* MTRR region types */
> +#define MTRR_TYPE_UNCACHEABLE  0
> +#define MTRR_TYPE_WRCOMB       1
> +#define MTRR_TYPE_WRTHROUGH    4
> +#define MTRR_TYPE_WRPROT       5
> +#define MTRR_TYPE_WRBACK       6
> +
> +#define MTRR_TYPE_COUNT                7
> +
> +#define MTRR_CAP_MSR           0x0fe
> +#define MTRR_DEF_TYPE_MSR      0x2ff
> +
> +#define MTRR_DEF_TYPE_EN       (1 << 11)
> +#define MTRR_DEF_TYPE_FIX_EN   (1 << 10)
> +
> +#define MTRR_PHYS_BASE_MSR(reg)        (0x200 + 2 * (reg))
> +#define MTRR_PHYS_MASK_MSR(reg)        (0x200 + 2 * (reg) + 1)
> +
> +#define MTRR_PHYS_MASK_VALID   (1 << 11)
> +
> +#define MTRR_BASE_TYPE_MASK    0x7
> +
> +/* Number of MTRRs supported */
> +#define MTRR_COUNT             8
>
>  #if !defined(__ASSEMBLER__)
>
> -/*
> - * The MTRR code has some side effects that the callers should be aware for.
> - * 1. The call sequence matters. x86_setup_mtrrs() calls
> - *    x86_setup_fixed_mtrrs_no_enable() then enable_fixed_mtrrs() (equivalent
> - *    of x86_setup_fixed_mtrrs()) then x86_setup_var_mtrrs(). If the callers
> - *    want to call the components of x86_setup_mtrrs() because of other
> - *    rquirements the ordering should still preserved.
> - * 2. enable_fixed_mtrr() will enable both variable and fixed MTRRs because
> - *    of the nature of the global MTRR enable flag. Therefore, all direct
> - *    or indirect callers of enable_fixed_mtrr() should ensure that the
> - *    variable MTRR MSRs do not contain bad ranges.
> - * 3. If CONFIG_CACHE_ROM is selected an MTRR is allocated for enabling
> - *    the caching of the ROM. However, it is set to uncacheable (UC). It
> - *    is the responsiblity of the caller to enable it by calling
> - *    x86_mtrr_enable_rom_caching().
> +/**
> + * Information about the previous MTRR state, set up by mtrr_open()
> + *
> + * @deftype:           Previous value of MTRR_DEF_TYPE_MSR
> + * @enable_cache:      true if cache was enabled
>   */
> -void x86_setup_mtrrs(void);
> -/*
> - * x86_setup_var_mtrrs() parameters:
> - * address_bits - number of physical address bits supported by cpu
> - * above4gb - 2 means dynamically detect number of variable MTRRs available.
> - *            non-zero means handle memory ranges above 4GiB.
> - *            0 means ignore memory ranges above 4GiB
> +struct mtrr_state {
> +       uint64_t deftype;
> +       bool enable_cache;
> +};
> +
> +/**
> + * mtrr_open() - Prepare to adjust MTRRs
> + *
> + * Use mtrr_open() passing in a structure - this function will init it. Then
> + * when done, pass the same structure to mtrr_close() to re-enable MTRRs and
> + * possibly the cache.
> + *
> + * @state:     Empty structure to pass in to hold settings
>   */
> -void x86_setup_var_mtrrs(unsigned int address_bits, unsigned int above4gb);
> -void enable_fixed_mtrr(void);
> -void x86_setup_fixed_mtrrs(void);
> -/* Set up fixed MTRRs but do not enable them. */
> -void x86_setup_fixed_mtrrs_no_enable(void);
> -int x86_mtrr_check(void);
> -/* ROM caching can be used after variable MTRRs are set up. Beware that
> - * enabling CONFIG_CACHE_ROM will eat through quite a few MTRRs based on
> - * one's IO hole size and WRCOMB resources. Be sure to check the console
> - * log when enabling CONFIG_CACHE_ROM or adding WRCOMB resources. Beware that
> - * on CPUs with core-scoped MTRR registers such as hyperthreaded CPUs the
> - * rom caching will be disabled if all threads run the MTRR code. Therefore,
> - * one needs to call x86_mtrr_enable_rom_caching() after all threads of the
> - * same core have run the MTRR code. */
> -#if CONFIG_CACHE_ROM
> -void x86_mtrr_enable_rom_caching(void);
> -void x86_mtrr_disable_rom_caching(void);
> -/* Return the variable range MTRR index of the ROM cache. */
> -long x86_mtrr_rom_cache_var_index(void);
> -#else
> -static inline void x86_mtrr_enable_rom_caching(void) {}
> -static inline void x86_mtrr_disable_rom_caching(void) {}
> -static inline long x86_mtrr_rom_cache_var_index(void) { return -1; }
> -#endif /* CONFIG_CACHE_ROM */
> +void mtrr_open(struct mtrr_state *state);
>
> -#endif
> +/**
> + * mtrr_open() - Clean up after adjusting MTRRs, and enable them
> + *
> + * This uses the structure containing information returned from mtrr_open().
> + *
> + * @state:     Structure from mtrr_open()
> + */
> +/*  */
> +void mtrr_close(struct mtrr_state *state);
> +
> +/**
> + * mtrr_add_request() - Add a new MTRR request
> + *
> + * This adds a request for a memory region to be set up in a particular way.
> + *
> + * @type:      Requested type (MTRR_TYPE_)
> + * @start:     Start address
> + * @size:      Size
> + */
> +int mtrr_add_request(int type, uint64_t start, uint64_t size);
> +
> +/**
> + * mtrr_commit() - set up the MTRR registers based on current requests
> + *
> + * This sets up MTRRs for the available DRAM and the requests received so far.
> + * It must be called with caches disabled.
> + *
> + * @do_caches: true if caches are currently on
> + */
> +int mtrr_commit(bool do_caches);
>
> -#if !defined(CONFIG_RAMTOP)
> -# error "CONFIG_RAMTOP not defined"
>  #endif
>

Can we move the removal of CONFIG_RAMTOP to the patch#2 in this series?

>  #if ((CONFIG_XIP_ROM_SIZE & (CONFIG_XIP_ROM_SIZE - 1)) != 0)
> @@ -114,8 +101,4 @@ static inline long x86_mtrr_rom_cache_var_index(void) { return -1; }
>
>  #define CACHE_ROM_BASE (((1 << 20) - (CONFIG_CACHE_ROM_SIZE >> 12)) << 12)
>
> -#if (CONFIG_RAMTOP & (CONFIG_RAMTOP - 1)) != 0
> -# error "CONFIG_RAMTOP must be a power of 2"
> -#endif
> -

Ditto.

>  #endif
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer
  2014-12-28  2:20 ` [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer Simon Glass
@ 2014-12-31  3:27   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  3:27 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> Set the frame buffer to write-combining. This makes it faster, although for
> scrolling write-through is even faster for U-Boot.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/ivybridge/gma.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/cpu/ivybridge/gma.c b/arch/x86/cpu/ivybridge/gma.c
> index 125021b..03b2ebd 100644
> --- a/arch/x86/cpu/ivybridge/gma.c
> +++ b/arch/x86/cpu/ivybridge/gma.c
> @@ -12,6 +12,7 @@
>  #include <fdtdec.h>
>  #include <pci_rom.h>
>  #include <asm/io.h>
> +#include <asm/mtrr.h>
>  #include <asm/pci.h>
>  #include <asm/arch/pch.h>
>  #include <asm/arch/sandybridge.h>
> @@ -738,6 +739,8 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
>                    const void *blob, int node)
>  {
>         void *gtt_bar;
> +       ulong start;
> +       ulong base;
>         u32 reg32;
>         int ret;
>
> @@ -746,6 +749,11 @@ int gma_func0_init(pci_dev_t dev, struct pci_controller *hose,
>         reg32 |= PCI_COMMAND_MASTER | PCI_COMMAND_MEMORY | PCI_COMMAND_IO;
>         pci_write_config32(dev, PCI_COMMAND, reg32);
>
> +       /* Use write-combining for the graphics memory, 256MB */
> +       base = pci_read_bar32(hose, dev, 2);
> +       mtrr_add_request(MTRR_TYPE_WRCOMB, base, 256 << 20);

To make the codes more generic, should we decode the memory size via
bar instead of hardcoded 256MB?

> +       mtrr_commit(true);
> +
>         gtt_bar = (void *)pci_read_bar32(pci_bus_to_hose(0), dev, 0);
>         debug("GT bar %p\n", gtt_bar);
>         ret = gma_pm_init_pre_vbios(gtt_bar);
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2014-12-28  2:20 ` [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance Simon Glass
@ 2014-12-31  5:51   ` Bin Meng
  2015-01-01 22:23     ` Simon Glass
  0 siblings, 1 reply; 46+ messages in thread
From: Bin Meng @ 2014-12-31  5:51 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> For bare platforms we turn off ROM-caching before calling board_init_f_r()
> It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
> that the copying happens before we turn off ROM-caching.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  common/board_f.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/common/board_f.c b/common/board_f.c
> index 98c9c72..1b65575 100644
> --- a/common/board_f.c
> +++ b/common/board_f.c
> @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>         INIT_FUNC_WATCHDOG_RESET
>         reloc_fdt,
>         setup_reloc,
> +#ifdef CONFIG_X86
> +       copy_uboot_to_ram,
> +       clear_bss,
> +       do_elf_reloc_fixups,
> +#endif
>  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>         jump_to_copy,
>  #endif
> @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>   */
>  static init_fnc_t init_sequence_f_r[] = {
>         init_cache_f_r,
> -       copy_uboot_to_ram,
> -       clear_bss,
> -       do_elf_reloc_fixups,
>
>         NULL,
>  };
> --

I don't understand. Isn't the init_cache_f_r() which turns on the cache?

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR
  2014-12-28  2:20 ` [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR Simon Glass
@ 2014-12-31  5:56   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  5:56 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> Cache-as-RAM should be turned off when we relocate since we want to run from
> RAM. Add a function to perform this task.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/ivybridge/car.S | 52 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
>
> diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
> index 72b22ea..6e7e1e4 100644
> --- a/arch/x86/cpu/ivybridge/car.S
> +++ b/arch/x86/cpu/ivybridge/car.S
> @@ -163,6 +163,58 @@ wait_for_sipi:
>         /* return */
>         jmp     car_init_ret
>
> +.globl car_uninit
> +car_uninit:
> +       /* Disable cache */
> +       movl    %cr0, %eax
> +       orl     $X86_CR0_CD, %eax
> +       movl    %eax, %cr0
> +
> +       /* Disable MTRRs */
> +       movl    $MTRR_DEF_TYPE_MSR, %ecx
> +       rdmsr
> +       andl    $(~MTRR_DEF_TYPE_EN), %eax
> +       wrmsr
> +
> +       /* Disable the no-eviction run state */
> +       movl    NOEVICTMOD_MSR, %ecx
> +       rdmsr
> +       andl    $~2, %eax
> +       wrmsr
> +
> +       invd
> +
> +       /* Disable the no-eviction mode */
> +       rdmsr
> +       andl    $~1, %eax
> +       wrmsr
> +
> +#ifdef CONFIG_CACHE_MRC_BIN
> +       /* Clear the MTRR that was used to cache MRC */
> +       xorl    %eax, %eax
> +       xorl    %edx, %edx
> +       movl    $MTRR_PHYS_BASE_MSR(2), %ecx
> +       wrmsr
> +       movl    $MTRR_PHYS_MASK_MSR(2), %ecx
> +       wrmsr
> +#endif
> +
> +       /* Enable MTRRs */
> +       movl    $MTRR_DEF_TYPE_MSR, %ecx
> +       rdmsr
> +       orl     $MTRR_DEF_TYPE_EN, %eax
> +       wrmsr
> +
> +       invd
> +
> +       /* Return to the caller */
> +       jmp     *%ebx
> +
> +.Lhlt:
> +       post_code(0xee)
> +       hlt
> +       jmp     .Lhlt

I don't see any codes could jump to this 4 lines above. Remove it?

>  mtrr_table:
>         /* Fixed MTRRs */
>         .word 0x250, 0x258, 0x259
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it
  2014-12-28  2:20 ` [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it Simon Glass
@ 2014-12-31  6:21   ` Bin Meng
  2015-01-01 22:32     ` Simon Glass
  0 siblings, 1 reply; 46+ messages in thread
From: Bin Meng @ 2014-12-31  6:21 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> For platforms with CAR we should disable it before relocation. Check if
> this function is available and call it if so.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/start.S | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/cpu/start.S b/arch/x86/cpu/start.S
> index 125782c..8cebde1 100644
> --- a/arch/x86/cpu/start.S
> +++ b/arch/x86/cpu/start.S
> @@ -205,6 +205,16 @@ board_init_f_r_trampoline:
>         /* Setup global descriptor table so gd->xyz works */
>         call    setup_gdt
>
> +       /* Set if we need to disable CAR */
> +       movl    $car_uninit, %eax
> +       cmpl    $0, %eax
> +       jz      car_ret
> +
> +       /* Pass return address in ebx */
> +.weak  car_uninit
> +       movl    $car_ret, %ebx
> +       jmp     car_uninit

Can we use 'call' here instead of jmp and %ebx as the return address?

> +car_ret:

car_uninit_ret

>         /* Re-enter U-Boot by calling board_init_f_r */
>         call    board_init_f_r
>
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot
  2014-12-28  2:20 ` [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot Simon Glass
@ 2014-12-31  6:45   ` Bin Meng
  2014-12-31  7:07     ` Bin Meng
  2015-01-01 22:37     ` Simon Glass
  0 siblings, 2 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  6:45 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> At present the normal update (which happens much later) does not work. This
> seems to have something to do with the 'no eviction' mode in the CAR, or at
> least moving the microcode update after that causes it not to work.
>
> For now, do an update early on so that it definitely works. Also refuse to
> continue unless the microcode update check (later in boot) is successful.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  arch/x86/cpu/ivybridge/car.S             | 14 ++++++++++++++
>  arch/x86/cpu/ivybridge/cpu.c             |  2 +-
>  arch/x86/cpu/ivybridge/microcode_intel.c |  9 +++++++--
>  arch/x86/dts/link.dts                    |  3 ---
>  4 files changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
> index 6e7e1e4..95da087 100644
> --- a/arch/x86/cpu/ivybridge/car.S
> +++ b/arch/x86/cpu/ivybridge/car.S
> @@ -45,6 +45,14 @@ car_init:
>         movl    $0xFEE00300, %esi
>         movl    %eax, (%esi)
>
> +       /* TODO: Load microcode later - the 'no eviction' mode breaks this */
> +       movl    $0x79, %ecx

Replace 0x79 to MSR_IA32_UCODE_WRITE from msr-index.h

> +       xorl    %edx, %edx
> +       movl    $_dt_ucode_base_size, %eax
> +       movl    (%eax), %eax
> +       addl    $0x30, %eax

And here 0x30 to something like MICROCODE_HEADER_LEN.

> +       wrmsr
> +
>         post_code(POST_CAR_SIPI)
>         /* Zero out all fixed range and variable range MTRRs */
>         movl    $mtrr_table, %esi
> @@ -228,3 +236,9 @@ mtrr_table:
>         .word 0x20C, 0x20D, 0x20E, 0x20F
>         .word 0x210, 0x211, 0x212, 0x213
>  mtrr_table_end:
> +
> +       .align 4
> +_dt_ucode_base_size:
> +       /* These next two fields are filled in by ifdtool */
> +       .long   0                       /* microcode base */
> +       .long   0                       /* microcode size */
> diff --git a/arch/x86/cpu/ivybridge/cpu.c b/arch/x86/cpu/ivybridge/cpu.c
> index 0543e06..e925310 100644
> --- a/arch/x86/cpu/ivybridge/cpu.c
> +++ b/arch/x86/cpu/ivybridge/cpu.c
> @@ -263,7 +263,7 @@ int print_cpuinfo(void)
>         enable_lapic();
>
>         ret = microcode_update_intel();

Since we already did the microcode update in car_init, why should we
do this again here?

> -       if (ret && ret != -ENOENT && ret != -EEXIST)
> +       if (ret)
>                 return ret;
>
>         /* Enable upper 128bytes of CMOS */
> diff --git a/arch/x86/cpu/ivybridge/microcode_intel.c b/arch/x86/cpu/ivybridge/microcode_intel.c
> index 0817751..c012820 100644
> --- a/arch/x86/cpu/ivybridge/microcode_intel.c
> +++ b/arch/x86/cpu/ivybridge/microcode_intel.c
> @@ -120,6 +120,7 @@ int microcode_update_intel(void)
>         int count;
>         int node;
>         int ret;
> +       int rev;
>
>         microcode_read_cpu(&cpu);
>         node = 0;
> @@ -147,12 +148,16 @@ int microcode_update_intel(void)
>                         skipped++;
>                         continue;
>                 }
> -               ret = microcode_read_rev();
>                 wrmsr(0x79, (ulong)update.data, 0);
> +               rev = microcode_read_rev();
>                 debug("microcode: updated to revision 0x%x date=%04x-%02x-%02x\n",
> -                     microcode_read_rev(), update.date_code & 0xffff,
> +                     rev, update.date_code & 0xffff,
>                       (update.date_code >> 24) & 0xff,
>                       (update.date_code >> 16) & 0xff);
> +               if (update.update_revision != rev) {
> +                       printf("Microcode update failed\n");
> +                       return -EFAULT;
> +               }
>                 count++;
>         } while (1);
>  }
> diff --git a/arch/x86/dts/link.dts b/arch/x86/dts/link.dts
> index a739080..1ebc334 100644
> --- a/arch/x86/dts/link.dts
> +++ b/arch/x86/dts/link.dts
> @@ -214,9 +214,6 @@
>
>         microcode {
>                 update at 0 {
> -#include "microcode/m12206a7_00000029.dtsi"
> -               };
> -               update at 1 {
>  #include "microcode/m12306a9_0000001b.dtsi"
>                 };
>         };
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs
  2014-12-28  2:20 ` [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs Simon Glass
@ 2014-12-31  7:03   ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  7:03 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> It is useful to be able to see the MTRR setup in U-Boot. Add a command
> to list the state of the variable MTRR registers and allow them to be
> changed.

Nice work! This will be a useful debug command.

> Update the documentation to list some of the available commands.
>
> This does not support fixed MTRRs as yet.
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
> ---
>
>  common/Makefile   |   1 +
>  common/cmd_mtrr.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  doc/README.x86    |  18 ++++++-
>  3 files changed, 156 insertions(+), 1 deletion(-)
>  create mode 100644 common/cmd_mtrr.c
>
> diff --git a/common/Makefile b/common/Makefile
> index c668a2f..39e4e5f 100644
> --- a/common/Makefile
> +++ b/common/Makefile
> @@ -135,6 +135,7 @@ obj-$(CONFIG_CMD_MMC) += cmd_mmc.o
>  obj-$(CONFIG_CMD_MMC_SPI) += cmd_mmc_spi.o
>  obj-$(CONFIG_MP) += cmd_mp.o
>  obj-$(CONFIG_CMD_MTDPARTS) += cmd_mtdparts.o
> +obj-$(CONFIG_X86) += cmd_mtrr.o
>  obj-$(CONFIG_CMD_NAND) += cmd_nand.o
>  obj-$(CONFIG_CMD_NET) += cmd_net.o
>  obj-$(CONFIG_CMD_ONENAND) += cmd_onenand.o
> diff --git a/common/cmd_mtrr.c b/common/cmd_mtrr.c
> new file mode 100644
> index 0000000..7e0506b
> --- /dev/null
> +++ b/common/cmd_mtrr.c

Should we put cmd_mtrr.c to arch/x86/lib?

> @@ -0,0 +1,138 @@
> +/*
> + * (C) Copyright 2014 Google, Inc
> + *
> + * SPDX-License-Identifier:    GPL-2.0+
> + */
> +
> +#include <common.h>
> +#include <asm/msr.h>
> +#include <asm/mtrr.h>
> +
> +static const char *const mtrr_type_name[MTRR_TYPE_COUNT] = {
> +       "Uncacheable",
> +       "Combine",
> +       "2",
> +       "3",
> +       "Through",
> +       "Protect",
> +       "Back",
> +};
> +
> +static int do_mtrr_list(void)
> +{
> +       int i;
> +
> +       printf("Reg Valid Write-type   %-16s %-16s %-16s\n", "Base   ||",
> +              "Mask   ||", "Size   ||");
> +       for (i = 0; i < MTRR_COUNT; i++) {
> +               const char *type = "Invalid";
> +               uint64_t base, mask, size;
> +               bool valid;
> +
> +               base = native_read_msr(MTRR_PHYS_BASE_MSR(i));
> +               mask = native_read_msr(MTRR_PHYS_MASK_MSR(i));
> +               size = ~mask & ((1ULL << CONFIG_CPU_ADDR_BITS) - 1);
> +               size |= (1 << 12) - 1;
> +               size += 1;
> +               valid = mask & MTRR_PHYS_MASK_VALID;
> +               type = mtrr_type_name[base & MTRR_BASE_TYPE_MASK];
> +               printf("%d   %-5s %-12s %016llx %016llx %016llx\n", i,
> +                      valid ? "Y" : "N", type, base, mask, size);
> +       }
> +
> +       return 0;
> +}
> +
> +static int do_mtrr_set(uint reg, int argc, char * const argv[])
> +{
> +       const char *typename = argv[0];
> +       struct mtrr_state state;
> +       uint32_t start, size;
> +       uint64_t base, mask;
> +       int i, type = -1;
> +       bool valid;
> +
> +       if (argc < 3)
> +               return CMD_RET_USAGE;
> +       for (i = 0; i < MTRR_TYPE_COUNT; i++) {
> +               if (*typename == *mtrr_type_name[i])
> +                       type = i;
> +       }
> +       if (type == -1) {
> +               printf("Invalid type name %s\n", typename);
> +               return CMD_RET_USAGE;
> +       }
> +       start = simple_strtoul(argv[1], NULL, 16);
> +       size = simple_strtoul(argv[2], NULL, 16);
> +
> +       base = start | type;
> +       valid = native_read_msr(MTRR_PHYS_MASK_MSR(reg)) & MTRR_PHYS_MASK_VALID;
> +       mask = ~((uint64_t)size - 1);
> +       mask &= (1ULL << CONFIG_CPU_ADDR_BITS) - 1;
> +       if (valid)
> +               mask |= MTRR_PHYS_MASK_VALID;
> +
> +       printf("base=%llx, mask=%llx\n", base, mask);
> +       mtrr_open(&state);
> +       wrmsrl(MTRR_PHYS_BASE_MSR(reg), base);
> +       wrmsrl(MTRR_PHYS_MASK_MSR(reg), mask);
> +       mtrr_close(&state);
> +
> +       return 0;
> +}
> +
> +static int mtrr_set_valid(int reg, bool valid)
> +{
> +       struct mtrr_state state;
> +       uint64_t mask;
> +
> +       mtrr_open(&state);
> +       mask = native_read_msr(MTRR_PHYS_MASK_MSR(reg));
> +       if (valid)
> +               mask |= MTRR_PHYS_MASK_VALID;
> +       else
> +               mask &= ~MTRR_PHYS_MASK_VALID;
> +       wrmsrl(MTRR_PHYS_MASK_MSR(reg), mask);
> +       mtrr_close(&state);
> +
> +       return 0;
> +}
> +
> +static int do_mtrr(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
> +{
> +       const char *cmd;
> +       uint reg;
> +
> +       cmd = argv[1];
> +       if (argc < 2 || *cmd == 'l')
> +               return do_mtrr_list();
> +       argc -= 2;
> +       argv += 2;
> +       if (argc <= 0)
> +               return CMD_RET_USAGE;
> +       reg = simple_strtoul(argv[0], NULL, 16);
> +       if (reg >= MTRR_COUNT) {
> +               printf("Invalid register number\n");
> +               return CMD_RET_USAGE;
> +       }
> +       if (*cmd == 'e')
> +               return mtrr_set_valid(reg, true);
> +       else if (*cmd == 'd')
> +               return mtrr_set_valid(reg, false);
> +       else if (*cmd == 's')
> +               return do_mtrr_set(reg, argc - 1, argv + 1);
> +       else
> +               return CMD_RET_USAGE;
> +
> +       return 0;
> +}
> +
> +U_BOOT_CMD(
> +       mtrr,   6,      1,      do_mtrr,
> +       "Use x86 memory type range registers (32-bit only)",
> +       "[list]        - list current registers\n"
> +       "set <reg> <type> <start> <size>   - set a register\n"
> +       "\t<type> is Uncacheable, Combine, Through, Protect, Back\n"
> +       "disable <reg>      - disable a register\n"
> +       "ensable <reg>      - enable a register"
> +);
> diff --git a/doc/README.x86 b/doc/README.x86
> index 5fab044..6fbfb90 100644
> --- a/doc/README.x86
> +++ b/doc/README.x86
> @@ -110,9 +110,25 @@ be turned on. Not every device on the board is configured via devie tree, but
>  more and more devices will be added as time goes by. Check out the directory
>  arch/x86/dts/ for these device tree source files.
>
> +Useful Commands
> +---------------
> +
> +In keeping with the U-Boot philosophy of providing functions to check and
> +adjust internal settings, there are several x86-specific commands that may be
> +useful:
> +
> +hob  - Display information about Firmware Support Package (FSP) Hand-off
> +        Block. This is only available on platform which use FSP, mostly

use -> uses

> +        Atom.
> +iod  - Display I/O memory
> +iow  - Write I/O memory
> +mtrr - List and set the Memory Type Range Registers (MTRR). These are used to
> +        tell the CPU whether memory is cacheable and if so the cache write
> +        mode to use. U-Boot sets up some reasonable values but you can
> +        adjust then with this command.
> +
>  TODO List
>  ---------
> -- MTRR support (for performance)
>  - Audio
>  - Chrome OS verified boot
>  - SMI and ACPI support, to provide platform info and facilities to Linux
> --

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation
  2014-12-31  2:47   ` Bin Meng
@ 2014-12-31  7:05     ` Bin Meng
  0 siblings, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  7:05 UTC (permalink / raw)
  To: u-boot

On Wed, Dec 31, 2014 at 10:47 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>> This currently assumes that U-Boot resides at the start of ROM. Update
>> it to remove this assumption.
>>
>> Signed-off-by: Simon Glass <sjg@chromium.org>
>> ---
>>
>>  tools/ifdtool.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/ifdtool.c b/tools/ifdtool.c
>> index fe8366b..590ccc9 100644
>> --- a/tools/ifdtool.c
>> +++ b/tools/ifdtool.c
>> @@ -788,9 +788,9 @@ static int write_uboot(char *image, int size, struct input_file *uboot,
>>                               fdt_strerror(data_size));
>>                         return -ENOENT;
>>                 }
>> -               offset = ucode_ptr - uboot->addr;
>> +               offset = (uint32_t)(ucode_ptr + size);
>>                 ptr = (void *)image + offset;
>> -               ptr[0] = uboot->addr + (data - image);
>> +               ptr[0] = (data - image) - size;
>>                 ptr[1] = data_size;
>>                 debug("Wrote microcode pointer at %x: addr=%x, size=%x\n",
>>                       ucode_ptr, ptr[0], ptr[1]);
>> --
>
> Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
>
> Tested on Intel Crown Bay by adjusting ROM_SIZE to 2MB
>
> Tested-by:  Bin Meng <bmeng.cn@gmail.com>

Oops, one additional space after 'Tested-by:' should be removed.

Tested-by: Bin Meng <bmeng.cn@gmail.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot
  2014-12-31  6:45   ` Bin Meng
@ 2014-12-31  7:07     ` Bin Meng
  2015-01-01 22:37     ` Simon Glass
  1 sibling, 0 replies; 46+ messages in thread
From: Bin Meng @ 2014-12-31  7:07 UTC (permalink / raw)
  To: u-boot

On Wed, Dec 31, 2014 at 2:45 PM, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
>
> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>> At present the normal update (which happens much later) does not work. This
>> seems to have something to do with the 'no eviction' mode in the CAR, or at
>> least moving the microcode update after that causes it not to work.
>>
>> For now, do an update early on so that it definitely works. Also refuse to
>> continue unless the microcode update check (later in boot) is successful.
>>
>> Signed-off-by: Simon Glass <sjg@chromium.org>
>> ---
>>
>>  arch/x86/cpu/ivybridge/car.S             | 14 ++++++++++++++
>>  arch/x86/cpu/ivybridge/cpu.c             |  2 +-
>>  arch/x86/cpu/ivybridge/microcode_intel.c |  9 +++++++--
>>  arch/x86/dts/link.dts                    |  3 ---
>>  4 files changed, 22 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
>> index 6e7e1e4..95da087 100644
>> --- a/arch/x86/cpu/ivybridge/car.S
>> +++ b/arch/x86/cpu/ivybridge/car.S
>> @@ -45,6 +45,14 @@ car_init:
>>         movl    $0xFEE00300, %esi
>>         movl    %eax, (%esi)
>>
>> +       /* TODO: Load microcode later - the 'no eviction' mode breaks this */
>> +       movl    $0x79, %ecx
>
> Replace 0x79 to MSR_IA32_UCODE_WRITE from msr-index.h
>
>> +       xorl    %edx, %edx
>> +       movl    $_dt_ucode_base_size, %eax
>> +       movl    (%eax), %eax
>> +       addl    $0x30, %eax
>
> And here 0x30 to something like MICROCODE_HEADER_LEN.
>

I realized UCODE_HEADER_LEN might be better. At least shorter than
microcode :-), and is consistent with the MSR_IA32_UCODE_WRITE.

[snip]

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2014-12-31  5:51   ` Bin Meng
@ 2015-01-01 22:23     ` Simon Glass
  2015-01-04  3:26       ` Bin Meng
  0 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2015-01-01 22:23 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Simon,
>
> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
> > For bare platforms we turn off ROM-caching before calling board_init_f_r()
> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
> > that the copying happens before we turn off ROM-caching.
> >
> > Signed-off-by: Simon Glass <sjg@chromium.org>
> > ---
> >
> >  common/board_f.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/common/board_f.c b/common/board_f.c
> > index 98c9c72..1b65575 100644
> > --- a/common/board_f.c
> > +++ b/common/board_f.c
> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
> >         INIT_FUNC_WATCHDOG_RESET
> >         reloc_fdt,
> >         setup_reloc,
> > +#ifdef CONFIG_X86
> > +       copy_uboot_to_ram,
> > +       clear_bss,
> > +       do_elf_reloc_fixups,
> > +#endif
> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
> >         jump_to_copy,
> >  #endif
> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
> >   */
> >  static init_fnc_t init_sequence_f_r[] = {
> >         init_cache_f_r,
> > -       copy_uboot_to_ram,
> > -       clear_bss,
> > -       do_elf_reloc_fixups,
> >
> >         NULL,
> >  };
> > --
>
> I don't understand. Isn't the init_cache_f_r() which turns on the cache?
>

Yes it turns on the cache, but turns off the ROM cache (they are
different things). So this ordering is much faster. I think I might
have more tuning to do though.

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it
  2014-12-31  6:21   ` Bin Meng
@ 2015-01-01 22:32     ` Simon Glass
  0 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2015-01-01 22:32 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 30 December 2014 at 23:21, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
>
> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>> For platforms with CAR we should disable it before relocation. Check if
>> this function is available and call it if so.
>>
>> Signed-off-by: Simon Glass <sjg@chromium.org>
>> ---
>>
>>  arch/x86/cpu/start.S | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/x86/cpu/start.S b/arch/x86/cpu/start.S
>> index 125782c..8cebde1 100644
>> --- a/arch/x86/cpu/start.S
>> +++ b/arch/x86/cpu/start.S
>> @@ -205,6 +205,16 @@ board_init_f_r_trampoline:
>>         /* Setup global descriptor table so gd->xyz works */
>>         call    setup_gdt
>>
>> +       /* Set if we need to disable CAR */
>> +       movl    $car_uninit, %eax
>> +       cmpl    $0, %eax
>> +       jz      car_ret
>> +
>> +       /* Pass return address in ebx */
>> +.weak  car_uninit
>> +       movl    $car_ret, %ebx
>> +       jmp     car_uninit
>
> Can we use 'call' here instead of jmp and %ebx as the return address?

Yes let's do that. The stack must be working so we might as well use it.

>
>> +car_ret:
>
> car_uninit_ret
>

OK

>>         /* Re-enter U-Boot by calling board_init_f_r */
>>         call    board_init_f_r

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot
  2014-12-31  6:45   ` Bin Meng
  2014-12-31  7:07     ` Bin Meng
@ 2015-01-01 22:37     ` Simon Glass
  1 sibling, 0 replies; 46+ messages in thread
From: Simon Glass @ 2015-01-01 22:37 UTC (permalink / raw)
  To: u-boot

Hi  Bin,

On 30 December 2014 at 23:45, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
>
> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>> At present the normal update (which happens much later) does not work. This
>> seems to have something to do with the 'no eviction' mode in the CAR, or at
>> least moving the microcode update after that causes it not to work.
>>
>> For now, do an update early on so that it definitely works. Also refuse to
>> continue unless the microcode update check (later in boot) is successful.
>>
>> Signed-off-by: Simon Glass <sjg@chromium.org>
>> ---
>>
>>  arch/x86/cpu/ivybridge/car.S             | 14 ++++++++++++++
>>  arch/x86/cpu/ivybridge/cpu.c             |  2 +-
>>  arch/x86/cpu/ivybridge/microcode_intel.c |  9 +++++++--
>>  arch/x86/dts/link.dts                    |  3 ---
>>  4 files changed, 22 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/cpu/ivybridge/car.S b/arch/x86/cpu/ivybridge/car.S
>> index 6e7e1e4..95da087 100644
>> --- a/arch/x86/cpu/ivybridge/car.S
>> +++ b/arch/x86/cpu/ivybridge/car.S
>> @@ -45,6 +45,14 @@ car_init:
>>         movl    $0xFEE00300, %esi
>>         movl    %eax, (%esi)
>>
>> +       /* TODO: Load microcode later - the 'no eviction' mode breaks this */
>> +       movl    $0x79, %ecx
>
> Replace 0x79 to MSR_IA32_UCODE_WRITE from msr-index.h
>
>> +       xorl    %edx, %edx
>> +       movl    $_dt_ucode_base_size, %eax
>> +       movl    (%eax), %eax
>> +       addl    $0x30, %eax
>
> And here 0x30 to something like MICROCODE_HEADER_LEN.
>
>> +       wrmsr
>> +
>>         post_code(POST_CAR_SIPI)
>>         /* Zero out all fixed range and variable range MTRRs */
>>         movl    $mtrr_table, %esi
>> @@ -228,3 +236,9 @@ mtrr_table:
>>         .word 0x20C, 0x20D, 0x20E, 0x20F
>>         .word 0x210, 0x211, 0x212, 0x213
>>  mtrr_table_end:
>> +
>> +       .align 4
>> +_dt_ucode_base_size:
>> +       /* These next two fields are filled in by ifdtool */
>> +       .long   0                       /* microcode base */
>> +       .long   0                       /* microcode size */
>> diff --git a/arch/x86/cpu/ivybridge/cpu.c b/arch/x86/cpu/ivybridge/cpu.c
>> index 0543e06..e925310 100644
>> --- a/arch/x86/cpu/ivybridge/cpu.c
>> +++ b/arch/x86/cpu/ivybridge/cpu.c
>> @@ -263,7 +263,7 @@ int print_cpuinfo(void)
>>         enable_lapic();
>>
>>         ret = microcode_update_intel();
>
> Since we already did the microcode update in car_init, why should we
> do this again here?

The car_init() version is a work-around that I would like to remove.

This update actually checks that it happened and fails with an error
if not. For car we don't have a way yet of displaying an error.

>
>> -       if (ret && ret != -ENOENT && ret != -EEXIST)
>> +       if (ret)
>>                 return ret;
>>
>>         /* Enable upper 128bytes of CMOS */
>> diff --git a/arch/x86/cpu/ivybridge/microcode_intel.c b/arch/x86/cpu/ivybridge/microcode_intel.c
>> index 0817751..c012820 100644
>> --- a/arch/x86/cpu/ivybridge/microcode_intel.c
>> +++ b/arch/x86/cpu/ivybridge/microcode_intel.c
>> @@ -120,6 +120,7 @@ int microcode_update_intel(void)
>>         int count;
>>         int node;
>>         int ret;
>> +       int rev;
>>
>>         microcode_read_cpu(&cpu);
>>         node = 0;
>> @@ -147,12 +148,16 @@ int microcode_update_intel(void)
>>                         skipped++;
>>                         continue;
>>                 }
>> -               ret = microcode_read_rev();
>>                 wrmsr(0x79, (ulong)update.data, 0);
>> +               rev = microcode_read_rev();
>>                 debug("microcode: updated to revision 0x%x date=%04x-%02x-%02x\n",
>> -                     microcode_read_rev(), update.date_code & 0xffff,
>> +                     rev, update.date_code & 0xffff,
>>                       (update.date_code >> 24) & 0xff,
>>                       (update.date_code >> 16) & 0xff);
>> +               if (update.update_revision != rev) {
>> +                       printf("Microcode update failed\n");
>> +                       return -EFAULT;
>> +               }
>>                 count++;
>>         } while (1);
>>  }
>> diff --git a/arch/x86/dts/link.dts b/arch/x86/dts/link.dts
>> index a739080..1ebc334 100644
>> --- a/arch/x86/dts/link.dts
>> +++ b/arch/x86/dts/link.dts
>> @@ -214,9 +214,6 @@
>>
>>         microcode {
>>                 update at 0 {
>> -#include "microcode/m12206a7_00000029.dtsi"
>> -               };
>> -               update at 1 {
>>  #include "microcode/m12306a9_0000001b.dtsi"
>>                 };
>>         };
>> --
>

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-01 22:23     ` Simon Glass
@ 2015-01-04  3:26       ` Bin Meng
  2015-01-05  1:44         ` Simon Glass
  0 siblings, 1 reply; 46+ messages in thread
From: Bin Meng @ 2015-01-04  3:26 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
>>
>> Hi Simon,
>>
>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>> > For bare platforms we turn off ROM-caching before calling board_init_f_r()
>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
>> > that the copying happens before we turn off ROM-caching.
>> >
>> > Signed-off-by: Simon Glass <sjg@chromium.org>
>> > ---
>> >
>> >  common/board_f.c | 8 +++++---
>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/common/board_f.c b/common/board_f.c
>> > index 98c9c72..1b65575 100644
>> > --- a/common/board_f.c
>> > +++ b/common/board_f.c
>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>> >         INIT_FUNC_WATCHDOG_RESET
>> >         reloc_fdt,
>> >         setup_reloc,
>> > +#ifdef CONFIG_X86
>> > +       copy_uboot_to_ram,
>> > +       clear_bss,
>> > +       do_elf_reloc_fixups,
>> > +#endif
>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>> >         jump_to_copy,
>> >  #endif
>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>> >   */
>> >  static init_fnc_t init_sequence_f_r[] = {
>> >         init_cache_f_r,
>> > -       copy_uboot_to_ram,
>> > -       clear_bss,
>> > -       do_elf_reloc_fixups,
>> >
>> >         NULL,
>> >  };
>> > --
>>
>> I don't understand. Isn't the init_cache_f_r() which turns on the cache?
>>
>
> Yes it turns on the cache, but turns off the ROM cache (they are
> different things). So this ordering is much faster. I think I might
> have more tuning to do though.
>

Got it. Since we moved these 3 routines from init_sequence_f_r[] to
init_sequence_f[], how about we remove the whole init_sequence_f_r[]
completely? If not possible, the comment block above
init_sequence_f_r[] needs to be fixed?

 *
 * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
 * supported).  It _should_, if possible, copy global data to RAM and
 * initialise the CPU caches (to speed up the relocation process)
 *
 * NOTE: At present only x86 uses this route, but it is intended that
 * all archs will move to this when generic relocation is implemented.
 */

So looks that we should either drop this _f_r stage, or make other
arch use this _f_r?

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-04  3:26       ` Bin Meng
@ 2015-01-05  1:44         ` Simon Glass
  2015-01-05 13:40           ` Bin Meng
  0 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2015-01-05  1:44 UTC (permalink / raw)
  To: u-boot

Hi Bin,

On 3 January 2015 at 20:26, Bin Meng <bmeng.cn@gmail.com> wrote:
> Hi Simon,
>
> On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg@chromium.org> wrote:
>> Hi Bin,
>>
>> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
>>>
>>> Hi Simon,
>>>
>>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>>> > For bare platforms we turn off ROM-caching before calling board_init_f_r()
>>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
>>> > that the copying happens before we turn off ROM-caching.
>>> >
>>> > Signed-off-by: Simon Glass <sjg@chromium.org>
>>> > ---
>>> >
>>> >  common/board_f.c | 8 +++++---
>>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>>> >
>>> > diff --git a/common/board_f.c b/common/board_f.c
>>> > index 98c9c72..1b65575 100644
>>> > --- a/common/board_f.c
>>> > +++ b/common/board_f.c
>>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>>> >         INIT_FUNC_WATCHDOG_RESET
>>> >         reloc_fdt,
>>> >         setup_reloc,
>>> > +#ifdef CONFIG_X86
>>> > +       copy_uboot_to_ram,
>>> > +       clear_bss,
>>> > +       do_elf_reloc_fixups,
>>> > +#endif
>>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>>> >         jump_to_copy,
>>> >  #endif
>>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>>> >   */
>>> >  static init_fnc_t init_sequence_f_r[] = {
>>> >         init_cache_f_r,
>>> > -       copy_uboot_to_ram,
>>> > -       clear_bss,
>>> > -       do_elf_reloc_fixups,
>>> >
>>> >         NULL,
>>> >  };
>>> > --
>>>
>>> I don't understand. Isn't the init_cache_f_r() which turns on the cache?
>>>
>>
>> Yes it turns on the cache, but turns off the ROM cache (they are
>> different things). So this ordering is much faster. I think I might
>> have more tuning to do though.
>>
>
> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
> completely? If not possible, the comment block above
> init_sequence_f_r[] needs to be fixed?

I'll remove the comment.

>
>  *
>  * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
>  * supported).  It _should_, if possible, copy global data to RAM and
>  * initialise the CPU caches (to speed up the relocation process)
>  *
>  * NOTE: At present only x86 uses this route, but it is intended that
>  * all archs will move to this when generic relocation is implemented.
>  */
>
> So looks that we should either drop this _f_r stage, or make other
> arch use this _f_r?

I think we should move to generic relocation, meaning that all archs
do relocation the same way with the same code. Then only arch-specific
stuff is the then ELF fixup function, which adjusts a relocation site,
and the code to actually change the stack pointer.

This board_init_f_r() code is part of one attempt to do this - the
premise was that turning the cache on before relocating U-Boot will
save time. That's true, but it would be even better to turn the cache
on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
So I think turning it on here is too late if we are trying to save
time. Still it is a good start and if we could make it happen
generally it would be nice.

See here for my original attempt, which was tied up with generic
board. In the end I untied them and we got one but not the other.

http://lists.denx.de/pipermail/u-boot/2012-February/118409.html

Since then Albert has tidied up ARM start.S a lot which makes this much easier.

So that's the background. One of these days I might take another look
at it if it doesn't get someone's attention.

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-05  1:44         ` Simon Glass
@ 2015-01-05 13:40           ` Bin Meng
  2015-01-05 17:22             ` Graeme Russ
  0 siblings, 1 reply; 46+ messages in thread
From: Bin Meng @ 2015-01-05 13:40 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On Mon, Jan 5, 2015 at 9:44 AM, Simon Glass <sjg@chromium.org> wrote:
> Hi Bin,
>
> On 3 January 2015 at 20:26, Bin Meng <bmeng.cn@gmail.com> wrote:
>> Hi Simon,
>>
>> On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg@chromium.org> wrote:
>>> Hi Bin,
>>>
>>> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
>>>>
>>>> Hi Simon,
>>>>
>>>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org> wrote:
>>>> > For bare platforms we turn off ROM-caching before calling board_init_f_r()
>>>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the order so
>>>> > that the copying happens before we turn off ROM-caching.
>>>> >
>>>> > Signed-off-by: Simon Glass <sjg@chromium.org>
>>>> > ---
>>>> >
>>>> >  common/board_f.c | 8 +++++---
>>>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>>>> >
>>>> > diff --git a/common/board_f.c b/common/board_f.c
>>>> > index 98c9c72..1b65575 100644
>>>> > --- a/common/board_f.c
>>>> > +++ b/common/board_f.c
>>>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>>>> >         INIT_FUNC_WATCHDOG_RESET
>>>> >         reloc_fdt,
>>>> >         setup_reloc,
>>>> > +#ifdef CONFIG_X86
>>>> > +       copy_uboot_to_ram,
>>>> > +       clear_bss,
>>>> > +       do_elf_reloc_fixups,
>>>> > +#endif
>>>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>>>> >         jump_to_copy,
>>>> >  #endif
>>>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>>>> >   */
>>>> >  static init_fnc_t init_sequence_f_r[] = {
>>>> >         init_cache_f_r,
>>>> > -       copy_uboot_to_ram,
>>>> > -       clear_bss,
>>>> > -       do_elf_reloc_fixups,
>>>> >
>>>> >         NULL,
>>>> >  };
>>>> > --
>>>>
>>>> I don't understand. Isn't the init_cache_f_r() which turns on the cache?
>>>>
>>>
>>> Yes it turns on the cache, but turns off the ROM cache (they are
>>> different things). So this ordering is much faster. I think I might
>>> have more tuning to do though.
>>>
>>
>> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
>> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
>> completely? If not possible, the comment block above
>> init_sequence_f_r[] needs to be fixed?
>
> I'll remove the comment.
>
>>
>>  *
>>  * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
>>  * supported).  It _should_, if possible, copy global data to RAM and
>>  * initialise the CPU caches (to speed up the relocation process)
>>  *
>>  * NOTE: At present only x86 uses this route, but it is intended that
>>  * all archs will move to this when generic relocation is implemented.
>>  */
>>
>> So looks that we should either drop this _f_r stage, or make other
>> arch use this _f_r?
>
> I think we should move to generic relocation, meaning that all archs
> do relocation the same way with the same code. Then only arch-specific
> stuff is the then ELF fixup function, which adjusts a relocation site,
> and the code to actually change the stack pointer.
>
> This board_init_f_r() code is part of one attempt to do this - the
> premise was that turning the cache on before relocating U-Boot will
> save time. That's true, but it would be even better to turn the cache
> on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
> So I think turning it on here is too late if we are trying to save
> time. Still it is a good start and if we could make it happen
> generally it would be nice.
>
> See here for my original attempt, which was tied up with generic
> board. In the end I untied them and we got one but not the other.
>
> http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
>
> Since then Albert has tidied up ARM start.S a lot which makes this much easier.
>
> So that's the background. One of these days I might take another look
> at it if it doesn't get someone's attention.
>
> Regards,
> Simon

Thanks for the background information. I will take a look. Hope we can
achieve generic board support as soon as we can.

Regards,
Bin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-05 13:40           ` Bin Meng
@ 2015-01-05 17:22             ` Graeme Russ
  2015-01-05 17:33               ` Simon Glass
  0 siblings, 1 reply; 46+ messages in thread
From: Graeme Russ @ 2015-01-05 17:22 UTC (permalink / raw)
  To: u-boot

Hi Simon & Bin,

On Tue, Jan 6, 2015 at 12:40 AM, Bin Meng <bmeng.cn@gmail.com> wrote:

> Hi Simon,
>
> On Mon, Jan 5, 2015 at 9:44 AM, Simon Glass <sjg@chromium.org> wrote:
> > Hi Bin,
> >
> > On 3 January 2015 at 20:26, Bin Meng <bmeng.cn@gmail.com> wrote:
> >> Hi Simon,
> >>
> >> On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg@chromium.org> wrote:
> >>> Hi Bin,
> >>>
> >>> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
> >>>>
> >>>> Hi Simon,
> >>>>
> >>>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org>
> wrote:
> >>>> > For bare platforms we turn off ROM-caching before calling
> board_init_f_r()
> >>>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the
> order so
> >>>> > that the copying happens before we turn off ROM-caching.
> >>>> >
> >>>> > Signed-off-by: Simon Glass <sjg@chromium.org>
> >>>> > ---
> >>>> >
> >>>> >  common/board_f.c | 8 +++++---
> >>>> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >>>> >
> >>>> > diff --git a/common/board_f.c b/common/board_f.c
> >>>> > index 98c9c72..1b65575 100644
> >>>> > --- a/common/board_f.c
> >>>> > +++ b/common/board_f.c
> >>>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
> >>>> >         INIT_FUNC_WATCHDOG_RESET
> >>>> >         reloc_fdt,
> >>>> >         setup_reloc,
> >>>> > +#ifdef CONFIG_X86
> >>>> > +       copy_uboot_to_ram,
> >>>> > +       clear_bss,
> >>>> > +       do_elf_reloc_fixups,
> >>>> > +#endif
> >>>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
> >>>> >         jump_to_copy,
> >>>> >  #endif
> >>>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
> >>>> >   */
> >>>> >  static init_fnc_t init_sequence_f_r[] = {
> >>>> >         init_cache_f_r,
> >>>> > -       copy_uboot_to_ram,
> >>>> > -       clear_bss,
> >>>> > -       do_elf_reloc_fixups,
> >>>> >
> >>>> >         NULL,
> >>>> >  };
> >>>> > --
>

Wow, doesn't this bring back some memories. I've had a look over this code
as it is now and it took a while to sink in. Things have moved on in the
past 2 years :)


> >>>>
> >>>> I don't understand. Isn't the init_cache_f_r() which turns on the
> cache?
> >>>>
> >>>
> >>> Yes it turns on the cache, but turns off the ROM cache (they are
> >>> different things). So this ordering is much faster. I think I might
> >>> have more tuning to do though.
> >>>
> >>
> >> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
> >> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
> >> completely? If not possible, the comment block above
> >> init_sequence_f_r[] needs to be fixed?
> >
> > I'll remove the comment.
>

I think init_sequence_f_r can go... but I need to have a better look at the
code

If turning off the ROM cache by init_cache_f_r is the problem, then perhaps
the following order would be better:

  copy_uboot_to_ram,
  init_cache_f_r,
  clear_bss,
  do_elf_reloc_fixups,

Without enabling the CPU's cache, clear_bss and do_elf_reloc_fixups will
suffer.

>
> >>
> >>  *
> >>  * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
> >>  * supported).  It _should_, if possible, copy global data to RAM and
> >>  * initialise the CPU caches (to speed up the relocation process)
> >>  *
> >>  * NOTE: At present only x86 uses this route, but it is intended that
> >>  * all archs will move to this when generic relocation is implemented.
> >>  */
> >>
> >> So looks that we should either drop this _f_r stage, or make other
> >> arch use this _f_r?
> >
> > I think we should move to generic relocation, meaning that all archs
> > do relocation the same way with the same code. Then only arch-specific
> > stuff is the then ELF fixup function, which adjusts a relocation site,
> > and the code to actually change the stack pointer.
>

This was always my plan - have arch specific do_reloc_fixups and the rest
would be generic


> >
> > This board_init_f_r() code is part of one attempt to do this - the
> > premise was that turning the cache on before relocating U-Boot will
> > save time. That's true, but it would be even better to turn the cache
> > on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
> > So I think turning it on here is too late if we are trying to save
> > time. Still it is a good start and if we could make it happen
> > generally it would be nice.
>

And now you have me thinking board_init_f_r is not needed at all...

If we can setup the stack, clear BSS, copy U-Boot to RAM and perform
relocation fixups while running from ROM, what is left for board_init_f_r
to do?

The only thing I can think of is the caveats mentioned in the comment
('static' variables are read-only / Global Data (gd->xxx) is read/write).
But aren't these true when running from ROM?

Looking at non-x86 arches, relocate_code() looks to do what
copy_uboot_to_ram and clear_bss does in x86 land (not sure about
do_elf_reloc_fixups - seems to be arch specific as to whether
relocate_code() does this or it is done later (hence
the CONFIG_NEEDS_MANUAL_RELOC #define?)

>
> > See here for my original attempt, which was tied up with generic
> > board. In the end I untied them and we got one but not the other.
> >
> > http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
> >
>

Ah, generic relocation... I really wish my life hadn't taken a hard-left
turn when it did. We got so close.

From where I'm looking (fresh eyes - I might be missing something big) we
should be able to do the ROM->RAM copy, BSS clearing, and relocation fixups
in board_init_f.

When we return from board_init_f it should be a fairly simple matter of:
 - Copying the contents of the Global Data structure from CAR to RAM (or
from RAM to RAM if your on a platform which initialises RAM before U-Boot)
 - Set the gd pointer (reserved register) to point to the new copy
 - Figure out where board_init_r is and jump to it

board_init_r should be able to do any remaining cache tweaks. If cache
tweaks cannot be done while executing from RAM then they need to be done in
board_init_f

I just cannot see the point of board_init_f_r any more


> > Since then Albert has tidied up ARM start.S a lot which makes this much
> easier.
> >
> > So that's the background. One of these days I might take another look
> > at it if it doesn't get someone's attention.
>

Oh dear - it looks like I just put my hand up :)


> >
> > Regards,
> > Simon
>
> Thanks for the background information. I will take a look. Hope we can
> achieve generic board support as soon as we can.
>
> Regards,
> Bin
>
>
Regards,

Graeme

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-05 17:22             ` Graeme Russ
@ 2015-01-05 17:33               ` Simon Glass
  2015-01-05 17:46                 ` Graeme Russ
  0 siblings, 1 reply; 46+ messages in thread
From: Simon Glass @ 2015-01-05 17:33 UTC (permalink / raw)
  To: u-boot

Hi,

On 5 January 2015 at 10:22, Graeme Russ <gruss@tss-engineering.com> wrote:
> Hi Simon & Bin,
>
> On Tue, Jan 6, 2015 at 12:40 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
>>
>> Hi Simon,
>>
>> On Mon, Jan 5, 2015 at 9:44 AM, Simon Glass <sjg@chromium.org> wrote:
>> > Hi Bin,
>> >
>> > On 3 January 2015 at 20:26, Bin Meng <bmeng.cn@gmail.com> wrote:
>> >> Hi Simon,
>> >>
>> >> On Fri, Jan 2, 2015 at 6:23 AM, Simon Glass <sjg@chromium.org> wrote:
>> >>> Hi Bin,
>> >>>
>> >>> On 30 December 2014 at 22:51, Bin Meng <bmeng.cn@gmail.com> wrote:
>> >>>>
>> >>>> Hi Simon,
>> >>>>
>> >>>> On Sun, Dec 28, 2014 at 10:20 AM, Simon Glass <sjg@chromium.org>
>> >>>> wrote:
>> >>>> > For bare platforms we turn off ROM-caching before calling
>> >>>> > board_init_f_r()
>> >>>> > It is then very slow to copy U-Boot from ROM to RAM. So adjust the
>> >>>> > order so
>> >>>> > that the copying happens before we turn off ROM-caching.
>> >>>> >
>> >>>> > Signed-off-by: Simon Glass <sjg@chromium.org>
>> >>>> > ---
>> >>>> >
>> >>>> >  common/board_f.c | 8 +++++---
>> >>>> >  1 file changed, 5 insertions(+), 3 deletions(-)
>> >>>> >
>> >>>> > diff --git a/common/board_f.c b/common/board_f.c
>> >>>> > index 98c9c72..1b65575 100644
>> >>>> > --- a/common/board_f.c
>> >>>> > +++ b/common/board_f.c
>> >>>> > @@ -983,6 +983,11 @@ static init_fnc_t init_sequence_f[] = {
>> >>>> >         INIT_FUNC_WATCHDOG_RESET
>> >>>> >         reloc_fdt,
>> >>>> >         setup_reloc,
>> >>>> > +#ifdef CONFIG_X86
>> >>>> > +       copy_uboot_to_ram,
>> >>>> > +       clear_bss,
>> >>>> > +       do_elf_reloc_fixups,
>> >>>> > +#endif
>> >>>> >  #if !defined(CONFIG_ARM) && !defined(CONFIG_SANDBOX)
>> >>>> >         jump_to_copy,
>> >>>> >  #endif
>> >>>> > @@ -1042,9 +1047,6 @@ void board_init_f(ulong boot_flags)
>> >>>> >   */
>> >>>> >  static init_fnc_t init_sequence_f_r[] = {
>> >>>> >         init_cache_f_r,
>> >>>> > -       copy_uboot_to_ram,
>> >>>> > -       clear_bss,
>> >>>> > -       do_elf_reloc_fixups,
>> >>>> >
>> >>>> >         NULL,
>> >>>> >  };
>> >>>> > --
>
>
> Wow, doesn't this bring back some memories. I've had a look over this code
> as it is now and it took a while to sink in. Things have moved on in the
> past 2 years :)

Nice to hear from you again!

>
>>
>> >>>>
>> >>>> I don't understand. Isn't the init_cache_f_r() which turns on the
>> >>>> cache?
>> >>>>
>> >>>
>> >>> Yes it turns on the cache, but turns off the ROM cache (they are
>> >>> different things). So this ordering is much faster. I think I might
>> >>> have more tuning to do though.
>> >>>
>> >>
>> >> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
>> >> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
>> >> completely? If not possible, the comment block above
>> >> init_sequence_f_r[] needs to be fixed?
>> >
>> > I'll remove the comment.
>
>
> I think init_sequence_f_r can go... but I need to have a better look at the
> code
>
> If turning off the ROM cache by init_cache_f_r is the problem, then perhaps
> the following order would be better:
>
>   copy_uboot_to_ram,
>   init_cache_f_r,
>   clear_bss,
>   do_elf_reloc_fixups,
>
> Without enabling the CPU's cache, clear_bss and do_elf_reloc_fixups will
> suffer.

Better would be to have init_cache_f_r after all this I think.

>
>> >
>> >>
>> >>  *
>> >>  * The '_f_r' sequence must, as a minimum, copy U-Boot to RAM (if
>> >>  * supported).  It _should_, if possible, copy global data to RAM and
>> >>  * initialise the CPU caches (to speed up the relocation process)
>> >>  *
>> >>  * NOTE: At present only x86 uses this route, but it is intended that
>> >>  * all archs will move to this when generic relocation is implemented.
>> >>  */
>> >>
>> >> So looks that we should either drop this _f_r stage, or make other
>> >> arch use this _f_r?
>> >
>> > I think we should move to generic relocation, meaning that all archs
>> > do relocation the same way with the same code. Then only arch-specific
>> > stuff is the then ELF fixup function, which adjusts a relocation site,
>> > and the code to actually change the stack pointer.
>
>
> This was always my plan - have arch specific do_reloc_fixups and the rest
> would be generic
>
>>
>> >
>> > This board_init_f_r() code is part of one attempt to do this - the
>> > premise was that turning the cache on before relocating U-Boot will
>> > save time. That's true, but it would be even better to turn the cache
>> > on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
>> > So I think turning it on here is too late if we are trying to save
>> > time. Still it is a good start and if we could make it happen
>> > generally it would be nice.
>
>
> And now you have me thinking board_init_f_r is not needed at all...
>
> If we can setup the stack, clear BSS, copy U-Boot to RAM and perform
> relocation fixups while running from ROM, what is left for board_init_f_r to
> do?
>
> The only thing I can think of is the caveats mentioned in the comment
> ('static' variables are read-only / Global Data (gd->xxx) is read/write).
> But aren't these true when running from ROM?
>
> Looking at non-x86 arches, relocate_code() looks to do what
> copy_uboot_to_ram and clear_bss does in x86 land (not sure about
> do_elf_reloc_fixups - seems to be arch specific as to whether
> relocate_code() does this or it is done later (hence the
> CONFIG_NEEDS_MANUAL_RELOC #define?)

Yes this can be unified. There is still something in there though for
board_init_f_r(), at least as things are now. It just so happens on
x86 that we are running from ROM and CAR so as I understand it we
sort-of have the cache on before relocation. That doesn't apply for
coreboot though, so there is probably an optimisation to be made
somewhere.

>
>> >
>> > See here for my original attempt, which was tied up with generic
>> > board. In the end I untied them and we got one but not the other.
>> >
>> > http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
>> >
>
>
> Ah, generic relocation... I really wish my life hadn't taken a hard-left
> turn when it did. We got so close.
>
> From where I'm looking (fresh eyes - I might be missing something big) we
> should be able to do the ROM->RAM copy, BSS clearing, and relocation fixups
> in board_init_f.
>
> When we return from board_init_f it should be a fairly simple matter of:
>  - Copying the contents of the Global Data structure from CAR to RAM (or
> from RAM to RAM if your on a platform which initialises RAM before U-Boot)
>  - Set the gd pointer (reserved register) to point to the new copy
>  - Figure out where board_init_r is and jump to it
>
> board_init_r should be able to do any remaining cache tweaks. If cache
> tweaks cannot be done while executing from RAM then they need to be done in
> board_init_f
>
> I just cannot see the point of board_init_f_r any more

Yes, it's hard to see, I'm not sure either.

Anyway I'm going to apply this patch while we figure out what further
work can be done.

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-05 17:33               ` Simon Glass
@ 2015-01-05 17:46                 ` Graeme Russ
  2015-01-05 18:02                   ` Simon Glass
  0 siblings, 1 reply; 46+ messages in thread
From: Graeme Russ @ 2015-01-05 17:46 UTC (permalink / raw)
  To: u-boot

Hi Simon,

On 06/01/15 04:33, Simon Glass wrote:

>
> Nice to hear from you again!
>

Thanks - Good to be back in the game

>>>>>>> I don't understand. Isn't the init_cache_f_r() which turns on the
>>>>>>> cache?
>>>>>>>
>>>>>>
>>>>>> Yes it turns on the cache, but turns off the ROM cache (they are
>>>>>> different things). So this ordering is much faster. I think I might
>>>>>> have more tuning to do though.
>>>>>>
>>>>>
>>>>> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
>>>>> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
>>>>> completely? If not possible, the comment block above
>>>>> init_sequence_f_r[] needs to be fixed?
>>>>
>>>> I'll remove the comment.
>>
>>
>> I think init_sequence_f_r can go... but I need to have a better look at the
>> code
>>
>> If turning off the ROM cache by init_cache_f_r is the problem, then perhaps
>> the following order would be better:
>>
>>    copy_uboot_to_ram,
>>    init_cache_f_r,
>>    clear_bss,
>>    do_elf_reloc_fixups,
>>
>> Without enabling the CPU's cache, clear_bss and do_elf_reloc_fixups will
>> suffer.
>
> Better would be to have init_cache_f_r after all this I think.

Why so? After copying the U-Boot code from ROM to RAM, we would want RAM 
based operations to be as fast as possible. By this point, we have 
passed most of the XIP code, so XIP performance _should_ be a non-issue.

>>>> This board_init_f_r() code is part of one attempt to do this - the
>>>> premise was that turning the cache on before relocating U-Boot will
>>>> save time. That's true, but it would be even better to turn the cache
>>>> on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
>>>> So I think turning it on here is too late if we are trying to save
>>>> time. Still it is a good start and if we could make it happen
>>>> generally it would be nice.
>>
>>
>> And now you have me thinking board_init_f_r is not needed at all...
>>
>> If we can setup the stack, clear BSS, copy U-Boot to RAM and perform
>> relocation fixups while running from ROM, what is left for board_init_f_r to
>> do?
>>
>> The only thing I can think of is the caveats mentioned in the comment
>> ('static' variables are read-only / Global Data (gd->xxx) is read/write).
>> But aren't these true when running from ROM?
>>
>> Looking at non-x86 arches, relocate_code() looks to do what
>> copy_uboot_to_ram and clear_bss does in x86 land (not sure about
>> do_elf_reloc_fixups - seems to be arch specific as to whether
>> relocate_code() does this or it is done later (hence the
>> CONFIG_NEEDS_MANUAL_RELOC #define?)
>
> Yes this can be unified. There is still something in there though for
> board_init_f_r(), at least as things are now. It just so happens on
> x86 that we are running from ROM and CAR so as I understand it we
> sort-of have the cache on before relocation. That doesn't apply for
> coreboot though, so there is probably an optimisation to be made
> somewhere.

U-Boot runs from RAM right from the start in coreboot correct? In which 
case, coreboot builds can probably do away with all the relocation code 
I assume coreboot can relocate U-Boot to any place in memory. Last time 
I looked, coreboot had some ELF processing code.

>>>> See here for my original attempt, which was tied up with generic
>>>> board. In the end I untied them and we got one but not the other.
>>>>
>>>> http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
>>>>
>>
>>
>> Ah, generic relocation... I really wish my life hadn't taken a hard-left
>> turn when it did. We got so close.
>>
>>  From where I'm looking (fresh eyes - I might be missing something big) we
>> should be able to do the ROM->RAM copy, BSS clearing, and relocation fixups
>> in board_init_f.
>>
>> When we return from board_init_f it should be a fairly simple matter of:
>>   - Copying the contents of the Global Data structure from CAR to RAM (or
>> from RAM to RAM if your on a platform which initialises RAM before U-Boot)
>>   - Set the gd pointer (reserved register) to point to the new copy
>>   - Figure out where board_init_r is and jump to it
>>
>> board_init_r should be able to do any remaining cache tweaks. If cache
>> tweaks cannot be done while executing from RAM then they need to be done in
>> board_init_f
>>
>> I just cannot see the point of board_init_f_r any more
>
> Yes, it's hard to see, I'm not sure either.
>
> Anyway I'm going to apply this patch while we figure out what further
> work can be done.

Sounds good to me.

The pity is that I don't have ANY hardware to test any of this any more, 
and it looks like any development I do in the foreseeable future will be 
on ARM. So hacking and testing x86 might be a bit of a problem

Regards,

Graeme

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance
  2015-01-05 17:46                 ` Graeme Russ
@ 2015-01-05 18:02                   ` Simon Glass
  0 siblings, 0 replies; 46+ messages in thread
From: Simon Glass @ 2015-01-05 18:02 UTC (permalink / raw)
  To: u-boot

Hi Graeme,

On 5 January 2015 at 10:46, Graeme Russ <graeme.russ@gmail.com> wrote:
> Hi Simon,
>
> On 06/01/15 04:33, Simon Glass wrote:
>
>>
>> Nice to hear from you again!
>>
>
> Thanks - Good to be back in the game
>
>>>>>>>> I don't understand. Isn't the init_cache_f_r() which turns on the
>>>>>>>> cache?
>>>>>>>>
>>>>>>>
>>>>>>> Yes it turns on the cache, but turns off the ROM cache (they are
>>>>>>> different things). So this ordering is much faster. I think I might
>>>>>>> have more tuning to do though.
>>>>>>>
>>>>>>
>>>>>> Got it. Since we moved these 3 routines from init_sequence_f_r[] to
>>>>>> init_sequence_f[], how about we remove the whole init_sequence_f_r[]
>>>>>> completely? If not possible, the comment block above
>>>>>> init_sequence_f_r[] needs to be fixed?
>>>>>
>>>>>
>>>>> I'll remove the comment.
>>>
>>>
>>>
>>> I think init_sequence_f_r can go... but I need to have a better look at
>>> the
>>> code
>>>
>>> If turning off the ROM cache by init_cache_f_r is the problem, then
>>> perhaps
>>> the following order would be better:
>>>
>>>    copy_uboot_to_ram,
>>>    init_cache_f_r,
>>>    clear_bss,
>>>    do_elf_reloc_fixups,
>>>
>>> Without enabling the CPU's cache, clear_bss and do_elf_reloc_fixups will
>>> suffer.
>>
>>
>> Better would be to have init_cache_f_r after all this I think.
>
>
> Why so? After copying the U-Boot code from ROM to RAM, we would want RAM
> based operations to be as fast as possible. By this point, we have passed
> most of the XIP code, so XIP performance _should_ be a non-issue.

Well I figure that the code is still running from ROM, and with the
ROM cache off this is almost exquisitely slow. On the other hand we
don't access the same DRAM location twice when clearing BSS and fixing
up relocations.

That said, I have not actually benchmarked this.

>
>
>>>>> This board_init_f_r() code is part of one attempt to do this - the
>>>>> premise was that turning the cache on before relocating U-Boot will
>>>>> save time. That's true, but it would be even better to turn the cache
>>>>> on much earlier. With pit (Chromebook 2) we turn on the cache in SPL.
>>>>> So I think turning it on here is too late if we are trying to save
>>>>> time. Still it is a good start and if we could make it happen
>>>>> generally it would be nice.
>>>
>>>
>>>
>>> And now you have me thinking board_init_f_r is not needed at all...
>>>
>>> If we can setup the stack, clear BSS, copy U-Boot to RAM and perform
>>> relocation fixups while running from ROM, what is left for board_init_f_r
>>> to
>>> do?
>>>
>>> The only thing I can think of is the caveats mentioned in the comment
>>> ('static' variables are read-only / Global Data (gd->xxx) is read/write).
>>> But aren't these true when running from ROM?
>>>
>>> Looking at non-x86 arches, relocate_code() looks to do what
>>> copy_uboot_to_ram and clear_bss does in x86 land (not sure about
>>> do_elf_reloc_fixups - seems to be arch specific as to whether
>>> relocate_code() does this or it is done later (hence the
>>> CONFIG_NEEDS_MANUAL_RELOC #define?)
>>
>>
>> Yes this can be unified. There is still something in there though for
>> board_init_f_r(), at least as things are now. It just so happens on
>> x86 that we are running from ROM and CAR so as I understand it we
>> sort-of have the cache on before relocation. That doesn't apply for
>> coreboot though, so there is probably an optimisation to be made
>> somewhere.
>
>
> U-Boot runs from RAM right from the start in coreboot correct? In which
> case, coreboot builds can probably do away with all the relocation code I
> assume coreboot can relocate U-Boot to any place in memory. Last time I
> looked, coreboot had some ELF processing code.

Sure but U-Boot expects to relocate, and wants to decide where to put
itself (e.g. allowing room for other things). We use the 'flat binary'
method for Coreboot, so U-Boot is not actually an ELF image.

I think there are some changes that could be made to speed things up
with Coreboot (e.g. leaving the cache on).

>
>>>>> See here for my original attempt, which was tied up with generic
>>>>> board. In the end I untied them and we got one but not the other.
>>>>>
>>>>> http://lists.denx.de/pipermail/u-boot/2012-February/118409.html
>>>>>
>>>
>>>
>>> Ah, generic relocation... I really wish my life hadn't taken a hard-left
>>> turn when it did. We got so close.
>>>
>>>  From where I'm looking (fresh eyes - I might be missing something big)
>>> we
>>> should be able to do the ROM->RAM copy, BSS clearing, and relocation
>>> fixups
>>> in board_init_f.
>>>
>>> When we return from board_init_f it should be a fairly simple matter of:
>>>   - Copying the contents of the Global Data structure from CAR to RAM (or
>>> from RAM to RAM if your on a platform which initialises RAM before
>>> U-Boot)
>>>   - Set the gd pointer (reserved register) to point to the new copy
>>>   - Figure out where board_init_r is and jump to it
>>>
>>> board_init_r should be able to do any remaining cache tweaks. If cache
>>> tweaks cannot be done while executing from RAM then they need to be done
>>> in
>>> board_init_f
>>>
>>> I just cannot see the point of board_init_f_r any more
>>
>>
>> Yes, it's hard to see, I'm not sure either.
>>
>> Anyway I'm going to apply this patch while we figure out what further
>> work can be done.
>
>
> Sounds good to me.
>
> The pity is that I don't have ANY hardware to test any of this any more, and
> it looks like any development I do in the foreseeable future will be on ARM.
> So hacking and testing x86 might be a bit of a problem

ARM is a bit saner anyway :-)  The binary blob thing on x86 is
painful. I still do most stuff on ARM. But it's been interesting for
me learning about x86 again after all these years.

Regards,
Simon

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2015-01-05 18:02 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-28  2:20 [U-Boot] [PATCH 0/22] x86: Add support for MTRRs Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 02/22] x86: Drop RAMTOP Kconfig Simon Glass
2014-12-30 15:05   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 03/22] x86: Correct ifdtool microcode calculation Simon Glass
2014-12-31  2:47   ` Bin Meng
2014-12-31  7:05     ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 04/22] x86: video: Add support for CONFIG_CONSOLE_SCROLL_LINES Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 05/22] x86: config: Always scroll the display by 5 lines, for speed Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 06/22] x86: video: Add a debug() to display the frame buffer address Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 07/22] x86: pci: Don't return a vesa mode when there is not video Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 08/22] x86: video: Add debug option to time the BIOS copy Simon Glass
2014-12-31  3:09   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 09/22] x86: ivybridge: Only run the Video BIOS when video is enabled Simon Glass
2014-12-31  3:14   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 10/22] x86: Use cache, don't clear the display in video BIOS Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 11/22] x86: Tidy up VESA mode numbers Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 12/22] x86: pci: Display vesa modes in hex Simon Glass
2014-12-31  3:18   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 13/22] x86: ivybridge: Drop support for ROM caching Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 14/22] x86: Add support for MTRRs Simon Glass
2014-12-31  3:24   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 15/22] x86: ivybridge: Set up an MTRR for the video frame buffer Simon Glass
2014-12-31  3:27   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 16/22] x86: board_f: Adjust x86 boot order for performance Simon Glass
2014-12-31  5:51   ` Bin Meng
2015-01-01 22:23     ` Simon Glass
2015-01-04  3:26       ` Bin Meng
2015-01-05  1:44         ` Simon Glass
2015-01-05 13:40           ` Bin Meng
2015-01-05 17:22             ` Graeme Russ
2015-01-05 17:33               ` Simon Glass
2015-01-05 17:46                 ` Graeme Russ
2015-01-05 18:02                   ` Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 17/22] x86: ivybridge: Request MTRRs for DRAM regions Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 18/22] x86: Commit the current MTRRs before relocation Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 19/22] x86: ivybridge: Add a way to turn off the CAR Simon Glass
2014-12-31  5:56   ` Bin Meng
2014-12-28  2:20 ` [U-Boot] [PATCH 20/22] x86: Disable CAR before relocation on platforms that need it Simon Glass
2014-12-31  6:21   ` Bin Meng
2015-01-01 22:32     ` Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 21/22] x86: ivybridge: Update microcode early in boot Simon Glass
2014-12-31  6:45   ` Bin Meng
2014-12-31  7:07     ` Bin Meng
2015-01-01 22:37     ` Simon Glass
2014-12-28  2:20 ` [U-Boot] [PATCH 22/22] x86: Add an 'mtrr' command to list and adjust MTRRs Simon Glass
2014-12-31  7:03   ` Bin Meng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.